public inbox for intel-gfx@lists.freedesktop.org
 help / color / mirror / Atom feed
* [PATCH 0/7] GEN8 PPGTT cleanups + 4GB support
@ 2013-12-24 21:42 Ben Widawsky
  2013-12-24 21:42 ` [PATCH 1/7] drm/i915/bdw: Split up PPGTT cleanup Ben Widawsky
                   ` (7 more replies)
  0 siblings, 8 replies; 15+ messages in thread
From: Ben Widawsky @ 2013-12-24 21:42 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

The following patches rework a lot of the PPGTT cleanup and
initialization code. In the short term, this should allow us to use all
4GB address space as well as get the code ready for adding full PPGTT
support to BDW.

Aside from making the code clearer, the more discrete functions make
finer grained page table allocation much easier to manage. Previously,
we were allocating one gigantic contiguous chunk for the GEN8 page
tables. This required a 4MB contiguous allocation. On modern Linux, such
a large allocation is impossible, and so we fell back to a 2GB address
space requiring only 2MB for page tables.

The GEN8 PPGTT code was in poor shape to begin with because the effort
was on getting functionality quickly to new platforms, and this was
achieved by largely copy and pasting the GEN6 code and altering where
needed. I initially tried to do the finer grained allocation as a single
patch, and was really unsatisfied with the results.

One nice outcome (I feel) is everything gets referenced as a PDPE/PDE/PTE, or
index thereof. It becomes easy to convert these things to the graphics virtual
address, or vice-versa. The win will be when we start to map address ranges
more dynamically, specifically, allocate the page tables, PDEs, and map the
PDEs.

The last patch is totally optional. I've or its equivalent quite a few
times in various series, but it never seems to stick.

After we get this merged, I'll submit BDW full PPGTT. I also have a blog post
about this stuff in the works.

Ben Widawsky (7):
  drm/i915/bdw: Split up PPGTT cleanup
  drm/i915/bdw: Reorganize PPGTT init
  drm/i915/bdw: Split ppgtt initialization up
  drm/i915: Make clear/insert vfuncs args absolute
  drm/i915/bdw: Reorganize PT allocations
  Revert "drm/i915/bdw: Limit GTT to 2GB"
  drm/i915: Update i915_gem_gtt.c copyright

 drivers/gpu/drm/i915/i915_drv.h     |  13 +-
 drivers/gpu/drm/i915/i915_gem_gtt.c | 463 +++++++++++++++++++++++++-----------
 2 files changed, 332 insertions(+), 144 deletions(-)

-- 
1.8.5.2

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 1/7] drm/i915/bdw: Split up PPGTT cleanup
  2013-12-24 21:42 [PATCH 0/7] GEN8 PPGTT cleanups + 4GB support Ben Widawsky
@ 2013-12-24 21:42 ` Ben Widawsky
  2013-12-24 21:42 ` [PATCH 2/7] drm/i915/bdw: Reorganize PPGTT init Ben Widawsky
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 15+ messages in thread
From: Ben Widawsky @ 2013-12-24 21:42 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky, Ben Widawsky

This will make the code more readable, and extensible which is needed
for upcoming feature work. Eventually, we'll do the same for init.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_gem_gtt.c | 58 +++++++++++++++++++++++--------------
 1 file changed, 37 insertions(+), 21 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 998f9a0..d0ee9a9 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -318,36 +318,52 @@ static void gen8_ppgtt_insert_entries(struct i915_address_space *vm,
 	kunmap_atomic(pt_vaddr);
 }
 
-static void gen8_ppgtt_cleanup(struct i915_address_space *vm)
+static void gen8_ppgtt_free(struct i915_hw_ppgtt *ppgtt)
+{
+	int i;
+
+	for (i = 0; i < ppgtt->num_pd_pages ; i++)
+		kfree(ppgtt->gen8_pt_dma_addr[i]);
+
+	__free_pages(ppgtt->gen8_pt_pages, get_order(ppgtt->num_pt_pages << PAGE_SHIFT));
+	__free_pages(ppgtt->pd_pages, get_order(ppgtt->num_pd_pages << PAGE_SHIFT));
+}
+
+static void gen8_ppgtt_unmap_pages(struct i915_hw_ppgtt *ppgtt)
 {
-	struct i915_hw_ppgtt *ppgtt =
-		container_of(vm, struct i915_hw_ppgtt, base);
 	int i, j;
 
-	list_del(&vm->global_link);
-	drm_mm_takedown(&vm->mm);
+	for (i = 0; i < ppgtt->num_pd_pages; i++) {
+		/* TODO: future may allow sparse mappings update this is not */
+		if (!ppgtt->pd_dma_addr[i])
+			continue;
 
-	for (i = 0; i < ppgtt->num_pd_pages ; i++) {
-		if (ppgtt->pd_dma_addr[i]) {
-			pci_unmap_page(ppgtt->base.dev->pdev,
-				       ppgtt->pd_dma_addr[i],
-				       PAGE_SIZE, PCI_DMA_BIDIRECTIONAL);
+		pci_unmap_page(ppgtt->base.dev->pdev,
+			       ppgtt->pd_dma_addr[i],
+			       PAGE_SIZE, PCI_DMA_BIDIRECTIONAL);
 
-			for (j = 0; j < GEN8_PDES_PER_PAGE; j++) {
-				dma_addr_t addr = ppgtt->gen8_pt_dma_addr[i][j];
-				if (addr)
-					pci_unmap_page(ppgtt->base.dev->pdev,
-						       addr,
-						       PAGE_SIZE,
-						       PCI_DMA_BIDIRECTIONAL);
+		for (j = 0; j < GEN8_PDES_PER_PAGE; j++) {
+			dma_addr_t addr = ppgtt->gen8_pt_dma_addr[i][j];
+			if (addr)
+				pci_unmap_page(ppgtt->base.dev->pdev,
+				       addr,
+				       PAGE_SIZE,
+				       PCI_DMA_BIDIRECTIONAL);
 
-			}
 		}
-		kfree(ppgtt->gen8_pt_dma_addr[i]);
 	}
+}
 
-	__free_pages(ppgtt->gen8_pt_pages, get_order(ppgtt->num_pt_pages << PAGE_SHIFT));
-	__free_pages(ppgtt->pd_pages, get_order(ppgtt->num_pd_pages << PAGE_SHIFT));
+static void gen8_ppgtt_cleanup(struct i915_address_space *vm)
+{
+	struct i915_hw_ppgtt *ppgtt =
+		container_of(vm, struct i915_hw_ppgtt, base);
+
+	list_del(&vm->global_link);
+	drm_mm_takedown(&vm->mm);
+
+	gen8_ppgtt_unmap_pages(ppgtt);
+	gen8_ppgtt_free(ppgtt);
 }
 
 /**
-- 
1.8.5.2

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 2/7] drm/i915/bdw: Reorganize PPGTT init
  2013-12-24 21:42 [PATCH 0/7] GEN8 PPGTT cleanups + 4GB support Ben Widawsky
  2013-12-24 21:42 ` [PATCH 1/7] drm/i915/bdw: Split up PPGTT cleanup Ben Widawsky
@ 2013-12-24 21:42 ` Ben Widawsky
  2013-12-25  6:58   ` [PATCH 2/7] [v2] " Ben Widawsky
  2013-12-24 21:42 ` [PATCH 3/7] drm/i915/bdw: Split ppgtt initialization up Ben Widawsky
                   ` (5 subsequent siblings)
  7 siblings, 1 reply; 15+ messages in thread
From: Ben Widawsky @ 2013-12-24 21:42 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky, Ben Widawsky

Create 3 clear stages in PPGTT init. This will help with upcoming
changes be more readable. The 3 stages are, allocation, dma mapping, and
writing the P[DT]Es

One nice benefit to the patches is that it makes 2 very clear error
points, allocation, and mapping, and avoids having to do any handling
after writing PTEs (something which was likely buggy before). This
simplified error handling I suspect will be helpful when we move to
deferred/dynamic page table allocation and mapping.

The patches also attempts to break up some of the steps into more
logical reviewable chunks, particularly when we free.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_drv.h     |   2 +-
 drivers/gpu/drm/i915/i915_gem_gtt.c | 125 +++++++++++++++++++++---------------
 2 files changed, 74 insertions(+), 53 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index cc8afff..dc21963 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -671,7 +671,7 @@ struct i915_hw_ppgtt {
 	};
 	union {
 		dma_addr_t *pt_dma_addr;
-		dma_addr_t *gen8_pt_dma_addr[4];
+		dma_addr_t **gen8_pt_dma_addr;
 	};
 
 	int (*enable)(struct i915_hw_ppgtt *ppgtt);
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index d0ee9a9..b0b621d 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -325,12 +325,14 @@ static void gen8_ppgtt_free(struct i915_hw_ppgtt *ppgtt)
 	for (i = 0; i < ppgtt->num_pd_pages ; i++)
 		kfree(ppgtt->gen8_pt_dma_addr[i]);
 
+	kfree(ppgtt->gen8_pt_dma_addr);
 	__free_pages(ppgtt->gen8_pt_pages, get_order(ppgtt->num_pt_pages << PAGE_SHIFT));
 	__free_pages(ppgtt->pd_pages, get_order(ppgtt->num_pd_pages << PAGE_SHIFT));
 }
 
 static void gen8_ppgtt_unmap_pages(struct i915_hw_ppgtt *ppgtt)
 {
+	struct pci_dev *hwdev = ppgtt->base.dev->pdev;
 	int i, j;
 
 	for (i = 0; i < ppgtt->num_pd_pages; i++) {
@@ -338,18 +340,14 @@ static void gen8_ppgtt_unmap_pages(struct i915_hw_ppgtt *ppgtt)
 		if (!ppgtt->pd_dma_addr[i])
 			continue;
 
-		pci_unmap_page(ppgtt->base.dev->pdev,
-			       ppgtt->pd_dma_addr[i],
-			       PAGE_SIZE, PCI_DMA_BIDIRECTIONAL);
+		pci_unmap_page(hwdev, ppgtt->pd_dma_addr[i], PAGE_SIZE,
+			       PCI_DMA_BIDIRECTIONAL);
 
 		for (j = 0; j < GEN8_PDES_PER_PAGE; j++) {
 			dma_addr_t addr = ppgtt->gen8_pt_dma_addr[i][j];
 			if (addr)
-				pci_unmap_page(ppgtt->base.dev->pdev,
-				       addr,
-				       PAGE_SIZE,
-				       PCI_DMA_BIDIRECTIONAL);
-
+				pci_unmap_page(hwdev, addr, PAGE_SIZE,
+					       PCI_DMA_BIDIRECTIONAL);
 		}
 	}
 }
@@ -367,27 +365,26 @@ static void gen8_ppgtt_cleanup(struct i915_address_space *vm)
 }
 
 /**
- * GEN8 legacy ppgtt programming is accomplished through 4 PDP registers with a
- * net effect resembling a 2-level page table in normal x86 terms. Each PDP
- * represents 1GB of memory
- * 4 * 512 * 512 * 4096 = 4GB legacy 32b address space.
+ * GEN8 legacy ppgtt programming is accomplished through a max 4 PDP registers
+ * with a net effect resembling a 2-level page table in normal x86 terms. Each
+ * PDP represents 1GB of memory 4 * 512 * 512 * 4096 = 4GB legacy 32b address
+ * space.
  *
+ * FIXME: split allocation into smaller pieces. For now we only ever do this
+ * once, but with full PPGTT, the multiple contiguous allocations will be bad.
  * TODO: Do something with the size parameter
- **/
+ */
 static int gen8_ppgtt_init(struct i915_hw_ppgtt *ppgtt, uint64_t size)
 {
 	struct page *pt_pages;
-	int i, j, ret = -ENOMEM;
 	const int max_pdp = DIV_ROUND_UP(size, 1 << 30);
 	const int num_pt_pages = GEN8_PDES_PER_PAGE * max_pdp;
+	int i, j, ret;
 
 	if (size % (1<<30))
 		DRM_INFO("Pages will be wasted unless GTT size (%llu) is divisible by 1GB\n", size);
 
-	/* FIXME: split allocation into smaller pieces. For now we only ever do
-	 * this once, but with full PPGTT, the multiple contiguous allocations
-	 * will be bad.
-	 */
+	/* 1. Do all our allocations for page directories and page tables */
 	ppgtt->pd_pages = alloc_pages(GFP_KERNEL, get_order(max_pdp << PAGE_SHIFT));
 	if (!ppgtt->pd_pages)
 		return -ENOMEM;
@@ -402,52 +399,68 @@ static int gen8_ppgtt_init(struct i915_hw_ppgtt *ppgtt, uint64_t size)
 	ppgtt->num_pd_pages = 1 << get_order(max_pdp << PAGE_SHIFT);
 	ppgtt->num_pt_pages = 1 << get_order(num_pt_pages << PAGE_SHIFT);
 	ppgtt->num_pd_entries = max_pdp * GEN8_PDES_PER_PAGE;
-	ppgtt->enable = gen8_ppgtt_enable;
-	ppgtt->switch_mm = gen8_mm_switch;
-	ppgtt->base.clear_range = gen8_ppgtt_clear_range;
-	ppgtt->base.insert_entries = gen8_ppgtt_insert_entries;
-	ppgtt->base.cleanup = gen8_ppgtt_cleanup;
-	ppgtt->base.start = 0;
-	ppgtt->base.total = ppgtt->num_pt_pages * GEN8_PTES_PER_PAGE * PAGE_SIZE;
-
 	BUG_ON(ppgtt->num_pd_pages > GEN8_LEGACY_PDPS);
 
+	/* NB:From here on, ppgtt->base.cleanup() should function properly. */
+
+	ppgtt->gen8_pt_dma_addr = kcalloc(max_pdp,
+					  sizeof(*ppgtt->gen8_pt_dma_addr),
+					  GFP_KERNEL);
+	if (!ppgtt->gen8_pt_dma_addr) {
+		ret = -ENOMEM;
+		goto bail;
+	}
+
+	for (i = 0; i < max_pdp; i++) {
+		ppgtt->gen8_pt_dma_addr[i] = kcalloc(GEN8_PDES_PER_PAGE,
+						     sizeof(dma_addr_t),
+						     GFP_KERNEL);
+		if (!ppgtt->gen8_pt_dma_addr[i]) {
+			ret = -ENOMEM;
+			goto bail;
+		}
+	}
+
 	/*
-	 * - Create a mapping for the page directories.
-	 * - For each page directory:
-	 *      allocate space for page table mappings.
-	 *      map each page table
+	 * 2. Create all the DMA mappings for the page directories and page
+	 * tables
 	 */
 	for (i = 0; i < max_pdp; i++) {
-		dma_addr_t temp;
-		temp = pci_map_page(ppgtt->base.dev->pdev,
-				    &ppgtt->pd_pages[i], 0,
-				    PAGE_SIZE, PCI_DMA_BIDIRECTIONAL);
-		if (pci_dma_mapping_error(ppgtt->base.dev->pdev, temp))
-			goto err_out;
-
-		ppgtt->pd_dma_addr[i] = temp;
-
-		ppgtt->gen8_pt_dma_addr[i] = kmalloc(sizeof(dma_addr_t) * GEN8_PDES_PER_PAGE, GFP_KERNEL);
-		if (!ppgtt->gen8_pt_dma_addr[i])
-			goto err_out;
+		dma_addr_t pd_addr, pt_addr;
 
+		/* Get the page table mappings per page directory */
 		for (j = 0; j < GEN8_PDES_PER_PAGE; j++) {
 			struct page *p = &pt_pages[i * GEN8_PDES_PER_PAGE + j];
-			temp = pci_map_page(ppgtt->base.dev->pdev,
-					    p, 0, PAGE_SIZE,
-					    PCI_DMA_BIDIRECTIONAL);
 
-			if (pci_dma_mapping_error(ppgtt->base.dev->pdev, temp))
-				goto err_out;
+			pt_addr = pci_map_page(ppgtt->base.dev->pdev,
+					       p, 0, PAGE_SIZE,
+					       PCI_DMA_BIDIRECTIONAL);
+			ret = pci_dma_mapping_error(ppgtt->base.dev->pdev, pt_addr);
+			if (ret)
+				goto bail;
 
-			ppgtt->gen8_pt_dma_addr[i][j] = temp;
+			ppgtt->gen8_pt_dma_addr[i][j] = pt_addr;
 		}
+
+		/* And the page directory mappings */
+		pd_addr = pci_map_page(ppgtt->base.dev->pdev,
+				       &ppgtt->pd_pages[i], 0,
+				       PAGE_SIZE, PCI_DMA_BIDIRECTIONAL);
+		ret = pci_dma_mapping_error(ppgtt->base.dev->pdev, pd_addr);
+		if (ret)
+			goto bail;
+
+		ppgtt->pd_dma_addr[i] = pd_addr;
 	}
 
-	/* For now, the PPGTT helper functions all require that the PDEs are
+	/*
+	 * 3. Map all the page directory entires to point to the page tables
+	 * we've allocated.
+	 *
+	 * For now, the PPGTT helper functions all require that the PDEs are
 	 * plugged in correctly. So we do that now/here. For aliasing PPGTT, we
-	 * will never need to touch the PDEs again */
+	 * will never need to touch the PDEs again.
+	 */
 	for (i = 0; i < max_pdp; i++) {
 		gen8_ppgtt_pde_t *pd_vaddr;
 		pd_vaddr = kmap_atomic(&ppgtt->pd_pages[i]);
@@ -459,6 +472,14 @@ static int gen8_ppgtt_init(struct i915_hw_ppgtt *ppgtt, uint64_t size)
 		kunmap_atomic(pd_vaddr);
 	}
 
+	ppgtt->enable = gen8_ppgtt_enable;
+	ppgtt->switch_mm = gen8_mm_switch;
+	ppgtt->base.clear_range = gen8_ppgtt_clear_range;
+	ppgtt->base.insert_entries = gen8_ppgtt_insert_entries;
+	ppgtt->base.cleanup = gen8_ppgtt_cleanup;
+	ppgtt->base.start = 0;
+	ppgtt->base.total = ppgtt->num_pt_pages * GEN8_PTES_PER_PAGE * PAGE_SIZE;
+
 	ppgtt->base.clear_range(&ppgtt->base, 0,
 				ppgtt->num_pd_entries * GEN8_PTES_PER_PAGE,
 				true);
@@ -471,8 +492,8 @@ static int gen8_ppgtt_init(struct i915_hw_ppgtt *ppgtt, uint64_t size)
 			 size % (1<<30));
 	return 0;
 
-err_out:
-	ppgtt->base.cleanup(&ppgtt->base);
+bail:
+	gen8_ppgtt_cleanup(&ppgtt->base);
 	return ret;
 }
 
-- 
1.8.5.2

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 3/7] drm/i915/bdw: Split ppgtt initialization up
  2013-12-24 21:42 [PATCH 0/7] GEN8 PPGTT cleanups + 4GB support Ben Widawsky
  2013-12-24 21:42 ` [PATCH 1/7] drm/i915/bdw: Split up PPGTT cleanup Ben Widawsky
  2013-12-24 21:42 ` [PATCH 2/7] drm/i915/bdw: Reorganize PPGTT init Ben Widawsky
@ 2013-12-24 21:42 ` Ben Widawsky
  2013-12-25  7:04   ` [PATCH 3/7] [v2] " Ben Widawsky
  2013-12-24 21:42 ` [PATCH 4/7] drm/i915: Make clear/insert vfuncs args absolute Ben Widawsky
                   ` (4 subsequent siblings)
  7 siblings, 1 reply; 15+ messages in thread
From: Ben Widawsky @ 2013-12-24 21:42 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky, Ben Widawsky

Like cleanup in an earlier patch, the code becomes much more readable,
and easier to extend if we extract out helper functions for the various
stages of init.

Note that with this patch it becomes really simple, and tempting to begin
using the 'goto out' idiom with explicit free/fini semantics. I've
avoiding that to keep the cleanup function (which is required anyway) as
robust and widespread as possible.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_gem_gtt.c | 183 +++++++++++++++++++++++++-----------
 1 file changed, 126 insertions(+), 57 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index b0b621d..25687cf 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -364,93 +364,162 @@ static void gen8_ppgtt_cleanup(struct i915_address_space *vm)
 	gen8_ppgtt_free(ppgtt);
 }
 
-/**
- * GEN8 legacy ppgtt programming is accomplished through a max 4 PDP registers
- * with a net effect resembling a 2-level page table in normal x86 terms. Each
- * PDP represents 1GB of memory 4 * 512 * 512 * 4096 = 4GB legacy 32b address
- * space.
- *
- * FIXME: split allocation into smaller pieces. For now we only ever do this
- * once, but with full PPGTT, the multiple contiguous allocations will be bad.
- * TODO: Do something with the size parameter
- */
-static int gen8_ppgtt_init(struct i915_hw_ppgtt *ppgtt, uint64_t size)
+static int gen8_ppgtt_allocate_page_tables(struct i915_hw_ppgtt *ppgtt,
+					   const int max_pdp)
 {
 	struct page *pt_pages;
-	const int max_pdp = DIV_ROUND_UP(size, 1 << 30);
 	const int num_pt_pages = GEN8_PDES_PER_PAGE * max_pdp;
-	int i, j, ret;
-
-	if (size % (1<<30))
-		DRM_INFO("Pages will be wasted unless GTT size (%llu) is divisible by 1GB\n", size);
-
-	/* 1. Do all our allocations for page directories and page tables */
-	ppgtt->pd_pages = alloc_pages(GFP_KERNEL, get_order(max_pdp << PAGE_SHIFT));
-	if (!ppgtt->pd_pages)
-		return -ENOMEM;
 
 	pt_pages = alloc_pages(GFP_KERNEL, get_order(num_pt_pages << PAGE_SHIFT));
-	if (!pt_pages) {
-		__free_pages(ppgtt->pd_pages, get_order(max_pdp << PAGE_SHIFT));
+	if (!pt_pages)
 		return -ENOMEM;
-	}
 
 	ppgtt->gen8_pt_pages = pt_pages;
-	ppgtt->num_pd_pages = 1 << get_order(max_pdp << PAGE_SHIFT);
 	ppgtt->num_pt_pages = 1 << get_order(num_pt_pages << PAGE_SHIFT);
-	ppgtt->num_pd_entries = max_pdp * GEN8_PDES_PER_PAGE;
-	BUG_ON(ppgtt->num_pd_pages > GEN8_LEGACY_PDPS);
 
-	/* NB:From here on, ppgtt->base.cleanup() should function properly. */
+	return 0;
+}
+
+static int gen8_ppgtt_allocate_dma(struct i915_hw_ppgtt *ppgtt)
+{
+	int i;
 
-	ppgtt->gen8_pt_dma_addr = kcalloc(max_pdp,
+	ppgtt->gen8_pt_dma_addr = kcalloc(ppgtt->num_pd_entries,
 					  sizeof(*ppgtt->gen8_pt_dma_addr),
 					  GFP_KERNEL);
-	if (!ppgtt->gen8_pt_dma_addr) {
-		ret = -ENOMEM;
-		goto bail;
-	}
+	if (!ppgtt->gen8_pt_dma_addr)
+		return -ENOMEM;
 
-	for (i = 0; i < max_pdp; i++) {
+	for (i = 0; i < ppgtt->num_pd_entries; i++) {
 		ppgtt->gen8_pt_dma_addr[i] = kcalloc(GEN8_PDES_PER_PAGE,
 						     sizeof(dma_addr_t),
 						     GFP_KERNEL);
 		if (!ppgtt->gen8_pt_dma_addr[i]) {
-			ret = -ENOMEM;
-			goto bail;
+			kfree(ppgtt->gen8_pt_dma_addr);
+			while(i--)
+				kfree(ppgtt->gen8_pt_dma_addr[i]);
+
+			return -ENOMEM;
 		}
 	}
 
+	return 0;
+}
+
+static int gen8_ppgtt_allocate_page_directories(struct i915_hw_ppgtt *ppgtt,
+						const int max_pdp)
+{
+	ppgtt->pd_pages = alloc_pages(GFP_KERNEL, get_order(max_pdp << PAGE_SHIFT));
+	if (!ppgtt->pd_pages)
+		return -ENOMEM;
+
+	ppgtt->num_pd_pages = 1 << get_order(max_pdp << PAGE_SHIFT);
+
+	return 0;
+}
+
+static int gen8_ppgtt_alloc(struct i915_hw_ppgtt *ppgtt,
+			    const int max_pdp)
+{
+	int ret;
+
+	ret = gen8_ppgtt_allocate_page_directories(ppgtt, max_pdp);
+	if (ret)
+		return ret;
+
+	ret = gen8_ppgtt_allocate_page_tables(ppgtt, max_pdp);
+	if (ret) {
+		__free_pages(ppgtt->pd_pages, get_order(max_pdp << PAGE_SHIFT));
+		return ret;
+	}
+
+	ppgtt->num_pd_entries = max_pdp * GEN8_PDES_PER_PAGE;
+	BUG_ON(ppgtt->num_pd_pages > GEN8_LEGACY_PDPS);
+
+	ret = gen8_ppgtt_allocate_dma(ppgtt);
+	if (ret)
+		gen8_ppgtt_free(ppgtt);
+
+	return ret;
+}
+
+static int gen8_ppgtt_setup_page_directories(struct i915_hw_ppgtt *ppgtt,
+					     const int pd)
+{
+	dma_addr_t pd_addr;
+	int ret;
+
+	pd_addr = pci_map_page(ppgtt->base.dev->pdev,
+			       &ppgtt->pd_pages[pd], 0,
+			       PAGE_SIZE, PCI_DMA_BIDIRECTIONAL);
+
+	ret = pci_dma_mapping_error(ppgtt->base.dev->pdev, pd_addr);
+	if (ret)
+		return ret;
+
+	ppgtt->pd_dma_addr[pd] = pd_addr;
+
+	return 0;
+}
+
+static int gen8_ppgtt_setup_page_tables(struct i915_hw_ppgtt *ppgtt,
+					const int pd,
+					const int pt)
+{
+	dma_addr_t pt_addr;
+	struct page *p;
+	int ret;
+
+	p = &ppgtt->gen8_pt_pages[pd * GEN8_PDES_PER_PAGE + pt];
+	pt_addr = pci_map_page(ppgtt->base.dev->pdev,
+			       p, 0, PAGE_SIZE, PCI_DMA_BIDIRECTIONAL);
+	ret = pci_dma_mapping_error(ppgtt->base.dev->pdev, pt_addr);
+	if (ret)
+		return ret;
+
+	ppgtt->gen8_pt_dma_addr[pd][pt] = pt_addr;
+
+	return 0;
+}
+
+/**
+ * GEN8 legacy ppgtt programming is accomplished through a max 4 PDP registers
+ * with a net effect resembling a 2-level page table in normal x86 terms. Each
+ * PDP represents 1GB of memory 4 * 512 * 512 * 4096 = 4GB legacy 32b address
+ * space.
+ *
+ * FIXME: split allocation into smaller pieces. For now we only ever do this
+ * once, but with full PPGTT, the multiple contiguous allocations will be bad.
+ * TODO: Do something with the size parameter
+ */
+static int gen8_ppgtt_init(struct i915_hw_ppgtt *ppgtt, uint64_t size)
+{
+	const int max_pdp = DIV_ROUND_UP(size, 1 << 30);
+	const int min_pt_pages = GEN8_PDES_PER_PAGE * max_pdp;
+	int i, j, ret;
+
+	if (size % (1<<30))
+		DRM_INFO("Pages will be wasted unless GTT size (%llu) is divisible by 1GB\n", size);
+
+	/* 1. Do all our allocations for page directories and page tables. */
+	ret = gen8_ppgtt_alloc(ppgtt, max_pdp);
+	if (ret)
+		return ret;
+
 	/*
-	 * 2. Create all the DMA mappings for the page directories and page
-	 * tables
+	 * 2. Create DMA mappings for the page directories and page tables.
+	 * NB:From here on, ppgtt->base.cleanup() should function properly.
 	 */
 	for (i = 0; i < max_pdp; i++) {
-		dma_addr_t pd_addr, pt_addr;
-
-		/* Get the page table mappings per page directory */
 		for (j = 0; j < GEN8_PDES_PER_PAGE; j++) {
-			struct page *p = &pt_pages[i * GEN8_PDES_PER_PAGE + j];
-
-			pt_addr = pci_map_page(ppgtt->base.dev->pdev,
-					       p, 0, PAGE_SIZE,
-					       PCI_DMA_BIDIRECTIONAL);
-			ret = pci_dma_mapping_error(ppgtt->base.dev->pdev, pt_addr);
+			ret = gen8_ppgtt_setup_page_tables(ppgtt, i, j);
 			if (ret)
 				goto bail;
-
-			ppgtt->gen8_pt_dma_addr[i][j] = pt_addr;
 		}
 
-		/* And the page directory mappings */
-		pd_addr = pci_map_page(ppgtt->base.dev->pdev,
-				       &ppgtt->pd_pages[i], 0,
-				       PAGE_SIZE, PCI_DMA_BIDIRECTIONAL);
-		ret = pci_dma_mapping_error(ppgtt->base.dev->pdev, pd_addr);
+		ret = gen8_ppgtt_setup_page_directories(ppgtt, i);
 		if (ret)
 			goto bail;
-
-		ppgtt->pd_dma_addr[i] = pd_addr;
 	}
 
 	/*
@@ -488,7 +557,7 @@ static int gen8_ppgtt_init(struct i915_hw_ppgtt *ppgtt, uint64_t size)
 			 ppgtt->num_pd_pages, ppgtt->num_pd_pages - max_pdp);
 	DRM_DEBUG_DRIVER("Allocated %d pages for page tables (%lld wasted)\n",
 			 ppgtt->num_pt_pages,
-			 (ppgtt->num_pt_pages - num_pt_pages) +
+			 (ppgtt->num_pt_pages - min_pt_pages) +
 			 size % (1<<30));
 	return 0;
 
-- 
1.8.5.2

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 4/7] drm/i915: Make clear/insert vfuncs args absolute
  2013-12-24 21:42 [PATCH 0/7] GEN8 PPGTT cleanups + 4GB support Ben Widawsky
                   ` (2 preceding siblings ...)
  2013-12-24 21:42 ` [PATCH 3/7] drm/i915/bdw: Split ppgtt initialization up Ben Widawsky
@ 2013-12-24 21:42 ` Ben Widawsky
  2013-12-24 21:42 ` [PATCH 5/7] drm/i915/bdw: Reorganize PT allocations Ben Widawsky
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 15+ messages in thread
From: Ben Widawsky @ 2013-12-24 21:42 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky, Ben Widawsky

This patch converts insert_entries and clear_range, both functions which
are specific to the VM. These functions tend to encapsulate the gen
specific PTE writes. Passing absolute addresses to the insert_entries,
and clear_range will help make the logic clearer within the functions as
to what's going on. Currently, all callers simply do the appropriate
page shift, which IMO, ends up looking weird with an upcoming change for
the gen8 page table allocations.

Up until now, the PPGTT was a funky 2 level page table. GEN8 changes
this to look more like a 3 level page table, and to that extent we need
a significant amount more memory simply for the page tables. To address
this, the allocations will be split up in finer amounts.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_drv.h     |  6 +--
 drivers/gpu/drm/i915/i915_gem_gtt.c | 80 +++++++++++++++++++++----------------
 2 files changed, 49 insertions(+), 37 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index dc21963..43c2ee7 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -614,12 +614,12 @@ struct i915_address_space {
 				     enum i915_cache_level level,
 				     bool valid); /* Create a valid PTE */
 	void (*clear_range)(struct i915_address_space *vm,
-			    unsigned int first_entry,
-			    unsigned int num_entries,
+			    uint64_t start,
+			    size_t length,
 			    bool use_scratch);
 	void (*insert_entries)(struct i915_address_space *vm,
 			       struct sg_table *st,
-			       unsigned int first_entry,
+			       uint64_t start,
 			       enum i915_cache_level cache_level);
 	void (*cleanup)(struct i915_address_space *vm);
 };
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 25687cf..af07371 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -253,13 +253,15 @@ static int gen8_mm_switch(struct i915_hw_ppgtt *ppgtt,
 }
 
 static void gen8_ppgtt_clear_range(struct i915_address_space *vm,
-				   unsigned first_entry,
-				   unsigned num_entries,
+				   uint64_t start,
+				   size_t length,
 				   bool use_scratch)
 {
 	struct i915_hw_ppgtt *ppgtt =
 		container_of(vm, struct i915_hw_ppgtt, base);
 	gen8_gtt_pte_t *pt_vaddr, scratch_pte;
+	unsigned first_entry = start >> PAGE_SHIFT;
+	unsigned num_entries = length >> PAGE_SHIFT;
 	unsigned act_pt = first_entry / GEN8_PTES_PER_PAGE;
 	unsigned first_pte = first_entry % GEN8_PTES_PER_PAGE;
 	unsigned last_pte, i;
@@ -289,12 +291,13 @@ static void gen8_ppgtt_clear_range(struct i915_address_space *vm,
 
 static void gen8_ppgtt_insert_entries(struct i915_address_space *vm,
 				      struct sg_table *pages,
-				      unsigned first_entry,
+				      uint64_t start,
 				      enum i915_cache_level cache_level)
 {
 	struct i915_hw_ppgtt *ppgtt =
 		container_of(vm, struct i915_hw_ppgtt, base);
 	gen8_gtt_pte_t *pt_vaddr;
+	unsigned first_entry = start >> PAGE_SHIFT;
 	unsigned act_pt = first_entry / GEN8_PTES_PER_PAGE;
 	unsigned act_pte = first_entry % GEN8_PTES_PER_PAGE;
 	struct sg_page_iter sg_iter;
@@ -864,13 +867,15 @@ static int gen6_ppgtt_enable(struct i915_hw_ppgtt *ppgtt)
 
 /* PPGTT support for Sandybdrige/Gen6 and later */
 static void gen6_ppgtt_clear_range(struct i915_address_space *vm,
-				   unsigned first_entry,
-				   unsigned num_entries,
+				   uint64_t start,
+				   size_t length,
 				   bool use_scratch)
 {
 	struct i915_hw_ppgtt *ppgtt =
 		container_of(vm, struct i915_hw_ppgtt, base);
 	gen6_gtt_pte_t *pt_vaddr, scratch_pte;
+	unsigned first_entry = start >> PAGE_SHIFT;
+	unsigned num_entries = length >> PAGE_SHIFT;
 	unsigned act_pt = first_entry / I915_PPGTT_PT_ENTRIES;
 	unsigned first_pte = first_entry % I915_PPGTT_PT_ENTRIES;
 	unsigned last_pte, i;
@@ -897,12 +902,13 @@ static void gen6_ppgtt_clear_range(struct i915_address_space *vm,
 
 static void gen6_ppgtt_insert_entries(struct i915_address_space *vm,
 				      struct sg_table *pages,
-				      unsigned first_entry,
+				      uint64_t start,
 				      enum i915_cache_level cache_level)
 {
 	struct i915_hw_ppgtt *ppgtt =
 		container_of(vm, struct i915_hw_ppgtt, base);
 	gen6_gtt_pte_t *pt_vaddr;
+	unsigned first_entry = start >> PAGE_SHIFT;
 	unsigned act_pt = first_entry / I915_PPGTT_PT_ENTRIES;
 	unsigned act_pte = first_entry % I915_PPGTT_PT_ENTRIES;
 	struct sg_page_iter sg_iter;
@@ -1033,8 +1039,7 @@ alloc:
 		ppgtt->pt_dma_addr[i] = pt_addr;
 	}
 
-	ppgtt->base.clear_range(&ppgtt->base, 0,
-				ppgtt->num_pd_entries * I915_PPGTT_PT_ENTRIES, true);
+	ppgtt->base.clear_range(&ppgtt->base, 0, ppgtt->base.total, true);
 	ppgtt->debug_dump = gen6_dump_ppgtt;
 
 	DRM_DEBUG_DRIVER("Allocated pde space (%ldM) at GTT entry: %lx\n",
@@ -1098,20 +1103,17 @@ ppgtt_bind_vma(struct i915_vma *vma,
 	       enum i915_cache_level cache_level,
 	       u32 flags)
 {
-	const unsigned long entry = vma->node.start >> PAGE_SHIFT;
-
 	WARN_ON(flags);
 
-	vma->vm->insert_entries(vma->vm, vma->obj->pages, entry, cache_level);
+	vma->vm->insert_entries(vma->vm, vma->obj->pages, vma->node.start,
+				cache_level);
 }
 
 static void ppgtt_unbind_vma(struct i915_vma *vma)
 {
-	const unsigned long entry = vma->node.start >> PAGE_SHIFT;
-
 	vma->vm->clear_range(vma->vm,
-			     entry,
-			     vma->obj->base.size >> PAGE_SHIFT,
+			     vma->node.start,
+			     vma->obj->base.size,
 			     true);
 }
 
@@ -1272,10 +1274,11 @@ static inline void gen8_set_pte(void __iomem *addr, gen8_gtt_pte_t pte)
 
 static void gen8_ggtt_insert_entries(struct i915_address_space *vm,
 				     struct sg_table *st,
-				     unsigned int first_entry,
+				     uint64_t start,
 				     enum i915_cache_level level)
 {
 	struct drm_i915_private *dev_priv = vm->dev->dev_private;
+	unsigned first_entry = start >> PAGE_SHIFT;
 	gen8_gtt_pte_t __iomem *gtt_entries =
 		(gen8_gtt_pte_t __iomem *)dev_priv->gtt.gsm + first_entry;
 	int i = 0;
@@ -1319,10 +1322,11 @@ static void gen8_ggtt_insert_entries(struct i915_address_space *vm,
  */
 static void gen6_ggtt_insert_entries(struct i915_address_space *vm,
 				     struct sg_table *st,
-				     unsigned int first_entry,
+				     uint64_t start,
 				     enum i915_cache_level level)
 {
 	struct drm_i915_private *dev_priv = vm->dev->dev_private;
+	unsigned first_entry = start >> PAGE_SHIFT;
 	gen6_gtt_pte_t __iomem *gtt_entries =
 		(gen6_gtt_pte_t __iomem *)dev_priv->gtt.gsm + first_entry;
 	int i = 0;
@@ -1354,11 +1358,13 @@ static void gen6_ggtt_insert_entries(struct i915_address_space *vm,
 }
 
 static void gen8_ggtt_clear_range(struct i915_address_space *vm,
-				  unsigned int first_entry,
-				  unsigned int num_entries,
+				  uint64_t start,
+				  size_t length,
 				  bool use_scratch)
 {
 	struct drm_i915_private *dev_priv = vm->dev->dev_private;
+	unsigned first_entry = start >> PAGE_SHIFT;
+	unsigned num_entries = length >> PAGE_SHIFT;
 	gen8_gtt_pte_t scratch_pte, __iomem *gtt_base =
 		(gen8_gtt_pte_t __iomem *) dev_priv->gtt.gsm + first_entry;
 	const int max_entries = gtt_total_entries(dev_priv->gtt) - first_entry;
@@ -1378,11 +1384,13 @@ static void gen8_ggtt_clear_range(struct i915_address_space *vm,
 }
 
 static void gen6_ggtt_clear_range(struct i915_address_space *vm,
-				  unsigned int first_entry,
-				  unsigned int num_entries,
+				  uint64_t start,
+				  size_t length,
 				  bool use_scratch)
 {
 	struct drm_i915_private *dev_priv = vm->dev->dev_private;
+	unsigned first_entry = start >> PAGE_SHIFT;
+	unsigned num_entries = length >> PAGE_SHIFT;
 	gen6_gtt_pte_t scratch_pte, __iomem *gtt_base =
 		(gen6_gtt_pte_t __iomem *) dev_priv->gtt.gsm + first_entry;
 	const int max_entries = gtt_total_entries(dev_priv->gtt) - first_entry;
@@ -1415,10 +1423,12 @@ static void i915_ggtt_bind_vma(struct i915_vma *vma,
 }
 
 static void i915_ggtt_clear_range(struct i915_address_space *vm,
-				  unsigned int first_entry,
-				  unsigned int num_entries,
+				  uint64_t start,
+				  size_t length,
 				  bool unused)
 {
+	unsigned first_entry = start >> PAGE_SHIFT;
+	unsigned num_entries = length >> PAGE_SHIFT;
 	intel_gtt_clear_range(first_entry, num_entries);
 }
 
@@ -1439,7 +1449,6 @@ static void ggtt_bind_vma(struct i915_vma *vma,
 	struct drm_device *dev = vma->vm->dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct drm_i915_gem_object *obj = vma->obj;
-	const unsigned long entry = vma->node.start >> PAGE_SHIFT;
 
 	/* If there is no aliasing PPGTT, or the caller needs a global mapping,
 	 * or we have a global mapping already but the cacheability flags have
@@ -1455,7 +1464,8 @@ static void ggtt_bind_vma(struct i915_vma *vma,
 	if (!dev_priv->mm.aliasing_ppgtt || flags & GLOBAL_BIND) {
 		if (!obj->has_global_gtt_mapping ||
 		    (cache_level != obj->cache_level)) {
-			vma->vm->insert_entries(vma->vm, obj->pages, entry,
+			vma->vm->insert_entries(vma->vm, obj->pages,
+						vma->node.start,
 						cache_level);
 			obj->has_global_gtt_mapping = 1;
 		}
@@ -1466,7 +1476,9 @@ static void ggtt_bind_vma(struct i915_vma *vma,
 	     (cache_level != obj->cache_level))) {
 		struct i915_hw_ppgtt *appgtt = dev_priv->mm.aliasing_ppgtt;
 		appgtt->base.insert_entries(&appgtt->base,
-					    vma->obj->pages, entry, cache_level);
+					    vma->obj->pages,
+					    vma->node.start,
+					    cache_level);
 		vma->obj->has_aliasing_ppgtt_mapping = 1;
 	}
 }
@@ -1476,11 +1488,11 @@ static void ggtt_unbind_vma(struct i915_vma *vma)
 	struct drm_device *dev = vma->vm->dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct drm_i915_gem_object *obj = vma->obj;
-	const unsigned long entry = vma->node.start >> PAGE_SHIFT;
 
 	if (obj->has_global_gtt_mapping) {
-		vma->vm->clear_range(vma->vm, entry,
-				     vma->obj->base.size >> PAGE_SHIFT,
+		vma->vm->clear_range(vma->vm,
+				     vma->node.start,
+				     obj->base.size,
 				     true);
 		obj->has_global_gtt_mapping = 0;
 	}
@@ -1488,8 +1500,8 @@ static void ggtt_unbind_vma(struct i915_vma *vma)
 	if (obj->has_aliasing_ppgtt_mapping) {
 		struct i915_hw_ppgtt *appgtt = dev_priv->mm.aliasing_ppgtt;
 		appgtt->base.clear_range(&appgtt->base,
-					 entry,
-					 obj->base.size >> PAGE_SHIFT,
+					 vma->node.start,
+					 obj->base.size,
 					 true);
 		obj->has_aliasing_ppgtt_mapping = 0;
 	}
@@ -1574,14 +1586,14 @@ void i915_gem_setup_global_gtt(struct drm_device *dev,
 
 	/* Clear any non-preallocated blocks */
 	drm_mm_for_each_hole(entry, &ggtt_vm->mm, hole_start, hole_end) {
-		const unsigned long count = (hole_end - hole_start) / PAGE_SIZE;
 		DRM_DEBUG_KMS("clearing unused GTT space: [%lx, %lx]\n",
 			      hole_start, hole_end);
-		ggtt_vm->clear_range(ggtt_vm, hole_start / PAGE_SIZE, count, true);
+		ggtt_vm->clear_range(ggtt_vm, hole_start,
+				     hole_end - hole_start, true);
 	}
 
 	/* And finally clear the reserved guard page */
-	ggtt_vm->clear_range(ggtt_vm, end / PAGE_SIZE - 1, 1, true);
+	ggtt_vm->clear_range(ggtt_vm, end - PAGE_SIZE, PAGE_SIZE, true);
 }
 
 void i915_gem_init_global_gtt(struct drm_device *dev)
-- 
1.8.5.2

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 5/7] drm/i915/bdw: Reorganize PT allocations
  2013-12-24 21:42 [PATCH 0/7] GEN8 PPGTT cleanups + 4GB support Ben Widawsky
                   ` (3 preceding siblings ...)
  2013-12-24 21:42 ` [PATCH 4/7] drm/i915: Make clear/insert vfuncs args absolute Ben Widawsky
@ 2013-12-24 21:42 ` Ben Widawsky
  2013-12-24 21:42 ` [PATCH 6/7] Revert "drm/i915/bdw: Limit GTT to 2GB" Ben Widawsky
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 15+ messages in thread
From: Ben Widawsky @ 2013-12-24 21:42 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky, Ben Widawsky

The previous allocation mechanism would get 2 contiguous allocations,
one for the page directories, and one for the page tables. As each page
table is 1 page, and there are 512 of these per page directory, this
goes to 1MB. An unfriendly request at best. Worse still, our HW now
supports 4 page directories, and a 2MB allocation is not allowed.

In order to fix this, this patch attempts to split up each page table
allocation into a single, discrete allocation. There is nothing really
fancy about the patch itself, it just has to manage an extra pointer
indirection, and have a fancier bit of logic to free up the pages.

To accommodate some of the added complexity, two new helpers are
introduced to allocate, and free the page table pages.

NOTE: I really wanted to split the way we do allocations, and the way in
which we identify the page table/page directory being used. I found
splitting this functionality up to be too unwieldy. I apologize in
advance to the reviewer. I'd recommend looking at the result, rather
than the diff.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_drv.h     |   5 +-
 drivers/gpu/drm/i915/i915_gem_gtt.c | 135 +++++++++++++++++++++++++++---------
 2 files changed, 107 insertions(+), 33 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 43c2ee7..6b23b13 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -653,6 +653,7 @@ struct i915_gtt {
 };
 #define gtt_total_entries(gtt) ((gtt).base.total >> PAGE_SHIFT)
 
+#define GEN8_LEGACY_PDPS 4
 struct i915_hw_ppgtt {
 	struct i915_address_space base;
 	struct kref ref;
@@ -660,14 +661,14 @@ struct i915_hw_ppgtt {
 	unsigned num_pd_entries;
 	union {
 		struct page **pt_pages;
-		struct page *gen8_pt_pages;
+		struct page **gen8_pt_pages[GEN8_LEGACY_PDPS];
 	};
 	struct page *pd_pages;
 	int num_pd_pages;
 	int num_pt_pages;
 	union {
 		uint32_t pd_offset;
-		dma_addr_t pd_dma_addr[4];
+		dma_addr_t pd_dma_addr[GEN8_LEGACY_PDPS];
 	};
 	union {
 		dma_addr_t *pt_dma_addr;
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index af07371..8946b1c 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -63,7 +63,19 @@ typedef gen8_gtt_pte_t gen8_ppgtt_pde_t;
 
 #define GEN8_PTES_PER_PAGE		(PAGE_SIZE / sizeof(gen8_gtt_pte_t))
 #define GEN8_PDES_PER_PAGE		(PAGE_SIZE / sizeof(gen8_ppgtt_pde_t))
-#define GEN8_LEGACY_PDPS		4
+
+/* GEN8 legacy style addressis defined as a 3 level page table:
+ * 31:30 | 29:21 | 20:12 |  11:0
+ * PDPE  |  PDE  |  PTE  | offset
+ * The difference as compared to normal x86 3 level page table is the PDPEs are
+ * programmed via register.
+ */
+#define GEN8_PDPE_SHIFT			30
+#define GEN8_PDPE_MASK			0x3
+#define GEN8_PDE_SHIFT			21
+#define GEN8_PDE_MASK			0x1ff
+#define GEN8_PTE_SHIFT			12
+#define GEN8_PTE_MASK			0x1ff
 
 #define PPAT_UNCACHED_INDEX		(_PAGE_PWT | _PAGE_PCD)
 #define PPAT_CACHED_PDE_INDEX		0 /* WB LLC */
@@ -260,32 +272,36 @@ static void gen8_ppgtt_clear_range(struct i915_address_space *vm,
 	struct i915_hw_ppgtt *ppgtt =
 		container_of(vm, struct i915_hw_ppgtt, base);
 	gen8_gtt_pte_t *pt_vaddr, scratch_pte;
-	unsigned first_entry = start >> PAGE_SHIFT;
+	unsigned which_pdpe = start >> GEN8_PDPE_SHIFT & GEN8_PDPE_MASK;
+	unsigned which_pde = start >> GEN8_PDE_SHIFT & GEN8_PDE_MASK;
+	unsigned which_pte = start >> GEN8_PTE_SHIFT & GEN8_PTE_MASK;
 	unsigned num_entries = length >> PAGE_SHIFT;
-	unsigned act_pt = first_entry / GEN8_PTES_PER_PAGE;
-	unsigned first_pte = first_entry % GEN8_PTES_PER_PAGE;
 	unsigned last_pte, i;
 
 	scratch_pte = gen8_pte_encode(ppgtt->base.scratch.addr,
 				      I915_CACHE_LLC, use_scratch);
 
 	while (num_entries) {
-		struct page *page_table = &ppgtt->gen8_pt_pages[act_pt];
+		struct page *page_table = ppgtt->gen8_pt_pages[which_pdpe][which_pde];
 
-		last_pte = first_pte + num_entries;
+		last_pte = which_pte + num_entries;
 		if (last_pte > GEN8_PTES_PER_PAGE)
 			last_pte = GEN8_PTES_PER_PAGE;
 
 		pt_vaddr = kmap_atomic(page_table);
 
-		for (i = first_pte; i < last_pte; i++)
+		for (i = which_pte; i < last_pte; i++) {
 			pt_vaddr[i] = scratch_pte;
+			num_entries--;
+			BUG_ON(num_entries < 0);
+		}
 
 		kunmap_atomic(pt_vaddr);
 
-		num_entries -= last_pte - first_pte;
-		first_pte = 0;
-		act_pt++;
+		which_pte = 0;
+		if (which_pde + 1 == GEN8_PDES_PER_PAGE)
+			which_pdpe++;
+		which_pde = (which_pde + 1) & GEN8_PDE_MASK;
 	}
 }
 
@@ -297,39 +313,57 @@ static void gen8_ppgtt_insert_entries(struct i915_address_space *vm,
 	struct i915_hw_ppgtt *ppgtt =
 		container_of(vm, struct i915_hw_ppgtt, base);
 	gen8_gtt_pte_t *pt_vaddr;
-	unsigned first_entry = start >> PAGE_SHIFT;
-	unsigned act_pt = first_entry / GEN8_PTES_PER_PAGE;
-	unsigned act_pte = first_entry % GEN8_PTES_PER_PAGE;
+	unsigned which_pdpe = start >> GEN8_PDPE_SHIFT & GEN8_PDPE_MASK;
+	unsigned which_pde = start >> GEN8_PDE_SHIFT & GEN8_PDE_MASK;
+	unsigned which_pte = start >> GEN8_PTE_SHIFT & GEN8_PTE_MASK;
 	struct sg_page_iter sg_iter;
 
-	pt_vaddr = kmap_atomic(&ppgtt->gen8_pt_pages[act_pt]);
+	pt_vaddr = kmap_atomic(ppgtt->gen8_pt_pages[which_pdpe][which_pde]);
+
 	for_each_sg_page(pages->sgl, &sg_iter, pages->nents, 0) {
 		dma_addr_t page_addr;
 
 		page_addr = sg_dma_address(sg_iter.sg) +
-				(sg_iter.sg_pgoffset << PAGE_SHIFT);
-		pt_vaddr[act_pte] = gen8_pte_encode(page_addr, cache_level,
-						    true);
-		if (++act_pte == GEN8_PTES_PER_PAGE) {
-			kunmap_atomic(pt_vaddr);
-			act_pt++;
-			pt_vaddr = kmap_atomic(&ppgtt->gen8_pt_pages[act_pt]);
-			act_pte = 0;
+			(sg_iter.sg_pgoffset << PAGE_SHIFT);
 
+		pt_vaddr[which_pte] =
+			gen8_pte_encode(page_addr, cache_level, true);
+
+		which_pte = (which_pte + 1) & GEN8_PTE_MASK;
+		if (which_pte == 0) {
+			kunmap_atomic(pt_vaddr);
+			if (which_pde + 1 == GEN8_PDES_PER_PAGE)
+				which_pdpe++;
+			which_pde = (which_pde + 1) & GEN8_PDE_MASK;
+			pt_vaddr = kmap_atomic(ppgtt->gen8_pt_pages[which_pdpe][which_pde]);
 		}
 	}
+
 	kunmap_atomic(pt_vaddr);
 }
 
-static void gen8_ppgtt_free(struct i915_hw_ppgtt *ppgtt)
+static void gen8_free_page_tables(struct page **pt_pages)
 {
 	int i;
 
-	for (i = 0; i < ppgtt->num_pd_pages ; i++)
+	if (pt_pages == NULL)
+		return;
+
+	for (i = 0; i < GEN8_PDES_PER_PAGE; i++)
+		if (pt_pages[i])
+			__free_pages(pt_pages[i], 0);
+}
+
+static void gen8_ppgtt_free(const struct i915_hw_ppgtt *ppgtt)
+{
+	int i;
+
+	for (i = 0; i < ppgtt->num_pd_pages; i++) {
+		gen8_free_page_tables(ppgtt->gen8_pt_pages[i]);
 		kfree(ppgtt->gen8_pt_dma_addr[i]);
+	}
 
 	kfree(ppgtt->gen8_pt_dma_addr);
-	__free_pages(ppgtt->gen8_pt_pages, get_order(ppgtt->num_pt_pages << PAGE_SHIFT));
 	__free_pages(ppgtt->pd_pages, get_order(ppgtt->num_pd_pages << PAGE_SHIFT));
 }
 
@@ -367,20 +401,59 @@ static void gen8_ppgtt_cleanup(struct i915_address_space *vm)
 	gen8_ppgtt_free(ppgtt);
 }
 
+static struct page **__gen8_alloc_page_tables(void)
+{
+	struct page **pt_pages;
+	int i;
+
+	pt_pages = kcalloc(GEN8_PDES_PER_PAGE, sizeof(struct page *), GFP_KERNEL);
+	if (!pt_pages)
+		return ERR_PTR(-ENOMEM);
+
+	for (i = 0; i < GEN8_PDES_PER_PAGE; i++) {
+		pt_pages[i] = alloc_page(GFP_KERNEL);
+		if (!pt_pages[i])
+			goto bail;
+	}
+
+	return pt_pages;
+
+bail:
+	gen8_free_page_tables(pt_pages);
+	kfree(pt_pages);
+	return ERR_PTR(-ENOMEM);
+}
+
 static int gen8_ppgtt_allocate_page_tables(struct i915_hw_ppgtt *ppgtt,
 					   const int max_pdp)
 {
-	struct page *pt_pages;
+	struct page **pt_pages[GEN8_LEGACY_PDPS];
 	const int num_pt_pages = GEN8_PDES_PER_PAGE * max_pdp;
+	int i, ret;
 
-	pt_pages = alloc_pages(GFP_KERNEL, get_order(num_pt_pages << PAGE_SHIFT));
-	if (!pt_pages)
-		return -ENOMEM;
+	for (i = 0; i < max_pdp; i++) {
+		pt_pages[i] = __gen8_alloc_page_tables();
+		if (IS_ERR(pt_pages[i])) {
+			ret = PTR_ERR(pt_pages[i]);
+			goto unwind_out;
+		}
+	}
+
+	/* NB: Avoid touching gen8_pt_pages until last to keep the allocation,
+	 * "atomic" - for cleanup purposes.
+	 */
+	for (i = 0; i < max_pdp; i++)
+		ppgtt->gen8_pt_pages[i] = pt_pages[i];
 
-	ppgtt->gen8_pt_pages = pt_pages;
 	ppgtt->num_pt_pages = 1 << get_order(num_pt_pages << PAGE_SHIFT);
 
 	return 0;
+
+unwind_out:
+	while(i--)
+		gen8_free_page_tables(pt_pages[i]);
+
+	return ret;
 }
 
 static int gen8_ppgtt_allocate_dma(struct i915_hw_ppgtt *ppgtt)
@@ -473,7 +546,7 @@ static int gen8_ppgtt_setup_page_tables(struct i915_hw_ppgtt *ppgtt,
 	struct page *p;
 	int ret;
 
-	p = &ppgtt->gen8_pt_pages[pd * GEN8_PDES_PER_PAGE + pt];
+	p = ppgtt->gen8_pt_pages[pd][pt];
 	pt_addr = pci_map_page(ppgtt->base.dev->pdev,
 			       p, 0, PAGE_SIZE, PCI_DMA_BIDIRECTIONAL);
 	ret = pci_dma_mapping_error(ppgtt->base.dev->pdev, pt_addr);
-- 
1.8.5.2

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 6/7] Revert "drm/i915/bdw: Limit GTT to 2GB"
  2013-12-24 21:42 [PATCH 0/7] GEN8 PPGTT cleanups + 4GB support Ben Widawsky
                   ` (4 preceding siblings ...)
  2013-12-24 21:42 ` [PATCH 5/7] drm/i915/bdw: Reorganize PT allocations Ben Widawsky
@ 2013-12-24 21:42 ` Ben Widawsky
  2013-12-24 21:42 ` [PATCH 7/7] drm/i915: Update i915_gem_gtt.c copyright Ben Widawsky
  2013-12-26  2:17 ` [PATCH 8/9] drm/i915: Split GEN6 PPGTT cleanup Ben Widawsky
  7 siblings, 0 replies; 15+ messages in thread
From: Ben Widawsky @ 2013-12-24 21:42 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky, Ben Widawsky

This reverts commit 3a2ffb65eec6dbda2fd8151894f51c18b42c8d41.

Now that the code is fixed to use smaller allocations, it should be safe
to let the full GGTT be used on DW.

The testcase for this is anything which uses more than half of the GTT,
thus eclipsing the old limit.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_gem_gtt.c | 5 -----
 1 file changed, 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 8946b1c..37955db 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -1731,11 +1731,6 @@ static inline unsigned int gen8_get_total_gtt_size(u16 bdw_gmch_ctl)
 	bdw_gmch_ctl &= BDW_GMCH_GGMS_MASK;
 	if (bdw_gmch_ctl)
 		bdw_gmch_ctl = 1 << bdw_gmch_ctl;
-	if (bdw_gmch_ctl > 4) {
-		WARN_ON(!i915_preliminary_hw_support);
-		return 4<<20;
-	}
-
 	return bdw_gmch_ctl << 20;
 }
 
-- 
1.8.5.2

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 7/7] drm/i915: Update i915_gem_gtt.c copyright
  2013-12-24 21:42 [PATCH 0/7] GEN8 PPGTT cleanups + 4GB support Ben Widawsky
                   ` (5 preceding siblings ...)
  2013-12-24 21:42 ` [PATCH 6/7] Revert "drm/i915/bdw: Limit GTT to 2GB" Ben Widawsky
@ 2013-12-24 21:42 ` Ben Widawsky
  2013-12-26  2:17 ` [PATCH 8/9] drm/i915: Split GEN6 PPGTT cleanup Ben Widawsky
  7 siblings, 0 replies; 15+ messages in thread
From: Ben Widawsky @ 2013-12-24 21:42 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky, Ben Widawsky

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=y, Size: 751 bytes --]

I keep meaning to do this... by now almost the entire file has been
written by an Intel employee (including Daniel post-2010).

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_gem_gtt.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 37955db..c21641f 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -1,5 +1,6 @@
 /*
  * Copyright © 2010 Daniel Vetter
+ * Copyright © 2011-2013 Intel Corporation
  *
  * Permission is hereby granted, free of charge, to any person obtaining a
  * copy of this software and associated documentation files (the "Software"),
-- 
1.8.5.2


[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 2/7] [v2] drm/i915/bdw: Reorganize PPGTT init
  2013-12-24 21:42 ` [PATCH 2/7] drm/i915/bdw: Reorganize PPGTT init Ben Widawsky
@ 2013-12-25  6:58   ` Ben Widawsky
  0 siblings, 0 replies; 15+ messages in thread
From: Ben Widawsky @ 2013-12-25  6:58 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky, Ben Widawsky

Create 3 clear stages in PPGTT init. This will help with upcoming
changes be more readable. The 3 stages are, allocation, dma mapping, and
writing the P[DT]Es

One nice benefit to the patches is that it makes 2 very clear error
points, allocation, and mapping, and avoids having to do any handling
after writing PTEs (something which was likely buggy before). This
simplified error handling I suspect will be helpful when we move to
deferred/dynamic page table allocation and mapping.

The patches also attempts to break up some of the steps into more
logical reviewable chunks, particularly when we free.

v2: Don't call cleanup on the error path since that takes down the
drm_mm and list entry, which aren't setup at this point.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_drv.h     |   2 +-
 drivers/gpu/drm/i915/i915_gem_gtt.c | 124 +++++++++++++++++++++---------------
 2 files changed, 73 insertions(+), 53 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index cc8afff..dc21963 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -671,7 +671,7 @@ struct i915_hw_ppgtt {
 	};
 	union {
 		dma_addr_t *pt_dma_addr;
-		dma_addr_t *gen8_pt_dma_addr[4];
+		dma_addr_t **gen8_pt_dma_addr;
 	};
 
 	int (*enable)(struct i915_hw_ppgtt *ppgtt);
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index d0ee9a9..502b1dc 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -325,12 +325,14 @@ static void gen8_ppgtt_free(struct i915_hw_ppgtt *ppgtt)
 	for (i = 0; i < ppgtt->num_pd_pages ; i++)
 		kfree(ppgtt->gen8_pt_dma_addr[i]);
 
+	kfree(ppgtt->gen8_pt_dma_addr);
 	__free_pages(ppgtt->gen8_pt_pages, get_order(ppgtt->num_pt_pages << PAGE_SHIFT));
 	__free_pages(ppgtt->pd_pages, get_order(ppgtt->num_pd_pages << PAGE_SHIFT));
 }
 
 static void gen8_ppgtt_unmap_pages(struct i915_hw_ppgtt *ppgtt)
 {
+	struct pci_dev *hwdev = ppgtt->base.dev->pdev;
 	int i, j;
 
 	for (i = 0; i < ppgtt->num_pd_pages; i++) {
@@ -338,18 +340,14 @@ static void gen8_ppgtt_unmap_pages(struct i915_hw_ppgtt *ppgtt)
 		if (!ppgtt->pd_dma_addr[i])
 			continue;
 
-		pci_unmap_page(ppgtt->base.dev->pdev,
-			       ppgtt->pd_dma_addr[i],
-			       PAGE_SIZE, PCI_DMA_BIDIRECTIONAL);
+		pci_unmap_page(hwdev, ppgtt->pd_dma_addr[i], PAGE_SIZE,
+			       PCI_DMA_BIDIRECTIONAL);
 
 		for (j = 0; j < GEN8_PDES_PER_PAGE; j++) {
 			dma_addr_t addr = ppgtt->gen8_pt_dma_addr[i][j];
 			if (addr)
-				pci_unmap_page(ppgtt->base.dev->pdev,
-				       addr,
-				       PAGE_SIZE,
-				       PCI_DMA_BIDIRECTIONAL);
-
+				pci_unmap_page(hwdev, addr, PAGE_SIZE,
+					       PCI_DMA_BIDIRECTIONAL);
 		}
 	}
 }
@@ -367,27 +365,26 @@ static void gen8_ppgtt_cleanup(struct i915_address_space *vm)
 }
 
 /**
- * GEN8 legacy ppgtt programming is accomplished through 4 PDP registers with a
- * net effect resembling a 2-level page table in normal x86 terms. Each PDP
- * represents 1GB of memory
- * 4 * 512 * 512 * 4096 = 4GB legacy 32b address space.
+ * GEN8 legacy ppgtt programming is accomplished through a max 4 PDP registers
+ * with a net effect resembling a 2-level page table in normal x86 terms. Each
+ * PDP represents 1GB of memory 4 * 512 * 512 * 4096 = 4GB legacy 32b address
+ * space.
  *
+ * FIXME: split allocation into smaller pieces. For now we only ever do this
+ * once, but with full PPGTT, the multiple contiguous allocations will be bad.
  * TODO: Do something with the size parameter
- **/
+ */
 static int gen8_ppgtt_init(struct i915_hw_ppgtt *ppgtt, uint64_t size)
 {
 	struct page *pt_pages;
-	int i, j, ret = -ENOMEM;
 	const int max_pdp = DIV_ROUND_UP(size, 1 << 30);
 	const int num_pt_pages = GEN8_PDES_PER_PAGE * max_pdp;
+	int i, j, ret;
 
 	if (size % (1<<30))
 		DRM_INFO("Pages will be wasted unless GTT size (%llu) is divisible by 1GB\n", size);
 
-	/* FIXME: split allocation into smaller pieces. For now we only ever do
-	 * this once, but with full PPGTT, the multiple contiguous allocations
-	 * will be bad.
-	 */
+	/* 1. Do all our allocations for page directories and page tables */
 	ppgtt->pd_pages = alloc_pages(GFP_KERNEL, get_order(max_pdp << PAGE_SHIFT));
 	if (!ppgtt->pd_pages)
 		return -ENOMEM;
@@ -402,52 +399,66 @@ static int gen8_ppgtt_init(struct i915_hw_ppgtt *ppgtt, uint64_t size)
 	ppgtt->num_pd_pages = 1 << get_order(max_pdp << PAGE_SHIFT);
 	ppgtt->num_pt_pages = 1 << get_order(num_pt_pages << PAGE_SHIFT);
 	ppgtt->num_pd_entries = max_pdp * GEN8_PDES_PER_PAGE;
-	ppgtt->enable = gen8_ppgtt_enable;
-	ppgtt->switch_mm = gen8_mm_switch;
-	ppgtt->base.clear_range = gen8_ppgtt_clear_range;
-	ppgtt->base.insert_entries = gen8_ppgtt_insert_entries;
-	ppgtt->base.cleanup = gen8_ppgtt_cleanup;
-	ppgtt->base.start = 0;
-	ppgtt->base.total = ppgtt->num_pt_pages * GEN8_PTES_PER_PAGE * PAGE_SIZE;
-
 	BUG_ON(ppgtt->num_pd_pages > GEN8_LEGACY_PDPS);
 
+	ppgtt->gen8_pt_dma_addr = kcalloc(max_pdp,
+					  sizeof(*ppgtt->gen8_pt_dma_addr),
+					  GFP_KERNEL);
+	if (!ppgtt->gen8_pt_dma_addr) {
+		ret = -ENOMEM;
+		goto bail;
+	}
+
+	for (i = 0; i < max_pdp; i++) {
+		ppgtt->gen8_pt_dma_addr[i] = kcalloc(GEN8_PDES_PER_PAGE,
+						     sizeof(dma_addr_t),
+						     GFP_KERNEL);
+		if (!ppgtt->gen8_pt_dma_addr[i]) {
+			ret = -ENOMEM;
+			goto bail;
+		}
+	}
+
 	/*
-	 * - Create a mapping for the page directories.
-	 * - For each page directory:
-	 *      allocate space for page table mappings.
-	 *      map each page table
+	 * 2. Create all the DMA mappings for the page directories and page
+	 * tables
 	 */
 	for (i = 0; i < max_pdp; i++) {
-		dma_addr_t temp;
-		temp = pci_map_page(ppgtt->base.dev->pdev,
-				    &ppgtt->pd_pages[i], 0,
-				    PAGE_SIZE, PCI_DMA_BIDIRECTIONAL);
-		if (pci_dma_mapping_error(ppgtt->base.dev->pdev, temp))
-			goto err_out;
-
-		ppgtt->pd_dma_addr[i] = temp;
-
-		ppgtt->gen8_pt_dma_addr[i] = kmalloc(sizeof(dma_addr_t) * GEN8_PDES_PER_PAGE, GFP_KERNEL);
-		if (!ppgtt->gen8_pt_dma_addr[i])
-			goto err_out;
+		dma_addr_t pd_addr, pt_addr;
 
+		/* Get the page table mappings per page directory */
 		for (j = 0; j < GEN8_PDES_PER_PAGE; j++) {
 			struct page *p = &pt_pages[i * GEN8_PDES_PER_PAGE + j];
-			temp = pci_map_page(ppgtt->base.dev->pdev,
-					    p, 0, PAGE_SIZE,
-					    PCI_DMA_BIDIRECTIONAL);
 
-			if (pci_dma_mapping_error(ppgtt->base.dev->pdev, temp))
-				goto err_out;
+			pt_addr = pci_map_page(ppgtt->base.dev->pdev,
+					       p, 0, PAGE_SIZE,
+					       PCI_DMA_BIDIRECTIONAL);
+			ret = pci_dma_mapping_error(ppgtt->base.dev->pdev, pt_addr);
+			if (ret)
+				goto bail;
 
-			ppgtt->gen8_pt_dma_addr[i][j] = temp;
+			ppgtt->gen8_pt_dma_addr[i][j] = pt_addr;
 		}
+
+		/* And the page directory mappings */
+		pd_addr = pci_map_page(ppgtt->base.dev->pdev,
+				       &ppgtt->pd_pages[i], 0,
+				       PAGE_SIZE, PCI_DMA_BIDIRECTIONAL);
+		ret = pci_dma_mapping_error(ppgtt->base.dev->pdev, pd_addr);
+		if (ret)
+			goto bail;
+
+		ppgtt->pd_dma_addr[i] = pd_addr;
 	}
 
-	/* For now, the PPGTT helper functions all require that the PDEs are
+	/*
+	 * 3. Map all the page directory entires to point to the page tables
+	 * we've allocated.
+	 *
+	 * For now, the PPGTT helper functions all require that the PDEs are
 	 * plugged in correctly. So we do that now/here. For aliasing PPGTT, we
-	 * will never need to touch the PDEs again */
+	 * will never need to touch the PDEs again.
+	 */
 	for (i = 0; i < max_pdp; i++) {
 		gen8_ppgtt_pde_t *pd_vaddr;
 		pd_vaddr = kmap_atomic(&ppgtt->pd_pages[i]);
@@ -459,6 +470,14 @@ static int gen8_ppgtt_init(struct i915_hw_ppgtt *ppgtt, uint64_t size)
 		kunmap_atomic(pd_vaddr);
 	}
 
+	ppgtt->enable = gen8_ppgtt_enable;
+	ppgtt->switch_mm = gen8_mm_switch;
+	ppgtt->base.clear_range = gen8_ppgtt_clear_range;
+	ppgtt->base.insert_entries = gen8_ppgtt_insert_entries;
+	ppgtt->base.cleanup = gen8_ppgtt_cleanup;
+	ppgtt->base.start = 0;
+	ppgtt->base.total = ppgtt->num_pt_pages * GEN8_PTES_PER_PAGE * PAGE_SIZE;
+
 	ppgtt->base.clear_range(&ppgtt->base, 0,
 				ppgtt->num_pd_entries * GEN8_PTES_PER_PAGE,
 				true);
@@ -471,8 +490,9 @@ static int gen8_ppgtt_init(struct i915_hw_ppgtt *ppgtt, uint64_t size)
 			 size % (1<<30));
 	return 0;
 
-err_out:
-	ppgtt->base.cleanup(&ppgtt->base);
+bail:
+	gen8_ppgtt_unmap_pages(ppgtt);
+	gen8_ppgtt_free(ppgtt);
 	return ret;
 }
 
-- 
1.8.5.2

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 3/7] [v2] drm/i915/bdw: Split ppgtt initialization up
  2013-12-24 21:42 ` [PATCH 3/7] drm/i915/bdw: Split ppgtt initialization up Ben Widawsky
@ 2013-12-25  7:04   ` Ben Widawsky
  0 siblings, 0 replies; 15+ messages in thread
From: Ben Widawsky @ 2013-12-25  7:04 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky, Ben Widawsky

Like cleanup in an earlier patch, the code becomes much more readable,
and easier to extend if we extract out helper functions for the various
stages of init.

Note that with this patch it becomes really simple, and tempting to begin
using the 'goto out' idiom with explicit free/fini semantics. I've
kept the error path as similar as possible to the cleanup() function to
make sure cleanup is as robust as possible

v2: Remove comment "NB:From here on, ppgtt->base.cleanup() should
function properly"
Update commit message to reflect above

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_gem_gtt.c | 182 +++++++++++++++++++++++++-----------
 1 file changed, 126 insertions(+), 56 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 502b1dc..a19cc73 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -364,91 +364,161 @@ static void gen8_ppgtt_cleanup(struct i915_address_space *vm)
 	gen8_ppgtt_free(ppgtt);
 }
 
-/**
- * GEN8 legacy ppgtt programming is accomplished through a max 4 PDP registers
- * with a net effect resembling a 2-level page table in normal x86 terms. Each
- * PDP represents 1GB of memory 4 * 512 * 512 * 4096 = 4GB legacy 32b address
- * space.
- *
- * FIXME: split allocation into smaller pieces. For now we only ever do this
- * once, but with full PPGTT, the multiple contiguous allocations will be bad.
- * TODO: Do something with the size parameter
- */
-static int gen8_ppgtt_init(struct i915_hw_ppgtt *ppgtt, uint64_t size)
+static int gen8_ppgtt_allocate_page_tables(struct i915_hw_ppgtt *ppgtt,
+					   const int max_pdp)
 {
 	struct page *pt_pages;
-	const int max_pdp = DIV_ROUND_UP(size, 1 << 30);
 	const int num_pt_pages = GEN8_PDES_PER_PAGE * max_pdp;
-	int i, j, ret;
-
-	if (size % (1<<30))
-		DRM_INFO("Pages will be wasted unless GTT size (%llu) is divisible by 1GB\n", size);
-
-	/* 1. Do all our allocations for page directories and page tables */
-	ppgtt->pd_pages = alloc_pages(GFP_KERNEL, get_order(max_pdp << PAGE_SHIFT));
-	if (!ppgtt->pd_pages)
-		return -ENOMEM;
 
 	pt_pages = alloc_pages(GFP_KERNEL, get_order(num_pt_pages << PAGE_SHIFT));
-	if (!pt_pages) {
-		__free_pages(ppgtt->pd_pages, get_order(max_pdp << PAGE_SHIFT));
+	if (!pt_pages)
 		return -ENOMEM;
-	}
 
 	ppgtt->gen8_pt_pages = pt_pages;
-	ppgtt->num_pd_pages = 1 << get_order(max_pdp << PAGE_SHIFT);
 	ppgtt->num_pt_pages = 1 << get_order(num_pt_pages << PAGE_SHIFT);
-	ppgtt->num_pd_entries = max_pdp * GEN8_PDES_PER_PAGE;
-	BUG_ON(ppgtt->num_pd_pages > GEN8_LEGACY_PDPS);
 
-	ppgtt->gen8_pt_dma_addr = kcalloc(max_pdp,
+	return 0;
+}
+
+static int gen8_ppgtt_allocate_dma(struct i915_hw_ppgtt *ppgtt)
+{
+	int i;
+
+	ppgtt->gen8_pt_dma_addr = kcalloc(ppgtt->num_pd_entries,
 					  sizeof(*ppgtt->gen8_pt_dma_addr),
 					  GFP_KERNEL);
-	if (!ppgtt->gen8_pt_dma_addr) {
-		ret = -ENOMEM;
-		goto bail;
-	}
+	if (!ppgtt->gen8_pt_dma_addr)
+		return -ENOMEM;
 
-	for (i = 0; i < max_pdp; i++) {
+	for (i = 0; i < ppgtt->num_pd_entries; i++) {
 		ppgtt->gen8_pt_dma_addr[i] = kcalloc(GEN8_PDES_PER_PAGE,
 						     sizeof(dma_addr_t),
 						     GFP_KERNEL);
 		if (!ppgtt->gen8_pt_dma_addr[i]) {
-			ret = -ENOMEM;
-			goto bail;
+			kfree(ppgtt->gen8_pt_dma_addr);
+			while(i--)
+				kfree(ppgtt->gen8_pt_dma_addr[i]);
+
+			return -ENOMEM;
 		}
 	}
 
+	return 0;
+}
+
+static int gen8_ppgtt_allocate_page_directories(struct i915_hw_ppgtt *ppgtt,
+						const int max_pdp)
+{
+	ppgtt->pd_pages = alloc_pages(GFP_KERNEL, get_order(max_pdp << PAGE_SHIFT));
+	if (!ppgtt->pd_pages)
+		return -ENOMEM;
+
+	ppgtt->num_pd_pages = 1 << get_order(max_pdp << PAGE_SHIFT);
+
+	return 0;
+}
+
+static int gen8_ppgtt_alloc(struct i915_hw_ppgtt *ppgtt,
+			    const int max_pdp)
+{
+	int ret;
+
+	ret = gen8_ppgtt_allocate_page_directories(ppgtt, max_pdp);
+	if (ret)
+		return ret;
+
+	ret = gen8_ppgtt_allocate_page_tables(ppgtt, max_pdp);
+	if (ret) {
+		__free_pages(ppgtt->pd_pages, get_order(max_pdp << PAGE_SHIFT));
+		return ret;
+	}
+
+	ppgtt->num_pd_entries = max_pdp * GEN8_PDES_PER_PAGE;
+	BUG_ON(ppgtt->num_pd_pages > GEN8_LEGACY_PDPS);
+
+	ret = gen8_ppgtt_allocate_dma(ppgtt);
+	if (ret)
+		gen8_ppgtt_free(ppgtt);
+
+	return ret;
+}
+
+static int gen8_ppgtt_setup_page_directories(struct i915_hw_ppgtt *ppgtt,
+					     const int pd)
+{
+	dma_addr_t pd_addr;
+	int ret;
+
+	pd_addr = pci_map_page(ppgtt->base.dev->pdev,
+			       &ppgtt->pd_pages[pd], 0,
+			       PAGE_SIZE, PCI_DMA_BIDIRECTIONAL);
+
+	ret = pci_dma_mapping_error(ppgtt->base.dev->pdev, pd_addr);
+	if (ret)
+		return ret;
+
+	ppgtt->pd_dma_addr[pd] = pd_addr;
+
+	return 0;
+}
+
+static int gen8_ppgtt_setup_page_tables(struct i915_hw_ppgtt *ppgtt,
+					const int pd,
+					const int pt)
+{
+	dma_addr_t pt_addr;
+	struct page *p;
+	int ret;
+
+	p = &ppgtt->gen8_pt_pages[pd * GEN8_PDES_PER_PAGE + pt];
+	pt_addr = pci_map_page(ppgtt->base.dev->pdev,
+			       p, 0, PAGE_SIZE, PCI_DMA_BIDIRECTIONAL);
+	ret = pci_dma_mapping_error(ppgtt->base.dev->pdev, pt_addr);
+	if (ret)
+		return ret;
+
+	ppgtt->gen8_pt_dma_addr[pd][pt] = pt_addr;
+
+	return 0;
+}
+
+/**
+ * GEN8 legacy ppgtt programming is accomplished through a max 4 PDP registers
+ * with a net effect resembling a 2-level page table in normal x86 terms. Each
+ * PDP represents 1GB of memory 4 * 512 * 512 * 4096 = 4GB legacy 32b address
+ * space.
+ *
+ * FIXME: split allocation into smaller pieces. For now we only ever do this
+ * once, but with full PPGTT, the multiple contiguous allocations will be bad.
+ * TODO: Do something with the size parameter
+ */
+static int gen8_ppgtt_init(struct i915_hw_ppgtt *ppgtt, uint64_t size)
+{
+	const int max_pdp = DIV_ROUND_UP(size, 1 << 30);
+	const int min_pt_pages = GEN8_PDES_PER_PAGE * max_pdp;
+	int i, j, ret;
+
+	if (size % (1<<30))
+		DRM_INFO("Pages will be wasted unless GTT size (%llu) is divisible by 1GB\n", size);
+
+	/* 1. Do all our allocations for page directories and page tables. */
+	ret = gen8_ppgtt_alloc(ppgtt, max_pdp);
+	if (ret)
+		return ret;
+
 	/*
-	 * 2. Create all the DMA mappings for the page directories and page
-	 * tables
+	 * 2. Create DMA mappings for the page directories and page tables.
 	 */
 	for (i = 0; i < max_pdp; i++) {
-		dma_addr_t pd_addr, pt_addr;
-
-		/* Get the page table mappings per page directory */
 		for (j = 0; j < GEN8_PDES_PER_PAGE; j++) {
-			struct page *p = &pt_pages[i * GEN8_PDES_PER_PAGE + j];
-
-			pt_addr = pci_map_page(ppgtt->base.dev->pdev,
-					       p, 0, PAGE_SIZE,
-					       PCI_DMA_BIDIRECTIONAL);
-			ret = pci_dma_mapping_error(ppgtt->base.dev->pdev, pt_addr);
+			ret = gen8_ppgtt_setup_page_tables(ppgtt, i, j);
 			if (ret)
 				goto bail;
-
-			ppgtt->gen8_pt_dma_addr[i][j] = pt_addr;
 		}
 
-		/* And the page directory mappings */
-		pd_addr = pci_map_page(ppgtt->base.dev->pdev,
-				       &ppgtt->pd_pages[i], 0,
-				       PAGE_SIZE, PCI_DMA_BIDIRECTIONAL);
-		ret = pci_dma_mapping_error(ppgtt->base.dev->pdev, pd_addr);
+		ret = gen8_ppgtt_setup_page_directories(ppgtt, i);
 		if (ret)
 			goto bail;
-
-		ppgtt->pd_dma_addr[i] = pd_addr;
 	}
 
 	/*
@@ -486,7 +556,7 @@ static int gen8_ppgtt_init(struct i915_hw_ppgtt *ppgtt, uint64_t size)
 			 ppgtt->num_pd_pages, ppgtt->num_pd_pages - max_pdp);
 	DRM_DEBUG_DRIVER("Allocated %d pages for page tables (%lld wasted)\n",
 			 ppgtt->num_pt_pages,
-			 (ppgtt->num_pt_pages - num_pt_pages) +
+			 (ppgtt->num_pt_pages - min_pt_pages) +
 			 size % (1<<30));
 	return 0;
 
-- 
1.8.5.2

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 8/9] drm/i915: Split GEN6 PPGTT cleanup
  2013-12-24 21:42 [PATCH 0/7] GEN8 PPGTT cleanups + 4GB support Ben Widawsky
                   ` (6 preceding siblings ...)
  2013-12-24 21:42 ` [PATCH 7/7] drm/i915: Update i915_gem_gtt.c copyright Ben Widawsky
@ 2013-12-26  2:17 ` Ben Widawsky
  2013-12-26  2:17   ` [PATCH 9/9] drm/i915: Split GEN6 PPGTT initialization up Ben Widawsky
  7 siblings, 1 reply; 15+ messages in thread
From: Ben Widawsky @ 2013-12-26  2:17 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky, Ben Widawsky

This cleanup is similar to the GEN8 cleanup (though less necessary).
Having everything split will make cleaning the initialization path error
paths easier to understand.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_gem_gtt.c | 26 +++++++++++++++++++-------
 1 file changed, 19 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 5aa6680..fc81daf 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -1004,22 +1004,21 @@ static void gen6_ppgtt_insert_entries(struct i915_address_space *vm,
 	kunmap_atomic(pt_vaddr);
 }
 
-static void gen6_ppgtt_cleanup(struct i915_address_space *vm)
+static void gen6_ppgtt_unmap_pages(struct i915_hw_ppgtt *ppgtt)
 {
-	struct i915_hw_ppgtt *ppgtt =
-		container_of(vm, struct i915_hw_ppgtt, base);
 	int i;
 
-	list_del(&vm->global_link);
-	drm_mm_takedown(&ppgtt->base.mm);
-	drm_mm_remove_node(&ppgtt->node);
-
 	if (ppgtt->pt_dma_addr) {
 		for (i = 0; i < ppgtt->num_pd_entries; i++)
 			pci_unmap_page(ppgtt->base.dev->pdev,
 				       ppgtt->pt_dma_addr[i],
 				       4096, PCI_DMA_BIDIRECTIONAL);
 	}
+}
+
+static void gen6_ppgtt_free(struct i915_hw_ppgtt *ppgtt)
+{
+	int i;
 
 	kfree(ppgtt->pt_dma_addr);
 	for (i = 0; i < ppgtt->num_pd_entries; i++)
@@ -1028,6 +1027,19 @@ static void gen6_ppgtt_cleanup(struct i915_address_space *vm)
 	kfree(ppgtt);
 }
 
+static void gen6_ppgtt_cleanup(struct i915_address_space *vm)
+{
+	struct i915_hw_ppgtt *ppgtt =
+		container_of(vm, struct i915_hw_ppgtt, base);
+
+	list_del(&vm->global_link);
+	drm_mm_takedown(&ppgtt->base.mm);
+	drm_mm_remove_node(&ppgtt->node);
+
+	gen6_ppgtt_unmap_pages(ppgtt);
+	gen6_ppgtt_free(ppgtt);
+}
+
 static int gen6_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
 {
 #define GEN6_PD_ALIGN (PAGE_SIZE * 16)
-- 
1.8.5.2

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 9/9] drm/i915: Split GEN6 PPGTT initialization up
  2013-12-26  2:17 ` [PATCH 8/9] drm/i915: Split GEN6 PPGTT cleanup Ben Widawsky
@ 2013-12-26  2:17   ` Ben Widawsky
  0 siblings, 0 replies; 15+ messages in thread
From: Ben Widawsky @ 2013-12-26  2:17 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky, Ben Widawsky

Simply to match the GEN8 style of PPGTT initialization, split up the
allocations and mappings. Unlike GEN8, we skip a separate dma_addr_t
allocation function, as it is much simpler pre-gen8.

With this code it would be easy to make a more general PPGTT
initialization function with per GEN alloc/map/etc. or use a common
helper, similar to the ringbuffer code. I don't see a benefit to doing
this just yet, but who knows...

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_gem_gtt.c | 141 +++++++++++++++++++++++-------------
 1 file changed, 91 insertions(+), 50 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index fc81daf..fd53789 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -1040,14 +1040,14 @@ static void gen6_ppgtt_cleanup(struct i915_address_space *vm)
 	gen6_ppgtt_free(ppgtt);
 }
 
-static int gen6_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
+static int gen6_ppgtt_allocate_page_directories(struct i915_hw_ppgtt *ppgtt)
 {
 #define GEN6_PD_ALIGN (PAGE_SIZE * 16)
 #define GEN6_PD_SIZE (GEN6_PPGTT_PD_ENTRIES * PAGE_SIZE)
 	struct drm_device *dev = ppgtt->base.dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	bool retried = false;
-	int i, ret;
+	int ret;
 
 	/* PPGTT PDEs reside in the GGTT and consists of 512 entries. The
 	 * allocator works in address space sizes, so it's multiplied by page
@@ -1074,42 +1074,60 @@ alloc:
 	if (ppgtt->node.start < dev_priv->gtt.mappable_end)
 		DRM_DEBUG("Forced to use aperture for PDEs\n");
 
-	ppgtt->base.pte_encode = dev_priv->gtt.base.pte_encode;
 	ppgtt->num_pd_entries = GEN6_PPGTT_PD_ENTRIES;
-	if (IS_GEN6(dev)) {
-		ppgtt->enable = gen6_ppgtt_enable;
-		ppgtt->switch_mm = gen6_mm_switch;
-	} else if (IS_HASWELL(dev)) {
-		ppgtt->enable = gen7_ppgtt_enable;
-		ppgtt->switch_mm = hsw_mm_switch;
-	} else if (IS_GEN7(dev)) {
-		ppgtt->enable = gen7_ppgtt_enable;
-		ppgtt->switch_mm = gen7_mm_switch;
-	} else
-		BUG();
-	ppgtt->base.clear_range = gen6_ppgtt_clear_range;
-	ppgtt->base.insert_entries = gen6_ppgtt_insert_entries;
-	ppgtt->base.cleanup = gen6_ppgtt_cleanup;
-	ppgtt->base.scratch = dev_priv->gtt.base.scratch;
-	ppgtt->base.start = 0;
-	ppgtt->base.total = GEN6_PPGTT_PD_ENTRIES * I915_PPGTT_PT_ENTRIES * PAGE_SIZE;
+	return ret;
+}
+
+static int gen6_ppgtt_allocate_page_tables(struct i915_hw_ppgtt *ppgtt)
+{
+	int i;
+
 	ppgtt->pt_pages = kcalloc(ppgtt->num_pd_entries, sizeof(struct page *),
 				  GFP_KERNEL);
-	if (!ppgtt->pt_pages) {
-		drm_mm_remove_node(&ppgtt->node);
+
+	if (!ppgtt->pt_pages)
 		return -ENOMEM;
-	}
 
 	for (i = 0; i < ppgtt->num_pd_entries; i++) {
 		ppgtt->pt_pages[i] = alloc_page(GFP_KERNEL);
-		if (!ppgtt->pt_pages[i])
-			goto err_pt_alloc;
+		if (!ppgtt->pt_pages[i]) {
+			gen6_ppgtt_free(ppgtt);
+			return -ENOMEM;
+		}
+	}
+
+	return 0;
+}
+
+static int gen6_ppgtt_alloc(struct i915_hw_ppgtt *ppgtt)
+{
+	int ret;
+
+	ret = gen6_ppgtt_allocate_page_directories(ppgtt);
+	if (ret)
+		return ret;
+
+	ret = gen6_ppgtt_allocate_page_tables(ppgtt);
+	if (ret) {
+		drm_mm_remove_node(&ppgtt->node);
+		return ret;
 	}
 
 	ppgtt->pt_dma_addr = kcalloc(ppgtt->num_pd_entries, sizeof(dma_addr_t),
 				     GFP_KERNEL);
-	if (!ppgtt->pt_dma_addr)
-		goto err_pt_alloc;
+	if (!ppgtt->pt_dma_addr) {
+		drm_mm_remove_node(&ppgtt->node);
+		gen6_ppgtt_free(ppgtt);
+		return -ENOMEM;
+	}
+
+	return 0;
+}
+
+static int gen6_ppgtt_setup_page_tables(struct i915_hw_ppgtt *ppgtt)
+{
+	struct drm_device *dev = ppgtt->base.dev;
+	int i;
 
 	for (i = 0; i < ppgtt->num_pd_entries; i++) {
 		dma_addr_t pt_addr;
@@ -1118,40 +1136,63 @@ alloc:
 				       PCI_DMA_BIDIRECTIONAL);
 
 		if (pci_dma_mapping_error(dev->pdev, pt_addr)) {
-			ret = -EIO;
-			goto err_pd_pin;
-
+			gen6_ppgtt_unmap_pages(ppgtt);
+			return -EIO;
 		}
+
 		ppgtt->pt_dma_addr[i] = pt_addr;
 	}
 
-	ppgtt->base.clear_range(&ppgtt->base, 0, ppgtt->base.total, true);
+	return 0;
+}
+
+static int gen6_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
+{
+	struct drm_device *dev = ppgtt->base.dev;
+	struct drm_i915_private *dev_priv = dev->dev_private;
+	int ret;
+
+	ppgtt->base.pte_encode = dev_priv->gtt.base.pte_encode;
+	if (IS_GEN6(dev)) {
+		ppgtt->enable = gen6_ppgtt_enable;
+		ppgtt->switch_mm = gen6_mm_switch;
+	} else if (IS_HASWELL(dev)) {
+		ppgtt->enable = gen7_ppgtt_enable;
+		ppgtt->switch_mm = hsw_mm_switch;
+	} else if (IS_GEN7(dev)) {
+		ppgtt->enable = gen7_ppgtt_enable;
+		ppgtt->switch_mm = gen7_mm_switch;
+	} else
+		BUG();
+
+	ret = gen6_ppgtt_alloc(ppgtt);
+	if (ret)
+		return ret;
+
+	ret = gen6_ppgtt_setup_page_tables(ppgtt);
+	if (ret) {
+		gen6_ppgtt_free(ppgtt);
+		return ret;
+	}
+
+	ppgtt->base.clear_range = gen6_ppgtt_clear_range;
+	ppgtt->base.insert_entries = gen6_ppgtt_insert_entries;
+	ppgtt->base.cleanup = gen6_ppgtt_cleanup;
+	ppgtt->base.scratch = dev_priv->gtt.base.scratch;
+	ppgtt->base.start = 0;
+	ppgtt->base.total = GEN6_PPGTT_PD_ENTRIES * I915_PPGTT_PT_ENTRIES * PAGE_SIZE;
 	ppgtt->debug_dump = gen6_dump_ppgtt;
 
-	DRM_DEBUG_DRIVER("Allocated pde space (%ldM) at GTT entry: %lx\n",
-			 ppgtt->node.size >> 20,
-			 ppgtt->node.start / PAGE_SIZE);
 	ppgtt->pd_offset =
 		ppgtt->node.start / PAGE_SIZE * sizeof(gen6_gtt_pte_t);
 
-	return 0;
+	ppgtt->base.clear_range(&ppgtt->base, 0, ppgtt->base.total, true);
 
-err_pd_pin:
-	if (ppgtt->pt_dma_addr) {
-		for (i--; i >= 0; i--)
-			pci_unmap_page(dev->pdev, ppgtt->pt_dma_addr[i],
-				       4096, PCI_DMA_BIDIRECTIONAL);
-	}
-err_pt_alloc:
-	kfree(ppgtt->pt_dma_addr);
-	for (i = 0; i < ppgtt->num_pd_entries; i++) {
-		if (ppgtt->pt_pages[i])
-			__free_page(ppgtt->pt_pages[i]);
-	}
-	kfree(ppgtt->pt_pages);
-	drm_mm_remove_node(&ppgtt->node);
+	DRM_DEBUG_DRIVER("Allocated pde space (%ldM) at GTT entry: %lx\n",
+			 ppgtt->node.size >> 20,
+			 ppgtt->node.start / PAGE_SIZE);
 
-	return ret;
+	return 0;
 }
 
 int i915_gem_init_ppgtt(struct drm_device *dev, struct i915_hw_ppgtt *ppgtt)
-- 
1.8.5.2

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 9/9] drm/i915: Split GEN6 PPGTT initialization up
  2014-02-12 22:28 ` [PATCH 0/9] [REPOST] " Ben Widawsky
@ 2014-02-12 22:28   ` Ben Widawsky
  2014-02-13 10:33     ` Chris Wilson
  2014-02-20  6:05   ` Ben Widawsky
  1 sibling, 1 reply; 15+ messages in thread
From: Ben Widawsky @ 2014-02-12 22:28 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky, Ben Widawsky

Simply to match the GEN8 style of PPGTT initialization, split up the
allocations and mappings. Unlike GEN8, we skip a separate dma_addr_t
allocation function, as it is much simpler pre-gen8.

With this code it would be easy to make a more general PPGTT
initialization function with per GEN alloc/map/etc. or use a common
helper, similar to the ringbuffer code. I don't see a benefit to doing
this just yet, but who knows...

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_gem_gtt.c | 141 +++++++++++++++++++++++-------------
 1 file changed, 91 insertions(+), 50 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index d3ee916..396c862 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -1044,14 +1044,14 @@ static void gen6_ppgtt_cleanup(struct i915_address_space *vm)
 	gen6_ppgtt_free(ppgtt);
 }
 
-static int gen6_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
+static int gen6_ppgtt_allocate_page_directories(struct i915_hw_ppgtt *ppgtt)
 {
 #define GEN6_PD_ALIGN (PAGE_SIZE * 16)
 #define GEN6_PD_SIZE (GEN6_PPGTT_PD_ENTRIES * PAGE_SIZE)
 	struct drm_device *dev = ppgtt->base.dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	bool retried = false;
-	int i, ret;
+	int ret;
 
 	/* PPGTT PDEs reside in the GGTT and consists of 512 entries. The
 	 * allocator works in address space sizes, so it's multiplied by page
@@ -1078,42 +1078,60 @@ alloc:
 	if (ppgtt->node.start < dev_priv->gtt.mappable_end)
 		DRM_DEBUG("Forced to use aperture for PDEs\n");
 
-	ppgtt->base.pte_encode = dev_priv->gtt.base.pte_encode;
 	ppgtt->num_pd_entries = GEN6_PPGTT_PD_ENTRIES;
-	if (IS_GEN6(dev)) {
-		ppgtt->enable = gen6_ppgtt_enable;
-		ppgtt->switch_mm = gen6_mm_switch;
-	} else if (IS_HASWELL(dev)) {
-		ppgtt->enable = gen7_ppgtt_enable;
-		ppgtt->switch_mm = hsw_mm_switch;
-	} else if (IS_GEN7(dev)) {
-		ppgtt->enable = gen7_ppgtt_enable;
-		ppgtt->switch_mm = gen7_mm_switch;
-	} else
-		BUG();
-	ppgtt->base.clear_range = gen6_ppgtt_clear_range;
-	ppgtt->base.insert_entries = gen6_ppgtt_insert_entries;
-	ppgtt->base.cleanup = gen6_ppgtt_cleanup;
-	ppgtt->base.scratch = dev_priv->gtt.base.scratch;
-	ppgtt->base.start = 0;
-	ppgtt->base.total = GEN6_PPGTT_PD_ENTRIES * I915_PPGTT_PT_ENTRIES * PAGE_SIZE;
+	return ret;
+}
+
+static int gen6_ppgtt_allocate_page_tables(struct i915_hw_ppgtt *ppgtt)
+{
+	int i;
+
 	ppgtt->pt_pages = kcalloc(ppgtt->num_pd_entries, sizeof(struct page *),
 				  GFP_KERNEL);
-	if (!ppgtt->pt_pages) {
-		drm_mm_remove_node(&ppgtt->node);
+
+	if (!ppgtt->pt_pages)
 		return -ENOMEM;
-	}
 
 	for (i = 0; i < ppgtt->num_pd_entries; i++) {
 		ppgtt->pt_pages[i] = alloc_page(GFP_KERNEL);
-		if (!ppgtt->pt_pages[i])
-			goto err_pt_alloc;
+		if (!ppgtt->pt_pages[i]) {
+			gen6_ppgtt_free(ppgtt);
+			return -ENOMEM;
+		}
+	}
+
+	return 0;
+}
+
+static int gen6_ppgtt_alloc(struct i915_hw_ppgtt *ppgtt)
+{
+	int ret;
+
+	ret = gen6_ppgtt_allocate_page_directories(ppgtt);
+	if (ret)
+		return ret;
+
+	ret = gen6_ppgtt_allocate_page_tables(ppgtt);
+	if (ret) {
+		drm_mm_remove_node(&ppgtt->node);
+		return ret;
 	}
 
 	ppgtt->pt_dma_addr = kcalloc(ppgtt->num_pd_entries, sizeof(dma_addr_t),
 				     GFP_KERNEL);
-	if (!ppgtt->pt_dma_addr)
-		goto err_pt_alloc;
+	if (!ppgtt->pt_dma_addr) {
+		drm_mm_remove_node(&ppgtt->node);
+		gen6_ppgtt_free(ppgtt);
+		return -ENOMEM;
+	}
+
+	return 0;
+}
+
+static int gen6_ppgtt_setup_page_tables(struct i915_hw_ppgtt *ppgtt)
+{
+	struct drm_device *dev = ppgtt->base.dev;
+	int i;
 
 	for (i = 0; i < ppgtt->num_pd_entries; i++) {
 		dma_addr_t pt_addr;
@@ -1122,40 +1140,63 @@ alloc:
 				       PCI_DMA_BIDIRECTIONAL);
 
 		if (pci_dma_mapping_error(dev->pdev, pt_addr)) {
-			ret = -EIO;
-			goto err_pd_pin;
-
+			gen6_ppgtt_unmap_pages(ppgtt);
+			return -EIO;
 		}
+
 		ppgtt->pt_dma_addr[i] = pt_addr;
 	}
 
-	ppgtt->base.clear_range(&ppgtt->base, 0, ppgtt->base.total, true);
+	return 0;
+}
+
+static int gen6_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
+{
+	struct drm_device *dev = ppgtt->base.dev;
+	struct drm_i915_private *dev_priv = dev->dev_private;
+	int ret;
+
+	ppgtt->base.pte_encode = dev_priv->gtt.base.pte_encode;
+	if (IS_GEN6(dev)) {
+		ppgtt->enable = gen6_ppgtt_enable;
+		ppgtt->switch_mm = gen6_mm_switch;
+	} else if (IS_HASWELL(dev)) {
+		ppgtt->enable = gen7_ppgtt_enable;
+		ppgtt->switch_mm = hsw_mm_switch;
+	} else if (IS_GEN7(dev)) {
+		ppgtt->enable = gen7_ppgtt_enable;
+		ppgtt->switch_mm = gen7_mm_switch;
+	} else
+		BUG();
+
+	ret = gen6_ppgtt_alloc(ppgtt);
+	if (ret)
+		return ret;
+
+	ret = gen6_ppgtt_setup_page_tables(ppgtt);
+	if (ret) {
+		gen6_ppgtt_free(ppgtt);
+		return ret;
+	}
+
+	ppgtt->base.clear_range = gen6_ppgtt_clear_range;
+	ppgtt->base.insert_entries = gen6_ppgtt_insert_entries;
+	ppgtt->base.cleanup = gen6_ppgtt_cleanup;
+	ppgtt->base.scratch = dev_priv->gtt.base.scratch;
+	ppgtt->base.start = 0;
+	ppgtt->base.total = GEN6_PPGTT_PD_ENTRIES * I915_PPGTT_PT_ENTRIES * PAGE_SIZE;
 	ppgtt->debug_dump = gen6_dump_ppgtt;
 
-	DRM_DEBUG_DRIVER("Allocated pde space (%ldM) at GTT entry: %lx\n",
-			 ppgtt->node.size >> 20,
-			 ppgtt->node.start / PAGE_SIZE);
 	ppgtt->pd_offset =
 		ppgtt->node.start / PAGE_SIZE * sizeof(gen6_gtt_pte_t);
 
-	return 0;
+	ppgtt->base.clear_range(&ppgtt->base, 0, ppgtt->base.total, true);
 
-err_pd_pin:
-	if (ppgtt->pt_dma_addr) {
-		for (i--; i >= 0; i--)
-			pci_unmap_page(dev->pdev, ppgtt->pt_dma_addr[i],
-				       4096, PCI_DMA_BIDIRECTIONAL);
-	}
-err_pt_alloc:
-	kfree(ppgtt->pt_dma_addr);
-	for (i = 0; i < ppgtt->num_pd_entries; i++) {
-		if (ppgtt->pt_pages[i])
-			__free_page(ppgtt->pt_pages[i]);
-	}
-	kfree(ppgtt->pt_pages);
-	drm_mm_remove_node(&ppgtt->node);
+	DRM_DEBUG_DRIVER("Allocated pde space (%ldM) at GTT entry: %lx\n",
+			 ppgtt->node.size >> 20,
+			 ppgtt->node.start / PAGE_SIZE);
 
-	return ret;
+	return 0;
 }
 
 int i915_gem_init_ppgtt(struct drm_device *dev, struct i915_hw_ppgtt *ppgtt)
-- 
1.8.5.4

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH 9/9] drm/i915: Split GEN6 PPGTT initialization up
  2014-02-12 22:28   ` [PATCH 9/9] drm/i915: Split GEN6 PPGTT initialization up Ben Widawsky
@ 2014-02-13 10:33     ` Chris Wilson
  0 siblings, 0 replies; 15+ messages in thread
From: Chris Wilson @ 2014-02-13 10:33 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Intel GFX, Ben Widawsky

On Wed, Feb 12, 2014 at 02:28:52PM -0800, Ben Widawsky wrote:
> Simply to match the GEN8 style of PPGTT initialization, split up the
> allocations and mappings. Unlike GEN8, we skip a separate dma_addr_t
> allocation function, as it is much simpler pre-gen8.
> 
> With this code it would be easy to make a more general PPGTT
> initialization function with per GEN alloc/map/etc. or use a common
> helper, similar to the ringbuffer code. I don't see a benefit to doing
> this just yet, but who knows...
> 
> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>

Had to double-check whether free_pages() safely accepted NULL, but that
was the only logic change I spotted, so
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 9/9] drm/i915: Split GEN6 PPGTT initialization up
  2014-02-12 22:28 ` [PATCH 0/9] [REPOST] " Ben Widawsky
  2014-02-12 22:28   ` [PATCH 9/9] drm/i915: Split GEN6 PPGTT initialization up Ben Widawsky
@ 2014-02-20  6:05   ` Ben Widawsky
  1 sibling, 0 replies; 15+ messages in thread
From: Ben Widawsky @ 2014-02-20  6:05 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky, Ben Widawsky

Simply to match the GEN8 style of PPGTT initialization, split up the
allocations and mappings. Unlike GEN8, we skip a separate dma_addr_t
allocation function, as it is much simpler pre-gen8.

With this code it would be easy to make a more general PPGTT
initialization function with per GEN alloc/map/etc. or use a common
helper, similar to the ringbuffer code. I don't see a benefit to doing
this just yet, but who knows...

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_gem_gtt.c | 141 +++++++++++++++++++++++-------------
 1 file changed, 91 insertions(+), 50 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 3f2b8e8..6630598 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -1036,14 +1036,14 @@ static void gen6_ppgtt_cleanup(struct i915_address_space *vm)
 	gen6_ppgtt_free(ppgtt);
 }
 
-static int gen6_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
+static int gen6_ppgtt_allocate_page_directories(struct i915_hw_ppgtt *ppgtt)
 {
 #define GEN6_PD_ALIGN (PAGE_SIZE * 16)
 #define GEN6_PD_SIZE (GEN6_PPGTT_PD_ENTRIES * PAGE_SIZE)
 	struct drm_device *dev = ppgtt->base.dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	bool retried = false;
-	int i, ret;
+	int ret;
 
 	/* PPGTT PDEs reside in the GGTT and consists of 512 entries. The
 	 * allocator works in address space sizes, so it's multiplied by page
@@ -1070,42 +1070,60 @@ alloc:
 	if (ppgtt->node.start < dev_priv->gtt.mappable_end)
 		DRM_DEBUG("Forced to use aperture for PDEs\n");
 
-	ppgtt->base.pte_encode = dev_priv->gtt.base.pte_encode;
 	ppgtt->num_pd_entries = GEN6_PPGTT_PD_ENTRIES;
-	if (IS_GEN6(dev)) {
-		ppgtt->enable = gen6_ppgtt_enable;
-		ppgtt->switch_mm = gen6_mm_switch;
-	} else if (IS_HASWELL(dev)) {
-		ppgtt->enable = gen7_ppgtt_enable;
-		ppgtt->switch_mm = hsw_mm_switch;
-	} else if (IS_GEN7(dev)) {
-		ppgtt->enable = gen7_ppgtt_enable;
-		ppgtt->switch_mm = gen7_mm_switch;
-	} else
-		BUG();
-	ppgtt->base.clear_range = gen6_ppgtt_clear_range;
-	ppgtt->base.insert_entries = gen6_ppgtt_insert_entries;
-	ppgtt->base.cleanup = gen6_ppgtt_cleanup;
-	ppgtt->base.scratch = dev_priv->gtt.base.scratch;
-	ppgtt->base.start = 0;
-	ppgtt->base.total = GEN6_PPGTT_PD_ENTRIES * I915_PPGTT_PT_ENTRIES * PAGE_SIZE;
+	return ret;
+}
+
+static int gen6_ppgtt_allocate_page_tables(struct i915_hw_ppgtt *ppgtt)
+{
+	int i;
+
 	ppgtt->pt_pages = kcalloc(ppgtt->num_pd_entries, sizeof(struct page *),
 				  GFP_KERNEL);
-	if (!ppgtt->pt_pages) {
-		drm_mm_remove_node(&ppgtt->node);
+
+	if (!ppgtt->pt_pages)
 		return -ENOMEM;
-	}
 
 	for (i = 0; i < ppgtt->num_pd_entries; i++) {
 		ppgtt->pt_pages[i] = alloc_page(GFP_KERNEL);
-		if (!ppgtt->pt_pages[i])
-			goto err_pt_alloc;
+		if (!ppgtt->pt_pages[i]) {
+			gen6_ppgtt_free(ppgtt);
+			return -ENOMEM;
+		}
+	}
+
+	return 0;
+}
+
+static int gen6_ppgtt_alloc(struct i915_hw_ppgtt *ppgtt)
+{
+	int ret;
+
+	ret = gen6_ppgtt_allocate_page_directories(ppgtt);
+	if (ret)
+		return ret;
+
+	ret = gen6_ppgtt_allocate_page_tables(ppgtt);
+	if (ret) {
+		drm_mm_remove_node(&ppgtt->node);
+		return ret;
 	}
 
 	ppgtt->pt_dma_addr = kcalloc(ppgtt->num_pd_entries, sizeof(dma_addr_t),
 				     GFP_KERNEL);
-	if (!ppgtt->pt_dma_addr)
-		goto err_pt_alloc;
+	if (!ppgtt->pt_dma_addr) {
+		drm_mm_remove_node(&ppgtt->node);
+		gen6_ppgtt_free(ppgtt);
+		return -ENOMEM;
+	}
+
+	return 0;
+}
+
+static int gen6_ppgtt_setup_page_tables(struct i915_hw_ppgtt *ppgtt)
+{
+	struct drm_device *dev = ppgtt->base.dev;
+	int i;
 
 	for (i = 0; i < ppgtt->num_pd_entries; i++) {
 		dma_addr_t pt_addr;
@@ -1114,40 +1132,63 @@ alloc:
 				       PCI_DMA_BIDIRECTIONAL);
 
 		if (pci_dma_mapping_error(dev->pdev, pt_addr)) {
-			ret = -EIO;
-			goto err_pd_pin;
-
+			gen6_ppgtt_unmap_pages(ppgtt);
+			return -EIO;
 		}
+
 		ppgtt->pt_dma_addr[i] = pt_addr;
 	}
 
-	ppgtt->base.clear_range(&ppgtt->base, 0, ppgtt->base.total, true);
+	return 0;
+}
+
+static int gen6_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
+{
+	struct drm_device *dev = ppgtt->base.dev;
+	struct drm_i915_private *dev_priv = dev->dev_private;
+	int ret;
+
+	ppgtt->base.pte_encode = dev_priv->gtt.base.pte_encode;
+	if (IS_GEN6(dev)) {
+		ppgtt->enable = gen6_ppgtt_enable;
+		ppgtt->switch_mm = gen6_mm_switch;
+	} else if (IS_HASWELL(dev)) {
+		ppgtt->enable = gen7_ppgtt_enable;
+		ppgtt->switch_mm = hsw_mm_switch;
+	} else if (IS_GEN7(dev)) {
+		ppgtt->enable = gen7_ppgtt_enable;
+		ppgtt->switch_mm = gen7_mm_switch;
+	} else
+		BUG();
+
+	ret = gen6_ppgtt_alloc(ppgtt);
+	if (ret)
+		return ret;
+
+	ret = gen6_ppgtt_setup_page_tables(ppgtt);
+	if (ret) {
+		gen6_ppgtt_free(ppgtt);
+		return ret;
+	}
+
+	ppgtt->base.clear_range = gen6_ppgtt_clear_range;
+	ppgtt->base.insert_entries = gen6_ppgtt_insert_entries;
+	ppgtt->base.cleanup = gen6_ppgtt_cleanup;
+	ppgtt->base.scratch = dev_priv->gtt.base.scratch;
+	ppgtt->base.start = 0;
+	ppgtt->base.total = GEN6_PPGTT_PD_ENTRIES * I915_PPGTT_PT_ENTRIES * PAGE_SIZE;
 	ppgtt->debug_dump = gen6_dump_ppgtt;
 
-	DRM_DEBUG_DRIVER("Allocated pde space (%ldM) at GTT entry: %lx\n",
-			 ppgtt->node.size >> 20,
-			 ppgtt->node.start / PAGE_SIZE);
 	ppgtt->pd_offset =
 		ppgtt->node.start / PAGE_SIZE * sizeof(gen6_gtt_pte_t);
 
-	return 0;
+	ppgtt->base.clear_range(&ppgtt->base, 0, ppgtt->base.total, true);
 
-err_pd_pin:
-	if (ppgtt->pt_dma_addr) {
-		for (i--; i >= 0; i--)
-			pci_unmap_page(dev->pdev, ppgtt->pt_dma_addr[i],
-				       4096, PCI_DMA_BIDIRECTIONAL);
-	}
-err_pt_alloc:
-	kfree(ppgtt->pt_dma_addr);
-	for (i = 0; i < ppgtt->num_pd_entries; i++) {
-		if (ppgtt->pt_pages[i])
-			__free_page(ppgtt->pt_pages[i]);
-	}
-	kfree(ppgtt->pt_pages);
-	drm_mm_remove_node(&ppgtt->node);
+	DRM_DEBUG_DRIVER("Allocated pde space (%ldM) at GTT entry: %lx\n",
+			 ppgtt->node.size >> 20,
+			 ppgtt->node.start / PAGE_SIZE);
 
-	return ret;
+	return 0;
 }
 
 int i915_gem_init_ppgtt(struct drm_device *dev, struct i915_hw_ppgtt *ppgtt)
-- 
1.9.0

^ permalink raw reply related	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2014-02-20  6:05 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-12-24 21:42 [PATCH 0/7] GEN8 PPGTT cleanups + 4GB support Ben Widawsky
2013-12-24 21:42 ` [PATCH 1/7] drm/i915/bdw: Split up PPGTT cleanup Ben Widawsky
2013-12-24 21:42 ` [PATCH 2/7] drm/i915/bdw: Reorganize PPGTT init Ben Widawsky
2013-12-25  6:58   ` [PATCH 2/7] [v2] " Ben Widawsky
2013-12-24 21:42 ` [PATCH 3/7] drm/i915/bdw: Split ppgtt initialization up Ben Widawsky
2013-12-25  7:04   ` [PATCH 3/7] [v2] " Ben Widawsky
2013-12-24 21:42 ` [PATCH 4/7] drm/i915: Make clear/insert vfuncs args absolute Ben Widawsky
2013-12-24 21:42 ` [PATCH 5/7] drm/i915/bdw: Reorganize PT allocations Ben Widawsky
2013-12-24 21:42 ` [PATCH 6/7] Revert "drm/i915/bdw: Limit GTT to 2GB" Ben Widawsky
2013-12-24 21:42 ` [PATCH 7/7] drm/i915: Update i915_gem_gtt.c copyright Ben Widawsky
2013-12-26  2:17 ` [PATCH 8/9] drm/i915: Split GEN6 PPGTT cleanup Ben Widawsky
2013-12-26  2:17   ` [PATCH 9/9] drm/i915: Split GEN6 PPGTT initialization up Ben Widawsky
  -- strict thread matches above, loose matches on Subject: below --
2014-02-20  6:05 [PATCH 0/9] [v2] BDW 4G GGTT + PPGTT cleanups Ben Widawsky
     [not found] <to=1387921357-22942-1-git-send-email-benjamin.widawsky@intel.com>
2014-02-12 22:28 ` [PATCH 0/9] [REPOST] " Ben Widawsky
2014-02-12 22:28   ` [PATCH 9/9] drm/i915: Split GEN6 PPGTT initialization up Ben Widawsky
2014-02-13 10:33     ` Chris Wilson
2014-02-20  6:05   ` Ben Widawsky

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox