* [PATCH v5 00/19] 48-bit PPGTT
@ 2015-07-16 9:33 Michel Thierry
2015-07-16 9:33 ` [PATCH v5 01/19] drm/i915: Remove unnecessary gen8_clamp_pd Michel Thierry
` (19 more replies)
0 siblings, 20 replies; 33+ messages in thread
From: Michel Thierry @ 2015-07-16 9:33 UTC (permalink / raw)
To: intel-gfx; +Cc: akash.goel
This clean-up version delays the 48-bit work to later patches and includes
other review comments from Akash and Chris Wilson. The first 4 patches
prepare the dynamic page allocation code to handle independent pdps, but
no specific code for 48-bit mode is added before the 5th patch.
In order expand the GPU address space, a 4th level translation is added,
the Page Map Level 4 (PML4). This PML4 has 512 PML4 Entries (PML4E),
PML4[0-511], each pointing to a PDP. All the existing "dynamic alloc
ppgtt" functions are used, only adding the 4th level changes. I also
updated some remaining variables that were 32b only.
There are 2 hardware workarounds needed to allow correct operation with
48b addresses (Wa32bitGeneralStateOffset & Wa32bitInstructionBaseOffset).
This new patchset version includes the comments and suggestions from Chris
Wilson. A flag (EXEC_OBJECT_SUPPORTS_48B_ADDRESS) will indicate if a given
object can be allocated outside the first 4 PDPs; if not, the end range is
forced to 4GB. Also, more objects now use the DRM_MM_CREATE_TOP flag. To
maintain compatibility, in libdrm I added a new drm_intel_bo_emit_reloc_48bit
function that will flag these objects, while the existing drm_intel_bo_emit_reloc
clears it.
Finally, this feature is only available in BDW and Gen9, requires LRC
submission mode (execlists) and it can be detected by i915.enable_ppgtt=3.
Also note that this expanded address space is only available for full
PPGTT, aliasing PPGTT and Global GTT remain 32-bit.
Michel Thierry (19):
drm/i915: Remove unnecessary gen8_clamp_pd
drm/i915/gen8: Make pdp allocation more dynamic
drm/i915/gen8: Abstract PDP usage
drm/i915/gen8: Add dynamic page trace events
drm/i915/gen8: Add PML4 structure
drm/i915/gen8: implement alloc/free for 4lvl
drm/i915/gen8: Add 4 level switching infrastructure and lrc support
drm/i915/gen8: Generalize PTE writing for GEN8 PPGTT
drm/i915/gen8: Pass sg_iter through pte inserts
drm/i915/gen8: Add 4 level support in insert_entries and clear_range
drm/i915/gen8: Initialize PDPs
drm/i915: Expand error state's address width to 64b
drm/i915/gen8: Add ppgtt info and debug_dump
drm/i915: object size needs to be u64
drm/i915: batch_obj vm offset must be u64
drm/i915/userptr: Kill user_size limit check
drm/i915: Wa32bitGeneralStateOffset & Wa32bitInstructionBaseOffset
drm/i915/gen8: Flip the 48b switch
drm/i915: Save some page table setup on repeated binds
drivers/gpu/drm/i915/i915_debugfs.c | 18 +-
drivers/gpu/drm/i915/i915_drv.h | 11 +-
drivers/gpu/drm/i915/i915_gem.c | 19 +-
drivers/gpu/drm/i915/i915_gem_execbuffer.c | 13 +
drivers/gpu/drm/i915/i915_gem_gtt.c | 625 ++++++++++++++++++++++++-----
drivers/gpu/drm/i915/i915_gem_gtt.h | 64 ++-
drivers/gpu/drm/i915/i915_gem_userptr.c | 4 -
drivers/gpu/drm/i915/i915_gpu_error.c | 24 +-
drivers/gpu/drm/i915/i915_params.c | 2 +-
drivers/gpu/drm/i915/i915_reg.h | 1 +
drivers/gpu/drm/i915/i915_trace.h | 32 +-
drivers/gpu/drm/i915/intel_lrc.c | 60 ++-
include/uapi/drm/i915_drm.h | 3 +-
13 files changed, 700 insertions(+), 176 deletions(-)
--
2.4.5
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 33+ messages in thread
* [PATCH v5 01/19] drm/i915: Remove unnecessary gen8_clamp_pd
2015-07-16 9:33 [PATCH v5 00/19] 48-bit PPGTT Michel Thierry
@ 2015-07-16 9:33 ` Michel Thierry
2015-07-16 9:33 ` [PATCH v5 02/19] drm/i915/gen8: Make pdp allocation more dynamic Michel Thierry
` (18 subsequent siblings)
19 siblings, 0 replies; 33+ messages in thread
From: Michel Thierry @ 2015-07-16 9:33 UTC (permalink / raw)
To: intel-gfx; +Cc: akash.goel
gen8_clamp_pd clamps to the next page directory boundary, but the macro
gen8_for_each_pde already has a check to stop at the page directory
boundary.
Furthermore, i915_pte_count also restricts to the next page table
boundary.
v2: Rebase after Mika's ppgtt cleanup / scratch merge patch series.
Suggested-by: Akash Goel <akash.goel@intel.com>
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
---
drivers/gpu/drm/i915/i915_gem_gtt.c | 2 +-
drivers/gpu/drm/i915/i915_gem_gtt.h | 11 -----------
2 files changed, 1 insertion(+), 12 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 4142561..0a68343 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -955,7 +955,7 @@ static int gen8_alloc_va_range(struct i915_address_space *vm,
gen8_for_each_pdpe(pd, &ppgtt->pdp, start, length, temp, pdpe) {
gen8_pde_t *const page_directory = kmap_px(pd);
struct i915_page_table *pt;
- uint64_t pd_len = gen8_clamp_pd(start, length);
+ uint64_t pd_len = length;
uint64_t pd_start = start;
uint32_t pde;
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
index e1cfa29..d5bf953 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.h
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
@@ -444,17 +444,6 @@ static inline uint32_t gen6_pde_index(uint32_t addr)
temp = min(temp, length), \
start += temp, length -= temp)
-/* Clamp length to the next page_directory boundary */
-static inline uint64_t gen8_clamp_pd(uint64_t start, uint64_t length)
-{
- uint64_t next_pd = ALIGN(start + 1, 1 << GEN8_PDPE_SHIFT);
-
- if (next_pd > (start + length))
- return length;
-
- return next_pd - start;
-}
-
static inline uint32_t gen8_pte_index(uint64_t address)
{
return i915_pte_index(address, GEN8_PDE_SHIFT);
--
2.4.5
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 33+ messages in thread
* [PATCH v5 02/19] drm/i915/gen8: Make pdp allocation more dynamic
2015-07-16 9:33 [PATCH v5 00/19] 48-bit PPGTT Michel Thierry
2015-07-16 9:33 ` [PATCH v5 01/19] drm/i915: Remove unnecessary gen8_clamp_pd Michel Thierry
@ 2015-07-16 9:33 ` Michel Thierry
2015-07-16 9:33 ` [PATCH v5 03/19] drm/i915/gen8: Abstract PDP usage Michel Thierry
` (17 subsequent siblings)
19 siblings, 0 replies; 33+ messages in thread
From: Michel Thierry @ 2015-07-16 9:33 UTC (permalink / raw)
To: intel-gfx; +Cc: akash.goel
This transitional patch doesn't do much for the existing code. However,
it should make upcoming patches to use the full 48b address space a bit
easier.
v2: Renamed pdp_free to be similar to pd/pt (unmap_and_free_pdp).
v3: To facilitate testing, 48b mode will be available on Broadwell and
GEN9+, when i915.enable_ppgtt = 3.
v4: Rebase after s/page_tables/page_table/, added extra information
about 4-level page table formats and use IS_ENABLED macro.
v5: Check CONFIG_X86_64 instead of CONFIG_64BIT.
v6: Rebase after Mika's ppgtt cleanup / scratch merge patch series, and
follow
his nomenclature in pdp functions (there is no alloc_pdp yet).
v7: Rebase after merged version of Mika's ppgtt cleanup patch series.
v8: Rebase after final merged version of Mika's ppgtt/scratch patches.
v9: Introduce PML4 (and 48-bit checks) until next patch (Akash).
v10: Also use test_bit to detect when pd/pt are already allocated (Akash)
Cc: Akash Goel <akash.goel@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2+)
---
drivers/gpu/drm/i915/i915_gem_gtt.c | 86 +++++++++++++++++++++++++++++--------
drivers/gpu/drm/i915/i915_gem_gtt.h | 17 +++++---
2 files changed, 80 insertions(+), 23 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 0a68343..5913653 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -522,6 +522,43 @@ static void gen8_initialize_pd(struct i915_address_space *vm,
fill_px(vm->dev, pd, scratch_pde);
}
+static int __pdp_init(struct drm_device *dev,
+ struct i915_page_directory_pointer *pdp)
+{
+ size_t pdpes = I915_PDPES_PER_PDP(dev);
+
+ pdp->used_pdpes = kcalloc(BITS_TO_LONGS(pdpes),
+ sizeof(unsigned long),
+ GFP_KERNEL);
+ if (!pdp->used_pdpes)
+ return -ENOMEM;
+
+ pdp->page_directory = kcalloc(pdpes, sizeof(*pdp->page_directory),
+ GFP_KERNEL);
+ if (!pdp->page_directory) {
+ kfree(pdp->used_pdpes);
+ /* the PDP might be the statically allocated top level. Keep it
+ * as clean as possible */
+ pdp->used_pdpes = NULL;
+ return -ENOMEM;
+ }
+
+ return 0;
+}
+
+static void __pdp_fini(struct i915_page_directory_pointer *pdp)
+{
+ kfree(pdp->used_pdpes);
+ kfree(pdp->page_directory);
+ pdp->page_directory = NULL;
+}
+
+static void free_pdp(struct drm_device *dev,
+ struct i915_page_directory_pointer *pdp)
+{
+ __pdp_fini(pdp);
+}
+
/* Broadwell Page Directory Pointer Descriptors */
static int gen8_write_pdp(struct drm_i915_gem_request *req,
unsigned entry,
@@ -720,7 +757,8 @@ static void gen8_ppgtt_cleanup(struct i915_address_space *vm)
container_of(vm, struct i915_hw_ppgtt, base);
int i;
- for_each_set_bit(i, ppgtt->pdp.used_pdpes, GEN8_LEGACY_PDPES) {
+ for_each_set_bit(i, ppgtt->pdp.used_pdpes,
+ I915_PDPES_PER_PDP(ppgtt->base.dev)) {
if (WARN_ON(!ppgtt->pdp.page_directory[i]))
continue;
@@ -729,6 +767,7 @@ static void gen8_ppgtt_cleanup(struct i915_address_space *vm)
free_pd(ppgtt->base.dev, ppgtt->pdp.page_directory[i]);
}
+ free_pdp(ppgtt->base.dev, &ppgtt->pdp);
gen8_free_scratch(vm);
}
@@ -763,7 +802,7 @@ static int gen8_ppgtt_alloc_pagetabs(struct i915_hw_ppgtt *ppgtt,
gen8_for_each_pde(pt, pd, start, length, temp, pde) {
/* Don't reallocate page tables */
- if (pt) {
+ if (test_bit(pde, pd->used_pdes)) {
/* Scratch is never allocated this way */
WARN_ON(pt == ppgtt->base.scratch_pt);
continue;
@@ -820,11 +859,12 @@ static int gen8_ppgtt_alloc_page_directories(struct i915_hw_ppgtt *ppgtt,
struct i915_page_directory *pd;
uint64_t temp;
uint32_t pdpe;
+ uint32_t pdpes = I915_PDPES_PER_PDP(dev);
- WARN_ON(!bitmap_empty(new_pds, GEN8_LEGACY_PDPES));
+ WARN_ON(!bitmap_empty(new_pds, pdpes));
gen8_for_each_pdpe(pd, pdp, start, length, temp, pdpe) {
- if (pd)
+ if (test_bit(pdpe, pdp->used_pdpes))
continue;
pd = alloc_pd(dev);
@@ -839,18 +879,19 @@ static int gen8_ppgtt_alloc_page_directories(struct i915_hw_ppgtt *ppgtt,
return 0;
unwind_out:
- for_each_set_bit(pdpe, new_pds, GEN8_LEGACY_PDPES)
+ for_each_set_bit(pdpe, new_pds, pdpes)
free_pd(dev, pdp->page_directory[pdpe]);
return -ENOMEM;
}
static void
-free_gen8_temp_bitmaps(unsigned long *new_pds, unsigned long **new_pts)
+free_gen8_temp_bitmaps(unsigned long *new_pds, unsigned long **new_pts,
+ uint32_t pdpes)
{
int i;
- for (i = 0; i < GEN8_LEGACY_PDPES; i++)
+ for (i = 0; i < pdpes; i++)
kfree(new_pts[i]);
kfree(new_pts);
kfree(new_pds);
@@ -861,23 +902,24 @@ free_gen8_temp_bitmaps(unsigned long *new_pds, unsigned long **new_pts)
*/
static
int __must_check alloc_gen8_temp_bitmaps(unsigned long **new_pds,
- unsigned long ***new_pts)
+ unsigned long ***new_pts,
+ uint32_t pdpes)
{
int i;
unsigned long *pds;
unsigned long **pts;
- pds = kcalloc(BITS_TO_LONGS(GEN8_LEGACY_PDPES), sizeof(unsigned long), GFP_KERNEL);
+ pds = kcalloc(BITS_TO_LONGS(pdpes), sizeof(unsigned long), GFP_KERNEL);
if (!pds)
return -ENOMEM;
- pts = kcalloc(GEN8_LEGACY_PDPES, sizeof(unsigned long *), GFP_KERNEL);
+ pts = kcalloc(pdpes, sizeof(unsigned long *), GFP_KERNEL);
if (!pts) {
kfree(pds);
return -ENOMEM;
}
- for (i = 0; i < GEN8_LEGACY_PDPES; i++) {
+ for (i = 0; i < pdpes; i++) {
pts[i] = kcalloc(BITS_TO_LONGS(I915_PDES),
sizeof(unsigned long), GFP_KERNEL);
if (!pts[i])
@@ -890,7 +932,7 @@ int __must_check alloc_gen8_temp_bitmaps(unsigned long **new_pds,
return 0;
err_out:
- free_gen8_temp_bitmaps(pds, pts);
+ free_gen8_temp_bitmaps(pds, pts, pdpes);
return -ENOMEM;
}
@@ -916,6 +958,7 @@ static int gen8_alloc_va_range(struct i915_address_space *vm,
const uint64_t orig_length = length;
uint64_t temp;
uint32_t pdpe;
+ uint32_t pdpes = I915_PDPES_PER_PDP(ppgtt->base.dev);
int ret;
/* Wrap is never okay since we can only represent 48b, and we don't
@@ -927,7 +970,7 @@ static int gen8_alloc_va_range(struct i915_address_space *vm,
if (WARN_ON(start + length > ppgtt->base.total))
return -ENODEV;
- ret = alloc_gen8_temp_bitmaps(&new_page_dirs, &new_page_tables);
+ ret = alloc_gen8_temp_bitmaps(&new_page_dirs, &new_page_tables, pdpes);
if (ret)
return ret;
@@ -935,7 +978,7 @@ static int gen8_alloc_va_range(struct i915_address_space *vm,
ret = gen8_ppgtt_alloc_page_directories(ppgtt, &ppgtt->pdp, start, length,
new_page_dirs);
if (ret) {
- free_gen8_temp_bitmaps(new_page_dirs, new_page_tables);
+ free_gen8_temp_bitmaps(new_page_dirs, new_page_tables, pdpes);
return ret;
}
@@ -989,7 +1032,7 @@ static int gen8_alloc_va_range(struct i915_address_space *vm,
__set_bit(pdpe, ppgtt->pdp.used_pdpes);
}
- free_gen8_temp_bitmaps(new_page_dirs, new_page_tables);
+ free_gen8_temp_bitmaps(new_page_dirs, new_page_tables, pdpes);
mark_tlbs_dirty(ppgtt);
return 0;
@@ -999,10 +1042,10 @@ err_out:
free_pt(vm->dev, ppgtt->pdp.page_directory[pdpe]->page_table[temp]);
}
- for_each_set_bit(pdpe, new_page_dirs, GEN8_LEGACY_PDPES)
+ for_each_set_bit(pdpe, new_page_dirs, pdpes)
free_pd(vm->dev, ppgtt->pdp.page_directory[pdpe]);
- free_gen8_temp_bitmaps(new_page_dirs, new_page_tables);
+ free_gen8_temp_bitmaps(new_page_dirs, new_page_tables, pdpes);
mark_tlbs_dirty(ppgtt);
return ret;
}
@@ -1040,7 +1083,16 @@ static int gen8_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
ppgtt->switch_mm = gen8_mm_switch;
+ ret = __pdp_init(false, &ppgtt->pdp);
+
+ if (ret)
+ goto free_scratch;
+
return 0;
+
+free_scratch:
+ gen8_free_scratch(&ppgtt->base);
+ return ret;
}
static void gen6_dump_ppgtt(struct i915_hw_ppgtt *ppgtt, struct seq_file *m)
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
index d5bf953..87e389c 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.h
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
@@ -98,6 +98,9 @@ typedef uint64_t gen8_pde_t;
#define GEN8_LEGACY_PDPES 4
#define GEN8_PTES I915_PTES(sizeof(gen8_pte_t))
+/* FIXME: Next patch will use dev */
+#define I915_PDPES_PER_PDP(dev) GEN8_LEGACY_PDPES
+
#define PPAT_UNCACHED_INDEX (_PAGE_PWT | _PAGE_PCD)
#define PPAT_CACHED_PDE_INDEX 0 /* WB LLC */
#define PPAT_CACHED_INDEX _PAGE_PAT /* WB LLCeLLC */
@@ -241,9 +244,10 @@ struct i915_page_directory {
};
struct i915_page_directory_pointer {
- /* struct page *page; */
- DECLARE_BITMAP(used_pdpes, GEN8_LEGACY_PDPES);
- struct i915_page_directory *page_directory[GEN8_LEGACY_PDPES];
+ struct i915_page_dma base;
+
+ unsigned long *used_pdpes;
+ struct i915_page_directory **page_directory;
};
struct i915_address_space {
@@ -436,9 +440,10 @@ static inline uint32_t gen6_pde_index(uint32_t addr)
temp = min(temp, length), \
start += temp, length -= temp)
-#define gen8_for_each_pdpe(pd, pdp, start, length, temp, iter) \
- for (iter = gen8_pdpe_index(start); \
- pd = (pdp)->page_directory[iter], length > 0 && iter < GEN8_LEGACY_PDPES; \
+#define gen8_for_each_pdpe(pd, pdp, start, length, temp, iter) \
+ for (iter = gen8_pdpe_index(start); \
+ pd = (pdp)->page_directory[iter], \
+ length > 0 && (iter < I915_PDPES_PER_PDP(dev)); \
iter++, \
temp = ALIGN(start+1, 1 << GEN8_PDPE_SHIFT) - start, \
temp = min(temp, length), \
--
2.4.5
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 33+ messages in thread
* [PATCH v5 03/19] drm/i915/gen8: Abstract PDP usage
2015-07-16 9:33 [PATCH v5 00/19] 48-bit PPGTT Michel Thierry
2015-07-16 9:33 ` [PATCH v5 01/19] drm/i915: Remove unnecessary gen8_clamp_pd Michel Thierry
2015-07-16 9:33 ` [PATCH v5 02/19] drm/i915/gen8: Make pdp allocation more dynamic Michel Thierry
@ 2015-07-16 9:33 ` Michel Thierry
2015-07-16 9:33 ` [PATCH v5 04/19] drm/i915/gen8: Add dynamic page trace events Michel Thierry
` (16 subsequent siblings)
19 siblings, 0 replies; 33+ messages in thread
From: Michel Thierry @ 2015-07-16 9:33 UTC (permalink / raw)
To: intel-gfx; +Cc: akash.goel
Up until now, ppgtt->pdp has always been the root of our page tables.
Legacy 32b addresses acted like it had 1 PDP with 4 PDPEs.
In preparation for 4 level page tables, we need to stop use ppgtt->pdp
directly unless we know it's what we want. The future structure will use
ppgtt->pml4 for the top level, and the pdp is just one of the entries
being pointed to by a pml4e. Places where this is not yet possible use a
temporal pdp local variable.
v2: Updated after dynamic page allocation changes.
v3: Rebase after s/page_tables/page_table/.
v4: Rebase after changes in "Dynamic page table allocations" patch.
v5: Rebase after Mika's ppgtt cleanup / scratch merge patch series.
v6: Rebase after final merged version of Mika's ppgtt/scratch patches.
v7: Keep pagetable map in-line (and avoid unnecessary for_each_pde
loops), remove redundant ppgtt pointer in _alloc_pagetabs (Akash)
v8: Fix text indentation in _alloc_pagetabs/page_directories (Chris)
v9: Defer gen8_alloc_va_range_4lvl definition until 4lvl is implemented,
clean-up gen8_ppgtt_cleanup [pun intended] (Akash).
Cc: Akash Goel <akash.goel@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2+)
---
drivers/gpu/drm/i915/i915_gem_gtt.c | 84 +++++++++++++++++++------------------
1 file changed, 44 insertions(+), 40 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 5913653..cc575c21 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -607,6 +607,7 @@ static void gen8_ppgtt_clear_range(struct i915_address_space *vm,
{
struct i915_hw_ppgtt *ppgtt =
container_of(vm, struct i915_hw_ppgtt, base);
+ struct i915_page_directory_pointer *pdp = &ppgtt->pdp; /* FIXME: 48b */
gen8_pte_t *pt_vaddr, scratch_pte;
unsigned pdpe = start >> GEN8_PDPE_SHIFT & GEN8_PDPE_MASK;
unsigned pde = start >> GEN8_PDE_SHIFT & GEN8_PDE_MASK;
@@ -621,10 +622,10 @@ static void gen8_ppgtt_clear_range(struct i915_address_space *vm,
struct i915_page_directory *pd;
struct i915_page_table *pt;
- if (WARN_ON(!ppgtt->pdp.page_directory[pdpe]))
+ if (WARN_ON(!pdp->page_directory[pdpe]))
break;
- pd = ppgtt->pdp.page_directory[pdpe];
+ pd = pdp->page_directory[pdpe];
if (WARN_ON(!pd->page_table[pde]))
break;
@@ -662,6 +663,7 @@ static void gen8_ppgtt_insert_entries(struct i915_address_space *vm,
{
struct i915_hw_ppgtt *ppgtt =
container_of(vm, struct i915_hw_ppgtt, base);
+ struct i915_page_directory_pointer *pdp = &ppgtt->pdp; /* FIXME: 48b */
gen8_pte_t *pt_vaddr;
unsigned pdpe = start >> GEN8_PDPE_SHIFT & GEN8_PDPE_MASK;
unsigned pde = start >> GEN8_PDE_SHIFT & GEN8_PDE_MASK;
@@ -675,7 +677,7 @@ static void gen8_ppgtt_insert_entries(struct i915_address_space *vm,
break;
if (pt_vaddr == NULL) {
- struct i915_page_directory *pd = ppgtt->pdp.page_directory[pdpe];
+ struct i915_page_directory *pd = pdp->page_directory[pdpe];
struct i915_page_table *pt = pd->page_table[pde];
pt_vaddr = kmap_px(pt);
}
@@ -755,28 +757,29 @@ static void gen8_ppgtt_cleanup(struct i915_address_space *vm)
{
struct i915_hw_ppgtt *ppgtt =
container_of(vm, struct i915_hw_ppgtt, base);
+ struct i915_page_directory_pointer *pdp = &ppgtt->pdp; /* FIXME: 48b */
+ struct drm_device *dev = ppgtt->base.dev;
int i;
- for_each_set_bit(i, ppgtt->pdp.used_pdpes,
- I915_PDPES_PER_PDP(ppgtt->base.dev)) {
- if (WARN_ON(!ppgtt->pdp.page_directory[i]))
+ for_each_set_bit(i, pdp->used_pdpes, I915_PDPES_PER_PDP(dev)) {
+ if (WARN_ON(!pdp->page_directory[i]))
continue;
- gen8_free_page_tables(ppgtt->base.dev,
- ppgtt->pdp.page_directory[i]);
- free_pd(ppgtt->base.dev, ppgtt->pdp.page_directory[i]);
+ gen8_free_page_tables(dev, pdp->page_directory[i]);
+ free_pd(dev, pdp->page_directory[i]);
}
- free_pdp(ppgtt->base.dev, &ppgtt->pdp);
+ free_pdp(dev, pdp);
+
gen8_free_scratch(vm);
}
/**
* gen8_ppgtt_alloc_pagetabs() - Allocate page tables for VA range.
- * @ppgtt: Master ppgtt structure.
- * @pd: Page directory for this address range.
+ * @vm: Master vm structure.
+ * @pd: Page directory for this address range.
* @start: Starting virtual address to begin allocations.
- * @length Size of the allocations.
+ * @length: Size of the allocations.
* @new_pts: Bitmap set by function with new allocations. Likely used by the
* caller to free on error.
*
@@ -789,13 +792,13 @@ static void gen8_ppgtt_cleanup(struct i915_address_space *vm)
*
* Return: 0 if success; negative error code otherwise.
*/
-static int gen8_ppgtt_alloc_pagetabs(struct i915_hw_ppgtt *ppgtt,
+static int gen8_ppgtt_alloc_pagetabs(struct i915_address_space *vm,
struct i915_page_directory *pd,
uint64_t start,
uint64_t length,
unsigned long *new_pts)
{
- struct drm_device *dev = ppgtt->base.dev;
+ struct drm_device *dev = vm->dev;
struct i915_page_table *pt;
uint64_t temp;
uint32_t pde;
@@ -804,7 +807,7 @@ static int gen8_ppgtt_alloc_pagetabs(struct i915_hw_ppgtt *ppgtt,
/* Don't reallocate page tables */
if (test_bit(pde, pd->used_pdes)) {
/* Scratch is never allocated this way */
- WARN_ON(pt == ppgtt->base.scratch_pt);
+ WARN_ON(pt == vm->scratch_pt);
continue;
}
@@ -812,7 +815,7 @@ static int gen8_ppgtt_alloc_pagetabs(struct i915_hw_ppgtt *ppgtt,
if (IS_ERR(pt))
goto unwind_out;
- gen8_initialize_pt(&ppgtt->base, pt);
+ gen8_initialize_pt(vm, pt);
pd->page_table[pde] = pt;
__set_bit(pde, new_pts);
}
@@ -828,11 +831,11 @@ unwind_out:
/**
* gen8_ppgtt_alloc_page_directories() - Allocate page directories for VA range.
- * @ppgtt: Master ppgtt structure.
+ * @vm: Master vm structure.
* @pdp: Page directory pointer for this address range.
* @start: Starting virtual address to begin allocations.
- * @length Size of the allocations.
- * @new_pds Bitmap set by function with new allocations. Likely used by the
+ * @length: Size of the allocations.
+ * @new_pds: Bitmap set by function with new allocations. Likely used by the
* caller to free on error.
*
* Allocate the required number of page directories starting at the pde index of
@@ -849,13 +852,14 @@ unwind_out:
*
* Return: 0 if success; negative error code otherwise.
*/
-static int gen8_ppgtt_alloc_page_directories(struct i915_hw_ppgtt *ppgtt,
- struct i915_page_directory_pointer *pdp,
- uint64_t start,
- uint64_t length,
- unsigned long *new_pds)
+static int
+gen8_ppgtt_alloc_page_directories(struct i915_address_space *vm,
+ struct i915_page_directory_pointer *pdp,
+ uint64_t start,
+ uint64_t length,
+ unsigned long *new_pds)
{
- struct drm_device *dev = ppgtt->base.dev;
+ struct drm_device *dev = vm->dev;
struct i915_page_directory *pd;
uint64_t temp;
uint32_t pdpe;
@@ -871,7 +875,7 @@ static int gen8_ppgtt_alloc_page_directories(struct i915_hw_ppgtt *ppgtt,
if (IS_ERR(pd))
goto unwind_out;
- gen8_initialize_pd(&ppgtt->base, pd);
+ gen8_initialize_pd(vm, pd);
pdp->page_directory[pdpe] = pd;
__set_bit(pdpe, new_pds);
}
@@ -947,18 +951,19 @@ static void mark_tlbs_dirty(struct i915_hw_ppgtt *ppgtt)
}
static int gen8_alloc_va_range(struct i915_address_space *vm,
- uint64_t start,
- uint64_t length)
+ uint64_t start, uint64_t length)
{
struct i915_hw_ppgtt *ppgtt =
container_of(vm, struct i915_hw_ppgtt, base);
unsigned long *new_page_dirs, **new_page_tables;
+ struct drm_device *dev = vm->dev;
+ struct i915_page_directory_pointer *pdp = &ppgtt->pdp; /* FIXME: 48b */
struct i915_page_directory *pd;
const uint64_t orig_start = start;
const uint64_t orig_length = length;
uint64_t temp;
uint32_t pdpe;
- uint32_t pdpes = I915_PDPES_PER_PDP(ppgtt->base.dev);
+ uint32_t pdpes = I915_PDPES_PER_PDP(dev);
int ret;
/* Wrap is never okay since we can only represent 48b, and we don't
@@ -967,7 +972,7 @@ static int gen8_alloc_va_range(struct i915_address_space *vm,
if (WARN_ON(start + length < start))
return -ENODEV;
- if (WARN_ON(start + length > ppgtt->base.total))
+ if (WARN_ON(start + length > vm->total))
return -ENODEV;
ret = alloc_gen8_temp_bitmaps(&new_page_dirs, &new_page_tables, pdpes);
@@ -975,16 +980,16 @@ static int gen8_alloc_va_range(struct i915_address_space *vm,
return ret;
/* Do the allocations first so we can easily bail out */
- ret = gen8_ppgtt_alloc_page_directories(ppgtt, &ppgtt->pdp, start, length,
- new_page_dirs);
+ ret = gen8_ppgtt_alloc_page_directories(vm, pdp, start, length,
+ new_page_dirs);
if (ret) {
free_gen8_temp_bitmaps(new_page_dirs, new_page_tables, pdpes);
return ret;
}
/* For every page directory referenced, allocate page tables */
- gen8_for_each_pdpe(pd, &ppgtt->pdp, start, length, temp, pdpe) {
- ret = gen8_ppgtt_alloc_pagetabs(ppgtt, pd, start, length,
+ gen8_for_each_pdpe(pd, pdp, start, length, temp, pdpe) {
+ ret = gen8_ppgtt_alloc_pagetabs(vm, pd, start, length,
new_page_tables[pdpe]);
if (ret)
goto err_out;
@@ -995,7 +1000,7 @@ static int gen8_alloc_va_range(struct i915_address_space *vm,
/* Allocations have completed successfully, so set the bitmaps, and do
* the mappings. */
- gen8_for_each_pdpe(pd, &ppgtt->pdp, start, length, temp, pdpe) {
+ gen8_for_each_pdpe(pd, pdp, start, length, temp, pdpe) {
gen8_pde_t *const page_directory = kmap_px(pd);
struct i915_page_table *pt;
uint64_t pd_len = length;
@@ -1028,8 +1033,7 @@ static int gen8_alloc_va_range(struct i915_address_space *vm,
}
kunmap_px(ppgtt, page_directory);
-
- __set_bit(pdpe, ppgtt->pdp.used_pdpes);
+ __set_bit(pdpe, pdp->used_pdpes);
}
free_gen8_temp_bitmaps(new_page_dirs, new_page_tables, pdpes);
@@ -1039,11 +1043,11 @@ static int gen8_alloc_va_range(struct i915_address_space *vm,
err_out:
while (pdpe--) {
for_each_set_bit(temp, new_page_tables[pdpe], I915_PDES)
- free_pt(vm->dev, ppgtt->pdp.page_directory[pdpe]->page_table[temp]);
+ free_pt(dev, pdp->page_directory[pdpe]->page_table[temp]);
}
for_each_set_bit(pdpe, new_page_dirs, pdpes)
- free_pd(vm->dev, ppgtt->pdp.page_directory[pdpe]);
+ free_pd(dev, pdp->page_directory[pdpe]);
free_gen8_temp_bitmaps(new_page_dirs, new_page_tables, pdpes);
mark_tlbs_dirty(ppgtt);
--
2.4.5
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 33+ messages in thread
* [PATCH v5 04/19] drm/i915/gen8: Add dynamic page trace events
2015-07-16 9:33 [PATCH v5 00/19] 48-bit PPGTT Michel Thierry
` (2 preceding siblings ...)
2015-07-16 9:33 ` [PATCH v5 03/19] drm/i915/gen8: Abstract PDP usage Michel Thierry
@ 2015-07-16 9:33 ` Michel Thierry
2015-07-16 9:33 ` [PATCH v5 05/19] drm/i915/gen8: Add PML4 structure Michel Thierry
` (15 subsequent siblings)
19 siblings, 0 replies; 33+ messages in thread
From: Michel Thierry @ 2015-07-16 9:33 UTC (permalink / raw)
To: intel-gfx; +Cc: akash.goel
The dynamic page allocation patch series added it for GEN6, this patch
adds them for GEN8.
v2: Consolidate pagetable/page_directory events
v3: Multiple rebases.
v4: Rebase after s/page_tables/page_table/.
v5: Rebase after Mika's ppgtt cleanup / scratch merge patch series.
v6: Rebase after gen8_map_pagetable_range removal.
v7: Use generic page name (px) in DECLARE_EVENT_CLASS (Akash)
v8: Defer define of i915_page_directory_pointer_entry_alloc (Akash)
Cc: Akash Goel <akash.goel@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v3+)
---
drivers/gpu/drm/i915/i915_gem_gtt.c | 6 ++++++
drivers/gpu/drm/i915/i915_trace.h | 24 ++++++++++++++++--------
2 files changed, 22 insertions(+), 8 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index cc575c21..4583a54 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -818,6 +818,7 @@ static int gen8_ppgtt_alloc_pagetabs(struct i915_address_space *vm,
gen8_initialize_pt(vm, pt);
pd->page_table[pde] = pt;
__set_bit(pde, new_pts);
+ trace_i915_page_table_entry_alloc(vm, pde, start, GEN8_PDE_SHIFT);
}
return 0;
@@ -878,6 +879,7 @@ gen8_ppgtt_alloc_page_directories(struct i915_address_space *vm,
gen8_initialize_pd(vm, pd);
pdp->page_directory[pdpe] = pd;
__set_bit(pdpe, new_pds);
+ trace_i915_page_directory_entry_alloc(vm, pdpe, start, GEN8_PDPE_SHIFT);
}
return 0;
@@ -1027,6 +1029,10 @@ static int gen8_alloc_va_range(struct i915_address_space *vm,
/* Map the PDE to the page table */
page_directory[pde] = gen8_pde_encode(px_dma(pt),
I915_CACHE_LLC);
+ trace_i915_page_table_entry_map(&ppgtt->base, pde, pt,
+ gen8_pte_index(start),
+ gen8_pte_count(start, length),
+ GEN8_PTES);
/* NB: We haven't yet mapped ptes to pages. At this
* point we're still relying on insert_entries() */
diff --git a/drivers/gpu/drm/i915/i915_trace.h b/drivers/gpu/drm/i915/i915_trace.h
index 2f34c47..f230d76 100644
--- a/drivers/gpu/drm/i915/i915_trace.h
+++ b/drivers/gpu/drm/i915/i915_trace.h
@@ -186,33 +186,41 @@ DEFINE_EVENT(i915_va, i915_va_alloc,
TP_ARGS(vm, start, length, name)
);
-DECLARE_EVENT_CLASS(i915_page_table_entry,
- TP_PROTO(struct i915_address_space *vm, u32 pde, u64 start, u64 pde_shift),
- TP_ARGS(vm, pde, start, pde_shift),
+DECLARE_EVENT_CLASS(i915_px_entry,
+ TP_PROTO(struct i915_address_space *vm, u32 px, u64 start, u64 px_shift),
+ TP_ARGS(vm, px, start, px_shift),
TP_STRUCT__entry(
__field(struct i915_address_space *, vm)
- __field(u32, pde)
+ __field(u32, px)
__field(u64, start)
__field(u64, end)
),
TP_fast_assign(
__entry->vm = vm;
- __entry->pde = pde;
+ __entry->px = px;
__entry->start = start;
- __entry->end = ((start + (1ULL << pde_shift)) & ~((1ULL << pde_shift)-1)) - 1;
+ __entry->end = ((start + (1ULL << px_shift)) & ~((1ULL << px_shift)-1)) - 1;
),
TP_printk("vm=%p, pde=%d (0x%llx-0x%llx)",
- __entry->vm, __entry->pde, __entry->start, __entry->end)
+ __entry->vm, __entry->px, __entry->start, __entry->end)
);
-DEFINE_EVENT(i915_page_table_entry, i915_page_table_entry_alloc,
+DEFINE_EVENT(i915_px_entry, i915_page_table_entry_alloc,
TP_PROTO(struct i915_address_space *vm, u32 pde, u64 start, u64 pde_shift),
TP_ARGS(vm, pde, start, pde_shift)
);
+DEFINE_EVENT_PRINT(i915_px_entry, i915_page_directory_entry_alloc,
+ TP_PROTO(struct i915_address_space *vm, u32 pdpe, u64 start, u64 pdpe_shift),
+ TP_ARGS(vm, pdpe, start, pdpe_shift),
+
+ TP_printk("vm=%p, pdpe=%d (0x%llx-0x%llx)",
+ __entry->vm, __entry->px, __entry->start, __entry->end)
+);
+
/* Avoid extra math because we only support two sizes. The format is defined by
* bitmap_scnprintf. Each 32 bits is 8 HEX digits followed by comma */
#define TRACE_PT_SIZE(bits) \
--
2.4.5
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 33+ messages in thread
* [PATCH v5 05/19] drm/i915/gen8: Add PML4 structure
2015-07-16 9:33 [PATCH v5 00/19] 48-bit PPGTT Michel Thierry
` (3 preceding siblings ...)
2015-07-16 9:33 ` [PATCH v5 04/19] drm/i915/gen8: Add dynamic page trace events Michel Thierry
@ 2015-07-16 9:33 ` Michel Thierry
2015-07-16 9:33 ` [PATCH v5 06/19] drm/i915/gen8: implement alloc/free for 4lvl Michel Thierry
` (14 subsequent siblings)
19 siblings, 0 replies; 33+ messages in thread
From: Michel Thierry @ 2015-07-16 9:33 UTC (permalink / raw)
To: intel-gfx; +Cc: akash.goel
Introduces the Page Map Level 4 (PML4), ie. the new top level structure
of the page tables.
To facilitate testing, 48b mode will be available on Broadwell and
GEN9+, when i915.enable_ppgtt = 3.
v2: Remove unnecessary CONFIG_X86_64 checks, ppgtt code is already
32/64-bit safe (Chris).
v3: Add goto free_scratch in temp 48-bit mode init code (Akash).
Cc: Akash Goel <akash.goel@intel.com>
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
---
drivers/gpu/drm/i915/i915_drv.h | 3 ++-
drivers/gpu/drm/i915/i915_gem_gtt.c | 38 ++++++++++++++++++++++++-------------
drivers/gpu/drm/i915/i915_gem_gtt.h | 26 ++++++++++++++++++++-----
3 files changed, 48 insertions(+), 19 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 38cfd53..550b243 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2500,7 +2500,8 @@ struct drm_i915_cmd_table {
#define HAS_HW_CONTEXTS(dev) (INTEL_INFO(dev)->gen >= 6)
#define HAS_LOGICAL_RING_CONTEXTS(dev) (INTEL_INFO(dev)->gen >= 8)
#define USES_PPGTT(dev) (i915.enable_ppgtt)
-#define USES_FULL_PPGTT(dev) (i915.enable_ppgtt == 2)
+#define USES_FULL_PPGTT(dev) (i915.enable_ppgtt >= 2)
+#define USES_FULL_48BIT_PPGTT(dev) (i915.enable_ppgtt == 3)
#define HAS_OVERLAY(dev) (INTEL_INFO(dev)->has_overlay)
#define OVERLAY_NEEDS_PHYSICAL(dev) (INTEL_INFO(dev)->overlay_needs_physical)
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 4583a54..59b7ff5 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -104,9 +104,12 @@ static int sanitize_enable_ppgtt(struct drm_device *dev, int enable_ppgtt)
{
bool has_aliasing_ppgtt;
bool has_full_ppgtt;
+ bool has_full_64bit_ppgtt;
has_aliasing_ppgtt = INTEL_INFO(dev)->gen >= 6;
has_full_ppgtt = INTEL_INFO(dev)->gen >= 7;
+ has_full_64bit_ppgtt = (IS_BROADWELL(dev) ||
+ INTEL_INFO(dev)->gen >= 9) && false; /* FIXME: 64b */
if (intel_vgpu_active(dev))
has_full_ppgtt = false; /* emulation is too hard */
@@ -125,6 +128,9 @@ static int sanitize_enable_ppgtt(struct drm_device *dev, int enable_ppgtt)
if (enable_ppgtt == 2 && has_full_ppgtt)
return 2;
+ if (enable_ppgtt == 3 && has_full_64bit_ppgtt)
+ return 3;
+
#ifdef CONFIG_INTEL_IOMMU
/* Disable ppgtt on SNB if VT-d is on. */
if (INTEL_INFO(dev)->gen == 6 && intel_iommu_gfx_mapped) {
@@ -557,6 +563,8 @@ static void free_pdp(struct drm_device *dev,
struct i915_page_directory_pointer *pdp)
{
__pdp_fini(pdp);
+ if (USES_FULL_48BIT_PPGTT(dev))
+ kfree(pdp);
}
/* Broadwell Page Directory Pointer Descriptors */
@@ -673,9 +681,6 @@ static void gen8_ppgtt_insert_entries(struct i915_address_space *vm,
pt_vaddr = NULL;
for_each_sg_page(pages->sgl, &sg_iter, pages->nents, 0) {
- if (WARN_ON(pdpe >= GEN8_LEGACY_PDPES))
- break;
-
if (pt_vaddr == NULL) {
struct i915_page_directory *pd = pdp->page_directory[pdpe];
struct i915_page_table *pt = pd->page_table[pde];
@@ -1076,14 +1081,6 @@ static int gen8_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
return ret;
ppgtt->base.start = 0;
- ppgtt->base.total = 1ULL << 32;
- if (IS_ENABLED(CONFIG_X86_32))
- /* While we have a proliferation of size_t variables
- * we cannot represent the full ppgtt size on 32bit,
- * so limit it to the same size as the GGTT (currently
- * 2GiB).
- */
- ppgtt->base.total = to_i915(ppgtt->base.dev)->gtt.base.total;
ppgtt->base.cleanup = gen8_ppgtt_cleanup;
ppgtt->base.allocate_va_range = gen8_alloc_va_range;
ppgtt->base.insert_entries = gen8_ppgtt_insert_entries;
@@ -1093,10 +1090,25 @@ static int gen8_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
ppgtt->switch_mm = gen8_mm_switch;
- ret = __pdp_init(false, &ppgtt->pdp);
+ if (!USES_FULL_48BIT_PPGTT(ppgtt->base.dev)) {
+ ret = __pdp_init(false, &ppgtt->pdp);
- if (ret)
+ if (ret)
+ goto free_scratch;
+
+ ppgtt->base.total = 1ULL << 32;
+ if (IS_ENABLED(CONFIG_X86_32))
+ /* While we have a proliferation of size_t variables
+ * we cannot represent the full ppgtt size on 32bit,
+ * so limit it to the same size as the GGTT (currently
+ * 2GiB).
+ */
+ ppgtt->base.total = to_i915(ppgtt->base.dev)->gtt.base.total;
+ } else {
+ ppgtt->base.total = 1ULL << 48;
+ ret = -EPERM; /* Not yet implemented */
goto free_scratch;
+ }
return 0;
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
index 87e389c..04bc66f 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.h
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
@@ -88,9 +88,17 @@ typedef uint64_t gen8_pde_t;
* PDPE | PDE | PTE | offset
* The difference as compared to normal x86 3 level page table is the PDPEs are
* programmed via register.
+ *
+ * GEN8 48b legacy style address is defined as a 4 level page table:
+ * 47:39 | 38:30 | 29:21 | 20:12 | 11:0
+ * PML4E | PDPE | PDE | PTE | offset
*/
+#define GEN8_PML4ES_PER_PML4 512
+#define GEN8_PML4E_SHIFT 39
#define GEN8_PDPE_SHIFT 30
-#define GEN8_PDPE_MASK 0x3
+/* NB: GEN8_PDPE_MASK is untrue for 32b platforms, but it has no impact on 32b page
+ * tables */
+#define GEN8_PDPE_MASK 0x1ff
#define GEN8_PDE_SHIFT 21
#define GEN8_PDE_MASK 0x1ff
#define GEN8_PTE_SHIFT 12
@@ -98,8 +106,8 @@ typedef uint64_t gen8_pde_t;
#define GEN8_LEGACY_PDPES 4
#define GEN8_PTES I915_PTES(sizeof(gen8_pte_t))
-/* FIXME: Next patch will use dev */
-#define I915_PDPES_PER_PDP(dev) GEN8_LEGACY_PDPES
+#define I915_PDPES_PER_PDP(dev) (USES_FULL_48BIT_PPGTT(dev) ?\
+ GEN8_PML4ES_PER_PML4 : GEN8_LEGACY_PDPES)
#define PPAT_UNCACHED_INDEX (_PAGE_PWT | _PAGE_PCD)
#define PPAT_CACHED_PDE_INDEX 0 /* WB LLC */
@@ -250,6 +258,13 @@ struct i915_page_directory_pointer {
struct i915_page_directory **page_directory;
};
+struct i915_pml4 {
+ struct i915_page_dma base;
+
+ DECLARE_BITMAP(used_pml4es, GEN8_PML4ES_PER_PML4);
+ struct i915_page_directory_pointer *pdps[GEN8_PML4ES_PER_PML4];
+};
+
struct i915_address_space {
struct drm_mm mm;
struct drm_device *dev;
@@ -345,8 +360,9 @@ struct i915_hw_ppgtt {
struct drm_mm_node node;
unsigned long pd_dirty_rings;
union {
- struct i915_page_directory_pointer pdp;
- struct i915_page_directory pd;
+ struct i915_pml4 pml4; /* GEN8+ & 48b PPGTT */
+ struct i915_page_directory_pointer pdp; /* GEN8+ */
+ struct i915_page_directory pd; /* GEN6-7 */
};
struct drm_i915_file_private *file_priv;
--
2.4.5
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 33+ messages in thread
* [PATCH v5 06/19] drm/i915/gen8: implement alloc/free for 4lvl
2015-07-16 9:33 [PATCH v5 00/19] 48-bit PPGTT Michel Thierry
` (4 preceding siblings ...)
2015-07-16 9:33 ` [PATCH v5 05/19] drm/i915/gen8: Add PML4 structure Michel Thierry
@ 2015-07-16 9:33 ` Michel Thierry
2015-07-29 14:34 ` Michel Thierry
2015-07-16 9:33 ` [PATCH v5 07/19] drm/i915/gen8: Add 4 level switching infrastructure and lrc support Michel Thierry
` (13 subsequent siblings)
19 siblings, 1 reply; 33+ messages in thread
From: Michel Thierry @ 2015-07-16 9:33 UTC (permalink / raw)
To: intel-gfx; +Cc: akash.goel
PML4 has no special attributes, and there will always be a PML4.
So simply initialize it at creation, and destroy it at the end.
The code for 4lvl is able to call into the existing 3lvl page table code
to handle all of the lower levels.
v2: Return something at the end of gen8_alloc_va_range_4lvl to keep the
compiler happy. And define ret only in one place.
Updated gen8_ppgtt_unmap_pages and gen8_ppgtt_free to handle 4lvl.
v3: Use i915_dma_unmap_single instead of pci API. Fix a
couple of incorrect checks when unmapping pdp and pd pages (Akash).
v4: Call __pdp_fini also for 32b PPGTT. Clean up alloc_pdp param list.
v5: Prevent (harmless) out of range access in gen8_for_each_pml4e.
v6: Simplify alloc_vma_range_4lvl and gen8_ppgtt_init_common error
paths. (Akash)
v7: Rebase, s/gen8_ppgtt_free_*/gen8_ppgtt_cleanup_*/.
v8: Change location of pml4_init/fini. It will make next patches
cleaner.
v9: Rebase after Mika's ppgtt cleanup / scratch merge patch series, while
trying to reuse as much as possible for pdp alloc. pml4_init/fini
replaced by setup/cleanup_px macros.
v10: Rebase after Mika's merged ppgtt cleanup patch series.
v11: Rebase after final merged version of Mika's ppgtt/scratch
patches.
v12: Fix pdpe start value in trace (Akash)
v13: Define all 4lvl functions in this patch directly, instead of
previous patches, add i915_page_directory_pointer_entry_alloc here,
use test_bit to detect when pdp is already allocated (Akash).
Cc: Akash Goel <akash.goel@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2+)
---
drivers/gpu/drm/i915/i915_gem_gtt.c | 163 ++++++++++++++++++++++++++++++++----
drivers/gpu/drm/i915/i915_gem_gtt.h | 13 ++-
drivers/gpu/drm/i915/i915_trace.h | 8 ++
3 files changed, 167 insertions(+), 17 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 59b7ff5..5901810 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -559,12 +559,44 @@ static void __pdp_fini(struct i915_page_directory_pointer *pdp)
pdp->page_directory = NULL;
}
+static struct
+i915_page_directory_pointer *alloc_pdp(struct drm_device *dev)
+{
+ struct i915_page_directory_pointer *pdp;
+ int ret = -ENOMEM;
+
+ WARN_ON(!USES_FULL_48BIT_PPGTT(dev));
+
+ pdp = kzalloc(sizeof(*pdp), GFP_KERNEL);
+ if (!pdp)
+ return ERR_PTR(-ENOMEM);
+
+ ret = __pdp_init(dev, pdp);
+ if (ret)
+ goto fail_bitmap;
+
+ ret = setup_px(dev, pdp);
+ if (ret)
+ goto fail_page_m;
+
+ return pdp;
+
+fail_page_m:
+ __pdp_fini(pdp);
+fail_bitmap:
+ kfree(pdp);
+
+ return ERR_PTR(ret);
+}
+
static void free_pdp(struct drm_device *dev,
struct i915_page_directory_pointer *pdp)
{
__pdp_fini(pdp);
- if (USES_FULL_48BIT_PPGTT(dev))
+ if (USES_FULL_48BIT_PPGTT(dev)) {
+ cleanup_px(dev, pdp);
kfree(pdp);
+ }
}
/* Broadwell Page Directory Pointer Descriptors */
@@ -758,12 +790,9 @@ static void gen8_free_scratch(struct i915_address_space *vm)
free_scratch_page(dev, vm->scratch_page);
}
-static void gen8_ppgtt_cleanup(struct i915_address_space *vm)
+static void gen8_ppgtt_cleanup_3lvl(struct drm_device *dev,
+ struct i915_page_directory_pointer *pdp)
{
- struct i915_hw_ppgtt *ppgtt =
- container_of(vm, struct i915_hw_ppgtt, base);
- struct i915_page_directory_pointer *pdp = &ppgtt->pdp; /* FIXME: 48b */
- struct drm_device *dev = ppgtt->base.dev;
int i;
for_each_set_bit(i, pdp->used_pdpes, I915_PDPES_PER_PDP(dev)) {
@@ -775,6 +804,31 @@ static void gen8_ppgtt_cleanup(struct i915_address_space *vm)
}
free_pdp(dev, pdp);
+}
+
+static void gen8_ppgtt_cleanup_4lvl(struct i915_hw_ppgtt *ppgtt)
+{
+ int i;
+
+ for_each_set_bit(i, ppgtt->pml4.used_pml4es, GEN8_PML4ES_PER_PML4) {
+ if (WARN_ON(!ppgtt->pml4.pdps[i]))
+ continue;
+
+ gen8_ppgtt_cleanup_3lvl(ppgtt->base.dev, ppgtt->pml4.pdps[i]);
+ }
+
+ cleanup_px(ppgtt->base.dev, &ppgtt->pml4);
+}
+
+static void gen8_ppgtt_cleanup(struct i915_address_space *vm)
+{
+ struct i915_hw_ppgtt *ppgtt =
+ container_of(vm, struct i915_hw_ppgtt, base);
+
+ if (!USES_FULL_48BIT_PPGTT(ppgtt->base.dev))
+ gen8_ppgtt_cleanup_3lvl(ppgtt->base.dev, &ppgtt->pdp);
+ else
+ gen8_ppgtt_cleanup_4lvl(ppgtt);
gen8_free_scratch(vm);
}
@@ -957,14 +1011,15 @@ static void mark_tlbs_dirty(struct i915_hw_ppgtt *ppgtt)
ppgtt->pd_dirty_rings = INTEL_INFO(ppgtt->base.dev)->ring_mask;
}
-static int gen8_alloc_va_range(struct i915_address_space *vm,
- uint64_t start, uint64_t length)
+static int gen8_alloc_va_range_3lvl(struct i915_address_space *vm,
+ struct i915_page_directory_pointer *pdp,
+ uint64_t start,
+ uint64_t length)
{
struct i915_hw_ppgtt *ppgtt =
container_of(vm, struct i915_hw_ppgtt, base);
unsigned long *new_page_dirs, **new_page_tables;
struct drm_device *dev = vm->dev;
- struct i915_page_directory_pointer *pdp = &ppgtt->pdp; /* FIXME: 48b */
struct i915_page_directory *pd;
const uint64_t orig_start = start;
const uint64_t orig_length = length;
@@ -1065,6 +1120,79 @@ err_out:
return ret;
}
+static int gen8_alloc_va_range_4lvl(struct i915_address_space *vm,
+ struct i915_pml4 *pml4,
+ uint64_t start,
+ uint64_t length)
+{
+ DECLARE_BITMAP(new_pdps, GEN8_PML4ES_PER_PML4);
+ struct i915_page_directory_pointer *pdp;
+ const uint64_t orig_start = start;
+ const uint64_t orig_length = length;
+ uint64_t temp, pml4e;
+ int ret = 0;
+
+ /* Do the pml4 allocations first, so we don't need to track the newly
+ * allocated tables below the pdp */
+ bitmap_zero(new_pdps, GEN8_PML4ES_PER_PML4);
+
+ /* The pagedirectory and pagetable allocations are done in the shared 3
+ * and 4 level code. Just allocate the pdps.
+ */
+ gen8_for_each_pml4e(pdp, pml4, start, length, temp, pml4e) {
+ if (!test_bit(pml4e, pml4->used_pml4es)) {
+ pdp = alloc_pdp(vm->dev);
+ if (IS_ERR(pdp))
+ goto err_out;
+
+ pml4->pdps[pml4e] = pdp;
+ __set_bit(pml4e, new_pdps);
+ trace_i915_page_directory_pointer_entry_alloc(vm,
+ pml4e,
+ start,
+ GEN8_PML4E_SHIFT);
+ }
+ }
+
+ WARN(bitmap_weight(new_pdps, GEN8_PML4ES_PER_PML4) > 2,
+ "The allocation has spanned more than 512GB. "
+ "It is highly likely this is incorrect.");
+
+ start = orig_start;
+ length = orig_length;
+
+ gen8_for_each_pml4e(pdp, pml4, start, length, temp, pml4e) {
+ WARN_ON(!pdp);
+
+ ret = gen8_alloc_va_range_3lvl(vm, pdp, start, length);
+ if (ret)
+ goto err_out;
+ }
+
+ bitmap_or(pml4->used_pml4es, new_pdps, pml4->used_pml4es,
+ GEN8_PML4ES_PER_PML4);
+
+ return 0;
+
+err_out:
+ for_each_set_bit(pml4e, new_pdps, GEN8_PML4ES_PER_PML4)
+ gen8_ppgtt_cleanup_3lvl(vm->dev, pml4->pdps[pml4e]);
+
+ return ret;
+}
+
+static int gen8_alloc_va_range(struct i915_address_space *vm,
+ uint64_t start, uint64_t length)
+{
+ struct i915_hw_ppgtt *ppgtt =
+ container_of(vm, struct i915_hw_ppgtt, base);
+
+ if (USES_FULL_48BIT_PPGTT(vm->dev))
+ return gen8_alloc_va_range_4lvl(vm, &ppgtt->pml4, start, length);
+ else
+ return gen8_alloc_va_range_3lvl(vm, &ppgtt->pdp, start, length);
+}
+
/*
* GEN8 legacy ppgtt programming is accomplished through a max 4 PDP registers
* with a net effect resembling a 2-level page table in normal x86 terms. Each
@@ -1090,9 +1218,14 @@ static int gen8_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
ppgtt->switch_mm = gen8_mm_switch;
- if (!USES_FULL_48BIT_PPGTT(ppgtt->base.dev)) {
- ret = __pdp_init(false, &ppgtt->pdp);
+ if (USES_FULL_48BIT_PPGTT(ppgtt->base.dev)) {
+ ret = setup_px(ppgtt->base.dev, &ppgtt->pml4);
+ if (ret)
+ goto free_scratch;
+ ppgtt->base.total = 1ULL << 48;
+ } else {
+ ret = __pdp_init(false, &ppgtt->pdp);
if (ret)
goto free_scratch;
@@ -1104,10 +1237,10 @@ static int gen8_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
* 2GiB).
*/
ppgtt->base.total = to_i915(ppgtt->base.dev)->gtt.base.total;
- } else {
- ppgtt->base.total = 1ULL << 48;
- ret = -EPERM; /* Not yet implemented */
- goto free_scratch;
+
+ trace_i915_page_directory_pointer_entry_alloc(&ppgtt->base,
+ 0, 0,
+ GEN8_PML4E_SHIFT);
}
return 0;
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
index 04bc66f..e63ef07 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.h
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
@@ -95,6 +95,7 @@ typedef uint64_t gen8_pde_t;
*/
#define GEN8_PML4ES_PER_PML4 512
#define GEN8_PML4E_SHIFT 39
+#define GEN8_PML4E_MASK (GEN8_PML4ES_PER_PML4 - 1)
#define GEN8_PDPE_SHIFT 30
/* NB: GEN8_PDPE_MASK is untrue for 32b platforms, but it has no impact on 32b page
* tables */
@@ -465,6 +466,15 @@ static inline uint32_t gen6_pde_index(uint32_t addr)
temp = min(temp, length), \
start += temp, length -= temp)
+#define gen8_for_each_pml4e(pdp, pml4, start, length, temp, iter) \
+ for (iter = gen8_pml4e_index(start); \
+ pdp = (pml4)->pdps[iter], \
+ length > 0 && iter < GEN8_PML4ES_PER_PML4; \
+ iter++, \
+ temp = ALIGN(start+1, 1ULL << GEN8_PML4E_SHIFT) - start, \
+ temp = min(temp, length), \
+ start += temp, length -= temp)
+
static inline uint32_t gen8_pte_index(uint64_t address)
{
return i915_pte_index(address, GEN8_PDE_SHIFT);
@@ -482,8 +492,7 @@ static inline uint32_t gen8_pdpe_index(uint64_t address)
static inline uint32_t gen8_pml4e_index(uint64_t address)
{
- WARN_ON(1); /* For 64B */
- return 0;
+ return (address >> GEN8_PML4E_SHIFT) & GEN8_PML4E_MASK;
}
static inline size_t gen8_pte_count(uint64_t address, uint64_t length)
diff --git a/drivers/gpu/drm/i915/i915_trace.h b/drivers/gpu/drm/i915/i915_trace.h
index f230d76..e6b5c74 100644
--- a/drivers/gpu/drm/i915/i915_trace.h
+++ b/drivers/gpu/drm/i915/i915_trace.h
@@ -221,6 +221,14 @@ DEFINE_EVENT_PRINT(i915_px_entry, i915_page_directory_entry_alloc,
__entry->vm, __entry->px, __entry->start, __entry->end)
);
+DEFINE_EVENT_PRINT(i915_px_entry, i915_page_directory_pointer_entry_alloc,
+ TP_PROTO(struct i915_address_space *vm, u32 pml4e, u64 start, u64 pml4e_shift),
+ TP_ARGS(vm, pml4e, start, pml4e_shift),
+
+ TP_printk("vm=%p, pml4e=%d (0x%llx-0x%llx)",
+ __entry->vm, __entry->px, __entry->start, __entry->end)
+);
+
/* Avoid extra math because we only support two sizes. The format is defined by
* bitmap_scnprintf. Each 32 bits is 8 HEX digits followed by comma */
#define TRACE_PT_SIZE(bits) \
--
2.4.5
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 33+ messages in thread
* [PATCH v5 07/19] drm/i915/gen8: Add 4 level switching infrastructure and lrc support
2015-07-16 9:33 [PATCH v5 00/19] 48-bit PPGTT Michel Thierry
` (5 preceding siblings ...)
2015-07-16 9:33 ` [PATCH v5 06/19] drm/i915/gen8: implement alloc/free for 4lvl Michel Thierry
@ 2015-07-16 9:33 ` Michel Thierry
2015-07-29 14:34 ` Michel Thierry
2015-07-16 9:33 ` [PATCH v5 08/19] drm/i915/gen8: Generalize PTE writing for GEN8 PPGTT Michel Thierry
` (12 subsequent siblings)
19 siblings, 1 reply; 33+ messages in thread
From: Michel Thierry @ 2015-07-16 9:33 UTC (permalink / raw)
To: intel-gfx; +Cc: akash.goel
In 64b (48bit canonical) PPGTT addressing, the PDP0 register contains
the base address to PML4, while the other PDP registers are ignored.
In LRC, the addressing mode must be specified in every context
descriptor, and the base address to PML4 is stored in the reg state.
v2: PML4 update in legacy context switch is left for historic reasons,
the preferred mode of operation is with lrc context based submission.
v3: s/gen8_map_page_directory/gen8_setup_page_directory and
s/gen8_map_page_directory_pointer/gen8_setup_page_directory_pointer.
Also, clflush will be needed for bxt. (Akash)
v4: Squashed lrc-specific code and use a macro to set PML4 register.
v5: Rebase after Mika's ppgtt cleanup / scratch merge patch series.
PDP update in bb_start is only for legacy 32b mode.
v6: Rebase after final merged version of Mika's ppgtt/scratch
patches.
v7: There is no need to update the pml4 register value in
execlists_update_context (Akash)
Cc: Akash Goel <akash.goel@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2+)
---
drivers/gpu/drm/i915/i915_gem_gtt.c | 54 +++++++++++++++++++++++++++++----
drivers/gpu/drm/i915/i915_gem_gtt.h | 2 ++
drivers/gpu/drm/i915/i915_reg.h | 1 +
drivers/gpu/drm/i915/intel_lrc.c | 60 ++++++++++++++++++++++++++-----------
4 files changed, 94 insertions(+), 23 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 5901810..8bcd328 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -210,6 +210,9 @@ static gen8_pde_t gen8_pde_encode(const dma_addr_t addr,
return pde;
}
+#define gen8_pdpe_encode gen8_pde_encode
+#define gen8_pml4e_encode gen8_pde_encode
+
static gen6_pte_t snb_pte_encode(dma_addr_t addr,
enum i915_cache_level level,
bool valid, u32 unused)
@@ -599,6 +602,35 @@ static void free_pdp(struct drm_device *dev,
}
}
+static void
+gen8_setup_page_directory(struct i915_hw_ppgtt *ppgtt,
+ struct i915_page_directory_pointer *pdp,
+ struct i915_page_directory *pd,
+ int index)
+{
+ gen8_ppgtt_pdpe_t *page_directorypo;
+
+ if (!USES_FULL_48BIT_PPGTT(ppgtt->base.dev))
+ return;
+
+ page_directorypo = kmap_px(pdp);
+ page_directorypo[index] = gen8_pdpe_encode(px_dma(pd), I915_CACHE_LLC);
+ kunmap_px(ppgtt, page_directorypo);
+}
+
+static void
+gen8_setup_page_directory_pointer(struct i915_hw_ppgtt *ppgtt,
+ struct i915_pml4 *pml4,
+ struct i915_page_directory_pointer *pdp,
+ int index)
+{
+ gen8_ppgtt_pml4e_t *pagemap = kmap_px(pml4);
+
+ WARN_ON(!USES_FULL_48BIT_PPGTT(ppgtt->base.dev));
+ pagemap[index] = gen8_pml4e_encode(px_dma(pdp), I915_CACHE_LLC);
+ kunmap_px(ppgtt, pagemap);
+}
+
/* Broadwell Page Directory Pointer Descriptors */
static int gen8_write_pdp(struct drm_i915_gem_request *req,
unsigned entry,
@@ -624,8 +656,8 @@ static int gen8_write_pdp(struct drm_i915_gem_request *req,
return 0;
}
-static int gen8_mm_switch(struct i915_hw_ppgtt *ppgtt,
- struct drm_i915_gem_request *req)
+static int gen8_legacy_mm_switch(struct i915_hw_ppgtt *ppgtt,
+ struct drm_i915_gem_request *req)
{
int i, ret;
@@ -640,6 +672,12 @@ static int gen8_mm_switch(struct i915_hw_ppgtt *ppgtt,
return 0;
}
+static int gen8_48b_mm_switch(struct i915_hw_ppgtt *ppgtt,
+ struct drm_i915_gem_request *req)
+{
+ return gen8_write_pdp(req, 0, px_dma(&ppgtt->pml4));
+}
+
static void gen8_ppgtt_clear_range(struct i915_address_space *vm,
uint64_t start,
uint64_t length,
@@ -1100,6 +1138,7 @@ static int gen8_alloc_va_range_3lvl(struct i915_address_space *vm,
kunmap_px(ppgtt, page_directory);
__set_bit(pdpe, pdp->used_pdpes);
+ gen8_setup_page_directory(ppgtt, pdp, pd, pdpe);
}
free_gen8_temp_bitmaps(new_page_dirs, new_page_tables, pdpes);
@@ -1126,6 +1165,8 @@ static int gen8_alloc_va_range_4lvl(struct i915_address_space *vm,
uint64_t length)
{
DECLARE_BITMAP(new_pdps, GEN8_PML4ES_PER_PML4);
+ struct i915_hw_ppgtt *ppgtt =
+ container_of(vm, struct i915_hw_ppgtt, base);
struct i915_page_directory_pointer *pdp;
const uint64_t orig_start = start;
const uint64_t orig_length = length;
@@ -1167,6 +1208,8 @@ static int gen8_alloc_va_range_4lvl(struct i915_address_space *vm,
ret = gen8_alloc_va_range_3lvl(vm, pdp, start, length);
if (ret)
goto err_out;
+
+ gen8_setup_page_directory_pointer(ppgtt, pml4, pdp, pml4e);
}
bitmap_or(pml4->used_pml4es, new_pdps, pml4->used_pml4es,
@@ -1216,14 +1259,13 @@ static int gen8_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
ppgtt->base.unbind_vma = ppgtt_unbind_vma;
ppgtt->base.bind_vma = ppgtt_bind_vma;
- ppgtt->switch_mm = gen8_mm_switch;
-
if (USES_FULL_48BIT_PPGTT(ppgtt->base.dev)) {
ret = setup_px(ppgtt->base.dev, &ppgtt->pml4);
if (ret)
goto free_scratch;
ppgtt->base.total = 1ULL << 48;
+ ppgtt->switch_mm = gen8_48b_mm_switch;
} else {
ret = __pdp_init(false, &ppgtt->pdp);
if (ret)
@@ -1238,6 +1280,7 @@ static int gen8_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
*/
ppgtt->base.total = to_i915(ppgtt->base.dev)->gtt.base.total;
+ ppgtt->switch_mm = gen8_legacy_mm_switch;
trace_i915_page_directory_pointer_entry_alloc(&ppgtt->base,
0, 0,
GEN8_PML4E_SHIFT);
@@ -1435,8 +1478,9 @@ static void gen8_ppgtt_enable(struct drm_device *dev)
int j;
for_each_ring(ring, dev_priv, j) {
+ u32 four_level = USES_FULL_48BIT_PPGTT(dev) ? GEN8_GFX_PPGTT_48B : 0;
I915_WRITE(RING_MODE_GEN7(ring),
- _MASKED_BIT_ENABLE(GFX_PPGTT_ENABLE));
+ _MASKED_BIT_ENABLE(GFX_PPGTT_ENABLE | four_level));
}
}
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
index e63ef07..11d44b3 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.h
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
@@ -39,6 +39,8 @@ struct drm_i915_file_private;
typedef uint32_t gen6_pte_t;
typedef uint64_t gen8_pte_t;
typedef uint64_t gen8_pde_t;
+typedef uint64_t gen8_ppgtt_pdpe_t;
+typedef uint64_t gen8_ppgtt_pml4e_t;
#define gtt_total_entries(gtt) ((gtt).base.total >> PAGE_SHIFT)
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 059de0f..152ad9b 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -1669,6 +1669,7 @@ enum skl_disp_power_wells {
#define GFX_REPLAY_MODE (1<<11)
#define GFX_PSMI_GRANULARITY (1<<10)
#define GFX_PPGTT_ENABLE (1<<9)
+#define GEN8_GFX_PPGTT_48B (1<<7)
#define VLV_DISPLAY_BASE 0x180000
#define VLV_MIPI_BASE VLV_DISPLAY_BASE
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 0007d45..ccf3b7a 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -195,13 +195,21 @@
reg_state[CTX_PDP ## n ## _LDW+1] = lower_32_bits(_addr); \
}
+#define ASSIGN_CTX_PML4(ppgtt, reg_state) { \
+ reg_state[CTX_PDP0_UDW + 1] = upper_32_bits(px_dma(&ppgtt->pml4)); \
+ reg_state[CTX_PDP0_LDW + 1] = lower_32_bits(px_dma(&ppgtt->pml4)); \
+}
+
enum {
ADVANCED_CONTEXT = 0,
- LEGACY_CONTEXT,
+ LEGACY_32B_CONTEXT,
ADVANCED_AD_CONTEXT,
LEGACY_64B_CONTEXT
};
-#define GEN8_CTX_MODE_SHIFT 3
+#define GEN8_CTX_ADDRESSING_MODE_SHIFT 3
+#define GEN8_CTX_ADDRESSING_MODE(dev) (USES_FULL_48BIT_PPGTT(dev) ?\
+ LEGACY_64B_CONTEXT :\
+ LEGACY_32B_CONTEXT)
enum {
FAULT_AND_HANG = 0,
FAULT_AND_HALT, /* Debug only */
@@ -272,7 +280,7 @@ static uint64_t execlists_ctx_descriptor(struct drm_i915_gem_request *rq)
WARN_ON(lrca & 0xFFFFFFFF00000FFFULL);
desc = GEN8_CTX_VALID;
- desc |= LEGACY_CONTEXT << GEN8_CTX_MODE_SHIFT;
+ desc |= GEN8_CTX_ADDRESSING_MODE(dev) << GEN8_CTX_ADDRESSING_MODE_SHIFT;
if (IS_GEN8(ctx_obj->base.dev))
desc |= GEN8_CTX_L3LLC_COHERENT;
desc |= GEN8_CTX_PRIVILEGE;
@@ -347,10 +355,12 @@ static int execlists_update_context(struct drm_i915_gem_request *rq)
reg_state[CTX_RING_TAIL+1] = rq->tail;
reg_state[CTX_RING_BUFFER_START+1] = i915_gem_obj_ggtt_offset(rb_obj);
- /* True PPGTT with dynamic page allocation: update PDP registers and
- * point the unallocated PDPs to the scratch page
- */
- if (ppgtt) {
+ if (ppgtt && !USES_FULL_48BIT_PPGTT(ppgtt->base.dev)) {
+ /* True 32b PPGTT with dynamic page allocation: update PDP
+ * registers and point the unallocated PDPs to scratch page.
+ * PML4 is allocated during ppgtt init, so this is not needed
+ * in 48-bit mode.
+ */
ASSIGN_CTX_PDP(ppgtt, reg_state, 3);
ASSIGN_CTX_PDP(ppgtt, reg_state, 2);
ASSIGN_CTX_PDP(ppgtt, reg_state, 1);
@@ -1433,12 +1443,15 @@ static int gen8_emit_bb_start(struct drm_i915_gem_request *req,
* Ideally, we should set Force PD Restore in ctx descriptor,
* but we can't. Force Restore would be a second option, but
* it is unsafe in case of lite-restore (because the ctx is
- * not idle). */
+ * not idle). PML4 is allocated during ppgtt init so this is
+ * not needed in 48-bit.*/
if (req->ctx->ppgtt &&
(intel_ring_flag(req->ring) & req->ctx->ppgtt->pd_dirty_rings)) {
- ret = intel_logical_ring_emit_pdps(req);
- if (ret)
- return ret;
+ if (GEN8_CTX_ADDRESSING_MODE(req->i915) == LEGACY_32B_CONTEXT) {
+ ret = intel_logical_ring_emit_pdps(req);
+ if (ret)
+ return ret;
+ }
req->ctx->ppgtt->pd_dirty_rings &= ~intel_ring_flag(req->ring);
}
@@ -2105,13 +2118,24 @@ populate_lr_context(struct intel_context *ctx, struct drm_i915_gem_object *ctx_o
reg_state[CTX_PDP0_UDW] = GEN8_RING_PDP_UDW(ring, 0);
reg_state[CTX_PDP0_LDW] = GEN8_RING_PDP_LDW(ring, 0);
- /* With dynamic page allocation, PDPs may not be allocated at this point,
- * Point the unallocated PDPs to the scratch page
- */
- ASSIGN_CTX_PDP(ppgtt, reg_state, 3);
- ASSIGN_CTX_PDP(ppgtt, reg_state, 2);
- ASSIGN_CTX_PDP(ppgtt, reg_state, 1);
- ASSIGN_CTX_PDP(ppgtt, reg_state, 0);
+ if (USES_FULL_48BIT_PPGTT(ppgtt->base.dev)) {
+ /* 64b PPGTT (48bit canonical)
+ * PDP0_DESCRIPTOR contains the base address to PML4 and
+ * other PDP Descriptors are ignored.
+ */
+ ASSIGN_CTX_PML4(ppgtt, reg_state);
+ } else {
+ /* 32b PPGTT
+ * PDP*_DESCRIPTOR contains the base address of space supported.
+ * With dynamic page allocation, PDPs may not be allocated at
+ * this point. Point the unallocated PDPs to the scratch page
+ */
+ ASSIGN_CTX_PDP(ppgtt, reg_state, 3);
+ ASSIGN_CTX_PDP(ppgtt, reg_state, 2);
+ ASSIGN_CTX_PDP(ppgtt, reg_state, 1);
+ ASSIGN_CTX_PDP(ppgtt, reg_state, 0);
+ }
+
if (ring->id == RCS) {
reg_state[CTX_LRI_HEADER_2] = MI_LOAD_REGISTER_IMM(1);
reg_state[CTX_R_PWR_CLK_STATE] = GEN8_R_PWR_CLK_STATE;
--
2.4.5
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 33+ messages in thread
* [PATCH v5 08/19] drm/i915/gen8: Generalize PTE writing for GEN8 PPGTT
2015-07-16 9:33 [PATCH v5 00/19] 48-bit PPGTT Michel Thierry
` (6 preceding siblings ...)
2015-07-16 9:33 ` [PATCH v5 07/19] drm/i915/gen8: Add 4 level switching infrastructure and lrc support Michel Thierry
@ 2015-07-16 9:33 ` Michel Thierry
2015-07-29 14:35 ` Michel Thierry
2015-07-16 9:33 ` [PATCH v5 09/19] drm/i915/gen8: Pass sg_iter through pte inserts Michel Thierry
` (11 subsequent siblings)
19 siblings, 1 reply; 33+ messages in thread
From: Michel Thierry @ 2015-07-16 9:33 UTC (permalink / raw)
To: intel-gfx; +Cc: akash.goel
The insert_entries function was the function used to write PTEs. For the
PPGTT it was "hardcoded" to only understand two level page tables, which
was the case for GEN7. We can reuse this for 4 level page tables, and
remove the concept of insert_entries, which was never viable past 2
level page tables anyway, but it requires a bit of rework to make the
function a bit more generic.
This patch begins the generalization work, and it will be heavily used
upon when the 48b code is complete. The patch series attempts to make
each function which touches a part of code specific to the page table
level and here is no exception.
v2: Rebase after Mika's ppgtt cleanup / scratch merge patch series.
v3: Rebase after final merged version of Mika's ppgtt/scratch patches.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2)
---
drivers/gpu/drm/i915/i915_gem_gtt.c | 52 +++++++++++++++++++++++++++----------
1 file changed, 39 insertions(+), 13 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 8bcd328..ba41b01 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -678,24 +678,21 @@ static int gen8_48b_mm_switch(struct i915_hw_ppgtt *ppgtt,
return gen8_write_pdp(req, 0, px_dma(&ppgtt->pml4));
}
-static void gen8_ppgtt_clear_range(struct i915_address_space *vm,
- uint64_t start,
- uint64_t length,
- bool use_scratch)
+static void gen8_ppgtt_clear_pte_range(struct i915_address_space *vm,
+ struct i915_page_directory_pointer *pdp,
+ uint64_t start,
+ uint64_t length,
+ gen8_pte_t scratch_pte)
{
struct i915_hw_ppgtt *ppgtt =
container_of(vm, struct i915_hw_ppgtt, base);
- struct i915_page_directory_pointer *pdp = &ppgtt->pdp; /* FIXME: 48b */
- gen8_pte_t *pt_vaddr, scratch_pte;
+ gen8_pte_t *pt_vaddr;
unsigned pdpe = start >> GEN8_PDPE_SHIFT & GEN8_PDPE_MASK;
unsigned pde = start >> GEN8_PDE_SHIFT & GEN8_PDE_MASK;
unsigned pte = start >> GEN8_PTE_SHIFT & GEN8_PTE_MASK;
unsigned num_entries = length >> PAGE_SHIFT;
unsigned last_pte, i;
- scratch_pte = gen8_pte_encode(px_dma(ppgtt->base.scratch_page),
- I915_CACHE_LLC, use_scratch);
-
while (num_entries) {
struct i915_page_directory *pd;
struct i915_page_table *pt;
@@ -734,14 +731,30 @@ static void gen8_ppgtt_clear_range(struct i915_address_space *vm,
}
}
-static void gen8_ppgtt_insert_entries(struct i915_address_space *vm,
- struct sg_table *pages,
- uint64_t start,
- enum i915_cache_level cache_level, u32 unused)
+static void gen8_ppgtt_clear_range(struct i915_address_space *vm,
+ uint64_t start,
+ uint64_t length,
+ bool use_scratch)
{
struct i915_hw_ppgtt *ppgtt =
container_of(vm, struct i915_hw_ppgtt, base);
struct i915_page_directory_pointer *pdp = &ppgtt->pdp; /* FIXME: 48b */
+
+ gen8_pte_t scratch_pte = gen8_pte_encode(px_dma(vm->scratch_page),
+ I915_CACHE_LLC, use_scratch);
+
+ gen8_ppgtt_clear_pte_range(vm, pdp, start, length, scratch_pte);
+}
+
+static void
+gen8_ppgtt_insert_pte_entries(struct i915_address_space *vm,
+ struct i915_page_directory_pointer *pdp,
+ struct sg_table *pages,
+ uint64_t start,
+ enum i915_cache_level cache_level)
+{
+ struct i915_hw_ppgtt *ppgtt =
+ container_of(vm, struct i915_hw_ppgtt, base);
gen8_pte_t *pt_vaddr;
unsigned pdpe = start >> GEN8_PDPE_SHIFT & GEN8_PDPE_MASK;
unsigned pde = start >> GEN8_PDE_SHIFT & GEN8_PDE_MASK;
@@ -775,6 +788,19 @@ static void gen8_ppgtt_insert_entries(struct i915_address_space *vm,
kunmap_px(ppgtt, pt_vaddr);
}
+static void gen8_ppgtt_insert_entries(struct i915_address_space *vm,
+ struct sg_table *pages,
+ uint64_t start,
+ enum i915_cache_level cache_level,
+ u32 unused)
+{
+ struct i915_hw_ppgtt *ppgtt =
+ container_of(vm, struct i915_hw_ppgtt, base);
+ struct i915_page_directory_pointer *pdp = &ppgtt->pdp; /* FIXME: 48b */
+
+ gen8_ppgtt_insert_pte_entries(vm, pdp, pages, start, cache_level);
+}
+
static void gen8_free_page_tables(struct drm_device *dev,
struct i915_page_directory *pd)
{
--
2.4.5
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 33+ messages in thread
* [PATCH v5 09/19] drm/i915/gen8: Pass sg_iter through pte inserts
2015-07-16 9:33 [PATCH v5 00/19] 48-bit PPGTT Michel Thierry
` (7 preceding siblings ...)
2015-07-16 9:33 ` [PATCH v5 08/19] drm/i915/gen8: Generalize PTE writing for GEN8 PPGTT Michel Thierry
@ 2015-07-16 9:33 ` Michel Thierry
2015-07-16 9:33 ` [PATCH v5 10/19] drm/i915/gen8: Add 4 level support in insert_entries and clear_range Michel Thierry
` (10 subsequent siblings)
19 siblings, 0 replies; 33+ messages in thread
From: Michel Thierry @ 2015-07-16 9:33 UTC (permalink / raw)
To: intel-gfx; +Cc: akash.goel
As a step towards implementing 4 levels, while not discarding the
existing pte insert functions, we need to pass the sg_iter through.
The current function understands to the page directory granularity.
An object's pages may span the page directory, and so using the iter
directly as we write the PTEs allows the iterator to stay coherent
through a VMA insert operation spanning multiple page table levels.
v2: Rebase after s/page_tables/page_table/.
v3: Rebase after Mika's ppgtt cleanup / scratch merge patch series;
updated commit message (s/map/insert).
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2+)
---
drivers/gpu/drm/i915/i915_gem_gtt.c | 11 ++++++-----
1 file changed, 6 insertions(+), 5 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index ba41b01..ee19473 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -749,7 +749,7 @@ static void gen8_ppgtt_clear_range(struct i915_address_space *vm,
static void
gen8_ppgtt_insert_pte_entries(struct i915_address_space *vm,
struct i915_page_directory_pointer *pdp,
- struct sg_table *pages,
+ struct sg_page_iter *sg_iter,
uint64_t start,
enum i915_cache_level cache_level)
{
@@ -759,11 +759,10 @@ gen8_ppgtt_insert_pte_entries(struct i915_address_space *vm,
unsigned pdpe = start >> GEN8_PDPE_SHIFT & GEN8_PDPE_MASK;
unsigned pde = start >> GEN8_PDE_SHIFT & GEN8_PDE_MASK;
unsigned pte = start >> GEN8_PTE_SHIFT & GEN8_PTE_MASK;
- struct sg_page_iter sg_iter;
pt_vaddr = NULL;
- for_each_sg_page(pages->sgl, &sg_iter, pages->nents, 0) {
+ while (__sg_page_iter_next(sg_iter)) {
if (pt_vaddr == NULL) {
struct i915_page_directory *pd = pdp->page_directory[pdpe];
struct i915_page_table *pt = pd->page_table[pde];
@@ -771,7 +770,7 @@ gen8_ppgtt_insert_pte_entries(struct i915_address_space *vm,
}
pt_vaddr[pte] =
- gen8_pte_encode(sg_page_iter_dma_address(&sg_iter),
+ gen8_pte_encode(sg_page_iter_dma_address(sg_iter),
cache_level, true);
if (++pte == GEN8_PTES) {
kunmap_px(ppgtt, pt_vaddr);
@@ -797,8 +796,10 @@ static void gen8_ppgtt_insert_entries(struct i915_address_space *vm,
struct i915_hw_ppgtt *ppgtt =
container_of(vm, struct i915_hw_ppgtt, base);
struct i915_page_directory_pointer *pdp = &ppgtt->pdp; /* FIXME: 48b */
+ struct sg_page_iter sg_iter;
- gen8_ppgtt_insert_pte_entries(vm, pdp, pages, start, cache_level);
+ __sg_page_iter_start(&sg_iter, pages->sgl, sg_nents(pages->sgl), 0);
+ gen8_ppgtt_insert_pte_entries(vm, pdp, &sg_iter, start, cache_level);
}
static void gen8_free_page_tables(struct drm_device *dev,
--
2.4.5
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 33+ messages in thread
* [PATCH v5 10/19] drm/i915/gen8: Add 4 level support in insert_entries and clear_range
2015-07-16 9:33 [PATCH v5 00/19] 48-bit PPGTT Michel Thierry
` (8 preceding siblings ...)
2015-07-16 9:33 ` [PATCH v5 09/19] drm/i915/gen8: Pass sg_iter through pte inserts Michel Thierry
@ 2015-07-16 9:33 ` Michel Thierry
2015-07-16 9:33 ` [PATCH v5 11/19] drm/i915/gen8: Initialize PDPs Michel Thierry
` (9 subsequent siblings)
19 siblings, 0 replies; 33+ messages in thread
From: Michel Thierry @ 2015-07-16 9:33 UTC (permalink / raw)
To: intel-gfx; +Cc: akash.goel
When 48b is enabled, gen8_ppgtt_insert_entries needs to read the Page Map
Level 4 (PML4), before it selects which Page Directory Pointer (PDP)
it will write to.
Similarly, gen8_ppgtt_clear_range needs to get the correct PDP/PD range.
This patch was inspired by Ben's "Depend exclusively on map and
unmap_vma".
v2: Rebase after s/page_tables/page_table/.
v3: Remove unnecessary pdpe loop in gen8_ppgtt_clear_range_4lvl and use
clamp_pdp in gen8_ppgtt_insert_entries (Akash).
v4: Merge gen8_ppgtt_clear_range_4lvl into gen8_ppgtt_clear_range to
maintain symmetry with gen8_ppgtt_insert_entries (Akash).
v5: Do not mix pages and bytes in insert_entries (Akash).
v6: Prevent overflow in sg_nents << PAGE_SHIFT, when inserting 4GB at
once.
v7: Rebase after Mika's ppgtt cleanup / scratch merge patch series.
Use gen8_px_index functions, and remove unnecessary number of pages
parameter in insert_pte_entries.
v8: Change gen8_ppgtt_clear_pte_range to stop at PDP boundary, instead of
adding and extra clamp function; remove unnecessary pdp_start/pdp_len
variables (Akash).
v9: Use pages->orig_nents instead of sg_nents(pages->sgl) to get the
length of the buffer being mapped to the PPGTT (Akash).
Cc: Akash Goel <akash.goel@intel.com>
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
---
drivers/gpu/drm/i915/i915_gem_gtt.c | 49 +++++++++++++++++++++++++++----------
1 file changed, 36 insertions(+), 13 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index ee19473..c3ab9a5 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -687,9 +687,9 @@ static void gen8_ppgtt_clear_pte_range(struct i915_address_space *vm,
struct i915_hw_ppgtt *ppgtt =
container_of(vm, struct i915_hw_ppgtt, base);
gen8_pte_t *pt_vaddr;
- unsigned pdpe = start >> GEN8_PDPE_SHIFT & GEN8_PDPE_MASK;
- unsigned pde = start >> GEN8_PDE_SHIFT & GEN8_PDE_MASK;
- unsigned pte = start >> GEN8_PTE_SHIFT & GEN8_PTE_MASK;
+ unsigned pdpe = gen8_pdpe_index(start);
+ unsigned pde = gen8_pde_index(start);
+ unsigned pte = gen8_pte_index(start);
unsigned num_entries = length >> PAGE_SHIFT;
unsigned last_pte, i;
@@ -725,7 +725,8 @@ static void gen8_ppgtt_clear_pte_range(struct i915_address_space *vm,
pte = 0;
if (++pde == I915_PDES) {
- pdpe++;
+ if (++pdpe == I915_PDPES_PER_PDP(vm->dev))
+ break;
pde = 0;
}
}
@@ -738,12 +739,21 @@ static void gen8_ppgtt_clear_range(struct i915_address_space *vm,
{
struct i915_hw_ppgtt *ppgtt =
container_of(vm, struct i915_hw_ppgtt, base);
- struct i915_page_directory_pointer *pdp = &ppgtt->pdp; /* FIXME: 48b */
-
gen8_pte_t scratch_pte = gen8_pte_encode(px_dma(vm->scratch_page),
I915_CACHE_LLC, use_scratch);
- gen8_ppgtt_clear_pte_range(vm, pdp, start, length, scratch_pte);
+ if (!USES_FULL_48BIT_PPGTT(vm->dev)) {
+ gen8_ppgtt_clear_pte_range(vm, &ppgtt->pdp, start, length,
+ scratch_pte);
+ } else {
+ uint64_t templ4, pml4e;
+ struct i915_page_directory_pointer *pdp;
+
+ gen8_for_each_pml4e(pdp, &ppgtt->pml4, start, length, templ4, pml4e) {
+ gen8_ppgtt_clear_pte_range(vm, pdp, start, length,
+ scratch_pte);
+ }
+ }
}
static void
@@ -756,9 +766,9 @@ gen8_ppgtt_insert_pte_entries(struct i915_address_space *vm,
struct i915_hw_ppgtt *ppgtt =
container_of(vm, struct i915_hw_ppgtt, base);
gen8_pte_t *pt_vaddr;
- unsigned pdpe = start >> GEN8_PDPE_SHIFT & GEN8_PDPE_MASK;
- unsigned pde = start >> GEN8_PDE_SHIFT & GEN8_PDE_MASK;
- unsigned pte = start >> GEN8_PTE_SHIFT & GEN8_PTE_MASK;
+ unsigned pdpe = gen8_pdpe_index(start);
+ unsigned pde = gen8_pde_index(start);
+ unsigned pte = gen8_pte_index(start);
pt_vaddr = NULL;
@@ -776,7 +786,8 @@ gen8_ppgtt_insert_pte_entries(struct i915_address_space *vm,
kunmap_px(ppgtt, pt_vaddr);
pt_vaddr = NULL;
if (++pde == I915_PDES) {
- pdpe++;
+ if (++pdpe == I915_PDPES_PER_PDP(vm->dev))
+ break;
pde = 0;
}
pte = 0;
@@ -795,11 +806,23 @@ static void gen8_ppgtt_insert_entries(struct i915_address_space *vm,
{
struct i915_hw_ppgtt *ppgtt =
container_of(vm, struct i915_hw_ppgtt, base);
- struct i915_page_directory_pointer *pdp = &ppgtt->pdp; /* FIXME: 48b */
struct sg_page_iter sg_iter;
__sg_page_iter_start(&sg_iter, pages->sgl, sg_nents(pages->sgl), 0);
- gen8_ppgtt_insert_pte_entries(vm, pdp, &sg_iter, start, cache_level);
+
+ if (!USES_FULL_48BIT_PPGTT(vm->dev)) {
+ gen8_ppgtt_insert_pte_entries(vm, &ppgtt->pdp, &sg_iter, start,
+ cache_level);
+ } else {
+ struct i915_page_directory_pointer *pdp;
+ uint64_t templ4, pml4e;
+ uint64_t length = (uint64_t)pages->orig_nents << PAGE_SHIFT;
+
+ gen8_for_each_pml4e(pdp, &ppgtt->pml4, start, length, templ4, pml4e) {
+ gen8_ppgtt_insert_pte_entries(vm, pdp, &sg_iter,
+ start, cache_level);
+ }
+ }
}
static void gen8_free_page_tables(struct drm_device *dev,
--
2.4.5
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 33+ messages in thread
* [PATCH v5 11/19] drm/i915/gen8: Initialize PDPs
2015-07-16 9:33 [PATCH v5 00/19] 48-bit PPGTT Michel Thierry
` (9 preceding siblings ...)
2015-07-16 9:33 ` [PATCH v5 10/19] drm/i915/gen8: Add 4 level support in insert_entries and clear_range Michel Thierry
@ 2015-07-16 9:33 ` Michel Thierry
2015-07-29 14:35 ` Michel Thierry
2015-07-16 9:33 ` [PATCH v5 12/19] drm/i915: Expand error state's address width to 64b Michel Thierry
` (8 subsequent siblings)
19 siblings, 1 reply; 33+ messages in thread
From: Michel Thierry @ 2015-07-16 9:33 UTC (permalink / raw)
To: intel-gfx; +Cc: akash.goel
Similar to PDs, while setting up a page directory pointer, make all entries
of the pdp point to the scratch pd before mapping (and make all its entries
point to the scratch page); this is to be safe in case of out of bound
access or proactive prefetch.
v2: Handle scratch_pdp allocation failure correctly, and keep
initialize_px functions together (Akash)
v3: Rebase after Mika's ppgtt cleanup / scratch merge patch series. Rely on
the added macros to initialize the pdps.
v4: Rebase after final merged version of Mika's ppgtt/scratch patches
(and removed commit message part related to v3).
Suggested-by: Akash Goel <akash.goel@intel.com>
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
---
drivers/gpu/drm/i915/i915_gem_gtt.c | 38 +++++++++++++++++++++++++++++++++++++
drivers/gpu/drm/i915/i915_gem_gtt.h | 1 +
2 files changed, 39 insertions(+)
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index c3ab9a5..ca121eb 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -602,6 +602,27 @@ static void free_pdp(struct drm_device *dev,
}
}
+static void gen8_initialize_pdp(struct i915_address_space *vm,
+ struct i915_page_directory_pointer *pdp)
+{
+ gen8_ppgtt_pdpe_t scratch_pdpe;
+
+ scratch_pdpe = gen8_pdpe_encode(px_dma(vm->scratch_pd), I915_CACHE_LLC);
+
+ fill_px(vm->dev, pdp, scratch_pdpe);
+}
+
+static void gen8_initialize_pml4(struct i915_address_space *vm,
+ struct i915_pml4 *pml4)
+{
+ gen8_ppgtt_pml4e_t scratch_pml4e;
+
+ scratch_pml4e = gen8_pml4e_encode(px_dma(vm->scratch_pdp),
+ I915_CACHE_LLC);
+
+ fill_px(vm->dev, pml4, scratch_pml4e);
+}
+
static void
gen8_setup_page_directory(struct i915_hw_ppgtt *ppgtt,
struct i915_page_directory_pointer *pdp,
@@ -863,8 +884,20 @@ static int gen8_init_scratch(struct i915_address_space *vm)
return PTR_ERR(vm->scratch_pd);
}
+ if (USES_FULL_48BIT_PPGTT(dev)) {
+ vm->scratch_pdp = alloc_pdp(dev);
+ if (IS_ERR(vm->scratch_pdp)) {
+ free_pd(dev, vm->scratch_pd);
+ free_pt(dev, vm->scratch_pt);
+ free_scratch_page(dev, vm->scratch_page);
+ return PTR_ERR(vm->scratch_pdp);
+ }
+ }
+
gen8_initialize_pt(vm, vm->scratch_pt);
gen8_initialize_pd(vm, vm->scratch_pd);
+ if (USES_FULL_48BIT_PPGTT(dev))
+ gen8_initialize_pdp(vm, vm->scratch_pdp);
return 0;
}
@@ -873,6 +906,8 @@ static void gen8_free_scratch(struct i915_address_space *vm)
{
struct drm_device *dev = vm->dev;
+ if (USES_FULL_48BIT_PPGTT(dev))
+ free_pdp(dev, vm->scratch_pdp);
free_pd(dev, vm->scratch_pd);
free_pt(dev, vm->scratch_pt);
free_scratch_page(dev, vm->scratch_page);
@@ -1236,6 +1271,7 @@ static int gen8_alloc_va_range_4lvl(struct i915_address_space *vm,
if (IS_ERR(pdp))
goto err_out;
+ gen8_initialize_pdp(vm, pdp);
pml4->pdps[pml4e] = pdp;
__set_bit(pml4e, new_pdps);
trace_i915_page_directory_pointer_entry_alloc(vm,
@@ -1314,6 +1350,8 @@ static int gen8_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
if (ret)
goto free_scratch;
+ gen8_initialize_pml4(&ppgtt->base, &ppgtt->pml4);
+
ppgtt->base.total = 1ULL << 48;
ppgtt->switch_mm = gen8_48b_mm_switch;
} else {
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
index 11d44b3..70c50e7 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.h
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
@@ -278,6 +278,7 @@ struct i915_address_space {
struct i915_page_scratch *scratch_page;
struct i915_page_table *scratch_pt;
struct i915_page_directory *scratch_pd;
+ struct i915_page_directory_pointer *scratch_pdp; /* GEN8+ & 48b PPGTT */
/**
* List of objects currently involved in rendering.
--
2.4.5
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 33+ messages in thread
* [PATCH v5 12/19] drm/i915: Expand error state's address width to 64b
2015-07-16 9:33 [PATCH v5 00/19] 48-bit PPGTT Michel Thierry
` (10 preceding siblings ...)
2015-07-16 9:33 ` [PATCH v5 11/19] drm/i915/gen8: Initialize PDPs Michel Thierry
@ 2015-07-16 9:33 ` Michel Thierry
2015-07-16 9:33 ` [PATCH v5 13/19] drm/i915/gen8: Add ppgtt info and debug_dump Michel Thierry
` (7 subsequent siblings)
19 siblings, 0 replies; 33+ messages in thread
From: Michel Thierry @ 2015-07-16 9:33 UTC (permalink / raw)
To: intel-gfx; +Cc: akash.goel
v2: For semaphore errors, object is mapped to GGTT and offset will not
be > 4GB, print only lower 32-bits (Akash)
v3: Print gtt_offset in groups of 32-bit (Chris)
Cc: Akash Goel <akash.goel@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
---
drivers/gpu/drm/i915/i915_drv.h | 4 ++--
drivers/gpu/drm/i915/i915_gpu_error.c | 24 ++++++++++++++----------
2 files changed, 16 insertions(+), 12 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 550b243..649408a 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -546,7 +546,7 @@ struct drm_i915_error_state {
struct drm_i915_error_object {
int page_count;
- u32 gtt_offset;
+ u64 gtt_offset;
u32 *pages[0];
} *ringbuffer, *batchbuffer, *wa_batchbuffer, *ctx, *hws_page;
@@ -572,7 +572,7 @@ struct drm_i915_error_state {
u32 size;
u32 name;
u32 rseqno[I915_NUM_RINGS], wseqno;
- u32 gtt_offset;
+ u64 gtt_offset;
u32 read_domains;
u32 write_domain;
s32 fence_reg:I915_MAX_NUM_FENCE_BITS;
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
index 6f42569..f79c952 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -197,8 +197,9 @@ static void print_error_buffers(struct drm_i915_error_state_buf *m,
err_printf(m, " %s [%d]:\n", name, count);
while (count--) {
- err_printf(m, " %08x %8u %02x %02x [ ",
- err->gtt_offset,
+ err_printf(m, " %08x_%08x %8u %02x %02x [ ",
+ upper_32_bits(err->gtt_offset),
+ lower_32_bits(err->gtt_offset),
err->size,
err->read_domains,
err->write_domain);
@@ -426,15 +427,17 @@ int i915_error_state_to_str(struct drm_i915_error_state_buf *m,
err_printf(m, " (submitted by %s [%d])",
error->ring[i].comm,
error->ring[i].pid);
- err_printf(m, " --- gtt_offset = 0x%08x\n",
- obj->gtt_offset);
+ err_printf(m, " --- gtt_offset = 0x%08x %08x\n",
+ upper_32_bits(obj->gtt_offset),
+ lower_32_bits(obj->gtt_offset));
print_error_obj(m, obj);
}
obj = error->ring[i].wa_batchbuffer;
if (obj) {
err_printf(m, "%s (w/a) --- gtt_offset = 0x%08x\n",
- dev_priv->ring[i].name, obj->gtt_offset);
+ dev_priv->ring[i].name,
+ lower_32_bits(obj->gtt_offset));
print_error_obj(m, obj);
}
@@ -453,14 +456,14 @@ int i915_error_state_to_str(struct drm_i915_error_state_buf *m,
if ((obj = error->ring[i].ringbuffer)) {
err_printf(m, "%s --- ringbuffer = 0x%08x\n",
dev_priv->ring[i].name,
- obj->gtt_offset);
+ lower_32_bits(obj->gtt_offset));
print_error_obj(m, obj);
}
if ((obj = error->ring[i].hws_page)) {
err_printf(m, "%s --- HW Status = 0x%08x\n",
dev_priv->ring[i].name,
- obj->gtt_offset);
+ lower_32_bits(obj->gtt_offset));
offset = 0;
for (elt = 0; elt < PAGE_SIZE/16; elt += 4) {
err_printf(m, "[%04x] %08x %08x %08x %08x\n",
@@ -476,13 +479,14 @@ int i915_error_state_to_str(struct drm_i915_error_state_buf *m,
if ((obj = error->ring[i].ctx)) {
err_printf(m, "%s --- HW Context = 0x%08x\n",
dev_priv->ring[i].name,
- obj->gtt_offset);
+ lower_32_bits(obj->gtt_offset));
print_error_obj(m, obj);
}
}
if ((obj = error->semaphore_obj)) {
- err_printf(m, "Semaphore page = 0x%08x\n", obj->gtt_offset);
+ err_printf(m, "Semaphore page = 0x%08x\n",
+ lower_32_bits(obj->gtt_offset));
for (elt = 0; elt < PAGE_SIZE/16; elt += 4) {
err_printf(m, "[%04x] %08x %08x %08x %08x\n",
elt * 4,
@@ -590,7 +594,7 @@ i915_error_object_create(struct drm_i915_private *dev_priv,
int num_pages;
bool use_ggtt;
int i = 0;
- u32 reloc_offset;
+ u64 reloc_offset;
if (src == NULL || src->pages == NULL)
return NULL;
--
2.4.5
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 33+ messages in thread
* [PATCH v5 13/19] drm/i915/gen8: Add ppgtt info and debug_dump
2015-07-16 9:33 [PATCH v5 00/19] 48-bit PPGTT Michel Thierry
` (11 preceding siblings ...)
2015-07-16 9:33 ` [PATCH v5 12/19] drm/i915: Expand error state's address width to 64b Michel Thierry
@ 2015-07-16 9:33 ` Michel Thierry
2015-07-16 9:33 ` [PATCH v5 14/19] drm/i915: object size needs to be u64 Michel Thierry
` (6 subsequent siblings)
19 siblings, 0 replies; 33+ messages in thread
From: Michel Thierry @ 2015-07-16 9:33 UTC (permalink / raw)
To: intel-gfx; +Cc: akash.goel
v2: Clean up patch after rebases.
v3: gen8_dump_ppgtt for 32b and 48b PPGTT.
v4: Use used_pml4es/pdpes (Akash).
v5: Rebase after Mika's ppgtt cleanup / scratch merge patch series.
v6: Rely on used_px bits instead of null checking (Akash)
Cc: Akash Goel <akash.goel@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2+)
---
drivers/gpu/drm/i915/i915_debugfs.c | 18 ++++----
drivers/gpu/drm/i915/i915_gem_gtt.c | 83 +++++++++++++++++++++++++++++++++++++
2 files changed, 93 insertions(+), 8 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index b2766a8..f05c386 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -2248,7 +2248,6 @@ static void gen6_ppgtt_info(struct seq_file *m, struct drm_device *dev)
{
struct drm_i915_private *dev_priv = dev->dev_private;
struct intel_engine_cs *ring;
- struct drm_file *file;
int i;
if (INTEL_INFO(dev)->gen == 6)
@@ -2271,13 +2270,6 @@ static void gen6_ppgtt_info(struct seq_file *m, struct drm_device *dev)
ppgtt->debug_dump(ppgtt, m);
}
- list_for_each_entry_reverse(file, &dev->filelist, lhead) {
- struct drm_i915_file_private *file_priv = file->driver_priv;
-
- seq_printf(m, "proc: %s\n",
- get_pid_task(file->pid, PIDTYPE_PID)->comm);
- idr_for_each(&file_priv->context_idr, per_file_ctx, m);
- }
seq_printf(m, "ECOCHK: 0x%08x\n", I915_READ(GAM_ECOCHK));
}
@@ -2286,6 +2278,7 @@ static int i915_ppgtt_info(struct seq_file *m, void *data)
struct drm_info_node *node = m->private;
struct drm_device *dev = node->minor->dev;
struct drm_i915_private *dev_priv = dev->dev_private;
+ struct drm_file *file;
int ret = mutex_lock_interruptible(&dev->struct_mutex);
if (ret)
@@ -2297,6 +2290,15 @@ static int i915_ppgtt_info(struct seq_file *m, void *data)
else if (INTEL_INFO(dev)->gen >= 6)
gen6_ppgtt_info(m, dev);
+ list_for_each_entry_reverse(file, &dev->filelist, lhead) {
+ struct drm_i915_file_private *file_priv = file->driver_priv;
+
+ seq_printf(m, "\nproc: %s\n",
+ get_pid_task(file->pid, PIDTYPE_PID)->comm);
+ idr_for_each(&file_priv->context_idr, per_file_ctx,
+ (void *)(unsigned long)m);
+ }
+
intel_runtime_pm_put(dev_priv);
mutex_unlock(&dev->struct_mutex);
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index ca121eb..1435701 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -1322,6 +1322,89 @@ static int gen8_alloc_va_range(struct i915_address_space *vm,
return gen8_alloc_va_range_3lvl(vm, &ppgtt->pdp, start, length);
}
+static void gen8_dump_pdp(struct i915_page_directory_pointer *pdp,
+ uint64_t start, uint64_t length,
+ gen8_pte_t scratch_pte,
+ struct seq_file *m)
+{
+ struct i915_page_directory *pd;
+ uint64_t temp;
+ uint32_t pdpe;
+
+ gen8_for_each_pdpe(pd, pdp, start, length, temp, pdpe) {
+ struct i915_page_table *pt;
+ uint64_t pd_len = length;
+ uint64_t pd_start = start;
+ uint32_t pde;
+
+ if (!test_bit(pdpe, pdp->used_pdpes))
+ continue;
+
+ seq_printf(m, "\tPDPE #%d\n", pdpe);
+ gen8_for_each_pde(pt, pd, pd_start, pd_len, temp, pde) {
+ uint32_t pte;
+ gen8_pte_t *pt_vaddr;
+
+ if (!test_bit(pde, pd->used_pdes))
+ continue;
+
+ pt_vaddr = kmap_px(pt);
+ for (pte = 0; pte < GEN8_PTES; pte += 4) {
+ uint64_t va =
+ (pdpe << GEN8_PDPE_SHIFT) |
+ (pde << GEN8_PDE_SHIFT) |
+ (pte << GEN8_PTE_SHIFT);
+ int i;
+ bool found = false;
+
+ for (i = 0; i < 4; i++)
+ if (pt_vaddr[pte + i] != scratch_pte)
+ found = true;
+ if (!found)
+ continue;
+
+ seq_printf(m, "\t\t0x%llx [%03d,%03d,%04d]: =", va, pdpe, pde, pte);
+ for (i = 0; i < 4; i++) {
+ if (pt_vaddr[pte + i] != scratch_pte)
+ seq_printf(m, " %llx", pt_vaddr[pte + i]);
+ else
+ seq_puts(m, " SCRATCH ");
+ }
+ seq_puts(m, "\n");
+ }
+ /* don't use kunmap_px, it could trigger
+ * an unnecessary flush.
+ */
+ kunmap_atomic(pt_vaddr);
+ }
+ }
+}
+
+static void gen8_dump_ppgtt(struct i915_hw_ppgtt *ppgtt, struct seq_file *m)
+{
+ struct i915_address_space *vm = &ppgtt->base;
+ uint64_t start = ppgtt->base.start;
+ uint64_t length = ppgtt->base.total;
+ gen8_pte_t scratch_pte = gen8_pte_encode(px_dma(vm->scratch_page),
+ I915_CACHE_LLC, true);
+
+ if (!USES_FULL_48BIT_PPGTT(vm->dev)) {
+ gen8_dump_pdp(&ppgtt->pdp, start, length, scratch_pte, m);
+ } else {
+ uint64_t templ4, pml4e;
+ struct i915_pml4 *pml4 = &ppgtt->pml4;
+ struct i915_page_directory_pointer *pdp;
+
+ gen8_for_each_pml4e(pdp, pml4, start, length, templ4, pml4e) {
+ if (!test_bit(pml4e, pml4->used_pml4es))
+ continue;
+
+ seq_printf(m, " PML4E #%llu\n", pml4e);
+ gen8_dump_pdp(pdp, start, length, scratch_pte, m);
+ }
+ }
+}
+
/*
* GEN8 legacy ppgtt programming is accomplished through a max 4 PDP registers
* with a net effect resembling a 2-level page table in normal x86 terms. Each
@@ -1344,6 +1426,7 @@ static int gen8_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
ppgtt->base.clear_range = gen8_ppgtt_clear_range;
ppgtt->base.unbind_vma = ppgtt_unbind_vma;
ppgtt->base.bind_vma = ppgtt_bind_vma;
+ ppgtt->debug_dump = gen8_dump_ppgtt;
if (USES_FULL_48BIT_PPGTT(ppgtt->base.dev)) {
ret = setup_px(ppgtt->base.dev, &ppgtt->pml4);
--
2.4.5
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 33+ messages in thread
* [PATCH v5 14/19] drm/i915: object size needs to be u64
2015-07-16 9:33 [PATCH v5 00/19] 48-bit PPGTT Michel Thierry
` (12 preceding siblings ...)
2015-07-16 9:33 ` [PATCH v5 13/19] drm/i915/gen8: Add ppgtt info and debug_dump Michel Thierry
@ 2015-07-16 9:33 ` Michel Thierry
2015-07-16 9:33 ` [PATCH v5 15/19] drm/i915: batch_obj vm offset must " Michel Thierry
` (5 subsequent siblings)
19 siblings, 0 replies; 33+ messages in thread
From: Michel Thierry @ 2015-07-16 9:33 UTC (permalink / raw)
To: intel-gfx; +Cc: akash.goel
In a 48b world, users can try to allocate buffers bigger than 4GB; in
these cases it is important that size is a 64b variable.
v2: Drop the warning about bind with size 0, it shouldn't happen anyway.
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
---
drivers/gpu/drm/i915/i915_gem.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 4025a3b..76b7612 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -3726,7 +3726,8 @@ i915_gem_object_bind_to_vm(struct drm_i915_gem_object *obj,
{
struct drm_device *dev = obj->base.dev;
struct drm_i915_private *dev_priv = dev->dev_private;
- u32 size, fence_size, fence_alignment, unfenced_alignment;
+ u32 fence_alignment, unfenced_alignment;
+ u64 size, fence_size;
u64 start =
flags & PIN_OFFSET_BIAS ? flags & PIN_OFFSET_MASK : 0;
u64 end =
@@ -3785,7 +3786,7 @@ i915_gem_object_bind_to_vm(struct drm_i915_gem_object *obj,
* attempt to find space.
*/
if (size > end) {
- DRM_DEBUG("Attempting to bind an object (view type=%u) larger than the aperture: size=%u > %s aperture=%llu\n",
+ DRM_DEBUG("Attempting to bind an object (view type=%u) larger than the aperture: size=%llu > %s aperture=%llu\n",
ggtt_view ? ggtt_view->type : 0,
size,
flags & PIN_MAPPABLE ? "mappable" : "total",
--
2.4.5
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 33+ messages in thread
* [PATCH v5 15/19] drm/i915: batch_obj vm offset must be u64
2015-07-16 9:33 [PATCH v5 00/19] 48-bit PPGTT Michel Thierry
` (13 preceding siblings ...)
2015-07-16 9:33 ` [PATCH v5 14/19] drm/i915: object size needs to be u64 Michel Thierry
@ 2015-07-16 9:33 ` Michel Thierry
2015-07-16 9:33 ` [PATCH v5 16/19] drm/i915/userptr: Kill user_size limit check Michel Thierry
` (4 subsequent siblings)
19 siblings, 0 replies; 33+ messages in thread
From: Michel Thierry @ 2015-07-16 9:33 UTC (permalink / raw)
To: intel-gfx; +Cc: akash.goel
Otherwise it can overflow in 48-bit mode, and cause an incorrect
exec_start.
Before commit 5f19e2bffa63a91cd4ac1adcec648e14a44277ce ("drm/i915: Merged
the many do_execbuf() parameters into a structure"), it was already an u64.
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
---
drivers/gpu/drm/i915/i915_drv.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 649408a..1dbbbf0 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1674,7 +1674,7 @@ struct i915_execbuffer_params {
struct drm_file *file;
uint32_t dispatch_flags;
uint32_t args_batch_start_offset;
- uint32_t batch_obj_vm_offset;
+ uint64_t batch_obj_vm_offset;
struct intel_engine_cs *ring;
struct drm_i915_gem_object *batch_obj;
struct intel_context *ctx;
--
2.4.5
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 33+ messages in thread
* [PATCH v5 16/19] drm/i915/userptr: Kill user_size limit check
2015-07-16 9:33 [PATCH v5 00/19] 48-bit PPGTT Michel Thierry
` (14 preceding siblings ...)
2015-07-16 9:33 ` [PATCH v5 15/19] drm/i915: batch_obj vm offset must " Michel Thierry
@ 2015-07-16 9:33 ` Michel Thierry
2015-07-16 9:33 ` [PATCH v5 17/19] drm/i915: Wa32bitGeneralStateOffset & Wa32bitInstructionBaseOffset Michel Thierry
` (3 subsequent siblings)
19 siblings, 0 replies; 33+ messages in thread
From: Michel Thierry @ 2015-07-16 9:33 UTC (permalink / raw)
To: intel-gfx; +Cc: akash.goel
GTT was only 32b and its max value is 4GB. In order to allow objects
bigger than 4GB in 48b PPGTT, i915_gem_userptr_ioctl we could check
against max 48b range (1ULL << 48).
But since the check no longer applies, just kill the limit.
v2: Use the default ctx to infer the ppgtt max size (Akash).
v3: Just kill the limit, it was only there for early detection of an
error when used for execbuffer (Chris).
Cc: Akash Goel <akash.goel@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
---
drivers/gpu/drm/i915/i915_gem_userptr.c | 4 ----
1 file changed, 4 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_gem_userptr.c b/drivers/gpu/drm/i915/i915_gem_userptr.c
index 8fd431b..d11901d 100644
--- a/drivers/gpu/drm/i915/i915_gem_userptr.c
+++ b/drivers/gpu/drm/i915/i915_gem_userptr.c
@@ -813,7 +813,6 @@ static const struct drm_i915_gem_object_ops i915_gem_userptr_ops = {
int
i915_gem_userptr_ioctl(struct drm_device *dev, void *data, struct drm_file *file)
{
- struct drm_i915_private *dev_priv = dev->dev_private;
struct drm_i915_gem_userptr *args = data;
struct drm_i915_gem_object *obj;
int ret;
@@ -826,9 +825,6 @@ i915_gem_userptr_ioctl(struct drm_device *dev, void *data, struct drm_file *file
if (offset_in_page(args->user_ptr | args->user_size))
return -EINVAL;
- if (args->user_size > dev_priv->gtt.base.total)
- return -E2BIG;
-
if (!access_ok(args->flags & I915_USERPTR_READ_ONLY ? VERIFY_READ : VERIFY_WRITE,
(char __user *)(unsigned long)args->user_ptr, args->user_size))
return -EFAULT;
--
2.4.5
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 33+ messages in thread
* [PATCH v5 17/19] drm/i915: Wa32bitGeneralStateOffset & Wa32bitInstructionBaseOffset
2015-07-16 9:33 [PATCH v5 00/19] 48-bit PPGTT Michel Thierry
` (15 preceding siblings ...)
2015-07-16 9:33 ` [PATCH v5 16/19] drm/i915/userptr: Kill user_size limit check Michel Thierry
@ 2015-07-16 9:33 ` Michel Thierry
2015-07-27 14:34 ` Goel, Akash
2015-07-27 21:11 ` Chris Wilson
2015-07-16 9:33 ` [PATCH v5 18/19] drm/i915/gen8: Flip the 48b switch Michel Thierry
` (2 subsequent siblings)
19 siblings, 2 replies; 33+ messages in thread
From: Michel Thierry @ 2015-07-16 9:33 UTC (permalink / raw)
To: intel-gfx; +Cc: akash.goel
There are some allocations that must be only referenced by 32-bit
offsets. To limit the chances of having the first 4GB already full,
objects not requiring this workaround use DRM_MM_SEARCH_BELOW/
DRM_MM_CREATE_TOP flags
In specific, any resource used with flat/heapless (0x00000000-0xfffff000)
General State Heap (GSH) or Instruction State Heap (ISH) must be in a
32-bit range, because the General State Offset and Instruction State
Offset are limited to 32-bits.
Objects must have EXEC_OBJECT_SUPPORTS_48B_ADDRESS flag to indicate if
they can be allocated above the 32-bit address range. To limit the
chances of having the first 4GB already full, objects will use
DRM_MM_SEARCH_BELOW + DRM_MM_CREATE_TOP flags when possible.
v2: Changed flag logic from neeeds_32b, to supports_48b.
v3: Moved 48-bit support flag back to exec_object. (Chris, Daniel)
v4: Split pin flags into PIN_ZONE_4G and PIN_HIGH; update PIN_OFFSET_MASK
to use last PIN_ defined instead of hard-coded value; use correct limit
check in eb_vma_misplaced. (Chris)
v5: Don't touch PIN_OFFSET_MASK and update workaround comment (Chris)
v6: Apply pin-high for ggtt too (Chris)
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v4)
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
---
drivers/gpu/drm/i915/i915_drv.h | 2 ++
drivers/gpu/drm/i915/i915_gem.c | 14 ++++++++++++--
drivers/gpu/drm/i915/i915_gem_execbuffer.c | 13 +++++++++++++
include/uapi/drm/i915_drm.h | 3 ++-
4 files changed, 29 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 1dbbbf0..f79cc7b 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2771,6 +2771,8 @@ void i915_gem_vma_destroy(struct i915_vma *vma);
#define PIN_OFFSET_BIAS (1<<3)
#define PIN_USER (1<<4)
#define PIN_UPDATE (1<<5)
+#define PIN_ZONE_4G (1<<6)
+#define PIN_HIGH (1<<7)
#define PIN_OFFSET_MASK (~4095)
int __must_check
i915_gem_object_pin(struct drm_i915_gem_object *obj,
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 76b7612..cd7e4b6 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -3728,6 +3728,8 @@ i915_gem_object_bind_to_vm(struct drm_i915_gem_object *obj,
struct drm_i915_private *dev_priv = dev->dev_private;
u32 fence_alignment, unfenced_alignment;
u64 size, fence_size;
+ u32 search_flag = DRM_MM_SEARCH_DEFAULT;
+ u32 alloc_flag = DRM_MM_CREATE_DEFAULT;
u64 start =
flags & PIN_OFFSET_BIAS ? flags & PIN_OFFSET_MASK : 0;
u64 end =
@@ -3771,6 +3773,14 @@ i915_gem_object_bind_to_vm(struct drm_i915_gem_object *obj,
size = flags & PIN_MAPPABLE ? fence_size : obj->base.size;
}
+ if (flags & PIN_HIGH) {
+ search_flag = DRM_MM_SEARCH_BELOW;
+ alloc_flag = DRM_MM_CREATE_TOP;
+ }
+
+ if (flags & PIN_ZONE_4G)
+ end = (1ULL << 32);
+
if (alignment == 0)
alignment = flags & PIN_MAPPABLE ? fence_alignment :
unfenced_alignment;
@@ -3811,8 +3821,8 @@ search_free:
size, alignment,
obj->cache_level,
start, end,
- DRM_MM_SEARCH_DEFAULT,
- DRM_MM_CREATE_DEFAULT);
+ search_flag,
+ alloc_flag);
if (ret) {
ret = i915_gem_evict_something(dev, vm, size, alignment,
obj->cache_level,
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index 923a3c4..209e8e2 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -589,11 +589,20 @@ i915_gem_execbuffer_reserve_vma(struct i915_vma *vma,
if (entry->flags & EXEC_OBJECT_NEEDS_GTT)
flags |= PIN_GLOBAL;
+ /* Wa32bitGeneralStateOffset & Wa32bitInstructionBaseOffset,
+ * limit address to the first 4GBs for unflagged objects.
+ */
+ flags |= PIN_ZONE_4G;
+ if (entry->flags & EXEC_OBJECT_SUPPORTS_48B_ADDRESS)
+ flags &= ~PIN_ZONE_4G;
+
if (!drm_mm_node_allocated(&vma->node)) {
if (entry->flags & __EXEC_OBJECT_NEEDS_MAP)
flags |= PIN_GLOBAL | PIN_MAPPABLE;
if (entry->flags & __EXEC_OBJECT_NEEDS_BIAS)
flags |= BATCH_OFFSET_BIAS | PIN_OFFSET_BIAS;
+ if ((flags & PIN_MAPPABLE) == 0)
+ flags |= PIN_HIGH;
}
ret = i915_gem_object_pin(obj, vma->vm, entry->alignment, flags);
@@ -671,6 +680,10 @@ eb_vma_misplaced(struct i915_vma *vma)
if (entry->flags & __EXEC_OBJECT_NEEDS_MAP && !obj->map_and_fenceable)
return !only_mappable_for_reloc(entry->flags);
+ if (!(entry->flags & EXEC_OBJECT_SUPPORTS_48B_ADDRESS) &&
+ (vma->node.start + vma->node.size) >= (1ULL << 32))
+ return true;
+
return false;
}
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index e7c29f1..e4471e8 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -686,7 +686,8 @@ struct drm_i915_gem_exec_object2 {
#define EXEC_OBJECT_NEEDS_FENCE (1<<0)
#define EXEC_OBJECT_NEEDS_GTT (1<<1)
#define EXEC_OBJECT_WRITE (1<<2)
-#define __EXEC_OBJECT_UNKNOWN_FLAGS -(EXEC_OBJECT_WRITE<<1)
+#define EXEC_OBJECT_SUPPORTS_48B_ADDRESS (1<<3)
+#define __EXEC_OBJECT_UNKNOWN_FLAGS -(EXEC_OBJECT_SUPPORTS_48B_ADDRESS<<1)
__u64 flags;
__u64 rsvd1;
--
2.4.5
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 33+ messages in thread
* [PATCH v5 18/19] drm/i915/gen8: Flip the 48b switch
2015-07-16 9:33 [PATCH v5 00/19] 48-bit PPGTT Michel Thierry
` (16 preceding siblings ...)
2015-07-16 9:33 ` [PATCH v5 17/19] drm/i915: Wa32bitGeneralStateOffset & Wa32bitInstructionBaseOffset Michel Thierry
@ 2015-07-16 9:33 ` Michel Thierry
2015-07-16 9:33 ` [PATCH v5 19/19] drm/i915: Save some page table setup on repeated binds Michel Thierry
2015-07-28 12:18 ` [PATCH v5 00/19] 48-bit PPGTT Chris Wilson
19 siblings, 0 replies; 33+ messages in thread
From: Michel Thierry @ 2015-07-16 9:33 UTC (permalink / raw)
To: intel-gfx; +Cc: akash.goel
Use 48b addresses if hw supports it (i915.enable_ppgtt=3).
Note, aliasing PPGTT remains 32b only.
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
---
drivers/gpu/drm/i915/i915_gem_gtt.c | 5 ++---
drivers/gpu/drm/i915/i915_params.c | 2 +-
2 files changed, 3 insertions(+), 4 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 1435701..446bd98 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -108,8 +108,7 @@ static int sanitize_enable_ppgtt(struct drm_device *dev, int enable_ppgtt)
has_aliasing_ppgtt = INTEL_INFO(dev)->gen >= 6;
has_full_ppgtt = INTEL_INFO(dev)->gen >= 7;
- has_full_64bit_ppgtt = (IS_BROADWELL(dev) ||
- INTEL_INFO(dev)->gen >= 9) && false; /* FIXME: 64b */
+ has_full_64bit_ppgtt = IS_BROADWELL(dev) || INTEL_INFO(dev)->gen >= 9;
if (intel_vgpu_active(dev))
has_full_ppgtt = false; /* emulation is too hard */
@@ -147,7 +146,7 @@ static int sanitize_enable_ppgtt(struct drm_device *dev, int enable_ppgtt)
}
if (INTEL_INFO(dev)->gen >= 8 && i915.enable_execlists)
- return 2;
+ return has_full_64bit_ppgtt ? 3 : 2;
else
return has_aliasing_ppgtt ? 1 : 0;
}
diff --git a/drivers/gpu/drm/i915/i915_params.c b/drivers/gpu/drm/i915/i915_params.c
index 7983fe4..ccf3eb2 100644
--- a/drivers/gpu/drm/i915/i915_params.c
+++ b/drivers/gpu/drm/i915/i915_params.c
@@ -110,7 +110,7 @@ MODULE_PARM_DESC(enable_hangcheck,
module_param_named_unsafe(enable_ppgtt, i915.enable_ppgtt, int, 0400);
MODULE_PARM_DESC(enable_ppgtt,
"Override PPGTT usage. "
- "(-1=auto [default], 0=disabled, 1=aliasing, 2=full)");
+ "(-1=auto [default], 0=disabled, 1=aliasing, 2=full, 3=full_64b)");
module_param_named(enable_execlists, i915.enable_execlists, int, 0400);
MODULE_PARM_DESC(enable_execlists,
--
2.4.5
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 33+ messages in thread
* [PATCH v5 19/19] drm/i915: Save some page table setup on repeated binds
2015-07-16 9:33 [PATCH v5 00/19] 48-bit PPGTT Michel Thierry
` (17 preceding siblings ...)
2015-07-16 9:33 ` [PATCH v5 18/19] drm/i915/gen8: Flip the 48b switch Michel Thierry
@ 2015-07-16 9:33 ` Michel Thierry
2015-07-28 12:18 ` [PATCH v5 00/19] 48-bit PPGTT Chris Wilson
19 siblings, 0 replies; 33+ messages in thread
From: Michel Thierry @ 2015-07-16 9:33 UTC (permalink / raw)
To: intel-gfx; +Cc: akash.goel
Check if the required page tables are already allocated, if so, we can
skip altogether the inner loop of pdes, and move to the next page
directory.
If the new allocation is different than the existing one (i.e. new
allocation spans more ptes than already covered from earlier allocations),
the used_ptes bitmap may not get updated correctly, but none of the
code-checks rely on this.
Suggested-by: Akash Goel <akash.goel@intel.com>
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
---
drivers/gpu/drm/i915/i915_gem_gtt.c | 10 ++++++++++
1 file changed, 10 insertions(+)
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 446bd98..086c71d 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -1194,6 +1194,16 @@ static int gen8_alloc_va_range_3lvl(struct i915_address_space *vm,
/* Every pd should be allocated, we just did that above. */
WARN_ON(!pd);
+ /* Check if the required page tables are already allocated /
+ * mapped; if so, we can skip altogether the inner loop of pdes,
+ * and move to the next page directory.
+ */
+ if (bitmap_subset(new_page_tables[pdpe], pd->used_pdes,
+ I915_PDES)) {
+ kunmap_px(ppgtt, page_directory);
+ continue;
+ }
+
gen8_for_each_pde(pt, pd, pd_start, pd_len, temp, pde) {
/* Same reasoning as pd */
WARN_ON(!pt);
--
2.4.5
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 33+ messages in thread
* Re: [PATCH v5 17/19] drm/i915: Wa32bitGeneralStateOffset & Wa32bitInstructionBaseOffset
2015-07-16 9:33 ` [PATCH v5 17/19] drm/i915: Wa32bitGeneralStateOffset & Wa32bitInstructionBaseOffset Michel Thierry
@ 2015-07-27 14:34 ` Goel, Akash
2015-07-27 14:46 ` Chris Wilson
2015-07-27 21:11 ` Chris Wilson
1 sibling, 1 reply; 33+ messages in thread
From: Goel, Akash @ 2015-07-27 14:34 UTC (permalink / raw)
To: Michel Thierry, intel-gfx@lists.freedesktop.org
On 7/16/2015 3:03 PM, Michel Thierry wrote:
> There are some allocations that must be only referenced by 32-bit
> offsets. To limit the chances of having the first 4GB already full,
> objects not requiring this workaround use DRM_MM_SEARCH_BELOW/
> DRM_MM_CREATE_TOP flags
>
> In specific, any resource used with flat/heapless (0x00000000-0xfffff000)
> General State Heap (GSH) or Instruction State Heap (ISH) must be in a
> 32-bit range, because the General State Offset and Instruction State
> Offset are limited to 32-bits.
>
> Objects must have EXEC_OBJECT_SUPPORTS_48B_ADDRESS flag to indicate if
> they can be allocated above the 32-bit address range. To limit the
> chances of having the first 4GB already full, objects will use
> DRM_MM_SEARCH_BELOW + DRM_MM_CREATE_TOP flags when possible.
>
> v2: Changed flag logic from neeeds_32b, to supports_48b.
> v3: Moved 48-bit support flag back to exec_object. (Chris, Daniel)
> v4: Split pin flags into PIN_ZONE_4G and PIN_HIGH; update PIN_OFFSET_MASK
> to use last PIN_ defined instead of hard-coded value; use correct limit
> check in eb_vma_misplaced. (Chris)
> v5: Don't touch PIN_OFFSET_MASK and update workaround comment (Chris)
> v6: Apply pin-high for ggtt too (Chris)
>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v4)
> Signed-off-by: Michel Thierry <michel.thierry@intel.com>
> ---
> drivers/gpu/drm/i915/i915_drv.h | 2 ++
> drivers/gpu/drm/i915/i915_gem.c | 14 ++++++++++++--
> drivers/gpu/drm/i915/i915_gem_execbuffer.c | 13 +++++++++++++
> include/uapi/drm/i915_drm.h | 3 ++-
> 4 files changed, 29 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 1dbbbf0..f79cc7b 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -2771,6 +2771,8 @@ void i915_gem_vma_destroy(struct i915_vma *vma);
> #define PIN_OFFSET_BIAS (1<<3)
> #define PIN_USER (1<<4)
> #define PIN_UPDATE (1<<5)
> +#define PIN_ZONE_4G (1<<6)
> +#define PIN_HIGH (1<<7)
> #define PIN_OFFSET_MASK (~4095)
> int __must_check
> i915_gem_object_pin(struct drm_i915_gem_object *obj,
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 76b7612..cd7e4b6 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -3728,6 +3728,8 @@ i915_gem_object_bind_to_vm(struct drm_i915_gem_object *obj,
> struct drm_i915_private *dev_priv = dev->dev_private;
> u32 fence_alignment, unfenced_alignment;
> u64 size, fence_size;
> + u32 search_flag = DRM_MM_SEARCH_DEFAULT;
> + u32 alloc_flag = DRM_MM_CREATE_DEFAULT;
> u64 start =
> flags & PIN_OFFSET_BIAS ? flags & PIN_OFFSET_MASK : 0;
> u64 end =
> @@ -3771,6 +3773,14 @@ i915_gem_object_bind_to_vm(struct drm_i915_gem_object *obj,
> size = flags & PIN_MAPPABLE ? fence_size : obj->base.size;
> }
>
> + if (flags & PIN_HIGH) {
> + search_flag = DRM_MM_SEARCH_BELOW;
> + alloc_flag = DRM_MM_CREATE_TOP;
> + }
> +
> + if (flags & PIN_ZONE_4G)
> + end = (1ULL << 32);
Would this be fine for platforms, where only 2 GB of GGTT space is
available ? For GEN7 & older platforms, only GGTT would be used.
Shouldn't this check for PIN_ZONE_4G flag, be done only for PPGTT vm ?
For GGTT we have to obey the PIN_MAPPABLE flag, if set then 'end' will
be 256 MB.
If both PIN_MAPPABLE & PIN_ZONE_4G flags are set, the 'end' should
still be 256 MB, for GGTT vm.
So we need to mindful in defining the 'end' for platforms where
only GGTT would be used.
Best Regards
Akash
> +
> if (alignment == 0)
> alignment = flags & PIN_MAPPABLE ? fence_alignment :
> unfenced_alignment;
> @@ -3811,8 +3821,8 @@ search_free:
> size, alignment,
> obj->cache_level,
> start, end,
> - DRM_MM_SEARCH_DEFAULT,
> - DRM_MM_CREATE_DEFAULT);
> + search_flag,
> + alloc_flag);
> if (ret) {
> ret = i915_gem_evict_something(dev, vm, size, alignment,
> obj->cache_level,
> diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> index 923a3c4..209e8e2 100644
> --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> @@ -589,11 +589,20 @@ i915_gem_execbuffer_reserve_vma(struct i915_vma *vma,
> if (entry->flags & EXEC_OBJECT_NEEDS_GTT)
> flags |= PIN_GLOBAL;
>
> + /* Wa32bitGeneralStateOffset & Wa32bitInstructionBaseOffset,
> + * limit address to the first 4GBs for unflagged objects.
> + */
> + flags |= PIN_ZONE_4G;
> + if (entry->flags & EXEC_OBJECT_SUPPORTS_48B_ADDRESS)
> + flags &= ~PIN_ZONE_4G;
> +
> if (!drm_mm_node_allocated(&vma->node)) {
> if (entry->flags & __EXEC_OBJECT_NEEDS_MAP)
> flags |= PIN_GLOBAL | PIN_MAPPABLE;
> if (entry->flags & __EXEC_OBJECT_NEEDS_BIAS)
> flags |= BATCH_OFFSET_BIAS | PIN_OFFSET_BIAS;
> + if ((flags & PIN_MAPPABLE) == 0)
> + flags |= PIN_HIGH;
> }
>
> ret = i915_gem_object_pin(obj, vma->vm, entry->alignment, flags);
> @@ -671,6 +680,10 @@ eb_vma_misplaced(struct i915_vma *vma)
> if (entry->flags & __EXEC_OBJECT_NEEDS_MAP && !obj->map_and_fenceable)
> return !only_mappable_for_reloc(entry->flags);
>
> + if (!(entry->flags & EXEC_OBJECT_SUPPORTS_48B_ADDRESS) &&
> + (vma->node.start + vma->node.size) >= (1ULL << 32))
> + return true;
> +
> return false;
> }
>
> diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
> index e7c29f1..e4471e8 100644
> --- a/include/uapi/drm/i915_drm.h
> +++ b/include/uapi/drm/i915_drm.h
> @@ -686,7 +686,8 @@ struct drm_i915_gem_exec_object2 {
> #define EXEC_OBJECT_NEEDS_FENCE (1<<0)
> #define EXEC_OBJECT_NEEDS_GTT (1<<1)
> #define EXEC_OBJECT_WRITE (1<<2)
> -#define __EXEC_OBJECT_UNKNOWN_FLAGS -(EXEC_OBJECT_WRITE<<1)
> +#define EXEC_OBJECT_SUPPORTS_48B_ADDRESS (1<<3)
> +#define __EXEC_OBJECT_UNKNOWN_FLAGS -(EXEC_OBJECT_SUPPORTS_48B_ADDRESS<<1)
> __u64 flags;
>
> __u64 rsvd1;
>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCH v5 17/19] drm/i915: Wa32bitGeneralStateOffset & Wa32bitInstructionBaseOffset
2015-07-27 14:34 ` Goel, Akash
@ 2015-07-27 14:46 ` Chris Wilson
2015-07-27 14:53 ` Michel Thierry
0 siblings, 1 reply; 33+ messages in thread
From: Chris Wilson @ 2015-07-27 14:46 UTC (permalink / raw)
To: Goel, Akash; +Cc: intel-gfx@lists.freedesktop.org
On Mon, Jul 27, 2015 at 08:04:50PM +0530, Goel, Akash wrote:
>
>
> On 7/16/2015 3:03 PM, Michel Thierry wrote:
> >There are some allocations that must be only referenced by 32-bit
> >offsets. To limit the chances of having the first 4GB already full,
> >objects not requiring this workaround use DRM_MM_SEARCH_BELOW/
> >DRM_MM_CREATE_TOP flags
> >
> >In specific, any resource used with flat/heapless (0x00000000-0xfffff000)
> >General State Heap (GSH) or Instruction State Heap (ISH) must be in a
> >32-bit range, because the General State Offset and Instruction State
> >Offset are limited to 32-bits.
> >
> >Objects must have EXEC_OBJECT_SUPPORTS_48B_ADDRESS flag to indicate if
> >they can be allocated above the 32-bit address range. To limit the
> >chances of having the first 4GB already full, objects will use
> >DRM_MM_SEARCH_BELOW + DRM_MM_CREATE_TOP flags when possible.
> >
> >v2: Changed flag logic from neeeds_32b, to supports_48b.
> >v3: Moved 48-bit support flag back to exec_object. (Chris, Daniel)
> >v4: Split pin flags into PIN_ZONE_4G and PIN_HIGH; update PIN_OFFSET_MASK
> >to use last PIN_ defined instead of hard-coded value; use correct limit
> >check in eb_vma_misplaced. (Chris)
> >v5: Don't touch PIN_OFFSET_MASK and update workaround comment (Chris)
> >v6: Apply pin-high for ggtt too (Chris)
> >
> >Cc: Chris Wilson <chris@chris-wilson.co.uk>
> >Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v4)
> >Signed-off-by: Michel Thierry <michel.thierry@intel.com>
> >---
> > drivers/gpu/drm/i915/i915_drv.h | 2 ++
> > drivers/gpu/drm/i915/i915_gem.c | 14 ++++++++++++--
> > drivers/gpu/drm/i915/i915_gem_execbuffer.c | 13 +++++++++++++
> > include/uapi/drm/i915_drm.h | 3 ++-
> > 4 files changed, 29 insertions(+), 3 deletions(-)
> >
> >diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> >index 1dbbbf0..f79cc7b 100644
> >--- a/drivers/gpu/drm/i915/i915_drv.h
> >+++ b/drivers/gpu/drm/i915/i915_drv.h
> >@@ -2771,6 +2771,8 @@ void i915_gem_vma_destroy(struct i915_vma *vma);
> > #define PIN_OFFSET_BIAS (1<<3)
> > #define PIN_USER (1<<4)
> > #define PIN_UPDATE (1<<5)
> >+#define PIN_ZONE_4G (1<<6)
> >+#define PIN_HIGH (1<<7)
> > #define PIN_OFFSET_MASK (~4095)
> > int __must_check
> > i915_gem_object_pin(struct drm_i915_gem_object *obj,
> >diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> >index 76b7612..cd7e4b6 100644
> >--- a/drivers/gpu/drm/i915/i915_gem.c
> >+++ b/drivers/gpu/drm/i915/i915_gem.c
> >@@ -3728,6 +3728,8 @@ i915_gem_object_bind_to_vm(struct drm_i915_gem_object *obj,
> > struct drm_i915_private *dev_priv = dev->dev_private;
> > u32 fence_alignment, unfenced_alignment;
> > u64 size, fence_size;
> >+ u32 search_flag = DRM_MM_SEARCH_DEFAULT;
> >+ u32 alloc_flag = DRM_MM_CREATE_DEFAULT;
> > u64 start =
> > flags & PIN_OFFSET_BIAS ? flags & PIN_OFFSET_MASK : 0;
> > u64 end =
> >@@ -3771,6 +3773,14 @@ i915_gem_object_bind_to_vm(struct drm_i915_gem_object *obj,
> > size = flags & PIN_MAPPABLE ? fence_size : obj->base.size;
> > }
> >
> >+ if (flags & PIN_HIGH) {
> >+ search_flag = DRM_MM_SEARCH_BELOW;
> >+ alloc_flag = DRM_MM_CREATE_TOP;
> >+ }
> >+
> >+ if (flags & PIN_ZONE_4G)
> >+ end = (1ULL << 32);
>
> Would this be fine for platforms, where only 2 GB of GGTT space is
> available ? For GEN7 & older platforms, only GGTT would be used.
> Shouldn't this check for PIN_ZONE_4G flag, be done only for PPGTT vm ?
> For GGTT we have to obey the PIN_MAPPABLE flag, if set then 'end'
> will be 256 MB.
> If both PIN_MAPPABLE & PIN_ZONE_4G flags are set, the 'end' should
> still be 256 MB, for GGTT vm.
> So we need to mindful in defining the 'end' for platforms where
> only GGTT would be used.
Ah, indeed. I didn't notice since this is end = min(end, 1<<32) in my
tree.
-Chris
--
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCH v5 17/19] drm/i915: Wa32bitGeneralStateOffset & Wa32bitInstructionBaseOffset
2015-07-27 14:46 ` Chris Wilson
@ 2015-07-27 14:53 ` Michel Thierry
0 siblings, 0 replies; 33+ messages in thread
From: Michel Thierry @ 2015-07-27 14:53 UTC (permalink / raw)
To: Chris Wilson, Goel, Akash, intel-gfx@lists.freedesktop.org
On 7/27/2015 3:46 PM, Chris Wilson wrote:
> On Mon, Jul 27, 2015 at 08:04:50PM +0530, Goel, Akash wrote:
>>
>>
>> On 7/16/2015 3:03 PM, Michel Thierry wrote:
>>> There are some allocations that must be only referenced by 32-bit
>>> offsets. To limit the chances of having the first 4GB already full,
>>> objects not requiring this workaround use DRM_MM_SEARCH_BELOW/
>>> DRM_MM_CREATE_TOP flags
>>>
>>> In specific, any resource used with flat/heapless (0x00000000-0xfffff000)
>>> General State Heap (GSH) or Instruction State Heap (ISH) must be in a
>>> 32-bit range, because the General State Offset and Instruction State
>>> Offset are limited to 32-bits.
>>>
>>> Objects must have EXEC_OBJECT_SUPPORTS_48B_ADDRESS flag to indicate if
>>> they can be allocated above the 32-bit address range. To limit the
>>> chances of having the first 4GB already full, objects will use
>>> DRM_MM_SEARCH_BELOW + DRM_MM_CREATE_TOP flags when possible.
>>>
>>> v2: Changed flag logic from neeeds_32b, to supports_48b.
>>> v3: Moved 48-bit support flag back to exec_object. (Chris, Daniel)
>>> v4: Split pin flags into PIN_ZONE_4G and PIN_HIGH; update PIN_OFFSET_MASK
>>> to use last PIN_ defined instead of hard-coded value; use correct limit
>>> check in eb_vma_misplaced. (Chris)
>>> v5: Don't touch PIN_OFFSET_MASK and update workaround comment (Chris)
>>> v6: Apply pin-high for ggtt too (Chris)
>>>
>>> Cc: Chris Wilson <chris@chris-wilson.co.uk>
>>> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v4)
>>> Signed-off-by: Michel Thierry <michel.thierry@intel.com>
>>> ---
>>> drivers/gpu/drm/i915/i915_drv.h | 2 ++
>>> drivers/gpu/drm/i915/i915_gem.c | 14 ++++++++++++--
>>> drivers/gpu/drm/i915/i915_gem_execbuffer.c | 13 +++++++++++++
>>> include/uapi/drm/i915_drm.h | 3 ++-
>>> 4 files changed, 29 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
>>> index 1dbbbf0..f79cc7b 100644
>>> --- a/drivers/gpu/drm/i915/i915_drv.h
>>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>>> @@ -2771,6 +2771,8 @@ void i915_gem_vma_destroy(struct i915_vma *vma);
>>> #define PIN_OFFSET_BIAS (1<<3)
>>> #define PIN_USER (1<<4)
>>> #define PIN_UPDATE (1<<5)
>>> +#define PIN_ZONE_4G (1<<6)
>>> +#define PIN_HIGH (1<<7)
>>> #define PIN_OFFSET_MASK (~4095)
>>> int __must_check
>>> i915_gem_object_pin(struct drm_i915_gem_object *obj,
>>> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
>>> index 76b7612..cd7e4b6 100644
>>> --- a/drivers/gpu/drm/i915/i915_gem.c
>>> +++ b/drivers/gpu/drm/i915/i915_gem.c
>>> @@ -3728,6 +3728,8 @@ i915_gem_object_bind_to_vm(struct drm_i915_gem_object *obj,
>>> struct drm_i915_private *dev_priv = dev->dev_private;
>>> u32 fence_alignment, unfenced_alignment;
>>> u64 size, fence_size;
>>> + u32 search_flag = DRM_MM_SEARCH_DEFAULT;
>>> + u32 alloc_flag = DRM_MM_CREATE_DEFAULT;
>>> u64 start =
>>> flags & PIN_OFFSET_BIAS ? flags & PIN_OFFSET_MASK : 0;
>>> u64 end =
>>> @@ -3771,6 +3773,14 @@ i915_gem_object_bind_to_vm(struct drm_i915_gem_object *obj,
>>> size = flags & PIN_MAPPABLE ? fence_size : obj->base.size;
>>> }
>>>
>>> + if (flags & PIN_HIGH) {
>>> + search_flag = DRM_MM_SEARCH_BELOW;
>>> + alloc_flag = DRM_MM_CREATE_TOP;
>>> + }
>>> +
>>> + if (flags & PIN_ZONE_4G)
>>> + end = (1ULL << 32);
>>
>> Would this be fine for platforms, where only 2 GB of GGTT space is
>> available ? For GEN7 & older platforms, only GGTT would be used.
>> Shouldn't this check for PIN_ZONE_4G flag, be done only for PPGTT vm ?
>> For GGTT we have to obey the PIN_MAPPABLE flag, if set then 'end'
>> will be 256 MB.
>> If both PIN_MAPPABLE & PIN_ZONE_4G flags are set, the 'end' should
>> still be 256 MB, for GGTT vm.
>> So we need to mindful in defining the 'end' for platforms where
>> only GGTT would be used.
>
> Ah, indeed. I didn't notice since this is end = min(end, 1<<32) in my
> tree.
Thanks, I'll resend with this fix.
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCH v5 17/19] drm/i915: Wa32bitGeneralStateOffset & Wa32bitInstructionBaseOffset
2015-07-16 9:33 ` [PATCH v5 17/19] drm/i915: Wa32bitGeneralStateOffset & Wa32bitInstructionBaseOffset Michel Thierry
2015-07-27 14:34 ` Goel, Akash
@ 2015-07-27 21:11 ` Chris Wilson
2015-07-28 11:12 ` Michel Thierry
1 sibling, 1 reply; 33+ messages in thread
From: Chris Wilson @ 2015-07-27 21:11 UTC (permalink / raw)
To: Michel Thierry; +Cc: intel-gfx, akash.goel
On Thu, Jul 16, 2015 at 10:33:29AM +0100, Michel Thierry wrote:
> + if (!(entry->flags & EXEC_OBJECT_SUPPORTS_48B_ADDRESS) &&
> + (vma->node.start + vma->node.size) >= (1ULL << 32))
> + return true;
gcc completely screwed this up here and used 0 for 1ULL<<32.
Note that we can allow state + size == 4G (since the end is exclusive),
so I went with
if ((entry->flags & EXEC_OBJECT_SUPPORTS_48B_ADDRESS) == 0 &&
(vma->node.start + vma->node.size - 1) >> 32)
return true;
instead.
-Chris
--
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCH v5 17/19] drm/i915: Wa32bitGeneralStateOffset & Wa32bitInstructionBaseOffset
2015-07-27 21:11 ` Chris Wilson
@ 2015-07-28 11:12 ` Michel Thierry
2015-07-28 14:43 ` Chris Wilson
0 siblings, 1 reply; 33+ messages in thread
From: Michel Thierry @ 2015-07-28 11:12 UTC (permalink / raw)
To: Chris Wilson, intel-gfx, akash.goel
On 7/27/2015 10:11 PM, Chris Wilson wrote:
> On Thu, Jul 16, 2015 at 10:33:29AM +0100, Michel Thierry wrote:
>> + if (!(entry->flags & EXEC_OBJECT_SUPPORTS_48B_ADDRESS) &&
>> + (vma->node.start + vma->node.size) >= (1ULL << 32))
>> + return true;
>
> gcc completely screwed this up here and used 0 for 1ULL<<32.
>
> Note that we can allow state + size == 4G (since the end is exclusive),
> so I went with
>
> if ((entry->flags & EXEC_OBJECT_SUPPORTS_48B_ADDRESS) == 0 &&
> (vma->node.start + vma->node.size - 1) >> 32)
> return true;
>
> instead.
> -Chris
>
Thanks, I'll include this change in the next patch version.
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCH v5 00/19] 48-bit PPGTT
2015-07-16 9:33 [PATCH v5 00/19] 48-bit PPGTT Michel Thierry
` (18 preceding siblings ...)
2015-07-16 9:33 ` [PATCH v5 19/19] drm/i915: Save some page table setup on repeated binds Michel Thierry
@ 2015-07-28 12:18 ` Chris Wilson
19 siblings, 0 replies; 33+ messages in thread
From: Chris Wilson @ 2015-07-28 12:18 UTC (permalink / raw)
To: Michel Thierry; +Cc: intel-gfx, akash.goel
On Thu, Jul 16, 2015 at 10:33:12AM +0100, Michel Thierry wrote:
> This clean-up version delays the 48-bit work to later patches and includes
> other review comments from Akash and Chris Wilson. The first 4 patches
> prepare the dynamic page allocation code to handle independent pdps, but
> no specific code for 48-bit mode is added before the 5th patch.
>
> In order expand the GPU address space, a 4th level translation is added,
> the Page Map Level 4 (PML4). This PML4 has 512 PML4 Entries (PML4E),
> PML4[0-511], each pointing to a PDP. All the existing "dynamic alloc
> ppgtt" functions are used, only adding the 4th level changes. I also
> updated some remaining variables that were 32b only.
>
> There are 2 hardware workarounds needed to allow correct operation with
> 48b addresses (Wa32bitGeneralStateOffset & Wa32bitInstructionBaseOffset).
> This new patchset version includes the comments and suggestions from Chris
> Wilson. A flag (EXEC_OBJECT_SUPPORTS_48B_ADDRESS) will indicate if a given
> object can be allocated outside the first 4 PDPs; if not, the end range is
> forced to 4GB. Also, more objects now use the DRM_MM_CREATE_TOP flag. To
> maintain compatibility, in libdrm I added a new drm_intel_bo_emit_reloc_48bit
> function that will flag these objects, while the existing drm_intel_bo_emit_reloc
> clears it.
>
> Finally, this feature is only available in BDW and Gen9, requires LRC
> submission mode (execlists) and it can be detected by i915.enable_ppgtt=3.
>
> Also note that this expanded address space is only available for full
> PPGTT, aliasing PPGTT and Global GTT remain 32-bit.
A test I just thought of is to extend gem_evict_alignment to iterate
over
for (align = 1<<12; align < 1<<48; align <<= 1)
exec(obj.align=align)
i.e. basically force the kernel to place the object in every
power-of-two zone. The idea here is to exercise and allocate as much of
the 4-level page table handling code as is trivially possible (to work
on extents tracking you could leave each level in place. Now this is
starting to feel more like a gem_ppgtt test). Using softpin we would
move control over exercising every boundary in the code (but then
requires softpin).
Also noticed that constructing the bitmaps for va_alloc_range tracking
was very expensive, even in the trivial no-op case (rebinding to the
same location). A benchmark to measure that allocation overhead would be
very useful. For that I think a synthetic like using softpin to move an
object through the entire address space or even flip between two locations
would do the job.
-Chris
--
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCH v5 17/19] drm/i915: Wa32bitGeneralStateOffset & Wa32bitInstructionBaseOffset
2015-07-28 11:12 ` Michel Thierry
@ 2015-07-28 14:43 ` Chris Wilson
2015-07-29 11:05 ` Michel Thierry
0 siblings, 1 reply; 33+ messages in thread
From: Chris Wilson @ 2015-07-28 14:43 UTC (permalink / raw)
To: Michel Thierry; +Cc: intel-gfx, akash.goel
On Tue, Jul 28, 2015 at 12:12:11PM +0100, Michel Thierry wrote:
> On 7/27/2015 10:11 PM, Chris Wilson wrote:
> >On Thu, Jul 16, 2015 at 10:33:29AM +0100, Michel Thierry wrote:
> >>+ if (!(entry->flags & EXEC_OBJECT_SUPPORTS_48B_ADDRESS) &&
> >>+ (vma->node.start + vma->node.size) >= (1ULL << 32))
> >>+ return true;
> >
> >gcc completely screwed this up here and used 0 for 1ULL<<32.
> >
> >Note that we can allow state + size == 4G (since the end is exclusive),
> >so I went with
> >
> > if ((entry->flags & EXEC_OBJECT_SUPPORTS_48B_ADDRESS) == 0 &&
> > (vma->node.start + vma->node.size - 1) >> 32)
> > return true;
> >
> >instead.
> >-Chris
> >
>
> Thanks, I'll include this change in the next patch version.
I've also got a couple of other stylistic changes, plus an earlier
request:
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 1e2764a324b2..b0fe9b4124fd 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -3356,13 +3356,9 @@ i915_gem_object_bind_to_vm(struct drm_i915_gem_object *obj,
struct drm_device *dev = obj->base.dev;
struct drm_i915_private *dev_priv = dev->dev_private;
u32 fence_alignment, unfenced_alignment;
+ u64 start, end;
u64 size, fence_size;
- u32 search_flag = DRM_MM_SEARCH_DEFAULT;
- u32 alloc_flag = DRM_MM_CREATE_DEFAULT;
- u64 start =
- flags & PIN_OFFSET_BIAS ? flags & PIN_OFFSET_MASK : 0;
- u64 end =
- flags & PIN_MAPPABLE ? dev_priv->gtt.mappable_end : vm->total;
+ u32 search_flag, alloc_flag;
struct i915_vma *vma;
int ret;
@@ -3400,15 +3396,15 @@ i915_gem_object_bind_to_vm(struct drm_i915_gem_object *obj,
obj->tiling_mode,
false);
size = flags & PIN_MAPPABLE ? fence_size : obj->base.size;
+ }
- if (flags & PIN_HIGH) {
- search_flag = DRM_MM_SEARCH_BELOW;
- alloc_flag = DRM_MM_CREATE_TOP;
- }
+ start = flags & PIN_OFFSET_BIAS ? flags & PIN_OFFSET_MASK : 0;
- if (flags & PIN_ZONE_4G)
- end = (1ULL << 32);
- }
+ end = vm->total;
+ if (flags & PIN_MAPPABLE)
+ end = min_t(u64, end, dev_priv->gtt.mappable_end);
+ if (flags & PIN_ZONE_4G)
+ end = min_t(u64, end, 1ULL << 32);
if (alignment == 0)
alignment = flags & PIN_MAPPABLE ? fence_alignment :
@@ -3445,6 +3441,14 @@ i915_gem_object_bind_to_vm(struct drm_i915_gem_object *obj,
if (IS_ERR(vma))
goto err_unpin;
+ if (flags & PIN_HIGH) {
+ search_flag = DRM_MM_SEARCH_BELOW;
+ alloc_flag = DRM_MM_CREATE_TOP;
+ } else {
+ search_flag = DRM_MM_SEARCH_DEFAULT;
+ alloc_flag = DRM_MM_CREATE_DEFAULT;
+ }
+
search_free:
ret = drm_mm_insert_node_in_range_generic(&vm->mm, &vma->node,
size, alignment,
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index 209e8e2b07be..78fc8810d6e0 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -680,8 +680,8 @@ eb_vma_misplaced(struct i915_vma *vma)
if (entry->flags & __EXEC_OBJECT_NEEDS_MAP && !obj->map_and_fenceable)
return !only_mappable_for_reloc(entry->flags);
- if (!(entry->flags & EXEC_OBJECT_SUPPORTS_48B_ADDRESS) &&
- (vma->node.start + vma->node.size) >= (1ULL << 32))
+ if ((entry->flags & EXEC_OBJECT_SUPPORTS_48B_ADDRESS) == 0 &&
+ (vma->node.start + vma->node.size - 1) >> 32)
return true;
return false;
--
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 33+ messages in thread
* Re: [PATCH v5 17/19] drm/i915: Wa32bitGeneralStateOffset & Wa32bitInstructionBaseOffset
2015-07-28 14:43 ` Chris Wilson
@ 2015-07-29 11:05 ` Michel Thierry
2015-07-29 11:17 ` Chris Wilson
0 siblings, 1 reply; 33+ messages in thread
From: Michel Thierry @ 2015-07-29 11:05 UTC (permalink / raw)
To: Chris Wilson, intel-gfx, akash.goel
On 7/28/2015 3:43 PM, Chris Wilson wrote:
> On Tue, Jul 28, 2015 at 12:12:11PM +0100, Michel Thierry wrote:
>> On 7/27/2015 10:11 PM, Chris Wilson wrote:
>>> On Thu, Jul 16, 2015 at 10:33:29AM +0100, Michel Thierry wrote:
>>>> + if (!(entry->flags & EXEC_OBJECT_SUPPORTS_48B_ADDRESS) &&
>>>> + (vma->node.start + vma->node.size) >= (1ULL << 32))
>>>> + return true;
>>>
>>> gcc completely screwed this up here and used 0 for 1ULL<<32.
>>>
>>> Note that we can allow state + size == 4G (since the end is exclusive),
>>> so I went with
>>>
>>> if ((entry->flags & EXEC_OBJECT_SUPPORTS_48B_ADDRESS) == 0 &&
>>> (vma->node.start + vma->node.size - 1) >> 32)
>>> return true;
>>>
>>> instead.
>>> -Chris
>>>
>>
>> Thanks, I'll include this change in the next patch version.
>
> I've also got a couple of other stylistic changes, plus an earlier
> request:
> ...
I updated my patch with these changes, thanks.
> diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> index 209e8e2b07be..78fc8810d6e0 100644
> --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> @@ -680,8 +680,8 @@ eb_vma_misplaced(struct i915_vma *vma)
> if (entry->flags & __EXEC_OBJECT_NEEDS_MAP && !obj->map_and_fenceable)
> return !only_mappable_for_reloc(entry->flags);
>
> - if (!(entry->flags & EXEC_OBJECT_SUPPORTS_48B_ADDRESS) &&
> - (vma->node.start + vma->node.size) >= (1ULL << 32))
> + if ((entry->flags & EXEC_OBJECT_SUPPORTS_48B_ADDRESS) == 0 &&
> + (vma->node.start + vma->node.size - 1) >> 32)
> return true;
>
> return false;
>
Also updated this part to follow your suggestion.
I also spent a bit more time trying to figure what gcc was doing here.
But I can't reproduce it locally, I get sizeof(1ULL<<32) = 8 and
1ULL<<32 doesn't seem to be zero (in 32 and 64 bit kernels).
Could it be related to the gcc version? I'm using 4.8.4.
-Michel
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCH v5 17/19] drm/i915: Wa32bitGeneralStateOffset & Wa32bitInstructionBaseOffset
2015-07-29 11:05 ` Michel Thierry
@ 2015-07-29 11:17 ` Chris Wilson
0 siblings, 0 replies; 33+ messages in thread
From: Chris Wilson @ 2015-07-29 11:17 UTC (permalink / raw)
To: Michel Thierry; +Cc: intel-gfx, akash.goel
On Wed, Jul 29, 2015 at 12:05:55PM +0100, Michel Thierry wrote:
> >@@ -680,8 +680,8 @@ eb_vma_misplaced(struct i915_vma *vma)
> > if (entry->flags & __EXEC_OBJECT_NEEDS_MAP && !obj->map_and_fenceable)
> > return !only_mappable_for_reloc(entry->flags);
> >
> >- if (!(entry->flags & EXEC_OBJECT_SUPPORTS_48B_ADDRESS) &&
> >- (vma->node.start + vma->node.size) >= (1ULL << 32))
> >+ if ((entry->flags & EXEC_OBJECT_SUPPORTS_48B_ADDRESS) == 0 &&
> >+ (vma->node.start + vma->node.size - 1) >> 32)
> > return true;
> >
> > return false;
> >
> Also updated this part to follow your suggestion.
>
> I also spent a bit more time trying to figure what gcc was doing
> here. But I can't reproduce it locally, I get sizeof(1ULL<<32) = 8
> and 1ULL<<32 doesn't seem to be zero (in 32 and 64 bit kernels).
>
> Could it be related to the gcc version? I'm using 4.8.4.
It looked valid to me as well, but a printk confirmed that gcc was
hitting that path for every object.
gcc (Ubuntu 4.9.2-10ubuntu13) 4.9.2
-Chris
--
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCH v5 07/19] drm/i915/gen8: Add 4 level switching infrastructure and lrc support
2015-07-16 9:33 ` [PATCH v5 07/19] drm/i915/gen8: Add 4 level switching infrastructure and lrc support Michel Thierry
@ 2015-07-29 14:34 ` Michel Thierry
0 siblings, 0 replies; 33+ messages in thread
From: Michel Thierry @ 2015-07-29 14:34 UTC (permalink / raw)
To: intel-gfx@lists.freedesktop.org; +Cc: Goel, Akash
On 7/16/2015 10:33 AM, Michel Thierry wrote:
> In 64b (48bit canonical) PPGTT addressing, the PDP0 register contains
> the base address to PML4, while the other PDP registers are ignored.
>
> In LRC, the addressing mode must be specified in every context
> descriptor, and the base address to PML4 is stored in the reg state.
>
> v2: PML4 update in legacy context switch is left for historic reasons,
> the preferred mode of operation is with lrc context based submission.
> v3: s/gen8_map_page_directory/gen8_setup_page_directory and
> s/gen8_map_page_directory_pointer/gen8_setup_page_directory_pointer.
> Also, clflush will be needed for bxt. (Akash)
> v4: Squashed lrc-specific code and use a macro to set PML4 register.
> v5: Rebase after Mika's ppgtt cleanup / scratch merge patch series.
> PDP update in bb_start is only for legacy 32b mode.
> v6: Rebase after final merged version of Mika's ppgtt/scratch
> patches.
> v7: There is no need to update the pml4 register value in
> execlists_update_context (Akash)
>
> Cc: Akash Goel <akash.goel@intel.com>
> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2+)
> ---
> drivers/gpu/drm/i915/i915_gem_gtt.c | 54 +++++++++++++++++++++++++++++----
> drivers/gpu/drm/i915/i915_gem_gtt.h | 2 ++
> drivers/gpu/drm/i915/i915_reg.h | 1 +
> drivers/gpu/drm/i915/intel_lrc.c | 60 ++++++++++++++++++++++++++-----------
> 4 files changed, 94 insertions(+), 23 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
> index 5901810..8bcd328 100644
> --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> @@ -210,6 +210,9 @@ static gen8_pde_t gen8_pde_encode(const dma_addr_t addr,
> return pde;
> }
>
> +#define gen8_pdpe_encode gen8_pde_encode
> +#define gen8_pml4e_encode gen8_pde_encode
> +
> static gen6_pte_t snb_pte_encode(dma_addr_t addr,
> enum i915_cache_level level,
> bool valid, u32 unused)
> @@ -599,6 +602,35 @@ static void free_pdp(struct drm_device *dev,
> }
> }
>
> +static void
> +gen8_setup_page_directory(struct i915_hw_ppgtt *ppgtt,
> + struct i915_page_directory_pointer *pdp,
> + struct i915_page_directory *pd,
> + int index)
> +{
> + gen8_ppgtt_pdpe_t *page_directorypo;
> +
> + if (!USES_FULL_48BIT_PPGTT(ppgtt->base.dev))
> + return;
> +
> + page_directorypo = kmap_px(pdp);
> + page_directorypo[index] = gen8_pdpe_encode(px_dma(pd), I915_CACHE_LLC);
> + kunmap_px(ppgtt, page_directorypo);
> +}
> +
> +static void
> +gen8_setup_page_directory_pointer(struct i915_hw_ppgtt *ppgtt,
> + struct i915_pml4 *pml4,
> + struct i915_page_directory_pointer *pdp,
> + int index)
> +{
> + gen8_ppgtt_pml4e_t *pagemap = kmap_px(pml4);
> +
> + WARN_ON(!USES_FULL_48BIT_PPGTT(ppgtt->base.dev));
> + pagemap[index] = gen8_pml4e_encode(px_dma(pdp), I915_CACHE_LLC);
> + kunmap_px(ppgtt, pagemap);
> +}
> +
> /* Broadwell Page Directory Pointer Descriptors */
> static int gen8_write_pdp(struct drm_i915_gem_request *req,
> unsigned entry,
These _setup_ functions don't belong to this patch, and should be moved
to the previous one in the patchset ("drm/i915/gen8: implement
alloc/free for 4lvl").
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCH v5 06/19] drm/i915/gen8: implement alloc/free for 4lvl
2015-07-16 9:33 ` [PATCH v5 06/19] drm/i915/gen8: implement alloc/free for 4lvl Michel Thierry
@ 2015-07-29 14:34 ` Michel Thierry
0 siblings, 0 replies; 33+ messages in thread
From: Michel Thierry @ 2015-07-29 14:34 UTC (permalink / raw)
To: intel-gfx@lists.freedesktop.org; +Cc: Goel, Akash
On 7/16/2015 10:33 AM, Michel Thierry wrote:
> PML4 has no special attributes, and there will always be a PML4.
> So simply initialize it at creation, and destroy it at the end.
>
> The code for 4lvl is able to call into the existing 3lvl page table code
> to handle all of the lower levels.
>
> v2: Return something at the end of gen8_alloc_va_range_4lvl to keep the
> compiler happy. And define ret only in one place.
> Updated gen8_ppgtt_unmap_pages and gen8_ppgtt_free to handle 4lvl.
> v3: Use i915_dma_unmap_single instead of pci API. Fix a
> couple of incorrect checks when unmapping pdp and pd pages (Akash).
> v4: Call __pdp_fini also for 32b PPGTT. Clean up alloc_pdp param list.
> v5: Prevent (harmless) out of range access in gen8_for_each_pml4e.
> v6: Simplify alloc_vma_range_4lvl and gen8_ppgtt_init_common error
> paths. (Akash)
> v7: Rebase, s/gen8_ppgtt_free_*/gen8_ppgtt_cleanup_*/.
> v8: Change location of pml4_init/fini. It will make next patches
> cleaner.
> v9: Rebase after Mika's ppgtt cleanup / scratch merge patch series, while
> trying to reuse as much as possible for pdp alloc. pml4_init/fini
> replaced by setup/cleanup_px macros.
> v10: Rebase after Mika's merged ppgtt cleanup patch series.
> v11: Rebase after final merged version of Mika's ppgtt/scratch
> patches.
> v12: Fix pdpe start value in trace (Akash)
> v13: Define all 4lvl functions in this patch directly, instead of
> previous patches, add i915_page_directory_pointer_entry_alloc here,
> use test_bit to detect when pdp is already allocated (Akash).
>
> Cc: Akash Goel <akash.goel@intel.com>
> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2+)
> ---
> drivers/gpu/drm/i915/i915_gem_gtt.c | 163 ++++++++++++++++++++++++++++++++----
> drivers/gpu/drm/i915/i915_gem_gtt.h | 13 ++-
> drivers/gpu/drm/i915/i915_trace.h | 8 ++
> 3 files changed, 167 insertions(+), 17 deletions(-)
>
> @@ -1065,6 +1120,79 @@ err_out:
> return ret;
> }
>
> +static int gen8_alloc_va_range_4lvl(struct i915_address_space *vm,
> + struct i915_pml4 *pml4,
> + uint64_t start,
> + uint64_t length)
> +{
> + DECLARE_BITMAP(new_pdps, GEN8_PML4ES_PER_PML4);
> + struct i915_page_directory_pointer *pdp;
> + const uint64_t orig_start = start;
> + const uint64_t orig_length = length;
> + uint64_t temp, pml4e;
> + int ret = 0;
> +
> + /* Do the pml4 allocations first, so we don't need to track the newly
> + * allocated tables below the pdp */
> + bitmap_zero(new_pdps, GEN8_PML4ES_PER_PML4);
> +
> + /* The pagedirectory and pagetable allocations are done in the shared 3
> + * and 4 level code. Just allocate the pdps.
> + */
> + gen8_for_each_pml4e(pdp, pml4, start, length, temp, pml4e) {
> + if (!test_bit(pml4e, pml4->used_pml4es)) {
> + pdp = alloc_pdp(vm->dev);
> + if (IS_ERR(pdp))
> + goto err_out;
> +
> + pml4->pdps[pml4e] = pdp;
> + __set_bit(pml4e, new_pdps);
> + trace_i915_page_directory_pointer_entry_alloc(vm,
> + pml4e,
> + start,
> + GEN8_PML4E_SHIFT);
> + }
> + }
> +
PDP allocation should be moved to a new
gen8_ppgtt_alloc_page_dirpointers function (as we do for pds and pts);
it should also have the pd and pdp setup functions which are mistakenly
added until a later patch ("drm/i915/gen8: Add 4 level switching
infrastructure and lrc support").
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCH v5 11/19] drm/i915/gen8: Initialize PDPs
2015-07-16 9:33 ` [PATCH v5 11/19] drm/i915/gen8: Initialize PDPs Michel Thierry
@ 2015-07-29 14:35 ` Michel Thierry
0 siblings, 0 replies; 33+ messages in thread
From: Michel Thierry @ 2015-07-29 14:35 UTC (permalink / raw)
To: intel-gfx@lists.freedesktop.org; +Cc: Goel, Akash
On 7/16/2015 10:33 AM, Michel Thierry wrote:
> Similar to PDs, while setting up a page directory pointer, make all entries
> of the pdp point to the scratch pd before mapping (and make all its entries
> point to the scratch page); this is to be safe in case of out of bound
> access or proactive prefetch.
>
> v2: Handle scratch_pdp allocation failure correctly, and keep
> initialize_px functions together (Akash)
> v3: Rebase after Mika's ppgtt cleanup / scratch merge patch series. Rely on
> the added macros to initialize the pdps.
> v4: Rebase after final merged version of Mika's ppgtt/scratch patches
> (and removed commit message part related to v3).
>
> Suggested-by: Akash Goel <akash.goel@intel.com>
> Signed-off-by: Michel Thierry <michel.thierry@intel.com>
> ---
> drivers/gpu/drm/i915/i915_gem_gtt.c | 38 +++++++++++++++++++++++++++++++++++++
> drivers/gpu/drm/i915/i915_gem_gtt.h | 1 +
> 2 files changed, 39 insertions(+)
>
This patch also initializes the PML4 so it should say so (I'll also add
text about the scratch pdp, which the PML4 entries point to).
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCH v5 08/19] drm/i915/gen8: Generalize PTE writing for GEN8 PPGTT
2015-07-16 9:33 ` [PATCH v5 08/19] drm/i915/gen8: Generalize PTE writing for GEN8 PPGTT Michel Thierry
@ 2015-07-29 14:35 ` Michel Thierry
0 siblings, 0 replies; 33+ messages in thread
From: Michel Thierry @ 2015-07-29 14:35 UTC (permalink / raw)
To: intel-gfx@lists.freedesktop.org; +Cc: Goel, Akash
On 7/16/2015 10:33 AM, Michel Thierry wrote:
> The insert_entries function was the function used to write PTEs. For the
> PPGTT it was "hardcoded" to only understand two level page tables, which
> was the case for GEN7. We can reuse this for 4 level page tables, and
> remove the concept of insert_entries, which was never viable past 2
> level page tables anyway, but it requires a bit of rework to make the
> function a bit more generic.
>
> This patch begins the generalization work, and it will be heavily used
> upon when the 48b code is complete. The patch series attempts to make
> each function which touches a part of code specific to the page table
> level and here is no exception.
>
> v2: Rebase after Mika's ppgtt cleanup / scratch merge patch series.
> v3: Rebase after final merged version of Mika's ppgtt/scratch patches.
>
> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2)
> ---
> drivers/gpu/drm/i915/i915_gem_gtt.c | 52 +++++++++++++++++++++++++++----------
> 1 file changed, 39 insertions(+), 13 deletions(-)
>
Akash pointed out that this change doesn't depend of 48-bit ppgtt, so it
should be moved earlier in the patchset (before PML4 is introduced).
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 33+ messages in thread
end of thread, other threads:[~2015-07-29 14:35 UTC | newest]
Thread overview: 33+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-07-16 9:33 [PATCH v5 00/19] 48-bit PPGTT Michel Thierry
2015-07-16 9:33 ` [PATCH v5 01/19] drm/i915: Remove unnecessary gen8_clamp_pd Michel Thierry
2015-07-16 9:33 ` [PATCH v5 02/19] drm/i915/gen8: Make pdp allocation more dynamic Michel Thierry
2015-07-16 9:33 ` [PATCH v5 03/19] drm/i915/gen8: Abstract PDP usage Michel Thierry
2015-07-16 9:33 ` [PATCH v5 04/19] drm/i915/gen8: Add dynamic page trace events Michel Thierry
2015-07-16 9:33 ` [PATCH v5 05/19] drm/i915/gen8: Add PML4 structure Michel Thierry
2015-07-16 9:33 ` [PATCH v5 06/19] drm/i915/gen8: implement alloc/free for 4lvl Michel Thierry
2015-07-29 14:34 ` Michel Thierry
2015-07-16 9:33 ` [PATCH v5 07/19] drm/i915/gen8: Add 4 level switching infrastructure and lrc support Michel Thierry
2015-07-29 14:34 ` Michel Thierry
2015-07-16 9:33 ` [PATCH v5 08/19] drm/i915/gen8: Generalize PTE writing for GEN8 PPGTT Michel Thierry
2015-07-29 14:35 ` Michel Thierry
2015-07-16 9:33 ` [PATCH v5 09/19] drm/i915/gen8: Pass sg_iter through pte inserts Michel Thierry
2015-07-16 9:33 ` [PATCH v5 10/19] drm/i915/gen8: Add 4 level support in insert_entries and clear_range Michel Thierry
2015-07-16 9:33 ` [PATCH v5 11/19] drm/i915/gen8: Initialize PDPs Michel Thierry
2015-07-29 14:35 ` Michel Thierry
2015-07-16 9:33 ` [PATCH v5 12/19] drm/i915: Expand error state's address width to 64b Michel Thierry
2015-07-16 9:33 ` [PATCH v5 13/19] drm/i915/gen8: Add ppgtt info and debug_dump Michel Thierry
2015-07-16 9:33 ` [PATCH v5 14/19] drm/i915: object size needs to be u64 Michel Thierry
2015-07-16 9:33 ` [PATCH v5 15/19] drm/i915: batch_obj vm offset must " Michel Thierry
2015-07-16 9:33 ` [PATCH v5 16/19] drm/i915/userptr: Kill user_size limit check Michel Thierry
2015-07-16 9:33 ` [PATCH v5 17/19] drm/i915: Wa32bitGeneralStateOffset & Wa32bitInstructionBaseOffset Michel Thierry
2015-07-27 14:34 ` Goel, Akash
2015-07-27 14:46 ` Chris Wilson
2015-07-27 14:53 ` Michel Thierry
2015-07-27 21:11 ` Chris Wilson
2015-07-28 11:12 ` Michel Thierry
2015-07-28 14:43 ` Chris Wilson
2015-07-29 11:05 ` Michel Thierry
2015-07-29 11:17 ` Chris Wilson
2015-07-16 9:33 ` [PATCH v5 18/19] drm/i915/gen8: Flip the 48b switch Michel Thierry
2015-07-16 9:33 ` [PATCH v5 19/19] drm/i915: Save some page table setup on repeated binds Michel Thierry
2015-07-28 12:18 ` [PATCH v5 00/19] 48-bit PPGTT Chris Wilson
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox