From: "Goel, Akash" <akash.goel@intel.com>
To: Michel Thierry <michel.thierry@intel.com>,
intel-gfx@lists.freedesktop.org
Subject: Re: [PATCH v3 05/17] drm/i915/gen8: implement alloc/free for 4lvl
Date: Tue, 7 Jul 2015 18:18:12 +0530 [thread overview]
Message-ID: <559BCA8C.3010604@intel.com> (raw)
In-Reply-To: <1435764453-11954-6-git-send-email-michel.thierry@intel.com>
On 7/1/2015 8:57 PM, Michel Thierry wrote:
> PML4 has no special attributes, and there will always be a PML4.
> So simply initialize it at creation, and destroy it at the end.
>
> The code for 4lvl is able to call into the existing 3lvl page table code
> to handle all of the lower levels.
>
> v2: Return something at the end of gen8_alloc_va_range_4lvl to keep the
> compiler happy. And define ret only in one place.
> Updated gen8_ppgtt_unmap_pages and gen8_ppgtt_free to handle 4lvl.
> v3: Use i915_dma_unmap_single instead of pci API. Fix a
> couple of incorrect checks when unmapping pdp and pd pages (Akash).
> v4: Call __pdp_fini also for 32b PPGTT. Clean up alloc_pdp param list.
> v5: Prevent (harmless) out of range access in gen8_for_each_pml4e.
> v6: Simplify alloc_vma_range_4lvl and gen8_ppgtt_init_common error
> paths. (Akash)
> v7: Rebase, s/gen8_ppgtt_free_*/gen8_ppgtt_cleanup_*/.
> v8: Change location of pml4_init/fini. It will make next patches
> cleaner.
> v9: Rebase after Mika's ppgtt cleanup / scratch merge patch series, while
> trying to reuse as much as possible for pdp alloc. pml4_init/fini
> replaced by setup/cleanup_px macros.
> v10: Rebase after Mika's merged ppgtt cleanup patch series.
> v11: Rebase after final merged version of Mika's ppgtt/scratch patches.
>
> Cc: Akash Goel <akash.goel@intel.com>
> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2+)
> ---
> drivers/gpu/drm/i915/i915_gem_gtt.c | 162 ++++++++++++++++++++++++++++++------
> drivers/gpu/drm/i915/i915_gem_gtt.h | 12 ++-
> 2 files changed, 146 insertions(+), 28 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
> index 1327e41..d23b0a8 100644
> --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> @@ -584,12 +584,44 @@ static void __pdp_fini(struct i915_page_directory_pointer *pdp)
> pdp->page_directory = NULL;
> }
>
> +static struct
> +i915_page_directory_pointer *alloc_pdp(struct drm_device *dev)
> +{
> + struct i915_page_directory_pointer *pdp;
> + int ret = -ENOMEM;
> +
> + WARN_ON(!USES_FULL_48BIT_PPGTT(dev));
> +
> + pdp = kzalloc(sizeof(*pdp), GFP_KERNEL);
> + if (!pdp)
> + return ERR_PTR(-ENOMEM);
> +
> + ret = __pdp_init(dev, pdp);
> + if (ret)
> + goto fail_bitmap;
> +
> + ret = setup_px(dev, pdp);
> + if (ret)
> + goto fail_page_m;
> +
> + return pdp;
> +
> +fail_page_m:
> + __pdp_fini(pdp);
> +fail_bitmap:
> + kfree(pdp);
> +
> + return ERR_PTR(ret);
> +}
> +
> static void free_pdp(struct drm_device *dev,
> struct i915_page_directory_pointer *pdp)
> {
> __pdp_fini(pdp);
> - if (USES_FULL_48BIT_PPGTT(dev))
> + if (USES_FULL_48BIT_PPGTT(dev)) {
> + cleanup_px(dev, pdp);
> kfree(pdp);
> + }
> }
>
> /* Broadwell Page Directory Pointer Descriptors */
> @@ -783,28 +815,46 @@ static void gen8_free_scratch(struct i915_address_space *vm)
> free_scratch_page(dev, vm->scratch_page);
> }
>
> -static void gen8_ppgtt_cleanup(struct i915_address_space *vm)
> +static void gen8_ppgtt_cleanup_3lvl(struct drm_device *dev,
> + struct i915_page_directory_pointer *pdp)
> {
> - struct i915_hw_ppgtt *ppgtt =
> - container_of(vm, struct i915_hw_ppgtt, base);
> int i;
>
> - if (!USES_FULL_48BIT_PPGTT(ppgtt->base.dev)) {
> - for_each_set_bit(i, ppgtt->pdp.used_pdpes,
> - I915_PDPES_PER_PDP(ppgtt->base.dev)) {
> - if (WARN_ON(!ppgtt->pdp.page_directory[i]))
> - continue;
> + for_each_set_bit(i, pdp->used_pdpes, I915_PDPES_PER_PDP(dev)) {
> + if (WARN_ON(!pdp->page_directory[i]))
> + continue;
>
> - gen8_free_page_tables(ppgtt->base.dev,
> - ppgtt->pdp.page_directory[i]);
> - free_pd(ppgtt->base.dev,
> - ppgtt->pdp.page_directory[i]);
> - }
> - free_pdp(ppgtt->base.dev, &ppgtt->pdp);
> - } else {
> - WARN_ON(1); /* to be implemented later */
> + gen8_free_page_tables(dev, pdp->page_directory[i]);
> + free_pd(dev, pdp->page_directory[i]);
> }
>
> + free_pdp(dev, pdp);
> +}
> +
> +static void gen8_ppgtt_cleanup_4lvl(struct i915_hw_ppgtt *ppgtt)
> +{
> + int i;
> +
> + for_each_set_bit(i, ppgtt->pml4.used_pml4es, GEN8_PML4ES_PER_PML4) {
> + if (WARN_ON(!ppgtt->pml4.pdps[i]))
> + continue;
> +
> + gen8_ppgtt_cleanup_3lvl(ppgtt->base.dev, ppgtt->pml4.pdps[i]);
> + }
> +
> + cleanup_px(ppgtt->base.dev, &ppgtt->pml4);
> +}
> +
> +static void gen8_ppgtt_cleanup(struct i915_address_space *vm)
> +{
> + struct i915_hw_ppgtt *ppgtt =
> + container_of(vm, struct i915_hw_ppgtt, base);
> +
> + if (!USES_FULL_48BIT_PPGTT(ppgtt->base.dev))
> + gen8_ppgtt_cleanup_3lvl(ppgtt->base.dev, &ppgtt->pdp);
> + else
> + gen8_ppgtt_cleanup_4lvl(ppgtt);
> +
> gen8_free_scratch(vm);
> }
>
> @@ -1087,8 +1137,62 @@ static int gen8_alloc_va_range_4lvl(struct i915_address_space *vm,
> uint64_t start,
> uint64_t length)
> {
> - WARN_ON(1); /* to be implemented later */
> + DECLARE_BITMAP(new_pdps, GEN8_PML4ES_PER_PML4);
> + struct i915_hw_ppgtt *ppgtt =
> + container_of(vm, struct i915_hw_ppgtt, base);
> + struct i915_page_directory_pointer *pdp;
> + const uint64_t orig_start = start;
> + const uint64_t orig_length = length;
> + uint64_t temp, pml4e;
> + int ret = 0;
> +
> + /* Do the pml4 allocations first, so we don't need to track the newly
> + * allocated tables below the pdp */
> + bitmap_zero(new_pdps, GEN8_PML4ES_PER_PML4);
> +
> + /* The pagedirectory and pagetable allocations are done in the shared 3
> + * and 4 level code. Just allocate the pdps.
> + */
> + gen8_for_each_pml4e(pdp, pml4, start, length, temp, pml4e) {
> + if (!pdp) {
> + WARN_ON(test_bit(pml4e, pml4->used_pml4es));
> + pdp = alloc_pdp(vm->dev);
> + if (IS_ERR(pdp))
> + goto err_out;
> +
> + pml4->pdps[pml4e] = pdp;
> + __set_bit(pml4e, new_pdps);
> + trace_i915_page_directory_pointer_entry_alloc(&ppgtt->base, pml4e,
> + pml4e << GEN8_PML4E_SHIFT,
The ‘start’ variable should be used here in place of ‘pml4e <<
GEN8_PML4E_SHIFT’ ?
> + GEN8_PML4E_SHIFT);
> + }
> + }
> +
> + WARN(bitmap_weight(new_pdps, GEN8_PML4ES_PER_PML4) > 2,
> + "The allocation has spanned more than 512GB. "
> + "It is highly likely this is incorrect.");
> +
> + start = orig_start;
> + length = orig_length;
> +
> + gen8_for_each_pml4e(pdp, pml4, start, length, temp, pml4e) {
> + WARN_ON(!pdp);
> +
> + ret = gen8_alloc_va_range_3lvl(vm, pdp, start, length);
> + if (ret)
> + goto err_out;
> + }
> +
> + bitmap_or(pml4->used_pml4es, new_pdps, pml4->used_pml4es,
> + GEN8_PML4ES_PER_PML4);
> +
> return 0;
> +
> +err_out:
> + for_each_set_bit(pml4e, new_pdps, GEN8_PML4ES_PER_PML4)
> + gen8_ppgtt_cleanup_3lvl(vm->dev, pml4->pdps[pml4e]);
> +
> + return ret;
> }
>
> static int gen8_alloc_va_range(struct i915_address_space *vm,
> @@ -1097,10 +1201,10 @@ static int gen8_alloc_va_range(struct i915_address_space *vm,
> struct i915_hw_ppgtt *ppgtt =
> container_of(vm, struct i915_hw_ppgtt, base);
>
> - if (!USES_FULL_48BIT_PPGTT(vm->dev))
> - return gen8_alloc_va_range_3lvl(vm, &ppgtt->pdp, start, length);
> - else
> + if (USES_FULL_48BIT_PPGTT(vm->dev))
> return gen8_alloc_va_range_4lvl(vm, &ppgtt->pml4, start, length);
> + else
> + return gen8_alloc_va_range_3lvl(vm, &ppgtt->pdp, start, length);
> }
>
> /*
> @@ -1128,9 +1232,14 @@ static int gen8_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
>
> ppgtt->switch_mm = gen8_mm_switch;
>
> - if (!USES_FULL_48BIT_PPGTT(ppgtt->base.dev)) {
> - ret = __pdp_init(false, &ppgtt->pdp);
> + if (USES_FULL_48BIT_PPGTT(ppgtt->base.dev)) {
> + ret = setup_px(ppgtt->base.dev, &ppgtt->pml4);
> + if (ret)
> + goto free_scratch;
>
> + ppgtt->base.total = 1ULL << 48;
> + } else {
> + ret = __pdp_init(false, &ppgtt->pdp);
> if (ret)
> goto free_scratch;
>
> @@ -1142,9 +1251,10 @@ static int gen8_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
> * 2GiB).
> */
> ppgtt->base.total = to_i915(ppgtt->base.dev)->gtt.base.total;
> - } else {
> - ppgtt->base.total = 1ULL << 48;
> - return -EPERM; /* Not yet implemented */
> +
> + trace_i915_page_directory_pointer_entry_alloc(&ppgtt->base,
> + 0, 0,
> + GEN8_PML4E_SHIFT);
> }
>
> return 0;
> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
> index e2b684e..c8ac0b5 100644
> --- a/drivers/gpu/drm/i915/i915_gem_gtt.h
> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
> @@ -95,6 +95,7 @@ typedef uint64_t gen8_pde_t;
> */
> #define GEN8_PML4ES_PER_PML4 512
> #define GEN8_PML4E_SHIFT 39
> +#define GEN8_PML4E_MASK (GEN8_PML4ES_PER_PML4 - 1)
> #define GEN8_PDPE_SHIFT 30
> /* NB: GEN8_PDPE_MASK is untrue for 32b platforms, but it has no impact on 32b page
> * tables */
> @@ -464,6 +465,14 @@ static inline uint32_t gen6_pde_index(uint32_t addr)
> temp = min(temp, length), \
> start += temp, length -= temp)
>
> +#define gen8_for_each_pml4e(pdp, pml4, start, length, temp, iter) \
> + for (iter = gen8_pml4e_index(start); \
> + pdp = (pml4)->pdps[iter], length > 0 && iter < GEN8_PML4ES_PER_PML4; \
> + iter++, \
> + temp = ALIGN(start+1, 1ULL << GEN8_PML4E_SHIFT) - start, \
> + temp = min(temp, length), \
> + start += temp, length -= temp)
> +
> #define gen8_for_each_pdpe(pd, pdp, start, length, temp, iter) \
> gen8_for_each_pdpe_e(pd, pdp, start, length, temp, iter, I915_PDPES_PER_PDP(dev))
>
> @@ -484,8 +493,7 @@ static inline uint32_t gen8_pdpe_index(uint64_t address)
>
> static inline uint32_t gen8_pml4e_index(uint64_t address)
> {
> - WARN_ON(1); /* For 64B */
> - return 0;
> + return (address >> GEN8_PML4E_SHIFT) & GEN8_PML4E_MASK;
> }
>
> static inline size_t gen8_pte_count(uint64_t address, uint64_t length)
>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
next prev parent reply other threads:[~2015-07-07 12:48 UTC|newest]
Thread overview: 74+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-06-10 16:46 [PATCH v2 00/18] 48-bit PPGTT Michel Thierry
2015-06-10 16:46 ` [PATCH v2 01/18] drm/i915/lrc: Update PDPx registers with lri commands Michel Thierry
2015-06-11 18:04 ` Mika Kuoppala
2015-06-22 9:18 ` Michel Thierry
2015-06-26 12:46 ` [PATCH v3] " Michel Thierry
2015-06-26 14:45 ` Mika Kuoppala
2015-06-10 16:46 ` [PATCH v2 02/18] drm/i915/gtt: Switch gen8_free_page_tables params Michel Thierry
2015-06-11 18:05 ` Mika Kuoppala
2015-06-26 16:38 ` Daniel Vetter
2015-06-10 16:46 ` [PATCH v2 03/18] drm/i915: Remove unnecessary gen8_clamp_pd Michel Thierry
2015-06-10 16:46 ` [PATCH v2 04/18] drm/i915/gen8: Make pdp allocation more dynamic Michel Thierry
2015-06-10 16:46 ` [PATCH v2 05/18] drm/i915/gen8: Abstract PDP usage Michel Thierry
2015-06-10 16:46 ` [PATCH v2 06/18] drm/i915/gen8: Add dynamic page trace events Michel Thierry
2015-06-10 16:46 ` [PATCH v2 07/18] drm/i915/gen8: implement alloc/free for 4lvl Michel Thierry
2015-06-10 16:46 ` [PATCH v2 08/18] drm/i915/gen8: Add 4 level switching infrastructure and lrc support Michel Thierry
2015-06-10 16:46 ` [PATCH v2 09/18] drm/i915/gen8: Generalize PTE writing for GEN8 PPGTT Michel Thierry
2015-06-10 16:46 ` [PATCH v2 10/18] drm/i915/gen8: Pass sg_iter through pte inserts Michel Thierry
2015-06-10 16:46 ` [PATCH v2 11/18] drm/i915/gen8: Add 4 level support in insert_entries and clear_range Michel Thierry
2015-06-10 16:46 ` [PATCH v2 12/18] drm/i915/gen8: Initialize PDPs Michel Thierry
2015-06-10 16:46 ` [PATCH v2 13/18] drm/i915: Expand error state's address width to 64b Michel Thierry
2015-06-10 16:46 ` [PATCH v2 14/18] drm/i915/gen8: Add ppgtt info and debug_dump Michel Thierry
2015-06-10 16:46 ` [PATCH v2 15/18] drm/i915: object size needs to be u64 Michel Thierry
2015-06-10 16:46 ` [PATCH v2 16/18] drm/i915: Check against correct user_size limit in 48b ppgtt mode Michel Thierry
2015-06-10 17:57 ` Chris Wilson
2015-06-10 16:46 ` [PATCH v2 17/18] drm/i915: Wa32bitGeneralStateOffset & Wa32bitInstructionBaseOffset Michel Thierry
2015-06-10 18:09 ` Chris Wilson
2015-06-17 12:49 ` Daniel Vetter
2015-06-17 12:53 ` Chris Wilson
2015-06-17 15:03 ` Daniel Vetter
2015-06-17 17:37 ` Chris Wilson
2015-06-18 6:45 ` Daniel Vetter
2015-06-18 7:03 ` Chris Wilson
2015-06-18 7:11 ` Daniel Vetter
2015-06-18 7:34 ` Chris Wilson
2015-06-23 12:21 ` [PATCH v3] " Michel Thierry
2015-06-23 13:22 ` Chris Wilson
2015-06-10 16:46 ` [PATCH v2 18/18] drm/i915/gen8: Flip the 48b switch Michel Thierry
2015-06-10 16:46 ` [PATCH v2] tests/gem_ppgtt: Check Wa32bitOffsets workarounds Michel Thierry
2015-07-01 15:27 ` [PATCH v3 00/17] 48-bit PPGTT Michel Thierry
2015-07-01 15:27 ` [PATCH v3 01/17] drm/i915: Remove unnecessary gen8_clamp_pd Michel Thierry
2015-07-01 15:27 ` [PATCH v3 02/17] drm/i915/gen8: Make pdp allocation more dynamic Michel Thierry
2015-07-07 12:36 ` Goel, Akash
2015-07-07 12:56 ` Michel Thierry
2015-07-01 15:27 ` [PATCH v3 03/17] drm/i915/gen8: Abstract PDP usage Michel Thierry
2015-07-07 12:43 ` Goel, Akash
2015-07-07 13:35 ` Michel Thierry
2015-07-01 15:27 ` [PATCH v3 04/17] drm/i915/gen8: Add dynamic page trace events Michel Thierry
2015-07-01 15:27 ` [PATCH v3 05/17] drm/i915/gen8: implement alloc/free for 4lvl Michel Thierry
2015-07-07 12:48 ` Goel, Akash [this message]
2015-07-07 13:40 ` Michel Thierry
2015-07-01 15:27 ` [PATCH v3 06/17] drm/i915/gen8: Add 4 level switching infrastructure and lrc support Michel Thierry
2015-07-01 15:27 ` [PATCH v3 07/17] drm/i915/gen8: Generalize PTE writing for GEN8 PPGTT Michel Thierry
2015-07-01 15:27 ` [PATCH v3 08/17] drm/i915/gen8: Pass sg_iter through pte inserts Michel Thierry
2015-07-01 15:27 ` [PATCH v3 09/17] drm/i915/gen8: Add 4 level support in insert_entries and clear_range Michel Thierry
2015-07-07 12:51 ` Goel, Akash
2015-07-07 13:42 ` Michel Thierry
2015-07-01 15:27 ` [PATCH v3 10/17] drm/i915/gen8: Initialize PDPs Michel Thierry
2015-07-01 15:27 ` [PATCH v3 11/17] drm/i915: Expand error state's address width to 64b Michel Thierry
2015-07-07 12:53 ` Goel, Akash
2015-07-07 13:50 ` Michel Thierry
2015-07-01 15:27 ` [PATCH v3 12/17] drm/i915/gen8: Add ppgtt info and debug_dump Michel Thierry
2015-07-07 12:56 ` Goel, Akash
2015-07-07 13:51 ` Michel Thierry
2015-07-01 15:27 ` [PATCH v3 13/17] drm/i915: object size needs to be u64 Michel Thierry
2015-07-01 15:27 ` [PATCH v3 14/17] drm/i915: batch_obj vm offset must " Michel Thierry
2015-07-01 16:07 ` John Harrison
2015-07-01 15:27 ` [PATCH v3 15/17] drm/i915/userptr: Kill user_size limit check Michel Thierry
2015-07-01 15:31 ` Chris Wilson
2015-07-01 15:27 ` [PATCH v3 16/17] drm/i915: Wa32bitGeneralStateOffset & Wa32bitInstructionBaseOffset Michel Thierry
2015-07-01 15:43 ` Chris Wilson
2015-07-01 15:54 ` Michel Thierry
2015-07-01 16:02 ` [PATCH v5] " Michel Thierry
2015-07-01 15:27 ` [PATCH v3 17/17] drm/i915/gen8: Flip the 48b switch Michel Thierry
2015-07-01 15:38 ` [PATCH v3 00/17] 48-bit PPGTT Daniel Vetter
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=559BCA8C.3010604@intel.com \
--to=akash.goel@intel.com \
--cc=intel-gfx@lists.freedesktop.org \
--cc=michel.thierry@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox