Intel-GFX Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: "Thomas Hellström" <thomas.hellstrom@linux.intel.com>
To: Matthew Auld <matthew.auld@intel.com>,
	intel-gfx@lists.freedesktop.org,
	 dri-devel@lists.freedesktop.org
Cc: maarten.lankhorst@linux.intel.com
Subject: Re: [Intel-gfx] [PATCH v3 6/6] drm/i915: Reduce the number of objects subject to memcpy recover
Date: Mon, 20 Sep 2021 13:09:08 +0200	[thread overview]
Message-ID: <b489b872d20113f38a1b6f9a74ac60e29793e5dd.camel@linux.intel.com> (raw)
In-Reply-To: <22f443a0-d740-8337-1311-d18a31e4f6b0@intel.com>

On Mon, 2021-09-20 at 12:05 +0100, Matthew Auld wrote:
> On 14/09/2021 20:31, Thomas Hellström wrote:
> > We really only need memcpy restore for objects that affect the
> > operability of the migrate context. That is, primarily the page-
> > table
> > objects of the migrate VM.
> > 
> > Add an object flag, I915_BO_ALLOC_PM_EARLY for objects that need
> > early
> > restores using memcpy and a way to assign LMEM page-table object
> > flags
> > to be used by the vms.
> > 
> > Restore objects without this flag with the gpu blitter and only
> > objects
> > carrying the flag using TTM memcpy.
> > 
> > Initially mark the migrate, gt, gtt and vgpu vms to use this flag,
> > and
> > defer for a later audit which vms actually need it. Most
> > importantly, user-
> > allocated vms with pinned page-table objects can be restored using
> > the
> > blitter.
> > 
> > Performance-wise memcpy restore is probably as fast as gpu restore
> > if not
> > faster, but using gpu restore will help tackling future
> > restrictions in
> > mappable LMEM size.
> > 
> > Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
> > ---
> >   drivers/gpu/drm/i915/gem/i915_gem_context.c      |  4 ++--
> >   drivers/gpu/drm/i915/gem/i915_gem_object_types.h |  9 ++++++---
> >   drivers/gpu/drm/i915/gem/i915_gem_pm.c           |  5 ++++-
> >   drivers/gpu/drm/i915/gem/i915_gem_ttm_pm.c       |  6 ++++--
> >   drivers/gpu/drm/i915/gem/selftests/huge_pages.c  |  2 +-
> >   drivers/gpu/drm/i915/gt/gen6_ppgtt.c             |  2 +-
> >   drivers/gpu/drm/i915/gt/gen8_ppgtt.c             |  5 +++--
> >   drivers/gpu/drm/i915/gt/gen8_ppgtt.h             |  4 +++-
> >   drivers/gpu/drm/i915/gt/intel_ggtt.c             |  2 +-
> >   drivers/gpu/drm/i915/gt/intel_gt.c               |  2 +-
> >   drivers/gpu/drm/i915/gt/intel_gtt.c              |  3 ++-
> >   drivers/gpu/drm/i915/gt/intel_gtt.h              |  9 +++++++--
> >   drivers/gpu/drm/i915/gt/intel_migrate.c          |  2 +-
> >   drivers/gpu/drm/i915/gt/intel_ppgtt.c            | 13 ++++++++---
> > --
> >   drivers/gpu/drm/i915/gt/selftest_hangcheck.c     |  2 +-
> >   drivers/gpu/drm/i915/gvt/scheduler.c             |  2 +-
> >   drivers/gpu/drm/i915/selftests/i915_gem_gtt.c    |  4 ++--
> >   17 files changed, 48 insertions(+), 28 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > index c2ab0e22db0a..8208fd5b72c3 100644
> > --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > @@ -1287,7 +1287,7 @@ i915_gem_create_context(struct
> > drm_i915_private *i915,
> >         } else if (HAS_FULL_PPGTT(i915)) {
> >                 struct i915_ppgtt *ppgtt;
> >   
> > -               ppgtt = i915_ppgtt_create(&i915->gt);
> > +               ppgtt = i915_ppgtt_create(&i915->gt, 0);
> >                 if (IS_ERR(ppgtt)) {
> >                         drm_dbg(&i915->drm, "PPGTT setup failed
> > (%ld)\n",
> >                                 PTR_ERR(ppgtt));
> > @@ -1465,7 +1465,7 @@ int i915_gem_vm_create_ioctl(struct
> > drm_device *dev, void *data,
> >         if (args->flags)
> >                 return -EINVAL;
> >   
> > -       ppgtt = i915_ppgtt_create(&i915->gt);
> > +       ppgtt = i915_ppgtt_create(&i915->gt, 0);
> >         if (IS_ERR(ppgtt))
> >                 return PTR_ERR(ppgtt);
> >   
> > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
> > b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
> > index 118691ce81d7..fa2ba9e2a4d0 100644
> > --- a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
> > +++ b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
> > @@ -294,13 +294,16 @@ struct drm_i915_gem_object {
> >   #define I915_BO_ALLOC_USER        BIT(3)
> >   /* Object is allowed to lose its contents on suspend / resume,
> > even if pinned */
> >   #define I915_BO_ALLOC_PM_VOLATILE BIT(4)
> > +/* Object needs to be restored early using memcpy during resume */
> > +#define I915_BO_ALLOC_PM_EARLY    BIT(5)
> >   #define I915_BO_ALLOC_FLAGS (I915_BO_ALLOC_CONTIGUOUS | \
> >                              I915_BO_ALLOC_VOLATILE | \
> >                              I915_BO_ALLOC_CPU_CLEAR | \
> >                              I915_BO_ALLOC_USER | \
> > -                            I915_BO_ALLOC_PM_VOLATILE)
> > -#define I915_BO_READONLY          BIT(5)
> > -#define I915_TILING_QUIRK_BIT     6 /* unknown swizzling; do not
> > release! */
> > +                            I915_BO_ALLOC_PM_VOLATILE | \
> > +                            I915_BO_ALLOC_PM_EARLY)
> > +#define I915_BO_READONLY          BIT(6)
> > +#define I915_TILING_QUIRK_BIT     7 /* unknown swizzling; do not
> > release! */
> >   
> >         /**
> >          * @mem_flags - Mutable placement-related flags
> > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_pm.c
> > b/drivers/gpu/drm/i915/gem/i915_gem_pm.c
> > index 8736ae1dfbb2..c4a75e1c12ee 100644
> > --- a/drivers/gpu/drm/i915/gem/i915_gem_pm.c
> > +++ b/drivers/gpu/drm/i915/gem/i915_gem_pm.c
> > @@ -98,8 +98,11 @@ int i915_gem_backup_suspend(struct
> > drm_i915_private *i915)
> >          * More objects may have become unpinned as requests were
> >          * retired. Now try to evict again. The gt may be wedged
> > here
> >          * in which case we automatically fall back to memcpy.
> > +        * We allow also backing up pinned objects that have not
> > been
> > +        * marked for early recover, and that may contain, for
> > example,
> > +        * page-tables for the migrate context. 
> >          */
> > -       ret = lmem_suspend(i915, true, false);
> > +       ret = lmem_suspend(i915, true, true);
> 
> I guess we could have made these flags instead of bools, for better 
> readability. I've already forgotten which is which :)
> 
> >         if (ret)
> >                 goto out_recover;
> >   
> > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm_pm.c
> > b/drivers/gpu/drm/i915/gem/i915_gem_ttm_pm.c
> > index a87419755d43..2684daaa2f22 100644
> > --- a/drivers/gpu/drm/i915/gem/i915_gem_ttm_pm.c
> > +++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm_pm.c
> > @@ -57,7 +57,8 @@ static int i915_ttm_backup(struct
> > i915_gem_apply_to_region *apply,
> >         if (pm_apply->allow_gpu && i915_gem_object_evictable(obj))
> >                 return ttm_bo_validate(bo,
> > i915_ttm_sys_placement(), &ctx);
> >   
> > -       if (!pm_apply->backup_pinned)
> > +       if (!pm_apply->backup_pinned ||
> > +           (pm_apply->allow_gpu && (obj->flags &
> > I915_BO_ALLOC_PM_EARLY)))
> >                 return 0;
> >   
> >         if (obj->flags & I915_BO_ALLOC_PM_VOLATILE)
> > @@ -156,7 +157,8 @@ static int i915_ttm_restore(struct
> > i915_gem_apply_to_region *apply,
> >         if (!backup)
> >                 return 0;
> >   
> > -       if (!pm_apply->allow_gpu && (obj->flags &
> > I915_BO_ALLOC_USER))
> > +       if (!pm_apply->allow_gpu && ((obj->flags &
> > I915_BO_ALLOC_USER) ||
> > +                                    !(obj->flags &
> > I915_BO_ALLOC_PM_EARLY)))
> >                 return 0;
> >   
> >         err = i915_gem_object_lock(backup, apply->ww);
> > diff --git a/drivers/gpu/drm/i915/gem/selftests/huge_pages.c
> > b/drivers/gpu/drm/i915/gem/selftests/huge_pages.c
> > index 0827634c842c..77d84a9e8789 100644
> > --- a/drivers/gpu/drm/i915/gem/selftests/huge_pages.c
> > +++ b/drivers/gpu/drm/i915/gem/selftests/huge_pages.c
> > @@ -1645,7 +1645,7 @@ int i915_gem_huge_page_mock_selftests(void)
> >         mkwrite_device_info(dev_priv)->ppgtt_type =
> > INTEL_PPGTT_FULL;
> >         mkwrite_device_info(dev_priv)->ppgtt_size = 48;
> >   
> > -       ppgtt = i915_ppgtt_create(&dev_priv->gt);
> > +       ppgtt = i915_ppgtt_create(&dev_priv->gt, 0);
> >         if (IS_ERR(ppgtt)) {
> >                 err = PTR_ERR(ppgtt);
> >                 goto out_unlock;
> > diff --git a/drivers/gpu/drm/i915/gt/gen6_ppgtt.c
> > b/drivers/gpu/drm/i915/gt/gen6_ppgtt.c
> > index 1aee5e6b1b23..890191f286e3 100644
> > --- a/drivers/gpu/drm/i915/gt/gen6_ppgtt.c
> > +++ b/drivers/gpu/drm/i915/gt/gen6_ppgtt.c
> > @@ -429,7 +429,7 @@ struct i915_ppgtt *gen6_ppgtt_create(struct
> > intel_gt *gt)
> >         mutex_init(&ppgtt->flush);
> >         mutex_init(&ppgtt->pin_mutex);
> >   
> > -       ppgtt_init(&ppgtt->base, gt);
> > +       ppgtt_init(&ppgtt->base, gt, 0);
> >         ppgtt->base.vm.pd_shift = ilog2(SZ_4K * SZ_4K /
> > sizeof(gen6_pte_t));
> >         ppgtt->base.vm.top = 1;
> >   
> > diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
> > b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
> > index 6a5af995f5b1..037a9a6e4889 100644
> > --- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
> > +++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
> > @@ -753,7 +753,8 @@ gen8_alloc_top_pd(struct i915_address_space
> > *vm)
> >    * space.
> >    *
> >    */
> > -struct i915_ppgtt *gen8_ppgtt_create(struct intel_gt *gt)
> > +struct i915_ppgtt *gen8_ppgtt_create(struct intel_gt *gt,
> > +                                    unsigned long
> > lmem_pt_obj_flags)
> >   {
> >         struct i915_ppgtt *ppgtt;
> >         int err;
> > @@ -762,7 +763,7 @@ struct i915_ppgtt *gen8_ppgtt_create(struct
> > intel_gt *gt)
> >         if (!ppgtt)
> >                 return ERR_PTR(-ENOMEM);
> >   
> > -       ppgtt_init(ppgtt, gt);
> > +       ppgtt_init(ppgtt, gt, lmem_pt_obj_flags);
> >         ppgtt->vm.top = i915_vm_is_4lvl(&ppgtt->vm) ? 3 : 2;
> >         ppgtt->vm.pd_shift = ilog2(SZ_4K * SZ_4K /
> > sizeof(gen8_pte_t));
> >   
> > diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.h
> > b/drivers/gpu/drm/i915/gt/gen8_ppgtt.h
> > index b9028c2ad3c7..f541d19264b4 100644
> > --- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.h
> > +++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.h
> > @@ -12,7 +12,9 @@ struct i915_address_space;
> >   struct intel_gt;
> >   enum i915_cache_level;
> >   
> > -struct i915_ppgtt *gen8_ppgtt_create(struct intel_gt *gt);
> > +struct i915_ppgtt *gen8_ppgtt_create(struct intel_gt *gt,
> > +                                    unsigned long
> > lmem_pt_obj_flags);
> > +
> >   u64 gen8_ggtt_pte_encode(dma_addr_t addr,
> >                          enum i915_cache_level level,
> >                          u32 flags);
> > diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt.c
> > b/drivers/gpu/drm/i915/gt/intel_ggtt.c
> > index 8d71f67926f1..b99b26201b67 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_ggtt.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_ggtt.c
> > @@ -644,7 +644,7 @@ static int init_aliasing_ppgtt(struct i915_ggtt
> > *ggtt)
> >         struct i915_ppgtt *ppgtt;
> >         int err;
> >   
> > -       ppgtt = i915_ppgtt_create(ggtt->vm.gt);
> > +       ppgtt = i915_ppgtt_create(ggtt->vm.gt,
> > I915_BO_ALLOC_PM_EARLY);
> 
> I guess could leave as flags=0, since appgtt is not relevant on 
> discrete/modern hw(?).
> 
> Reviewed-by: Matthew Auld <matthew.auld@intel.com>

True, I'll fix these up.



  reply	other threads:[~2021-09-20 11:09 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-14 19:31 [Intel-gfx] [PATCH v3 0/6] drm/i915: Suspend / resume backup- and restore of LMEM Thomas Hellström
2021-09-14 19:31 ` [Intel-gfx] [PATCH v3 1/6] drm/i915/ttm: Implement a function to copy the contents of two TTM-based objects Thomas Hellström
2021-09-16 10:17   ` Matthew Auld
2021-09-14 19:31 ` [Intel-gfx] [PATCH v3 2/6] drm/i915/gem: Implement a function to process all gem objects of a region Thomas Hellström
2021-09-16 10:23   ` Matthew Auld
2021-09-14 19:31 ` [Intel-gfx] [PATCH v3 3/6] drm/i915 Implement LMEM backup and restore for suspend / resume Thomas Hellström
2021-09-17 12:03   ` Matthew Auld
2021-09-20 10:49   ` Matthew Auld
2021-09-20 11:05     ` Thomas Hellström
2021-09-14 19:31 ` [Intel-gfx] [PATCH v3 4/6] drm/i915/gt: Register the migrate contexts with their engines Thomas Hellström
2021-09-20  9:53   ` Matthew Auld
2021-09-14 19:31 ` [Intel-gfx] [PATCH v3 5/6] drm/i915: Don't back up pinned LMEM context images and rings during suspend Thomas Hellström
2021-09-20  9:57   ` Matthew Auld
2021-09-14 19:31 ` [Intel-gfx] [PATCH v3 6/6] drm/i915: Reduce the number of objects subject to memcpy recover Thomas Hellström
2021-09-20 11:05   ` Matthew Auld
2021-09-20 11:09     ` Thomas Hellström [this message]
2021-09-14 19:40 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for drm/i915: Suspend / resume backup- and restore of LMEM. (rev4) Patchwork
2021-09-14 20:06 ` [Intel-gfx] ✓ Fi.CI.BAT: success " Patchwork
2021-09-14 21:14 ` [Intel-gfx] ✗ Fi.CI.IGT: failure " Patchwork
2021-09-15 12:22 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for drm/i915: Suspend / resume backup- and restore of LMEM. (rev5) Patchwork
2021-09-15 13:09 ` [Intel-gfx] ✗ Fi.CI.BAT: failure " Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b489b872d20113f38a1b6f9a74ac60e29793e5dd.camel@linux.intel.com \
    --to=thomas.hellstrom@linux.intel.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=maarten.lankhorst@linux.intel.com \
    --cc=matthew.auld@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox