All of lore.kernel.org
 help / color / mirror / Atom feed
From: Daniel Vetter <daniel@ffwll.ch>
To: Chris Wilson <chris@chris-wilson.co.uk>
Cc: intel-gfx@lists.freedesktop.org
Subject: Re: [PATCH 1/3] drm/i915: Use RPM as the barrier for controlling user mmap access
Date: Tue, 11 Oct 2016 15:25:24 +0200	[thread overview]
Message-ID: <20161011132524.GC20761@phenom.ffwll.local> (raw)
In-Reply-To: <20161010215842.7898-1-chris@chris-wilson.co.uk>

On Mon, Oct 10, 2016 at 10:58:40PM +0100, Chris Wilson wrote:
> We can remove the false coupling between RPM and struct mutex by the
> observation that we can use the RPM wakeref as the barrier around user
> mmap access. That is as we tear down the user's PTE atomically from
> within rpm suspend and then to fault in new PTE requires the rpm
> wakeref, means that no user access is possible through those PTE without
> RPM being awake. Having made that observation, we can then remove the
> presumption of having to take rpm outside of struct_mutex and so allow
> fine grained acquisition of a wakeref around hw access rather than
> having to remember to acquire the wakeref early on.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Imre Deak <imre.deak@intel.com>
> Cc: Daniel Vetter <daniel@ffwll.ch>
> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
> ---
>  drivers/gpu/drm/i915/i915_debugfs.c    | 22 ----------------------
>  drivers/gpu/drm/i915/i915_drv.c        | 19 -------------------
>  drivers/gpu/drm/i915/i915_gem.c        | 29 ++++++++++++++++++-----------
>  drivers/gpu/drm/i915/i915_gem_gtt.c    | 17 +++++++++++++----
>  drivers/gpu/drm/i915/i915_gem_tiling.c |  4 ----
>  drivers/gpu/drm/i915/intel_drv.h       |  6 ++++++
>  drivers/gpu/drm/i915/intel_uncore.c    |  6 +++---
>  7 files changed, 40 insertions(+), 63 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
> index 20689f1cd719..793b1816f700 100644
> --- a/drivers/gpu/drm/i915/i915_debugfs.c
> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> @@ -1396,14 +1396,9 @@ static int i915_hangcheck_info(struct seq_file *m, void *unused)
>  static int ironlake_drpc_info(struct seq_file *m)
>  {
>  	struct drm_i915_private *dev_priv = node_to_i915(m->private);
> -	struct drm_device *dev = &dev_priv->drm;
>  	u32 rgvmodectl, rstdbyctl;
>  	u16 crstandvid;
> -	int ret;
>  
> -	ret = mutex_lock_interruptible(&dev->struct_mutex);
> -	if (ret)
> -		return ret;
>  	intel_runtime_pm_get(dev_priv);
>  
>  	rgvmodectl = I915_READ(MEMMODECTL);
> @@ -1411,7 +1406,6 @@ static int ironlake_drpc_info(struct seq_file *m)
>  	crstandvid = I915_READ16(CRSTANDVID);
>  
>  	intel_runtime_pm_put(dev_priv);
> -	mutex_unlock(&dev->struct_mutex);
>  
>  	seq_printf(m, "HD boost: %s\n", yesno(rgvmodectl & MEMMODE_BOOST_EN));
>  	seq_printf(m, "Boost freq: %d\n",
> @@ -2089,12 +2083,7 @@ static const char *swizzle_string(unsigned swizzle)
>  static int i915_swizzle_info(struct seq_file *m, void *data)
>  {
>  	struct drm_i915_private *dev_priv = node_to_i915(m->private);
> -	struct drm_device *dev = &dev_priv->drm;
> -	int ret;
>  
> -	ret = mutex_lock_interruptible(&dev->struct_mutex);
> -	if (ret)
> -		return ret;
>  	intel_runtime_pm_get(dev_priv);
>  
>  	seq_printf(m, "bit6 swizzle for X-tiling = %s\n",
> @@ -2134,7 +2123,6 @@ static int i915_swizzle_info(struct seq_file *m, void *data)
>  		seq_puts(m, "L-shaped memory detected\n");
>  
>  	intel_runtime_pm_put(dev_priv);
> -	mutex_unlock(&dev->struct_mutex);
>  
>  	return 0;
>  }
> @@ -4793,13 +4781,9 @@ i915_wedged_set(void *data, u64 val)
>  	if (i915_reset_in_progress(&dev_priv->gpu_error))
>  		return -EAGAIN;
>  
> -	intel_runtime_pm_get(dev_priv);
> -
>  	i915_handle_error(dev_priv, val,
>  			  "Manually setting wedged to %llu", val);
>  
> -	intel_runtime_pm_put(dev_priv);
> -
>  	return 0;
>  }
>  
> @@ -5034,22 +5018,16 @@ static int
>  i915_cache_sharing_get(void *data, u64 *val)
>  {
>  	struct drm_i915_private *dev_priv = data;
> -	struct drm_device *dev = &dev_priv->drm;
>  	u32 snpcr;
> -	int ret;
>  
>  	if (!(IS_GEN6(dev_priv) || IS_GEN7(dev_priv)))
>  		return -ENODEV;
>  
> -	ret = mutex_lock_interruptible(&dev->struct_mutex);
> -	if (ret)
> -		return ret;
>  	intel_runtime_pm_get(dev_priv);
>  
>  	snpcr = I915_READ(GEN6_MBCUNIT_SNPCR);
>  
>  	intel_runtime_pm_put(dev_priv);
> -	mutex_unlock(&dev->struct_mutex);
>  
>  	*val = (snpcr & GEN6_MBC_SNPCR_MASK) >> GEN6_MBC_SNPCR_SHIFT;
>  
> diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
> index 89d322215c84..31eee32fcf6d 100644
> --- a/drivers/gpu/drm/i915/i915_drv.c
> +++ b/drivers/gpu/drm/i915/i915_drv.c
> @@ -2313,24 +2313,6 @@ static int intel_runtime_suspend(struct device *kdev)
>  
>  	DRM_DEBUG_KMS("Suspending device\n");
>  
> -	/*
> -	 * We could deadlock here in case another thread holding struct_mutex
> -	 * calls RPM suspend concurrently, since the RPM suspend will wait
> -	 * first for this RPM suspend to finish. In this case the concurrent
> -	 * RPM resume will be followed by its RPM suspend counterpart. Still
> -	 * for consistency return -EAGAIN, which will reschedule this suspend.
> -	 */
> -	if (!mutex_trylock(&dev->struct_mutex)) {
> -		DRM_DEBUG_KMS("device lock contention, deffering suspend\n");
> -		/*
> -		 * Bump the expiration timestamp, otherwise the suspend won't
> -		 * be rescheduled.
> -		 */
> -		pm_runtime_mark_last_busy(kdev);
> -
> -		return -EAGAIN;
> -	}
> -
>  	disable_rpm_wakeref_asserts(dev_priv);
>  
>  	/*
> @@ -2338,7 +2320,6 @@ static int intel_runtime_suspend(struct device *kdev)
>  	 * an RPM reference.
>  	 */
>  	i915_gem_release_all_mmaps(dev_priv);
> -	mutex_unlock(&dev->struct_mutex);

i915_gem_release_all_mmaps is walking the bound_list, which afaict is
protected by dev->struct_mutex still. I guess a separate lock for that
list would be ok, but I don't see it anywhere.

Requiring that we hold a full rpm reference before touching this list
would be another approach, but imo that defeats the point of this patch of
decoupling rpm and struct_mutex locking.

Otherwise makes sense I think, same for patch2.
-Daniel

>  
>  	intel_guc_suspend(dev);
>  
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index a89a88922448..ee920b0ec258 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -1469,7 +1469,9 @@ i915_gem_pwrite_ioctl(struct drm_device *dev, void *data,
>  	 */
>  	if (!i915_gem_object_has_struct_page(obj) ||
>  	    cpu_write_needs_clflush(obj)) {
> +		intel_runtime_pm_get(dev_priv);
>  		ret = i915_gem_gtt_pwrite_fast(dev_priv, obj, args, file);
> +		intel_runtime_pm_put(dev_priv);
>  		/* Note that the gtt paths might fail with non-page-backed user
>  		 * pointers (e.g. gtt mappings when moving data between
>  		 * textures). Fallback to the shmem path in that case. */
> @@ -1922,9 +1924,13 @@ i915_gem_release_mmap(struct drm_i915_gem_object *obj)
>  	/* Serialisation between user GTT access and our code depends upon
>  	 * revoking the CPU's PTE whilst the mutex is held. The next user
>  	 * pagefault then has to wait until we release the mutex.
> +	 *
> +	 * Note that RPM complicates somewhat by adding an additional
> +	 * requirement that operations to the GGTT be made holding the RPM
> +	 * wakeref. This in turns allow us to release the mmap from within
> +	 * the RPM suspend code ignoring the struct_mutex serialisation in
> +	 * lieu of the RPM barriers.
>  	 */
> -	lockdep_assert_held(&obj->base.dev->struct_mutex);
> -
>  	if (!obj->fault_mappable)
>  		return;
>  
> @@ -1948,6 +1954,11 @@ i915_gem_release_all_mmaps(struct drm_i915_private *dev_priv)
>  {
>  	struct drm_i915_gem_object *obj;
>  
> +	/* This should only be called by RPM as we require the bound_list
> +	 * to be protected by the RPM barriers and not struct_mutex.
> +	 * We check that we are holding the wakeref whenever we manipulate
> +	 * the dev_priv->mm.bound_list (via assert_rpm_release_all_mmaps).
> +	 */
>  	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list)
>  		i915_gem_release_mmap(obj);
>  }
> @@ -2911,9 +2922,11 @@ int i915_vma_unbind(struct i915_vma *vma)
>  
>  	/* Since the unbound list is global, only move to that list if
>  	 * no more VMAs exist. */
> -	if (--obj->bind_count == 0)
> +	if (--obj->bind_count == 0) {
> +		assert_rpm_release_all_mmaps(to_i915(obj->base.dev));
>  		list_move_tail(&obj->global_list,
>  			       &to_i915(obj->base.dev)->mm.unbound_list);
> +	}
>  
>  	/* And finally now the object is completely decoupled from this vma,
>  	 * we can drop its hold on the backing storage and allow it to be
> @@ -3099,6 +3112,7 @@ search_free:
>  	}
>  	GEM_BUG_ON(!i915_gem_valid_gtt_space(vma, obj->cache_level));
>  
> +	assert_rpm_release_all_mmaps(dev_priv);
>  	list_move_tail(&obj->global_list, &dev_priv->mm.bound_list);
>  	list_move_tail(&vma->vm_link, &vma->vm->inactive_list);
>  	obj->bind_count++;
> @@ -3446,7 +3460,6 @@ int i915_gem_get_caching_ioctl(struct drm_device *dev, void *data,
>  int i915_gem_set_caching_ioctl(struct drm_device *dev, void *data,
>  			       struct drm_file *file)
>  {
> -	struct drm_i915_private *dev_priv = to_i915(dev);
>  	struct drm_i915_gem_caching *args = data;
>  	struct drm_i915_gem_object *obj;
>  	enum i915_cache_level level;
> @@ -3475,11 +3488,9 @@ int i915_gem_set_caching_ioctl(struct drm_device *dev, void *data,
>  		return -EINVAL;
>  	}
>  
> -	intel_runtime_pm_get(dev_priv);
> -
>  	ret = i915_mutex_lock_interruptible(dev);
>  	if (ret)
> -		goto rpm_put;
> +		return ret;
>  
>  	obj = i915_gem_object_lookup(file, args->handle);
>  	if (!obj) {
> @@ -3488,13 +3499,9 @@ int i915_gem_set_caching_ioctl(struct drm_device *dev, void *data,
>  	}
>  
>  	ret = i915_gem_object_set_cache_level(obj, level);
> -
>  	i915_gem_object_put(obj);
>  unlock:
>  	mutex_unlock(&dev->struct_mutex);
> -rpm_put:
> -	intel_runtime_pm_put(dev_priv);
> -
>  	return ret;
>  }
>  
> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
> index 0bb4232f66bc..e1109d723060 100644
> --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> @@ -2596,6 +2596,7 @@ static int ggtt_bind_vma(struct i915_vma *vma,
>  			 enum i915_cache_level cache_level,
>  			 u32 flags)
>  {
> +	struct drm_i915_private *i915 = to_i915(vma->vm->dev);
>  	struct drm_i915_gem_object *obj = vma->obj;
>  	u32 pte_flags = 0;
>  	int ret;
> @@ -2608,8 +2609,10 @@ static int ggtt_bind_vma(struct i915_vma *vma,
>  	if (obj->gt_ro)
>  		pte_flags |= PTE_READ_ONLY;
>  
> +	intel_runtime_pm_get(i915);
>  	vma->vm->insert_entries(vma->vm, vma->pages, vma->node.start,
>  				cache_level, pte_flags);
> +	intel_runtime_pm_put(i915);
>  
>  	/*
>  	 * Without aliasing PPGTT there's no difference between
> @@ -2625,6 +2628,7 @@ static int aliasing_gtt_bind_vma(struct i915_vma *vma,
>  				 enum i915_cache_level cache_level,
>  				 u32 flags)
>  {
> +	struct drm_i915_private *i915 = to_i915(vma->vm->dev);
>  	u32 pte_flags;
>  	int ret;
>  
> @@ -2639,14 +2643,15 @@ static int aliasing_gtt_bind_vma(struct i915_vma *vma,
>  
>  
>  	if (flags & I915_VMA_GLOBAL_BIND) {
> +		intel_runtime_pm_get(i915);
>  		vma->vm->insert_entries(vma->vm,
>  					vma->pages, vma->node.start,
>  					cache_level, pte_flags);
> +		intel_runtime_pm_put(i915);
>  	}
>  
>  	if (flags & I915_VMA_LOCAL_BIND) {
> -		struct i915_hw_ppgtt *appgtt =
> -			to_i915(vma->vm->dev)->mm.aliasing_ppgtt;
> +		struct i915_hw_ppgtt *appgtt = i915->mm.aliasing_ppgtt;
>  		appgtt->base.insert_entries(&appgtt->base,
>  					    vma->pages, vma->node.start,
>  					    cache_level, pte_flags);
> @@ -2657,13 +2662,17 @@ static int aliasing_gtt_bind_vma(struct i915_vma *vma,
>  
>  static void ggtt_unbind_vma(struct i915_vma *vma)
>  {
> -	struct i915_hw_ppgtt *appgtt = to_i915(vma->vm->dev)->mm.aliasing_ppgtt;
> +	struct drm_i915_private *i915 = to_i915(vma->vm->dev);
> +	struct i915_hw_ppgtt *appgtt = i915->mm.aliasing_ppgtt;
>  	const u64 size = min(vma->size, vma->node.size);
>  
> -	if (vma->flags & I915_VMA_GLOBAL_BIND)
> +	if (vma->flags & I915_VMA_GLOBAL_BIND) {
> +		intel_runtime_pm_get(i915);
>  		vma->vm->clear_range(vma->vm,
>  				     vma->node.start, size,
>  				     true);
> +		intel_runtime_pm_put(i915);
> +	}
>  
>  	if (vma->flags & I915_VMA_LOCAL_BIND && appgtt)
>  		appgtt->base.clear_range(&appgtt->base,
> diff --git a/drivers/gpu/drm/i915/i915_gem_tiling.c b/drivers/gpu/drm/i915/i915_gem_tiling.c
> index a14b1e3d4c78..08f796a4f5f6 100644
> --- a/drivers/gpu/drm/i915/i915_gem_tiling.c
> +++ b/drivers/gpu/drm/i915/i915_gem_tiling.c
> @@ -204,8 +204,6 @@ i915_gem_set_tiling(struct drm_device *dev, void *data,
>  		return -EINVAL;
>  	}
>  
> -	intel_runtime_pm_get(dev_priv);
> -
>  	mutex_lock(&dev->struct_mutex);
>  	if (obj->pin_display || obj->framebuffer_references) {
>  		err = -EBUSY;
> @@ -301,8 +299,6 @@ err:
>  	i915_gem_object_put(obj);
>  	mutex_unlock(&dev->struct_mutex);
>  
> -	intel_runtime_pm_put(dev_priv);
> -
>  	return err;
>  }
>  
> diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
> index f48e79ae2ac6..9b0a2fa7984b 100644
> --- a/drivers/gpu/drm/i915/intel_drv.h
> +++ b/drivers/gpu/drm/i915/intel_drv.h
> @@ -1648,6 +1648,12 @@ assert_rpm_wakelock_held(struct drm_i915_private *dev_priv)
>  		DRM_DEBUG_DRIVER("RPM wakelock ref not held during HW access");
>  }
>  
> +static inline void
> +assert_rpm_release_all_mmaps(struct drm_i915_private *dev_priv)
> +{
> +	assert_rpm_wakelock_held(dev_priv);
> +}
> +
>  static inline int
>  assert_rpm_atomic_begin(struct drm_i915_private *dev_priv)
>  {
> diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c
> index e2b188dcf908..e0c3bd941e38 100644
> --- a/drivers/gpu/drm/i915/intel_uncore.c
> +++ b/drivers/gpu/drm/i915/intel_uncore.c
> @@ -1364,7 +1364,7 @@ int i915_reg_read_ioctl(struct drm_device *dev,
>  	struct register_whitelist const *entry = whitelist;
>  	unsigned size;
>  	i915_reg_t offset_ldw, offset_udw;
> -	int i, ret = 0;
> +	int i, ret;
>  
>  	for (i = 0; i < ARRAY_SIZE(whitelist); i++, entry++) {
>  		if (i915_mmio_reg_offset(entry->offset_ldw) == (reg->offset & -entry->size) &&
> @@ -1386,6 +1386,7 @@ int i915_reg_read_ioctl(struct drm_device *dev,
>  
>  	intel_runtime_pm_get(dev_priv);
>  
> +	ret = 0;
>  	switch (size) {
>  	case 8 | 1:
>  		reg->val = I915_READ64_2x32(offset_ldw, offset_udw);
> @@ -1404,10 +1405,9 @@ int i915_reg_read_ioctl(struct drm_device *dev,
>  		break;
>  	default:
>  		ret = -EINVAL;
> -		goto out;
> +		break;
>  	}
>  
> -out:
>  	intel_runtime_pm_put(dev_priv);
>  	return ret;
>  }
> -- 
> 2.9.3
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

  parent reply	other threads:[~2016-10-11 13:25 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-10-10 21:58 [PATCH 1/3] drm/i915: Use RPM as the barrier for controlling user mmap access Chris Wilson
2016-10-10 21:58 ` [PATCH 2/3] drm/i915: Remove RPM sequence checking Chris Wilson
2016-10-10 21:58 ` [PATCH 3/3] drm/i915: Use fence_write() from rpm resume Chris Wilson
2016-10-11  9:42   ` Joonas Lahtinen
2016-10-10 22:19 ` ✗ Fi.CI.BAT: warning for series starting with [1/3] drm/i915: Use RPM as the barrier for controlling user mmap access Patchwork
2016-10-11 13:25 ` Daniel Vetter [this message]
2016-10-11 14:37   ` [PATCH 1/3] drm/i915: Move user fault tracking to a separate list Chris Wilson
2016-10-11 14:37     ` [PATCH 2/3] drm/i915: Use RPM as the barrier for controlling user mmap access Chris Wilson
2016-10-13 14:44       ` Daniel Vetter
2016-10-13 15:15         ` Chris Wilson
2016-10-13 15:25           ` Daniel Vetter
2016-10-11 14:37     ` [PATCH 3/3] drm/i915: Remove superfluous locking around userfault_list Chris Wilson
2016-10-13 14:47       ` Daniel Vetter
2016-10-13 14:40     ` [PATCH 1/3] drm/i915: Move user fault tracking to a separate list Daniel Vetter
2016-10-17  7:44 ` [drm/i915] f23c074696: BUG: sleeping function called from invalid context at drivers/base/power/runtime.c:955 kernel test robot
2016-10-17  7:44   ` [lkp] " kernel test robot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20161011132524.GC20761@phenom.ffwll.local \
    --to=daniel@ffwll.ch \
    --cc=chris@chris-wilson.co.uk \
    --cc=intel-gfx@lists.freedesktop.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.