public inbox for linux-media@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 00/23] drm/msm+PM+icc: Make job_run() reclaim-safe
@ 2023-03-20 14:43 Rob Clark
  2023-03-20 14:43 ` [PATCH v2 01/23] drm/msm: Pre-allocate hw_fence Rob Clark
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Rob Clark @ 2023-03-20 14:43 UTC (permalink / raw)
  To: dri-devel
  Cc: Daniel Vetter, freedreno, linux-arm-msm, Rob Clark,
	Akhil P Oommen, Chia-I Wu, Dmitry Baryshkov, Douglas Anderson,
	Geert Uytterhoeven, Joel Fernandes (Google), Konrad Dybcio,
	Konrad Dybcio, moderated list:DMA BUFFER SHARING FRAMEWORK,
	open list, open list:DMA BUFFER SHARING FRAMEWORK,
	open list:DEVICE FREQUENCY (DEVFREQ), Luca Weiss, Maximilian Luz,
	Nathan Chancellor, Rafael J. Wysocki, Sean Paul

From: Rob Clark <robdclark@chromium.org>

Inspired by https://lore.kernel.org/dri-devel/20200604081224.863494-10-daniel.vetter@ffwll.ch/
it seemed like a good idea to get rid of memory allocation in job_run()
fence signaling path, and use lockdep annotations to yell at us about
anything that could deadlock against shrinker/reclaim.  Anything that
can trigger reclaim, or block on any other thread that has triggered
reclaim, can block the GPU shrinker from releasing memory if it is
waiting the job to complete, causing deadlock.

The first patch pre-allocates the hw_fence, splitting allocation and
initialization, to avoid allocation in the job_run() path.  The next
eight decouple the obj lock from job_run(), as the obj lock is required
to pin/unpin backing pages (ie. holding an obj lock in job_run() could
deadlock the shrinker by blocking forward progress towards pinned buffers
becoming idle).  Followed by two so that we could idr_preload() in order
to avoid memory allocations under locks indirectly connected to the
shrinker path.

Next are three paths to decouple initialization (where allocations are
needed) from GPU runpm and devfreq, to avoid allocations in the fence
signaling path.  Followed by various PM devfreq/qos and interconnect
locking fixes to decouple initialization (allocation) from runtime.

And finally, the last patch is a modified version of danvet's patch to
add lockdep annotations to gpu scheduler, but does so conditionally so
that drivers can opt-in.

v2: Switch from embedding hw_fence in submit/job object to preallocating
    the hw_fence.  Rework "fenced unpin" locking to drop obj lock from
    fence signaling path (ie. the part that was still WIP in the first
    iteration of the patchset).  Adds the final patch to enable fence
    signaling annotations now that job_run() and job_free() are safe.
    The PM devfreq/QoS and interconnect patches are unchanged.

Rob Clark (23):
  drm/msm: Pre-allocate hw_fence
  drm/msm: Move submit bo flags update from obj lock
  drm/msm/gem: Tidy up VMA API
  drm/msm: Decouple vma tracking from obj lock
  drm/msm/gem: Simplify vmap vs LRU tracking
  drm/gem: Export drm_gem_lru_move_tail_locked()
  drm/msm/gem: Move update_lru()
  drm/msm/gem: Protect pin_count/madv by LRU lock
  drm/msm/gem: Avoid obj lock in job_run()
  drm/msm: Switch idr_lock to spinlock
  drm/msm: Use idr_preload()
  drm/msm/gpu: Move fw loading out of hw_init() path
  drm/msm/gpu: Move BO allocation out of hw_init
  drm/msm/a6xx: Move ioremap out of hw_init path
  PM / devfreq: Drop unneed locking to appease lockdep
  PM / devfreq: Teach lockdep about locking order
  PM / QoS: Fix constraints alloc vs reclaim locking
  PM / QoS: Decouple request alloc from dev_pm_qos_mtx
  PM / QoS: Teach lockdep about dev_pm_qos_mtx locking order
  soc: qcom: smd-rpm: Use GFP_ATOMIC in write path
  interconnect: Fix locking for runpm vs reclaim
  interconnect: Teach lockdep about icc_bw_lock order
  drm/sched: Add (optional) fence signaling annotation

 drivers/base/power/qos.c                   |  83 +++++++++---
 drivers/devfreq/devfreq.c                  |  52 ++++----
 drivers/gpu/drm/drm_gem.c                  |  11 +-
 drivers/gpu/drm/msm/adreno/a5xx_gpu.c      |  48 ++++---
 drivers/gpu/drm/msm/adreno/a6xx_gmu.c      |  18 ++-
 drivers/gpu/drm/msm/adreno/a6xx_gpu.c      |  46 ++++---
 drivers/gpu/drm/msm/adreno/adreno_device.c |   6 +
 drivers/gpu/drm/msm/adreno/adreno_gpu.c    |   9 +-
 drivers/gpu/drm/msm/msm_drv.c              |   6 +-
 drivers/gpu/drm/msm/msm_fence.c            |  12 +-
 drivers/gpu/drm/msm/msm_fence.h            |   3 +-
 drivers/gpu/drm/msm/msm_gem.c              | 145 ++++++++++++++-------
 drivers/gpu/drm/msm/msm_gem.h              |  29 +++--
 drivers/gpu/drm/msm/msm_gem_submit.c       |  27 ++--
 drivers/gpu/drm/msm/msm_gem_vma.c          |  91 ++++++++++---
 drivers/gpu/drm/msm/msm_gpu.h              |   8 +-
 drivers/gpu/drm/msm/msm_ringbuffer.c       |   9 +-
 drivers/gpu/drm/msm/msm_submitqueue.c      |   2 +-
 drivers/gpu/drm/scheduler/sched_main.c     |   9 ++
 drivers/interconnect/core.c                |  18 ++-
 drivers/soc/qcom/smd-rpm.c                 |   2 +-
 include/drm/drm_gem.h                      |   1 +
 include/drm/gpu_scheduler.h                |   2 +
 23 files changed, 416 insertions(+), 221 deletions(-)

-- 
2.39.2


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH v2 01/23] drm/msm: Pre-allocate hw_fence
  2023-03-20 14:43 [PATCH v2 00/23] drm/msm+PM+icc: Make job_run() reclaim-safe Rob Clark
@ 2023-03-20 14:43 ` Rob Clark
  2023-03-20 16:52   ` Christian König
  2023-03-20 14:43 ` [PATCH v2 09/23] drm/msm/gem: Avoid obj lock in job_run() Rob Clark
  2023-04-07 17:41 ` (subset) [PATCH v2 00/23] drm/msm+PM+icc: Make job_run() reclaim-safe Bjorn Andersson
  2 siblings, 1 reply; 6+ messages in thread
From: Rob Clark @ 2023-03-20 14:43 UTC (permalink / raw)
  To: dri-devel
  Cc: Daniel Vetter, freedreno, linux-arm-msm, Rob Clark, Rob Clark,
	Abhinav Kumar, Dmitry Baryshkov, Sean Paul, David Airlie,
	Sumit Semwal, Christian König, open list,
	open list:DMA BUFFER SHARING FRAMEWORK,
	moderated list:DMA BUFFER SHARING FRAMEWORK

From: Rob Clark <robdclark@chromium.org>

Avoid allocating memory in job_run() by pre-allocating the hw_fence.

Signed-off-by: Rob Clark <robdclark@chromium.org>
---
 drivers/gpu/drm/msm/msm_fence.c      | 12 +++++++++---
 drivers/gpu/drm/msm/msm_fence.h      |  3 ++-
 drivers/gpu/drm/msm/msm_gem_submit.c |  7 +++++++
 drivers/gpu/drm/msm/msm_ringbuffer.c |  2 +-
 4 files changed, 19 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/msm/msm_fence.c b/drivers/gpu/drm/msm/msm_fence.c
index 56641408ea74..bab3d84f1686 100644
--- a/drivers/gpu/drm/msm/msm_fence.c
+++ b/drivers/gpu/drm/msm/msm_fence.c
@@ -99,7 +99,7 @@ static const struct dma_fence_ops msm_fence_ops = {
 };
 
 struct dma_fence *
-msm_fence_alloc(struct msm_fence_context *fctx)
+msm_fence_alloc(void)
 {
 	struct msm_fence *f;
 
@@ -107,10 +107,16 @@ msm_fence_alloc(struct msm_fence_context *fctx)
 	if (!f)
 		return ERR_PTR(-ENOMEM);
 
+	return &f->base;
+}
+
+void
+msm_fence_init(struct dma_fence *fence, struct msm_fence_context *fctx)
+{
+	struct msm_fence *f = to_msm_fence(fence);
+
 	f->fctx = fctx;
 
 	dma_fence_init(&f->base, &msm_fence_ops, &fctx->spinlock,
 		       fctx->context, ++fctx->last_fence);
-
-	return &f->base;
 }
diff --git a/drivers/gpu/drm/msm/msm_fence.h b/drivers/gpu/drm/msm/msm_fence.h
index 7f1798c54cd1..f913fa22d8fe 100644
--- a/drivers/gpu/drm/msm/msm_fence.h
+++ b/drivers/gpu/drm/msm/msm_fence.h
@@ -61,7 +61,8 @@ void msm_fence_context_free(struct msm_fence_context *fctx);
 bool msm_fence_completed(struct msm_fence_context *fctx, uint32_t fence);
 void msm_update_fence(struct msm_fence_context *fctx, uint32_t fence);
 
-struct dma_fence * msm_fence_alloc(struct msm_fence_context *fctx);
+struct dma_fence * msm_fence_alloc(void);
+void msm_fence_init(struct dma_fence *fence, struct msm_fence_context *fctx);
 
 static inline bool
 fence_before(uint32_t a, uint32_t b)
diff --git a/drivers/gpu/drm/msm/msm_gem_submit.c b/drivers/gpu/drm/msm/msm_gem_submit.c
index be4bf77103cd..2570c018b0cb 100644
--- a/drivers/gpu/drm/msm/msm_gem_submit.c
+++ b/drivers/gpu/drm/msm/msm_gem_submit.c
@@ -41,6 +41,13 @@ static struct msm_gem_submit *submit_create(struct drm_device *dev,
 	if (!submit)
 		return ERR_PTR(-ENOMEM);
 
+	submit->hw_fence = msm_fence_alloc();
+	if (IS_ERR(submit->hw_fence)) {
+		ret = PTR_ERR(submit->hw_fence);
+		kfree(submit);
+		return ERR_PTR(ret);
+	}
+
 	ret = drm_sched_job_init(&submit->base, queue->entity, queue);
 	if (ret) {
 		kfree(submit);
diff --git a/drivers/gpu/drm/msm/msm_ringbuffer.c b/drivers/gpu/drm/msm/msm_ringbuffer.c
index 57a8e9564540..a62b45e5a8c3 100644
--- a/drivers/gpu/drm/msm/msm_ringbuffer.c
+++ b/drivers/gpu/drm/msm/msm_ringbuffer.c
@@ -18,7 +18,7 @@ static struct dma_fence *msm_job_run(struct drm_sched_job *job)
 	struct msm_gpu *gpu = submit->gpu;
 	int i;
 
-	submit->hw_fence = msm_fence_alloc(fctx);
+	msm_fence_init(submit->hw_fence, fctx);
 
 	for (i = 0; i < submit->nr_bos; i++) {
 		struct drm_gem_object *obj = &submit->bos[i].obj->base;
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH v2 09/23] drm/msm/gem: Avoid obj lock in job_run()
  2023-03-20 14:43 [PATCH v2 00/23] drm/msm+PM+icc: Make job_run() reclaim-safe Rob Clark
  2023-03-20 14:43 ` [PATCH v2 01/23] drm/msm: Pre-allocate hw_fence Rob Clark
@ 2023-03-20 14:43 ` Rob Clark
  2023-04-07 17:41 ` (subset) [PATCH v2 00/23] drm/msm+PM+icc: Make job_run() reclaim-safe Bjorn Andersson
  2 siblings, 0 replies; 6+ messages in thread
From: Rob Clark @ 2023-03-20 14:43 UTC (permalink / raw)
  To: dri-devel
  Cc: Daniel Vetter, freedreno, linux-arm-msm, Rob Clark, Rob Clark,
	Abhinav Kumar, Dmitry Baryshkov, Sean Paul, David Airlie,
	Sumit Semwal, Christian König, open list,
	open list:DMA BUFFER SHARING FRAMEWORK,
	moderated list:DMA BUFFER SHARING FRAMEWORK

From: Rob Clark <robdclark@chromium.org>

Now that everything that controls which LRU an obj lives in *except* the
backing pages is protected by the LRU lock, add a special path to unpin
in the job_run() path, we we are assured that we already have backing
pages and will not be racing against eviction (because the GEM object's
dma_resv contains the fence that will be signaled when the submit/job
completes).

Signed-off-by: Rob Clark <robdclark@chromium.org>
---
 drivers/gpu/drm/msm/msm_gem.c        | 44 +++++++++++++++++++++++-----
 drivers/gpu/drm/msm/msm_gem.h        |  1 +
 drivers/gpu/drm/msm/msm_ringbuffer.c |  4 +--
 3 files changed, 39 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/msm/msm_gem.c b/drivers/gpu/drm/msm/msm_gem.c
index d0ac3e704b66..9628e8d8dd02 100644
--- a/drivers/gpu/drm/msm/msm_gem.c
+++ b/drivers/gpu/drm/msm/msm_gem.c
@@ -61,18 +61,14 @@ static void sync_for_cpu(struct msm_gem_object *msm_obj)
 	dma_unmap_sgtable(dev, msm_obj->sgt, DMA_BIDIRECTIONAL, 0);
 }
 
-static void update_lru_locked(struct drm_gem_object *obj)
+static void update_lru_active(struct drm_gem_object *obj)
 {
 	struct msm_drm_private *priv = obj->dev->dev_private;
 	struct msm_gem_object *msm_obj = to_msm_bo(obj);
 
-	msm_gem_assert_locked(&msm_obj->base);
-
-	if (!msm_obj->pages) {
-		GEM_WARN_ON(msm_obj->pin_count);
+	GEM_WARN_ON(!msm_obj->pages);
 
-		drm_gem_lru_move_tail_locked(&priv->lru.unbacked, obj);
-	} else if (msm_obj->pin_count) {
+	if (msm_obj->pin_count) {
 		drm_gem_lru_move_tail_locked(&priv->lru.pinned, obj);
 	} else if (msm_obj->madv == MSM_MADV_WILLNEED) {
 		drm_gem_lru_move_tail_locked(&priv->lru.willneed, obj);
@@ -83,6 +79,22 @@ static void update_lru_locked(struct drm_gem_object *obj)
 	}
 }
 
+static void update_lru_locked(struct drm_gem_object *obj)
+{
+	struct msm_drm_private *priv = obj->dev->dev_private;
+	struct msm_gem_object *msm_obj = to_msm_bo(obj);
+
+	msm_gem_assert_locked(&msm_obj->base);
+
+	if (!msm_obj->pages) {
+		GEM_WARN_ON(msm_obj->pin_count);
+
+		drm_gem_lru_move_tail_locked(&priv->lru.unbacked, obj);
+	} else {
+		update_lru_active(obj);
+	}
+}
+
 static void update_lru(struct drm_gem_object *obj)
 {
 	struct msm_drm_private *priv = obj->dev->dev_private;
@@ -489,6 +501,24 @@ void msm_gem_unpin_locked(struct drm_gem_object *obj)
 	mutex_unlock(&priv->lru.lock);
 }
 
+/* Special unpin path for use in fence-signaling path, avoiding the need
+ * to hold the obj lock by only depending on things that a protected by
+ * the LRU lock.  In particular we know that that we already have backing
+ * and and that the object's dma_resv has the fence for the current
+ * submit/job which will prevent us racing against page eviction.
+ */
+void msm_gem_unpin_active(struct drm_gem_object *obj)
+{
+	struct msm_drm_private *priv = obj->dev->dev_private;
+	struct msm_gem_object *msm_obj = to_msm_bo(obj);
+
+	mutex_lock(&priv->lru.lock);
+	msm_obj->pin_count--;
+	GEM_WARN_ON(msm_obj->pin_count < 0);
+	update_lru_active(obj);
+	mutex_unlock(&priv->lru.lock);
+}
+
 struct msm_gem_vma *msm_gem_get_vma_locked(struct drm_gem_object *obj,
 					   struct msm_gem_address_space *aspace)
 {
diff --git a/drivers/gpu/drm/msm/msm_gem.h b/drivers/gpu/drm/msm/msm_gem.h
index 0057e8e8fa13..2bd6846c83a9 100644
--- a/drivers/gpu/drm/msm/msm_gem.h
+++ b/drivers/gpu/drm/msm/msm_gem.h
@@ -128,6 +128,7 @@ struct msm_gem_object {
 uint64_t msm_gem_mmap_offset(struct drm_gem_object *obj);
 int msm_gem_pin_vma_locked(struct drm_gem_object *obj, struct msm_gem_vma *vma);
 void msm_gem_unpin_locked(struct drm_gem_object *obj);
+void msm_gem_unpin_active(struct drm_gem_object *obj);
 struct msm_gem_vma *msm_gem_get_vma_locked(struct drm_gem_object *obj,
 					   struct msm_gem_address_space *aspace);
 int msm_gem_get_iova(struct drm_gem_object *obj,
diff --git a/drivers/gpu/drm/msm/msm_ringbuffer.c b/drivers/gpu/drm/msm/msm_ringbuffer.c
index 31b4fbf96c36..b60199184409 100644
--- a/drivers/gpu/drm/msm/msm_ringbuffer.c
+++ b/drivers/gpu/drm/msm/msm_ringbuffer.c
@@ -24,9 +24,7 @@ static struct dma_fence *msm_job_run(struct drm_sched_job *job)
 		struct drm_gem_object *obj = &submit->bos[i].obj->base;
 
 		msm_gem_vma_unpin_fenced(submit->bos[i].vma, fctx);
-		msm_gem_lock(obj);
-		msm_gem_unpin_locked(obj);
-		msm_gem_unlock(obj);
+		msm_gem_unpin_active(obj);
 		submit->bos[i].flags &= ~(BO_VMA_PINNED | BO_OBJ_PINNED);
 	}
 
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH v2 01/23] drm/msm: Pre-allocate hw_fence
  2023-03-20 14:43 ` [PATCH v2 01/23] drm/msm: Pre-allocate hw_fence Rob Clark
@ 2023-03-20 16:52   ` Christian König
  2023-03-20 20:32     ` Rob Clark
  0 siblings, 1 reply; 6+ messages in thread
From: Christian König @ 2023-03-20 16:52 UTC (permalink / raw)
  To: Rob Clark, dri-devel
  Cc: Daniel Vetter, freedreno, linux-arm-msm, Rob Clark, Abhinav Kumar,
	Dmitry Baryshkov, Sean Paul, David Airlie, Sumit Semwal,
	open list, open list:DMA BUFFER SHARING FRAMEWORK,
	moderated list:DMA BUFFER SHARING FRAMEWORK



Am 20.03.23 um 15:43 schrieb Rob Clark:
> From: Rob Clark <robdclark@chromium.org>
>
> Avoid allocating memory in job_run() by pre-allocating the hw_fence.
>
> Signed-off-by: Rob Clark <robdclark@chromium.org>
> ---
>   drivers/gpu/drm/msm/msm_fence.c      | 12 +++++++++---
>   drivers/gpu/drm/msm/msm_fence.h      |  3 ++-
>   drivers/gpu/drm/msm/msm_gem_submit.c |  7 +++++++
>   drivers/gpu/drm/msm/msm_ringbuffer.c |  2 +-
>   4 files changed, 19 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/gpu/drm/msm/msm_fence.c b/drivers/gpu/drm/msm/msm_fence.c
> index 56641408ea74..bab3d84f1686 100644
> --- a/drivers/gpu/drm/msm/msm_fence.c
> +++ b/drivers/gpu/drm/msm/msm_fence.c
> @@ -99,7 +99,7 @@ static const struct dma_fence_ops msm_fence_ops = {
>   };
>   
>   struct dma_fence *
> -msm_fence_alloc(struct msm_fence_context *fctx)
> +msm_fence_alloc(void)
>   {
>   	struct msm_fence *f;
>   
> @@ -107,10 +107,16 @@ msm_fence_alloc(struct msm_fence_context *fctx)
>   	if (!f)
>   		return ERR_PTR(-ENOMEM);
>   
> +	return &f->base;
> +}
> +
> +void
> +msm_fence_init(struct dma_fence *fence, struct msm_fence_context *fctx)
> +{
> +	struct msm_fence *f = to_msm_fence(fence);
> +
>   	f->fctx = fctx;
>   
>   	dma_fence_init(&f->base, &msm_fence_ops, &fctx->spinlock,
>   		       fctx->context, ++fctx->last_fence);
> -
> -	return &f->base;
>   }
> diff --git a/drivers/gpu/drm/msm/msm_fence.h b/drivers/gpu/drm/msm/msm_fence.h
> index 7f1798c54cd1..f913fa22d8fe 100644
> --- a/drivers/gpu/drm/msm/msm_fence.h
> +++ b/drivers/gpu/drm/msm/msm_fence.h
> @@ -61,7 +61,8 @@ void msm_fence_context_free(struct msm_fence_context *fctx);
>   bool msm_fence_completed(struct msm_fence_context *fctx, uint32_t fence);
>   void msm_update_fence(struct msm_fence_context *fctx, uint32_t fence);
>   
> -struct dma_fence * msm_fence_alloc(struct msm_fence_context *fctx);
> +struct dma_fence * msm_fence_alloc(void);
> +void msm_fence_init(struct dma_fence *fence, struct msm_fence_context *fctx);
>   
>   static inline bool
>   fence_before(uint32_t a, uint32_t b)
> diff --git a/drivers/gpu/drm/msm/msm_gem_submit.c b/drivers/gpu/drm/msm/msm_gem_submit.c
> index be4bf77103cd..2570c018b0cb 100644
> --- a/drivers/gpu/drm/msm/msm_gem_submit.c
> +++ b/drivers/gpu/drm/msm/msm_gem_submit.c
> @@ -41,6 +41,13 @@ static struct msm_gem_submit *submit_create(struct drm_device *dev,
>   	if (!submit)
>   		return ERR_PTR(-ENOMEM);
>   
> +	submit->hw_fence = msm_fence_alloc();
> +	if (IS_ERR(submit->hw_fence)) {
> +		ret = PTR_ERR(submit->hw_fence);
> +		kfree(submit);
> +		return ERR_PTR(ret);
> +	}
> +
>   	ret = drm_sched_job_init(&submit->base, queue->entity, queue);
>   	if (ret) {
>   		kfree(submit);

You probably need some error handling here or otherwise leak 
submit->hw_fence.

Apart from that looks good to me.

Christian.

> diff --git a/drivers/gpu/drm/msm/msm_ringbuffer.c b/drivers/gpu/drm/msm/msm_ringbuffer.c
> index 57a8e9564540..a62b45e5a8c3 100644
> --- a/drivers/gpu/drm/msm/msm_ringbuffer.c
> +++ b/drivers/gpu/drm/msm/msm_ringbuffer.c
> @@ -18,7 +18,7 @@ static struct dma_fence *msm_job_run(struct drm_sched_job *job)
>   	struct msm_gpu *gpu = submit->gpu;
>   	int i;
>   
> -	submit->hw_fence = msm_fence_alloc(fctx);
> +	msm_fence_init(submit->hw_fence, fctx);
>   
>   	for (i = 0; i < submit->nr_bos; i++) {
>   		struct drm_gem_object *obj = &submit->bos[i].obj->base;


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2 01/23] drm/msm: Pre-allocate hw_fence
  2023-03-20 16:52   ` Christian König
@ 2023-03-20 20:32     ` Rob Clark
  0 siblings, 0 replies; 6+ messages in thread
From: Rob Clark @ 2023-03-20 20:32 UTC (permalink / raw)
  To: Christian König
  Cc: dri-devel, Daniel Vetter, freedreno, linux-arm-msm, Rob Clark,
	Abhinav Kumar, Dmitry Baryshkov, Sean Paul, David Airlie,
	Sumit Semwal, open list, open list:DMA BUFFER SHARING FRAMEWORK,
	moderated list:DMA BUFFER SHARING FRAMEWORK

On Mon, Mar 20, 2023 at 9:52 AM Christian König
<christian.koenig@amd.com> wrote:
>
>
>
> Am 20.03.23 um 15:43 schrieb Rob Clark:
> > From: Rob Clark <robdclark@chromium.org>
> >
> > Avoid allocating memory in job_run() by pre-allocating the hw_fence.
> >
> > Signed-off-by: Rob Clark <robdclark@chromium.org>
> > ---
> >   drivers/gpu/drm/msm/msm_fence.c      | 12 +++++++++---
> >   drivers/gpu/drm/msm/msm_fence.h      |  3 ++-
> >   drivers/gpu/drm/msm/msm_gem_submit.c |  7 +++++++
> >   drivers/gpu/drm/msm/msm_ringbuffer.c |  2 +-
> >   4 files changed, 19 insertions(+), 5 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/msm/msm_fence.c b/drivers/gpu/drm/msm/msm_fence.c
> > index 56641408ea74..bab3d84f1686 100644
> > --- a/drivers/gpu/drm/msm/msm_fence.c
> > +++ b/drivers/gpu/drm/msm/msm_fence.c
> > @@ -99,7 +99,7 @@ static const struct dma_fence_ops msm_fence_ops = {
> >   };
> >
> >   struct dma_fence *
> > -msm_fence_alloc(struct msm_fence_context *fctx)
> > +msm_fence_alloc(void)
> >   {
> >       struct msm_fence *f;
> >
> > @@ -107,10 +107,16 @@ msm_fence_alloc(struct msm_fence_context *fctx)
> >       if (!f)
> >               return ERR_PTR(-ENOMEM);
> >
> > +     return &f->base;
> > +}
> > +
> > +void
> > +msm_fence_init(struct dma_fence *fence, struct msm_fence_context *fctx)
> > +{
> > +     struct msm_fence *f = to_msm_fence(fence);
> > +
> >       f->fctx = fctx;
> >
> >       dma_fence_init(&f->base, &msm_fence_ops, &fctx->spinlock,
> >                      fctx->context, ++fctx->last_fence);
> > -
> > -     return &f->base;
> >   }
> > diff --git a/drivers/gpu/drm/msm/msm_fence.h b/drivers/gpu/drm/msm/msm_fence.h
> > index 7f1798c54cd1..f913fa22d8fe 100644
> > --- a/drivers/gpu/drm/msm/msm_fence.h
> > +++ b/drivers/gpu/drm/msm/msm_fence.h
> > @@ -61,7 +61,8 @@ void msm_fence_context_free(struct msm_fence_context *fctx);
> >   bool msm_fence_completed(struct msm_fence_context *fctx, uint32_t fence);
> >   void msm_update_fence(struct msm_fence_context *fctx, uint32_t fence);
> >
> > -struct dma_fence * msm_fence_alloc(struct msm_fence_context *fctx);
> > +struct dma_fence * msm_fence_alloc(void);
> > +void msm_fence_init(struct dma_fence *fence, struct msm_fence_context *fctx);
> >
> >   static inline bool
> >   fence_before(uint32_t a, uint32_t b)
> > diff --git a/drivers/gpu/drm/msm/msm_gem_submit.c b/drivers/gpu/drm/msm/msm_gem_submit.c
> > index be4bf77103cd..2570c018b0cb 100644
> > --- a/drivers/gpu/drm/msm/msm_gem_submit.c
> > +++ b/drivers/gpu/drm/msm/msm_gem_submit.c
> > @@ -41,6 +41,13 @@ static struct msm_gem_submit *submit_create(struct drm_device *dev,
> >       if (!submit)
> >               return ERR_PTR(-ENOMEM);
> >
> > +     submit->hw_fence = msm_fence_alloc();
> > +     if (IS_ERR(submit->hw_fence)) {
> > +             ret = PTR_ERR(submit->hw_fence);
> > +             kfree(submit);
> > +             return ERR_PTR(ret);
> > +     }
> > +
> >       ret = drm_sched_job_init(&submit->base, queue->entity, queue);
> >       if (ret) {
> >               kfree(submit);
>
> You probably need some error handling here or otherwise leak
> submit->hw_fence.

ah, right.. thx

BR,
-R

> Apart from that looks good to me.
>
> Christian.
>
> > diff --git a/drivers/gpu/drm/msm/msm_ringbuffer.c b/drivers/gpu/drm/msm/msm_ringbuffer.c
> > index 57a8e9564540..a62b45e5a8c3 100644
> > --- a/drivers/gpu/drm/msm/msm_ringbuffer.c
> > +++ b/drivers/gpu/drm/msm/msm_ringbuffer.c
> > @@ -18,7 +18,7 @@ static struct dma_fence *msm_job_run(struct drm_sched_job *job)
> >       struct msm_gpu *gpu = submit->gpu;
> >       int i;
> >
> > -     submit->hw_fence = msm_fence_alloc(fctx);
> > +     msm_fence_init(submit->hw_fence, fctx);
> >
> >       for (i = 0; i < submit->nr_bos; i++) {
> >               struct drm_gem_object *obj = &submit->bos[i].obj->base;
>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: (subset) [PATCH v2 00/23] drm/msm+PM+icc: Make job_run() reclaim-safe
  2023-03-20 14:43 [PATCH v2 00/23] drm/msm+PM+icc: Make job_run() reclaim-safe Rob Clark
  2023-03-20 14:43 ` [PATCH v2 01/23] drm/msm: Pre-allocate hw_fence Rob Clark
  2023-03-20 14:43 ` [PATCH v2 09/23] drm/msm/gem: Avoid obj lock in job_run() Rob Clark
@ 2023-04-07 17:41 ` Bjorn Andersson
  2 siblings, 0 replies; 6+ messages in thread
From: Bjorn Andersson @ 2023-04-07 17:41 UTC (permalink / raw)
  To: dri-devel, Rob Clark
  Cc: Geert Uytterhoeven, Maximilian Luz, Dmitry Baryshkov,
	Akhil P Oommen, Konrad Dybcio, Douglas Anderson, Chia-I Wu,
	Luca Weiss, Sean Paul, Nathan Chancellor, Rafael J. Wysocki,
	open list:DEVICE FREQUENCY DEVFREQ, Rob Clark, Konrad Dybcio,
	moderated list:DMA BUFFER SHARING FRAMEWORK, Daniel Vetter,
	freedreno, Joel Fernandes (Google), open list,
	open list:DMA BUFFER SHARING FRAMEWORK, linux-arm-msm

On Mon, 20 Mar 2023 07:43:22 -0700, Rob Clark wrote:
> From: Rob Clark <robdclark@chromium.org>
> 
> Inspired by https://lore.kernel.org/dri-devel/20200604081224.863494-10-daniel.vetter@ffwll.ch/
> it seemed like a good idea to get rid of memory allocation in job_run()
> fence signaling path, and use lockdep annotations to yell at us about
> anything that could deadlock against shrinker/reclaim.  Anything that
> can trigger reclaim, or block on any other thread that has triggered
> reclaim, can block the GPU shrinker from releasing memory if it is
> waiting the job to complete, causing deadlock.
> 
> [...]

Applied, thanks!

[20/23] soc: qcom: smd-rpm: Use GFP_ATOMIC in write path
        commit: 5808c532ca0a983d643319caca44f2bcb148298f

Best regards,
-- 
Bjorn Andersson <andersson@kernel.org>

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2023-04-07 17:39 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-03-20 14:43 [PATCH v2 00/23] drm/msm+PM+icc: Make job_run() reclaim-safe Rob Clark
2023-03-20 14:43 ` [PATCH v2 01/23] drm/msm: Pre-allocate hw_fence Rob Clark
2023-03-20 16:52   ` Christian König
2023-03-20 20:32     ` Rob Clark
2023-03-20 14:43 ` [PATCH v2 09/23] drm/msm/gem: Avoid obj lock in job_run() Rob Clark
2023-04-07 17:41 ` (subset) [PATCH v2 00/23] drm/msm+PM+icc: Make job_run() reclaim-safe Bjorn Andersson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox