Re: [PATCH v2 3/8] v4l: vsp1: Convert display lists to use new fragment pool

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
To: Kieran Bingham <kieran.bingham+renesas@ideasonboard.com>
Cc: linux-renesas-soc@vger.kernel.org, linux-media@vger.kernel.org
Subject: Re: [PATCH v2 3/8] v4l: vsp1: Convert display lists to use new fragment pool
Date: Thu, 17 Aug 2017 15:13:24 +0300	[thread overview]
Message-ID: <1922275.UObh22kbi7@avalon> (raw)
In-Reply-To: <b15a5776bf062f26bdc6ae580414cd9252d900e3.1502723341.git-series.kieran.bingham+renesas@ideasonboard.com>

Hi Kieran,

Thank you for the patch.

On Monday 14 Aug 2017 16:13:26 Kieran Bingham wrote:
> Adapt the dl->body0 object to use an object from the fragment pool.
> This greatly reduces the pressure on the TLB for IPMMU use cases, as
> all of the lists use a single allocation for the main body.
> 
> The CLU and LUT objects pre-allocate a pool containing two bodies,
> allowing a userspace update before the hardware has committed a previous
> set of tables.

I think you'll need three bodies, one for the DL queued to the hardware, one 
for the pending DL and one for the new DL needed when you update the LUT/CLU. 
Given that the VSP test suite hasn't caught this problem, we also need a new 
test :-)

> Fragments are no longer 'freed' in interrupt context, but instead
> released back to their respective pools.  This allows us to remove the
> garbage collector in the DLM.
> 
> Signed-off-by: Kieran Bingham <kieran.bingham+renesas@ideasonboard.com>
> 
> ---
> v2:
>  - Use dl->body0->max_entries to determine header offset, instead of the
>    global constant VSP1_DL_NUM_ENTRIES which is incorrect.
>  - squash updates for LUT, CLU, and fragment cleanup into single patch.
>    (Not fully bisectable when separated)
> ---
>  drivers/media/platform/vsp1/vsp1_clu.c |  22 ++-
>  drivers/media/platform/vsp1/vsp1_clu.h |   1 +-
>  drivers/media/platform/vsp1/vsp1_dl.c  | 223 +++++---------------------
>  drivers/media/platform/vsp1/vsp1_dl.h  |   3 +-
>  drivers/media/platform/vsp1/vsp1_lut.c |  23 ++-
>  drivers/media/platform/vsp1/vsp1_lut.h |   1 +-
>  6 files changed, 90 insertions(+), 183 deletions(-)

This is a nice diffstat, but only if you add kerneldoc for the new functions 
introduced in patch 2/8, otherwise the overall documentation diffstat looks 
bad :-)

> diff --git a/drivers/media/platform/vsp1/vsp1_clu.c
> b/drivers/media/platform/vsp1/vsp1_clu.c index f2fb26e5ab4e..52c523625e2f
> 100644
> --- a/drivers/media/platform/vsp1/vsp1_clu.c
> +++ b/drivers/media/platform/vsp1/vsp1_clu.c

[snip]

> @@ -288,6 +298,12 @@ struct vsp1_clu *vsp1_clu_create(struct vsp1_device
> *vsp1) if (ret < 0)
>  		return ERR_PTR(ret);
> 
> +	/* Allocate a fragment pool */

The comment would be more useful if you explained why you need to allocate a 
pool here. Same comment for the LUT.

> +	clu->pool = vsp1_dl_fragment_pool_alloc(clu->entity.vsp1, 2,
> +						CLU_SIZE + 1, 0);
> +	if (!clu->pool)
> +		return ERR_PTR(-ENOMEM);
> +
>  	/* Initialize the control handler. */
>  	v4l2_ctrl_handler_init(&clu->ctrls, 2);
>  	v4l2_ctrl_new_custom(&clu->ctrls, &clu_table_control, NULL);

[snip]

> diff --git a/drivers/media/platform/vsp1/vsp1_dl.c
> b/drivers/media/platform/vsp1/vsp1_dl.c index aab9dd6ec0eb..6ffdc3549283
> 100644
> --- a/drivers/media/platform/vsp1/vsp1_dl.c
> +++ b/drivers/media/platform/vsp1/vsp1_dl.c

[snip]


> @@ -379,41 +289,39 @@ static struct vsp1_dl_list *vsp1_dl_list_alloc(struct
> vsp1_dl_manager *dlm) INIT_LIST_HEAD(&dl->fragments);
>  	dl->dlm = dlm;
> 
> -	/*
> -	 * Initialize the display list body and allocate DMA memory for the 
body
> -	 * and the optional header. Both are allocated together to avoid 
memory
> -	 * fragmentation, with the header located right after the body in
> -	 * memory.
> -	 */
> -	header_size = dlm->mode == VSP1_DL_MODE_HEADER
> -		    ? ALIGN(sizeof(struct vsp1_dl_header), 8)
> -		    : 0;
> -
> -	ret = vsp1_dl_body_init(dlm->vsp1, &dl->body0, VSP1_DL_NUM_ENTRIES,
> -				header_size);
> -	if (ret < 0) {
> -		kfree(dl);
> +	/* Retrieve a body from our DLM body pool */
> +	dl->body0 = vsp1_dl_fragment_get(pool);
> +	if (!dl->body0)
>  		return NULL;
> -	}
> -
>  	if (dlm->mode == VSP1_DL_MODE_HEADER) {
> -		size_t header_offset = VSP1_DL_NUM_ENTRIES
> -				     * sizeof(*dl->body0.entries);
> +		size_t header_offset = dl->body0->max_entries
> +				     * sizeof(*dl->body0->entries);
> 
> -		dl->header = ((void *)dl->body0.entries) + header_offset;
> -		dl->dma = dl->body0.dma + header_offset;
> +		dl->header = ((void *)dl->body0->entries) + header_offset;
> +		dl->dma = dl->body0->dma + header_offset;
> 
>  		memset(dl->header, 0, sizeof(*dl->header));
> -		dl->header->lists[0].addr = dl->body0.dma;
> +		dl->header->lists[0].addr = dl->body0->dma;
>  	}
> 
>  	return dl;
>  }
> 
> +static void vsp1_dl_list_fragments_free(struct vsp1_dl_list *dl)

This function doesn't free fragments put puts them back to the free list. I'd 
call it vsp1_dl_list_fragments_put().

> +{
> +	struct vsp1_dl_body *dlb, *tmp;
> +
> +	list_for_each_entry_safe(dlb, tmp, &dl->fragments, list) {
> +		list_del(&dlb->list);
> +		vsp1_dl_fragment_put(dlb);
> +	}
> +}
> +
>  static void vsp1_dl_list_free(struct vsp1_dl_list *dl)
>  {
> -	vsp1_dl_body_cleanup(&dl->body0);
> -	list_splice_init(&dl->fragments, &dl->dlm->gc_fragments);
> +	vsp1_dl_fragment_put(dl->body0);
> +	vsp1_dl_list_fragments_free(dl);

I wonder whether the second line is actually needed. vsp1_dl_list_free() is 
called from vsp1_dlm_destroy() for every entry in the dlm->free list. A DL can 
only be put in that list by vsp1_dlm_create() or __vsp1_dl_list_put(). The 
former creates lists with no fragment, while the latter calls 
vsp1_dl_list_fragments_free() already.

If you're not entirely sure you could add a WARN_ON(!list_empty(&dl-
>fragments)) and run the test suite. A comment explaining why the fragments 
list should already be empty here would be useful too.

> +
>  	kfree(dl);
>  }
> 
> @@ -467,18 +375,10 @@ static void __vsp1_dl_list_put(struct vsp1_dl_list
> *dl)
> 
>  	dl->has_chain = false;
> 
> -	/*
> -	 * We can't free fragments here as DMA memory can only be freed in
> -	 * interruptible context. Move all fragments to the display list
> -	 * manager's list of fragments to be freed, they will be
> -	 * garbage-collected by the work queue.
> -	 */
> -	if (!list_empty(&dl->fragments)) {
> -		list_splice_init(&dl->fragments, &dl->dlm->gc_fragments);
> -		schedule_work(&dl->dlm->gc_work);
> -	}
> +	vsp1_dl_list_fragments_free(dl);
> 
> -	dl->body0.num_entries = 0;
> +	/* body0 is reused */

It would be useful to explain why. Maybe something like "body0 is reused as an 
optimization as every display list needs at least one body." ? And now I'm 
wondering it it's really a useful optimization :-)

> +	dl->body0->num_entries = 0;
> 
>  	list_add_tail(&dl->list, &dl->dlm->free);
>  }

[snip]

> @@ -898,13 +764,26 @@ struct vsp1_dl_manager *vsp1_dlm_create(struct
> vsp1_device *vsp1,
> 
>  	spin_lock_init(&dlm->lock);
>  	INIT_LIST_HEAD(&dlm->free);
> -	INIT_LIST_HEAD(&dlm->gc_fragments);
> -	INIT_WORK(&dlm->gc_work, vsp1_dlm_garbage_collect);
> +
> +	/*
> +	 * Initialize the display list body and allocate DMA memory for the 
body
> +	 * and the optional header. Both are allocated together to avoid 
memory
> +	 * fragmentation, with the header located right after the body in
> +	 * memory.
> +	 */

Nice to see you're keeping this comment, but maybe you want to update it 
according to the code changes ;-)

> +	header_size = dlm->mode == VSP1_DL_MODE_HEADER
> +		    ? ALIGN(sizeof(struct vsp1_dl_header), 8)
> +		    : 0;
> +
> +	dlm->pool = vsp1_dl_fragment_pool_alloc(vsp1, prealloc,
> +					VSP1_DL_NUM_ENTRIES, header_size);
> +	if (!dlm->pool)
> +		return NULL;
> 
>  	for (i = 0; i < prealloc; ++i) {
>  		struct vsp1_dl_list *dl;
> 
> -		dl = vsp1_dl_list_alloc(dlm);
> +		dl = vsp1_dl_list_alloc(dlm, dlm->pool);
>  		if (!dl)
>  			return NULL;
> 

[snip]

-- 
Regards,

Laurent Pinchart

next prev parent reply	other threads:[~2017-08-17 12:13 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-08-14 15:13 [PATCH v2 0/8] vsp1: TLB optimisation and DL caching Kieran Bingham
2017-08-14 15:13 ` [PATCH v2 1/8] v4l: vsp1: Protect fragments against overflow Kieran Bingham
2017-08-16 21:53   ` Laurent Pinchart
2017-08-17  8:16     ` Kieran Bingham
2017-08-14 15:13 ` [PATCH v2 2/8] v4l: vsp1: Provide a fragment pool Kieran Bingham
2017-08-17 12:13   ` Laurent Pinchart
2017-09-11 20:30     ` Kieran Bingham
2017-09-13  2:15       ` Laurent Pinchart
2017-08-14 15:13 ` [PATCH v2 3/8] v4l: vsp1: Convert display lists to use new " Kieran Bingham
2017-08-17 12:13   ` Laurent Pinchart [this message]
2017-09-11 20:27     ` Kieran Bingham
2017-09-13  2:26       ` Laurent Pinchart
2017-08-14 15:13 ` [PATCH v2 4/8] v4l: vsp1: Use reference counting for fragments Kieran Bingham
2017-08-17 12:53   ` Laurent Pinchart
2017-08-14 15:13 ` [PATCH v2 5/8] v4l: vsp1: Refactor display list configure operations Kieran Bingham
2017-08-17 18:13   ` Laurent Pinchart
2017-09-11 21:16     ` Kieran Bingham
2017-09-12 19:19       ` Laurent Pinchart
2017-11-17 15:07         ` Kieran Bingham
2018-02-28 16:41           ` Kieran Bingham
2018-02-28 21:04             ` Laurent Pinchart
2017-08-14 15:13 ` [PATCH v2 6/8] v4l: vsp1: Adapt entities to configure into a body Kieran Bingham
2017-08-17 17:58   ` Laurent Pinchart
2017-09-11 21:42     ` Kieran Bingham
2017-09-12 19:18       ` Laurent Pinchart
2017-11-17 13:40         ` Kieran Bingham
2017-08-14 15:13 ` [PATCH v2 7/8] v4l: vsp1: Move video configuration to a cached dlb Kieran Bingham
2017-08-17 18:10   ` Laurent Pinchart
2017-11-16 18:19     ` Kieran Bingham
2017-08-14 15:13 ` [PATCH v2 8/8] v4l: vsp1: Reduce display list body size Kieran Bingham
2017-08-17 16:11   ` Laurent Pinchart
2017-09-11 21:15     ` Kieran Bingham

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1922275.UObh22kbi7@avalon \
    --to=laurent.pinchart@ideasonboard.com \
    --cc=kieran.bingham+renesas@ideasonboard.com \
    --cc=linux-media@vger.kernel.org \
    --cc=linux-renesas-soc@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.