public inbox for linux-media@vger.kernel.org
 help / color / mirror / Atom feed
From: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
To: Kieran Bingham <kieran.bingham@ideasonboard.com>
Cc: linux-media@vger.kernel.org,
	Guennadi Liakhovetski <g.liakhovetski@gmx.de>,
	Olivier BRAUN <olivier.braun@stereolabs.com>,
	Troy Kisky <troy.kisky@boundarydevices.com>,
	Randy Dunlap <rdunlap@infradead.org>,
	Philipp Zabel <philipp.zabel@gmail.com>,
	Mauro Carvalho Chehab <mchehab@kernel.org>,
	open list <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH v4 6/6] media: uvcvideo: Move decode processing to process context
Date: Tue, 31 Jul 2018 01:11:51 +0300	[thread overview]
Message-ID: <2606200.TATdTEd0YG@avalon> (raw)
In-Reply-To: <cae511f90085701e7093ce39dc8dabf8fc16b844.1522168131.git-series.kieran.bingham@ideasonboard.com>

Hi Kieran,

Thank you for the patch.

On Tuesday, 27 March 2018 19:46:03 EEST Kieran Bingham wrote:
> Newer high definition cameras, and cameras with multiple lenses such as
> the range of stereo-vision cameras now available have ever increasing
> data rates.
> 
> The inclusion of a variable length packet header in URB packets mean
> that we must memcpy the frame data out to our destination 'manually'.
> This can result in data rates of up to 2 gigabits per second being
> processed.
> 
> To improve efficiency, and maximise throughput, handle the URB decode
> processing through a work queue to move it from interrupt context, and
> allow multiple processors to work on URBs in parallel.
> 
> Signed-off-by: Kieran Bingham <kieran.bingham@ideasonboard.com>
> 
> ---
> v2:
>  - Lock full critical section of usb_submit_urb()
> 
> v3:
>  - Fix race on submitting uvc_video_decode_data_work() to work queue.
>  - Rename uvc_decode_op -> uvc_copy_op (Generic to encode/decode)
>  - Rename decodes -> copy_operations
>  - Don't queue work if there is no async task
>  - obtain copy op structure directly in uvc_video_decode_data()
>  - uvc_video_decode_data_work() -> uvc_video_copy_data_work()
> 
> v4:
>  - Provide for_each_uvc_urb()
>  - Simplify fix for shutdown race to flush queue before freeing URBs

Indeed, v4 looks simpler, I like it.

>  - Rebase to v4.16-rc4 (linux-media/master) adjusting for metadata
>    conflicts.
> 
>  drivers/media/usb/uvc/uvc_video.c | 107 ++++++++++++++++++++++++-------
>  drivers/media/usb/uvc/uvcvideo.h  |  28 ++++++++-
>  2 files changed, 111 insertions(+), 24 deletions(-)
> 
> diff --git a/drivers/media/usb/uvc/uvc_video.c
> b/drivers/media/usb/uvc/uvc_video.c index 7dd0dcb457f3..a62e8caf367c 100644
> --- a/drivers/media/usb/uvc/uvc_video.c
> +++ b/drivers/media/usb/uvc/uvc_video.c
> @@ -1042,21 +1042,54 @@ static int uvc_video_decode_start(struct
> uvc_streaming *stream, return data[0];
>  }
> 
> -static void uvc_video_decode_data(struct uvc_streaming *stream,
> +/*
> + * uvc_video_decode_data_work: Asynchronous memcpy processing
> + *
> + * Perform memcpy tasks in process context, with completion handlers
> + * to return the URB, and buffer handles.

By "and buffer handles" do you mean you release the buffer reference ? I'd 
then write this as

"Copy URB data to video buffers in process context, releasing buffer 
references and requeuing the URB when done."

> + */
> +static void uvc_video_copy_data_work(struct work_struct *work)
> +{
> +	struct uvc_urb *uvc_urb = container_of(work, struct uvc_urb, work);
> +	unsigned int i;
> +	int ret;
> +
> +	for (i = 0; i < uvc_urb->async_operations; i++) {
> +		struct uvc_copy_op *op = &uvc_urb->copy_operations[i];
> +
> +		memcpy(op->dst, op->src, op->len);
> +
> +		/* Release reference taken on this buffer */

Missing period at the end of the sentence, here and in other locations in this 
patch.

> +		uvc_queue_buffer_release(op->buf);
> +	}
> +
> +	ret = usb_submit_urb(uvc_urb->urb, GFP_ATOMIC);
> +	if (ret < 0)
> +		uvc_printk(KERN_ERR, "Failed to resubmit video URB (%d).\n",
> +			   ret);
> +}
> +
> +static void uvc_video_decode_data(struct uvc_urb *uvc_urb,
>  		struct uvc_buffer *buf, const u8 *data, int len)
>  {
> -	unsigned int maxlen, nbytes;
> -	void *mem;
> +	unsigned int active_op = uvc_urb->async_operations;
> +	struct uvc_copy_op *decode = &uvc_urb->copy_operations[active_op];

How about calling the variable "op" as it's a copy operation ?

> +	unsigned int maxlen;
> 
>  	if (len <= 0)
>  		return;
> 
> -	/* Copy the video data to the buffer. */
>  	maxlen = buf->length - buf->bytesused;
> -	mem = buf->mem + buf->bytesused;
> -	nbytes = min((unsigned int)len, maxlen);
> -	memcpy(mem, data, nbytes);
> -	buf->bytesused += nbytes;
> +
> +	/* Take a buffer reference for async work */
> +	kref_get(&buf->ref);
> +
> +	decode->buf = buf;
> +	decode->src = data;
> +	decode->dst = buf->mem + buf->bytesused;
> +	decode->len = min_t(unsigned int, len, maxlen);
> +
> +	buf->bytesused += decode->len;
> 
>  	/* Complete the current frame if the buffer size was exceeded. */
>  	if (len > maxlen) {
> @@ -1064,6 +1097,8 @@ static void uvc_video_decode_data(struct uvc_streaming
> *stream, buf->error = 1;
>  		buf->state = UVC_BUF_STATE_READY;
>  	}
> +
> +	uvc_urb->async_operations++;
>  }
> 
>  static void uvc_video_decode_end(struct uvc_streaming *stream,
> @@ -1272,7 +1307,7 @@ static void uvc_video_decode_isoc(struct uvc_urb
> *uvc_urb, uvc_video_decode_meta(stream, meta_buf, mem, ret);
> 
>  		/* Decode the payload data. */
> -		uvc_video_decode_data(stream, buf, mem + ret,
> +		uvc_video_decode_data(uvc_urb, buf, mem + ret,
>  			urb->iso_frame_desc[i].actual_length - ret);
> 
>  		/* Process the header again. */
> @@ -1334,9 +1369,9 @@ static void uvc_video_decode_bulk(struct uvc_urb
> *uvc_urb, * sure buf is never dereferenced if NULL.
>  	 */
> 
> -	/* Process video data. */
> +	/* Prepare video data for processing. */
>  	if (!stream->bulk.skip_payload && buf != NULL)
> -		uvc_video_decode_data(stream, buf, mem, len);
> +		uvc_video_decode_data(uvc_urb, buf, mem, len);
> 
>  	/* Detect the payload end by a URB smaller than the maximum size (or
>  	 * a payload size equal to the maximum) and process the header again.
> @@ -1422,7 +1457,7 @@ static void uvc_video_complete(struct urb *urb)
>  		uvc_printk(KERN_WARNING, "Non-zero status (%d) in video "
>  			"completion handler.\n", urb->status);
>  		/* fall through */
> -	case -ENOENT:		/* usb_kill_urb() called. */
> +	case -ENOENT:		/* usb_poison_urb() called. */
>  		if (stream->frozen)
>  			return;
>  		/* fall through */
> @@ -1436,6 +1471,9 @@ static void uvc_video_complete(struct urb *urb)
> 
>  	buf = uvc_queue_get_current_buffer(queue);
> 
> +	/* Re-initialise the URB async work. */
> +	uvc_urb->async_operations = 0;
> +

I'd move this just after the vb2_qmeta block to keep the buf and buf_meta code 
together.

>  	if (vb2_qmeta) {
>  		spin_lock_irqsave(&qmeta->irqlock, flags);
>  		if (!list_empty(&qmeta->irqqueue))
> @@ -1444,12 +1482,24 @@ static void uvc_video_complete(struct urb *urb)
>  		spin_unlock_irqrestore(&qmeta->irqlock, flags);
>  	}
> 
> +	/*
> +	 * Process the URB headers, and optionally queue expensive memcpy tasks
> +	 * to be deferred to a work queue.
> +	 */
>  	stream->decode(uvc_urb, buf, buf_meta);
> 
> -	if ((ret = usb_submit_urb(urb, GFP_ATOMIC)) < 0) {
> -		uvc_printk(KERN_ERR, "Failed to resubmit video URB (%d).\n",
> -			ret);
> +	/* If no async work is needed, resubmit the URB immediately. */
> +	if (!uvc_urb->async_operations) {
> +		ret = usb_submit_urb(uvc_urb->urb, GFP_ATOMIC);
> +		if (ret < 0)
> +			uvc_printk(KERN_ERR,
> +				   "Failed to resubmit video URB (%d).\n",
> +				   ret);
> +		return;
>  	}
> +
> +	INIT_WORK(&uvc_urb->work, uvc_video_copy_data_work);

Do you need to reinit the work every time ? I thought that once after 
allocating it was enough. I could be wrong though.

> +	queue_work(stream->async_wq, &uvc_urb->work);
>  }
> 
>  /*
> @@ -1544,25 +1594,29 @@ static int uvc_alloc_urb_buffers(struct
> uvc_streaming *stream, */
>  static void uvc_uninit_video(struct uvc_streaming *stream, int
> free_buffers) {
> -	struct urb *urb;
> -	unsigned int i;
> +	struct uvc_urb *uvc_urb;
> 
>  	uvc_video_stats_stop(stream);
> 
> -	for (i = 0; i < UVC_URBS; ++i) {
> -		struct uvc_urb *uvc_urb = &stream->uvc_urb[i];
> +	/*
> +	 * We must poison the URBs rather than kill them to ensure that even
> +	 * after the completion handler returns, any asynchronous workqueues
> +	 * will be prevented from resubmitting the URBs

Missing period.

> +	 */
> +	for_each_uvc_urb(uvc_urb, stream)
> +		usb_poison_urb(uvc_urb->urb);
> 
> -		urb = uvc_urb->urb;
> -		if (urb == NULL)
> -			continue;
> +	flush_workqueue(stream->async_wq);
> 
> -		usb_kill_urb(urb);
> -		usb_free_urb(urb);
> +	for_each_uvc_urb(uvc_urb, stream) {
> +		usb_free_urb(uvc_urb->urb);
>  		uvc_urb->urb = NULL;
>  	}
> 
>  	if (free_buffers)
>  		uvc_free_urb_buffers(stream);
> +
> +	destroy_workqueue(stream->async_wq);
>  }
> 
>  /*
> @@ -1720,6 +1774,11 @@ static int uvc_init_video(struct uvc_streaming
> *stream, gfp_t gfp_flags)
> 
>  	uvc_video_stats_start(stream);
> 
> +	stream->async_wq = alloc_workqueue("uvcvideo", WQ_UNBOUND | WQ_HIGHPRI,
> +			0);
> +	if (!stream->async_wq)
> +		return -ENOMEM;
> +

Is there a reason to allocate the workqueue at stream on time and destroy it 
at stream off time ? Does it consume resources if we keep it allocated without 
work queued ?

>  	if (intf->num_altsetting > 1) {
>  		struct usb_host_endpoint *best_ep = NULL;
>  		unsigned int best_psize = UINT_MAX;
> diff --git a/drivers/media/usb/uvc/uvcvideo.h
> b/drivers/media/usb/uvc/uvcvideo.h index 112eed49bf50..27c230430eda 100644
> --- a/drivers/media/usb/uvc/uvcvideo.h
> +++ b/drivers/media/usb/uvc/uvcvideo.h
> @@ -485,12 +485,30 @@ struct uvc_stats_stream {
>  #define UVC_METATADA_BUF_SIZE 1024
> 
>  /**
> + * struct uvc_copy_op: Context structure to schedule asynchronous memcpy
> + *
> + * @buf: active buf object for this operation
> + * @dst: copy destination address
> + * @src: copy source address
> + * @len: copy length
> + */
> +struct uvc_copy_op {
> +	struct uvc_buffer *buf;
> +	void *dst;
> +	const __u8 *src;
> +	size_t len;
> +};
> +
> +/**
>   * struct uvc_urb - URB context management structure
>   *
>   * @urb: the URB described by this context structure
>   * @stream: UVC streaming context
>   * @buffer: memory storage for the URB
>   * @dma: DMA coherent addressing for the urb_buffer
> + * @async_operations: counter to indicate the number of copy operations
> + * @copy_operations: work descriptors for asynchronous copy operations
> + * @work: work queue entry for asynchronous decode
>   */
>  struct uvc_urb {
>  	struct urb *urb;
> @@ -498,6 +516,10 @@ struct uvc_urb {
> 
>  	char *buffer;
>  	dma_addr_t dma;
> +
> +	unsigned int async_operations;
> +	struct uvc_copy_op copy_operations[UVC_MAX_PACKETS];
> +	struct work_struct work;
>  };
> 
>  struct uvc_streaming {
> @@ -530,6 +552,7 @@ struct uvc_streaming {
>  	/* Buffers queue. */
>  	unsigned int frozen : 1;
>  	struct uvc_video_queue queue;
> +	struct workqueue_struct *async_wq;
>  	void (*decode)(struct uvc_urb *uvc_urb, struct uvc_buffer *buf,
>  		       struct uvc_buffer *meta_buf);
> 
> @@ -583,6 +606,11 @@ struct uvc_streaming {
>  	} clock;
>  };
> 
> +#define for_each_uvc_urb(uvc_urb, uvc_streaming) \
> +	for (uvc_urb = &uvc_streaming->uvc_urb[0]; \
> +	     uvc_urb < &uvc_streaming->uvc_urb[UVC_URBS]; \
> +	     ++uvc_urb)
> +
>  struct uvc_device {
>  	struct usb_device *udev;
>  	struct usb_interface *intf;

-- 
Regards,

Laurent Pinchart

  parent reply	other threads:[~2018-07-30 23:48 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-03-27 16:45 [PATCH v4 0/6] Asynchronous UVC Kieran Bingham
2018-03-27 16:45 ` [PATCH v4 1/6] media: uvcvideo: Refactor URB descriptors Kieran Bingham
2018-03-27 16:45 ` [PATCH v4 2/6] media: uvcvideo: Convert decode functions to use new context structure Kieran Bingham
2018-03-27 16:46 ` [PATCH v4 3/6] media: uvcvideo: Protect queue internals with helper Kieran Bingham
2018-03-27 16:46 ` [PATCH v4 4/6] media: uvcvideo: queue: Simplify spin-lock usage Kieran Bingham
2018-07-30 19:57   ` Laurent Pinchart
2018-11-06 15:10     ` Kieran Bingham
2018-03-27 16:46 ` [PATCH v4 5/6] media: uvcvideo: queue: Support asynchronous buffer handling Kieran Bingham
2018-03-27 16:46 ` [PATCH v4 6/6] media: uvcvideo: Move decode processing to process context Kieran Bingham
2018-06-04 12:09   ` Guennadi Liakhovetski
2018-06-04 12:34     ` Kieran Bingham
2018-07-30 22:11   ` Laurent Pinchart [this message]
2018-08-07  9:54   ` Tomasz Figa
2018-08-07 23:06     ` Laurent Pinchart
2018-08-07 23:13     ` Laurent Pinchart
2018-08-08  3:50       ` Tomasz Figa
2018-08-08 13:49     ` Kieran Bingham
2018-11-06 15:13     ` Kieran Bingham
2018-11-07  4:38       ` Tomasz Figa
2018-05-15 19:22 ` [PATCH v4 0/6] Asynchronous UVC Mauro Carvalho Chehab
2018-05-15 20:19   ` Kieran Bingham
2018-05-15 21:45     ` Mauro Carvalho Chehab

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2606200.TATdTEd0YG@avalon \
    --to=laurent.pinchart@ideasonboard.com \
    --cc=g.liakhovetski@gmx.de \
    --cc=kieran.bingham@ideasonboard.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-media@vger.kernel.org \
    --cc=mchehab@kernel.org \
    --cc=olivier.braun@stereolabs.com \
    --cc=philipp.zabel@gmail.com \
    --cc=rdunlap@infradead.org \
    --cc=troy.kisky@boundarydevices.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox