Re: per-frame camera metadata (again)

linux-media.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Sakari Ailus <sakari.ailus@linux.intel.com>
To: Hans Verkuil <hverkuil@xs4all.nl>,
	Guennadi Liakhovetski <g.liakhovetski@gmx.de>,
	Linux Media Mailing List <linux-media@vger.kernel.org>
Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com>,
	Mauro Carvalho Chehab <mchehab@infradead.org>,
	Aviv Greenberg <avivgr@gmail.com>
Subject: Re: per-frame camera metadata (again)
Date: Sat, 19 Dec 2015 02:06:29 +0200	[thread overview]
Message-ID: <56749F85.8000502@linux.intel.com> (raw)
In-Reply-To: <567136C6.8090009@xs4all.nl>

Hi Guennadi and Hans,

Hans Verkuil wrote:
> On 12/16/15 10:37, Guennadi Liakhovetski wrote:
>> Hi all,
>>
>> A project, I am currently working on, requires acquiringing per-frame
>> metadata from the camera and passing it to user-space. This is not the
>> first time this comes up and I know such discussions have been held
>> before. A typical user is Android (also my case), where you have to
>> provide parameter values, that have been used to capture a specific frame,
>> to the user. I know Hans is working to handle one side of this process -
>> sending per-request controls,
>
> Actually, the request framework can do both sides of the equation: giving
> back meta data in read-only controls that are per-frame. While ideally the
> driver would extract the information from the binary blob and put it in
> nice controls, it is also possible to make a control that just contains the
> binary blob itself. Whether that's a good approach depends on many factors
> and that's another topic.

I think that could be possible in some cases. If you don't have a lot of
metadata, then, sure.

>
>> but I'm not aware whether he or anyone else
>> is actively working on this already or is planning to do so in the near
>> future? I also know, that several proprietary solutions have been
>> developed and are in use in various projects.
>>
>> I think a general agreement has been, that such data has to be passed via
>> a buffer queue. But there are a few possibilities there too. Below are
>> some:
>>
>> 1. Multiplanar. A separate plane is dedicated to metadata. Pros: (a)
>> metadata is already associated to specific frames, which they correspond
>> to. Cons: (a) a correct implementation would specify image plane fourcc
>> separately from any metadata plane format description, but we currently
>> don't support per-plane format specification.
>
> This only makes sense if the data actually comes in via DMA and if it is
> large enough to make it worth the effort of implementing this. As you say,
> it will require figuring out how to do per-frame fourcc.
>
> It also only makes sense if the metadata comes in at the same time as the
> frame.

I agree. Much of the time the metadata indeed arrives earlier than the
rest of the frame. The frame layout nor the use cases should be assumed
in the bridge (ISP) driver which implements the interface, essentially
forcing this on the user. This is a major drawback in the approach.

Albeit. If you combine this with the need to pass buffer data to the 
user before the entire buffer is ready, i.e. a plane is ready, you could 
get around this quite neatly.

However, if the DMA engine writing the metadata is different than what's 
writing the image data to memory, then you have a plain metadata buffer 
--- as it's a different video node. But there's really nothing special 
about that then.

Conceptually we should support multi-part frames rather than metadata, 
albeit metadata is just a single use case where a single DMA engine 
outputs multiple kind of data. This could be statistics as well. Or 
multiple images, e.g. YUV and RAW format images of the same frame.

With CSI-2, as the virtual channels are independent, one could start and 
stop them at different times and the frame rate in those channels could 
as well be unrelated. This suggests that different virtual channels 
should be conceptually separate streams also in V4L2 and thus the data 
from different streams should not end up to the same buffer.

Metadata usually (or practically ever?) does not arrive on a separate 
virtual channel though. So this isn't something that necessarily is 
taken into account right now but it's good to be aware of it.

>
>> 2. Separate buffer queues. Pros: (a) no need to extend multiplanar buffer
>> implementation. Cons: (a) more difficult synchronisation with image
>> frames, (b) still need to work out a way to specify the metadata version.

Do you think you have different versions of metadata from a sensor, for
instance? Based on what I've seen these tend to be sensor specific, or
SMIA which defines a metadata type for each bit depth for compliant sensors.

Each metadata format should have a 4cc code, SMIA bit depth specific or
sensor specific where metadata is sensor specific.

Other kind of metadata than what you get from sensors is not covered by 
the thoughts above.

<URL:http://www.retiisi.org.uk/v4l2/foil/v4l2-multi-format.pdf>

I think I'd still favour separate buffer queues.

>>
>> Any further options? Of the above my choice would go with (1) but with a
>> dedicated metadata plane in struct vb2_buffer.
>
> 3. Use the request framework and return the metadata as control(s). Since controls
> can be associated with events when they change you can subscribe to such events.
> Note: currently I haven't implemented such events for request controls since I am
> not certainly how it would be used, but this would be a good test case.
>
> Pros: (a) no need to extend multiplanar buffer implementation, (b) syncing up
> with the image frames should be easy (both use the same request ID), (c) a lot
> of freedom on how to export the metadata. Cons: (a) request framework is still
> work in progress (currently worked on by Laurent), (b) probably too slow for
> really large amounts of metadata, you'll need proper DMA handling for that in
> which case I would go for 2.

Agreed. You could consider it as a drawback that the number of new 
controls required for this could be large as well, but then already for 
other reasons the best implementation would rather be the second option 
mentioned.

>
>>
>> In either of the above options we also need a way to tell the user what is
>> in the metadata buffer, its format. We could create new FOURCC codes for
>> them, perhaps as V4L2_META_FMT_... or the user space could identify the
>> metadata format based on the camera model and an opaque type (metadata
>> version code) value. Since metadata formats seem to be extremely camera-
>> specific, I'd go with the latter option.

I think I'd use separate 4cc codes for the metadata formats when they 
really are different. There are plenty of possible 4cc codes we can use. :-)

Documenting the formats might be painful though.

>>
>> Comments extremely welcome.
>
> What I like about the request framework is that the driver can pick apart
> the metadata and turn it into well-defined controls. So the knowledge how
> to do that is in the place where it belongs. In cases where the meta data
> is simple too large for that to be feasible, then I don't have much of an
> opinion. Camera + version could be enough. Although the same can just as
> easily be encoded as a fourcc (V4L2_META_FMT_OVXXXX_V1, _V2, etc). A fourcc
> is more consistent with the current API.

-- 
Kind regards,

Sakari Ailus
sakari.ailus@linux.intel.com

next prev parent reply	other threads:[~2015-12-19  0:08 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-12-16  9:37 per-frame camera metadata (again) Guennadi Liakhovetski
2015-12-16 10:02 ` Hans Verkuil
2015-12-16 11:25   ` Guennadi Liakhovetski
2015-12-21  3:41     ` Laurent Pinchart
2015-12-22 11:16       ` Guennadi Liakhovetski
2015-12-22 13:30         ` karthik poduval
2015-12-24 10:54           ` Laurent Pinchart
2015-12-23 17:40         ` Laurent Pinchart
2015-12-24 10:42           ` Guennadi Liakhovetski
2015-12-26 23:47             ` Laurent Pinchart
2016-01-01 15:43               ` Guennadi Liakhovetski
2016-01-05 11:31                 ` Guennadi Liakhovetski
2016-01-25 11:14                   ` Guennadi Liakhovetski
2016-01-25 19:53                     ` Laurent Pinchart
2016-01-26 12:49                       ` Guennadi Liakhovetski
2016-01-29 10:08                         ` Guennadi Liakhovetski
2015-12-19  0:06   ` Sakari Ailus [this message]
2015-12-23  9:47     ` Guennadi Liakhovetski
2015-12-24 10:46       ` Laurent Pinchart
2015-12-24 11:17         ` hverkuil
2015-12-24 11:29           ` Laurent Pinchart
2015-12-24 12:54             ` hverkuil
2015-12-24 17:33               ` Laurent Pinchart

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=56749F85.8000502@linux.intel.com \
    --to=sakari.ailus@linux.intel.com \
    --cc=avivgr@gmail.com \
    --cc=g.liakhovetski@gmx.de \
    --cc=hverkuil@xs4all.nl \
    --cc=laurent.pinchart@ideasonboard.com \
    --cc=linux-media@vger.kernel.org \
    --cc=mchehab@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).