* DRM device memory writeback (Mali-DP)
@ 2016-07-14 17:03 Brian Starkey
2016-07-15 7:33 ` Daniel Vetter
0 siblings, 1 reply; 9+ messages in thread
From: Brian Starkey @ 2016-07-14 17:03 UTC (permalink / raw)
To: dri-devel, linux-media; +Cc: david.brown, liviu.dudau
Hi,
The Mali-DP display processors have a memory-writeback engine which
can write the result of the composition (CRTC output) to a memory
buffer in a variety of formats.
We're looking for feedback/suggestions on how to expose this in the
mali-dp DRM kernel driver - possibly via V4L2.
We've got a few use cases where writeback is useful:
- testing, to check the displayed image
- screen recording
- wireless display (e.g. Miracast)
- dual-display clone mode
- memory-to-memory composition
Note that the HW is capable of writing one of the input planes instead
of the CRTC output, but we've no good use-case for wanting to expose
that.
In our Android ADF driver[1] we exposed the memory write engine as
part of the ADF device using ADF's "MEMORY" interface type. DRM/KMS
doesn't have any similar support for memory output from CRTCs, but we
want to expose the functionality in the mainline Mali-DP DRM driver.
A previous discussion on the topic went towards exposing the
memory-write engine via V4L2[2].
I'm thinking to userspace this would look like two distinct devices:
- A DRM KMS display controller.
- A V4L2 video source.
They'd both exist in the same kernel driver.
A V4L2 client can queue up (CAPTURE) buffers in the normal way, and
the DRM driver would see if there's a buffer to dequeue every time a
new modeset is received via the DRM API - if so, it can configure the
HW to dump into it (one-shot operation).
An implication of this is that if the screen is actively displaying a
static scene and the V4L2 client queues up a buffer, it won't get
filled until the DRM scene changes. This seems best, otherwise the
V4L2 driver has to change the HW configuration out-of-band from the
DRM device which sounds horribly racy.
One further complication is scaling. Our HW has a scaler which can
tasked with either scaling an input plane or the buffer being written
to memory, but not both at the same time. This means we need to
arbitrate the scaler between the DRM device (scaling input planes) and
the V4L2 device (scaling output buffers).
I think the simplest approach here is to allow V4L2 to "claim" the
scaler by setting the image size (VIDIOC_S_FMT) to something other
than the CRTC's current resolution. After that, any attempt to use the
scaler on an input plane via DRM should fail atomic_check().
If the V4L2 client goes away or sets the image size to the CRTC's
native resolution, then the DRM device is allowed to use the scaler.
I don't know if/how the DRM device should communicate to userspace
that the scaler is or isn't available for use.
Any thoughts on this approach?
Is it acceptable to both V4L2 and DRM folks?
Thanks for your time,
-Brian
[1] http://malideveloper.arm.com/resources/drivers/open-source-mali-dp-adf-kernel-device-drivers/
[2] https://people.freedesktop.org/~cbrill/dri-log/?channel=dri-devel&date=2016-05-04
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: DRM device memory writeback (Mali-DP)
2016-07-14 17:03 DRM device memory writeback (Mali-DP) Brian Starkey
@ 2016-07-15 7:33 ` Daniel Vetter
2016-07-15 9:09 ` Brian Starkey
0 siblings, 1 reply; 9+ messages in thread
From: Daniel Vetter @ 2016-07-15 7:33 UTC (permalink / raw)
To: Brian Starkey; +Cc: dri-devel, linux-media, liviu.dudau
On Thu, Jul 14, 2016 at 06:03:40PM +0100, Brian Starkey wrote:
> Hi,
>
> The Mali-DP display processors have a memory-writeback engine which
> can write the result of the composition (CRTC output) to a memory
> buffer in a variety of formats.
>
> We're looking for feedback/suggestions on how to expose this in the
> mali-dp DRM kernel driver - possibly via V4L2.
>
> We've got a few use cases where writeback is useful:
> - testing, to check the displayed image
This might or might not need a separate interface. There are efforts to
make the intel kms validation tests in i-g-t generic (well under way
already), and part of that is creating a generic infrastructure to capture
display CRCs for functional tests (still in progress).
But it might be better if userspace abstracts between full readback and
display CRC, assuming we can make full writeback cross-vendor enough for
that use-case.
> - screen recording
> - wireless display (e.g. Miracast)
> - dual-display clone mode
> - memory-to-memory composition
> Note that the HW is capable of writing one of the input planes instead
> of the CRTC output, but we've no good use-case for wanting to expose
> that.
>
> In our Android ADF driver[1] we exposed the memory write engine as
> part of the ADF device using ADF's "MEMORY" interface type. DRM/KMS
> doesn't have any similar support for memory output from CRTCs, but we
> want to expose the functionality in the mainline Mali-DP DRM driver.
>
> A previous discussion on the topic went towards exposing the
> memory-write engine via V4L2[2].
>
> I'm thinking to userspace this would look like two distinct devices:
> - A DRM KMS display controller.
> - A V4L2 video source.
> They'd both exist in the same kernel driver.
> A V4L2 client can queue up (CAPTURE) buffers in the normal way, and
> the DRM driver would see if there's a buffer to dequeue every time a
> new modeset is received via the DRM API - if so, it can configure the
> HW to dump into it (one-shot operation).
>
> An implication of this is that if the screen is actively displaying a
> static scene and the V4L2 client queues up a buffer, it won't get
> filled until the DRM scene changes. This seems best, otherwise the
> V4L2 driver has to change the HW configuration out-of-band from the
> DRM device which sounds horribly racy.
>
> One further complication is scaling. Our HW has a scaler which can
> tasked with either scaling an input plane or the buffer being written
> to memory, but not both at the same time. This means we need to
> arbitrate the scaler between the DRM device (scaling input planes) and
> the V4L2 device (scaling output buffers).
>
> I think the simplest approach here is to allow V4L2 to "claim" the
> scaler by setting the image size (VIDIOC_S_FMT) to something other
> than the CRTC's current resolution. After that, any attempt to use the
> scaler on an input plane via DRM should fail atomic_check().
That's perfectly fine atomic_check behaviour. Only trouble is that the v4l
locking must integrate into the drm locking, but that should be doable.
Worst case you must shadow all v4l locks with a wait/wound
drm_modeset_lock to avoid deadlocks (since you could try to grab locks
from either end).
> If the V4L2 client goes away or sets the image size to the CRTC's
> native resolution, then the DRM device is allowed to use the scaler.
> I don't know if/how the DRM device should communicate to userspace
> that the scaler is or isn't available for use.
>
> Any thoughts on this approach?
> Is it acceptable to both V4L2 and DRM folks?
For streaming a V4L2 capture device seems like the right interface. But if
you want to use writeback in your compositor you must know which atomic
kms update results in which frame, since if you don't you can't use that
composited buffer for the next frame reliable.
For that case I think a drm-only solution would be better, to make sure
you can do an atomic update and writeback in one step. v4l seems to grow
an atomic api of its own, but I don't think we'll have one spanning
subsystems anytime soon.
For the kms-only interface the idea was to add a property on the crtc
where you can attach a writeback drm_framebuffer. Extending that idea to
the drm->v4l case we could create special drm_framebuffer objects
representing a v4l sink, and attach them to the same property. That would
also solve the problem of getting some agreement on buffer metadata
between v4l and drm side.
Laurent had some poc patches a while ago for this, he's definitely the one
to ping.
-Daniel
>
> Thanks for your time,
>
> -Brian
>
> [1] http://malideveloper.arm.com/resources/drivers/open-source-mali-dp-adf-kernel-device-drivers/
> [2] https://people.freedesktop.org/~cbrill/dri-log/?channel=dri-devel&date=2016-05-04
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: DRM device memory writeback (Mali-DP)
2016-07-15 7:33 ` Daniel Vetter
@ 2016-07-15 9:09 ` Brian Starkey
2016-07-15 10:57 ` Ville Syrjälä
0 siblings, 1 reply; 9+ messages in thread
From: Brian Starkey @ 2016-07-15 9:09 UTC (permalink / raw)
To: Daniel Vetter; +Cc: dri-devel, linux-media, liviu.dudau, laurent.pinchart
Hi Daniel,
Thanks for taking a look.
(+Cc Laurent)
On Fri, Jul 15, 2016 at 09:33:34AM +0200, Daniel Vetter wrote:
>On Thu, Jul 14, 2016 at 06:03:40PM +0100, Brian Starkey wrote:
>> Hi,
>>
>> The Mali-DP display processors have a memory-writeback engine which
>> can write the result of the composition (CRTC output) to a memory
>> buffer in a variety of formats.
>>
>> We're looking for feedback/suggestions on how to expose this in the
>> mali-dp DRM kernel driver - possibly via V4L2.
>>
>> We've got a few use cases where writeback is useful:
>> - testing, to check the displayed image
>
>This might or might not need a separate interface. There are efforts to
>make the intel kms validation tests in i-g-t generic (well under way
>already), and part of that is creating a generic infrastructure to capture
>display CRCs for functional tests (still in progress).
>
>But it might be better if userspace abstracts between full readback and
>display CRC, assuming we can make full writeback cross-vendor enough for
>that use-case.
>
I'd lean towards the userspace abstraction.
Encumbering a simple CRC interface with all the complexity of
full-writeback (size, scaling, pixel format, multi-planar etc.) sounds
a bit unnecessary.
Of course, if v4l2 isn't going to be the cross-vendor full-writeback
implementation, then we need to be aiming to use whatever _is_ in
the mali-dp driver.
>> - screen recording
>> - wireless display (e.g. Miracast)
>> - dual-display clone mode
>> - memory-to-memory composition
>> Note that the HW is capable of writing one of the input planes instead
>> of the CRTC output, but we've no good use-case for wanting to expose
>> that.
>>
>> In our Android ADF driver[1] we exposed the memory write engine as
>> part of the ADF device using ADF's "MEMORY" interface type. DRM/KMS
>> doesn't have any similar support for memory output from CRTCs, but we
>> want to expose the functionality in the mainline Mali-DP DRM driver.
>>
>> A previous discussion on the topic went towards exposing the
>> memory-write engine via V4L2[2].
>>
>> I'm thinking to userspace this would look like two distinct devices:
>> - A DRM KMS display controller.
>> - A V4L2 video source.
>> They'd both exist in the same kernel driver.
>> A V4L2 client can queue up (CAPTURE) buffers in the normal way, and
>> the DRM driver would see if there's a buffer to dequeue every time a
>> new modeset is received via the DRM API - if so, it can configure the
>> HW to dump into it (one-shot operation).
>>
>> An implication of this is that if the screen is actively displaying a
>> static scene and the V4L2 client queues up a buffer, it won't get
>> filled until the DRM scene changes. This seems best, otherwise the
>> V4L2 driver has to change the HW configuration out-of-band from the
>> DRM device which sounds horribly racy.
>>
>> One further complication is scaling. Our HW has a scaler which can
>> tasked with either scaling an input plane or the buffer being written
>> to memory, but not both at the same time. This means we need to
>> arbitrate the scaler between the DRM device (scaling input planes) and
>> the V4L2 device (scaling output buffers).
>>
>> I think the simplest approach here is to allow V4L2 to "claim" the
>> scaler by setting the image size (VIDIOC_S_FMT) to something other
>> than the CRTC's current resolution. After that, any attempt to use the
>> scaler on an input plane via DRM should fail atomic_check().
>
>That's perfectly fine atomic_check behaviour. Only trouble is that the v4l
>locking must integrate into the drm locking, but that should be doable.
>Worst case you must shadow all v4l locks with a wait/wound
>drm_modeset_lock to avoid deadlocks (since you could try to grab locks
>from either end).
>
Yes, I haven't looked at the details of the locking but I'm hoping
it's manageable.
>> If the V4L2 client goes away or sets the image size to the CRTC's
>> native resolution, then the DRM device is allowed to use the scaler.
>> I don't know if/how the DRM device should communicate to userspace
>> that the scaler is or isn't available for use.
>>
>> Any thoughts on this approach?
>> Is it acceptable to both V4L2 and DRM folks?
>
>For streaming a V4L2 capture device seems like the right interface. But if
>you want to use writeback in your compositor you must know which atomic
>kms update results in which frame, since if you don't you can't use that
>composited buffer for the next frame reliable.
>
>For that case I think a drm-only solution would be better, to make sure
>you can do an atomic update and writeback in one step. v4l seems to grow
>an atomic api of its own, but I don't think we'll have one spanning
>subsystems anytime soon.
>
I've been thinking about this from the point of view of a HWComposer
implementation and I think the hybrid DRM-V4L2 device would work OK.
However it depends on the behaviour I mentioned above:
>> if the screen is actively displaying a
>> static scene and the V4L2 client queues up a buffer, it won't get
>> filled until the DRM scene changes.
V4L2 buffer queues are FIFO, so as long as the compositor queues only
one V4L2 buffer per atomic update, there's no ambiguity.
In the most simplistic case the compositor would alternate between:
- Queue V4L2 buffer
- DRM atomic update
... and dequeue either in the same thread or a different one. As long
as the compositor keeps track of how many buffers it has queued and
how many atomic updates it's made, it doesn't really matter.
We'd probably be looking to add in V4L2 asynchronous dequeue using
fences for synchronisation, but that's a separate issue.
>For the kms-only interface the idea was to add a property on the crtc
>where you can attach a writeback drm_framebuffer. Extending that idea to
>the drm->v4l case we could create special drm_framebuffer objects
>representing a v4l sink, and attach them to the same property. That would
>also solve the problem of getting some agreement on buffer metadata
>between v4l and drm side.
>
I think a drm_framebuffer on its own wouldn't be enough to handle our
scaling case - at that point it starts to look more like a plane.
However, if "special" CRTC sinks became a thing it could allow us to
chain our writeback output to another CRTC's input (via memory)
without a trip through userspace, which would be nice.
>Laurent had some poc patches a while ago for this, he's definitely the one
>to ping.
I've had a little look at: https://patchwork.kernel.org/patch/6026611/
It looks pretty similar to what I was hoping to do.
We have the advantage of being able to update the display scene and
writeback buffer together atomically in hardware, and to write-back in
a one-shot mode. This lets us make the DRM update queue and V4L2
buffer queues advance in lock-step.
>-Daniel
Thanks,
-Brian
>
>>
>> Thanks for your time,
>>
>> -Brian
>>
>> [1] http://malideveloper.arm.com/resources/drivers/open-source-mali-dp-adf-kernel-device-drivers/
>> [2] https://people.freedesktop.org/~cbrill/dri-log/?channel=dri-devel&date=2016-05-04
>> _______________________________________________
>> dri-devel mailing list
>> dri-devel@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/dri-devel
>
>--
>Daniel Vetter
>Software Engineer, Intel Corporation
>http://blog.ffwll.ch
>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: DRM device memory writeback (Mali-DP)
2016-07-15 9:09 ` Brian Starkey
@ 2016-07-15 10:57 ` Ville Syrjälä
2016-07-15 17:42 ` Eric Anholt
0 siblings, 1 reply; 9+ messages in thread
From: Ville Syrjälä @ 2016-07-15 10:57 UTC (permalink / raw)
To: Brian Starkey
Cc: Daniel Vetter, liviu.dudau, laurent.pinchart, dri-devel,
linux-media
On Fri, Jul 15, 2016 at 10:09:19AM +0100, Brian Starkey wrote:
> Hi Daniel,
>
> Thanks for taking a look.
>
> (+Cc Laurent)
>
> On Fri, Jul 15, 2016 at 09:33:34AM +0200, Daniel Vetter wrote:
> >On Thu, Jul 14, 2016 at 06:03:40PM +0100, Brian Starkey wrote:
> >> Hi,
> >>
> >> The Mali-DP display processors have a memory-writeback engine which
> >> can write the result of the composition (CRTC output) to a memory
> >> buffer in a variety of formats.
> >>
> >> We're looking for feedback/suggestions on how to expose this in the
> >> mali-dp DRM kernel driver - possibly via V4L2.
> >>
> >> We've got a few use cases where writeback is useful:
> >> - testing, to check the displayed image
> >
> >This might or might not need a separate interface. There are efforts to
> >make the intel kms validation tests in i-g-t generic (well under way
> >already), and part of that is creating a generic infrastructure to capture
> >display CRCs for functional tests (still in progress).
> >
> >But it might be better if userspace abstracts between full readback and
> >display CRC, assuming we can make full writeback cross-vendor enough for
> >that use-case.
> >
>
> I'd lean towards the userspace abstraction.
> Encumbering a simple CRC interface with all the complexity of
> full-writeback (size, scaling, pixel format, multi-planar etc.) sounds
> a bit unnecessary.
>
> Of course, if v4l2 isn't going to be the cross-vendor full-writeback
> implementation, then we need to be aiming to use whatever _is_ in
> the mali-dp driver.
>
> >> - screen recording
> >> - wireless display (e.g. Miracast)
> >> - dual-display clone mode
> >> - memory-to-memory composition
> >> Note that the HW is capable of writing one of the input planes instead
> >> of the CRTC output, but we've no good use-case for wanting to expose
> >> that.
> >>
> >> In our Android ADF driver[1] we exposed the memory write engine as
> >> part of the ADF device using ADF's "MEMORY" interface type. DRM/KMS
> >> doesn't have any similar support for memory output from CRTCs, but we
> >> want to expose the functionality in the mainline Mali-DP DRM driver.
> >>
> >> A previous discussion on the topic went towards exposing the
> >> memory-write engine via V4L2[2].
> >>
> >> I'm thinking to userspace this would look like two distinct devices:
> >> - A DRM KMS display controller.
> >> - A V4L2 video source.
> >> They'd both exist in the same kernel driver.
> >> A V4L2 client can queue up (CAPTURE) buffers in the normal way, and
> >> the DRM driver would see if there's a buffer to dequeue every time a
> >> new modeset is received via the DRM API - if so, it can configure the
> >> HW to dump into it (one-shot operation).
> >>
> >> An implication of this is that if the screen is actively displaying a
> >> static scene and the V4L2 client queues up a buffer, it won't get
> >> filled until the DRM scene changes. This seems best, otherwise the
> >> V4L2 driver has to change the HW configuration out-of-band from the
> >> DRM device which sounds horribly racy.
> >>
> >> One further complication is scaling. Our HW has a scaler which can
> >> tasked with either scaling an input plane or the buffer being written
> >> to memory, but not both at the same time. This means we need to
> >> arbitrate the scaler between the DRM device (scaling input planes) and
> >> the V4L2 device (scaling output buffers).
> >>
> >> I think the simplest approach here is to allow V4L2 to "claim" the
> >> scaler by setting the image size (VIDIOC_S_FMT) to something other
> >> than the CRTC's current resolution. After that, any attempt to use the
> >> scaler on an input plane via DRM should fail atomic_check().
> >
> >That's perfectly fine atomic_check behaviour. Only trouble is that the v4l
> >locking must integrate into the drm locking, but that should be doable.
> >Worst case you must shadow all v4l locks with a wait/wound
> >drm_modeset_lock to avoid deadlocks (since you could try to grab locks
> >from either end).
> >
>
> Yes, I haven't looked at the details of the locking but I'm hoping
> it's manageable.
>
> >> If the V4L2 client goes away or sets the image size to the CRTC's
> >> native resolution, then the DRM device is allowed to use the scaler.
> >> I don't know if/how the DRM device should communicate to userspace
> >> that the scaler is or isn't available for use.
> >>
> >> Any thoughts on this approach?
> >> Is it acceptable to both V4L2 and DRM folks?
> >
> >For streaming a V4L2 capture device seems like the right interface. But if
> >you want to use writeback in your compositor you must know which atomic
> >kms update results in which frame, since if you don't you can't use that
> >composited buffer for the next frame reliable.
> >
> >For that case I think a drm-only solution would be better, to make sure
> >you can do an atomic update and writeback in one step. v4l seems to grow
> >an atomic api of its own, but I don't think we'll have one spanning
> >subsystems anytime soon.
> >
>
> I've been thinking about this from the point of view of a HWComposer
> implementation and I think the hybrid DRM-V4L2 device would work OK.
> However it depends on the behaviour I mentioned above:
>
> >> if the screen is actively displaying a
> >> static scene and the V4L2 client queues up a buffer, it won't get
> >> filled until the DRM scene changes.
>
> V4L2 buffer queues are FIFO, so as long as the compositor queues only
> one V4L2 buffer per atomic update, there's no ambiguity.
> In the most simplistic case the compositor would alternate between:
> - Queue V4L2 buffer
> - DRM atomic update
> ... and dequeue either in the same thread or a different one. As long
> as the compositor keeps track of how many buffers it has queued and
> how many atomic updates it's made, it doesn't really matter.
>
> We'd probably be looking to add in V4L2 asynchronous dequeue using
> fences for synchronisation, but that's a separate issue.
>
> >For the kms-only interface the idea was to add a property on the crtc
> >where you can attach a writeback drm_framebuffer. Extending that idea to
> >the drm->v4l case we could create special drm_framebuffer objects
> >representing a v4l sink, and attach them to the same property. That would
> >also solve the problem of getting some agreement on buffer metadata
> >between v4l and drm side.
> >
>
> I think a drm_framebuffer on its own wouldn't be enough to handle our
> scaling case - at that point it starts to look more like a plane.
> However, if "special" CRTC sinks became a thing it could allow us to
> chain our writeback output to another CRTC's input (via memory)
> without a trip through userspace, which would be nice.
My thinking has been that we could represent the writeback sink
as a connector. Combine that with my other idea of exposing a
fixed mode property for each connector (to make output scaling
user configurable), and we should end up with a way to control
the scaling of the writeback path.
Also it would mean there's always a connector attached to the
crtc, even for the pure writeback only case, which I think should
result in less problems in general as I'm sure there are a lot of
assumptions in the code that there must be at least one connector
for an active crtc.
> >Laurent had some poc patches a while ago for this, he's definitely the one
> >to ping.
>
> I've had a little look at: https://patchwork.kernel.org/patch/6026611/
> It looks pretty similar to what I was hoping to do.
>
> We have the advantage of being able to update the display scene and
> writeback buffer together atomically in hardware, and to write-back in
> a one-shot mode. This lets us make the DRM update queue and V4L2
> buffer queues advance in lock-step.
>
> >-Daniel
>
> Thanks,
> -Brian
> >
> >>
> >> Thanks for your time,
> >>
> >> -Brian
> >>
> >> [1] http://malideveloper.arm.com/resources/drivers/open-source-mali-dp-adf-kernel-device-drivers/
> >> [2] https://people.freedesktop.org/~cbrill/dri-log/?channel=dri-devel&date=2016-05-04
> >> _______________________________________________
> >> dri-devel mailing list
> >> dri-devel@lists.freedesktop.org
> >> https://lists.freedesktop.org/mailman/listinfo/dri-devel
> >
> >--
> >Daniel Vetter
> >Software Engineer, Intel Corporation
> >http://blog.ffwll.ch
> >
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel
--
Ville Syrjälä
Intel OTC
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: DRM device memory writeback (Mali-DP)
2016-07-15 10:57 ` Ville Syrjälä
@ 2016-07-15 17:42 ` Eric Anholt
2016-07-18 10:29 ` Brian Starkey
0 siblings, 1 reply; 9+ messages in thread
From: Eric Anholt @ 2016-07-15 17:42 UTC (permalink / raw)
To: Ville Syrjälä, Brian Starkey
Cc: dri-devel, liviu.dudau, laurent.pinchart, linux-media
[-- Attachment #1: Type: text/plain, Size: 8094 bytes --]
Ville Syrjälä <ville.syrjala@linux.intel.com> writes:
> On Fri, Jul 15, 2016 at 10:09:19AM +0100, Brian Starkey wrote:
>> Hi Daniel,
>>
>> Thanks for taking a look.
>>
>> (+Cc Laurent)
>>
>> On Fri, Jul 15, 2016 at 09:33:34AM +0200, Daniel Vetter wrote:
>> >On Thu, Jul 14, 2016 at 06:03:40PM +0100, Brian Starkey wrote:
>> >> Hi,
>> >>
>> >> The Mali-DP display processors have a memory-writeback engine which
>> >> can write the result of the composition (CRTC output) to a memory
>> >> buffer in a variety of formats.
>> >>
>> >> We're looking for feedback/suggestions on how to expose this in the
>> >> mali-dp DRM kernel driver - possibly via V4L2.
>> >>
>> >> We've got a few use cases where writeback is useful:
>> >> - testing, to check the displayed image
>> >
>> >This might or might not need a separate interface. There are efforts to
>> >make the intel kms validation tests in i-g-t generic (well under way
>> >already), and part of that is creating a generic infrastructure to capture
>> >display CRCs for functional tests (still in progress).
>> >
>> >But it might be better if userspace abstracts between full readback and
>> >display CRC, assuming we can make full writeback cross-vendor enough for
>> >that use-case.
>> >
>>
>> I'd lean towards the userspace abstraction.
>> Encumbering a simple CRC interface with all the complexity of
>> full-writeback (size, scaling, pixel format, multi-planar etc.) sounds
>> a bit unnecessary.
>>
>> Of course, if v4l2 isn't going to be the cross-vendor full-writeback
>> implementation, then we need to be aiming to use whatever _is_ in
>> the mali-dp driver.
>>
>> >> - screen recording
>> >> - wireless display (e.g. Miracast)
>> >> - dual-display clone mode
>> >> - memory-to-memory composition
>> >> Note that the HW is capable of writing one of the input planes instead
>> >> of the CRTC output, but we've no good use-case for wanting to expose
>> >> that.
>> >>
>> >> In our Android ADF driver[1] we exposed the memory write engine as
>> >> part of the ADF device using ADF's "MEMORY" interface type. DRM/KMS
>> >> doesn't have any similar support for memory output from CRTCs, but we
>> >> want to expose the functionality in the mainline Mali-DP DRM driver.
>> >>
>> >> A previous discussion on the topic went towards exposing the
>> >> memory-write engine via V4L2[2].
>> >>
>> >> I'm thinking to userspace this would look like two distinct devices:
>> >> - A DRM KMS display controller.
>> >> - A V4L2 video source.
>> >> They'd both exist in the same kernel driver.
>> >> A V4L2 client can queue up (CAPTURE) buffers in the normal way, and
>> >> the DRM driver would see if there's a buffer to dequeue every time a
>> >> new modeset is received via the DRM API - if so, it can configure the
>> >> HW to dump into it (one-shot operation).
>> >>
>> >> An implication of this is that if the screen is actively displaying a
>> >> static scene and the V4L2 client queues up a buffer, it won't get
>> >> filled until the DRM scene changes. This seems best, otherwise the
>> >> V4L2 driver has to change the HW configuration out-of-band from the
>> >> DRM device which sounds horribly racy.
>> >>
>> >> One further complication is scaling. Our HW has a scaler which can
>> >> tasked with either scaling an input plane or the buffer being written
>> >> to memory, but not both at the same time. This means we need to
>> >> arbitrate the scaler between the DRM device (scaling input planes) and
>> >> the V4L2 device (scaling output buffers).
>> >>
>> >> I think the simplest approach here is to allow V4L2 to "claim" the
>> >> scaler by setting the image size (VIDIOC_S_FMT) to something other
>> >> than the CRTC's current resolution. After that, any attempt to use the
>> >> scaler on an input plane via DRM should fail atomic_check().
>> >
>> >That's perfectly fine atomic_check behaviour. Only trouble is that the v4l
>> >locking must integrate into the drm locking, but that should be doable.
>> >Worst case you must shadow all v4l locks with a wait/wound
>> >drm_modeset_lock to avoid deadlocks (since you could try to grab locks
>> >from either end).
>> >
>>
>> Yes, I haven't looked at the details of the locking but I'm hoping
>> it's manageable.
>>
>> >> If the V4L2 client goes away or sets the image size to the CRTC's
>> >> native resolution, then the DRM device is allowed to use the scaler.
>> >> I don't know if/how the DRM device should communicate to userspace
>> >> that the scaler is or isn't available for use.
>> >>
>> >> Any thoughts on this approach?
>> >> Is it acceptable to both V4L2 and DRM folks?
>> >
>> >For streaming a V4L2 capture device seems like the right interface. But if
>> >you want to use writeback in your compositor you must know which atomic
>> >kms update results in which frame, since if you don't you can't use that
>> >composited buffer for the next frame reliable.
>> >
>> >For that case I think a drm-only solution would be better, to make sure
>> >you can do an atomic update and writeback in one step. v4l seems to grow
>> >an atomic api of its own, but I don't think we'll have one spanning
>> >subsystems anytime soon.
>> >
>>
>> I've been thinking about this from the point of view of a HWComposer
>> implementation and I think the hybrid DRM-V4L2 device would work OK.
>> However it depends on the behaviour I mentioned above:
>>
>> >> if the screen is actively displaying a
>> >> static scene and the V4L2 client queues up a buffer, it won't get
>> >> filled until the DRM scene changes.
>>
>> V4L2 buffer queues are FIFO, so as long as the compositor queues only
>> one V4L2 buffer per atomic update, there's no ambiguity.
>> In the most simplistic case the compositor would alternate between:
>> - Queue V4L2 buffer
>> - DRM atomic update
>> ... and dequeue either in the same thread or a different one. As long
>> as the compositor keeps track of how many buffers it has queued and
>> how many atomic updates it's made, it doesn't really matter.
>>
>> We'd probably be looking to add in V4L2 asynchronous dequeue using
>> fences for synchronisation, but that's a separate issue.
>>
>> >For the kms-only interface the idea was to add a property on the crtc
>> >where you can attach a writeback drm_framebuffer. Extending that idea to
>> >the drm->v4l case we could create special drm_framebuffer objects
>> >representing a v4l sink, and attach them to the same property. That would
>> >also solve the problem of getting some agreement on buffer metadata
>> >between v4l and drm side.
>> >
>>
>> I think a drm_framebuffer on its own wouldn't be enough to handle our
>> scaling case - at that point it starts to look more like a plane.
>> However, if "special" CRTC sinks became a thing it could allow us to
>> chain our writeback output to another CRTC's input (via memory)
>> without a trip through userspace, which would be nice.
>
> My thinking has been that we could represent the writeback sink
> as a connector. Combine that with my other idea of exposing a
> fixed mode property for each connector (to make output scaling
> user configurable), and we should end up with a way to control
> the scaling of the writeback path.
>
> Also it would mean there's always a connector attached to the
> crtc, even for the pure writeback only case, which I think should
> result in less problems in general as I'm sure there are a lot of
> assumptions in the code that there must be at least one connector
> for an active crtc.
This is what I've been thinking for vc4's writeback as well. I
definitely want my writeback to be using the same drivers/gpu/drm/vc4
code for setting up the hardware. And, since I expect writeback to be
mostly used by compositors for the case where a frame exceeds bandwidth
constraints, I want userspace to be able to use the same DRM APIs for
setting up the frame to be written back as it was trying to use for
setting up the frame before.
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 818 bytes --]
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: DRM device memory writeback (Mali-DP)
2016-07-15 17:42 ` Eric Anholt
@ 2016-07-18 10:29 ` Brian Starkey
2016-07-18 10:47 ` Hans Verkuil
0 siblings, 1 reply; 9+ messages in thread
From: Brian Starkey @ 2016-07-18 10:29 UTC (permalink / raw)
To: Eric Anholt
Cc: Ville Syrjälä, dri-devel, liviu.dudau, laurent.pinchart,
linux-media, david.brown, daniel.vetter
Hi,
On Fri, Jul 15, 2016 at 10:42:01AM -0700, Eric Anholt wrote:
>Ville Syrjälä <ville.syrjala@linux.intel.com> writes:
>
>> On Fri, Jul 15, 2016 at 10:09:19AM +0100, Brian Starkey wrote:
>>> Hi Daniel,
>>>
>>> Thanks for taking a look.
>>>
>>> (+Cc Laurent)
>>>
>>> On Fri, Jul 15, 2016 at 09:33:34AM +0200, Daniel Vetter wrote:
>>> >On Thu, Jul 14, 2016 at 06:03:40PM +0100, Brian Starkey wrote:
>>> >> Hi,
>>> >>
>>> >> The Mali-DP display processors have a memory-writeback engine which
>>> >> can write the result of the composition (CRTC output) to a memory
>>> >> buffer in a variety of formats.
>>> >>
>>> >> We're looking for feedback/suggestions on how to expose this in the
>>> >> mali-dp DRM kernel driver - possibly via V4L2.
>>> >>
>>> >> We've got a few use cases where writeback is useful:
>>> >> - testing, to check the displayed image
>>> >
>>> >This might or might not need a separate interface. There are efforts to
>>> >make the intel kms validation tests in i-g-t generic (well under way
>>> >already), and part of that is creating a generic infrastructure to capture
>>> >display CRCs for functional tests (still in progress).
>>> >
>>> >But it might be better if userspace abstracts between full readback and
>>> >display CRC, assuming we can make full writeback cross-vendor enough for
>>> >that use-case.
>>> >
>>>
>>> I'd lean towards the userspace abstraction.
>>> Encumbering a simple CRC interface with all the complexity of
>>> full-writeback (size, scaling, pixel format, multi-planar etc.) sounds
>>> a bit unnecessary.
>>>
>>> Of course, if v4l2 isn't going to be the cross-vendor full-writeback
>>> implementation, then we need to be aiming to use whatever _is_ in
>>> the mali-dp driver.
>>>
>>> >> - screen recording
>>> >> - wireless display (e.g. Miracast)
>>> >> - dual-display clone mode
>>> >> - memory-to-memory composition
>>> >> Note that the HW is capable of writing one of the input planes instead
>>> >> of the CRTC output, but we've no good use-case for wanting to expose
>>> >> that.
>>> >>
>>> >> In our Android ADF driver[1] we exposed the memory write engine as
>>> >> part of the ADF device using ADF's "MEMORY" interface type. DRM/KMS
>>> >> doesn't have any similar support for memory output from CRTCs, but we
>>> >> want to expose the functionality in the mainline Mali-DP DRM driver.
>>> >>
>>> >> A previous discussion on the topic went towards exposing the
>>> >> memory-write engine via V4L2[2].
>>> >>
>>> >> I'm thinking to userspace this would look like two distinct devices:
>>> >> - A DRM KMS display controller.
>>> >> - A V4L2 video source.
>>> >> They'd both exist in the same kernel driver.
>>> >> A V4L2 client can queue up (CAPTURE) buffers in the normal way, and
>>> >> the DRM driver would see if there's a buffer to dequeue every time a
>>> >> new modeset is received via the DRM API - if so, it can configure the
>>> >> HW to dump into it (one-shot operation).
>>> >>
>>> >> An implication of this is that if the screen is actively displaying a
>>> >> static scene and the V4L2 client queues up a buffer, it won't get
>>> >> filled until the DRM scene changes. This seems best, otherwise the
>>> >> V4L2 driver has to change the HW configuration out-of-band from the
>>> >> DRM device which sounds horribly racy.
>>> >>
>>> >> One further complication is scaling. Our HW has a scaler which can
>>> >> tasked with either scaling an input plane or the buffer being written
>>> >> to memory, but not both at the same time. This means we need to
>>> >> arbitrate the scaler between the DRM device (scaling input planes) and
>>> >> the V4L2 device (scaling output buffers).
>>> >>
>>> >> I think the simplest approach here is to allow V4L2 to "claim" the
>>> >> scaler by setting the image size (VIDIOC_S_FMT) to something other
>>> >> than the CRTC's current resolution. After that, any attempt to use the
>>> >> scaler on an input plane via DRM should fail atomic_check().
>>> >
>>> >That's perfectly fine atomic_check behaviour. Only trouble is that the v4l
>>> >locking must integrate into the drm locking, but that should be doable.
>>> >Worst case you must shadow all v4l locks with a wait/wound
>>> >drm_modeset_lock to avoid deadlocks (since you could try to grab locks
>>> >from either end).
>>> >
>>>
>>> Yes, I haven't looked at the details of the locking but I'm hoping
>>> it's manageable.
>>>
>>> >> If the V4L2 client goes away or sets the image size to the CRTC's
>>> >> native resolution, then the DRM device is allowed to use the scaler.
>>> >> I don't know if/how the DRM device should communicate to userspace
>>> >> that the scaler is or isn't available for use.
>>> >>
>>> >> Any thoughts on this approach?
>>> >> Is it acceptable to both V4L2 and DRM folks?
>>> >
>>> >For streaming a V4L2 capture device seems like the right interface. But if
>>> >you want to use writeback in your compositor you must know which atomic
>>> >kms update results in which frame, since if you don't you can't use that
>>> >composited buffer for the next frame reliable.
>>> >
>>> >For that case I think a drm-only solution would be better, to make sure
>>> >you can do an atomic update and writeback in one step. v4l seems to grow
>>> >an atomic api of its own, but I don't think we'll have one spanning
>>> >subsystems anytime soon.
>>> >
>>>
>>> I've been thinking about this from the point of view of a HWComposer
>>> implementation and I think the hybrid DRM-V4L2 device would work OK.
>>> However it depends on the behaviour I mentioned above:
>>>
>>> >> if the screen is actively displaying a
>>> >> static scene and the V4L2 client queues up a buffer, it won't get
>>> >> filled until the DRM scene changes.
>>>
>>> V4L2 buffer queues are FIFO, so as long as the compositor queues only
>>> one V4L2 buffer per atomic update, there's no ambiguity.
>>> In the most simplistic case the compositor would alternate between:
>>> - Queue V4L2 buffer
>>> - DRM atomic update
>>> ... and dequeue either in the same thread or a different one. As long
>>> as the compositor keeps track of how many buffers it has queued and
>>> how many atomic updates it's made, it doesn't really matter.
>>>
>>> We'd probably be looking to add in V4L2 asynchronous dequeue using
>>> fences for synchronisation, but that's a separate issue.
>>>
>>> >For the kms-only interface the idea was to add a property on the crtc
>>> >where you can attach a writeback drm_framebuffer. Extending that idea to
>>> >the drm->v4l case we could create special drm_framebuffer objects
>>> >representing a v4l sink, and attach them to the same property. That would
>>> >also solve the problem of getting some agreement on buffer metadata
>>> >between v4l and drm side.
>>> >
>>>
>>> I think a drm_framebuffer on its own wouldn't be enough to handle our
>>> scaling case - at that point it starts to look more like a plane.
>>> However, if "special" CRTC sinks became a thing it could allow us to
>>> chain our writeback output to another CRTC's input (via memory)
>>> without a trip through userspace, which would be nice.
>>
>> My thinking has been that we could represent the writeback sink
>> as a connector. Combine that with my other idea of exposing a
>> fixed mode property for each connector (to make output scaling
>> user configurable), and we should end up with a way to control
>> the scaling of the writeback path.
>>
>> Also it would mean there's always a connector attached to the
>> crtc, even for the pure writeback only case, which I think should
>> result in less problems in general as I'm sure there are a lot of
>> assumptions in the code that there must be at least one connector
>> for an active crtc.
>
>This is what I've been thinking for vc4's writeback as well. I
>definitely want my writeback to be using the same drivers/gpu/drm/vc4
>code for setting up the hardware. And, since I expect writeback to be
>mostly used by compositors for the case where a frame exceeds bandwidth
>constraints, I want userspace to be able to use the same DRM APIs for
>setting up the frame to be written back as it was trying to use for
>setting up the frame before.
OK, so let's talk about using connectors instead then :-)
We can attach a generic fixed_mode property which can represent panel
fitters or Mali-DP's writeback scaling, that sounds good.
The DRM driver can create a new "virtual" encoder/connector pair, I
guess with a new connector type specifically for memory-writeback?
On this connector we allow the fb_id property to attach a framebuffer
for writing into.
We'd probably want an additional property to enable writeback, perhaps
an enum: "disabled", "streaming", "oneshot".
I think it makes sense to put the output buffer format, size etc.
validation in the virtual encoder's atomic_check. It has access to
both CRTC and connector state, and the encoder is supposed to
represent the format/stream conversion element anyway.
What I'm not so clear on is how to manage the API with userspace.
Changing connectors is a "full modeset required" operation. I expect
this means an application needs to know up-front if it will ever make
use of memory-writeback, and if so do its initial modeset on both the
writeback and the normal connector, even if it doesn't attach a
writeback framebuffer until much later. Otherwise, enabling writeback
will force a full disable-modeset-enable cycle.
The writeback connector would have to be hidden from legacy clients
which would expect it to work like a normal connector - I suppose
that's somewhat similar to universal planes.
For the case that Ville mentioned where only the writeback connector
is connected, would userspace be expected to "make up" a
drm_mode_modeinfo to set? I can't think of any meaningful list of
available modes for the writeback connector to report.
Thanks for all the input so far!
-Brian
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: DRM device memory writeback (Mali-DP)
2016-07-18 10:29 ` Brian Starkey
@ 2016-07-18 10:47 ` Hans Verkuil
2016-07-18 15:01 ` Daniel Vetter
0 siblings, 1 reply; 9+ messages in thread
From: Hans Verkuil @ 2016-07-18 10:47 UTC (permalink / raw)
To: Brian Starkey, Eric Anholt
Cc: Ville Syrjälä, dri-devel, liviu.dudau, laurent.pinchart,
linux-media, david.brown, daniel.vetter
On 07/18/2016 12:29 PM, Brian Starkey wrote:
> Hi,
>
> On Fri, Jul 15, 2016 at 10:42:01AM -0700, Eric Anholt wrote:
>> Ville Syrjälä <ville.syrjala@linux.intel.com> writes:
>>
>>> On Fri, Jul 15, 2016 at 10:09:19AM +0100, Brian Starkey wrote:
>>>> Hi Daniel,
>>>>
>>>> Thanks for taking a look.
>>>>
>>>> (+Cc Laurent)
>>>>
>>>> On Fri, Jul 15, 2016 at 09:33:34AM +0200, Daniel Vetter wrote:
>>>>> On Thu, Jul 14, 2016 at 06:03:40PM +0100, Brian Starkey wrote:
>>>>>> Hi,
>>>>>>
>>>>>> The Mali-DP display processors have a memory-writeback engine which
>>>>>> can write the result of the composition (CRTC output) to a memory
>>>>>> buffer in a variety of formats.
>>>>>>
>>>>>> We're looking for feedback/suggestions on how to expose this in the
>>>>>> mali-dp DRM kernel driver - possibly via V4L2.
>>>>>>
>>>>>> We've got a few use cases where writeback is useful:
>>>>>> - testing, to check the displayed image
>>>>>
>>>>> This might or might not need a separate interface. There are efforts to
>>>>> make the intel kms validation tests in i-g-t generic (well under way
>>>>> already), and part of that is creating a generic infrastructure to capture
>>>>> display CRCs for functional tests (still in progress).
>>>>>
>>>>> But it might be better if userspace abstracts between full readback and
>>>>> display CRC, assuming we can make full writeback cross-vendor enough for
>>>>> that use-case.
>>>>>
>>>>
>>>> I'd lean towards the userspace abstraction.
>>>> Encumbering a simple CRC interface with all the complexity of
>>>> full-writeback (size, scaling, pixel format, multi-planar etc.) sounds
>>>> a bit unnecessary.
>>>>
>>>> Of course, if v4l2 isn't going to be the cross-vendor full-writeback
>>>> implementation, then we need to be aiming to use whatever _is_ in
>>>> the mali-dp driver.
>>>>
>>>>>> - screen recording
>>>>>> - wireless display (e.g. Miracast)
>>>>>> - dual-display clone mode
>>>>>> - memory-to-memory composition
>>>>>> Note that the HW is capable of writing one of the input planes instead
>>>>>> of the CRTC output, but we've no good use-case for wanting to expose
>>>>>> that.
>>>>>>
>>>>>> In our Android ADF driver[1] we exposed the memory write engine as
>>>>>> part of the ADF device using ADF's "MEMORY" interface type. DRM/KMS
>>>>>> doesn't have any similar support for memory output from CRTCs, but we
>>>>>> want to expose the functionality in the mainline Mali-DP DRM driver.
>>>>>>
>>>>>> A previous discussion on the topic went towards exposing the
>>>>>> memory-write engine via V4L2[2].
>>>>>>
>>>>>> I'm thinking to userspace this would look like two distinct devices:
>>>>>> - A DRM KMS display controller.
>>>>>> - A V4L2 video source.
>>>>>> They'd both exist in the same kernel driver.
>>>>>> A V4L2 client can queue up (CAPTURE) buffers in the normal way, and
>>>>>> the DRM driver would see if there's a buffer to dequeue every time a
>>>>>> new modeset is received via the DRM API - if so, it can configure the
>>>>>> HW to dump into it (one-shot operation).
>>>>>>
>>>>>> An implication of this is that if the screen is actively displaying a
>>>>>> static scene and the V4L2 client queues up a buffer, it won't get
>>>>>> filled until the DRM scene changes. This seems best, otherwise the
>>>>>> V4L2 driver has to change the HW configuration out-of-band from the
>>>>>> DRM device which sounds horribly racy.
>>>>>>
>>>>>> One further complication is scaling. Our HW has a scaler which can
>>>>>> tasked with either scaling an input plane or the buffer being written
>>>>>> to memory, but not both at the same time. This means we need to
>>>>>> arbitrate the scaler between the DRM device (scaling input planes) and
>>>>>> the V4L2 device (scaling output buffers).
>>>>>>
>>>>>> I think the simplest approach here is to allow V4L2 to "claim" the
>>>>>> scaler by setting the image size (VIDIOC_S_FMT) to something other
>>>>>> than the CRTC's current resolution. After that, any attempt to use the
>>>>>> scaler on an input plane via DRM should fail atomic_check().
>>>>>
>>>>> That's perfectly fine atomic_check behaviour. Only trouble is that the v4l
>>>>> locking must integrate into the drm locking, but that should be doable.
>>>>> Worst case you must shadow all v4l locks with a wait/wound
>>>>> drm_modeset_lock to avoid deadlocks (since you could try to grab locks
>>>> >from either end).
>>>>>
>>>>
>>>> Yes, I haven't looked at the details of the locking but I'm hoping
>>>> it's manageable.
>>>>
>>>>>> If the V4L2 client goes away or sets the image size to the CRTC's
>>>>>> native resolution, then the DRM device is allowed to use the scaler.
>>>>>> I don't know if/how the DRM device should communicate to userspace
>>>>>> that the scaler is or isn't available for use.
>>>>>>
>>>>>> Any thoughts on this approach?
>>>>>> Is it acceptable to both V4L2 and DRM folks?
>>>>>
>>>>> For streaming a V4L2 capture device seems like the right interface. But if
>>>>> you want to use writeback in your compositor you must know which atomic
>>>>> kms update results in which frame, since if you don't you can't use that
>>>>> composited buffer for the next frame reliable.
>>>>>
>>>>> For that case I think a drm-only solution would be better, to make sure
>>>>> you can do an atomic update and writeback in one step. v4l seems to grow
>>>>> an atomic api of its own, but I don't think we'll have one spanning
>>>>> subsystems anytime soon.
>>>>>
>>>>
>>>> I've been thinking about this from the point of view of a HWComposer
>>>> implementation and I think the hybrid DRM-V4L2 device would work OK.
>>>> However it depends on the behaviour I mentioned above:
>>>>
>>>>>> if the screen is actively displaying a
>>>>>> static scene and the V4L2 client queues up a buffer, it won't get
>>>>>> filled until the DRM scene changes.
>>>>
>>>> V4L2 buffer queues are FIFO, so as long as the compositor queues only
>>>> one V4L2 buffer per atomic update, there's no ambiguity.
>>>> In the most simplistic case the compositor would alternate between:
>>>> - Queue V4L2 buffer
>>>> - DRM atomic update
>>>> ... and dequeue either in the same thread or a different one. As long
>>>> as the compositor keeps track of how many buffers it has queued and
>>>> how many atomic updates it's made, it doesn't really matter.
>>>>
>>>> We'd probably be looking to add in V4L2 asynchronous dequeue using
>>>> fences for synchronisation, but that's a separate issue.
>>>>
>>>>> For the kms-only interface the idea was to add a property on the crtc
>>>>> where you can attach a writeback drm_framebuffer. Extending that idea to
>>>>> the drm->v4l case we could create special drm_framebuffer objects
>>>>> representing a v4l sink, and attach them to the same property. That would
>>>>> also solve the problem of getting some agreement on buffer metadata
>>>>> between v4l and drm side.
>>>>>
>>>>
>>>> I think a drm_framebuffer on its own wouldn't be enough to handle our
>>>> scaling case - at that point it starts to look more like a plane.
>>>> However, if "special" CRTC sinks became a thing it could allow us to
>>>> chain our writeback output to another CRTC's input (via memory)
>>>> without a trip through userspace, which would be nice.
>>>
>>> My thinking has been that we could represent the writeback sink
>>> as a connector. Combine that with my other idea of exposing a
>>> fixed mode property for each connector (to make output scaling
>>> user configurable), and we should end up with a way to control
>>> the scaling of the writeback path.
>>>
>>> Also it would mean there's always a connector attached to the
>>> crtc, even for the pure writeback only case, which I think should
>>> result in less problems in general as I'm sure there are a lot of
>>> assumptions in the code that there must be at least one connector
>>> for an active crtc.
>>
>> This is what I've been thinking for vc4's writeback as well. I
>> definitely want my writeback to be using the same drivers/gpu/drm/vc4
>> code for setting up the hardware. And, since I expect writeback to be
>> mostly used by compositors for the case where a frame exceeds bandwidth
>> constraints, I want userspace to be able to use the same DRM APIs for
>> setting up the frame to be written back as it was trying to use for
>> setting up the frame before.
>
> OK, so let's talk about using connectors instead then :-)
>
> We can attach a generic fixed_mode property which can represent panel
> fitters or Mali-DP's writeback scaling, that sounds good.
>
> The DRM driver can create a new "virtual" encoder/connector pair, I
> guess with a new connector type specifically for memory-writeback?
> On this connector we allow the fb_id property to attach a framebuffer
> for writing into.
> We'd probably want an additional property to enable writeback, perhaps
> an enum: "disabled", "streaming", "oneshot".
>
> I think it makes sense to put the output buffer format, size etc.
> validation in the virtual encoder's atomic_check. It has access to
> both CRTC and connector state, and the encoder is supposed to
> represent the format/stream conversion element anyway.
>
> What I'm not so clear on is how to manage the API with userspace.
I am not terribly enthusiastic (to say the least) if a new drm API is
created for video capture. That's what V4L2 is for and it comes with
kernel frameworks, documentation and utilities. Creating a new API will
not make userspace develoeprs happy.
I know it is a pain to work with different subsystems, but reinventing
the wheel is a much bigger pain. For you and for the poor sods who have
to use yet another API to do the same thing.
Of course this has to be hooked up to drm at the low level, and anything
that can be done to help out with that from the V4L2 side of things is
fine, but creating duplicate public APIs isn't the way IMHO.
Regards,
Hans
>
> Changing connectors is a "full modeset required" operation. I expect
> this means an application needs to know up-front if it will ever make
> use of memory-writeback, and if so do its initial modeset on both the
> writeback and the normal connector, even if it doesn't attach a
> writeback framebuffer until much later. Otherwise, enabling writeback
> will force a full disable-modeset-enable cycle.
>
> The writeback connector would have to be hidden from legacy clients
> which would expect it to work like a normal connector - I suppose
> that's somewhat similar to universal planes.
>
> For the case that Ville mentioned where only the writeback connector
> is connected, would userspace be expected to "make up" a
> drm_mode_modeinfo to set? I can't think of any meaningful list of
> available modes for the writeback connector to report.
>
> Thanks for all the input so far!
>
> -Brian
> --
> To unsubscribe from this list: send the line "unsubscribe linux-media" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: DRM device memory writeback (Mali-DP)
2016-07-18 10:47 ` Hans Verkuil
@ 2016-07-18 15:01 ` Daniel Vetter
2016-07-18 15:16 ` Rob Clark
0 siblings, 1 reply; 9+ messages in thread
From: Daniel Vetter @ 2016-07-18 15:01 UTC (permalink / raw)
To: Hans Verkuil
Cc: Brian Starkey, Eric Anholt, Ville Syrjälä, dri-devel,
liviu.dudau, laurent.pinchart, linux-media, david.brown,
daniel.vetter
On Mon, Jul 18, 2016 at 12:47:38PM +0200, Hans Verkuil wrote:
> On 07/18/2016 12:29 PM, Brian Starkey wrote:
> > Hi,
> >
> > On Fri, Jul 15, 2016 at 10:42:01AM -0700, Eric Anholt wrote:
> >> Ville Syrjälä <ville.syrjala@linux.intel.com> writes:
> >>
> >>> On Fri, Jul 15, 2016 at 10:09:19AM +0100, Brian Starkey wrote:
> >>>> Hi Daniel,
> >>>>
> >>>> Thanks for taking a look.
> >>>>
> >>>> (+Cc Laurent)
> >>>>
> >>>> On Fri, Jul 15, 2016 at 09:33:34AM +0200, Daniel Vetter wrote:
> >>>>> On Thu, Jul 14, 2016 at 06:03:40PM +0100, Brian Starkey wrote:
> >>>>>> Hi,
> >>>>>>
> >>>>>> The Mali-DP display processors have a memory-writeback engine which
> >>>>>> can write the result of the composition (CRTC output) to a memory
> >>>>>> buffer in a variety of formats.
> >>>>>>
> >>>>>> We're looking for feedback/suggestions on how to expose this in the
> >>>>>> mali-dp DRM kernel driver - possibly via V4L2.
> >>>>>>
> >>>>>> We've got a few use cases where writeback is useful:
> >>>>>> - testing, to check the displayed image
> >>>>>
> >>>>> This might or might not need a separate interface. There are efforts to
> >>>>> make the intel kms validation tests in i-g-t generic (well under way
> >>>>> already), and part of that is creating a generic infrastructure to capture
> >>>>> display CRCs for functional tests (still in progress).
> >>>>>
> >>>>> But it might be better if userspace abstracts between full readback and
> >>>>> display CRC, assuming we can make full writeback cross-vendor enough for
> >>>>> that use-case.
> >>>>>
> >>>>
> >>>> I'd lean towards the userspace abstraction.
> >>>> Encumbering a simple CRC interface with all the complexity of
> >>>> full-writeback (size, scaling, pixel format, multi-planar etc.) sounds
> >>>> a bit unnecessary.
> >>>>
> >>>> Of course, if v4l2 isn't going to be the cross-vendor full-writeback
> >>>> implementation, then we need to be aiming to use whatever _is_ in
> >>>> the mali-dp driver.
> >>>>
> >>>>>> - screen recording
> >>>>>> - wireless display (e.g. Miracast)
> >>>>>> - dual-display clone mode
> >>>>>> - memory-to-memory composition
> >>>>>> Note that the HW is capable of writing one of the input planes instead
> >>>>>> of the CRTC output, but we've no good use-case for wanting to expose
> >>>>>> that.
> >>>>>>
> >>>>>> In our Android ADF driver[1] we exposed the memory write engine as
> >>>>>> part of the ADF device using ADF's "MEMORY" interface type. DRM/KMS
> >>>>>> doesn't have any similar support for memory output from CRTCs, but we
> >>>>>> want to expose the functionality in the mainline Mali-DP DRM driver.
> >>>>>>
> >>>>>> A previous discussion on the topic went towards exposing the
> >>>>>> memory-write engine via V4L2[2].
> >>>>>>
> >>>>>> I'm thinking to userspace this would look like two distinct devices:
> >>>>>> - A DRM KMS display controller.
> >>>>>> - A V4L2 video source.
> >>>>>> They'd both exist in the same kernel driver.
> >>>>>> A V4L2 client can queue up (CAPTURE) buffers in the normal way, and
> >>>>>> the DRM driver would see if there's a buffer to dequeue every time a
> >>>>>> new modeset is received via the DRM API - if so, it can configure the
> >>>>>> HW to dump into it (one-shot operation).
> >>>>>>
> >>>>>> An implication of this is that if the screen is actively displaying a
> >>>>>> static scene and the V4L2 client queues up a buffer, it won't get
> >>>>>> filled until the DRM scene changes. This seems best, otherwise the
> >>>>>> V4L2 driver has to change the HW configuration out-of-band from the
> >>>>>> DRM device which sounds horribly racy.
> >>>>>>
> >>>>>> One further complication is scaling. Our HW has a scaler which can
> >>>>>> tasked with either scaling an input plane or the buffer being written
> >>>>>> to memory, but not both at the same time. This means we need to
> >>>>>> arbitrate the scaler between the DRM device (scaling input planes) and
> >>>>>> the V4L2 device (scaling output buffers).
> >>>>>>
> >>>>>> I think the simplest approach here is to allow V4L2 to "claim" the
> >>>>>> scaler by setting the image size (VIDIOC_S_FMT) to something other
> >>>>>> than the CRTC's current resolution. After that, any attempt to use the
> >>>>>> scaler on an input plane via DRM should fail atomic_check().
> >>>>>
> >>>>> That's perfectly fine atomic_check behaviour. Only trouble is that the v4l
> >>>>> locking must integrate into the drm locking, but that should be doable.
> >>>>> Worst case you must shadow all v4l locks with a wait/wound
> >>>>> drm_modeset_lock to avoid deadlocks (since you could try to grab locks
> >>>> >from either end).
> >>>>>
> >>>>
> >>>> Yes, I haven't looked at the details of the locking but I'm hoping
> >>>> it's manageable.
> >>>>
> >>>>>> If the V4L2 client goes away or sets the image size to the CRTC's
> >>>>>> native resolution, then the DRM device is allowed to use the scaler.
> >>>>>> I don't know if/how the DRM device should communicate to userspace
> >>>>>> that the scaler is or isn't available for use.
> >>>>>>
> >>>>>> Any thoughts on this approach?
> >>>>>> Is it acceptable to both V4L2 and DRM folks?
> >>>>>
> >>>>> For streaming a V4L2 capture device seems like the right interface. But if
> >>>>> you want to use writeback in your compositor you must know which atomic
> >>>>> kms update results in which frame, since if you don't you can't use that
> >>>>> composited buffer for the next frame reliable.
> >>>>>
> >>>>> For that case I think a drm-only solution would be better, to make sure
> >>>>> you can do an atomic update and writeback in one step. v4l seems to grow
> >>>>> an atomic api of its own, but I don't think we'll have one spanning
> >>>>> subsystems anytime soon.
> >>>>>
> >>>>
> >>>> I've been thinking about this from the point of view of a HWComposer
> >>>> implementation and I think the hybrid DRM-V4L2 device would work OK.
> >>>> However it depends on the behaviour I mentioned above:
> >>>>
> >>>>>> if the screen is actively displaying a
> >>>>>> static scene and the V4L2 client queues up a buffer, it won't get
> >>>>>> filled until the DRM scene changes.
> >>>>
> >>>> V4L2 buffer queues are FIFO, so as long as the compositor queues only
> >>>> one V4L2 buffer per atomic update, there's no ambiguity.
> >>>> In the most simplistic case the compositor would alternate between:
> >>>> - Queue V4L2 buffer
> >>>> - DRM atomic update
> >>>> ... and dequeue either in the same thread or a different one. As long
> >>>> as the compositor keeps track of how many buffers it has queued and
> >>>> how many atomic updates it's made, it doesn't really matter.
> >>>>
> >>>> We'd probably be looking to add in V4L2 asynchronous dequeue using
> >>>> fences for synchronisation, but that's a separate issue.
> >>>>
> >>>>> For the kms-only interface the idea was to add a property on the crtc
> >>>>> where you can attach a writeback drm_framebuffer. Extending that idea to
> >>>>> the drm->v4l case we could create special drm_framebuffer objects
> >>>>> representing a v4l sink, and attach them to the same property. That would
> >>>>> also solve the problem of getting some agreement on buffer metadata
> >>>>> between v4l and drm side.
> >>>>>
> >>>>
> >>>> I think a drm_framebuffer on its own wouldn't be enough to handle our
> >>>> scaling case - at that point it starts to look more like a plane.
> >>>> However, if "special" CRTC sinks became a thing it could allow us to
> >>>> chain our writeback output to another CRTC's input (via memory)
> >>>> without a trip through userspace, which would be nice.
> >>>
> >>> My thinking has been that we could represent the writeback sink
> >>> as a connector. Combine that with my other idea of exposing a
> >>> fixed mode property for each connector (to make output scaling
> >>> user configurable), and we should end up with a way to control
> >>> the scaling of the writeback path.
> >>>
> >>> Also it would mean there's always a connector attached to the
> >>> crtc, even for the pure writeback only case, which I think should
> >>> result in less problems in general as I'm sure there are a lot of
> >>> assumptions in the code that there must be at least one connector
> >>> for an active crtc.
> >>
> >> This is what I've been thinking for vc4's writeback as well. I
> >> definitely want my writeback to be using the same drivers/gpu/drm/vc4
> >> code for setting up the hardware. And, since I expect writeback to be
> >> mostly used by compositors for the case where a frame exceeds bandwidth
> >> constraints, I want userspace to be able to use the same DRM APIs for
> >> setting up the frame to be written back as it was trying to use for
> >> setting up the frame before.
> >
> > OK, so let's talk about using connectors instead then :-)
> >
> > We can attach a generic fixed_mode property which can represent panel
> > fitters or Mali-DP's writeback scaling, that sounds good.
> >
> > The DRM driver can create a new "virtual" encoder/connector pair, I
> > guess with a new connector type specifically for memory-writeback?
> > On this connector we allow the fb_id property to attach a framebuffer
> > for writing into.
> > We'd probably want an additional property to enable writeback, perhaps
> > an enum: "disabled", "streaming", "oneshot".
> >
> > I think it makes sense to put the output buffer format, size etc.
> > validation in the virtual encoder's atomic_check. It has access to
> > both CRTC and connector state, and the encoder is supposed to
> > represent the format/stream conversion element anyway.
> >
> > What I'm not so clear on is how to manage the API with userspace.
>
> I am not terribly enthusiastic (to say the least) if a new drm API is
> created for video capture. That's what V4L2 is for and it comes with
> kernel frameworks, documentation and utilities. Creating a new API will
> not make userspace develoeprs happy.
>
> I know it is a pain to work with different subsystems, but reinventing
> the wheel is a much bigger pain. For you and for the poor sods who have
> to use yet another API to do the same thing.
>
> Of course this has to be hooked up to drm at the low level, and anything
> that can be done to help out with that from the V4L2 side of things is
> fine, but creating duplicate public APIs isn't the way IMHO.
I agree for the streaming video use-case. But for compositors I do agree
with Eric that a simple frame capture interface integrated with the drm
atomic ioctl is what we want. At least my understanding of v4l is that
it's not that well suited to grabbing specific (single) frames.
-Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: DRM device memory writeback (Mali-DP)
2016-07-18 15:01 ` Daniel Vetter
@ 2016-07-18 15:16 ` Rob Clark
0 siblings, 0 replies; 9+ messages in thread
From: Rob Clark @ 2016-07-18 15:16 UTC (permalink / raw)
To: Daniel Vetter
Cc: Hans Verkuil, Daniel Vetter, liviu.dudau,
dri-devel@lists.freedesktop.org, Laurent Pinchart,
linux-media@vger.kernel.org
On Mon, Jul 18, 2016 at 11:01 AM, Daniel Vetter <daniel@ffwll.ch> wrote:
> On Mon, Jul 18, 2016 at 12:47:38PM +0200, Hans Verkuil wrote:
>> On 07/18/2016 12:29 PM, Brian Starkey wrote:
>> > OK, so let's talk about using connectors instead then :-)
>> >
>> > We can attach a generic fixed_mode property which can represent panel
>> > fitters or Mali-DP's writeback scaling, that sounds good.
>> >
>> > The DRM driver can create a new "virtual" encoder/connector pair, I
>> > guess with a new connector type specifically for memory-writeback?
>> > On this connector we allow the fb_id property to attach a framebuffer
>> > for writing into.
>> > We'd probably want an additional property to enable writeback, perhaps
>> > an enum: "disabled", "streaming", "oneshot".
>> >
>> > I think it makes sense to put the output buffer format, size etc.
>> > validation in the virtual encoder's atomic_check. It has access to
>> > both CRTC and connector state, and the encoder is supposed to
>> > represent the format/stream conversion element anyway.
>> >
>> > What I'm not so clear on is how to manage the API with userspace.
>>
>> I am not terribly enthusiastic (to say the least) if a new drm API is
>> created for video capture. That's what V4L2 is for and it comes with
>> kernel frameworks, documentation and utilities. Creating a new API will
>> not make userspace develoeprs happy.
>>
>> I know it is a pain to work with different subsystems, but reinventing
>> the wheel is a much bigger pain. For you and for the poor sods who have
>> to use yet another API to do the same thing.
>>
>> Of course this has to be hooked up to drm at the low level, and anything
>> that can be done to help out with that from the V4L2 side of things is
>> fine, but creating duplicate public APIs isn't the way IMHO.
>
> I agree for the streaming video use-case. But for compositors I do agree
> with Eric that a simple frame capture interface integrated with the drm
> atomic ioctl is what we want. At least my understanding of v4l is that
> it's not that well suited to grabbing specific (single) frames.
same here, we defn want to use v4l for the streaming case (so we can
take advantage of existing userspace support for
capture/encode/stream, etc).. but for compositors, I think v4l would
be awkward. Ideal case is single set of driver APIs, so driver
doesn't have to care so much about the use case, and then some drm_v4l
helpers to expose a v4l device for the streaming case.
There are some older patches floating around that implemented v4l
inside drm/msm.. which looked simple enough, although I don't think we
should need to do that in each drm driver..
BR,
-R
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2016-07-18 15:16 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-07-14 17:03 DRM device memory writeback (Mali-DP) Brian Starkey
2016-07-15 7:33 ` Daniel Vetter
2016-07-15 9:09 ` Brian Starkey
2016-07-15 10:57 ` Ville Syrjälä
2016-07-15 17:42 ` Eric Anholt
2016-07-18 10:29 ` Brian Starkey
2016-07-18 10:47 ` Hans Verkuil
2016-07-18 15:01 ` Daniel Vetter
2016-07-18 15:16 ` Rob Clark
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).