* Re: [linux-sunxi] [PATCH v2 1/2] media: v4l: Add definitions for the HEVC slice format and controls
[not found] ` <ffe9c81db34b599f675ca5bbf02de360bf0a1608.camel@bootlin.com>
@ 2019-01-07 3:49 ` Randy Li
[not found] ` <776e63c9-d4a5-342a-e0f7-200ef144ffc4-TNX95d0MmH7DzftRWevZcw@public.gmane.org>
0 siblings, 1 reply; 22+ messages in thread
From: Randy Li @ 2019-01-07 3:49 UTC (permalink / raw)
To: Paul Kocialkowski
Cc: devel, Alexandre Courbot, Maxime Ripard, Randy Li, linux-kernel,
Jernej Škrabec, Tomasz Figa, Hans Verkuil, linux-rockchip,
linux-sunxi, Thomas Petazzoni, Mauro Carvalho Chehab,
Ezequiel Garcia, linux-arm-kernel, linux-media
On 12/12/18 8:51 PM, Paul Kocialkowski wrote:
> Hi,
>
> On Wed, 2018-12-05 at 21:59 +0100, Jernej Škrabec wrote:
>
>>> +
>>> +#define V4L2_HEVC_DPB_ENTRY_RPS_ST_CURR_BEFORE 0x01
>>> +#define V4L2_HEVC_DPB_ENTRY_RPS_ST_CURR_AFTER 0x02
>>> +#define V4L2_HEVC_DPB_ENTRY_RPS_LT_CURR 0x03
>>> +
>>> +#define V4L2_HEVC_DPB_ENTRIES_NUM_MAX 16
>>> +
>>> +struct v4l2_hevc_dpb_entry {
>>> + __u32 buffer_tag;
>>> + __u8 rps;
>>> + __u8 field_pic;
>>> + __u16 pic_order_cnt[2];
>>> +};
Please add a property for reference index, if that rps is not used for
this, some device would request that(not the rockchip one). And
Rockchip's VDPU1 and VDPU2 for AVC would request a similar property.
Adding another buffer_tag for referring the memory of the motion vectors
for each frames. Or a better method is add a meta data to echo picture
buffer, since the picture output is just the same as the original,
display won't care whether the motion vectors are written the button of
picture or somewhere else.
>>> +
>>> +struct v4l2_hevc_pred_weight_table {
>>> + __u8 luma_log2_weight_denom;
>>> + __s8 delta_chroma_log2_weight_denom;
>>> +
>>> + __s8 delta_luma_weight_l0[V4L2_HEVC_DPB_ENTRIES_NUM_MAX];
>>> + __s8 luma_offset_l0[V4L2_HEVC_DPB_ENTRIES_NUM_MAX];
>>> + __s8 delta_chroma_weight_l0[V4L2_HEVC_DPB_ENTRIES_NUM_MAX][2];
>>> + __s8 chroma_offset_l0[V4L2_HEVC_DPB_ENTRIES_NUM_MAX][2];
>>> +
>>> + __s8 delta_luma_weight_l1[V4L2_HEVC_DPB_ENTRIES_NUM_MAX];
>>> + __s8 luma_offset_l1[V4L2_HEVC_DPB_ENTRIES_NUM_MAX];
>>> + __s8 delta_chroma_weight_l1[V4L2_HEVC_DPB_ENTRIES_NUM_MAX][2];
>>> + __s8 chroma_offset_l1[V4L2_HEVC_DPB_ENTRIES_NUM_MAX][2];
>>> +};
>>> +
Those properties I think are not necessary are applying for the
Rockchip's device, may not work for the others.
>>> +struct v4l2_ctrl_hevc_slice_params {
>>> + __u32 bit_size;
>>> + __u32 data_bit_offset;
>>> +
>>> + /* ISO/IEC 23008-2, ITU-T Rec. H.265: NAL unit header */
>>> + __u8 nal_unit_type;
>>> + __u8 nuh_temporal_id_plus1;
>>> +
>>> + /* ISO/IEC 23008-2, ITU-T Rec. H.265: General slice segment header */
>>> + __u8 slice_type;
>>> + __u8 colour_plane_id;
----------------------------------------------------------------------------
>>> + __u16 slice_pic_order_cnt;
>>> + __u8 slice_sao_luma_flag;
>>> + __u8 slice_sao_chroma_flag;
>>> + __u8 slice_temporal_mvp_enabled_flag;
>>> + __u8 num_ref_idx_l0_active_minus1;
>>> + __u8 num_ref_idx_l1_active_minus1;
Rockchip's decoder doesn't use this part.
>>> + __u8 mvd_l1_zero_flag;
>>> + __u8 cabac_init_flag;
>>> + __u8 collocated_from_l0_flag;
>>> + __u8 collocated_ref_idx;
>>> + __u8 five_minus_max_num_merge_cand;
>>> + __u8 use_integer_mv_flag;
>>> + __s8 slice_qp_delta;
>>> + __s8 slice_cb_qp_offset;
>>> + __s8 slice_cr_qp_offset;
>>> + __s8 slice_act_y_qp_offset;
>>> + __s8 slice_act_cb_qp_offset;
>>> + __s8 slice_act_cr_qp_offset;
>>> + __u8 slice_deblocking_filter_disabled_flag;
>>> + __s8 slice_beta_offset_div2;
>>> + __s8 slice_tc_offset_div2;
>>> + __u8 slice_loop_filter_across_slices_enabled_flag;
>>> +
>>> + /* ISO/IEC 23008-2, ITU-T Rec. H.265: Picture timing SEI message */
>>> + __u8 pic_struct;
I think the decoder doesn't care about this, it is used for display.
>>> +
>>> + /* ISO/IEC 23008-2, ITU-T Rec. H.265: General slice segment header */
>>> + struct v4l2_hevc_dpb_entry dpb[V4L2_HEVC_DPB_ENTRIES_NUM_MAX];
>>> + __u8 num_active_dpb_entries;
>>> + __u8 ref_idx_l0[V4L2_HEVC_DPB_ENTRIES_NUM_MAX];
>>> + __u8 ref_idx_l1[V4L2_HEVC_DPB_ENTRIES_NUM_MAX];
>>> +
>>> + __u8 num_rps_poc_st_curr_before;
>>> + __u8 num_rps_poc_st_curr_after;
>>> + __u8 num_rps_poc_lt_curr;
>>> +
>>> + /* ISO/IEC 23008-2, ITU-T Rec. H.265: Weighted prediction parameter */
>>> + struct v4l2_hevc_pred_weight_table pred_weight_table;
>>> +};
>>> +
>>> #endif
>>
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH v2 1/2] media: v4l: Add definitions for the HEVC slice format and controls
[not found] ` <776e63c9-d4a5-342a-e0f7-200ef144ffc4-TNX95d0MmH7DzftRWevZcw@public.gmane.org>
@ 2019-01-07 9:57 ` Paul Kocialkowski
2019-01-08 1:16 ` [linux-sunxi] " Ayaka
0 siblings, 1 reply; 22+ messages in thread
From: Paul Kocialkowski @ 2019-01-07 9:57 UTC (permalink / raw)
To: Randy Li
Cc: Jernej Škrabec, linux-sunxi-/JYPxA39Uh5TLH3MbocFFw,
linux-media-u79uwXL29TY76Z2rM5mHXA,
linux-kernel-u79uwXL29TY76Z2rM5mHXA,
devel-gWbeCf7V1WCQmaza687I9mD2FQJk+8+b,
linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
Mauro Carvalho Chehab, Maxime Ripard, Randy Li, Hans Verkuil,
Ezequiel Garcia, Tomasz Figa, Alexandre Courbot, Thomas Petazzoni,
linux-rockchip-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r
Hi,
On Mon, 2019-01-07 at 11:49 +0800, Randy Li wrote:
> On 12/12/18 8:51 PM, Paul Kocialkowski wrote:
> > Hi,
> >
> > On Wed, 2018-12-05 at 21:59 +0100, Jernej Škrabec wrote:
> >
> > > > +
> > > > +#define V4L2_HEVC_DPB_ENTRY_RPS_ST_CURR_BEFORE 0x01
> > > > +#define V4L2_HEVC_DPB_ENTRY_RPS_ST_CURR_AFTER 0x02
> > > > +#define V4L2_HEVC_DPB_ENTRY_RPS_LT_CURR 0x03
> > > > +
> > > > +#define V4L2_HEVC_DPB_ENTRIES_NUM_MAX 16
> > > > +
> > > > +struct v4l2_hevc_dpb_entry {
> > > > + __u32 buffer_tag;
> > > > + __u8 rps;
> > > > + __u8 field_pic;
> > > > + __u16 pic_order_cnt[2];
> > > > +};
>
> Please add a property for reference index, if that rps is not used for
> this, some device would request that(not the rockchip one). And
> Rockchip's VDPU1 and VDPU2 for AVC would request a similar property.
What exactly is that reference index? Is it a bitstream element or
something deduced from the bitstream?
> Adding another buffer_tag for referring the memory of the motion vectors
> for each frames. Or a better method is add a meta data to echo picture
> buffer, since the picture output is just the same as the original,
> display won't care whether the motion vectors are written the button of
> picture or somewhere else.
The motion vectors are passed as part of the raw bitstream data, in the
slices. Is there a case where the motion vectors are coded differently?
> > > > +
> > > > +struct v4l2_hevc_pred_weight_table {
> > > > + __u8 luma_log2_weight_denom;
> > > > + __s8 delta_chroma_log2_weight_denom;
> > > > +
> > > > + __s8 delta_luma_weight_l0[V4L2_HEVC_DPB_ENTRIES_NUM_MAX];
> > > > + __s8 luma_offset_l0[V4L2_HEVC_DPB_ENTRIES_NUM_MAX];
> > > > + __s8 delta_chroma_weight_l0[V4L2_HEVC_DPB_ENTRIES_NUM_MAX][2];
> > > > + __s8 chroma_offset_l0[V4L2_HEVC_DPB_ENTRIES_NUM_MAX][2];
> > > > +
> > > > + __s8 delta_luma_weight_l1[V4L2_HEVC_DPB_ENTRIES_NUM_MAX];
> > > > + __s8 luma_offset_l1[V4L2_HEVC_DPB_ENTRIES_NUM_MAX];
> > > > + __s8 delta_chroma_weight_l1[V4L2_HEVC_DPB_ENTRIES_NUM_MAX][2];
> > > > + __s8 chroma_offset_l1[V4L2_HEVC_DPB_ENTRIES_NUM_MAX][2];
> > > > +};
> > > > +
> Those properties I think are not necessary are applying for the
> Rockchip's device, may not work for the others.
Yes, it's possible that some of the elements are not necessary for some
decoders. What we want is to cover all the elements that might be
required for a decoder.
> > > > +struct v4l2_ctrl_hevc_slice_params {
> > > > + __u32 bit_size;
> > > > + __u32 data_bit_offset;
> > > > +
> > > > + /* ISO/IEC 23008-2, ITU-T Rec. H.265: NAL unit header */
> > > > + __u8 nal_unit_type;
> > > > + __u8 nuh_temporal_id_plus1;
> > > > +
> > > > + /* ISO/IEC 23008-2, ITU-T Rec. H.265: General slice segment header */
> > > > + __u8 slice_type;
> > > > + __u8 colour_plane_id;
> ----------------------------------------------------------------------------
> > > > + __u16 slice_pic_order_cnt;
> > > > + __u8 slice_sao_luma_flag;
> > > > + __u8 slice_sao_chroma_flag;
> > > > + __u8 slice_temporal_mvp_enabled_flag;
> > > > + __u8 num_ref_idx_l0_active_minus1;
> > > > + __u8 num_ref_idx_l1_active_minus1;
> Rockchip's decoder doesn't use this part.
> > > > + __u8 mvd_l1_zero_flag;
> > > > + __u8 cabac_init_flag;
> > > > + __u8 collocated_from_l0_flag;
> > > > + __u8 collocated_ref_idx;
> > > > + __u8 five_minus_max_num_merge_cand;
> > > > + __u8 use_integer_mv_flag;
> > > > + __s8 slice_qp_delta;
> > > > + __s8 slice_cb_qp_offset;
> > > > + __s8 slice_cr_qp_offset;
> > > > + __s8 slice_act_y_qp_offset;
> > > > + __s8 slice_act_cb_qp_offset;
> > > > + __s8 slice_act_cr_qp_offset;
> > > > + __u8 slice_deblocking_filter_disabled_flag;
> > > > + __s8 slice_beta_offset_div2;
> > > > + __s8 slice_tc_offset_div2;
> > > > + __u8 slice_loop_filter_across_slices_enabled_flag;
> > > > +
> > > > + /* ISO/IEC 23008-2, ITU-T Rec. H.265: Picture timing SEI message */
> > > > + __u8 pic_struct;
> I think the decoder doesn't care about this, it is used for display.
The purpose of this field is to indicate whether the current picture is
a progressive frame or an interlaced field picture, which is useful for
decoding.
At least our decoder has a register field to indicate frame/top
field/bottom field, so we certainly need to keep the info around.
Looking at the spec and the ffmpeg implementation, it looks like this
flag of the bitstream is the usual way to report field coding.
Cheers,
Paul
> > > > +
> > > > + /* ISO/IEC 23008-2, ITU-T Rec. H.265: General slice segment header */
> > > > + struct v4l2_hevc_dpb_entry dpb[V4L2_HEVC_DPB_ENTRIES_NUM_MAX];
> > > > + __u8 num_active_dpb_entries;
> > > > + __u8 ref_idx_l0[V4L2_HEVC_DPB_ENTRIES_NUM_MAX];
> > > > + __u8 ref_idx_l1[V4L2_HEVC_DPB_ENTRIES_NUM_MAX];
> > > > +
> > > > + __u8 num_rps_poc_st_curr_before;
> > > > + __u8 num_rps_poc_st_curr_after;
> > > > + __u8 num_rps_poc_lt_curr;
> > > > +
> > > > + /* ISO/IEC 23008-2, ITU-T Rec. H.265: Weighted prediction parameter */
> > > > + struct v4l2_hevc_pred_weight_table pred_weight_table;
> > > > +};
> > > > +
> > > > #endif
--
Paul Kocialkowski, Bootlin (formerly Free Electrons)
Embedded Linux and kernel engineering
https://bootlin.com
--
You received this message because you are subscribed to the Google Groups "linux-sunxi" group.
To unsubscribe from this group and stop receiving emails from it, send an email to linux-sunxi+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
For more options, visit https://groups.google.com/d/optout.
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [linux-sunxi] [PATCH v2 1/2] media: v4l: Add definitions for the HEVC slice format and controls
2019-01-07 9:57 ` Paul Kocialkowski
@ 2019-01-08 1:16 ` Ayaka
[not found] ` <D8005130-F7FD-4CBD-8396-1BB08BB08E81-xPW3/0Ywev/iB9QmIjCX8w@public.gmane.org>
0 siblings, 1 reply; 22+ messages in thread
From: Ayaka @ 2019-01-08 1:16 UTC (permalink / raw)
To: Paul Kocialkowski
Cc: Randy Li, Jernej Škrabec, linux-sunxi, linux-media,
linux-kernel, devel, linux-arm-kernel, Mauro Carvalho Chehab,
Maxime Ripard, Hans Verkuil, Ezequiel Garcia, Tomasz Figa,
Alexandre Courbot, Thomas Petazzoni, linux-rockchip
Sent from my iPad
> On Jan 7, 2019, at 5:57 PM, Paul Kocialkowski <paul.kocialkowski@bootlin.com> wrote:
>
> Hi,
>
>> On Mon, 2019-01-07 at 11:49 +0800, Randy Li wrote:
>>> On 12/12/18 8:51 PM, Paul Kocialkowski wrote:
>>> Hi,
>>>
>>> On Wed, 2018-12-05 at 21:59 +0100, Jernej Škrabec wrote:
>>>
>>>>> +
>>>>> +#define V4L2_HEVC_DPB_ENTRY_RPS_ST_CURR_BEFORE 0x01
>>>>> +#define V4L2_HEVC_DPB_ENTRY_RPS_ST_CURR_AFTER 0x02
>>>>> +#define V4L2_HEVC_DPB_ENTRY_RPS_LT_CURR 0x03
>>>>> +
>>>>> +#define V4L2_HEVC_DPB_ENTRIES_NUM_MAX 16
>>>>> +
>>>>> +struct v4l2_hevc_dpb_entry {
>>>>> + __u32 buffer_tag;
>>>>> + __u8 rps;
>>>>> + __u8 field_pic;
>>>>> + __u16 pic_order_cnt[2];
>>>>> +};
>>
>> Please add a property for reference index, if that rps is not used for
>> this, some device would request that(not the rockchip one). And
>> Rockchip's VDPU1 and VDPU2 for AVC would request a similar property.
>
> What exactly is that reference index? Is it a bitstream element or
> something deduced from the bitstream?
>
picture order count(POC) for HEVC and frame_num in AVC. I think it is the number used in list0(P slice and B slice) and list1(B slice).
>> Adding another buffer_tag for referring the memory of the motion vectors
>> for each frames. Or a better method is add a meta data to echo picture
>> buffer, since the picture output is just the same as the original,
>> display won't care whether the motion vectors are written the button of
>> picture or somewhere else.
>
> The motion vectors are passed as part of the raw bitstream data, in the
> slices. Is there a case where the motion vectors are coded differently?
No, it is an additional cache for decoder, even FFmpeg having such data, I think allwinner must output it into somewhere.
>
>>>>> +
>>>>> +struct v4l2_hevc_pred_weight_table {
>>>>> + __u8 luma_log2_weight_denom;
>>>>> + __s8 delta_chroma_log2_weight_denom;
>>>>> +
>>>>> + __s8 delta_luma_weight_l0[V4L2_HEVC_DPB_ENTRIES_NUM_MAX];
>>>>> + __s8 luma_offset_l0[V4L2_HEVC_DPB_ENTRIES_NUM_MAX];
>>>>> + __s8 delta_chroma_weight_l0[V4L2_HEVC_DPB_ENTRIES_NUM_MAX][2];
>>>>> + __s8 chroma_offset_l0[V4L2_HEVC_DPB_ENTRIES_NUM_MAX][2];
>>>>> +
>>>>> + __s8 delta_luma_weight_l1[V4L2_HEVC_DPB_ENTRIES_NUM_MAX];
>>>>> + __s8 luma_offset_l1[V4L2_HEVC_DPB_ENTRIES_NUM_MAX];
>>>>> + __s8 delta_chroma_weight_l1[V4L2_HEVC_DPB_ENTRIES_NUM_MAX][2];
>>>>> + __s8 chroma_offset_l1[V4L2_HEVC_DPB_ENTRIES_NUM_MAX][2];
>>>>> +};
>>>>> +
>> Those properties I think are not necessary are applying for the
>> Rockchip's device, may not work for the others.
>
> Yes, it's possible that some of the elements are not necessary for some
> decoders. What we want is to cover all the elements that might be
> required for a decoder.
I wonder whether allwinner need that, those sao flag usually ignored by decoder in design. But more is better than less, it is hard to extend a v4l2 structure in the future, maybe a new HEVC profile would bring a new property, it is still too early for HEVC.
>
>>>>> +struct v4l2_ctrl_hevc_slice_params {
>>>>> + __u32 bit_size;
>>>>> + __u32 data_bit_offset;
>>>>> +
>>>>> + /* ISO/IEC 23008-2, ITU-T Rec. H.265: NAL unit header */
>>>>> + __u8 nal_unit_type;
>>>>> + __u8 nuh_temporal_id_plus1;
>>>>> +
>>>>> + /* ISO/IEC 23008-2, ITU-T Rec. H.265: General slice segment header */
>>>>> + __u8 slice_type;
>>>>> + __u8 colour_plane_id;
>> ----------------------------------------------------------------------------
>>>>> + __u16 slice_pic_order_cnt;
>>>>> + __u8 slice_sao_luma_flag;
>>>>> + __u8 slice_sao_chroma_flag;
>>>>> + __u8 slice_temporal_mvp_enabled_flag;
>>>>> + __u8 num_ref_idx_l0_active_minus1;
>>>>> + __u8 num_ref_idx_l1_active_minus1;
>> Rockchip's decoder doesn't use this part.
>>>>> + __u8 mvd_l1_zero_flag;
>>>>> + __u8 cabac_init_flag;
>>>>> + __u8 collocated_from_l0_flag;
>>>>> + __u8 collocated_ref_idx;
>>>>> + __u8 five_minus_max_num_merge_cand;
>>>>> + __u8 use_integer_mv_flag;
>>>>> + __s8 slice_qp_delta;
>>>>> + __s8 slice_cb_qp_offset;
>>>>> + __s8 slice_cr_qp_offset;
>>>>> + __s8 slice_act_y_qp_offset;
>>>>> + __s8 slice_act_cb_qp_offset;
>>>>> + __s8 slice_act_cr_qp_offset;
>>>>> + __u8 slice_deblocking_filter_disabled_flag;
>>>>> + __s8 slice_beta_offset_div2;
>>>>> + __s8 slice_tc_offset_div2;
>>>>> + __u8 slice_loop_filter_across_slices_enabled_flag;
>>>>> +
>>>>> + /* ISO/IEC 23008-2, ITU-T Rec. H.265: Picture timing SEI message */
>>>>> + __u8 pic_struct;
>> I think the decoder doesn't care about this, it is used for display.
>
> The purpose of this field is to indicate whether the current picture is
> a progressive frame or an interlaced field picture, which is useful for
> decoding.
>
> At least our decoder has a register field to indicate frame/top
> field/bottom field, so we certainly need to keep the info around.
> Looking at the spec and the ffmpeg implementation, it looks like this
> flag of the bitstream is the usual way to report field coding.
It depends whether the decoder cares about scan type or more, I wonder prefer general_interlaced_source_flag for just scan type, it would be better than reading another SEL.
>
> Cheers,
>
> Paul
Randy “ayaka” LI
>
>>>>> +
>>>>> + /* ISO/IEC 23008-2, ITU-T Rec. H.265: General slice segment header */
>>>>> + struct v4l2_hevc_dpb_entry dpb[V4L2_HEVC_DPB_ENTRIES_NUM_MAX];
>>>>> + __u8 num_active_dpb_entries;
>>>>> + __u8 ref_idx_l0[V4L2_HEVC_DPB_ENTRIES_NUM_MAX];
>>>>> + __u8 ref_idx_l1[V4L2_HEVC_DPB_ENTRIES_NUM_MAX];
>>>>> +
>>>>> + __u8 num_rps_poc_st_curr_before;
>>>>> + __u8 num_rps_poc_st_curr_after;
>>>>> + __u8 num_rps_poc_lt_curr;
>>>>> +
>>>>> + /* ISO/IEC 23008-2, ITU-T Rec. H.265: Weighted prediction parameter */
>>>>> + struct v4l2_hevc_pred_weight_table pred_weight_table;
>>>>> +};
>>>>> +
>>>>> #endif
> --
> Paul Kocialkowski, Bootlin (formerly Free Electrons)
> Embedded Linux and kernel engineering
> https://bootlin.com
>
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH v2 1/2] media: v4l: Add definitions for the HEVC slice format and controls
[not found] ` <D8005130-F7FD-4CBD-8396-1BB08BB08E81-xPW3/0Ywev/iB9QmIjCX8w@public.gmane.org>
@ 2019-01-08 8:38 ` Paul Kocialkowski
2019-01-08 10:00 ` [linux-sunxi] " Ayaka
0 siblings, 1 reply; 22+ messages in thread
From: Paul Kocialkowski @ 2019-01-08 8:38 UTC (permalink / raw)
To: Ayaka
Cc: Randy Li, Jernej Škrabec, linux-sunxi-/JYPxA39Uh5TLH3MbocFFw,
linux-media-u79uwXL29TY76Z2rM5mHXA,
linux-kernel-u79uwXL29TY76Z2rM5mHXA,
devel-gWbeCf7V1WCQmaza687I9mD2FQJk+8+b,
linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
Mauro Carvalho Chehab, Maxime Ripard, Hans Verkuil,
Ezequiel Garcia, Tomasz Figa, Alexandre Courbot, Thomas Petazzoni,
linux-rockchip-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r
Hi,
On Tue, 2019-01-08 at 09:16 +0800, Ayaka wrote:
>
> Sent from my iPad
>
> > On Jan 7, 2019, at 5:57 PM, Paul Kocialkowski <paul.kocialkowski@bootlin.com> wrote:
> >
> > Hi,
> >
> > > On Mon, 2019-01-07 at 11:49 +0800, Randy Li wrote:
> > > > On 12/12/18 8:51 PM, Paul Kocialkowski wrote:
> > > > Hi,
> > > >
> > > > On Wed, 2018-12-05 at 21:59 +0100, Jernej Škrabec wrote:
> > > >
> > > > > > +
> > > > > > +#define V4L2_HEVC_DPB_ENTRY_RPS_ST_CURR_BEFORE 0x01
> > > > > > +#define V4L2_HEVC_DPB_ENTRY_RPS_ST_CURR_AFTER 0x02
> > > > > > +#define V4L2_HEVC_DPB_ENTRY_RPS_LT_CURR 0x03
> > > > > > +
> > > > > > +#define V4L2_HEVC_DPB_ENTRIES_NUM_MAX 16
> > > > > > +
> > > > > > +struct v4l2_hevc_dpb_entry {
> > > > > > + __u32 buffer_tag;
> > > > > > + __u8 rps;
> > > > > > + __u8 field_pic;
> > > > > > + __u16 pic_order_cnt[2];
> > > > > > +};
> > >
> > > Please add a property for reference index, if that rps is not used for
> > > this, some device would request that(not the rockchip one). And
> > > Rockchip's VDPU1 and VDPU2 for AVC would request a similar property.
> >
> > What exactly is that reference index? Is it a bitstream element or
> > something deduced from the bitstream?
> >
> picture order count(POC) for HEVC and frame_num in AVC. I think it is
> the number used in list0(P slice and B slice) and list1(B slice).
The picture order count is already the last field of the DPB entry
structure. There is one for each field picture.
> > > Adding another buffer_tag for referring the memory of the motion vectors
> > > for each frames. Or a better method is add a meta data to echo picture
> > > buffer, since the picture output is just the same as the original,
> > > display won't care whether the motion vectors are written the button of
> > > picture or somewhere else.
> >
> > The motion vectors are passed as part of the raw bitstream data, in the
> > slices. Is there a case where the motion vectors are coded differently?
> No, it is an additional cache for decoder, even FFmpeg having such
> data, I think allwinner must output it into somewhere.
Ah yes I see what you mean! This is handled internally by our driver
and not exposed to userspace. I don't think it would be a good idea to
expose this cache or request that userspace allocates it like a video
buffer.
> > > > > > +
> > > > > > +struct v4l2_hevc_pred_weight_table {
> > > > > > + __u8 luma_log2_weight_denom;
> > > > > > + __s8 delta_chroma_log2_weight_denom;
> > > > > > +
> > > > > > + __s8 delta_luma_weight_l0[V4L2_HEVC_DPB_ENTRIES_NUM_MAX];
> > > > > > + __s8 luma_offset_l0[V4L2_HEVC_DPB_ENTRIES_NUM_MAX];
> > > > > > + __s8 delta_chroma_weight_l0[V4L2_HEVC_DPB_ENTRIES_NUM_MAX][2];
> > > > > > + __s8 chroma_offset_l0[V4L2_HEVC_DPB_ENTRIES_NUM_MAX][2];
> > > > > > +
> > > > > > + __s8 delta_luma_weight_l1[V4L2_HEVC_DPB_ENTRIES_NUM_MAX];
> > > > > > + __s8 luma_offset_l1[V4L2_HEVC_DPB_ENTRIES_NUM_MAX];
> > > > > > + __s8 delta_chroma_weight_l1[V4L2_HEVC_DPB_ENTRIES_NUM_MAX][2];
> > > > > > + __s8 chroma_offset_l1[V4L2_HEVC_DPB_ENTRIES_NUM_MAX][2];
> > > > > > +};
> > > > > > +
> > > Those properties I think are not necessary are applying for the
> > > Rockchip's device, may not work for the others.
> >
> > Yes, it's possible that some of the elements are not necessary for some
> > decoders. What we want is to cover all the elements that might be
> > required for a decoder.
> I wonder whether allwinner need that, those sao flag usually ignored
> by decoder in design. But more is better than less, it is hard to
> extend a v4l2 structure in the future, maybe a new HEVC profile
> would bring a new property, it is still too early for HEVC.
Yes this is used by our decoder. The idea is to have all the basic
bitstream elements in the structures (even if some decoders don't use
them all) and add others for extension as separate controls later.
> > > > > > +struct v4l2_ctrl_hevc_slice_params {
> > > > > > + __u32 bit_size;
> > > > > > + __u32 data_bit_offset;
> > > > > > +
> > > > > > + /* ISO/IEC 23008-2, ITU-T Rec. H.265: NAL unit header */
> > > > > > + __u8 nal_unit_type;
> > > > > > + __u8 nuh_temporal_id_plus1;
> > > > > > +
> > > > > > + /* ISO/IEC 23008-2, ITU-T Rec. H.265: General slice segment header */
> > > > > > + __u8 slice_type;
> > > > > > + __u8 colour_plane_id;
> > > ----------------------------------------------------------------------------
> > > > > > + __u16 slice_pic_order_cnt;
> > > > > > + __u8 slice_sao_luma_flag;
> > > > > > + __u8 slice_sao_chroma_flag;
> > > > > > + __u8 slice_temporal_mvp_enabled_flag;
> > > > > > + __u8 num_ref_idx_l0_active_minus1;
> > > > > > + __u8 num_ref_idx_l1_active_minus1;
> > > Rockchip's decoder doesn't use this part.
> > > > > > + __u8 mvd_l1_zero_flag;
> > > > > > + __u8 cabac_init_flag;
> > > > > > + __u8 collocated_from_l0_flag;
> > > > > > + __u8 collocated_ref_idx;
> > > > > > + __u8 five_minus_max_num_merge_cand;
> > > > > > + __u8 use_integer_mv_flag;
> > > > > > + __s8 slice_qp_delta;
> > > > > > + __s8 slice_cb_qp_offset;
> > > > > > + __s8 slice_cr_qp_offset;
> > > > > > + __s8 slice_act_y_qp_offset;
> > > > > > + __s8 slice_act_cb_qp_offset;
> > > > > > + __s8 slice_act_cr_qp_offset;
> > > > > > + __u8 slice_deblocking_filter_disabled_flag;
> > > > > > + __s8 slice_beta_offset_div2;
> > > > > > + __s8 slice_tc_offset_div2;
> > > > > > + __u8 slice_loop_filter_across_slices_enabled_flag;
> > > > > > +
> > > > > > + /* ISO/IEC 23008-2, ITU-T Rec. H.265: Picture timing SEI message */
> > > > > > + __u8 pic_struct;
> > > I think the decoder doesn't care about this, it is used for display.
> >
> > The purpose of this field is to indicate whether the current picture is
> > a progressive frame or an interlaced field picture, which is useful for
> > decoding.
> >
> > At least our decoder has a register field to indicate frame/top
> > field/bottom field, so we certainly need to keep the info around.
> > Looking at the spec and the ffmpeg implementation, it looks like this
> > flag of the bitstream is the usual way to report field coding.
> It depends whether the decoder cares about scan type or more, I
> wonder prefer general_interlaced_source_flag for just scan type, it
> would be better than reading another SEL.
Well we still need a way to indicate if the current data is top or
bottom field for interlaced. I don't think that knowing that the whole
video is interlaced would be precise enough.
Cheers,
Paul
> > > > > > +
> > > > > > + /* ISO/IEC 23008-2, ITU-T Rec. H.265: General slice segment header */
> > > > > > + struct v4l2_hevc_dpb_entry dpb[V4L2_HEVC_DPB_ENTRIES_NUM_MAX];
> > > > > > + __u8 num_active_dpb_entries;
> > > > > > + __u8 ref_idx_l0[V4L2_HEVC_DPB_ENTRIES_NUM_MAX];
> > > > > > + __u8 ref_idx_l1[V4L2_HEVC_DPB_ENTRIES_NUM_MAX];
> > > > > > +
> > > > > > + __u8 num_rps_poc_st_curr_before;
> > > > > > + __u8 num_rps_poc_st_curr_after;
> > > > > > + __u8 num_rps_poc_lt_curr;
> > > > > > +
> > > > > > + /* ISO/IEC 23008-2, ITU-T Rec. H.265: Weighted prediction parameter */
> > > > > > + struct v4l2_hevc_pred_weight_table pred_weight_table;
> > > > > > +};
> > > > > > +
> > > > > > #endif
> > --
> > Paul Kocialkowski, Bootlin (formerly Free Electrons)
> > Embedded Linux and kernel engineering
> > https://bootlin.com
> >
--
Paul Kocialkowski, Bootlin (formerly Free Electrons)
Embedded Linux and kernel engineering
https://bootlin.com
--
You received this message because you are subscribed to the Google Groups "linux-sunxi" group.
To unsubscribe from this group and stop receiving emails from it, send an email to linux-sunxi+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
For more options, visit https://groups.google.com/d/optout.
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [linux-sunxi] [PATCH v2 1/2] media: v4l: Add definitions for the HEVC slice format and controls
2019-01-08 8:38 ` Paul Kocialkowski
@ 2019-01-08 10:00 ` Ayaka
2019-01-10 13:32 ` ayaka
2019-01-24 10:36 ` Paul Kocialkowski
0 siblings, 2 replies; 22+ messages in thread
From: Ayaka @ 2019-01-08 10:00 UTC (permalink / raw)
To: Paul Kocialkowski
Cc: Randy Li, Jernej Škrabec, linux-media, linux-kernel, devel,
linux-arm-kernel, Mauro Carvalho Chehab, Maxime Ripard,
Hans Verkuil, Ezequiel Garcia, Tomasz Figa, Alexandre Courbot,
Thomas Petazzoni, linux-rockchip
Sent from my iPad
> On Jan 8, 2019, at 4:38 PM, Paul Kocialkowski <paul.kocialkowski@bootlin.com> wrote:
>
> Hi,
>
>> On Tue, 2019-01-08 at 09:16 +0800, Ayaka wrote:
>>
>> Sent from my iPad
>>
>>> On Jan 7, 2019, at 5:57 PM, Paul Kocialkowski <paul.kocialkowski@bootlin.com> wrote:
>>>
>>> Hi,
>>>
>>>>> On Mon, 2019-01-07 at 11:49 +0800, Randy Li wrote:
>>>>> On 12/12/18 8:51 PM, Paul Kocialkowski wrote:
>>>>> Hi,
>>>>>
>>>>> On Wed, 2018-12-05 at 21:59 +0100, Jernej Škrabec wrote:
>>>>>
>>>>>>> +
>>>>>>> +#define V4L2_HEVC_DPB_ENTRY_RPS_ST_CURR_BEFORE 0x01
>>>>>>> +#define V4L2_HEVC_DPB_ENTRY_RPS_ST_CURR_AFTER 0x02
>>>>>>> +#define V4L2_HEVC_DPB_ENTRY_RPS_LT_CURR 0x03
>>>>>>> +
>>>>>>> +#define V4L2_HEVC_DPB_ENTRIES_NUM_MAX 16
>>>>>>> +
>>>>>>> +struct v4l2_hevc_dpb_entry {
>>>>>>> + __u32 buffer_tag;
>>>>>>> + __u8 rps;
>>>>>>> + __u8 field_pic;
>>>>>>> + __u16 pic_order_cnt[2];
>>>>>>> +};
>>>>
>>>> Please add a property for reference index, if that rps is not used for
>>>> this, some device would request that(not the rockchip one). And
>>>> Rockchip's VDPU1 and VDPU2 for AVC would request a similar property.
>>>
>>> What exactly is that reference index? Is it a bitstream element or
>>> something deduced from the bitstream?
>>>
>> picture order count(POC) for HEVC and frame_num in AVC. I think it is
>> the number used in list0(P slice and B slice) and list1(B slice).
>
> The picture order count is already the last field of the DPB entry
> structure. There is one for each field picture.
As we are not sure whether there is a field coded slice or CTU, I would hold this part and else about the field.
>
>>>> Adding another buffer_tag for referring the memory of the motion vectors
>>>> for each frames. Or a better method is add a meta data to echo picture
>>>> buffer, since the picture output is just the same as the original,
>>>> display won't care whether the motion vectors are written the button of
>>>> picture or somewhere else.
>>>
>>> The motion vectors are passed as part of the raw bitstream data, in the
>>> slices. Is there a case where the motion vectors are coded differently?
>> No, it is an additional cache for decoder, even FFmpeg having such
>> data, I think allwinner must output it into somewhere.
>
> Ah yes I see what you mean! This is handled internally by our driver
> and not exposed to userspace. I don't think it would be a good idea to
> expose this cache or request that userspace allocates it like a video
> buffer.
>
No, usually the driver should allocate, as the user space have no idea on size of each devices.
But for advantage user, application can fix a broken picture with a proper data or analysis a object motion from that.
So I would suggest attaching this information to a picture buffer as a meta data.
>>>>>>> +
>>>>>>> +struct v4l2_hevc_pred_weight_table {
>>>>>>> + __u8 luma_log2_weight_denom;
>>>>>>> + __s8 delta_chroma_log2_weight_denom;
>>>>>>> +
>>>>>>> + __s8 delta_luma_weight_l0[V4L2_HEVC_DPB_ENTRIES_NUM_MAX];
>>>>>>> + __s8 luma_offset_l0[V4L2_HEVC_DPB_ENTRIES_NUM_MAX];
>>>>>>> + __s8 delta_chroma_weight_l0[V4L2_HEVC_DPB_ENTRIES_NUM_MAX][2];
>>>>>>> + __s8 chroma_offset_l0[V4L2_HEVC_DPB_ENTRIES_NUM_MAX][2];
>>>>>>> +
>>>>>>> + __s8 delta_luma_weight_l1[V4L2_HEVC_DPB_ENTRIES_NUM_MAX];
>>>>>>> + __s8 luma_offset_l1[V4L2_HEVC_DPB_ENTRIES_NUM_MAX];
>>>>>>> + __s8 delta_chroma_weight_l1[V4L2_HEVC_DPB_ENTRIES_NUM_MAX][2];
>>>>>>> + __s8 chroma_offset_l1[V4L2_HEVC_DPB_ENTRIES_NUM_MAX][2];
>>>>>>> +};
>>>>>>> +
>>>> Those properties I think are not necessary are applying for the
>>>> Rockchip's device, may not work for the others.
>>>
>>> Yes, it's possible that some of the elements are not necessary for some
>>> decoders. What we want is to cover all the elements that might be
>>> required for a decoder.
>> I wonder whether allwinner need that, those sao flag usually ignored
>> by decoder in design. But more is better than less, it is hard to
>> extend a v4l2 structure in the future, maybe a new HEVC profile
>> would bring a new property, it is still too early for HEVC.
>
> Yes this is used by our decoder. The idea is to have all the basic
> bitstream elements in the structures (even if some decoders don't use
> them all) and add others for extension as separate controls later.
>
>>>>>>> +struct v4l2_ctrl_hevc_slice_params {
>>>>>>> + __u32 bit_size;
>>>>>>> + __u32 data_bit_offset;
>>>>>>> +
>>>>>>> + /* ISO/IEC 23008-2, ITU-T Rec. H.265: NAL unit header */
>>>>>>> + __u8 nal_unit_type;
>>>>>>> + __u8 nuh_temporal_id_plus1;
>>>>>>> +
>>>>>>> + /* ISO/IEC 23008-2, ITU-T Rec. H.265: General slice segment header */
>>>>>>> + __u8 slice_type;
>>>>>>> + __u8 colour_plane_id;
>>>> ----------------------------------------------------------------------------
>>>>>>> + __u16 slice_pic_order_cnt;
>>>>>>> + __u8 slice_sao_luma_flag;
>>>>>>> + __u8 slice_sao_chroma_flag;
>>>>>>> + __u8 slice_temporal_mvp_enabled_flag;
>>>>>>> + __u8 num_ref_idx_l0_active_minus1;
>>>>>>> + __u8 num_ref_idx_l1_active_minus1;
>>>> Rockchip's decoder doesn't use this part.
>>>>>>> + __u8 mvd_l1_zero_flag;
>>>>>>> + __u8 cabac_init_flag;
>>>>>>> + __u8 collocated_from_l0_flag;
>>>>>>> + __u8 collocated_ref_idx;
>>>>>>> + __u8 five_minus_max_num_merge_cand;
>>>>>>> + __u8 use_integer_mv_flag;
>>>>>>> + __s8 slice_qp_delta;
>>>>>>> + __s8 slice_cb_qp_offset;
>>>>>>> + __s8 slice_cr_qp_offset;
>>>>>>> + __s8 slice_act_y_qp_offset;
>>>>>>> + __s8 slice_act_cb_qp_offset;
>>>>>>> + __s8 slice_act_cr_qp_offset;
>>>>>>> + __u8 slice_deblocking_filter_disabled_flag;
>>>>>>> + __s8 slice_beta_offset_div2;
>>>>>>> + __s8 slice_tc_offset_div2;
>>>>>>> + __u8 slice_loop_filter_across_slices_enabled_flag;
>>>>>>> +
>>>>>>> + /* ISO/IEC 23008-2, ITU-T Rec. H.265: Picture timing SEI message */
>>>>>>> + __u8 pic_struct;
>>>> I think the decoder doesn't care about this, it is used for display.
>>>
>>> The purpose of this field is to indicate whether the current picture is
>>> a progressive frame or an interlaced field picture, which is useful for
>>> decoding.
>>>
>>> At least our decoder has a register field to indicate frame/top
>>> field/bottom field, so we certainly need to keep the info around.
>>> Looking at the spec and the ffmpeg implementation, it looks like this
>>> flag of the bitstream is the usual way to report field coding.
>> It depends whether the decoder cares about scan type or more, I
>> wonder prefer general_interlaced_source_flag for just scan type, it
>> would be better than reading another SEL.
>
> Well we still need a way to indicate if the current data is top or
> bottom field for interlaced. I don't think that knowing that the whole
> video is interlaced would be precise enough.
>
> Cheers,
>
> Paul
>
>>>>>>> +
>>>>>>> + /* ISO/IEC 23008-2, ITU-T Rec. H.265: General slice segment header */
>>>>>>> + struct v4l2_hevc_dpb_entry dpb[V4L2_HEVC_DPB_ENTRIES_NUM_MAX];
>>>>>>> + __u8 num_active_dpb_entries;
>>>>>>> + __u8 ref_idx_l0[V4L2_HEVC_DPB_ENTRIES_NUM_MAX];
>>>>>>> + __u8 ref_idx_l1[V4L2_HEVC_DPB_ENTRIES_NUM_MAX];
>>>>>>> +
>>>>>>> + __u8 num_rps_poc_st_curr_before;
>>>>>>> + __u8 num_rps_poc_st_curr_after;
>>>>>>> + __u8 num_rps_poc_lt_curr;
>>>>>>> +
>>>>>>> + /* ISO/IEC 23008-2, ITU-T Rec. H.265: Weighted prediction parameter */
>>>>>>> + struct v4l2_hevc_pred_weight_table pred_weight_table;
>>>>>>> +};
>>>>>>> +
>>>>>>> #endif
>>> --
>>> Paul Kocialkowski, Bootlin (formerly Free Electrons)
>>> Embedded Linux and kernel engineering
>>> https://bootlin.com
>>>
> --
> Paul Kocialkowski, Bootlin (formerly Free Electrons)
> Embedded Linux and kernel engineering
> https://bootlin.com
>
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [linux-sunxi] [PATCH v2 1/2] media: v4l: Add definitions for the HEVC slice format and controls
2019-01-08 10:00 ` [linux-sunxi] " Ayaka
@ 2019-01-10 13:32 ` ayaka
2019-01-24 10:27 ` Paul Kocialkowski
2019-01-24 10:36 ` Paul Kocialkowski
1 sibling, 1 reply; 22+ messages in thread
From: ayaka @ 2019-01-10 13:32 UTC (permalink / raw)
To: Paul Kocialkowski
Cc: Randy Li, Jernej Škrabec, linux-media, linux-kernel, devel,
linux-arm-kernel, Mauro Carvalho Chehab, Maxime Ripard,
Hans Verkuil, Ezequiel Garcia, Tomasz Figa, Alexandre Courbot,
Thomas Petazzoni, linux-rockchip
I forget a important thing, for the rkvdec and rk hevc decoder, it would
requests cabac table, scaling list, picture parameter set and reference
picture storing in one or various of DMA buffers. I am not talking about
the data been parsed, the decoder would requests a raw data.
For the pps and rps, it is possible to reuse the slice header, just let
the decoder know the offset from the bitstream bufer, I would suggest to
add three properties(with sps) for them. But I think we need a method to
mark a OUTPUT side buffer for those aux data.
On 1/8/19 6:00 PM, Ayaka wrote:
>
> Sent from my iPad
>
>> On Jan 8, 2019, at 4:38 PM, Paul Kocialkowski <paul.kocialkowski@bootlin.com> wrote:
>>
>> Hi,
>>
>>> On Tue, 2019-01-08 at 09:16 +0800, Ayaka wrote:
>>>
>>> Sent from my iPad
>>>
>>>> On Jan 7, 2019, at 5:57 PM, Paul Kocialkowski <paul.kocialkowski@bootlin.com> wrote:
>>>>
>>>> Hi,
>>>>
>>>>>> On Mon, 2019-01-07 at 11:49 +0800, Randy Li wrote:
>>>>>> On 12/12/18 8:51 PM, Paul Kocialkowski wrote:
>>>>>> Hi,
>>>>>>
>>>>>> On Wed, 2018-12-05 at 21:59 +0100, Jernej Škrabec wrote:
>>>>>>
>>>>>>>> +
>>>>>>>> +#define V4L2_HEVC_DPB_ENTRY_RPS_ST_CURR_BEFORE 0x01
>>>>>>>> +#define V4L2_HEVC_DPB_ENTRY_RPS_ST_CURR_AFTER 0x02
>>>>>>>> +#define V4L2_HEVC_DPB_ENTRY_RPS_LT_CURR 0x03
>>>>>>>> +
>>>>>>>> +#define V4L2_HEVC_DPB_ENTRIES_NUM_MAX 16
>>>>>>>> +
>>>>>>>> +struct v4l2_hevc_dpb_entry {
>>>>>>>> + __u32 buffer_tag;
>>>>>>>> + __u8 rps;
>>>>>>>> + __u8 field_pic;
>>>>>>>> + __u16 pic_order_cnt[2];
>>>>>>>> +};
>>>>> Please add a property for reference index, if that rps is not used for
>>>>> this, some device would request that(not the rockchip one). And
>>>>> Rockchip's VDPU1 and VDPU2 for AVC would request a similar property.
>>>> What exactly is that reference index? Is it a bitstream element or
>>>> something deduced from the bitstream?
>>>>
>>> picture order count(POC) for HEVC and frame_num in AVC. I think it is
>>> the number used in list0(P slice and B slice) and list1(B slice).
>> The picture order count is already the last field of the DPB entry
>> structure. There is one for each field picture.
> As we are not sure whether there is a field coded slice or CTU, I would hold this part and else about the field.
>>>>> Adding another buffer_tag for referring the memory of the motion vectors
>>>>> for each frames. Or a better method is add a meta data to echo picture
>>>>> buffer, since the picture output is just the same as the original,
>>>>> display won't care whether the motion vectors are written the button of
>>>>> picture or somewhere else.
>>>> The motion vectors are passed as part of the raw bitstream data, in the
>>>> slices. Is there a case where the motion vectors are coded differently?
>>> No, it is an additional cache for decoder, even FFmpeg having such
>>> data, I think allwinner must output it into somewhere.
>> Ah yes I see what you mean! This is handled internally by our driver
>> and not exposed to userspace. I don't think it would be a good idea to
>> expose this cache or request that userspace allocates it like a video
>> buffer.
>>
> No, usually the driver should allocate, as the user space have no idea on size of each devices.
> But for advantage user, application can fix a broken picture with a proper data or analysis a object motion from that.
> So I would suggest attaching this information to a picture buffer as a meta data.
>>>>>>>> +
>>>>>>>> +struct v4l2_hevc_pred_weight_table {
>>>>>>>> + __u8 luma_log2_weight_denom;
>>>>>>>> + __s8 delta_chroma_log2_weight_denom;
>>>>>>>> +
>>>>>>>> + __s8 delta_luma_weight_l0[V4L2_HEVC_DPB_ENTRIES_NUM_MAX];
>>>>>>>> + __s8 luma_offset_l0[V4L2_HEVC_DPB_ENTRIES_NUM_MAX];
>>>>>>>> + __s8 delta_chroma_weight_l0[V4L2_HEVC_DPB_ENTRIES_NUM_MAX][2];
>>>>>>>> + __s8 chroma_offset_l0[V4L2_HEVC_DPB_ENTRIES_NUM_MAX][2];
>>>>>>>> +
>>>>>>>> + __s8 delta_luma_weight_l1[V4L2_HEVC_DPB_ENTRIES_NUM_MAX];
>>>>>>>> + __s8 luma_offset_l1[V4L2_HEVC_DPB_ENTRIES_NUM_MAX];
>>>>>>>> + __s8 delta_chroma_weight_l1[V4L2_HEVC_DPB_ENTRIES_NUM_MAX][2];
>>>>>>>> + __s8 chroma_offset_l1[V4L2_HEVC_DPB_ENTRIES_NUM_MAX][2];
>>>>>>>> +};
>>>>>>>> +
>>>>> Those properties I think are not necessary are applying for the
>>>>> Rockchip's device, may not work for the others.
>>>> Yes, it's possible that some of the elements are not necessary for some
>>>> decoders. What we want is to cover all the elements that might be
>>>> required for a decoder.
>>> I wonder whether allwinner need that, those sao flag usually ignored
>>> by decoder in design. But more is better than less, it is hard to
>>> extend a v4l2 structure in the future, maybe a new HEVC profile
>>> would bring a new property, it is still too early for HEVC.
>> Yes this is used by our decoder. The idea is to have all the basic
>> bitstream elements in the structures (even if some decoders don't use
>> them all) and add others for extension as separate controls later.
>>
>>>>>>>> +struct v4l2_ctrl_hevc_slice_params {
>>>>>>>> + __u32 bit_size;
>>>>>>>> + __u32 data_bit_offset;
>>>>>>>> +
>>>>>>>> + /* ISO/IEC 23008-2, ITU-T Rec. H.265: NAL unit header */
>>>>>>>> + __u8 nal_unit_type;
>>>>>>>> + __u8 nuh_temporal_id_plus1;
>>>>>>>> +
>>>>>>>> + /* ISO/IEC 23008-2, ITU-T Rec. H.265: General slice segment header */
>>>>>>>> + __u8 slice_type;
>>>>>>>> + __u8 colour_plane_id;
>>>>> ----------------------------------------------------------------------------
>>>>>>>> + __u16 slice_pic_order_cnt;
>>>>>>>> + __u8 slice_sao_luma_flag;
>>>>>>>> + __u8 slice_sao_chroma_flag;
>>>>>>>> + __u8 slice_temporal_mvp_enabled_flag;
>>>>>>>> + __u8 num_ref_idx_l0_active_minus1;
>>>>>>>> + __u8 num_ref_idx_l1_active_minus1;
>>>>> Rockchip's decoder doesn't use this part.
>>>>>>>> + __u8 mvd_l1_zero_flag;
>>>>>>>> + __u8 cabac_init_flag;
>>>>>>>> + __u8 collocated_from_l0_flag;
>>>>>>>> + __u8 collocated_ref_idx;
>>>>>>>> + __u8 five_minus_max_num_merge_cand;
>>>>>>>> + __u8 use_integer_mv_flag;
>>>>>>>> + __s8 slice_qp_delta;
>>>>>>>> + __s8 slice_cb_qp_offset;
>>>>>>>> + __s8 slice_cr_qp_offset;
>>>>>>>> + __s8 slice_act_y_qp_offset;
>>>>>>>> + __s8 slice_act_cb_qp_offset;
>>>>>>>> + __s8 slice_act_cr_qp_offset;
>>>>>>>> + __u8 slice_deblocking_filter_disabled_flag;
>>>>>>>> + __s8 slice_beta_offset_div2;
>>>>>>>> + __s8 slice_tc_offset_div2;
>>>>>>>> + __u8 slice_loop_filter_across_slices_enabled_flag;
>>>>>>>> +
>>>>>>>> + /* ISO/IEC 23008-2, ITU-T Rec. H.265: Picture timing SEI message */
>>>>>>>> + __u8 pic_struct;
>>>>> I think the decoder doesn't care about this, it is used for display.
>>>> The purpose of this field is to indicate whether the current picture is
>>>> a progressive frame or an interlaced field picture, which is useful for
>>>> decoding.
>>>>
>>>> At least our decoder has a register field to indicate frame/top
>>>> field/bottom field, so we certainly need to keep the info around.
>>>> Looking at the spec and the ffmpeg implementation, it looks like this
>>>> flag of the bitstream is the usual way to report field coding.
>>> It depends whether the decoder cares about scan type or more, I
>>> wonder prefer general_interlaced_source_flag for just scan type, it
>>> would be better than reading another SEL.
>> Well we still need a way to indicate if the current data is top or
>> bottom field for interlaced. I don't think that knowing that the whole
>> video is interlaced would be precise enough.
>>
>> Cheers,
>>
>> Paul
>>
>>>>>>>> +
>>>>>>>> + /* ISO/IEC 23008-2, ITU-T Rec. H.265: General slice segment header */
>>>>>>>> + struct v4l2_hevc_dpb_entry dpb[V4L2_HEVC_DPB_ENTRIES_NUM_MAX];
>>>>>>>> + __u8 num_active_dpb_entries;
>>>>>>>> + __u8 ref_idx_l0[V4L2_HEVC_DPB_ENTRIES_NUM_MAX];
>>>>>>>> + __u8 ref_idx_l1[V4L2_HEVC_DPB_ENTRIES_NUM_MAX];
>>>>>>>> +
>>>>>>>> + __u8 num_rps_poc_st_curr_before;
>>>>>>>> + __u8 num_rps_poc_st_curr_after;
>>>>>>>> + __u8 num_rps_poc_lt_curr;
>>>>>>>> +
>>>>>>>> + /* ISO/IEC 23008-2, ITU-T Rec. H.265: Weighted prediction parameter */
>>>>>>>> + struct v4l2_hevc_pred_weight_table pred_weight_table;
>>>>>>>> +};
>>>>>>>> +
>>>>>>>> #endif
>>>> --
>>>> Paul Kocialkowski, Bootlin (formerly Free Electrons)
>>>> Embedded Linux and kernel engineering
>>>> https://bootlin.com
>>>>
>> --
>> Paul Kocialkowski, Bootlin (formerly Free Electrons)
>> Embedded Linux and kernel engineering
>> https://bootlin.com
>>
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [linux-sunxi] [PATCH v2 1/2] media: v4l: Add definitions for the HEVC slice format and controls
2019-01-10 13:32 ` ayaka
@ 2019-01-24 10:27 ` Paul Kocialkowski
2019-01-24 12:23 ` Ayaka
0 siblings, 1 reply; 22+ messages in thread
From: Paul Kocialkowski @ 2019-01-24 10:27 UTC (permalink / raw)
To: ayaka
Cc: Randy Li, Jernej Škrabec, linux-media, linux-kernel, devel,
linux-arm-kernel, Mauro Carvalho Chehab, Maxime Ripard,
Hans Verkuil, Ezequiel Garcia, Tomasz Figa, Alexandre Courbot,
Thomas Petazzoni, linux-rockchip
Hi,
On Thu, 2019-01-10 at 21:32 +0800, ayaka wrote:
> I forget a important thing, for the rkvdec and rk hevc decoder, it would
> requests cabac table, scaling list, picture parameter set and reference
> picture storing in one or various of DMA buffers. I am not talking about
> the data been parsed, the decoder would requests a raw data.
>
> For the pps and rps, it is possible to reuse the slice header, just let
> the decoder know the offset from the bitstream bufer, I would suggest to
> add three properties(with sps) for them. But I think we need a method to
> mark a OUTPUT side buffer for those aux data.
I'm quite confused about the hardware implementation then. From what
you're saying, it seems that it takes the raw bitstream elements rather
than parsed elements. Is it really a stateless implementation?
The stateless implementation was designed with the idea that only the
raw slice data should be passed in bitstream form to the decoder. For
H.264, it seems that some decoders also need the slice header in raw
bitstream form (because they take the full slice NAL unit), see the
discussions in this thread:
media: docs-rst: Document m2m stateless video decoder interface
Can you detail exactly what the rockchip decoder absolutely needs in
raw bitstream format?
Cheers,
Paul
> On 1/8/19 6:00 PM, Ayaka wrote:
> > Sent from my iPad
> >
> > > On Jan 8, 2019, at 4:38 PM, Paul Kocialkowski <paul.kocialkowski@bootlin.com> wrote:
> > >
> > > Hi,
> > >
> > > > On Tue, 2019-01-08 at 09:16 +0800, Ayaka wrote:
> > > >
> > > > Sent from my iPad
> > > >
> > > > > On Jan 7, 2019, at 5:57 PM, Paul Kocialkowski <paul.kocialkowski@bootlin.com> wrote:
> > > > >
> > > > > Hi,
> > > > >
> > > > > > > On Mon, 2019-01-07 at 11:49 +0800, Randy Li wrote:
> > > > > > > On 12/12/18 8:51 PM, Paul Kocialkowski wrote:
> > > > > > > Hi,
> > > > > > >
> > > > > > > On Wed, 2018-12-05 at 21:59 +0100, Jernej Škrabec wrote:
> > > > > > >
> > > > > > > > > +
> > > > > > > > > +#define V4L2_HEVC_DPB_ENTRY_RPS_ST_CURR_BEFORE 0x01
> > > > > > > > > +#define V4L2_HEVC_DPB_ENTRY_RPS_ST_CURR_AFTER 0x02
> > > > > > > > > +#define V4L2_HEVC_DPB_ENTRY_RPS_LT_CURR 0x03
> > > > > > > > > +
> > > > > > > > > +#define V4L2_HEVC_DPB_ENTRIES_NUM_MAX 16
> > > > > > > > > +
> > > > > > > > > +struct v4l2_hevc_dpb_entry {
> > > > > > > > > + __u32 buffer_tag;
> > > > > > > > > + __u8 rps;
> > > > > > > > > + __u8 field_pic;
> > > > > > > > > + __u16 pic_order_cnt[2];
> > > > > > > > > +};
> > > > > > Please add a property for reference index, if that rps is not used for
> > > > > > this, some device would request that(not the rockchip one). And
> > > > > > Rockchip's VDPU1 and VDPU2 for AVC would request a similar property.
> > > > > What exactly is that reference index? Is it a bitstream element or
> > > > > something deduced from the bitstream?
> > > > >
> > > > picture order count(POC) for HEVC and frame_num in AVC. I think it is
> > > > the number used in list0(P slice and B slice) and list1(B slice).
> > > The picture order count is already the last field of the DPB entry
> > > structure. There is one for each field picture.
> > As we are not sure whether there is a field coded slice or CTU, I would hold this part and else about the field.
> > > > > > Adding another buffer_tag for referring the memory of the motion vectors
> > > > > > for each frames. Or a better method is add a meta data to echo picture
> > > > > > buffer, since the picture output is just the same as the original,
> > > > > > display won't care whether the motion vectors are written the button of
> > > > > > picture or somewhere else.
> > > > > The motion vectors are passed as part of the raw bitstream data, in the
> > > > > slices. Is there a case where the motion vectors are coded differently?
> > > > No, it is an additional cache for decoder, even FFmpeg having such
> > > > data, I think allwinner must output it into somewhere.
> > > Ah yes I see what you mean! This is handled internally by our driver
> > > and not exposed to userspace. I don't think it would be a good idea to
> > > expose this cache or request that userspace allocates it like a video
> > > buffer.
> > >
> > No, usually the driver should allocate, as the user space have no idea on size of each devices.
> > But for advantage user, application can fix a broken picture with a proper data or analysis a object motion from that.
> > So I would suggest attaching this information to a picture buffer as a meta data.
> > > > > > > > > +
> > > > > > > > > +struct v4l2_hevc_pred_weight_table {
> > > > > > > > > + __u8 luma_log2_weight_denom;
> > > > > > > > > + __s8 delta_chroma_log2_weight_denom;
> > > > > > > > > +
> > > > > > > > > + __s8 delta_luma_weight_l0[V4L2_HEVC_DPB_ENTRIES_NUM_MAX];
> > > > > > > > > + __s8 luma_offset_l0[V4L2_HEVC_DPB_ENTRIES_NUM_MAX];
> > > > > > > > > + __s8 delta_chroma_weight_l0[V4L2_HEVC_DPB_ENTRIES_NUM_MAX][2];
> > > > > > > > > + __s8 chroma_offset_l0[V4L2_HEVC_DPB_ENTRIES_NUM_MAX][2];
> > > > > > > > > +
> > > > > > > > > + __s8 delta_luma_weight_l1[V4L2_HEVC_DPB_ENTRIES_NUM_MAX];
> > > > > > > > > + __s8 luma_offset_l1[V4L2_HEVC_DPB_ENTRIES_NUM_MAX];
> > > > > > > > > + __s8 delta_chroma_weight_l1[V4L2_HEVC_DPB_ENTRIES_NUM_MAX][2];
> > > > > > > > > + __s8 chroma_offset_l1[V4L2_HEVC_DPB_ENTRIES_NUM_MAX][2];
> > > > > > > > > +};
> > > > > > > > > +
> > > > > > Those properties I think are not necessary are applying for the
> > > > > > Rockchip's device, may not work for the others.
> > > > > Yes, it's possible that some of the elements are not necessary for some
> > > > > decoders. What we want is to cover all the elements that might be
> > > > > required for a decoder.
> > > > I wonder whether allwinner need that, those sao flag usually ignored
> > > > by decoder in design. But more is better than less, it is hard to
> > > > extend a v4l2 structure in the future, maybe a new HEVC profile
> > > > would bring a new property, it is still too early for HEVC.
> > > Yes this is used by our decoder. The idea is to have all the basic
> > > bitstream elements in the structures (even if some decoders don't use
> > > them all) and add others for extension as separate controls later.
> > >
> > > > > > > > > +struct v4l2_ctrl_hevc_slice_params {
> > > > > > > > > + __u32 bit_size;
> > > > > > > > > + __u32 data_bit_offset;
> > > > > > > > > +
> > > > > > > > > + /* ISO/IEC 23008-2, ITU-T Rec. H.265: NAL unit header */
> > > > > > > > > + __u8 nal_unit_type;
> > > > > > > > > + __u8 nuh_temporal_id_plus1;
> > > > > > > > > +
> > > > > > > > > + /* ISO/IEC 23008-2, ITU-T Rec. H.265: General slice segment header */
> > > > > > > > > + __u8 slice_type;
> > > > > > > > > + __u8 colour_plane_id;
> > > > > > ----------------------------------------------------------------------------
> > > > > > > > > + __u16 slice_pic_order_cnt;
> > > > > > > > > + __u8 slice_sao_luma_flag;
> > > > > > > > > + __u8 slice_sao_chroma_flag;
> > > > > > > > > + __u8 slice_temporal_mvp_enabled_flag;
> > > > > > > > > + __u8 num_ref_idx_l0_active_minus1;
> > > > > > > > > + __u8 num_ref_idx_l1_active_minus1;
> > > > > > Rockchip's decoder doesn't use this part.
> > > > > > > > > + __u8 mvd_l1_zero_flag;
> > > > > > > > > + __u8 cabac_init_flag;
> > > > > > > > > + __u8 collocated_from_l0_flag;
> > > > > > > > > + __u8 collocated_ref_idx;
> > > > > > > > > + __u8 five_minus_max_num_merge_cand;
> > > > > > > > > + __u8 use_integer_mv_flag;
> > > > > > > > > + __s8 slice_qp_delta;
> > > > > > > > > + __s8 slice_cb_qp_offset;
> > > > > > > > > + __s8 slice_cr_qp_offset;
> > > > > > > > > + __s8 slice_act_y_qp_offset;
> > > > > > > > > + __s8 slice_act_cb_qp_offset;
> > > > > > > > > + __s8 slice_act_cr_qp_offset;
> > > > > > > > > + __u8 slice_deblocking_filter_disabled_flag;
> > > > > > > > > + __s8 slice_beta_offset_div2;
> > > > > > > > > + __s8 slice_tc_offset_div2;
> > > > > > > > > + __u8 slice_loop_filter_across_slices_enabled_flag;
> > > > > > > > > +
> > > > > > > > > + /* ISO/IEC 23008-2, ITU-T Rec. H.265: Picture timing SEI message */
> > > > > > > > > + __u8 pic_struct;
> > > > > > I think the decoder doesn't care about this, it is used for display.
> > > > > The purpose of this field is to indicate whether the current picture is
> > > > > a progressive frame or an interlaced field picture, which is useful for
> > > > > decoding.
> > > > >
> > > > > At least our decoder has a register field to indicate frame/top
> > > > > field/bottom field, so we certainly need to keep the info around.
> > > > > Looking at the spec and the ffmpeg implementation, it looks like this
> > > > > flag of the bitstream is the usual way to report field coding.
> > > > It depends whether the decoder cares about scan type or more, I
> > > > wonder prefer general_interlaced_source_flag for just scan type, it
> > > > would be better than reading another SEL.
> > > Well we still need a way to indicate if the current data is top or
> > > bottom field for interlaced. I don't think that knowing that the whole
> > > video is interlaced would be precise enough.
> > >
> > > Cheers,
> > >
> > > Paul
> > >
> > > > > > > > > +
> > > > > > > > > + /* ISO/IEC 23008-2, ITU-T Rec. H.265: General slice segment header */
> > > > > > > > > + struct v4l2_hevc_dpb_entry dpb[V4L2_HEVC_DPB_ENTRIES_NUM_MAX];
> > > > > > > > > + __u8 num_active_dpb_entries;
> > > > > > > > > + __u8 ref_idx_l0[V4L2_HEVC_DPB_ENTRIES_NUM_MAX];
> > > > > > > > > + __u8 ref_idx_l1[V4L2_HEVC_DPB_ENTRIES_NUM_MAX];
> > > > > > > > > +
> > > > > > > > > + __u8 num_rps_poc_st_curr_before;
> > > > > > > > > + __u8 num_rps_poc_st_curr_after;
> > > > > > > > > + __u8 num_rps_poc_lt_curr;
> > > > > > > > > +
> > > > > > > > > + /* ISO/IEC 23008-2, ITU-T Rec. H.265: Weighted prediction parameter */
> > > > > > > > > + struct v4l2_hevc_pred_weight_table pred_weight_table;
> > > > > > > > > +};
> > > > > > > > > +
> > > > > > > > > #endif
> > > > > --
> > > > > Paul Kocialkowski, Bootlin (formerly Free Electrons)
> > > > > Embedded Linux and kernel engineering
> > > > > https://bootlin.com
> > > > >
> > > --
> > > Paul Kocialkowski, Bootlin (formerly Free Electrons)
> > > Embedded Linux and kernel engineering
> > > https://bootlin.com
> > >
--
Paul Kocialkowski, Bootlin
Embedded Linux and kernel engineering
https://bootlin.com
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [linux-sunxi] [PATCH v2 1/2] media: v4l: Add definitions for the HEVC slice format and controls
2019-01-08 10:00 ` [linux-sunxi] " Ayaka
2019-01-10 13:32 ` ayaka
@ 2019-01-24 10:36 ` Paul Kocialkowski
2019-01-24 12:19 ` Ayaka
1 sibling, 1 reply; 22+ messages in thread
From: Paul Kocialkowski @ 2019-01-24 10:36 UTC (permalink / raw)
To: Ayaka
Cc: devel, Alexandre Courbot, Maxime Ripard, Randy Li, linux-kernel,
Jernej Škrabec, Tomasz Figa, Hans Verkuil, linux-rockchip,
Thomas Petazzoni, Mauro Carvalho Chehab, Ezequiel Garcia,
linux-arm-kernel, linux-media
Hi,
On Tue, 2019-01-08 at 18:00 +0800, Ayaka wrote:
>
> Sent from my iPad
>
> > On Jan 8, 2019, at 4:38 PM, Paul Kocialkowski <paul.kocialkowski@bootlin.com> wrote:
> >
> > Hi,
> >
> > > On Tue, 2019-01-08 at 09:16 +0800, Ayaka wrote:
> > >
> > > Sent from my iPad
> > >
> > > > On Jan 7, 2019, at 5:57 PM, Paul Kocialkowski <paul.kocialkowski@bootlin.com> wrote:
> > > >
> > > > Hi,
> > > >
> > > > > > On Mon, 2019-01-07 at 11:49 +0800, Randy Li wrote:
> > > > > > On 12/12/18 8:51 PM, Paul Kocialkowski wrote:
> > > > > > Hi,
> > > > > >
> > > > > > On Wed, 2018-12-05 at 21:59 +0100, Jernej Škrabec wrote:
> > > > > >
> > > > > > > > +
> > > > > > > > +#define V4L2_HEVC_DPB_ENTRY_RPS_ST_CURR_BEFORE 0x01
> > > > > > > > +#define V4L2_HEVC_DPB_ENTRY_RPS_ST_CURR_AFTER 0x02
> > > > > > > > +#define V4L2_HEVC_DPB_ENTRY_RPS_LT_CURR 0x03
> > > > > > > > +
> > > > > > > > +#define V4L2_HEVC_DPB_ENTRIES_NUM_MAX 16
> > > > > > > > +
> > > > > > > > +struct v4l2_hevc_dpb_entry {
> > > > > > > > + __u32 buffer_tag;
> > > > > > > > + __u8 rps;
> > > > > > > > + __u8 field_pic;
> > > > > > > > + __u16 pic_order_cnt[2];
> > > > > > > > +};
> > > > >
> > > > > Please add a property for reference index, if that rps is not used for
> > > > > this, some device would request that(not the rockchip one). And
> > > > > Rockchip's VDPU1 and VDPU2 for AVC would request a similar property.
> > > >
> > > > What exactly is that reference index? Is it a bitstream element or
> > > > something deduced from the bitstream?
> > > >
> > > picture order count(POC) for HEVC and frame_num in AVC. I think it is
> > > the number used in list0(P slice and B slice) and list1(B slice).
> >
> > The picture order count is already the last field of the DPB entry
> > structure. There is one for each field picture.
> As we are not sure whether there is a field coded slice or CTU, I
> would hold this part and else about the field.
I'm not sure what you meant here, sorry.
> > > > > Adding another buffer_tag for referring the memory of the motion vectors
> > > > > for each frames. Or a better method is add a meta data to echo picture
> > > > > buffer, since the picture output is just the same as the original,
> > > > > display won't care whether the motion vectors are written the button of
> > > > > picture or somewhere else.
> > > >
> > > > The motion vectors are passed as part of the raw bitstream data, in the
> > > > slices. Is there a case where the motion vectors are coded differently?
> > > No, it is an additional cache for decoder, even FFmpeg having such
> > > data, I think allwinner must output it into somewhere.
> >
> > Ah yes I see what you mean! This is handled internally by our driver
> > and not exposed to userspace. I don't think it would be a good idea to
> > expose this cache or request that userspace allocates it like a video
> > buffer.
> >
> No, usually the driver should allocate, as the user space have no
> idea on size of each devices.
> But for advantage user, application can fix a broken picture with a
> proper data or analysis a object motion from that.
> So I would suggest attaching this information to a picture buffer as
> a meta data.
Right, the driver will allocate chunks of memory for the decoding
metadata used by the hardware decoder.
Well, I don't think V4L2 has any mechanism to expose this data for now
and since it's very specific to the hardware implementation, I guess
the interest in having that is generally pretty low.
That's maybe something that could be added later if someone wants to
work on it, but I think we are better off keeping this metadata hidden
by the driver for now.
> > > > > > > > +
> > > > > > > > +struct v4l2_hevc_pred_weight_table {
> > > > > > > > + __u8 luma_log2_weight_denom;
> > > > > > > > + __s8 delta_chroma_log2_weight_denom;
> > > > > > > > +
> > > > > > > > + __s8 delta_luma_weight_l0[V4L2_HEVC_DPB_ENTRIES_NUM_MAX];
> > > > > > > > + __s8 luma_offset_l0[V4L2_HEVC_DPB_ENTRIES_NUM_MAX];
> > > > > > > > + __s8 delta_chroma_weight_l0[V4L2_HEVC_DPB_ENTRIES_NUM_MAX][2];
> > > > > > > > + __s8 chroma_offset_l0[V4L2_HEVC_DPB_ENTRIES_NUM_MAX][2];
> > > > > > > > +
> > > > > > > > + __s8 delta_luma_weight_l1[V4L2_HEVC_DPB_ENTRIES_NUM_MAX];
> > > > > > > > + __s8 luma_offset_l1[V4L2_HEVC_DPB_ENTRIES_NUM_MAX];
> > > > > > > > + __s8 delta_chroma_weight_l1[V4L2_HEVC_DPB_ENTRIES_NUM_MAX][2];
> > > > > > > > + __s8 chroma_offset_l1[V4L2_HEVC_DPB_ENTRIES_NUM_MAX][2];
> > > > > > > > +};
> > > > > > > > +
> > > > > Those properties I think are not necessary are applying for the
> > > > > Rockchip's device, may not work for the others.
> > > >
> > > > Yes, it's possible that some of the elements are not necessary for some
> > > > decoders. What we want is to cover all the elements that might be
> > > > required for a decoder.
> > > I wonder whether allwinner need that, those sao flag usually ignored
> > > by decoder in design. But more is better than less, it is hard to
> > > extend a v4l2 structure in the future, maybe a new HEVC profile
> > > would bring a new property, it is still too early for HEVC.
> >
> > Yes this is used by our decoder. The idea is to have all the basic
> > bitstream elements in the structures (even if some decoders don't use
> > them all) and add others for extension as separate controls later.
> >
> > > > > > > > +struct v4l2_ctrl_hevc_slice_params {
> > > > > > > > + __u32 bit_size;
> > > > > > > > + __u32 data_bit_offset;
> > > > > > > > +
> > > > > > > > + /* ISO/IEC 23008-2, ITU-T Rec. H.265: NAL unit header */
> > > > > > > > + __u8 nal_unit_type;
> > > > > > > > + __u8 nuh_temporal_id_plus1;
> > > > > > > > +
> > > > > > > > + /* ISO/IEC 23008-2, ITU-T Rec. H.265: General slice segment header */
> > > > > > > > + __u8 slice_type;
> > > > > > > > + __u8 colour_plane_id;
> > > > > ----------------------------------------------------------------------------
> > > > > > > > + __u16 slice_pic_order_cnt;
> > > > > > > > + __u8 slice_sao_luma_flag;
> > > > > > > > + __u8 slice_sao_chroma_flag;
> > > > > > > > + __u8 slice_temporal_mvp_enabled_flag;
> > > > > > > > + __u8 num_ref_idx_l0_active_minus1;
> > > > > > > > + __u8 num_ref_idx_l1_active_minus1;
> > > > > Rockchip's decoder doesn't use this part.
> > > > > > > > + __u8 mvd_l1_zero_flag;
> > > > > > > > + __u8 cabac_init_flag;
> > > > > > > > + __u8 collocated_from_l0_flag;
> > > > > > > > + __u8 collocated_ref_idx;
> > > > > > > > + __u8 five_minus_max_num_merge_cand;
> > > > > > > > + __u8 use_integer_mv_flag;
> > > > > > > > + __s8 slice_qp_delta;
> > > > > > > > + __s8 slice_cb_qp_offset;
> > > > > > > > + __s8 slice_cr_qp_offset;
> > > > > > > > + __s8 slice_act_y_qp_offset;
> > > > > > > > + __s8 slice_act_cb_qp_offset;
> > > > > > > > + __s8 slice_act_cr_qp_offset;
> > > > > > > > + __u8 slice_deblocking_filter_disabled_flag;
> > > > > > > > + __s8 slice_beta_offset_div2;
> > > > > > > > + __s8 slice_tc_offset_div2;
> > > > > > > > + __u8 slice_loop_filter_across_slices_enabled_flag;
> > > > > > > > +
> > > > > > > > + /* ISO/IEC 23008-2, ITU-T Rec. H.265: Picture timing SEI message */
> > > > > > > > + __u8 pic_struct;
> > > > > I think the decoder doesn't care about this, it is used for display.
> > > >
> > > > The purpose of this field is to indicate whether the current picture is
> > > > a progressive frame or an interlaced field picture, which is useful for
> > > > decoding.
> > > >
> > > > At least our decoder has a register field to indicate frame/top
> > > > field/bottom field, so we certainly need to keep the info around.
> > > > Looking at the spec and the ffmpeg implementation, it looks like this
> > > > flag of the bitstream is the usual way to report field coding.
> > > It depends whether the decoder cares about scan type or more, I
> > > wonder prefer general_interlaced_source_flag for just scan type, it
> > > would be better than reading another SEL.
> >
> > Well we still need a way to indicate if the current data is top or
> > bottom field for interlaced. I don't think that knowing that the whole
> > video is interlaced would be precise enough.
> >
> > Cheers,
> >
> > Paul
> >
> > > > > > > > +
> > > > > > > > + /* ISO/IEC 23008-2, ITU-T Rec. H.265: General slice segment header */
> > > > > > > > + struct v4l2_hevc_dpb_entry dpb[V4L2_HEVC_DPB_ENTRIES_NUM_MAX];
> > > > > > > > + __u8 num_active_dpb_entries;
> > > > > > > > + __u8 ref_idx_l0[V4L2_HEVC_DPB_ENTRIES_NUM_MAX];
> > > > > > > > + __u8 ref_idx_l1[V4L2_HEVC_DPB_ENTRIES_NUM_MAX];
> > > > > > > > +
> > > > > > > > + __u8 num_rps_poc_st_curr_before;
> > > > > > > > + __u8 num_rps_poc_st_curr_after;
> > > > > > > > + __u8 num_rps_poc_lt_curr;
> > > > > > > > +
> > > > > > > > + /* ISO/IEC 23008-2, ITU-T Rec. H.265: Weighted prediction parameter */
> > > > > > > > + struct v4l2_hevc_pred_weight_table pred_weight_table;
> > > > > > > > +};
> > > > > > > > +
> > > > > > > > #endif
> > > > --
> > > > Paul Kocialkowski, Bootlin (formerly Free Electrons)
> > > > Embedded Linux and kernel engineering
> > > > https://bootlin.com
> > > >
> > --
> > Paul Kocialkowski, Bootlin (formerly Free Electrons)
> > Embedded Linux and kernel engineering
> > https://bootlin.com
> >
--
Paul Kocialkowski, Bootlin
Embedded Linux and kernel engineering
https://bootlin.com
_______________________________________________
devel mailing list
devel@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [linux-sunxi] [PATCH v2 1/2] media: v4l: Add definitions for the HEVC slice format and controls
2019-01-24 10:36 ` Paul Kocialkowski
@ 2019-01-24 12:19 ` Ayaka
0 siblings, 0 replies; 22+ messages in thread
From: Ayaka @ 2019-01-24 12:19 UTC (permalink / raw)
To: Paul Kocialkowski
Cc: Randy Li, Jernej Škrabec, linux-media, linux-kernel, devel,
linux-arm-kernel, Mauro Carvalho Chehab, Maxime Ripard,
Hans Verkuil, Ezequiel Garcia, Tomasz Figa, Alexandre Courbot,
Thomas Petazzoni, linux-rockchip
Sent from my iPad
> On Jan 24, 2019, at 6:36 PM, Paul Kocialkowski <paul.kocialkowski@bootlin.com> wrote:
>
> Hi,
>
>> On Tue, 2019-01-08 at 18:00 +0800, Ayaka wrote:
>>
>> Sent from my iPad
>>
>>> On Jan 8, 2019, at 4:38 PM, Paul Kocialkowski <paul.kocialkowski@bootlin.com> wrote:
>>>
>>> Hi,
>>>
>>>> On Tue, 2019-01-08 at 09:16 +0800, Ayaka wrote:
>>>>
>>>> Sent from my iPad
>>>>
>>>>> On Jan 7, 2019, at 5:57 PM, Paul Kocialkowski <paul.kocialkowski@bootlin.com> wrote:
>>>>>
>>>>> Hi,
>>>>>
>>>>>>> On Mon, 2019-01-07 at 11:49 +0800, Randy Li wrote:
>>>>>>> On 12/12/18 8:51 PM, Paul Kocialkowski wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>> On Wed, 2018-12-05 at 21:59 +0100, Jernej Škrabec wrote:
>>>>>>>
>>>>>>>>> +
>>>>>>>>> +#define V4L2_HEVC_DPB_ENTRY_RPS_ST_CURR_BEFORE 0x01
>>>>>>>>> +#define V4L2_HEVC_DPB_ENTRY_RPS_ST_CURR_AFTER 0x02
>>>>>>>>> +#define V4L2_HEVC_DPB_ENTRY_RPS_LT_CURR 0x03
>>>>>>>>> +
>>>>>>>>> +#define V4L2_HEVC_DPB_ENTRIES_NUM_MAX 16
>>>>>>>>> +
>>>>>>>>> +struct v4l2_hevc_dpb_entry {
>>>>>>>>> + __u32 buffer_tag;
>>>>>>>>> + __u8 rps;
>>>>>>>>> + __u8 field_pic;
>>>>>>>>> + __u16 pic_order_cnt[2];
>>>>>>>>> +};
>>>>>>
>>>>>> Please add a property for reference index, if that rps is not used for
>>>>>> this, some device would request that(not the rockchip one). And
>>>>>> Rockchip's VDPU1 and VDPU2 for AVC would request a similar property.
>>>>>
>>>>> What exactly is that reference index? Is it a bitstream element or
>>>>> something deduced from the bitstream?
>>>>>
>>>> picture order count(POC) for HEVC and frame_num in AVC. I think it is
>>>> the number used in list0(P slice and B slice) and list1(B slice).
>>>
>>> The picture order count is already the last field of the DPB entry
>>> structure. There is one for each field picture.
>> As we are not sure whether there is a field coded slice or CTU, I
>> would hold this part and else about the field.
>
> I'm not sure what you meant here, sorry.
As we talked in IRC, I am not sure the field coded picture is supported in HEVC.
And I don’t why there would be two pic order cnt, a picture can only be used a short term or a long term reference at one picture decoding
>
>>>>>> Adding another buffer_tag for referring the memory of the motion vectors
>>>>>> for each frames. Or a better method is add a meta data to echo picture
>>>>>> buffer, since the picture output is just the same as the original,
>>>>>> display won't care whether the motion vectors are written the button of
>>>>>> picture or somewhere else.
>>>>>
>>>>> The motion vectors are passed as part of the raw bitstream data, in the
>>>>> slices. Is there a case where the motion vectors are coded differently?
>>>> No, it is an additional cache for decoder, even FFmpeg having such
>>>> data, I think allwinner must output it into somewhere.
>>>
>>> Ah yes I see what you mean! This is handled internally by our driver
>>> and not exposed to userspace. I don't think it would be a good idea to
>>> expose this cache or request that userspace allocates it like a video
>>> buffer.
>>>
>> No, usually the driver should allocate, as the user space have no
>> idea on size of each devices.
>> But for advantage user, application can fix a broken picture with a
>> proper data or analysis a object motion from that.
>> So I would suggest attaching this information to a picture buffer as
>> a meta data.
>
> Right, the driver will allocate chunks of memory for the decoding
> metadata used by the hardware decoder.
>
> Well, I don't think V4L2 has any mechanism to expose this data for now
> and since it's very specific to the hardware implementation, I guess
> the interest in having that is generally pretty low.
>
> That's maybe something that could be added later if someone wants to
> work on it, but I think we are better off keeping this metadata hidden
> by the driver for now.
I am writing a V4l2 driver for rockchip based on the previous vendor driver I sent to mail list. I think I would offer a better way to describe the meta after that. But it need both work in derives and userspace, it would cost some times.
>
>>>>>>>>> +
>>>>>>>>> +struct v4l2_hevc_pred_weight_table {
>>>>>>>>> + __u8 luma_log2_weight_denom;
>>>>>>>>> + __s8 delta_chroma_log2_weight_denom;
>>>>>>>>> +
>>>>>>>>> + __s8 delta_luma_weight_l0[V4L2_HEVC_DPB_ENTRIES_NUM_MAX];
>>>>>>>>> + __s8 luma_offset_l0[V4L2_HEVC_DPB_ENTRIES_NUM_MAX];
>>>>>>>>> + __s8 delta_chroma_weight_l0[V4L2_HEVC_DPB_ENTRIES_NUM_MAX][2];
>>>>>>>>> + __s8 chroma_offset_l0[V4L2_HEVC_DPB_ENTRIES_NUM_MAX][2];
>>>>>>>>> +
>>>>>>>>> + __s8 delta_luma_weight_l1[V4L2_HEVC_DPB_ENTRIES_NUM_MAX];
>>>>>>>>> + __s8 luma_offset_l1[V4L2_HEVC_DPB_ENTRIES_NUM_MAX];
>>>>>>>>> + __s8 delta_chroma_weight_l1[V4L2_HEVC_DPB_ENTRIES_NUM_MAX][2];
>>>>>>>>> + __s8 chroma_offset_l1[V4L2_HEVC_DPB_ENTRIES_NUM_MAX][2];
>>>>>>>>> +};
>>>>>>>>> +
>>>>>> Those properties I think are not necessary are applying for the
>>>>>> Rockchip's device, may not work for the others.
>>>>>
>>>>> Yes, it's possible that some of the elements are not necessary for some
>>>>> decoders. What we want is to cover all the elements that might be
>>>>> required for a decoder.
>>>> I wonder whether allwinner need that, those sao flag usually ignored
>>>> by decoder in design. But more is better than less, it is hard to
>>>> extend a v4l2 structure in the future, maybe a new HEVC profile
>>>> would bring a new property, it is still too early for HEVC.
>>>
>>> Yes this is used by our decoder. The idea is to have all the basic
>>> bitstream elements in the structures (even if some decoders don't use
>>> them all) and add others for extension as separate controls later.
>>>
>>>>>>>>> +struct v4l2_ctrl_hevc_slice_params {
>>>>>>>>> + __u32 bit_size;
>>>>>>>>> + __u32 data_bit_offset;
>>>>>>>>> +
>>>>>>>>> + /* ISO/IEC 23008-2, ITU-T Rec. H.265: NAL unit header */
>>>>>>>>> + __u8 nal_unit_type;
>>>>>>>>> + __u8 nuh_temporal_id_plus1;
>>>>>>>>> +
>>>>>>>>> + /* ISO/IEC 23008-2, ITU-T Rec. H.265: General slice segment header */
>>>>>>>>> + __u8 slice_type;
>>>>>>>>> + __u8 colour_plane_id;
>>>>>> ----------------------------------------------------------------------------
>>>>>>>>> + __u16 slice_pic_order_cnt;
>>>>>>>>> + __u8 slice_sao_luma_flag;
>>>>>>>>> + __u8 slice_sao_chroma_flag;
>>>>>>>>> + __u8 slice_temporal_mvp_enabled_flag;
>>>>>>>>> + __u8 num_ref_idx_l0_active_minus1;
>>>>>>>>> + __u8 num_ref_idx_l1_active_minus1;
>>>>>> Rockchip's decoder doesn't use this part.
>>>>>>>>> + __u8 mvd_l1_zero_flag;
>>>>>>>>> + __u8 cabac_init_flag;
>>>>>>>>> + __u8 collocated_from_l0_flag;
>>>>>>>>> + __u8 collocated_ref_idx;
>>>>>>>>> + __u8 five_minus_max_num_merge_cand;
>>>>>>>>> + __u8 use_integer_mv_flag;
>>>>>>>>> + __s8 slice_qp_delta;
>>>>>>>>> + __s8 slice_cb_qp_offset;
>>>>>>>>> + __s8 slice_cr_qp_offset;
>>>>>>>>> + __s8 slice_act_y_qp_offset;
>>>>>>>>> + __s8 slice_act_cb_qp_offset;
>>>>>>>>> + __s8 slice_act_cr_qp_offset;
>>>>>>>>> + __u8 slice_deblocking_filter_disabled_flag;
>>>>>>>>> + __s8 slice_beta_offset_div2;
>>>>>>>>> + __s8 slice_tc_offset_div2;
>>>>>>>>> + __u8 slice_loop_filter_across_slices_enabled_flag;
>>>>>>>>> +
>>>>>>>>> + /* ISO/IEC 23008-2, ITU-T Rec. H.265: Picture timing SEI message */
>>>>>>>>> + __u8 pic_struct;
>>>>>> I think the decoder doesn't care about this, it is used for display.
>>>>>
>>>>> The purpose of this field is to indicate whether the current picture is
>>>>> a progressive frame or an interlaced field picture, which is useful for
>>>>> decoding.
>>>>>
>>>>> At least our decoder has a register field to indicate frame/top
>>>>> field/bottom field, so we certainly need to keep the info around.
>>>>> Looking at the spec and the ffmpeg implementation, it looks like this
>>>>> flag of the bitstream is the usual way to report field coding.
>>>> It depends whether the decoder cares about scan type or more, I
>>>> wonder prefer general_interlaced_source_flag for just scan type, it
>>>> would be better than reading another SEL.
>>>
>>> Well we still need a way to indicate if the current data is top or
>>> bottom field for interlaced. I don't think that knowing that the whole
>>> video is interlaced would be precise enough.
>>>
>>> Cheers,
>>>
>>> Paul
>>>
>>>>>>>>> +
>>>>>>>>> + /* ISO/IEC 23008-2, ITU-T Rec. H.265: General slice segment header */
>>>>>>>>> + struct v4l2_hevc_dpb_entry dpb[V4L2_HEVC_DPB_ENTRIES_NUM_MAX];
>>>>>>>>> + __u8 num_active_dpb_entries;
>>>>>>>>> + __u8 ref_idx_l0[V4L2_HEVC_DPB_ENTRIES_NUM_MAX];
>>>>>>>>> + __u8 ref_idx_l1[V4L2_HEVC_DPB_ENTRIES_NUM_MAX];
>>>>>>>>> +
>>>>>>>>> + __u8 num_rps_poc_st_curr_before;
>>>>>>>>> + __u8 num_rps_poc_st_curr_after;
>>>>>>>>> + __u8 num_rps_poc_lt_curr;
>>>>>>>>> +
>>>>>>>>> + /* ISO/IEC 23008-2, ITU-T Rec. H.265: Weighted prediction parameter */
>>>>>>>>> + struct v4l2_hevc_pred_weight_table pred_weight_table;
>>>>>>>>> +};
>>>>>>>>> +
>>>>>>>>> #endif
>>>>> --
>>>>> Paul Kocialkowski, Bootlin (formerly Free Electrons)
>>>>> Embedded Linux and kernel engineering
>>>>> https://bootlin.com
>>>>>
>>> --
>>> Paul Kocialkowski, Bootlin (formerly Free Electrons)
>>> Embedded Linux and kernel engineering
>>> https://bootlin.com
>>>
> --
> Paul Kocialkowski, Bootlin
> Embedded Linux and kernel engineering
> https://bootlin.com
>
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [linux-sunxi] [PATCH v2 1/2] media: v4l: Add definitions for the HEVC slice format and controls
2019-01-24 10:27 ` Paul Kocialkowski
@ 2019-01-24 12:23 ` Ayaka
2019-01-25 13:04 ` Paul Kocialkowski
0 siblings, 1 reply; 22+ messages in thread
From: Ayaka @ 2019-01-24 12:23 UTC (permalink / raw)
To: Paul Kocialkowski
Cc: Randy Li, Jernej Škrabec, linux-media, linux-kernel, devel,
linux-arm-kernel, Mauro Carvalho Chehab, Maxime Ripard,
Hans Verkuil, Ezequiel Garcia, Tomasz Figa, Alexandre Courbot,
Thomas Petazzoni, linux-rockchip
Sent from my iPad
> On Jan 24, 2019, at 6:27 PM, Paul Kocialkowski <paul.kocialkowski@bootlin.com> wrote:
>
> Hi,
>
>> On Thu, 2019-01-10 at 21:32 +0800, ayaka wrote:
>> I forget a important thing, for the rkvdec and rk hevc decoder, it would
>> requests cabac table, scaling list, picture parameter set and reference
>> picture storing in one or various of DMA buffers. I am not talking about
>> the data been parsed, the decoder would requests a raw data.
>>
>> For the pps and rps, it is possible to reuse the slice header, just let
>> the decoder know the offset from the bitstream bufer, I would suggest to
>> add three properties(with sps) for them. But I think we need a method to
>> mark a OUTPUT side buffer for those aux data.
>
> I'm quite confused about the hardware implementation then. From what
> you're saying, it seems that it takes the raw bitstream elements rather
> than parsed elements. Is it really a stateless implementation?
>
> The stateless implementation was designed with the idea that only the
> raw slice data should be passed in bitstream form to the decoder. For
> H.264, it seems that some decoders also need the slice header in raw
> bitstream form (because they take the full slice NAL unit), see the
> discussions in this thread:
> media: docs-rst: Document m2m stateless video decoder interface
Stateless just mean it won’t track the previous result, but I don’t think you can define what a date the hardware would need. Even you just build a dpb for the decoder, it is still stateless, but parsing less or more data from the bitstream doesn’t stop a decoder become a stateless decoder.
>
> Can you detail exactly what the rockchip decoder absolutely needs in
> raw bitstream format?
>
> Cheers,
>
> Paul
>
>>> On 1/8/19 6:00 PM, Ayaka wrote:
>>> Sent from my iPad
>>>
>>>> On Jan 8, 2019, at 4:38 PM, Paul Kocialkowski <paul.kocialkowski@bootlin.com> wrote:
>>>>
>>>> Hi,
>>>>
>>>>> On Tue, 2019-01-08 at 09:16 +0800, Ayaka wrote:
>>>>>
>>>>> Sent from my iPad
>>>>>
>>>>>> On Jan 7, 2019, at 5:57 PM, Paul Kocialkowski <paul.kocialkowski@bootlin.com> wrote:
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>>>> On Mon, 2019-01-07 at 11:49 +0800, Randy Li wrote:
>>>>>>>> On 12/12/18 8:51 PM, Paul Kocialkowski wrote:
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> On Wed, 2018-12-05 at 21:59 +0100, Jernej Škrabec wrote:
>>>>>>>>
>>>>>>>>>> +
>>>>>>>>>> +#define V4L2_HEVC_DPB_ENTRY_RPS_ST_CURR_BEFORE 0x01
>>>>>>>>>> +#define V4L2_HEVC_DPB_ENTRY_RPS_ST_CURR_AFTER 0x02
>>>>>>>>>> +#define V4L2_HEVC_DPB_ENTRY_RPS_LT_CURR 0x03
>>>>>>>>>> +
>>>>>>>>>> +#define V4L2_HEVC_DPB_ENTRIES_NUM_MAX 16
>>>>>>>>>> +
>>>>>>>>>> +struct v4l2_hevc_dpb_entry {
>>>>>>>>>> + __u32 buffer_tag;
>>>>>>>>>> + __u8 rps;
>>>>>>>>>> + __u8 field_pic;
>>>>>>>>>> + __u16 pic_order_cnt[2];
>>>>>>>>>> +};
>>>>>>> Please add a property for reference index, if that rps is not used for
>>>>>>> this, some device would request that(not the rockchip one). And
>>>>>>> Rockchip's VDPU1 and VDPU2 for AVC would request a similar property.
>>>>>> What exactly is that reference index? Is it a bitstream element or
>>>>>> something deduced from the bitstream?
>>>>>>
>>>>> picture order count(POC) for HEVC and frame_num in AVC. I think it is
>>>>> the number used in list0(P slice and B slice) and list1(B slice).
>>>> The picture order count is already the last field of the DPB entry
>>>> structure. There is one for each field picture.
>>> As we are not sure whether there is a field coded slice or CTU, I would hold this part and else about the field.
>>>>>>> Adding another buffer_tag for referring the memory of the motion vectors
>>>>>>> for each frames. Or a better method is add a meta data to echo picture
>>>>>>> buffer, since the picture output is just the same as the original,
>>>>>>> display won't care whether the motion vectors are written the button of
>>>>>>> picture or somewhere else.
>>>>>> The motion vectors are passed as part of the raw bitstream data, in the
>>>>>> slices. Is there a case where the motion vectors are coded differently?
>>>>> No, it is an additional cache for decoder, even FFmpeg having such
>>>>> data, I think allwinner must output it into somewhere.
>>>> Ah yes I see what you mean! This is handled internally by our driver
>>>> and not exposed to userspace. I don't think it would be a good idea to
>>>> expose this cache or request that userspace allocates it like a video
>>>> buffer.
>>>>
>>> No, usually the driver should allocate, as the user space have no idea on size of each devices.
>>> But for advantage user, application can fix a broken picture with a proper data or analysis a object motion from that.
>>> So I would suggest attaching this information to a picture buffer as a meta data.
>>>>>>>>>> +
>>>>>>>>>> +struct v4l2_hevc_pred_weight_table {
>>>>>>>>>> + __u8 luma_log2_weight_denom;
>>>>>>>>>> + __s8 delta_chroma_log2_weight_denom;
>>>>>>>>>> +
>>>>>>>>>> + __s8 delta_luma_weight_l0[V4L2_HEVC_DPB_ENTRIES_NUM_MAX];
>>>>>>>>>> + __s8 luma_offset_l0[V4L2_HEVC_DPB_ENTRIES_NUM_MAX];
>>>>>>>>>> + __s8 delta_chroma_weight_l0[V4L2_HEVC_DPB_ENTRIES_NUM_MAX][2];
>>>>>>>>>> + __s8 chroma_offset_l0[V4L2_HEVC_DPB_ENTRIES_NUM_MAX][2];
>>>>>>>>>> +
>>>>>>>>>> + __s8 delta_luma_weight_l1[V4L2_HEVC_DPB_ENTRIES_NUM_MAX];
>>>>>>>>>> + __s8 luma_offset_l1[V4L2_HEVC_DPB_ENTRIES_NUM_MAX];
>>>>>>>>>> + __s8 delta_chroma_weight_l1[V4L2_HEVC_DPB_ENTRIES_NUM_MAX][2];
>>>>>>>>>> + __s8 chroma_offset_l1[V4L2_HEVC_DPB_ENTRIES_NUM_MAX][2];
>>>>>>>>>> +};
>>>>>>>>>> +
>>>>>>> Those properties I think are not necessary are applying for the
>>>>>>> Rockchip's device, may not work for the others.
>>>>>> Yes, it's possible that some of the elements are not necessary for some
>>>>>> decoders. What we want is to cover all the elements that might be
>>>>>> required for a decoder.
>>>>> I wonder whether allwinner need that, those sao flag usually ignored
>>>>> by decoder in design. But more is better than less, it is hard to
>>>>> extend a v4l2 structure in the future, maybe a new HEVC profile
>>>>> would bring a new property, it is still too early for HEVC.
>>>> Yes this is used by our decoder. The idea is to have all the basic
>>>> bitstream elements in the structures (even if some decoders don't use
>>>> them all) and add others for extension as separate controls later.
>>>>
>>>>>>>>>> +struct v4l2_ctrl_hevc_slice_params {
>>>>>>>>>> + __u32 bit_size;
>>>>>>>>>> + __u32 data_bit_offset;
>>>>>>>>>> +
>>>>>>>>>> + /* ISO/IEC 23008-2, ITU-T Rec. H.265: NAL unit header */
>>>>>>>>>> + __u8 nal_unit_type;
>>>>>>>>>> + __u8 nuh_temporal_id_plus1;
>>>>>>>>>> +
>>>>>>>>>> + /* ISO/IEC 23008-2, ITU-T Rec. H.265: General slice segment header */
>>>>>>>>>> + __u8 slice_type;
>>>>>>>>>> + __u8 colour_plane_id;
>>>>>>> ----------------------------------------------------------------------------
>>>>>>>>>> + __u16 slice_pic_order_cnt;
>>>>>>>>>> + __u8 slice_sao_luma_flag;
>>>>>>>>>> + __u8 slice_sao_chroma_flag;
>>>>>>>>>> + __u8 slice_temporal_mvp_enabled_flag;
>>>>>>>>>> + __u8 num_ref_idx_l0_active_minus1;
>>>>>>>>>> + __u8 num_ref_idx_l1_active_minus1;
>>>>>>> Rockchip's decoder doesn't use this part.
>>>>>>>>>> + __u8 mvd_l1_zero_flag;
>>>>>>>>>> + __u8 cabac_init_flag;
>>>>>>>>>> + __u8 collocated_from_l0_flag;
>>>>>>>>>> + __u8 collocated_ref_idx;
>>>>>>>>>> + __u8 five_minus_max_num_merge_cand;
>>>>>>>>>> + __u8 use_integer_mv_flag;
>>>>>>>>>> + __s8 slice_qp_delta;
>>>>>>>>>> + __s8 slice_cb_qp_offset;
>>>>>>>>>> + __s8 slice_cr_qp_offset;
>>>>>>>>>> + __s8 slice_act_y_qp_offset;
>>>>>>>>>> + __s8 slice_act_cb_qp_offset;
>>>>>>>>>> + __s8 slice_act_cr_qp_offset;
>>>>>>>>>> + __u8 slice_deblocking_filter_disabled_flag;
>>>>>>>>>> + __s8 slice_beta_offset_div2;
>>>>>>>>>> + __s8 slice_tc_offset_div2;
>>>>>>>>>> + __u8 slice_loop_filter_across_slices_enabled_flag;
>>>>>>>>>> +
>>>>>>>>>> + /* ISO/IEC 23008-2, ITU-T Rec. H.265: Picture timing SEI message */
>>>>>>>>>> + __u8 pic_struct;
>>>>>>> I think the decoder doesn't care about this, it is used for display.
>>>>>> The purpose of this field is to indicate whether the current picture is
>>>>>> a progressive frame or an interlaced field picture, which is useful for
>>>>>> decoding.
>>>>>>
>>>>>> At least our decoder has a register field to indicate frame/top
>>>>>> field/bottom field, so we certainly need to keep the info around.
>>>>>> Looking at the spec and the ffmpeg implementation, it looks like this
>>>>>> flag of the bitstream is the usual way to report field coding.
>>>>> It depends whether the decoder cares about scan type or more, I
>>>>> wonder prefer general_interlaced_source_flag for just scan type, it
>>>>> would be better than reading another SEL.
>>>> Well we still need a way to indicate if the current data is top or
>>>> bottom field for interlaced. I don't think that knowing that the whole
>>>> video is interlaced would be precise enough.
>>>>
>>>> Cheers,
>>>>
>>>> Paul
>>>>
>>>>>>>>>> +
>>>>>>>>>> + /* ISO/IEC 23008-2, ITU-T Rec. H.265: General slice segment header */
>>>>>>>>>> + struct v4l2_hevc_dpb_entry dpb[V4L2_HEVC_DPB_ENTRIES_NUM_MAX];
>>>>>>>>>> + __u8 num_active_dpb_entries;
>>>>>>>>>> + __u8 ref_idx_l0[V4L2_HEVC_DPB_ENTRIES_NUM_MAX];
>>>>>>>>>> + __u8 ref_idx_l1[V4L2_HEVC_DPB_ENTRIES_NUM_MAX];
>>>>>>>>>> +
>>>>>>>>>> + __u8 num_rps_poc_st_curr_before;
>>>>>>>>>> + __u8 num_rps_poc_st_curr_after;
>>>>>>>>>> + __u8 num_rps_poc_lt_curr;
>>>>>>>>>> +
>>>>>>>>>> + /* ISO/IEC 23008-2, ITU-T Rec. H.265: Weighted prediction parameter */
>>>>>>>>>> + struct v4l2_hevc_pred_weight_table pred_weight_table;
>>>>>>>>>> +};
>>>>>>>>>> +
>>>>>>>>>> #endif
>>>>>> --
>>>>>> Paul Kocialkowski, Bootlin (formerly Free Electrons)
>>>>>> Embedded Linux and kernel engineering
>>>>>> https://bootlin.com
>>>>>>
>>>> --
>>>> Paul Kocialkowski, Bootlin (formerly Free Electrons)
>>>> Embedded Linux and kernel engineering
>>>> https://bootlin.com
>>>>
> --
> Paul Kocialkowski, Bootlin
> Embedded Linux and kernel engineering
> https://bootlin.com
>
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [linux-sunxi] [PATCH v2 1/2] media: v4l: Add definitions for the HEVC slice format and controls
2019-01-24 12:23 ` Ayaka
@ 2019-01-25 13:04 ` Paul Kocialkowski
[not found] ` <ab8ca098ad60f209fe97f79bb93b2d1e898da524.camel-LDxbnhwyfcJBDgjK7y7TUQ@public.gmane.org>
0 siblings, 1 reply; 22+ messages in thread
From: Paul Kocialkowski @ 2019-01-25 13:04 UTC (permalink / raw)
To: Ayaka
Cc: Randy Li, Jernej Škrabec, linux-media, linux-kernel, devel,
linux-arm-kernel, Mauro Carvalho Chehab, Maxime Ripard,
Hans Verkuil, Ezequiel Garcia, Tomasz Figa, Alexandre Courbot,
Thomas Petazzoni, linux-rockchip
Hi,
On Thu, 2019-01-24 at 20:23 +0800, Ayaka wrote:
>
> Sent from my iPad
>
> > On Jan 24, 2019, at 6:27 PM, Paul Kocialkowski <paul.kocialkowski@bootlin.com> wrote:
> >
> > Hi,
> >
> > > On Thu, 2019-01-10 at 21:32 +0800, ayaka wrote:
> > > I forget a important thing, for the rkvdec and rk hevc decoder, it would
> > > requests cabac table, scaling list, picture parameter set and reference
> > > picture storing in one or various of DMA buffers. I am not talking about
> > > the data been parsed, the decoder would requests a raw data.
> > >
> > > For the pps and rps, it is possible to reuse the slice header, just let
> > > the decoder know the offset from the bitstream bufer, I would suggest to
> > > add three properties(with sps) for them. But I think we need a method to
> > > mark a OUTPUT side buffer for those aux data.
> >
> > I'm quite confused about the hardware implementation then. From what
> > you're saying, it seems that it takes the raw bitstream elements rather
> > than parsed elements. Is it really a stateless implementation?
> >
> > The stateless implementation was designed with the idea that only the
> > raw slice data should be passed in bitstream form to the decoder. For
> > H.264, it seems that some decoders also need the slice header in raw
> > bitstream form (because they take the full slice NAL unit), see the
> > discussions in this thread:
> > media: docs-rst: Document m2m stateless video decoder interface
>
> Stateless just mean it won’t track the previous result, but I don’t
> think you can define what a date the hardware would need. Even you
> just build a dpb for the decoder, it is still stateless, but parsing
> less or more data from the bitstream doesn’t stop a decoder become a
> stateless decoder.
Yes fair enough, the format in which the hardware decoder takes the
bitstream parameters does not make it stateless or stateful per-se.
It's just that stateless decoders should have no particular reason for
parsing the bitstream on their own since the hardware can be designed
with registers for each relevant bitstream element to configure the
decoding pipeline. That's how GPU-based decoder implementations are
implemented (VAAPI/VDPAU/NVDEC, etc).
So the format we have agreed on so far for the stateless interface is
to pass parsed elements via v4l2 control structures.
If the hardware can only work by parsing the bitstream itself, I'm not
sure what the best solution would be. Reconstructing the bitstream in
the kernel is a pretty bad option, but so is parsing in the kernel or
having the data both in parsed and raw forms. Do you see another
possibility?
Cheers,
Paul
> > Can you detail exactly what the rockchip decoder absolutely needs in
> > raw bitstream format?
> >
> > Cheers,
> >
> > Paul
> >
> > > > On 1/8/19 6:00 PM, Ayaka wrote:
> > > > Sent from my iPad
> > > >
> > > > > On Jan 8, 2019, at 4:38 PM, Paul Kocialkowski <paul.kocialkowski@bootlin.com> wrote:
> > > > >
> > > > > Hi,
> > > > >
> > > > > > On Tue, 2019-01-08 at 09:16 +0800, Ayaka wrote:
> > > > > >
> > > > > > Sent from my iPad
> > > > > >
> > > > > > > On Jan 7, 2019, at 5:57 PM, Paul Kocialkowski <paul.kocialkowski@bootlin.com> wrote:
> > > > > > >
> > > > > > > Hi,
> > > > > > >
> > > > > > > > > On Mon, 2019-01-07 at 11:49 +0800, Randy Li wrote:
> > > > > > > > > On 12/12/18 8:51 PM, Paul Kocialkowski wrote:
> > > > > > > > > Hi,
> > > > > > > > >
> > > > > > > > > On Wed, 2018-12-05 at 21:59 +0100, Jernej Škrabec wrote:
> > > > > > > > >
> > > > > > > > > > > +
> > > > > > > > > > > +#define V4L2_HEVC_DPB_ENTRY_RPS_ST_CURR_BEFORE 0x01
> > > > > > > > > > > +#define V4L2_HEVC_DPB_ENTRY_RPS_ST_CURR_AFTER 0x02
> > > > > > > > > > > +#define V4L2_HEVC_DPB_ENTRY_RPS_LT_CURR 0x03
> > > > > > > > > > > +
> > > > > > > > > > > +#define V4L2_HEVC_DPB_ENTRIES_NUM_MAX 16
> > > > > > > > > > > +
> > > > > > > > > > > +struct v4l2_hevc_dpb_entry {
> > > > > > > > > > > + __u32 buffer_tag;
> > > > > > > > > > > + __u8 rps;
> > > > > > > > > > > + __u8 field_pic;
> > > > > > > > > > > + __u16 pic_order_cnt[2];
> > > > > > > > > > > +};
> > > > > > > > Please add a property for reference index, if that rps is not used for
> > > > > > > > this, some device would request that(not the rockchip one). And
> > > > > > > > Rockchip's VDPU1 and VDPU2 for AVC would request a similar property.
> > > > > > > What exactly is that reference index? Is it a bitstream element or
> > > > > > > something deduced from the bitstream?
> > > > > > >
> > > > > > picture order count(POC) for HEVC and frame_num in AVC. I think it is
> > > > > > the number used in list0(P slice and B slice) and list1(B slice).
> > > > > The picture order count is already the last field of the DPB entry
> > > > > structure. There is one for each field picture.
> > > > As we are not sure whether there is a field coded slice or CTU, I would hold this part and else about the field.
> > > > > > > > Adding another buffer_tag for referring the memory of the motion vectors
> > > > > > > > for each frames. Or a better method is add a meta data to echo picture
> > > > > > > > buffer, since the picture output is just the same as the original,
> > > > > > > > display won't care whether the motion vectors are written the button of
> > > > > > > > picture or somewhere else.
> > > > > > > The motion vectors are passed as part of the raw bitstream data, in the
> > > > > > > slices. Is there a case where the motion vectors are coded differently?
> > > > > > No, it is an additional cache for decoder, even FFmpeg having such
> > > > > > data, I think allwinner must output it into somewhere.
> > > > > Ah yes I see what you mean! This is handled internally by our driver
> > > > > and not exposed to userspace. I don't think it would be a good idea to
> > > > > expose this cache or request that userspace allocates it like a video
> > > > > buffer.
> > > > >
> > > > No, usually the driver should allocate, as the user space have no idea on size of each devices.
> > > > But for advantage user, application can fix a broken picture with a proper data or analysis a object motion from that.
> > > > So I would suggest attaching this information to a picture buffer as a meta data.
> > > > > > > > > > > +
> > > > > > > > > > > +struct v4l2_hevc_pred_weight_table {
> > > > > > > > > > > + __u8 luma_log2_weight_denom;
> > > > > > > > > > > + __s8 delta_chroma_log2_weight_denom;
> > > > > > > > > > > +
> > > > > > > > > > > + __s8 delta_luma_weight_l0[V4L2_HEVC_DPB_ENTRIES_NUM_MAX];
> > > > > > > > > > > + __s8 luma_offset_l0[V4L2_HEVC_DPB_ENTRIES_NUM_MAX];
> > > > > > > > > > > + __s8 delta_chroma_weight_l0[V4L2_HEVC_DPB_ENTRIES_NUM_MAX][2];
> > > > > > > > > > > + __s8 chroma_offset_l0[V4L2_HEVC_DPB_ENTRIES_NUM_MAX][2];
> > > > > > > > > > > +
> > > > > > > > > > > + __s8 delta_luma_weight_l1[V4L2_HEVC_DPB_ENTRIES_NUM_MAX];
> > > > > > > > > > > + __s8 luma_offset_l1[V4L2_HEVC_DPB_ENTRIES_NUM_MAX];
> > > > > > > > > > > + __s8 delta_chroma_weight_l1[V4L2_HEVC_DPB_ENTRIES_NUM_MAX][2];
> > > > > > > > > > > + __s8 chroma_offset_l1[V4L2_HEVC_DPB_ENTRIES_NUM_MAX][2];
> > > > > > > > > > > +};
> > > > > > > > > > > +
> > > > > > > > Those properties I think are not necessary are applying for the
> > > > > > > > Rockchip's device, may not work for the others.
> > > > > > > Yes, it's possible that some of the elements are not necessary for some
> > > > > > > decoders. What we want is to cover all the elements that might be
> > > > > > > required for a decoder.
> > > > > > I wonder whether allwinner need that, those sao flag usually ignored
> > > > > > by decoder in design. But more is better than less, it is hard to
> > > > > > extend a v4l2 structure in the future, maybe a new HEVC profile
> > > > > > would bring a new property, it is still too early for HEVC.
> > > > > Yes this is used by our decoder. The idea is to have all the basic
> > > > > bitstream elements in the structures (even if some decoders don't use
> > > > > them all) and add others for extension as separate controls later.
> > > > >
> > > > > > > > > > > +struct v4l2_ctrl_hevc_slice_params {
> > > > > > > > > > > + __u32 bit_size;
> > > > > > > > > > > + __u32 data_bit_offset;
> > > > > > > > > > > +
> > > > > > > > > > > + /* ISO/IEC 23008-2, ITU-T Rec. H.265: NAL unit header */
> > > > > > > > > > > + __u8 nal_unit_type;
> > > > > > > > > > > + __u8 nuh_temporal_id_plus1;
> > > > > > > > > > > +
> > > > > > > > > > > + /* ISO/IEC 23008-2, ITU-T Rec. H.265: General slice segment header */
> > > > > > > > > > > + __u8 slice_type;
> > > > > > > > > > > + __u8 colour_plane_id;
> > > > > > > > ----------------------------------------------------------------------------
> > > > > > > > > > > + __u16 slice_pic_order_cnt;
> > > > > > > > > > > + __u8 slice_sao_luma_flag;
> > > > > > > > > > > + __u8 slice_sao_chroma_flag;
> > > > > > > > > > > + __u8 slice_temporal_mvp_enabled_flag;
> > > > > > > > > > > + __u8 num_ref_idx_l0_active_minus1;
> > > > > > > > > > > + __u8 num_ref_idx_l1_active_minus1;
> > > > > > > > Rockchip's decoder doesn't use this part.
> > > > > > > > > > > + __u8 mvd_l1_zero_flag;
> > > > > > > > > > > + __u8 cabac_init_flag;
> > > > > > > > > > > + __u8 collocated_from_l0_flag;
> > > > > > > > > > > + __u8 collocated_ref_idx;
> > > > > > > > > > > + __u8 five_minus_max_num_merge_cand;
> > > > > > > > > > > + __u8 use_integer_mv_flag;
> > > > > > > > > > > + __s8 slice_qp_delta;
> > > > > > > > > > > + __s8 slice_cb_qp_offset;
> > > > > > > > > > > + __s8 slice_cr_qp_offset;
> > > > > > > > > > > + __s8 slice_act_y_qp_offset;
> > > > > > > > > > > + __s8 slice_act_cb_qp_offset;
> > > > > > > > > > > + __s8 slice_act_cr_qp_offset;
> > > > > > > > > > > + __u8 slice_deblocking_filter_disabled_flag;
> > > > > > > > > > > + __s8 slice_beta_offset_div2;
> > > > > > > > > > > + __s8 slice_tc_offset_div2;
> > > > > > > > > > > + __u8 slice_loop_filter_across_slices_enabled_flag;
> > > > > > > > > > > +
> > > > > > > > > > > + /* ISO/IEC 23008-2, ITU-T Rec. H.265: Picture timing SEI message */
> > > > > > > > > > > + __u8 pic_struct;
> > > > > > > > I think the decoder doesn't care about this, it is used for display.
> > > > > > > The purpose of this field is to indicate whether the current picture is
> > > > > > > a progressive frame or an interlaced field picture, which is useful for
> > > > > > > decoding.
> > > > > > >
> > > > > > > At least our decoder has a register field to indicate frame/top
> > > > > > > field/bottom field, so we certainly need to keep the info around.
> > > > > > > Looking at the spec and the ffmpeg implementation, it looks like this
> > > > > > > flag of the bitstream is the usual way to report field coding.
> > > > > > It depends whether the decoder cares about scan type or more, I
> > > > > > wonder prefer general_interlaced_source_flag for just scan type, it
> > > > > > would be better than reading another SEL.
> > > > > Well we still need a way to indicate if the current data is top or
> > > > > bottom field for interlaced. I don't think that knowing that the whole
> > > > > video is interlaced would be precise enough.
> > > > >
> > > > > Cheers,
> > > > >
> > > > > Paul
> > > > >
> > > > > > > > > > > +
> > > > > > > > > > > + /* ISO/IEC 23008-2, ITU-T Rec. H.265: General slice segment header */
> > > > > > > > > > > + struct v4l2_hevc_dpb_entry dpb[V4L2_HEVC_DPB_ENTRIES_NUM_MAX];
> > > > > > > > > > > + __u8 num_active_dpb_entries;
> > > > > > > > > > > + __u8 ref_idx_l0[V4L2_HEVC_DPB_ENTRIES_NUM_MAX];
> > > > > > > > > > > + __u8 ref_idx_l1[V4L2_HEVC_DPB_ENTRIES_NUM_MAX];
> > > > > > > > > > > +
> > > > > > > > > > > + __u8 num_rps_poc_st_curr_before;
> > > > > > > > > > > + __u8 num_rps_poc_st_curr_after;
> > > > > > > > > > > + __u8 num_rps_poc_lt_curr;
> > > > > > > > > > > +
> > > > > > > > > > > + /* ISO/IEC 23008-2, ITU-T Rec. H.265: Weighted prediction parameter */
> > > > > > > > > > > + struct v4l2_hevc_pred_weight_table pred_weight_table;
> > > > > > > > > > > +};
> > > > > > > > > > > +
> > > > > > > > > > > #endif
> > > > > > > --
> > > > > > > Paul Kocialkowski, Bootlin (formerly Free Electrons)
> > > > > > > Embedded Linux and kernel engineering
> > > > > > > https://bootlin.com
> > > > > > >
> > > > > --
> > > > > Paul Kocialkowski, Bootlin (formerly Free Electrons)
> > > > > Embedded Linux and kernel engineering
> > > > > https://bootlin.com
> > > > >
> > --
> > Paul Kocialkowski, Bootlin
> > Embedded Linux and kernel engineering
> > https://bootlin.com
> >
--
Paul Kocialkowski, Bootlin
Embedded Linux and kernel engineering
https://bootlin.com
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [linux-sunxi] [PATCH v2 1/2] media: v4l: Add definitions for the HEVC slice format and controls
[not found] ` <ab8ca098ad60f209fe97f79bb93b2d1e898da524.camel-LDxbnhwyfcJBDgjK7y7TUQ@public.gmane.org>
@ 2019-01-29 7:44 ` Alexandre Courbot
2019-01-29 8:09 ` Maxime Ripard
2019-01-29 21:41 ` Nicolas Dufresne
0 siblings, 2 replies; 22+ messages in thread
From: Alexandre Courbot @ 2019-01-29 7:44 UTC (permalink / raw)
To: Paul Kocialkowski
Cc: devel-gWbeCf7V1WCQmaza687I9mD2FQJk+8+b, Maxime Ripard, Ayaka,
Randy Li, LKML, Jernej Škrabec, Tomasz Figa, Hans Verkuil,
linux-rockchip-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, Thomas Petazzoni,
Mauro Carvalho Chehab, Ezequiel Garcia,
linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
Linux Media Mailing List
On Fri, Jan 25, 2019 at 10:04 PM Paul Kocialkowski
<paul.kocialkowski@bootlin.com> wrote:
>
> Hi,
>
> On Thu, 2019-01-24 at 20:23 +0800, Ayaka wrote:
> >
> > Sent from my iPad
> >
> > > On Jan 24, 2019, at 6:27 PM, Paul Kocialkowski <paul.kocialkowski@bootlin.com> wrote:
> > >
> > > Hi,
> > >
> > > > On Thu, 2019-01-10 at 21:32 +0800, ayaka wrote:
> > > > I forget a important thing, for the rkvdec and rk hevc decoder, it would
> > > > requests cabac table, scaling list, picture parameter set and reference
> > > > picture storing in one or various of DMA buffers. I am not talking about
> > > > the data been parsed, the decoder would requests a raw data.
> > > >
> > > > For the pps and rps, it is possible to reuse the slice header, just let
> > > > the decoder know the offset from the bitstream bufer, I would suggest to
> > > > add three properties(with sps) for them. But I think we need a method to
> > > > mark a OUTPUT side buffer for those aux data.
> > >
> > > I'm quite confused about the hardware implementation then. From what
> > > you're saying, it seems that it takes the raw bitstream elements rather
> > > than parsed elements. Is it really a stateless implementation?
> > >
> > > The stateless implementation was designed with the idea that only the
> > > raw slice data should be passed in bitstream form to the decoder. For
> > > H.264, it seems that some decoders also need the slice header in raw
> > > bitstream form (because they take the full slice NAL unit), see the
> > > discussions in this thread:
> > > media: docs-rst: Document m2m stateless video decoder interface
> >
> > Stateless just mean it won’t track the previous result, but I don’t
> > think you can define what a date the hardware would need. Even you
> > just build a dpb for the decoder, it is still stateless, but parsing
> > less or more data from the bitstream doesn’t stop a decoder become a
> > stateless decoder.
>
> Yes fair enough, the format in which the hardware decoder takes the
> bitstream parameters does not make it stateless or stateful per-se.
> It's just that stateless decoders should have no particular reason for
> parsing the bitstream on their own since the hardware can be designed
> with registers for each relevant bitstream element to configure the
> decoding pipeline. That's how GPU-based decoder implementations are
> implemented (VAAPI/VDPAU/NVDEC, etc).
>
> So the format we have agreed on so far for the stateless interface is
> to pass parsed elements via v4l2 control structures.
>
> If the hardware can only work by parsing the bitstream itself, I'm not
> sure what the best solution would be. Reconstructing the bitstream in
> the kernel is a pretty bad option, but so is parsing in the kernel or
> having the data both in parsed and raw forms. Do you see another
> possibility?
Is reconstructing the bitstream so bad? The v4l2 controls provide a
generic interface to an encoded format which the driver needs to
convert into a sequence that the hardware can understand. Typically
this is done by populating hardware-specific structures. Can't we
consider that in this specific instance, the hardware-specific
structure just happens to be identical to the original bitstream
format?
I agree that this is not strictly optimal for that particular
hardware, but such is the cost of abstractions, and in this specific
case I don't believe the cost would be particularly high?
_______________________________________________
Linux-rockchip mailing list
Linux-rockchip@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-rockchip
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [linux-sunxi] [PATCH v2 1/2] media: v4l: Add definitions for the HEVC slice format and controls
2019-01-29 7:44 ` Alexandre Courbot
@ 2019-01-29 8:09 ` Maxime Ripard
2019-01-29 9:39 ` Tomasz Figa
2019-01-29 21:41 ` Nicolas Dufresne
1 sibling, 1 reply; 22+ messages in thread
From: Maxime Ripard @ 2019-01-29 8:09 UTC (permalink / raw)
To: Alexandre Courbot
Cc: Paul Kocialkowski, Ayaka, Randy Li, Jernej Škrabec,
Linux Media Mailing List, LKML, devel, linux-arm-kernel,
Mauro Carvalho Chehab, Hans Verkuil, Ezequiel Garcia, Tomasz Figa,
Thomas Petazzoni, linux-rockchip
On Tue, Jan 29, 2019 at 04:44:35PM +0900, Alexandre Courbot wrote:
> On Fri, Jan 25, 2019 at 10:04 PM Paul Kocialkowski
> > On Thu, 2019-01-24 at 20:23 +0800, Ayaka wrote:
> > >
> > > Sent from my iPad
> > >
> > > > On Jan 24, 2019, at 6:27 PM, Paul Kocialkowski <paul.kocialkowski@bootlin.com> wrote:
> > > >
> > > > Hi,
> > > >
> > > > > On Thu, 2019-01-10 at 21:32 +0800, ayaka wrote:
> > > > > I forget a important thing, for the rkvdec and rk hevc decoder, it would
> > > > > requests cabac table, scaling list, picture parameter set and reference
> > > > > picture storing in one or various of DMA buffers. I am not talking about
> > > > > the data been parsed, the decoder would requests a raw data.
> > > > >
> > > > > For the pps and rps, it is possible to reuse the slice header, just let
> > > > > the decoder know the offset from the bitstream bufer, I would suggest to
> > > > > add three properties(with sps) for them. But I think we need a method to
> > > > > mark a OUTPUT side buffer for those aux data.
> > > >
> > > > I'm quite confused about the hardware implementation then. From what
> > > > you're saying, it seems that it takes the raw bitstream elements rather
> > > > than parsed elements. Is it really a stateless implementation?
> > > >
> > > > The stateless implementation was designed with the idea that only the
> > > > raw slice data should be passed in bitstream form to the decoder. For
> > > > H.264, it seems that some decoders also need the slice header in raw
> > > > bitstream form (because they take the full slice NAL unit), see the
> > > > discussions in this thread:
> > > > media: docs-rst: Document m2m stateless video decoder interface
> > >
> > > Stateless just mean it won’t track the previous result, but I don’t
> > > think you can define what a date the hardware would need. Even you
> > > just build a dpb for the decoder, it is still stateless, but parsing
> > > less or more data from the bitstream doesn’t stop a decoder become a
> > > stateless decoder.
> >
> > Yes fair enough, the format in which the hardware decoder takes the
> > bitstream parameters does not make it stateless or stateful per-se.
> > It's just that stateless decoders should have no particular reason for
> > parsing the bitstream on their own since the hardware can be designed
> > with registers for each relevant bitstream element to configure the
> > decoding pipeline. That's how GPU-based decoder implementations are
> > implemented (VAAPI/VDPAU/NVDEC, etc).
> >
> > So the format we have agreed on so far for the stateless interface is
> > to pass parsed elements via v4l2 control structures.
> >
> > If the hardware can only work by parsing the bitstream itself, I'm not
> > sure what the best solution would be. Reconstructing the bitstream in
> > the kernel is a pretty bad option, but so is parsing in the kernel or
> > having the data both in parsed and raw forms. Do you see another
> > possibility?
>
> Is reconstructing the bitstream so bad? The v4l2 controls provide a
> generic interface to an encoded format which the driver needs to
> convert into a sequence that the hardware can understand. Typically
> this is done by populating hardware-specific structures. Can't we
> consider that in this specific instance, the hardware-specific
> structure just happens to be identical to the original bitstream
> format?
>
> I agree that this is not strictly optimal for that particular
> hardware, but such is the cost of abstractions, and in this specific
> case I don't believe the cost would be particularly high?
I mean, that argument can be made for the rockchip driver as well. If
reconstructing the bitstream is something we can do, and if we don't
care about being suboptimal for one particular hardware, then why the
rockchip driver doesn't just recreate the bitstream from that API?
After all, this is just a hardware specific header that happens to be
identical to the original bitstream format
Maxime
--
Maxime Ripard, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [linux-sunxi] [PATCH v2 1/2] media: v4l: Add definitions for the HEVC slice format and controls
2019-01-29 8:09 ` Maxime Ripard
@ 2019-01-29 9:39 ` Tomasz Figa
0 siblings, 0 replies; 22+ messages in thread
From: Tomasz Figa @ 2019-01-29 9:39 UTC (permalink / raw)
To: Maxime Ripard
Cc: Alexandre Courbot, Paul Kocialkowski, Ayaka, Randy Li,
Jernej Škrabec, Linux Media Mailing List, LKML, devel,
list@263.net:IOMMU DRIVERS <iommu@lists.linux-foundation.org>, Joerg Roedel <joro@8bytes.org>,,
Mauro Carvalho Chehab, Hans Verkuil, Ezequiel Garcia,
Thomas Petazzoni, open list:ARM/Rockchip SoC...
On Tue, Jan 29, 2019 at 5:09 PM Maxime Ripard <maxime.ripard@bootlin.com> wrote:
>
> On Tue, Jan 29, 2019 at 04:44:35PM +0900, Alexandre Courbot wrote:
> > On Fri, Jan 25, 2019 at 10:04 PM Paul Kocialkowski
> > > On Thu, 2019-01-24 at 20:23 +0800, Ayaka wrote:
> > > >
> > > > Sent from my iPad
> > > >
> > > > > On Jan 24, 2019, at 6:27 PM, Paul Kocialkowski <paul.kocialkowski@bootlin.com> wrote:
> > > > >
> > > > > Hi,
> > > > >
> > > > > > On Thu, 2019-01-10 at 21:32 +0800, ayaka wrote:
> > > > > > I forget a important thing, for the rkvdec and rk hevc decoder, it would
> > > > > > requests cabac table, scaling list, picture parameter set and reference
> > > > > > picture storing in one or various of DMA buffers. I am not talking about
> > > > > > the data been parsed, the decoder would requests a raw data.
> > > > > >
> > > > > > For the pps and rps, it is possible to reuse the slice header, just let
> > > > > > the decoder know the offset from the bitstream bufer, I would suggest to
> > > > > > add three properties(with sps) for them. But I think we need a method to
> > > > > > mark a OUTPUT side buffer for those aux data.
> > > > >
> > > > > I'm quite confused about the hardware implementation then. From what
> > > > > you're saying, it seems that it takes the raw bitstream elements rather
> > > > > than parsed elements. Is it really a stateless implementation?
> > > > >
> > > > > The stateless implementation was designed with the idea that only the
> > > > > raw slice data should be passed in bitstream form to the decoder. For
> > > > > H.264, it seems that some decoders also need the slice header in raw
> > > > > bitstream form (because they take the full slice NAL unit), see the
> > > > > discussions in this thread:
> > > > > media: docs-rst: Document m2m stateless video decoder interface
> > > >
> > > > Stateless just mean it won’t track the previous result, but I don’t
> > > > think you can define what a date the hardware would need. Even you
> > > > just build a dpb for the decoder, it is still stateless, but parsing
> > > > less or more data from the bitstream doesn’t stop a decoder become a
> > > > stateless decoder.
> > >
> > > Yes fair enough, the format in which the hardware decoder takes the
> > > bitstream parameters does not make it stateless or stateful per-se.
> > > It's just that stateless decoders should have no particular reason for
> > > parsing the bitstream on their own since the hardware can be designed
> > > with registers for each relevant bitstream element to configure the
> > > decoding pipeline. That's how GPU-based decoder implementations are
> > > implemented (VAAPI/VDPAU/NVDEC, etc).
> > >
> > > So the format we have agreed on so far for the stateless interface is
> > > to pass parsed elements via v4l2 control structures.
> > >
> > > If the hardware can only work by parsing the bitstream itself, I'm not
> > > sure what the best solution would be. Reconstructing the bitstream in
> > > the kernel is a pretty bad option, but so is parsing in the kernel or
> > > having the data both in parsed and raw forms. Do you see another
> > > possibility?
> >
> > Is reconstructing the bitstream so bad? The v4l2 controls provide a
> > generic interface to an encoded format which the driver needs to
> > convert into a sequence that the hardware can understand. Typically
> > this is done by populating hardware-specific structures. Can't we
> > consider that in this specific instance, the hardware-specific
> > structure just happens to be identical to the original bitstream
> > format?
> >
> > I agree that this is not strictly optimal for that particular
> > hardware, but such is the cost of abstractions, and in this specific
> > case I don't believe the cost would be particularly high?
>
> I mean, that argument can be made for the rockchip driver as well. If
> reconstructing the bitstream is something we can do, and if we don't
> care about being suboptimal for one particular hardware, then why the
> rockchip driver doesn't just recreate the bitstream from that API?
>
> After all, this is just a hardware specific header that happens to be
> identical to the original bitstream format
I think in another thread (about H.264 I believe), we realized that it
could be a good idea to just include the Slice NAL units in the
Annex.B format in the buffers and that should work for all the
hardware we could think of (given offsets to particular parts inside
of the buffer). Wouldn't something similar work here for HEVC?
I don't really get the meaning of "raw" for "cabac table, scaling
list, picture parameter set and reference picture", since those are
parts of the bitstream, which needs to be parsed to obtain those.
Best regards,
Tomasz
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [linux-sunxi] [PATCH v2 1/2] media: v4l: Add definitions for the HEVC slice format and controls
2019-01-29 7:44 ` Alexandre Courbot
2019-01-29 8:09 ` Maxime Ripard
@ 2019-01-29 21:41 ` Nicolas Dufresne
2019-01-30 2:28 ` Alexandre Courbot
2019-01-30 7:03 ` Ayaka
1 sibling, 2 replies; 22+ messages in thread
From: Nicolas Dufresne @ 2019-01-29 21:41 UTC (permalink / raw)
To: Alexandre Courbot, Paul Kocialkowski
Cc: Ayaka, Randy Li, Jernej Škrabec, Linux Media Mailing List,
LKML, devel, linux-arm-kernel, Mauro Carvalho Chehab,
Maxime Ripard, Hans Verkuil, Ezequiel Garcia, Tomasz Figa,
Thomas Petazzoni, linux-rockchip
[-- Attachment #1: Type: text/plain, Size: 4305 bytes --]
Le mardi 29 janvier 2019 à 16:44 +0900, Alexandre Courbot a écrit :
> On Fri, Jan 25, 2019 at 10:04 PM Paul Kocialkowski
> <paul.kocialkowski@bootlin.com> wrote:
> > Hi,
> >
> > On Thu, 2019-01-24 at 20:23 +0800, Ayaka wrote:
> > > Sent from my iPad
> > >
> > > > On Jan 24, 2019, at 6:27 PM, Paul Kocialkowski <paul.kocialkowski@bootlin.com> wrote:
> > > >
> > > > Hi,
> > > >
> > > > > On Thu, 2019-01-10 at 21:32 +0800, ayaka wrote:
> > > > > I forget a important thing, for the rkvdec and rk hevc decoder, it would
> > > > > requests cabac table, scaling list, picture parameter set and reference
> > > > > picture storing in one or various of DMA buffers. I am not talking about
> > > > > the data been parsed, the decoder would requests a raw data.
> > > > >
> > > > > For the pps and rps, it is possible to reuse the slice header, just let
> > > > > the decoder know the offset from the bitstream bufer, I would suggest to
> > > > > add three properties(with sps) for them. But I think we need a method to
> > > > > mark a OUTPUT side buffer for those aux data.
> > > >
> > > > I'm quite confused about the hardware implementation then. From what
> > > > you're saying, it seems that it takes the raw bitstream elements rather
> > > > than parsed elements. Is it really a stateless implementation?
> > > >
> > > > The stateless implementation was designed with the idea that only the
> > > > raw slice data should be passed in bitstream form to the decoder. For
> > > > H.264, it seems that some decoders also need the slice header in raw
> > > > bitstream form (because they take the full slice NAL unit), see the
> > > > discussions in this thread:
> > > > media: docs-rst: Document m2m stateless video decoder interface
> > >
> > > Stateless just mean it won’t track the previous result, but I don’t
> > > think you can define what a date the hardware would need. Even you
> > > just build a dpb for the decoder, it is still stateless, but parsing
> > > less or more data from the bitstream doesn’t stop a decoder become a
> > > stateless decoder.
> >
> > Yes fair enough, the format in which the hardware decoder takes the
> > bitstream parameters does not make it stateless or stateful per-se.
> > It's just that stateless decoders should have no particular reason for
> > parsing the bitstream on their own since the hardware can be designed
> > with registers for each relevant bitstream element to configure the
> > decoding pipeline. That's how GPU-based decoder implementations are
> > implemented (VAAPI/VDPAU/NVDEC, etc).
> >
> > So the format we have agreed on so far for the stateless interface is
> > to pass parsed elements via v4l2 control structures.
> >
> > If the hardware can only work by parsing the bitstream itself, I'm not
> > sure what the best solution would be. Reconstructing the bitstream in
> > the kernel is a pretty bad option, but so is parsing in the kernel or
> > having the data both in parsed and raw forms. Do you see another
> > possibility?
>
> Is reconstructing the bitstream so bad? The v4l2 controls provide a
> generic interface to an encoded format which the driver needs to
> convert into a sequence that the hardware can understand. Typically
> this is done by populating hardware-specific structures. Can't we
> consider that in this specific instance, the hardware-specific
> structure just happens to be identical to the original bitstream
> format?
At maximum allowed bitrate for let's say HEVC (940MB/s iirc), yes, it
would be really really bad. In GStreamer project we have discussed for
a while (but have never done anything about) adding the ability through
a bitmask to select which part of the stream need to be parsed, as
parsing itself was causing some overhead. Maybe similar thing applies,
though as per our new design, it's the fourcc that dictate the driver
behaviour, we'd need yet another fourcc for drivers that wants the full
bitstream (which seems odd if you have already parsed everything, I
think this need some clarification).
>
> I agree that this is not strictly optimal for that particular
> hardware, but such is the cost of abstractions, and in this specific
> case I don't believe the cost would be particularly high?
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 195 bytes --]
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [linux-sunxi] [PATCH v2 1/2] media: v4l: Add definitions for the HEVC slice format and controls
2019-01-29 21:41 ` Nicolas Dufresne
@ 2019-01-30 2:28 ` Alexandre Courbot
2019-01-30 3:35 ` Tomasz Figa
2019-01-30 7:03 ` Ayaka
1 sibling, 1 reply; 22+ messages in thread
From: Alexandre Courbot @ 2019-01-30 2:28 UTC (permalink / raw)
To: Nicolas Dufresne
Cc: Paul Kocialkowski, Ayaka, Randy Li, Jernej Škrabec,
Linux Media Mailing List, LKML, devel, linux-arm-kernel,
Mauro Carvalho Chehab, Maxime Ripard, Hans Verkuil,
Ezequiel Garcia, Tomasz Figa, Thomas Petazzoni, linux-rockchip
On Wed, Jan 30, 2019 at 6:41 AM Nicolas Dufresne <nicolas@ndufresne.ca> wrote:
>
> Le mardi 29 janvier 2019 à 16:44 +0900, Alexandre Courbot a écrit :
> > On Fri, Jan 25, 2019 at 10:04 PM Paul Kocialkowski
> > <paul.kocialkowski@bootlin.com> wrote:
> > > Hi,
> > >
> > > On Thu, 2019-01-24 at 20:23 +0800, Ayaka wrote:
> > > > Sent from my iPad
> > > >
> > > > > On Jan 24, 2019, at 6:27 PM, Paul Kocialkowski <paul.kocialkowski@bootlin.com> wrote:
> > > > >
> > > > > Hi,
> > > > >
> > > > > > On Thu, 2019-01-10 at 21:32 +0800, ayaka wrote:
> > > > > > I forget a important thing, for the rkvdec and rk hevc decoder, it would
> > > > > > requests cabac table, scaling list, picture parameter set and reference
> > > > > > picture storing in one or various of DMA buffers. I am not talking about
> > > > > > the data been parsed, the decoder would requests a raw data.
> > > > > >
> > > > > > For the pps and rps, it is possible to reuse the slice header, just let
> > > > > > the decoder know the offset from the bitstream bufer, I would suggest to
> > > > > > add three properties(with sps) for them. But I think we need a method to
> > > > > > mark a OUTPUT side buffer for those aux data.
> > > > >
> > > > > I'm quite confused about the hardware implementation then. From what
> > > > > you're saying, it seems that it takes the raw bitstream elements rather
> > > > > than parsed elements. Is it really a stateless implementation?
> > > > >
> > > > > The stateless implementation was designed with the idea that only the
> > > > > raw slice data should be passed in bitstream form to the decoder. For
> > > > > H.264, it seems that some decoders also need the slice header in raw
> > > > > bitstream form (because they take the full slice NAL unit), see the
> > > > > discussions in this thread:
> > > > > media: docs-rst: Document m2m stateless video decoder interface
> > > >
> > > > Stateless just mean it won’t track the previous result, but I don’t
> > > > think you can define what a date the hardware would need. Even you
> > > > just build a dpb for the decoder, it is still stateless, but parsing
> > > > less or more data from the bitstream doesn’t stop a decoder become a
> > > > stateless decoder.
> > >
> > > Yes fair enough, the format in which the hardware decoder takes the
> > > bitstream parameters does not make it stateless or stateful per-se.
> > > It's just that stateless decoders should have no particular reason for
> > > parsing the bitstream on their own since the hardware can be designed
> > > with registers for each relevant bitstream element to configure the
> > > decoding pipeline. That's how GPU-based decoder implementations are
> > > implemented (VAAPI/VDPAU/NVDEC, etc).
> > >
> > > So the format we have agreed on so far for the stateless interface is
> > > to pass parsed elements via v4l2 control structures.
> > >
> > > If the hardware can only work by parsing the bitstream itself, I'm not
> > > sure what the best solution would be. Reconstructing the bitstream in
> > > the kernel is a pretty bad option, but so is parsing in the kernel or
> > > having the data both in parsed and raw forms. Do you see another
> > > possibility?
> >
> > Is reconstructing the bitstream so bad? The v4l2 controls provide a
> > generic interface to an encoded format which the driver needs to
> > convert into a sequence that the hardware can understand. Typically
> > this is done by populating hardware-specific structures. Can't we
> > consider that in this specific instance, the hardware-specific
> > structure just happens to be identical to the original bitstream
> > format?
>
> At maximum allowed bitrate for let's say HEVC (940MB/s iirc), yes, it
> would be really really bad. In GStreamer project we have discussed for
> a while (but have never done anything about) adding the ability through
> a bitmask to select which part of the stream need to be parsed, as
> parsing itself was causing some overhead. Maybe similar thing applies,
> though as per our new design, it's the fourcc that dictate the driver
> behaviour, we'd need yet another fourcc for drivers that wants the full
> bitstream (which seems odd if you have already parsed everything, I
> think this need some clarification).
Note that I am not proposing to rebuild the *entire* bitstream
in-kernel. What I am saying is that if the hardware interprets some
structures (like SPS/PPS) in their raw format, this raw format could
be reconstructed from the structures passed by userspace at negligible
cost. Such manipulation would only happen on a small amount of data.
Exposing finer-grained driver requirements through a bitmask may
deserve more exploring. Maybe we could end with a spectrum of
capabilities that would allow us to cover the range from fully
stateless to fully stateful IPs more smoothly. Right now we have two
specifications that only consider the extremes of that range.
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [linux-sunxi] [PATCH v2 1/2] media: v4l: Add definitions for the HEVC slice format and controls
2019-01-30 2:28 ` Alexandre Courbot
@ 2019-01-30 3:35 ` Tomasz Figa
2019-01-30 6:27 ` Ayaka
2019-01-30 7:57 ` Maxime Ripard
0 siblings, 2 replies; 22+ messages in thread
From: Tomasz Figa @ 2019-01-30 3:35 UTC (permalink / raw)
To: Alexandre Courbot
Cc: devel, Hans Verkuil, Maxime Ripard, Ayaka, Randy Li, LKML,
Jernej Škrabec, Nicolas Dufresne, Paul Kocialkowski,
open list:ARM/Rockchip SoC..., Thomas Petazzoni,
Mauro Carvalho Chehab, Ezequiel Garcia,
list@263.net:IOMMU DRIVERS <iommu@lists.linux-foundation.org>, Joerg Roedel <joro@8bytes.org>, ,
Linux Media Mailing List
On Wed, Jan 30, 2019 at 11:29 AM Alexandre Courbot
<acourbot@chromium.org> wrote:
>
> On Wed, Jan 30, 2019 at 6:41 AM Nicolas Dufresne <nicolas@ndufresne.ca> wrote:
> >
> > Le mardi 29 janvier 2019 à 16:44 +0900, Alexandre Courbot a écrit :
> > > On Fri, Jan 25, 2019 at 10:04 PM Paul Kocialkowski
> > > <paul.kocialkowski@bootlin.com> wrote:
> > > > Hi,
> > > >
> > > > On Thu, 2019-01-24 at 20:23 +0800, Ayaka wrote:
> > > > > Sent from my iPad
> > > > >
> > > > > > On Jan 24, 2019, at 6:27 PM, Paul Kocialkowski <paul.kocialkowski@bootlin.com> wrote:
> > > > > >
> > > > > > Hi,
> > > > > >
> > > > > > > On Thu, 2019-01-10 at 21:32 +0800, ayaka wrote:
> > > > > > > I forget a important thing, for the rkvdec and rk hevc decoder, it would
> > > > > > > requests cabac table, scaling list, picture parameter set and reference
> > > > > > > picture storing in one or various of DMA buffers. I am not talking about
> > > > > > > the data been parsed, the decoder would requests a raw data.
> > > > > > >
> > > > > > > For the pps and rps, it is possible to reuse the slice header, just let
> > > > > > > the decoder know the offset from the bitstream bufer, I would suggest to
> > > > > > > add three properties(with sps) for them. But I think we need a method to
> > > > > > > mark a OUTPUT side buffer for those aux data.
> > > > > >
> > > > > > I'm quite confused about the hardware implementation then. From what
> > > > > > you're saying, it seems that it takes the raw bitstream elements rather
> > > > > > than parsed elements. Is it really a stateless implementation?
> > > > > >
> > > > > > The stateless implementation was designed with the idea that only the
> > > > > > raw slice data should be passed in bitstream form to the decoder. For
> > > > > > H.264, it seems that some decoders also need the slice header in raw
> > > > > > bitstream form (because they take the full slice NAL unit), see the
> > > > > > discussions in this thread:
> > > > > > media: docs-rst: Document m2m stateless video decoder interface
> > > > >
> > > > > Stateless just mean it won’t track the previous result, but I don’t
> > > > > think you can define what a date the hardware would need. Even you
> > > > > just build a dpb for the decoder, it is still stateless, but parsing
> > > > > less or more data from the bitstream doesn’t stop a decoder become a
> > > > > stateless decoder.
> > > >
> > > > Yes fair enough, the format in which the hardware decoder takes the
> > > > bitstream parameters does not make it stateless or stateful per-se.
> > > > It's just that stateless decoders should have no particular reason for
> > > > parsing the bitstream on their own since the hardware can be designed
> > > > with registers for each relevant bitstream element to configure the
> > > > decoding pipeline. That's how GPU-based decoder implementations are
> > > > implemented (VAAPI/VDPAU/NVDEC, etc).
> > > >
> > > > So the format we have agreed on so far for the stateless interface is
> > > > to pass parsed elements via v4l2 control structures.
> > > >
> > > > If the hardware can only work by parsing the bitstream itself, I'm not
> > > > sure what the best solution would be. Reconstructing the bitstream in
> > > > the kernel is a pretty bad option, but so is parsing in the kernel or
> > > > having the data both in parsed and raw forms. Do you see another
> > > > possibility?
> > >
> > > Is reconstructing the bitstream so bad? The v4l2 controls provide a
> > > generic interface to an encoded format which the driver needs to
> > > convert into a sequence that the hardware can understand. Typically
> > > this is done by populating hardware-specific structures. Can't we
> > > consider that in this specific instance, the hardware-specific
> > > structure just happens to be identical to the original bitstream
> > > format?
> >
> > At maximum allowed bitrate for let's say HEVC (940MB/s iirc), yes, it
> > would be really really bad. In GStreamer project we have discussed for
> > a while (but have never done anything about) adding the ability through
> > a bitmask to select which part of the stream need to be parsed, as
> > parsing itself was causing some overhead. Maybe similar thing applies,
> > though as per our new design, it's the fourcc that dictate the driver
> > behaviour, we'd need yet another fourcc for drivers that wants the full
> > bitstream (which seems odd if you have already parsed everything, I
> > think this need some clarification).
>
> Note that I am not proposing to rebuild the *entire* bitstream
> in-kernel. What I am saying is that if the hardware interprets some
> structures (like SPS/PPS) in their raw format, this raw format could
> be reconstructed from the structures passed by userspace at negligible
> cost. Such manipulation would only happen on a small amount of data.
>
> Exposing finer-grained driver requirements through a bitmask may
> deserve more exploring. Maybe we could end with a spectrum of
> capabilities that would allow us to cover the range from fully
> stateless to fully stateful IPs more smoothly. Right now we have two
> specifications that only consider the extremes of that range.
I gave it a bit more thought and if we combine what Nicolas suggested
about the bitmask control with the userspace providing the full
bitstream in the OUTPUT buffers, split into some logical units and
"tagged" with their type (e.g. SPS, PPS, slice, etc.), we could
potentially get an interface that would work for any kind of decoder I
can think of, actually eliminating the boundary between stateful and
stateless decoders.
For example, a fully stateful decoder would have the bitmask control
set to 0 and accept data from all the OUTPUT buffers as they come. A
decoder that doesn't do any parsing on its own would have all the
valid bits in the bitmask set and ignore the data in OUTPUT buffers
tagged as any kind of metadata. And then, we could have any cases in
between, including stateful decoders which just can't parse the stream
on their own, but still manage anything else themselves, or stateless
ones which can parse parts of the stream, like the rk3399 vdec can
parse the H.264 slice headers on its own.
That could potentially let us completely eliminate the distinction
between the stateful and stateless interfaces and just have one that
covers both.
Thoughts?
Best regards,
Tomasz
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [linux-sunxi] [PATCH v2 1/2] media: v4l: Add definitions for the HEVC slice format and controls
2019-01-30 3:35 ` Tomasz Figa
@ 2019-01-30 6:27 ` Ayaka
[not found] ` <5F14507E-1BE2-4A74-A59C-1DB3C3E07DBA-xPW3/0Ywev/iB9QmIjCX8w@public.gmane.org>
2019-01-30 7:57 ` Maxime Ripard
1 sibling, 1 reply; 22+ messages in thread
From: Ayaka @ 2019-01-30 6:27 UTC (permalink / raw)
To: Tomasz Figa
Cc: Alexandre Courbot, Nicolas Dufresne, Paul Kocialkowski, Randy Li,
Jernej Škrabec, Linux Media Mailing List, LKML, devel,
list@263.net:IOMMU DRIVERS <iommu@lists.linux-foundation.org>, Joerg Roedel <joro@8bytes.org>,,
Mauro Carvalho Chehab, Maxime Ripard, Hans Verkuil,
Ezequiel Garcia, Thomas Petazzoni, open list:ARM/Rockchip SoC...
Sent from my iPad
> On Jan 30, 2019, at 11:35 AM, Tomasz Figa <tfiga@chromium.org> wrote:
>
> On Wed, Jan 30, 2019 at 11:29 AM Alexandre Courbot
> <acourbot@chromium.org> wrote:
>>
>>> On Wed, Jan 30, 2019 at 6:41 AM Nicolas Dufresne <nicolas@ndufresne.ca> wrote:
>>>
>>>> Le mardi 29 janvier 2019 à 16:44 +0900, Alexandre Courbot a écrit :
>>>> On Fri, Jan 25, 2019 at 10:04 PM Paul Kocialkowski
>>>> <paul.kocialkowski@bootlin.com> wrote:
>>>>> Hi,
>>>>>
>>>>>> On Thu, 2019-01-24 at 20:23 +0800, Ayaka wrote:
>>>>>> Sent from my iPad
>>>>>>
>>>>>>> On Jan 24, 2019, at 6:27 PM, Paul Kocialkowski <paul.kocialkowski@bootlin.com> wrote:
>>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>>> On Thu, 2019-01-10 at 21:32 +0800, ayaka wrote:
>>>>>>>> I forget a important thing, for the rkvdec and rk hevc decoder, it would
>>>>>>>> requests cabac table, scaling list, picture parameter set and reference
>>>>>>>> picture storing in one or various of DMA buffers. I am not talking about
>>>>>>>> the data been parsed, the decoder would requests a raw data.
>>>>>>>>
>>>>>>>> For the pps and rps, it is possible to reuse the slice header, just let
>>>>>>>> the decoder know the offset from the bitstream bufer, I would suggest to
>>>>>>>> add three properties(with sps) for them. But I think we need a method to
>>>>>>>> mark a OUTPUT side buffer for those aux data.
>>>>>>>
>>>>>>> I'm quite confused about the hardware implementation then. From what
>>>>>>> you're saying, it seems that it takes the raw bitstream elements rather
>>>>>>> than parsed elements. Is it really a stateless implementation?
>>>>>>>
>>>>>>> The stateless implementation was designed with the idea that only the
>>>>>>> raw slice data should be passed in bitstream form to the decoder. For
>>>>>>> H.264, it seems that some decoders also need the slice header in raw
>>>>>>> bitstream form (because they take the full slice NAL unit), see the
>>>>>>> discussions in this thread:
>>>>>>> media: docs-rst: Document m2m stateless video decoder interface
>>>>>>
>>>>>> Stateless just mean it won’t track the previous result, but I don’t
>>>>>> think you can define what a date the hardware would need. Even you
>>>>>> just build a dpb for the decoder, it is still stateless, but parsing
>>>>>> less or more data from the bitstream doesn’t stop a decoder become a
>>>>>> stateless decoder.
>>>>>
>>>>> Yes fair enough, the format in which the hardware decoder takes the
>>>>> bitstream parameters does not make it stateless or stateful per-se.
>>>>> It's just that stateless decoders should have no particular reason for
>>>>> parsing the bitstream on their own since the hardware can be designed
>>>>> with registers for each relevant bitstream element to configure the
>>>>> decoding pipeline. That's how GPU-based decoder implementations are
>>>>> implemented (VAAPI/VDPAU/NVDEC, etc).
>>>>>
>>>>> So the format we have agreed on so far for the stateless interface is
>>>>> to pass parsed elements via v4l2 control structures.
>>>>>
>>>>> If the hardware can only work by parsing the bitstream itself, I'm not
>>>>> sure what the best solution would be. Reconstructing the bitstream in
>>>>> the kernel is a pretty bad option, but so is parsing in the kernel or
>>>>> having the data both in parsed and raw forms. Do you see another
>>>>> possibility?
>>>>
>>>> Is reconstructing the bitstream so bad? The v4l2 controls provide a
>>>> generic interface to an encoded format which the driver needs to
>>>> convert into a sequence that the hardware can understand. Typically
>>>> this is done by populating hardware-specific structures. Can't we
>>>> consider that in this specific instance, the hardware-specific
>>>> structure just happens to be identical to the original bitstream
>>>> format?
>>>
>>> At maximum allowed bitrate for let's say HEVC (940MB/s iirc), yes, it
>>> would be really really bad. In GStreamer project we have discussed for
>>> a while (but have never done anything about) adding the ability through
>>> a bitmask to select which part of the stream need to be parsed, as
>>> parsing itself was causing some overhead. Maybe similar thing applies,
>>> though as per our new design, it's the fourcc that dictate the driver
>>> behaviour, we'd need yet another fourcc for drivers that wants the full
>>> bitstream (which seems odd if you have already parsed everything, I
>>> think this need some clarification).
>>
>> Note that I am not proposing to rebuild the *entire* bitstream
>> in-kernel. What I am saying is that if the hardware interprets some
>> structures (like SPS/PPS) in their raw format, this raw format could
>> be reconstructed from the structures passed by userspace at negligible
>> cost. Such manipulation would only happen on a small amount of data.
>>
>> Exposing finer-grained driver requirements through a bitmask may
>> deserve more exploring. Maybe we could end with a spectrum of
>> capabilities that would allow us to cover the range from fully
>> stateless to fully stateful IPs more smoothly. Right now we have two
>> specifications that only consider the extremes of that range.
>
> I gave it a bit more thought and if we combine what Nicolas suggested
> about the bitmask control with the userspace providing the full
> bitstream in the OUTPUT buffers, split into some logical units and
> "tagged" with their type (e.g. SPS, PPS, slice, etc.), we could
> potentially get an interface that would work for any kind of decoder I
> can think of, actually eliminating the boundary between stateful and
> stateless decoders.
I agree with this idea, that is what I want calling memory region description while I am still struggling with userspace to post my driver demo.
>
> For example, a fully stateful decoder would have the bitmask control
> set to 0 and accept data from all the OUTPUT buffers as they come. A
> decoder that doesn't do any parsing on its own would have all the
> valid bits in the bitmask set and ignore the data in OUTPUT buffers
> tagged as any kind of metadata. And then, we could have any cases in
> between, including stateful decoders which just can't parse the stream
> on their own, but still manage anything else themselves, or stateless
> ones which can parse parts of the stream, like the rk3399 vdec can
> parse the H.264 slice headers on its own.
>
Actually not, the rkvdec and rkhevc can parse most but not all syntax sections.
Besides the vp9 decoder of rkvdec won’t parse most of the syntax.
I talked to some rockchip staff about the performance problem of reconstruction bitstream after yesterday arguing with tfiga at IRC yesterday. Although 1ms looks small to those decoder which can decode a picture of a UHD 4K HEVC videos in 9ms, it is enough for 60fps. But how about a higher frame rate like 120fps or 240fps and when it comes to 8K which is used in Japan broadcast.
I would bring more detail in the FOSDEM 2019, I may stay at graphics devroom at Saturday.
> That could potentially let us completely eliminate the distinction
> between the stateful and stateless interfaces and just have one that
> covers both.
>
> Thoughts?
>
> Best regards,
> Tomasz
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [linux-sunxi] [PATCH v2 1/2] media: v4l: Add definitions for the HEVC slice format and controls
2019-01-29 21:41 ` Nicolas Dufresne
2019-01-30 2:28 ` Alexandre Courbot
@ 2019-01-30 7:03 ` Ayaka
1 sibling, 0 replies; 22+ messages in thread
From: Ayaka @ 2019-01-30 7:03 UTC (permalink / raw)
To: Nicolas Dufresne
Cc: Alexandre Courbot, Paul Kocialkowski, Randy Li,
Jernej Škrabec, Linux Media Mailing List, LKML, devel,
linux-arm-kernel, Mauro Carvalho Chehab, Maxime Ripard,
Hans Verkuil, Ezequiel Garcia, Tomasz Figa, Thomas Petazzoni,
linux-rockchip
Sent from my iPad
> On Jan 30, 2019, at 5:41 AM, Nicolas Dufresne <nicolas@ndufresne.ca> wrote:
>
>> Le mardi 29 janvier 2019 à 16:44 +0900, Alexandre Courbot a écrit :
>> On Fri, Jan 25, 2019 at 10:04 PM Paul Kocialkowski
>> <paul.kocialkowski@bootlin.com> wrote:
>>> Hi,
>>>
>>>> On Thu, 2019-01-24 at 20:23 +0800, Ayaka wrote:
>>>> Sent from my iPad
>>>>
>>>>> On Jan 24, 2019, at 6:27 PM, Paul Kocialkowski <paul.kocialkowski@bootlin.com> wrote:
>>>>>
>>>>> Hi,
>>>>>
>>>>>> On Thu, 2019-01-10 at 21:32 +0800, ayaka wrote:
>>>>>> I forget a important thing, for the rkvdec and rk hevc decoder, it would
>>>>>> requests cabac table, scaling list, picture parameter set and reference
>>>>>> picture storing in one or various of DMA buffers. I am not talking about
>>>>>> the data been parsed, the decoder would requests a raw data.
>>>>>>
>>>>>> For the pps and rps, it is possible to reuse the slice header, just let
>>>>>> the decoder know the offset from the bitstream bufer, I would suggest to
>>>>>> add three properties(with sps) for them. But I think we need a method to
>>>>>> mark a OUTPUT side buffer for those aux data.
>>>>>
>>>>> I'm quite confused about the hardware implementation then. From what
>>>>> you're saying, it seems that it takes the raw bitstream elements rather
>>>>> than parsed elements. Is it really a stateless implementation?
>>>>>
>>>>> The stateless implementation was designed with the idea that only the
>>>>> raw slice data should be passed in bitstream form to the decoder. For
>>>>> H.264, it seems that some decoders also need the slice header in raw
>>>>> bitstream form (because they take the full slice NAL unit), see the
>>>>> discussions in this thread:
>>>>> media: docs-rst: Document m2m stateless video decoder interface
>>>>
>>>> Stateless just mean it won’t track the previous result, but I don’t
>>>> think you can define what a date the hardware would need. Even you
>>>> just build a dpb for the decoder, it is still stateless, but parsing
>>>> less or more data from the bitstream doesn’t stop a decoder become a
>>>> stateless decoder.
>>>
>>> Yes fair enough, the format in which the hardware decoder takes the
>>> bitstream parameters does not make it stateless or stateful per-se.
>>> It's just that stateless decoders should have no particular reason for
>>> parsing the bitstream on their own since the hardware can be designed
>>> with registers for each relevant bitstream element to configure the
>>> decoding pipeline. That's how GPU-based decoder implementations are
>>> implemented (VAAPI/VDPAU/NVDEC, etc).
>>>
>>> So the format we have agreed on so far for the stateless interface is
>>> to pass parsed elements via v4l2 control structures.
>>>
>>> If the hardware can only work by parsing the bitstream itself, I'm not
>>> sure what the best solution would be. Reconstructing the bitstream in
>>> the kernel is a pretty bad option, but so is parsing in the kernel or
>>> having the data both in parsed and raw forms. Do you see another
>>> possibility?
>>
>> Is reconstructing the bitstream so bad? The v4l2 controls provide a
>> generic interface to an encoded format which the driver needs to
>> convert into a sequence that the hardware can understand. Typically
>> this is done by populating hardware-specific structures. Can't we
>> consider that in this specific instance, the hardware-specific
>> structure just happens to be identical to the original bitstream
>> format?
>
> At maximum allowed bitrate for let's say HEVC (940MB/s iirc), yes, it
Lucky, most of hardware won’t be able to processing such a big buffer.
General speaking, the register is 24bits for stream length in bytes.
> would be really really bad. In GStreamer project we have discussed for
> a while (but have never done anything about) adding the ability through
> a bitmask to select which part of the stream need to be parsed, as
> parsing itself was causing some overhead. Maybe similar thing applies,
> though as per our new design, it's the fourcc that dictate the driver
> behaviour, we'd need yet another fourcc for drivers that wants the full
> bitstream (which seems odd if you have already parsed everything, I
> think this need some clarification).
>
>>
>> I agree that this is not strictly optimal for that particular
>> hardware, but such is the cost of abstractions, and in this specific
>> case I don't believe the cost would be particularly high?
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [linux-sunxi] [PATCH v2 1/2] media: v4l: Add definitions for the HEVC slice format and controls
[not found] ` <5F14507E-1BE2-4A74-A59C-1DB3C3E07DBA-xPW3/0Ywev/iB9QmIjCX8w@public.gmane.org>
@ 2019-01-30 7:17 ` Tomasz Figa
2019-01-30 9:54 ` Ayaka
0 siblings, 1 reply; 22+ messages in thread
From: Tomasz Figa @ 2019-01-30 7:17 UTC (permalink / raw)
To: Ayaka
Cc: devel-gWbeCf7V1WCQmaza687I9mD2FQJk+8+b, Hans Verkuil,
Alexandre Courbot, Maxime Ripard, Randy Li, LKML,
Jernej Škrabec, Nicolas Dufresne, Paul Kocialkowski,
open list:ARM/Rockchip SoC..., Thomas Petazzoni,
Mauro Carvalho Chehab, Ezequiel Garcia,
list-Y9sIeH5OGRo@public.gmane.org:IOMMU DRIVERS <iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org>, Joerg Roedel <joro-zLv9SwRftAIdnm+yROfE0A@public.gmane.org>, ,
Linux Media Mailing List
On Wed, Jan 30, 2019 at 3:28 PM Ayaka <ayaka@soulik.info> wrote:
>
>
>
> Sent from my iPad
>
> > On Jan 30, 2019, at 11:35 AM, Tomasz Figa <tfiga@chromium.org> wrote:
> >
> > On Wed, Jan 30, 2019 at 11:29 AM Alexandre Courbot
> > <acourbot@chromium.org> wrote:
> >>
> >>> On Wed, Jan 30, 2019 at 6:41 AM Nicolas Dufresne <nicolas@ndufresne.ca> wrote:
> >>>
> >>>> Le mardi 29 janvier 2019 à 16:44 +0900, Alexandre Courbot a écrit :
> >>>> On Fri, Jan 25, 2019 at 10:04 PM Paul Kocialkowski
> >>>> <paul.kocialkowski@bootlin.com> wrote:
> >>>>> Hi,
> >>>>>
> >>>>>> On Thu, 2019-01-24 at 20:23 +0800, Ayaka wrote:
> >>>>>> Sent from my iPad
> >>>>>>
> >>>>>>> On Jan 24, 2019, at 6:27 PM, Paul Kocialkowski <paul.kocialkowski@bootlin.com> wrote:
> >>>>>>>
> >>>>>>> Hi,
> >>>>>>>
> >>>>>>>> On Thu, 2019-01-10 at 21:32 +0800, ayaka wrote:
> >>>>>>>> I forget a important thing, for the rkvdec and rk hevc decoder, it would
> >>>>>>>> requests cabac table, scaling list, picture parameter set and reference
> >>>>>>>> picture storing in one or various of DMA buffers. I am not talking about
> >>>>>>>> the data been parsed, the decoder would requests a raw data.
> >>>>>>>>
> >>>>>>>> For the pps and rps, it is possible to reuse the slice header, just let
> >>>>>>>> the decoder know the offset from the bitstream bufer, I would suggest to
> >>>>>>>> add three properties(with sps) for them. But I think we need a method to
> >>>>>>>> mark a OUTPUT side buffer for those aux data.
> >>>>>>>
> >>>>>>> I'm quite confused about the hardware implementation then. From what
> >>>>>>> you're saying, it seems that it takes the raw bitstream elements rather
> >>>>>>> than parsed elements. Is it really a stateless implementation?
> >>>>>>>
> >>>>>>> The stateless implementation was designed with the idea that only the
> >>>>>>> raw slice data should be passed in bitstream form to the decoder. For
> >>>>>>> H.264, it seems that some decoders also need the slice header in raw
> >>>>>>> bitstream form (because they take the full slice NAL unit), see the
> >>>>>>> discussions in this thread:
> >>>>>>> media: docs-rst: Document m2m stateless video decoder interface
> >>>>>>
> >>>>>> Stateless just mean it won’t track the previous result, but I don’t
> >>>>>> think you can define what a date the hardware would need. Even you
> >>>>>> just build a dpb for the decoder, it is still stateless, but parsing
> >>>>>> less or more data from the bitstream doesn’t stop a decoder become a
> >>>>>> stateless decoder.
> >>>>>
> >>>>> Yes fair enough, the format in which the hardware decoder takes the
> >>>>> bitstream parameters does not make it stateless or stateful per-se.
> >>>>> It's just that stateless decoders should have no particular reason for
> >>>>> parsing the bitstream on their own since the hardware can be designed
> >>>>> with registers for each relevant bitstream element to configure the
> >>>>> decoding pipeline. That's how GPU-based decoder implementations are
> >>>>> implemented (VAAPI/VDPAU/NVDEC, etc).
> >>>>>
> >>>>> So the format we have agreed on so far for the stateless interface is
> >>>>> to pass parsed elements via v4l2 control structures.
> >>>>>
> >>>>> If the hardware can only work by parsing the bitstream itself, I'm not
> >>>>> sure what the best solution would be. Reconstructing the bitstream in
> >>>>> the kernel is a pretty bad option, but so is parsing in the kernel or
> >>>>> having the data both in parsed and raw forms. Do you see another
> >>>>> possibility?
> >>>>
> >>>> Is reconstructing the bitstream so bad? The v4l2 controls provide a
> >>>> generic interface to an encoded format which the driver needs to
> >>>> convert into a sequence that the hardware can understand. Typically
> >>>> this is done by populating hardware-specific structures. Can't we
> >>>> consider that in this specific instance, the hardware-specific
> >>>> structure just happens to be identical to the original bitstream
> >>>> format?
> >>>
> >>> At maximum allowed bitrate for let's say HEVC (940MB/s iirc), yes, it
> >>> would be really really bad. In GStreamer project we have discussed for
> >>> a while (but have never done anything about) adding the ability through
> >>> a bitmask to select which part of the stream need to be parsed, as
> >>> parsing itself was causing some overhead. Maybe similar thing applies,
> >>> though as per our new design, it's the fourcc that dictate the driver
> >>> behaviour, we'd need yet another fourcc for drivers that wants the full
> >>> bitstream (which seems odd if you have already parsed everything, I
> >>> think this need some clarification).
> >>
> >> Note that I am not proposing to rebuild the *entire* bitstream
> >> in-kernel. What I am saying is that if the hardware interprets some
> >> structures (like SPS/PPS) in their raw format, this raw format could
> >> be reconstructed from the structures passed by userspace at negligible
> >> cost. Such manipulation would only happen on a small amount of data.
> >>
> >> Exposing finer-grained driver requirements through a bitmask may
> >> deserve more exploring. Maybe we could end with a spectrum of
> >> capabilities that would allow us to cover the range from fully
> >> stateless to fully stateful IPs more smoothly. Right now we have two
> >> specifications that only consider the extremes of that range.
> >
> > I gave it a bit more thought and if we combine what Nicolas suggested
> > about the bitmask control with the userspace providing the full
> > bitstream in the OUTPUT buffers, split into some logical units and
> > "tagged" with their type (e.g. SPS, PPS, slice, etc.), we could
> > potentially get an interface that would work for any kind of decoder I
> > can think of, actually eliminating the boundary between stateful and
> > stateless decoders.
> I agree with this idea, that is what I want calling memory region description while I am still struggling with userspace to post my driver demo.
> >
> > For example, a fully stateful decoder would have the bitmask control
> > set to 0 and accept data from all the OUTPUT buffers as they come. A
> > decoder that doesn't do any parsing on its own would have all the
> > valid bits in the bitmask set and ignore the data in OUTPUT buffers
> > tagged as any kind of metadata. And then, we could have any cases in
> > between, including stateful decoders which just can't parse the stream
> > on their own, but still manage anything else themselves, or stateless
> > ones which can parse parts of the stream, like the rk3399 vdec can
> > parse the H.264 slice headers on its own.
> >
> Actually not, the rkvdec and rkhevc can parse most but not all syntax sections.
> Besides the vp9 decoder of rkvdec won’t parse most of the syntax.
>
> I talked to some rockchip staff about the performance problem of reconstruction bitstream after yesterday arguing with tfiga at IRC yesterday. Although 1ms looks small to those decoder which can decode a picture of a UHD 4K HEVC videos in 9ms, it is enough for 60fps. But how about a higher frame rate like 120fps or 240fps and when it comes to 8K which is used in Japan broadcast.
1 ms for a 500 MHz CPU (which is quite slow these days) is 500k
cycles. We don't have to reconstruct the whole bitstream, just the
parsed metadata and also we don't get a new PPS or SPS every frame.
Not sure where you have this 1 ms from. Most of the difference between
our structures and the bitstream is that the latter is packed and
could be variable length.
We actually have some sample bitstream assembly code for the rockchip encoder:
https://chromium.googlesource.com/chromiumos/third_party/libv4lplugins/+/5e6034258146af6be973fb6a5bb6b9d6e7489437/libv4l-rockchip_v2/libvepu/h264e/h264e.c#148
https://chromium.googlesource.com/chromiumos/third_party/libv4lplugins/+/5e6034258146af6be973fb6a5bb6b9d6e7489437/libv4l-rockchip_v2/libvepu/streams.c#36
Disassembling the stream_put_bits() gives 36 thumb2 instructions,
including 23 for the loop for each byte that is written.
stream_write_ue() is a bit more complicated, but in the worst case it
ends up with 4 calls to stream_put_bits(), each at most spanning 4
bytes for simplicity.
Let's count the operations for SPS then:
(1) stream_put_bits() spanning 1 byte: 33 times
(2) stream_put_bits() spanning up to 3 bytes: 4 times
(3) stream_write_ue() up to 31 bits: 19 times
Adding it together:
(1) 33 * 36 +
(2) 4 * (36 + 2 * 23) +
(3) 19 * (4 * (36 + 3 * 23)) =
1188 + 328 + 7980 = 9496 ~= 10k instructions
The code above doesn't seem to contain any expensive instructions,
like division, so for a modern pipelined out of order core (e.g. A53),
it could be safe to assume 1 instruction per cycle. At 500 MHz that
gives you 20 usecs.
SPS is the most complex header and for H.264 we just do PPS and some
slice headers. Let's round it up a bit and we could have around 100
usecs for the complete frame metadata.
>
> I would bring more detail in the FOSDEM 2019, I may stay at graphics devroom at Saturday.
> > That could potentially let us completely eliminate the distinction
> > between the stateful and stateless interfaces and just have one that
> > covers both.
> >
> > Thoughts?
Any thoughts on my proposal to make the interface more flexible? Any
specific examples of issues that we could encounter that would prevent
it from working efficiently with Rockchip (or other) hardware?
Best regards,
Tomasz
_______________________________________________
Linux-rockchip mailing list
Linux-rockchip@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-rockchip
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [linux-sunxi] [PATCH v2 1/2] media: v4l: Add definitions for the HEVC slice format and controls
2019-01-30 3:35 ` Tomasz Figa
2019-01-30 6:27 ` Ayaka
@ 2019-01-30 7:57 ` Maxime Ripard
1 sibling, 0 replies; 22+ messages in thread
From: Maxime Ripard @ 2019-01-30 7:57 UTC (permalink / raw)
To: Tomasz Figa
Cc: devel, Hans Verkuil, Alexandre Courbot, Ayaka, Randy Li, LKML,
Jernej Škrabec, Nicolas Dufresne, Paul Kocialkowski,
open list:ARM/Rockchip SoC..., Thomas Petazzoni,
Mauro Carvalho Chehab, Ezequiel Garcia,
list@263.net:IOMMU DRIVERS <iommu@lists.linux-foundation.org>, Joerg Roedel <joro@8bytes.org>, ,
Linux Media Mailing List
[-- Attachment #1.1: Type: text/plain, Size: 7202 bytes --]
On Wed, Jan 30, 2019 at 12:35:41PM +0900, Tomasz Figa wrote:
> On Wed, Jan 30, 2019 at 11:29 AM Alexandre Courbot
> <acourbot@chromium.org> wrote:
> >
> > On Wed, Jan 30, 2019 at 6:41 AM Nicolas Dufresne <nicolas@ndufresne.ca> wrote:
> > >
> > > Le mardi 29 janvier 2019 à 16:44 +0900, Alexandre Courbot a écrit :
> > > > On Fri, Jan 25, 2019 at 10:04 PM Paul Kocialkowski
> > > > <paul.kocialkowski@bootlin.com> wrote:
> > > > > Hi,
> > > > >
> > > > > On Thu, 2019-01-24 at 20:23 +0800, Ayaka wrote:
> > > > > > Sent from my iPad
> > > > > >
> > > > > > > On Jan 24, 2019, at 6:27 PM, Paul Kocialkowski <paul.kocialkowski@bootlin.com> wrote:
> > > > > > >
> > > > > > > Hi,
> > > > > > >
> > > > > > > > On Thu, 2019-01-10 at 21:32 +0800, ayaka wrote:
> > > > > > > > I forget a important thing, for the rkvdec and rk hevc decoder, it would
> > > > > > > > requests cabac table, scaling list, picture parameter set and reference
> > > > > > > > picture storing in one or various of DMA buffers. I am not talking about
> > > > > > > > the data been parsed, the decoder would requests a raw data.
> > > > > > > >
> > > > > > > > For the pps and rps, it is possible to reuse the slice header, just let
> > > > > > > > the decoder know the offset from the bitstream bufer, I would suggest to
> > > > > > > > add three properties(with sps) for them. But I think we need a method to
> > > > > > > > mark a OUTPUT side buffer for those aux data.
> > > > > > >
> > > > > > > I'm quite confused about the hardware implementation then. From what
> > > > > > > you're saying, it seems that it takes the raw bitstream elements rather
> > > > > > > than parsed elements. Is it really a stateless implementation?
> > > > > > >
> > > > > > > The stateless implementation was designed with the idea that only the
> > > > > > > raw slice data should be passed in bitstream form to the decoder. For
> > > > > > > H.264, it seems that some decoders also need the slice header in raw
> > > > > > > bitstream form (because they take the full slice NAL unit), see the
> > > > > > > discussions in this thread:
> > > > > > > media: docs-rst: Document m2m stateless video decoder interface
> > > > > >
> > > > > > Stateless just mean it won’t track the previous result, but I don’t
> > > > > > think you can define what a date the hardware would need. Even you
> > > > > > just build a dpb for the decoder, it is still stateless, but parsing
> > > > > > less or more data from the bitstream doesn’t stop a decoder become a
> > > > > > stateless decoder.
> > > > >
> > > > > Yes fair enough, the format in which the hardware decoder takes the
> > > > > bitstream parameters does not make it stateless or stateful per-se.
> > > > > It's just that stateless decoders should have no particular reason for
> > > > > parsing the bitstream on their own since the hardware can be designed
> > > > > with registers for each relevant bitstream element to configure the
> > > > > decoding pipeline. That's how GPU-based decoder implementations are
> > > > > implemented (VAAPI/VDPAU/NVDEC, etc).
> > > > >
> > > > > So the format we have agreed on so far for the stateless interface is
> > > > > to pass parsed elements via v4l2 control structures.
> > > > >
> > > > > If the hardware can only work by parsing the bitstream itself, I'm not
> > > > > sure what the best solution would be. Reconstructing the bitstream in
> > > > > the kernel is a pretty bad option, but so is parsing in the kernel or
> > > > > having the data both in parsed and raw forms. Do you see another
> > > > > possibility?
> > > >
> > > > Is reconstructing the bitstream so bad? The v4l2 controls provide a
> > > > generic interface to an encoded format which the driver needs to
> > > > convert into a sequence that the hardware can understand. Typically
> > > > this is done by populating hardware-specific structures. Can't we
> > > > consider that in this specific instance, the hardware-specific
> > > > structure just happens to be identical to the original bitstream
> > > > format?
> > >
> > > At maximum allowed bitrate for let's say HEVC (940MB/s iirc), yes, it
> > > would be really really bad. In GStreamer project we have discussed for
> > > a while (but have never done anything about) adding the ability through
> > > a bitmask to select which part of the stream need to be parsed, as
> > > parsing itself was causing some overhead. Maybe similar thing applies,
> > > though as per our new design, it's the fourcc that dictate the driver
> > > behaviour, we'd need yet another fourcc for drivers that wants the full
> > > bitstream (which seems odd if you have already parsed everything, I
> > > think this need some clarification).
> >
> > Note that I am not proposing to rebuild the *entire* bitstream
> > in-kernel. What I am saying is that if the hardware interprets some
> > structures (like SPS/PPS) in their raw format, this raw format could
> > be reconstructed from the structures passed by userspace at negligible
> > cost. Such manipulation would only happen on a small amount of data.
> >
> > Exposing finer-grained driver requirements through a bitmask may
> > deserve more exploring. Maybe we could end with a spectrum of
> > capabilities that would allow us to cover the range from fully
> > stateless to fully stateful IPs more smoothly. Right now we have two
> > specifications that only consider the extremes of that range.
>
> I gave it a bit more thought and if we combine what Nicolas suggested
> about the bitmask control with the userspace providing the full
> bitstream in the OUTPUT buffers, split into some logical units and
> "tagged" with their type (e.g. SPS, PPS, slice, etc.), we could
> potentially get an interface that would work for any kind of decoder I
> can think of, actually eliminating the boundary between stateful and
> stateless decoders.
>
> For example, a fully stateful decoder would have the bitmask control
> set to 0 and accept data from all the OUTPUT buffers as they come. A
> decoder that doesn't do any parsing on its own would have all the
> valid bits in the bitmask set and ignore the data in OUTPUT buffers
> tagged as any kind of metadata. And then, we could have any cases in
> between, including stateful decoders which just can't parse the stream
> on their own, but still manage anything else themselves, or stateless
> ones which can parse parts of the stream, like the rk3399 vdec can
> parse the H.264 slice headers on its own.
>
> That could potentially let us completely eliminate the distinction
> between the stateful and stateless interfaces and just have one that
> covers both.
>
> Thoughts?
If we have to provide the whole bitstream in the buffers, then it
entirely breaks the sole software stack we have running and working
currently, for a use case and a driver that hasn't seen a single line
of code.
Seriously, this is a *private* API that we did that way so that we can
change it and only make it public. Why not do just that?
Maxime
--
Maxime Ripard, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com
[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]
[-- Attachment #2: Type: text/plain, Size: 169 bytes --]
_______________________________________________
devel mailing list
devel@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [linux-sunxi] [PATCH v2 1/2] media: v4l: Add definitions for the HEVC slice format and controls
2019-01-30 7:17 ` Tomasz Figa
@ 2019-01-30 9:54 ` Ayaka
0 siblings, 0 replies; 22+ messages in thread
From: Ayaka @ 2019-01-30 9:54 UTC (permalink / raw)
To: Tomasz Figa
Cc: Alexandre Courbot, Nicolas Dufresne, Paul Kocialkowski, Randy Li,
Jernej Škrabec, Linux Media Mailing List, LKML, devel,
list@263.net:IOMMU DRIVERS <iommu@lists.linux-foundation.org>, Joerg Roedel <joro@8bytes.org>,,
Mauro Carvalho Chehab, Maxime Ripard, Hans Verkuil,
Ezequiel Garcia, Thomas Petazzoni, open list:ARM/Rockchip SoC...
Sent from my iPad
> On Jan 30, 2019, at 3:17 PM, Tomasz Figa <tfiga@chromium.org> wrote:
>
>> On Wed, Jan 30, 2019 at 3:28 PM Ayaka <ayaka@soulik.info> wrote:
>>
>>
>>
>> Sent from my iPad
>>
>>> On Jan 30, 2019, at 11:35 AM, Tomasz Figa <tfiga@chromium.org> wrote:
>>>
>>> On Wed, Jan 30, 2019 at 11:29 AM Alexandre Courbot
>>> <acourbot@chromium.org> wrote:
>>>>
>>>>>> On Wed, Jan 30, 2019 at 6:41 AM Nicolas Dufresne <nicolas@ndufresne.ca> wrote:
>>>>>>
>>>>>> Le mardi 29 janvier 2019 à 16:44 +0900, Alexandre Courbot a écrit :
>>>>>> On Fri, Jan 25, 2019 at 10:04 PM Paul Kocialkowski
>>>>>> <paul.kocialkowski@bootlin.com> wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>>> On Thu, 2019-01-24 at 20:23 +0800, Ayaka wrote:
>>>>>>>> Sent from my iPad
>>>>>>>>
>>>>>>>>> On Jan 24, 2019, at 6:27 PM, Paul Kocialkowski <paul.kocialkowski@bootlin.com> wrote:
>>>>>>>>>
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>>> On Thu, 2019-01-10 at 21:32 +0800, ayaka wrote:
>>>>>>>>>> I forget a important thing, for the rkvdec and rk hevc decoder, it would
>>>>>>>>>> requests cabac table, scaling list, picture parameter set and reference
>>>>>>>>>> picture storing in one or various of DMA buffers. I am not talking about
>>>>>>>>>> the data been parsed, the decoder would requests a raw data.
>>>>>>>>>>
>>>>>>>>>> For the pps and rps, it is possible to reuse the slice header, just let
>>>>>>>>>> the decoder know the offset from the bitstream bufer, I would suggest to
>>>>>>>>>> add three properties(with sps) for them. But I think we need a method to
>>>>>>>>>> mark a OUTPUT side buffer for those aux data.
>>>>>>>>>
>>>>>>>>> I'm quite confused about the hardware implementation then. From what
>>>>>>>>> you're saying, it seems that it takes the raw bitstream elements rather
>>>>>>>>> than parsed elements. Is it really a stateless implementation?
>>>>>>>>>
>>>>>>>>> The stateless implementation was designed with the idea that only the
>>>>>>>>> raw slice data should be passed in bitstream form to the decoder. For
>>>>>>>>> H.264, it seems that some decoders also need the slice header in raw
>>>>>>>>> bitstream form (because they take the full slice NAL unit), see the
>>>>>>>>> discussions in this thread:
>>>>>>>>> media: docs-rst: Document m2m stateless video decoder interface
>>>>>>>>
>>>>>>>> Stateless just mean it won’t track the previous result, but I don’t
>>>>>>>> think you can define what a date the hardware would need. Even you
>>>>>>>> just build a dpb for the decoder, it is still stateless, but parsing
>>>>>>>> less or more data from the bitstream doesn’t stop a decoder become a
>>>>>>>> stateless decoder.
>>>>>>>
>>>>>>> Yes fair enough, the format in which the hardware decoder takes the
>>>>>>> bitstream parameters does not make it stateless or stateful per-se.
>>>>>>> It's just that stateless decoders should have no particular reason for
>>>>>>> parsing the bitstream on their own since the hardware can be designed
>>>>>>> with registers for each relevant bitstream element to configure the
>>>>>>> decoding pipeline. That's how GPU-based decoder implementations are
>>>>>>> implemented (VAAPI/VDPAU/NVDEC, etc).
>>>>>>>
>>>>>>> So the format we have agreed on so far for the stateless interface is
>>>>>>> to pass parsed elements via v4l2 control structures.
>>>>>>>
>>>>>>> If the hardware can only work by parsing the bitstream itself, I'm not
>>>>>>> sure what the best solution would be. Reconstructing the bitstream in
>>>>>>> the kernel is a pretty bad option, but so is parsing in the kernel or
>>>>>>> having the data both in parsed and raw forms. Do you see another
>>>>>>> possibility?
>>>>>>
>>>>>> Is reconstructing the bitstream so bad? The v4l2 controls provide a
>>>>>> generic interface to an encoded format which the driver needs to
>>>>>> convert into a sequence that the hardware can understand. Typically
>>>>>> this is done by populating hardware-specific structures. Can't we
>>>>>> consider that in this specific instance, the hardware-specific
>>>>>> structure just happens to be identical to the original bitstream
>>>>>> format?
>>>>>
>>>>> At maximum allowed bitrate for let's say HEVC (940MB/s iirc), yes, it
>>>>> would be really really bad. In GStreamer project we have discussed for
>>>>> a while (but have never done anything about) adding the ability through
>>>>> a bitmask to select which part of the stream need to be parsed, as
>>>>> parsing itself was causing some overhead. Maybe similar thing applies,
>>>>> though as per our new design, it's the fourcc that dictate the driver
>>>>> behaviour, we'd need yet another fourcc for drivers that wants the full
>>>>> bitstream (which seems odd if you have already parsed everything, I
>>>>> think this need some clarification).
>>>>
>>>> Note that I am not proposing to rebuild the *entire* bitstream
>>>> in-kernel. What I am saying is that if the hardware interprets some
>>>> structures (like SPS/PPS) in their raw format, this raw format could
>>>> be reconstructed from the structures passed by userspace at negligible
>>>> cost. Such manipulation would only happen on a small amount of data.
>>>>
>>>> Exposing finer-grained driver requirements through a bitmask may
>>>> deserve more exploring. Maybe we could end with a spectrum of
>>>> capabilities that would allow us to cover the range from fully
>>>> stateless to fully stateful IPs more smoothly. Right now we have two
>>>> specifications that only consider the extremes of that range.
>>>
>>> I gave it a bit more thought and if we combine what Nicolas suggested
>>> about the bitmask control with the userspace providing the full
>>> bitstream in the OUTPUT buffers, split into some logical units and
>>> "tagged" with their type (e.g. SPS, PPS, slice, etc.), we could
>>> potentially get an interface that would work for any kind of decoder I
>>> can think of, actually eliminating the boundary between stateful and
>>> stateless decoders.
>> I agree with this idea, that is what I want calling memory region description while I am still struggling with userspace to post my driver demo.
>>>
>>> For example, a fully stateful decoder would have the bitmask control
>>> set to 0 and accept data from all the OUTPUT buffers as they come. A
>>> decoder that doesn't do any parsing on its own would have all the
>>> valid bits in the bitmask set and ignore the data in OUTPUT buffers
>>> tagged as any kind of metadata. And then, we could have any cases in
>>> between, including stateful decoders which just can't parse the stream
>>> on their own, but still manage anything else themselves, or stateless
>>> ones which can parse parts of the stream, like the rk3399 vdec can
>>> parse the H.264 slice headers on its own.
>>>
>> Actually not, the rkvdec and rkhevc can parse most but not all syntax sections.
>> Besides the vp9 decoder of rkvdec won’t parse most of the syntax.
>>
>> I talked to some rockchip staff about the performance problem of reconstruction bitstream after yesterday arguing with tfiga at IRC yesterday. Although 1ms looks small to those decoder which can decode a picture of a UHD 4K HEVC videos in 9ms, it is enough for 60fps. But how about a higher frame rate like 120fps or 240fps and when it comes to 8K which is used in Japan broadcast.
>
> 1 ms for a 500 MHz CPU (which is quite slow these days) is 500k
> cycles. We don't have to reconstruct the whole bitstream, just the
> parsed metadata and also we don't get a new PPS or SPS every frame.
> Not sure where you have this 1 ms from. Most of the difference between
> our structures and the bitstream is that the latter is packed and
> could be variable length.
You told me that number yesterday.
>
> We actually have some sample bitstream assembly code for the rockchip encoder:
>
> https://chromium.googlesource.com/chromiumos/third_party/libv4lplugins/+/5e6034258146af6be973fb6a5bb6b9d6e7489437/libv4l-rockchip_v2/libvepu/h264e/h264e.c#148
> https://chromium.googlesource.com/chromiumos/third_party/libv4lplugins/+/5e6034258146af6be973fb6a5bb6b9d6e7489437/libv4l-rockchip_v2/libvepu/streams.c#36
>
> Disassembling the stream_put_bits() gives 36 thumb2 instructions,
> including 23 for the loop for each byte that is written.
> stream_write_ue() is a bit more complicated, but in the worst case it
> ends up with 4 calls to stream_put_bits(), each at most spanning 4
> bytes for simplicity.
>
> Let's count the operations for SPS then:
> (1) stream_put_bits() spanning 1 byte: 33 times
> (2) stream_put_bits() spanning up to 3 bytes: 4 times
> (3) stream_write_ue() up to 31 bits: 19 times
>
> Adding it together:
> (1) 33 * 36 +
> (2) 4 * (36 + 2 * 23) +
> (3) 19 * (4 * (36 + 3 * 23)) =
>
> 1188 + 328 + 7980 = 9496 ~= 10k instructions
>
> The code above doesn't seem to contain any expensive instructions,
> like division, so for a modern pipelined out of order core (e.g. A53),
> it could be safe to assume 1 instruction per cycle. At 500 MHz that
> gives you 20 usecs.
>
> SPS is the most complex header and for H.264 we just do PPS and some
> slice headers. Let's round it up a bit and we could have around 100
> usecs for the complete frame metadata.
The scaling list (cabac table) address need to fill the pps header, which would request mapping and unmapping a 4K memory echo of times.
And you ever think of multiple session at the same time. Besides, have you tired the Android CTS test? There is a test that would do resolution change every 5 frames which would request to construct a new sps and pps for H.264. It is the Android CTS make me have such of thought.
Anyway the problem here is that v4l2 driver need to wait the previous picture is done to generate the register for the next pictures. Which leading more idle time of the device.
>>
>> I would bring more detail in the FOSDEM 2019, I may stay at graphics devroom at Saturday.
>>> That could potentially let us completely eliminate the distinction
>>> between the stateful and stateless interfaces and just have one that
>>> covers both.
>>>
>>> Thoughts?
>
> Any thoughts on my proposal to make the interface more flexible? Any
> specific examples of issues that we could encounter that would prevent
> it from working efficiently with Rockchip (or other) hardware?
>
> Best regards,
> Tomasz
^ permalink raw reply [flat|nested] 22+ messages in thread
end of thread, other threads:[~2019-01-30 9:54 UTC | newest]
Thread overview: 22+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20181123130209.11696-1-paul.kocialkowski@bootlin.com>
[not found] ` <20181123130209.11696-2-paul.kocialkowski@bootlin.com>
[not found] ` <5515174.7lFZcYkk85@jernej-laptop>
[not found] ` <ffe9c81db34b599f675ca5bbf02de360bf0a1608.camel@bootlin.com>
2019-01-07 3:49 ` [linux-sunxi] [PATCH v2 1/2] media: v4l: Add definitions for the HEVC slice format and controls Randy Li
[not found] ` <776e63c9-d4a5-342a-e0f7-200ef144ffc4-TNX95d0MmH7DzftRWevZcw@public.gmane.org>
2019-01-07 9:57 ` Paul Kocialkowski
2019-01-08 1:16 ` [linux-sunxi] " Ayaka
[not found] ` <D8005130-F7FD-4CBD-8396-1BB08BB08E81-xPW3/0Ywev/iB9QmIjCX8w@public.gmane.org>
2019-01-08 8:38 ` Paul Kocialkowski
2019-01-08 10:00 ` [linux-sunxi] " Ayaka
2019-01-10 13:32 ` ayaka
2019-01-24 10:27 ` Paul Kocialkowski
2019-01-24 12:23 ` Ayaka
2019-01-25 13:04 ` Paul Kocialkowski
[not found] ` <ab8ca098ad60f209fe97f79bb93b2d1e898da524.camel-LDxbnhwyfcJBDgjK7y7TUQ@public.gmane.org>
2019-01-29 7:44 ` Alexandre Courbot
2019-01-29 8:09 ` Maxime Ripard
2019-01-29 9:39 ` Tomasz Figa
2019-01-29 21:41 ` Nicolas Dufresne
2019-01-30 2:28 ` Alexandre Courbot
2019-01-30 3:35 ` Tomasz Figa
2019-01-30 6:27 ` Ayaka
[not found] ` <5F14507E-1BE2-4A74-A59C-1DB3C3E07DBA-xPW3/0Ywev/iB9QmIjCX8w@public.gmane.org>
2019-01-30 7:17 ` Tomasz Figa
2019-01-30 9:54 ` Ayaka
2019-01-30 7:57 ` Maxime Ripard
2019-01-30 7:03 ` Ayaka
2019-01-24 10:36 ` Paul Kocialkowski
2019-01-24 12:19 ` Ayaka
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox