All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Jernej Škrabec" <jernej.skrabec@siol.net>
To: Paul Kocialkowski <paul.kocialkowski@bootlin.com>
Cc: devel@driverdev.osuosl.org, gregkh@linuxfoundation.org,
	linux-kernel@vger.kernel.org, mripard@kernel.org, wens@csie.org,
	hverkuil-cisco@xs4all.nl, mchehab@kernel.org,
	linux-arm-kernel@lists.infradead.org,
	linux-media@vger.kernel.org
Subject: Re: [PATCH v2 1/3] media: cedrus: Fix decoding for some H264 videos
Date: Tue, 15 Oct 2019 19:16:32 +0200	[thread overview]
Message-ID: <1916783.jTTlJIDQL9@jernej-laptop> (raw)
In-Reply-To: <20191002215442.GA24151@aptenodytes>

Hi!

Sorry for late reponse, technical issues...

Dne sreda, 02. oktober 2019 ob 23:54:47 CEST je Paul Kocialkowski napisal(a):
> Hi,
> 
> On Wed 02 Oct 19, 21:35, Jernej Skrabec wrote:
> > It seems that for some H264 videos at least one bitstream parsing
> > trigger must be called in order to be decoded correctly. There is no
> > explanation why this helps, but it was observed that two sample videos
> > with this fix are now decoded correctly and there is no regression with
> > others.
> 
> I understand there might be some magic going on under the hood here, but I
> would be interested in trying to understand what's really going on.

Me too.

> 
> For instance, comparing register dumps of the whole H264 block before/after
> calling the hardware parser could help, and comparing that to using the
> previous code (without hardware parsing).

Please understand that I was working on this on and off for almost half a year 
and checked many times all register values. At one point I tried libvdpau-
sunxi which has no problem with sample video.  Still, all relevant register 
values were the same. In a desperate attempt, I tried with HW header parsing 
which magically solved the issue. After that, I reused values provided in 
controls and then finally I made minimal solution as suggested in this patch. 

> 
> I could try and have a look if you have an available sample for testing the
> erroneous case!

Of course: http://jernej.libreelec.tv/videos/h264/test.mkv

> 
> Another minor thing: do you have some idea of whether the udelay call adds
> significant delay in the process?

I didn't notice any issue with it. Do you have any better idea? I just didn't 
want to make empty loop and udelay is the shortest delay that is provided by 
the kernel API.

Best regards,
Jernej

> 
> Cheers and thanks for the patch!
> 
> Paul
> 
> > Signed-off-by: Jernej Skrabec <jernej.skrabec@siol.net>
> > ---
> > 
> >  .../staging/media/sunxi/cedrus/cedrus_h264.c  | 30 +++++++++++++++++--
> >  .../staging/media/sunxi/cedrus/cedrus_regs.h  |  3 ++
> >  2 files changed, 30 insertions(+), 3 deletions(-)
> > 
> > diff --git a/drivers/staging/media/sunxi/cedrus/cedrus_h264.c
> > b/drivers/staging/media/sunxi/cedrus/cedrus_h264.c index
> > d6a782703c9b..bd848146eada 100644
> > --- a/drivers/staging/media/sunxi/cedrus/cedrus_h264.c
> > +++ b/drivers/staging/media/sunxi/cedrus/cedrus_h264.c
> > @@ -6,6 +6,7 @@
> > 
> >   * Copyright (c) 2018 Bootlin
> >   */
> > 
> > +#include <linux/delay.h>
> > 
> >  #include <linux/types.h>
> >  
> >  #include <media/videobuf2-dma-contig.h>
> > 
> > @@ -289,6 +290,28 @@ static void cedrus_write_pred_weight_table(struct
> > cedrus_ctx *ctx,> 
> >  	}
> >  
> >  }
> > 
> > +/*
> > + * It turns out that using VE_H264_VLD_OFFSET to skip bits is not
> > reliable. In + * rare cases frame is not decoded correctly. However,
> > setting offset to 0 and + * skipping appropriate amount of bits with
> > flush bits trigger always works. + */
> > +static void cedrus_skip_bits(struct cedrus_dev *dev, int num)
> > +{
> > +	int count = 0;
> > +
> > +	while (count < num) {
> > +		int tmp = min(num - count, 32);
> > 
> > +
> > +		cedrus_write(dev, VE_H264_TRIGGER_TYPE,
> > +			     VE_H264_TRIGGER_TYPE_FLUSH_BITS |
> > +			     VE_H264_TRIGGER_TYPE_N_BITS(tmp));
> > +		while (cedrus_read(dev, VE_H264_STATUS) & 
VE_H264_STATUS_VLD_BUSY)
> > +			udelay(1);
> > +
> > +		count += tmp;
> > +	}
> > +}
> > +
> > 
> >  static void cedrus_set_params(struct cedrus_ctx *ctx,
> >  
> >  			      struct cedrus_run *run)
> >  
> >  {
> > 
> > @@ -299,12 +322,11 @@ static void cedrus_set_params(struct cedrus_ctx
> > *ctx,
> > 
> >  	struct vb2_buffer *src_buf = &run->src->vb2_buf;
> >  	struct cedrus_dev *dev = ctx->dev;
> >  	dma_addr_t src_buf_addr;
> > 
> > -	u32 offset = slice->header_bit_size;
> > -	u32 len = (slice->size * 8) - offset;
> > +	u32 len = slice->size * 8;
> > 
> >  	u32 reg;
> >  	
> >  	cedrus_write(dev, VE_H264_VLD_LEN, len);
> > 
> > -	cedrus_write(dev, VE_H264_VLD_OFFSET, offset);
> > +	cedrus_write(dev, VE_H264_VLD_OFFSET, 0);
> > 
> >  	src_buf_addr = vb2_dma_contig_plane_dma_addr(src_buf, 0);
> >  	cedrus_write(dev, VE_H264_VLD_END,
> > 
> > @@ -323,6 +345,8 @@ static void cedrus_set_params(struct cedrus_ctx *ctx,
> > 
> >  	cedrus_write(dev, VE_H264_TRIGGER_TYPE,
> >  	
> >  		     VE_H264_TRIGGER_TYPE_INIT_SWDEC);
> > 
> > +	cedrus_skip_bits(dev, slice->header_bit_size);
> > +
> > 
> >  	if (((pps->flags & V4L2_H264_PPS_FLAG_WEIGHTED_PRED) &&
> >  	
> >  	     (slice->slice_type == V4L2_H264_SLICE_TYPE_P ||
> >  	     
> >  	      slice->slice_type == V4L2_H264_SLICE_TYPE_SP)) ||
> > 
> > diff --git a/drivers/staging/media/sunxi/cedrus/cedrus_regs.h
> > b/drivers/staging/media/sunxi/cedrus/cedrus_regs.h index
> > 3329f9aaf975..b52926a54025 100644
> > --- a/drivers/staging/media/sunxi/cedrus/cedrus_regs.h
> > +++ b/drivers/staging/media/sunxi/cedrus/cedrus_regs.h
> > @@ -538,13 +538,16 @@
> > 
> >  					 
VE_H264_CTRL_SLICE_DECODE_INT)
> >  
> >  #define VE_H264_TRIGGER_TYPE		0x224
> > 
> > +#define VE_H264_TRIGGER_TYPE_N_BITS(x)		(((x) & 0x3f) << 8)
> > 
> >  #define VE_H264_TRIGGER_TYPE_AVC_SLICE_DECODE	(8 << 0)
> >  #define VE_H264_TRIGGER_TYPE_INIT_SWDEC		(7 << 0)
> > 
> > +#define VE_H264_TRIGGER_TYPE_FLUSH_BITS		(3 << 0)
> > 
> >  #define VE_H264_STATUS			0x228
> >  #define VE_H264_STATUS_VLD_DATA_REQ_INT		
VE_H264_CTRL_VLD_DATA_REQ_INT
> >  #define VE_H264_STATUS_DECODE_ERR_INT		
VE_H264_CTRL_DECODE_ERR_INT
> >  #define VE_H264_STATUS_SLICE_DECODE_INT		
VE_H264_CTRL_SLICE_DECODE_INT
> > 
> > +#define VE_H264_STATUS_VLD_BUSY			BIT(8)
> > 
> >  #define VE_H264_STATUS_INT_MASK			
VE_H264_CTRL_INT_MASK





_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

WARNING: multiple messages have this Message-ID (diff)
From: "Jernej Škrabec" <jernej.skrabec@siol.net>
To: Paul Kocialkowski <paul.kocialkowski@bootlin.com>
Cc: mripard@kernel.org, mchehab@kernel.org, hverkuil-cisco@xs4all.nl,
	gregkh@linuxfoundation.org, wens@csie.org,
	linux-media@vger.kernel.org, devel@driverdev.osuosl.org,
	linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH v2 1/3] media: cedrus: Fix decoding for some H264 videos
Date: Tue, 15 Oct 2019 19:16:32 +0200	[thread overview]
Message-ID: <1916783.jTTlJIDQL9@jernej-laptop> (raw)
In-Reply-To: <20191002215442.GA24151@aptenodytes>

Hi!

Sorry for late reponse, technical issues...

Dne sreda, 02. oktober 2019 ob 23:54:47 CEST je Paul Kocialkowski napisal(a):
> Hi,
> 
> On Wed 02 Oct 19, 21:35, Jernej Skrabec wrote:
> > It seems that for some H264 videos at least one bitstream parsing
> > trigger must be called in order to be decoded correctly. There is no
> > explanation why this helps, but it was observed that two sample videos
> > with this fix are now decoded correctly and there is no regression with
> > others.
> 
> I understand there might be some magic going on under the hood here, but I
> would be interested in trying to understand what's really going on.

Me too.

> 
> For instance, comparing register dumps of the whole H264 block before/after
> calling the hardware parser could help, and comparing that to using the
> previous code (without hardware parsing).

Please understand that I was working on this on and off for almost half a year 
and checked many times all register values. At one point I tried libvdpau-
sunxi which has no problem with sample video.  Still, all relevant register 
values were the same. In a desperate attempt, I tried with HW header parsing 
which magically solved the issue. After that, I reused values provided in 
controls and then finally I made minimal solution as suggested in this patch. 

> 
> I could try and have a look if you have an available sample for testing the
> erroneous case!

Of course: http://jernej.libreelec.tv/videos/h264/test.mkv

> 
> Another minor thing: do you have some idea of whether the udelay call adds
> significant delay in the process?

I didn't notice any issue with it. Do you have any better idea? I just didn't 
want to make empty loop and udelay is the shortest delay that is provided by 
the kernel API.

Best regards,
Jernej

> 
> Cheers and thanks for the patch!
> 
> Paul
> 
> > Signed-off-by: Jernej Skrabec <jernej.skrabec@siol.net>
> > ---
> > 
> >  .../staging/media/sunxi/cedrus/cedrus_h264.c  | 30 +++++++++++++++++--
> >  .../staging/media/sunxi/cedrus/cedrus_regs.h  |  3 ++
> >  2 files changed, 30 insertions(+), 3 deletions(-)
> > 
> > diff --git a/drivers/staging/media/sunxi/cedrus/cedrus_h264.c
> > b/drivers/staging/media/sunxi/cedrus/cedrus_h264.c index
> > d6a782703c9b..bd848146eada 100644
> > --- a/drivers/staging/media/sunxi/cedrus/cedrus_h264.c
> > +++ b/drivers/staging/media/sunxi/cedrus/cedrus_h264.c
> > @@ -6,6 +6,7 @@
> > 
> >   * Copyright (c) 2018 Bootlin
> >   */
> > 
> > +#include <linux/delay.h>
> > 
> >  #include <linux/types.h>
> >  
> >  #include <media/videobuf2-dma-contig.h>
> > 
> > @@ -289,6 +290,28 @@ static void cedrus_write_pred_weight_table(struct
> > cedrus_ctx *ctx,> 
> >  	}
> >  
> >  }
> > 
> > +/*
> > + * It turns out that using VE_H264_VLD_OFFSET to skip bits is not
> > reliable. In + * rare cases frame is not decoded correctly. However,
> > setting offset to 0 and + * skipping appropriate amount of bits with
> > flush bits trigger always works. + */
> > +static void cedrus_skip_bits(struct cedrus_dev *dev, int num)
> > +{
> > +	int count = 0;
> > +
> > +	while (count < num) {
> > +		int tmp = min(num - count, 32);
> > 
> > +
> > +		cedrus_write(dev, VE_H264_TRIGGER_TYPE,
> > +			     VE_H264_TRIGGER_TYPE_FLUSH_BITS |
> > +			     VE_H264_TRIGGER_TYPE_N_BITS(tmp));
> > +		while (cedrus_read(dev, VE_H264_STATUS) & 
VE_H264_STATUS_VLD_BUSY)
> > +			udelay(1);
> > +
> > +		count += tmp;
> > +	}
> > +}
> > +
> > 
> >  static void cedrus_set_params(struct cedrus_ctx *ctx,
> >  
> >  			      struct cedrus_run *run)
> >  
> >  {
> > 
> > @@ -299,12 +322,11 @@ static void cedrus_set_params(struct cedrus_ctx
> > *ctx,
> > 
> >  	struct vb2_buffer *src_buf = &run->src->vb2_buf;
> >  	struct cedrus_dev *dev = ctx->dev;
> >  	dma_addr_t src_buf_addr;
> > 
> > -	u32 offset = slice->header_bit_size;
> > -	u32 len = (slice->size * 8) - offset;
> > +	u32 len = slice->size * 8;
> > 
> >  	u32 reg;
> >  	
> >  	cedrus_write(dev, VE_H264_VLD_LEN, len);
> > 
> > -	cedrus_write(dev, VE_H264_VLD_OFFSET, offset);
> > +	cedrus_write(dev, VE_H264_VLD_OFFSET, 0);
> > 
> >  	src_buf_addr = vb2_dma_contig_plane_dma_addr(src_buf, 0);
> >  	cedrus_write(dev, VE_H264_VLD_END,
> > 
> > @@ -323,6 +345,8 @@ static void cedrus_set_params(struct cedrus_ctx *ctx,
> > 
> >  	cedrus_write(dev, VE_H264_TRIGGER_TYPE,
> >  	
> >  		     VE_H264_TRIGGER_TYPE_INIT_SWDEC);
> > 
> > +	cedrus_skip_bits(dev, slice->header_bit_size);
> > +
> > 
> >  	if (((pps->flags & V4L2_H264_PPS_FLAG_WEIGHTED_PRED) &&
> >  	
> >  	     (slice->slice_type == V4L2_H264_SLICE_TYPE_P ||
> >  	     
> >  	      slice->slice_type == V4L2_H264_SLICE_TYPE_SP)) ||
> > 
> > diff --git a/drivers/staging/media/sunxi/cedrus/cedrus_regs.h
> > b/drivers/staging/media/sunxi/cedrus/cedrus_regs.h index
> > 3329f9aaf975..b52926a54025 100644
> > --- a/drivers/staging/media/sunxi/cedrus/cedrus_regs.h
> > +++ b/drivers/staging/media/sunxi/cedrus/cedrus_regs.h
> > @@ -538,13 +538,16 @@
> > 
> >  					 
VE_H264_CTRL_SLICE_DECODE_INT)
> >  
> >  #define VE_H264_TRIGGER_TYPE		0x224
> > 
> > +#define VE_H264_TRIGGER_TYPE_N_BITS(x)		(((x) & 0x3f) << 8)
> > 
> >  #define VE_H264_TRIGGER_TYPE_AVC_SLICE_DECODE	(8 << 0)
> >  #define VE_H264_TRIGGER_TYPE_INIT_SWDEC		(7 << 0)
> > 
> > +#define VE_H264_TRIGGER_TYPE_FLUSH_BITS		(3 << 0)
> > 
> >  #define VE_H264_STATUS			0x228
> >  #define VE_H264_STATUS_VLD_DATA_REQ_INT		
VE_H264_CTRL_VLD_DATA_REQ_INT
> >  #define VE_H264_STATUS_DECODE_ERR_INT		
VE_H264_CTRL_DECODE_ERR_INT
> >  #define VE_H264_STATUS_SLICE_DECODE_INT		
VE_H264_CTRL_SLICE_DECODE_INT
> > 
> > +#define VE_H264_STATUS_VLD_BUSY			BIT(8)
> > 
> >  #define VE_H264_STATUS_INT_MASK			
VE_H264_CTRL_INT_MASK





  reply	other threads:[~2019-10-15 17:17 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-02 19:35 [PATCH v2 0/3] media: cedrus: improvements Jernej Skrabec
2019-10-02 19:35 ` Jernej Skrabec
2019-10-02 19:35 ` [PATCH v2 1/3] media: cedrus: Fix decoding for some H264 videos Jernej Skrabec
2019-10-02 19:35   ` Jernej Skrabec
2019-10-02 21:54   ` Paul Kocialkowski
2019-10-02 21:54     ` Paul Kocialkowski
2019-10-15 17:16     ` Jernej Škrabec [this message]
2019-10-15 17:16       ` Jernej Škrabec
2019-10-22  9:10       ` Paul Kocialkowski
2019-10-22  9:10         ` Paul Kocialkowski
2019-10-02 19:35 ` [PATCH v2 2/3] media: cedrus: Fix H264 default reference index count Jernej Skrabec
2019-10-02 19:35   ` Jernej Skrabec
2019-10-02 22:06   ` Paul Kocialkowski
2019-10-02 22:06     ` Paul Kocialkowski
2019-10-03  5:16     ` Jernej Škrabec
2019-10-03  5:16       ` Jernej Škrabec
2019-10-03 20:28       ` Paul Kocialkowski
2019-10-03 20:28         ` Paul Kocialkowski
2019-10-03 20:44         ` Jernej Škrabec
2019-10-03 20:44           ` Jernej Škrabec
2019-10-03 20:58           ` Paul Kocialkowski
2019-10-03 20:58             ` Paul Kocialkowski
2019-10-03 21:19             ` Jernej Škrabec
2019-10-03 21:19               ` Jernej Škrabec
2019-10-03 21:32               ` Paul Kocialkowski
2019-10-03 21:32                 ` Paul Kocialkowski
2019-10-02 19:35 ` [PATCH v2 3/3] media: cedrus: Use helpers to access capture queue Jernej Skrabec
2019-10-02 19:35   ` Jernej Skrabec
2019-10-02 22:14   ` Paul Kocialkowski
2019-10-02 22:14     ` Paul Kocialkowski
2019-10-02 22:23 ` [PATCH v2 0/3] media: cedrus: improvements Paul Kocialkowski
2019-10-02 22:23   ` Paul Kocialkowski
2019-10-03  5:21   ` Jernej Škrabec
2019-10-03  5:21     ` Jernej Škrabec

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1916783.jTTlJIDQL9@jernej-laptop \
    --to=jernej.skrabec@siol.net \
    --cc=devel@driverdev.osuosl.org \
    --cc=gregkh@linuxfoundation.org \
    --cc=hverkuil-cisco@xs4all.nl \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-media@vger.kernel.org \
    --cc=mchehab@kernel.org \
    --cc=mripard@kernel.org \
    --cc=paul.kocialkowski@bootlin.com \
    --cc=wens@csie.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.