All of lore.kernel.org
 help / color / mirror / Atom feed
From: Pekka Paalanen <pekka.paalanen@haloniitty.fi>
To: Arthur Grillo <arthurgrillo@riseup.net>
Cc: "Miquel Raynal" <miquel.raynal@bootlin.com>,
	"Maxime Ripard" <mripard@kernel.org>,
	"Louis Chauvet" <louis.chauvet@bootlin.com>,
	"Rodrigo Siqueira" <rodrigosiqueiramelo@gmail.com>,
	"Melissa Wen" <melissa.srw@gmail.com>,
	"Maíra Canal" <mairacanal@riseup.net>,
	"Haneen Mohammed" <hamohammed.sa@gmail.com>,
	"Daniel Vetter" <daniel@ffwll.ch>,
	"Maarten Lankhorst" <maarten.lankhorst@linux.intel.com>,
	"Thomas Zimmermann" <tzimmermann@suse.de>,
	"David Airlie" <airlied@gmail.com>,
	marcheu@google.com, seanpaul@google.com,
	nicolejadeyee@google.com, dri-devel@lists.freedesktop.org,
	linux-kernel@vger.kernel.org, thomas.petazzoni@bootlin.com
Subject: Re: [PATCH 2/2] drm/vkms: Use a simpler composition function
Date: Mon, 5 Feb 2024 12:19:13 +0200	[thread overview]
Message-ID: <20240205121913.379e28a4@eldfell> (raw)
In-Reply-To: <20240205121204.20878e28@eldfell>

[-- Attachment #1: Type: text/plain, Size: 9741 bytes --]

On Mon, 5 Feb 2024 12:12:04 +0200
Pekka Paalanen <pekka.paalanen@haloniitty.fi> wrote:

> On Fri, 2 Feb 2024 17:02:20 -0300
> Arthur Grillo <arthurgrillo@riseup.net> wrote:
> 
> > On 02/02/24 12:49, Pekka Paalanen wrote:  
> > > On Fri, 2 Feb 2024 13:13:22 +0100
> > > Miquel Raynal <miquel.raynal@bootlin.com> wrote:
> > >     
> > >> Hello Maxime,
> > >>
> > >> + Arthur
> > >>
> > >> mripard@kernel.org wrote on Fri, 2 Feb 2024 10:53:37 +0100:
> > >>    
> > >>> Hi Miquel,
> > >>>
> > >>> On Fri, Feb 02, 2024 at 10:26:01AM +0100, Miquel Raynal wrote:      
> > >>>> pekka.paalanen@haloniitty.fi wrote on Fri, 2 Feb 2024 10:55:22 +0200:
> > >>>>         
> > >>>>> On Thu, 01 Feb 2024 18:31:32 +0100
> > >>>>> Louis Chauvet <louis.chauvet@bootlin.com> wrote:
> > >>>>>         
> > >>>>>> Change the composition algorithm to iterate over pixels instead of lines.
> > >>>>>> It allows a simpler management of rotation and pixel access for complex formats.
> > >>>>>>
> > >>>>>> This new algorithm allows read_pixel function to have access to x/y
> > >>>>>> coordinates and make it possible to read the correct thing in a block
> > >>>>>> when block_w and block_h are not 1.
> > >>>>>> The iteration pixel-by-pixel in the same method also allows a simpler
> > >>>>>> management of rotation with drm_rect_* helpers. This way it's not needed
> > >>>>>> anymore to have misterious switch-case distributed in multiple places.          
> > >>>>>
> > >>>>> Hi,
> > >>>>>
> > >>>>> there was a very good reason to write this code using lines:
> > >>>>> performance. Before lines, it was indeed operating on individual pixels.
> > >>>>>
> > >>>>> Please, include performance measurements before and after this series
> > >>>>> to quantify the impact on the previously already supported pixel
> > >>>>> formats, particularly the 32-bit-per-pixel RGB variants.
> > >>>>>
> > >>>>> VKMS will be used more and more in CI for userspace projects, and
> > >>>>> performance actually matters there.
> > >>>>>
> > >>>>> I'm worrying that this performance degradation here is significant. I
> > >>>>> believe it is possible to keep blending with lines, if you add new line
> > >>>>> getters for reading from rotated, sub-sampled etc. images. That way you
> > >>>>> don't have to regress the most common formats' performance.        
> > >>>>
> > >>>> While I understand performance is important and should be taken into
> > >>>> account seriously, I cannot understand how broken testing could be
> > >>>> considered better. Fast but inaccurate will always be significantly
> > >>>> less attractive to my eyes.        
> > >>>
> > >>> AFAIK, neither the cover letter nor the commit log claimed it was fixing
> > >>> something broken, just that it was "better" (according to what
> > >>> criteria?).      
> > >>
> > >> Better is probably too vague and I agree the "fixing" part is not
> > >> clearly explained in the commit log. The cover-letter however states:
> > >>    
> > >>> Patch 2/2: This patch is more complex. My main target was to solve issues
> > >>> I found in [1], but as it was very complex to do it "in place", I choose
> > >>> to rework the composition function.      
> > >> ...    
> > >>> [1]: https://lore.kernel.org/dri-devel/20240110-vkms-yuv-v2-0-952fcaa5a193@riseup.net/      
> > >>
> > >> If you follow this link you will find all the feedback and especially
> > >> the "broken" parts. Just to be clear, writing bugs is totally expected
> > >> and review/testing is supposed to help on this regard. I am not blaming
> > >> the author in any way, just focusing on getting this code in a more
> > >> readable shape and hopefully reinforce the testing procedure.
> > >>    
> > >>> If something is truly broken, it must be stated what exactly is so we
> > >>> can all come up with a solution that will satisfy everyone.      
> > >>
> > >> Maybe going through the series pointed above will give more context
> > >> but AFAIU: the YUV composition is not totally right (and the tests used
> > >> to validate it need to be more complex as well in order to fail).
> > >>
> > >> Here is a proposal.
> > >>
> > >> Today's RGB implementation is only optimized in the line-by-line case
> > >> when there is no rotation. The logic is bit convoluted and may possibly
> > >> be slightly clarified with a per-format read_line() implementation,
> > >> at a very light performance cost. Such an improvement would definitely
> > >> benefit to the clarity of the code, especially when transformations
> > >> (especially the rotations) come into play because they would be clearly
> > >> handled differently instead of being "hidden" in the optimized logic.
> > >> Performances would not change much as this path is not optimized today
> > >> anyway (the pixel-oriented logic is already used in the rotation case).
> > >>
> > >> Arthur's YUV implementation is indeed well optimized but the added
> > >> complexity probably lead to small mistakes in the logic. The
> > >> per-format read_line() implementation mentioned above could be
> > >> extended to the YUV format as well, which would leverage Arthur's
> > >> proposal by re-using his optimized version. Louis will help on this
> > >> regard. However, for more complex cases such as when there is a
> > >> rotation, it will be easier (and not sub-optimized compared to the RGB
> > >> case) to also fallback to a pixel-oriented processing.
> > >>
> > >> Would this approach make sense?    
> > > 
> > > Hi,
> > > 
> > > I think it would, if I understand what you mean. Ever since I proposed
> > > a line-by-line algorithm to improve the performance, I was thinking of
> > > per-format read_line() functions that would be selected outside of any
> > > loops. Extending that to support YUV is only natural. I can imagine
> > > rotation complicates things, and I won't oppose that resulting in a
> > > much heavier read_line() implementation used in those cases. They might
> > > perhaps call the original read_line() implementations pixel-by-pixel or
> > > plane-by-plane (i.e. YUV planes) per pixel. Chroma-siting complicates
> > > things even further. That way one could compose any
> > > rotation-format-siting combination by chaining function pointers.
> > > 
> > > I haven't looked at VKMS in a long time, and I am disappointed to find
> > > that vkms_compose_row() is calling plane->pixel_read() pixel-by-pixel.
> > > The reading vfunc should be called with many pixels at a time when the
> > > source FB layout allows it. The whole point of the line-based functions
> > > was that they repeat the innermost loop in every function body to make
> > > the per-pixel overhead as small as possible. The VKMS implementations
> > > benchmarked before and after the original line-based algorithm showed
> > > that calling a function pointer per-pixel is relatively very expensive.
> > > Or maybe it was a switch-case.    
> > 
> > Hi,
> > 
> > I think I'm the culprit for that, as stated on [1]. My intention with
> > the suggestion was to remove some code repetition and too facilitate the
> > rotation support implementation. Going back, I think I was to high on
> > DRY at the time and didn't worry about optimization, which was a
> > mistake.  
> 
> Hi Arthur,
> 
> no worries. We can also blame reviewers for not minding benchmarks. ;-)
> 
> > But, I agree with Miquel that the rotation logic is easier to implement
> > in a pixel-based way. So going pixel-by-pixel only when rotation occurs
> > would be great.  
> 
> Yes, and I think that can very well be done in the line-based framework
> still that existed in the old days before any rotation support was
> added. Essentially a plug-in line-getter function that then calls a
> format-specific line-getter pixel-by-pixel while applying the rotation.
> It would be simple, it would leave unrotated performance unharmed (use
> format-specific line-getter directly with lines), but it might be
> somewhat less performant for rotated KMS planes. I suspect that might
> be a good compromise.
> 
> Format-specific line-getters could also be parameterized by
> pixel-to-pixel offset in bytes. Then they could directly traverse FB
> rows forward and backward, and even FB columns. It may or may not have
> a penalty compared to the original line-getters, so it would have to
> be benchmarked.

Oh, actually, since the byte offset depends on format, it might be
better to parametrize by direction and compute the offset in the
format-specific line-getter before the loop.


Thanks,
pq

> Line-getters working on planar YUV FBs might delegate Y, U, V, or UV/VU
> reading to R8 and/or RG88 line or pixel reader functions. They might
> also return block-height lines instead of one line at a time. However,
> I wouldn't commit to any approach without benchmarking alternatives.
> The performance comparison must justify the code complexity, like it
> was seen with the initial line-based implementation.
> 
> A good benchmark is key, IMO.
> 
> 
> Thanks,
> pq
> 
> > 
> > Best Regards,
> > ~Arthur Grillo
> > 
> > [1]: https://lore.kernel.org/dri-devel/20230418130525.128733-2-mcanal@igalia.com/
> >   
> > > 
> > > Sorry, I didn't realize the optimization had already been lost.
> > > 
> > > Btw. I'd suggest renaming vkms_compose_row() to vkms_fetch_row() since
> > > it's not composing anything and the name mislead me.
> > > 
> > > I think if you inspect the compositing code as of revision
> > > 8356b97906503a02125c8d03c9b88a61ea46a05a you'll get a better feeling of
> > > what it was supposed to be.
> > > 
> > > 
> > > Thanks,
> > > pq    
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

  reply	other threads:[~2024-02-05 10:19 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-02-01 17:31 [PATCH 0/2] Better support for complex pixel formats Louis Chauvet
2024-02-01 17:31 ` [PATCH 1/2] drm/vkms: Create a type to check a function pointer validity Louis Chauvet
2024-02-02 20:59   ` Arthur Grillo
2024-02-01 17:31 ` [PATCH 2/2] drm/vkms: Use a simpler composition function Louis Chauvet
2024-02-02  8:55   ` Pekka Paalanen
2024-02-02  9:26     ` Miquel Raynal
2024-02-02  9:47       ` Maíra Canal
2024-02-02  9:53       ` Maxime Ripard
2024-02-02 12:13         ` Miquel Raynal
2024-02-02 15:49           ` Pekka Paalanen
2024-02-02 16:07             ` Miquel Raynal
2024-02-02 19:45               ` Pekka Paalanen
2024-02-06 17:57                 ` Arthur Grillo
2024-02-07  8:44                   ` Pekka Paalanen
2024-02-07 16:03                     ` Louis Chauvet
2024-02-07 20:21                       ` Arthur Grillo
2024-02-02 20:02             ` Arthur Grillo
2024-02-05 10:12               ` Pekka Paalanen
2024-02-05 10:19                 ` Pekka Paalanen [this message]
2024-02-07 15:49                   ` Louis Chauvet
2024-02-08  9:39                     ` Pekka Paalanen
2024-02-15 17:43                       ` Arthur Grillo
2024-02-02 10:24       ` Pekka Paalanen
2024-02-01 22:07 ` [PATCH 0/2] Better support for complex pixel formats Maira Canal
2024-02-02  8:15   ` Louis Chauvet
2024-02-02  9:40     ` Maíra Canal

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240205121913.379e28a4@eldfell \
    --to=pekka.paalanen@haloniitty.fi \
    --cc=airlied@gmail.com \
    --cc=arthurgrillo@riseup.net \
    --cc=daniel@ffwll.ch \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=hamohammed.sa@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=louis.chauvet@bootlin.com \
    --cc=maarten.lankhorst@linux.intel.com \
    --cc=mairacanal@riseup.net \
    --cc=marcheu@google.com \
    --cc=melissa.srw@gmail.com \
    --cc=miquel.raynal@bootlin.com \
    --cc=mripard@kernel.org \
    --cc=nicolejadeyee@google.com \
    --cc=rodrigosiqueiramelo@gmail.com \
    --cc=seanpaul@google.com \
    --cc=thomas.petazzoni@bootlin.com \
    --cc=tzimmermann@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.