All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jani Nikula <jani.nikula@linux.intel.com>
To: Thomas Zimmermann <tzimmermann@suse.de>,
	jfalempe@redhat.com, simona@ffwll.ch, airlied@gmail.com,
	mripard@kernel.org, maarten.lankhorst@linux.intel.com
Cc: dri-devel@lists.freedesktop.org, Thomas Zimmermann <tzimmermann@suse.de>
Subject: Re: [PATCH 7/8] drm/format-helper: Optimize 32-to-16-bpp conversion
Date: Wed, 26 Mar 2025 12:53:33 +0200	[thread overview]
Message-ID: <87sen06p1u.fsf@intel.com> (raw)
In-Reply-To: <20250325110407.81107-8-tzimmermann@suse.de>

On Tue, 25 Mar 2025, Thomas Zimmermann <tzimmermann@suse.de> wrote:
> For ease of implementation, existing line-conversion functions
> for 16-bit formats write each pixel individually. Optimize the
> performance by writing mulitple pixels in single 64-bit and 32-bit
> stores.
>
> Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
> ---
>  drivers/gpu/drm/drm_format_helper.c | 40 ++++++++++++++++++++++++++++-
>  1 file changed, 39 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/drm_format_helper.c b/drivers/gpu/drm/drm_format_helper.c
> index b9c9c712aa9c..66137df85725 100644
> --- a/drivers/gpu/drm/drm_format_helper.c
> +++ b/drivers/gpu/drm/drm_format_helper.c
> @@ -262,10 +262,48 @@ static __always_inline void drm_fb_xfrm_line_32to16(void *dbuf, const void *sbuf
>  						    unsigned int pixels,
>  						    u32 (*xfrm_pixel)(u32))
>  {
> -	__le16 *dbuf16 = dbuf;
> +	__le64 *dbuf64 = dbuf;
> +	__le32 *dbuf32;
> +	__le16 *dbuf16;
>  	const __le32 *sbuf32 = sbuf;
>  	const __le32 *send32 = sbuf32 + pixels;
>  
> +#if defined(CONFIG_64BIT)
> +	/* write 4 pixels at once */
> +	send32 -= pixels & GENMASK(1, 0);
> +	while (sbuf32 < send32) {

I find the adjusting of send32 before and after the loop with different
masks a bit confusing. Would it not suffice to:

	while (sbuf32 < ALIGN_DOWN(send32, 4))

and leave send32 untouched? With different alignments for 2 pixels at a
time.


BR,
Jani.


> +		u32 pix[4] = {
> +			le32_to_cpup(sbuf32++),
> +			le32_to_cpup(sbuf32++),
> +			le32_to_cpup(sbuf32++),
> +			le32_to_cpup(sbuf32++),
> +		};
> +		/* write output bytes in reverse order for little endianness */
> +		u64 val64 = ((u64)xfrm_pixel(pix[0])) |
> +			    ((u64)xfrm_pixel(pix[1]) << 16) |
> +			    ((u64)xfrm_pixel(pix[2]) << 32) |
> +			    ((u64)xfrm_pixel(pix[3]) << 48);
> +		*dbuf64++ = cpu_to_le64(val64);
> +	}
> +	send32 += pixels & GENMASK(1, 1);
> +#endif
> +
> +	/* write 2 pixels at once */
> +	dbuf32 = (__le32 __force *)dbuf64;
> +	while (sbuf32 < send32) {
> +		u32 pix[2] = {
> +			le32_to_cpup(sbuf32++),
> +			le32_to_cpup(sbuf32++),
> +		};
> +		/* write output bytes in reverse order for little endianness */
> +		u32 val32 = xfrm_pixel(pix[0]) |
> +			   (xfrm_pixel(pix[1]) << 16);
> +		*dbuf32++ = cpu_to_le32(val32);
> +	}
> +	send32 += pixels & GENMASK(0, 0);
> +
> +	/* write trailing pixel */
> +	dbuf16 = (__le16 __force *)dbuf32;
>  	while (sbuf32 < send32)
>  		*dbuf16++ = cpu_to_le16(xfrm_pixel(le32_to_cpup(sbuf32++)));
>  }

-- 
Jani Nikula, Intel

  parent reply	other threads:[~2025-03-26 10:53 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-03-25 10:31 [PATCH 0/8] drm/format-helper: Add helpers for line conversion Thomas Zimmermann
2025-03-25 10:31 ` [PATCH 1/8] drm/format-helper: Move helpers for pixel conversion to header file Thomas Zimmermann
2025-03-26  8:51   ` Jocelyn Falempe
2025-03-25 10:31 ` [PATCH 2/8] drm/format-helper: Add generic conversion to 32-bit formats Thomas Zimmermann
2025-03-26  8:52   ` Jocelyn Falempe
2025-03-25 10:31 ` [PATCH 3/8] drm/format-helper: Add generic conversion to 24-bit formats Thomas Zimmermann
2025-03-26  8:52   ` Jocelyn Falempe
2025-03-25 10:31 ` [PATCH 4/8] drm/format-helper: Add generic conversion to 16-bit formats Thomas Zimmermann
2025-03-26  8:53   ` Jocelyn Falempe
2025-03-25 10:31 ` [PATCH 5/8] drm/format-helper: Add generic conversion to 8-bit formats Thomas Zimmermann
2025-03-26  8:53   ` Jocelyn Falempe
2025-03-25 10:31 ` [PATCH 6/8] drm/format-helper: Optimize 32-to-24-bpp conversion Thomas Zimmermann
2025-03-26  8:55   ` Jocelyn Falempe
2025-03-25 10:31 ` [PATCH 7/8] drm/format-helper: Optimize 32-to-16-bpp conversion Thomas Zimmermann
2025-03-26  8:56   ` Jocelyn Falempe
2025-03-26 10:53   ` Jani Nikula [this message]
2025-03-26 12:36     ` Thomas Zimmermann
2025-03-25 10:31 ` [PATCH 8/8] drm/format-helper: Optimize 32-to-8-bpp conversion Thomas Zimmermann
2025-03-26  8:57   ` Jocelyn Falempe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87sen06p1u.fsf@intel.com \
    --to=jani.nikula@linux.intel.com \
    --cc=airlied@gmail.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=jfalempe@redhat.com \
    --cc=maarten.lankhorst@linux.intel.com \
    --cc=mripard@kernel.org \
    --cc=simona@ffwll.ch \
    --cc=tzimmermann@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.