patches.lists.linux.dev archive mirror
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: stable@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	patches@lists.linux.dev, Thomas Zimmermann <tzimmermann@suse.de>,
	Javier Martinez Canillas <javierm@redhat.com>,
	Sam Ravnborg <sam@ravnborg.org>, Sasha Levin <sashal@kernel.org>
Subject: [PATCH 5.15 14/89] fbdev: Improve performance of sys_imageblit()
Date: Mon, 28 Aug 2023 12:13:15 +0200	[thread overview]
Message-ID: <20230828101150.644350280@linuxfoundation.org> (raw)
In-Reply-To: <20230828101150.163430842@linuxfoundation.org>

5.15-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Thomas Zimmermann <tzimmermann@suse.de>

[ Upstream commit 6f29e04938bf509fccfad490a74284cf158891ce ]

Improve the performance of sys_imageblit() by manually unrolling
the inner blitting loop and moving some invariants out. The compiler
failed to do this automatically. The resulting binary code was even
slower than the cfb_imageblit() helper, which uses the same algorithm,
but operates on I/O memory.

A microbenchmark measures the average number of CPU cycles
for sys_imageblit() after a stabilizing period of a few minutes
(i7-4790, FullHD, simpledrm, kernel with debugging). The value
for CFB is given as a reference.

  sys_imageblit(), new: 25934 cycles
  sys_imageblit(), old: 35944 cycles
  cfb_imageblit():      30566 cycles

In the optimized case, sys_imageblit() is now ~30% faster than before
and ~20% faster than cfb_imageblit().

v2:
	* move switch out of inner loop (Gerd)
	* remove test for alignment of dst1 (Sam)

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Javier Martinez Canillas <javierm@redhat.com>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20220223193804.18636-3-tzimmermann@suse.de
Stable-dep-of: c2d22806aecb ("fbdev: fix potential OOB read in fast_imageblit()")
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/video/fbdev/core/sysimgblt.c | 49 +++++++++++++++++++++-------
 1 file changed, 38 insertions(+), 11 deletions(-)

diff --git a/drivers/video/fbdev/core/sysimgblt.c b/drivers/video/fbdev/core/sysimgblt.c
index a4d05b1b17d7d..722c327a381bd 100644
--- a/drivers/video/fbdev/core/sysimgblt.c
+++ b/drivers/video/fbdev/core/sysimgblt.c
@@ -188,23 +188,29 @@ static void fast_imageblit(const struct fb_image *image, struct fb_info *p,
 {
 	u32 fgx = fgcolor, bgx = bgcolor, bpp = p->var.bits_per_pixel;
 	u32 ppw = 32/bpp, spitch = (image->width + 7)/8;
-	u32 bit_mask, end_mask, eorx, shift;
+	u32 bit_mask, eorx;
 	const char *s = image->data, *src;
 	u32 *dst;
-	const u32 *tab = NULL;
+	const u32 *tab;
+	size_t tablen;
+	u32 colortab[16];
 	int i, j, k;
 
 	switch (bpp) {
 	case 8:
 		tab = fb_be_math(p) ? cfb_tab8_be : cfb_tab8_le;
+		tablen = 16;
 		break;
 	case 16:
 		tab = fb_be_math(p) ? cfb_tab16_be : cfb_tab16_le;
+		tablen = 4;
 		break;
 	case 32:
-	default:
 		tab = cfb_tab32;
+		tablen = 2;
 		break;
+	default:
+		return;
 	}
 
 	for (i = ppw-1; i--; ) {
@@ -218,19 +224,40 @@ static void fast_imageblit(const struct fb_image *image, struct fb_info *p,
 	eorx = fgx ^ bgx;
 	k = image->width/ppw;
 
+	for (i = 0; i < tablen; ++i)
+		colortab[i] = (tab[i] & eorx) ^ bgx;
+
 	for (i = image->height; i--; ) {
 		dst = dst1;
-		shift = 8;
 		src = s;
 
-		for (j = k; j--; ) {
-			shift -= ppw;
-			end_mask = tab[(*src >> shift) & bit_mask];
-			*dst++ = (end_mask & eorx) ^ bgx;
-			if (!shift) {
-				shift = 8;
-				src++;
+		switch (ppw) {
+		case 4: /* 8 bpp */
+			for (j = k; j; j -= 2, ++src) {
+				*dst++ = colortab[(*src >> 4) & bit_mask];
+				*dst++ = colortab[(*src >> 0) & bit_mask];
+			}
+			break;
+		case 2: /* 16 bpp */
+			for (j = k; j; j -= 4, ++src) {
+				*dst++ = colortab[(*src >> 6) & bit_mask];
+				*dst++ = colortab[(*src >> 4) & bit_mask];
+				*dst++ = colortab[(*src >> 2) & bit_mask];
+				*dst++ = colortab[(*src >> 0) & bit_mask];
+			}
+			break;
+		case 1: /* 32 bpp */
+			for (j = k; j; j -= 8, ++src) {
+				*dst++ = colortab[(*src >> 7) & bit_mask];
+				*dst++ = colortab[(*src >> 6) & bit_mask];
+				*dst++ = colortab[(*src >> 5) & bit_mask];
+				*dst++ = colortab[(*src >> 4) & bit_mask];
+				*dst++ = colortab[(*src >> 3) & bit_mask];
+				*dst++ = colortab[(*src >> 2) & bit_mask];
+				*dst++ = colortab[(*src >> 1) & bit_mask];
+				*dst++ = colortab[(*src >> 0) & bit_mask];
 			}
+			break;
 		}
 		dst1 += p->fix.line_length;
 		s += spitch;
-- 
2.40.1




  parent reply	other threads:[~2023-08-28 10:42 UTC|newest]

Thread overview: 102+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-08-28 10:13 [PATCH 5.15 00/89] 5.15.129-rc1 review Greg Kroah-Hartman
2023-08-28 10:13 ` [PATCH 5.15 01/89] objtool/x86: Fix SRSO mess Greg Kroah-Hartman
2023-08-28 10:13 ` [PATCH 5.15 02/89] NFSv4.2: fix error handling in nfs42_proc_getxattr Greg Kroah-Hartman
2023-08-28 10:13 ` [PATCH 5.15 03/89] NFSv4: fix out path in __nfs4_get_acl_uncached Greg Kroah-Hartman
2023-08-28 10:13 ` [PATCH 5.15 04/89] xprtrdma: Remap Receive buffers after a reconnect Greg Kroah-Hartman
2023-08-28 10:13 ` [PATCH 5.15 05/89] PCI: acpiphp: Reassign resources on bridge if necessary Greg Kroah-Hartman
2023-08-28 10:13 ` [PATCH 5.15 06/89] dlm: improve plock logging if interrupted Greg Kroah-Hartman
2023-08-28 10:13 ` [PATCH 5.15 07/89] dlm: replace usage of found with dedicated list iterator variable Greg Kroah-Hartman
2023-08-28 10:13 ` [PATCH 5.15 08/89] fs: dlm: add pid to debug log Greg Kroah-Hartman
2023-08-28 10:13 ` [PATCH 5.15 09/89] fs: dlm: change plock interrupted message to debug again Greg Kroah-Hartman
2023-08-28 10:13 ` [PATCH 5.15 10/89] fs: dlm: use dlm_plock_info for do_unlock_close Greg Kroah-Hartman
2023-08-28 10:13 ` [PATCH 5.15 11/89] fs: dlm: fix mismatch of plock results from userspace Greg Kroah-Hartman
2023-08-28 10:13 ` [PATCH 5.15 12/89] MIPS: cpu-features: Enable octeon_cache by cpu_type Greg Kroah-Hartman
2023-08-28 10:13 ` [PATCH 5.15 13/89] MIPS: cpu-features: Use boot_cpu_type for CPU type based features Greg Kroah-Hartman
2023-08-28 10:13 ` Greg Kroah-Hartman [this message]
2023-08-28 10:13 ` [PATCH 5.15 15/89] fbdev: Fix sys_imageblit() for arbitrary image widths Greg Kroah-Hartman
2023-08-28 10:13 ` [PATCH 5.15 16/89] fbdev: fix potential OOB read in fast_imageblit() Greg Kroah-Hartman
2023-08-28 10:13 ` [PATCH 5.15 17/89] ALSA: pcm: Fix potential data race at PCM memory allocation helpers Greg Kroah-Hartman
2023-08-28 10:13 ` [PATCH 5.15 18/89] jbd2: remove t_checkpoint_io_list Greg Kroah-Hartman
2023-08-28 10:13 ` [PATCH 5.15 19/89] jbd2: remove journal_clean_one_cp_list() Greg Kroah-Hartman
2023-08-28 10:13 ` [PATCH 5.15 20/89] jbd2: fix a race when checking checkpoint buffer busy Greg Kroah-Hartman
2023-08-28 10:13 ` [PATCH 5.15 21/89] can: raw: fix receiver memory leak Greg Kroah-Hartman
2023-08-28 10:13 ` [PATCH 5.15 22/89] drm/amd/display: do not wait for mpc idle if tg is disabled Greg Kroah-Hartman
2023-08-28 10:13 ` [PATCH 5.15 23/89] drm/amd/display: check TG is non-null before checking if enabled Greg Kroah-Hartman
2023-08-28 10:13 ` [PATCH 5.15 24/89] can: raw: fix lockdep issue in raw_release() Greg Kroah-Hartman
2023-08-28 10:13 ` [PATCH 5.15 25/89] tracing: Fix cpu buffers unavailable due to record_disabled missed Greg Kroah-Hartman
2023-08-28 10:13 ` [PATCH 5.15 26/89] tracing: Fix memleak due to race between current_tracer and trace Greg Kroah-Hartman
2023-08-28 10:13 ` [PATCH 5.15 27/89] octeontx2-af: SDP: fix receive link config Greg Kroah-Hartman
2023-08-28 10:13 ` [PATCH 5.15 28/89] sock: annotate data-races around prot->memory_pressure Greg Kroah-Hartman
2023-08-28 10:13 ` [PATCH 5.15 29/89] dccp: annotate data-races in dccp_poll() Greg Kroah-Hartman
2023-08-28 10:13 ` [PATCH 5.15 30/89] ipvlan: Fix a reference count leak warning in ipvlan_ns_exit() Greg Kroah-Hartman
2023-08-28 10:13 ` [PATCH 5.15 31/89] net: bgmac: Fix return value check for fixed_phy_register() Greg Kroah-Hartman
2023-08-28 10:13 ` [PATCH 5.15 32/89] net: bcmgenet: " Greg Kroah-Hartman
2023-08-28 10:13 ` [PATCH 5.15 33/89] net: validate veth and vxcan peer ifindexes Greg Kroah-Hartman
2023-08-28 10:13 ` [PATCH 5.15 34/89] ice: fix receive buffer size miscalculation Greg Kroah-Hartman
2023-08-28 10:13 ` [PATCH 5.15 35/89] igb: Avoid starting unnecessary workqueues Greg Kroah-Hartman
2023-08-28 10:13 ` [PATCH 5.15 36/89] igc: Fix the typo in the PTM Control macro Greg Kroah-Hartman
2023-08-28 10:13 ` [PATCH 5.15 37/89] net/sched: fix a qdisc modification with ambiguous command request Greg Kroah-Hartman
2023-08-28 10:13 ` [PATCH 5.15 38/89] netfilter: nf_tables: flush pending destroy work before netlink notifier Greg Kroah-Hartman
2023-08-28 10:13 ` [PATCH 5.15 39/89] netfilter: nf_tables: fix out of memory error handling Greg Kroah-Hartman
2023-08-28 10:13 ` [PATCH 5.15 40/89] rtnetlink: return ENODEV when ifname does not exist and group is given Greg Kroah-Hartman
2023-08-28 10:13 ` [PATCH 5.15 41/89] rtnetlink: Reject negative ifindexes in RTM_NEWLINK Greg Kroah-Hartman
2023-08-28 10:13 ` [PATCH 5.15 42/89] net: remove bond_slave_has_mac_rcu() Greg Kroah-Hartman
2023-08-28 10:13 ` [PATCH 5.15 43/89] bonding: fix macvlan over alb bond support Greg Kroah-Hartman
2023-08-28 10:13 ` [PATCH 5.15 44/89] net/ncsi: make one oem_gma function for all mfr id Greg Kroah-Hartman
2023-08-28 10:13 ` [PATCH 5.15 45/89] net/ncsi: change from ndo_set_mac_address to dev_set_mac_address Greg Kroah-Hartman
2023-08-28 10:13 ` [PATCH 5.15 46/89] Revert "KVM: x86: enable TDP MMU by default" Greg Kroah-Hartman
2023-08-28 10:13 ` [PATCH 5.15 47/89] ibmveth: Use dcbf rather than dcbfl Greg Kroah-Hartman
2023-08-28 10:13 ` [PATCH 5.15 48/89] NFSv4: Fix dropped lock for racing OPEN and delegation return Greg Kroah-Hartman
2023-08-28 10:13 ` [PATCH 5.15 49/89] clk: Fix slab-out-of-bounds error in devm_clk_release() Greg Kroah-Hartman
2023-08-28 10:13 ` [PATCH 5.15 50/89] ALSA: ymfpci: Fix the missing snd_card_free() call at probe error Greg Kroah-Hartman
2023-08-28 10:13 ` [PATCH 5.15 51/89] mm: add a call to flush_cache_vmap() in vmap_pfn() Greg Kroah-Hartman
2023-08-28 10:13 ` [PATCH 5.15 52/89] NFS: Fix a use after free in nfs_direct_join_group() Greg Kroah-Hartman
2023-08-28 10:13 ` [PATCH 5.15 53/89] nfsd: Fix race to FREE_STATEID and cl_revoked Greg Kroah-Hartman
2023-08-28 10:13 ` [PATCH 5.15 54/89] selinux: set next pointer before attaching to list Greg Kroah-Hartman
2023-08-28 10:13 ` [PATCH 5.15 55/89] batman-adv: Trigger events for auto adjusted MTU Greg Kroah-Hartman
2023-08-28 10:13 ` [PATCH 5.15 56/89] batman-adv: Dont increase MTU when set by user Greg Kroah-Hartman
2023-08-28 10:13 ` [PATCH 5.15 57/89] batman-adv: Do not get eth header before batadv_check_management_packet Greg Kroah-Hartman
2023-08-28 10:13 ` [PATCH 5.15 58/89] batman-adv: Fix TT global entry leak when client roamed back Greg Kroah-Hartman
2023-08-28 10:14 ` [PATCH 5.15 59/89] batman-adv: Fix batadv_v_ogm_aggr_send memory leak Greg Kroah-Hartman
2023-08-28 10:14 ` [PATCH 5.15 60/89] batman-adv: Hold rtnl lock during MTU update via netlink Greg Kroah-Hartman
2023-08-28 10:14 ` [PATCH 5.15 61/89] lib/clz_ctz.c: Fix __clzdi2() and __ctzdi2() for 32-bit kernels Greg Kroah-Hartman
2023-08-28 10:14 ` [PATCH 5.15 62/89] radix tree: remove unused variable Greg Kroah-Hartman
2023-08-28 10:14 ` [PATCH 5.15 63/89] of: unittest: Fix EXPECT for parse_phandle_with_args_map() test Greg Kroah-Hartman
2023-08-28 10:14 ` [PATCH 5.15 64/89] of: dynamic: Refactor action prints to not use "%pOF" inside devtree_lock Greg Kroah-Hartman
2023-08-28 10:14 ` [PATCH 5.15 65/89] media: vcodec: Fix potential array out-of-bounds in encoder queue_setup Greg Kroah-Hartman
2023-08-28 10:14 ` [PATCH 5.15 66/89] PCI: acpiphp: Use pci_assign_unassigned_bridge_resources() only for non-root bus Greg Kroah-Hartman
2023-08-28 10:14 ` [PATCH 5.15 67/89] drm/vmwgfx: Fix shader stage validation Greg Kroah-Hartman
2023-08-28 10:14 ` [PATCH 5.15 68/89] drm/display/dp: Fix the DP DSC Receiver cap size Greg Kroah-Hartman
2023-08-28 10:14 ` [PATCH 5.15 69/89] x86/fpu: Invalidate FPU state correctly on exec() Greg Kroah-Hartman
2023-08-28 10:14 ` [PATCH 5.15 70/89] x86/fpu: Set X86_FEATURE_OSXSAVE feature after enabling OSXSAVE in CR4 Greg Kroah-Hartman
2023-08-28 10:14 ` [PATCH 5.15 71/89] nfs: use vfs setgid helper Greg Kroah-Hartman
2023-08-28 10:14 ` [PATCH 5.15 72/89] nfsd: " Greg Kroah-Hartman
2023-08-28 10:14 ` [PATCH 5.15 73/89] torture: Fix hang during kthread shutdown phase Greg Kroah-Hartman
2023-08-28 10:14 ` [PATCH 5.15 74/89] cgroup/cpuset: Rename functions dealing with DEADLINE accounting Greg Kroah-Hartman
2023-08-28 10:14 ` [PATCH 5.15 75/89] sched/cpuset: Bring back cpuset_mutex Greg Kroah-Hartman
2023-08-28 10:14 ` [PATCH 5.15 76/89] sched/cpuset: Keep track of SCHED_DEADLINE task in cpusets Greg Kroah-Hartman
2023-08-28 10:14 ` [PATCH 5.15 77/89] cgroup/cpuset: Iterate only if DEADLINE tasks are present Greg Kroah-Hartman
2023-08-28 10:14 ` [PATCH 5.15 78/89] sched/deadline: Create DL BW alloc, free & check overflow interface Greg Kroah-Hartman
2023-08-28 10:14 ` [PATCH 5.15 79/89] cgroup/cpuset: Free DL BW in case can_attach() fails Greg Kroah-Hartman
2023-08-28 10:14 ` [PATCH 5.15 80/89] drm/i915: Fix premature release of requests reusable memory Greg Kroah-Hartman
2023-08-28 10:14 ` [PATCH 5.15 81/89] can: raw: add missing refcount for memory leak fix Greg Kroah-Hartman
2023-08-28 10:14 ` [PATCH 5.15 82/89] scsi: snic: Fix double free in snic_tgt_create() Greg Kroah-Hartman
2023-08-28 10:14 ` [PATCH 5.15 83/89] scsi: core: raid_class: Remove raid_component_add() Greg Kroah-Hartman
2023-08-28 10:14 ` [PATCH 5.15 84/89] clk: Fix undefined reference to `clk_rate_exclusive_{get,put} Greg Kroah-Hartman
2023-08-28 10:14 ` [PATCH 5.15 85/89] pinctrl: renesas: rza2: Add lock around pinctrl_generic{{add,remove}_group,{add,remove}_function} Greg Kroah-Hartman
2023-08-28 10:14 ` [PATCH 5.15 86/89] dma-buf/sw_sync: Avoid recursive lock during fence signal Greg Kroah-Hartman
2023-08-28 10:14 ` [PATCH 5.15 87/89] mm: memory-failure: kill soft_offline_free_page() Greg Kroah-Hartman
2023-08-28 10:14 ` [PATCH 5.15 88/89] mm: memory-failure: fix unexpected return value in soft_offline_page() Greg Kroah-Hartman
2023-08-28 10:14 ` [PATCH 5.15 89/89] mm,ima,kexec,of: use memblock_free_late from ima_free_kexec_buffer Greg Kroah-Hartman
2023-08-29  1:57 ` [PATCH 5.15 00/89] 5.15.129-rc1 review SeongJae Park
2023-08-29  9:05 ` Naresh Kamboju
2023-08-29  9:36 ` Naresh Kamboju
2023-08-29 10:00 ` Harshit Mogalapalli
2023-08-29 11:50 ` Sudip Mukherjee (Codethink)
2023-08-29 14:13 ` Shuah Khan
2023-08-29 18:52 ` Florian Fainelli
2023-08-29 23:26 ` Ron Economos
2023-08-30  2:24 ` Guenter Roeck
2023-08-30 10:24 ` Jon Hunter
2023-08-30 13:19 ` Joel Fernandes
2023-08-30 16:32 ` Allen Pais

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230828101150.644350280@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=javierm@redhat.com \
    --cc=patches@lists.linux.dev \
    --cc=sam@ravnborg.org \
    --cc=sashal@kernel.org \
    --cc=stable@vger.kernel.org \
    --cc=tzimmermann@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).