[merged mm-nonmm-stable] sparc-move-the-xor-code-to-lib-raid.patch removed from -mm tree

public inbox for mm-commits@vger.kernel.org
 help / color / mirror / Atom feed

From: Andrew Morton <akpm@linux-foundation.org>
To: mm-commits@vger.kernel.org,will@kernel.org,tytso@mit.edu,svens@linux.ibm.com,song@kernel.org,richard@nod.at,richard.henderson@linaro.org,palmer@dabbelt.com,npiggin@gmail.com,mpe@ellerman.id.au,mingo@redhat.com,mattst88@gmail.com,maddy@linux.ibm.com,linux@armlinux.org.uk,linmag7@gmail.com,linan122@huawei.com,kernel@xen0n.name,johannes@sipsolutions.net,jason@zx2c4.com,hpa@zytor.com,herbert@gondor.apana.org.au,hca@linux.ibm.com,gor@linux.ibm.com,ebiggers@kernel.org,dsterba@suse.com,davem@davemloft.net,dan.j.williams@intel.com,clm@fb.com,chenhuacai@kernel.org,catalin.marinas@arm.com,bp@alien8.de,borntraeger@linux.ibm.com,arnd@arndb.de,ardb@kernel.org,aou@eecs.berkeley.edu,anton.ivanov@cambridgegreys.com,andreas@gaisler.com,alex@ghiti.fr,agordeev@linux.ibm.com,hch@lst.de,akpm@linux-foundation.org
Subject: [merged mm-nonmm-stable] sparc-move-the-xor-code-to-lib-raid.patch removed from -mm tree
Date: Thu, 02 Apr 2026 23:41:58 -0700	[thread overview]
Message-ID: <20260403064158.AB8A1C4CEF7@smtp.kernel.org> (raw)


The quilt patch titled
     Subject: sparc: move the XOR code to lib/raid/
has been removed from the -mm tree.  Its filename was
     sparc-move-the-xor-code-to-lib-raid.patch

This patch was dropped because it was merged into the mm-nonmm-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

------------------------------------------------------
From: Christoph Hellwig <hch@lst.de>
Subject: sparc: move the XOR code to lib/raid/
Date: Fri, 27 Mar 2026 07:16:49 +0100

Move the optimized XOR into lib/raid and include it it in xor.ko instead
of always building it into the main kernel image.

The code should probably be split into separate files for the two
implementations, but for now this just does the trivial move.

Link: https://lkml.kernel.org/r/20260327061704.3707577-18-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Eric Biggers <ebiggers@kernel.org>
Tested-by: Eric Biggers <ebiggers@kernel.org>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Alexandre Ghiti <alex@ghiti.fr>
Cc: Andreas Larsson <andreas@gaisler.com>
Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: "Borislav Petkov (AMD)" <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chris Mason <clm@fb.com>
Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: David Sterba <dsterba@suse.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jason A. Donenfeld <jason@zx2c4.com>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: Li Nan <linan122@huawei.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Magnus Lindholm <linmag7@gmail.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Richard Henderson <richard.henderson@linaro.org>
Cc: Richard Weinberger <richard@nod.at>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Song Liu <song@kernel.org>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Ted Ts'o <tytso@mit.edu>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: WANG Xuerui <kernel@xen0n.name>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 arch/sparc/include/asm/asm-prototypes.h |    1 
 arch/sparc/include/asm/xor.h            |   45 +
 arch/sparc/include/asm/xor_32.h         |  268 ---------
 arch/sparc/include/asm/xor_64.h         |   76 --
 arch/sparc/lib/Makefile                 |    2 
 arch/sparc/lib/xor.S                    |  646 ----------------------
 lib/raid/xor/Makefile                   |    2 
 lib/raid/xor/sparc/xor-sparc32.c        |  253 ++++++++
 lib/raid/xor/sparc/xor-sparc64-glue.c   |   60 ++
 lib/raid/xor/sparc/xor-sparc64.S        |  636 +++++++++++++++++++++
 10 files changed, 992 insertions(+), 997 deletions(-)

--- a/arch/sparc/include/asm/asm-prototypes.h~sparc-move-the-xor-code-to-lib-raid
+++ a/arch/sparc/include/asm/asm-prototypes.h
@@ -14,7 +14,6 @@
 #include <asm/oplib.h>
 #include <asm/pgtable.h>
 #include <asm/trap_block.h>
-#include <asm/xor.h>
 
 void *__memscan_zero(void *, size_t);
 void *__memscan_generic(void *, int, size_t);
diff --git a/arch/sparc/include/asm/xor_32.h a/arch/sparc/include/asm/xor_32.h
deleted file mode 100644
--- a/arch/sparc/include/asm/xor_32.h
+++ /dev/null
@@ -1,268 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0-or-later */
-/*
- * include/asm/xor.h
- *
- * Optimized RAID-5 checksumming functions for 32-bit Sparc.
- */
-
-/*
- * High speed xor_block operation for RAID4/5 utilizing the
- * ldd/std SPARC instructions.
- *
- * Copyright (C) 1999 Jakub Jelinek (jj@ultra.linux.cz)
- */
-
-static void
-sparc_2(unsigned long bytes, unsigned long * __restrict p1,
-	const unsigned long * __restrict p2)
-{
-	int lines = bytes / (sizeof (long)) / 8;
-
-	do {
-		__asm__ __volatile__(
-		  "ldd [%0 + 0x00], %%g2\n\t"
-		  "ldd [%0 + 0x08], %%g4\n\t"
-		  "ldd [%0 + 0x10], %%o0\n\t"
-		  "ldd [%0 + 0x18], %%o2\n\t"
-		  "ldd [%1 + 0x00], %%o4\n\t"
-		  "ldd [%1 + 0x08], %%l0\n\t"
-		  "ldd [%1 + 0x10], %%l2\n\t"
-		  "ldd [%1 + 0x18], %%l4\n\t"
-		  "xor %%g2, %%o4, %%g2\n\t"
-		  "xor %%g3, %%o5, %%g3\n\t"
-		  "xor %%g4, %%l0, %%g4\n\t"
-		  "xor %%g5, %%l1, %%g5\n\t"
-		  "xor %%o0, %%l2, %%o0\n\t"
-		  "xor %%o1, %%l3, %%o1\n\t"
-		  "xor %%o2, %%l4, %%o2\n\t"
-		  "xor %%o3, %%l5, %%o3\n\t"
-		  "std %%g2, [%0 + 0x00]\n\t"
-		  "std %%g4, [%0 + 0x08]\n\t"
-		  "std %%o0, [%0 + 0x10]\n\t"
-		  "std %%o2, [%0 + 0x18]\n"
-		:
-		: "r" (p1), "r" (p2)
-		: "g2", "g3", "g4", "g5",
-		  "o0", "o1", "o2", "o3", "o4", "o5",
-		  "l0", "l1", "l2", "l3", "l4", "l5");
-		p1 += 8;
-		p2 += 8;
-	} while (--lines > 0);
-}
-
-static void
-sparc_3(unsigned long bytes, unsigned long * __restrict p1,
-	const unsigned long * __restrict p2,
-	const unsigned long * __restrict p3)
-{
-	int lines = bytes / (sizeof (long)) / 8;
-
-	do {
-		__asm__ __volatile__(
-		  "ldd [%0 + 0x00], %%g2\n\t"
-		  "ldd [%0 + 0x08], %%g4\n\t"
-		  "ldd [%0 + 0x10], %%o0\n\t"
-		  "ldd [%0 + 0x18], %%o2\n\t"
-		  "ldd [%1 + 0x00], %%o4\n\t"
-		  "ldd [%1 + 0x08], %%l0\n\t"
-		  "ldd [%1 + 0x10], %%l2\n\t"
-		  "ldd [%1 + 0x18], %%l4\n\t"
-		  "xor %%g2, %%o4, %%g2\n\t"
-		  "xor %%g3, %%o5, %%g3\n\t"
-		  "ldd [%2 + 0x00], %%o4\n\t"
-		  "xor %%g4, %%l0, %%g4\n\t"
-		  "xor %%g5, %%l1, %%g5\n\t"
-		  "ldd [%2 + 0x08], %%l0\n\t"
-		  "xor %%o0, %%l2, %%o0\n\t"
-		  "xor %%o1, %%l3, %%o1\n\t"
-		  "ldd [%2 + 0x10], %%l2\n\t"
-		  "xor %%o2, %%l4, %%o2\n\t"
-		  "xor %%o3, %%l5, %%o3\n\t"
-		  "ldd [%2 + 0x18], %%l4\n\t"
-		  "xor %%g2, %%o4, %%g2\n\t"
-		  "xor %%g3, %%o5, %%g3\n\t"
-		  "xor %%g4, %%l0, %%g4\n\t"
-		  "xor %%g5, %%l1, %%g5\n\t"
-		  "xor %%o0, %%l2, %%o0\n\t"
-		  "xor %%o1, %%l3, %%o1\n\t"
-		  "xor %%o2, %%l4, %%o2\n\t"
-		  "xor %%o3, %%l5, %%o3\n\t"
-		  "std %%g2, [%0 + 0x00]\n\t"
-		  "std %%g4, [%0 + 0x08]\n\t"
-		  "std %%o0, [%0 + 0x10]\n\t"
-		  "std %%o2, [%0 + 0x18]\n"
-		:
-		: "r" (p1), "r" (p2), "r" (p3)
-		: "g2", "g3", "g4", "g5",
-		  "o0", "o1", "o2", "o3", "o4", "o5",
-		  "l0", "l1", "l2", "l3", "l4", "l5");
-		p1 += 8;
-		p2 += 8;
-		p3 += 8;
-	} while (--lines > 0);
-}
-
-static void
-sparc_4(unsigned long bytes, unsigned long * __restrict p1,
-	const unsigned long * __restrict p2,
-	const unsigned long * __restrict p3,
-	const unsigned long * __restrict p4)
-{
-	int lines = bytes / (sizeof (long)) / 8;
-
-	do {
-		__asm__ __volatile__(
-		  "ldd [%0 + 0x00], %%g2\n\t"
-		  "ldd [%0 + 0x08], %%g4\n\t"
-		  "ldd [%0 + 0x10], %%o0\n\t"
-		  "ldd [%0 + 0x18], %%o2\n\t"
-		  "ldd [%1 + 0x00], %%o4\n\t"
-		  "ldd [%1 + 0x08], %%l0\n\t"
-		  "ldd [%1 + 0x10], %%l2\n\t"
-		  "ldd [%1 + 0x18], %%l4\n\t"
-		  "xor %%g2, %%o4, %%g2\n\t"
-		  "xor %%g3, %%o5, %%g3\n\t"
-		  "ldd [%2 + 0x00], %%o4\n\t"
-		  "xor %%g4, %%l0, %%g4\n\t"
-		  "xor %%g5, %%l1, %%g5\n\t"
-		  "ldd [%2 + 0x08], %%l0\n\t"
-		  "xor %%o0, %%l2, %%o0\n\t"
-		  "xor %%o1, %%l3, %%o1\n\t"
-		  "ldd [%2 + 0x10], %%l2\n\t"
-		  "xor %%o2, %%l4, %%o2\n\t"
-		  "xor %%o3, %%l5, %%o3\n\t"
-		  "ldd [%2 + 0x18], %%l4\n\t"
-		  "xor %%g2, %%o4, %%g2\n\t"
-		  "xor %%g3, %%o5, %%g3\n\t"
-		  "ldd [%3 + 0x00], %%o4\n\t"
-		  "xor %%g4, %%l0, %%g4\n\t"
-		  "xor %%g5, %%l1, %%g5\n\t"
-		  "ldd [%3 + 0x08], %%l0\n\t"
-		  "xor %%o0, %%l2, %%o0\n\t"
-		  "xor %%o1, %%l3, %%o1\n\t"
-		  "ldd [%3 + 0x10], %%l2\n\t"
-		  "xor %%o2, %%l4, %%o2\n\t"
-		  "xor %%o3, %%l5, %%o3\n\t"
-		  "ldd [%3 + 0x18], %%l4\n\t"
-		  "xor %%g2, %%o4, %%g2\n\t"
-		  "xor %%g3, %%o5, %%g3\n\t"
-		  "xor %%g4, %%l0, %%g4\n\t"
-		  "xor %%g5, %%l1, %%g5\n\t"
-		  "xor %%o0, %%l2, %%o0\n\t"
-		  "xor %%o1, %%l3, %%o1\n\t"
-		  "xor %%o2, %%l4, %%o2\n\t"
-		  "xor %%o3, %%l5, %%o3\n\t"
-		  "std %%g2, [%0 + 0x00]\n\t"
-		  "std %%g4, [%0 + 0x08]\n\t"
-		  "std %%o0, [%0 + 0x10]\n\t"
-		  "std %%o2, [%0 + 0x18]\n"
-		:
-		: "r" (p1), "r" (p2), "r" (p3), "r" (p4)
-		: "g2", "g3", "g4", "g5",
-		  "o0", "o1", "o2", "o3", "o4", "o5",
-		  "l0", "l1", "l2", "l3", "l4", "l5");
-		p1 += 8;
-		p2 += 8;
-		p3 += 8;
-		p4 += 8;
-	} while (--lines > 0);
-}
-
-static void
-sparc_5(unsigned long bytes, unsigned long * __restrict p1,
-	const unsigned long * __restrict p2,
-	const unsigned long * __restrict p3,
-	const unsigned long * __restrict p4,
-	const unsigned long * __restrict p5)
-{
-	int lines = bytes / (sizeof (long)) / 8;
-
-	do {
-		__asm__ __volatile__(
-		  "ldd [%0 + 0x00], %%g2\n\t"
-		  "ldd [%0 + 0x08], %%g4\n\t"
-		  "ldd [%0 + 0x10], %%o0\n\t"
-		  "ldd [%0 + 0x18], %%o2\n\t"
-		  "ldd [%1 + 0x00], %%o4\n\t"
-		  "ldd [%1 + 0x08], %%l0\n\t"
-		  "ldd [%1 + 0x10], %%l2\n\t"
-		  "ldd [%1 + 0x18], %%l4\n\t"
-		  "xor %%g2, %%o4, %%g2\n\t"
-		  "xor %%g3, %%o5, %%g3\n\t"
-		  "ldd [%2 + 0x00], %%o4\n\t"
-		  "xor %%g4, %%l0, %%g4\n\t"
-		  "xor %%g5, %%l1, %%g5\n\t"
-		  "ldd [%2 + 0x08], %%l0\n\t"
-		  "xor %%o0, %%l2, %%o0\n\t"
-		  "xor %%o1, %%l3, %%o1\n\t"
-		  "ldd [%2 + 0x10], %%l2\n\t"
-		  "xor %%o2, %%l4, %%o2\n\t"
-		  "xor %%o3, %%l5, %%o3\n\t"
-		  "ldd [%2 + 0x18], %%l4\n\t"
-		  "xor %%g2, %%o4, %%g2\n\t"
-		  "xor %%g3, %%o5, %%g3\n\t"
-		  "ldd [%3 + 0x00], %%o4\n\t"
-		  "xor %%g4, %%l0, %%g4\n\t"
-		  "xor %%g5, %%l1, %%g5\n\t"
-		  "ldd [%3 + 0x08], %%l0\n\t"
-		  "xor %%o0, %%l2, %%o0\n\t"
-		  "xor %%o1, %%l3, %%o1\n\t"
-		  "ldd [%3 + 0x10], %%l2\n\t"
-		  "xor %%o2, %%l4, %%o2\n\t"
-		  "xor %%o3, %%l5, %%o3\n\t"
-		  "ldd [%3 + 0x18], %%l4\n\t"
-		  "xor %%g2, %%o4, %%g2\n\t"
-		  "xor %%g3, %%o5, %%g3\n\t"
-		  "ldd [%4 + 0x00], %%o4\n\t"
-		  "xor %%g4, %%l0, %%g4\n\t"
-		  "xor %%g5, %%l1, %%g5\n\t"
-		  "ldd [%4 + 0x08], %%l0\n\t"
-		  "xor %%o0, %%l2, %%o0\n\t"
-		  "xor %%o1, %%l3, %%o1\n\t"
-		  "ldd [%4 + 0x10], %%l2\n\t"
-		  "xor %%o2, %%l4, %%o2\n\t"
-		  "xor %%o3, %%l5, %%o3\n\t"
-		  "ldd [%4 + 0x18], %%l4\n\t"
-		  "xor %%g2, %%o4, %%g2\n\t"
-		  "xor %%g3, %%o5, %%g3\n\t"
-		  "xor %%g4, %%l0, %%g4\n\t"
-		  "xor %%g5, %%l1, %%g5\n\t"
-		  "xor %%o0, %%l2, %%o0\n\t"
-		  "xor %%o1, %%l3, %%o1\n\t"
-		  "xor %%o2, %%l4, %%o2\n\t"
-		  "xor %%o3, %%l5, %%o3\n\t"
-		  "std %%g2, [%0 + 0x00]\n\t"
-		  "std %%g4, [%0 + 0x08]\n\t"
-		  "std %%o0, [%0 + 0x10]\n\t"
-		  "std %%o2, [%0 + 0x18]\n"
-		:
-		: "r" (p1), "r" (p2), "r" (p3), "r" (p4), "r" (p5)
-		: "g2", "g3", "g4", "g5",
-		  "o0", "o1", "o2", "o3", "o4", "o5",
-		  "l0", "l1", "l2", "l3", "l4", "l5");
-		p1 += 8;
-		p2 += 8;
-		p3 += 8;
-		p4 += 8;
-		p5 += 8;
-	} while (--lines > 0);
-}
-
-static struct xor_block_template xor_block_SPARC = {
-	.name	= "SPARC",
-	.do_2	= sparc_2,
-	.do_3	= sparc_3,
-	.do_4	= sparc_4,
-	.do_5	= sparc_5,
-};
-
-/* For grins, also test the generic routines.  */
-#include <asm-generic/xor.h>
-
-#define arch_xor_init arch_xor_init
-static __always_inline void __init arch_xor_init(void)
-{
-	xor_register(&xor_block_8regs);
-	xor_register(&xor_block_32regs);
-	xor_register(&xor_block_SPARC);
-}
diff --git a/arch/sparc/include/asm/xor_64.h a/arch/sparc/include/asm/xor_64.h
deleted file mode 100644
--- a/arch/sparc/include/asm/xor_64.h
+++ /dev/null
@@ -1,76 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0-or-later */
-/*
- * include/asm/xor.h
- *
- * High speed xor_block operation for RAID4/5 utilizing the
- * UltraSparc Visual Instruction Set and Niagara block-init
- * twin-load instructions.
- *
- * Copyright (C) 1997, 1999 Jakub Jelinek (jj@ultra.linux.cz)
- * Copyright (C) 2006 David S. Miller <davem@davemloft.net>
- */
-
-#include <asm/spitfire.h>
-
-void xor_vis_2(unsigned long bytes, unsigned long * __restrict p1,
-	       const unsigned long * __restrict p2);
-void xor_vis_3(unsigned long bytes, unsigned long * __restrict p1,
-	       const unsigned long * __restrict p2,
-	       const unsigned long * __restrict p3);
-void xor_vis_4(unsigned long bytes, unsigned long * __restrict p1,
-	       const unsigned long * __restrict p2,
-	       const unsigned long * __restrict p3,
-	       const unsigned long * __restrict p4);
-void xor_vis_5(unsigned long bytes, unsigned long * __restrict p1,
-	       const unsigned long * __restrict p2,
-	       const unsigned long * __restrict p3,
-	       const unsigned long * __restrict p4,
-	       const unsigned long * __restrict p5);
-
-/* XXX Ugh, write cheetah versions... -DaveM */
-
-static struct xor_block_template xor_block_VIS = {
-        .name	= "VIS",
-        .do_2	= xor_vis_2,
-        .do_3	= xor_vis_3,
-        .do_4	= xor_vis_4,
-        .do_5	= xor_vis_5,
-};
-
-void xor_niagara_2(unsigned long bytes, unsigned long * __restrict p1,
-		   const unsigned long * __restrict p2);
-void xor_niagara_3(unsigned long bytes, unsigned long * __restrict p1,
-		   const unsigned long * __restrict p2,
-		   const unsigned long * __restrict p3);
-void xor_niagara_4(unsigned long bytes, unsigned long * __restrict p1,
-		   const unsigned long * __restrict p2,
-		   const unsigned long * __restrict p3,
-		   const unsigned long * __restrict p4);
-void xor_niagara_5(unsigned long bytes, unsigned long * __restrict p1,
-		   const unsigned long * __restrict p2,
-		   const unsigned long * __restrict p3,
-		   const unsigned long * __restrict p4,
-		   const unsigned long * __restrict p5);
-
-static struct xor_block_template xor_block_niagara = {
-        .name	= "Niagara",
-        .do_2	= xor_niagara_2,
-        .do_3	= xor_niagara_3,
-        .do_4	= xor_niagara_4,
-        .do_5	= xor_niagara_5,
-};
-
-#define arch_xor_init arch_xor_init
-static __always_inline void __init arch_xor_init(void)
-{
-	/* Force VIS for everything except Niagara.  */
-	if (tlb_type == hypervisor &&
-	    (sun4v_chip_type == SUN4V_CHIP_NIAGARA1 ||
-	     sun4v_chip_type == SUN4V_CHIP_NIAGARA2 ||
-	     sun4v_chip_type == SUN4V_CHIP_NIAGARA3 ||
-	     sun4v_chip_type == SUN4V_CHIP_NIAGARA4 ||
-	     sun4v_chip_type == SUN4V_CHIP_NIAGARA5))
-		xor_force(&xor_block_niagara);
-	else
-		xor_force(&xor_block_VIS);
-}
--- a/arch/sparc/include/asm/xor.h~sparc-move-the-xor-code-to-lib-raid
+++ a/arch/sparc/include/asm/xor.h
@@ -1,9 +1,44 @@
 /* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 1997, 1999 Jakub Jelinek (jj@ultra.linux.cz)
+ * Copyright (C) 2006 David S. Miller <davem@davemloft.net>
+ */
 #ifndef ___ASM_SPARC_XOR_H
 #define ___ASM_SPARC_XOR_H
+
 #if defined(__sparc__) && defined(__arch64__)
-#include <asm/xor_64.h>
-#else
-#include <asm/xor_32.h>
-#endif
-#endif
+#include <asm/spitfire.h>
+
+extern struct xor_block_template xor_block_VIS;
+extern struct xor_block_template xor_block_niagara;
+
+#define arch_xor_init arch_xor_init
+static __always_inline void __init arch_xor_init(void)
+{
+	/* Force VIS for everything except Niagara.  */
+	if (tlb_type == hypervisor &&
+	    (sun4v_chip_type == SUN4V_CHIP_NIAGARA1 ||
+	     sun4v_chip_type == SUN4V_CHIP_NIAGARA2 ||
+	     sun4v_chip_type == SUN4V_CHIP_NIAGARA3 ||
+	     sun4v_chip_type == SUN4V_CHIP_NIAGARA4 ||
+	     sun4v_chip_type == SUN4V_CHIP_NIAGARA5))
+		xor_force(&xor_block_niagara);
+	else
+		xor_force(&xor_block_VIS);
+}
+#else /* sparc64 */
+
+/* For grins, also test the generic routines.  */
+#include <asm-generic/xor.h>
+
+extern struct xor_block_template xor_block_SPARC;
+
+#define arch_xor_init arch_xor_init
+static __always_inline void __init arch_xor_init(void)
+{
+	xor_register(&xor_block_8regs);
+	xor_register(&xor_block_32regs);
+	xor_register(&xor_block_SPARC);
+}
+#endif /* !sparc64 */
+#endif /* ___ASM_SPARC_XOR_H */
--- a/arch/sparc/lib/Makefile~sparc-move-the-xor-code-to-lib-raid
+++ a/arch/sparc/lib/Makefile
@@ -48,7 +48,7 @@ lib-$(CONFIG_SPARC64) += GENmemcpy.o GEN
 lib-$(CONFIG_SPARC64) += GENpatch.o GENpage.o GENbzero.o
 
 lib-$(CONFIG_SPARC64) += copy_in_user.o memmove.o
-lib-$(CONFIG_SPARC64) += mcount.o ipcsum.o xor.o hweight.o ffs.o
+lib-$(CONFIG_SPARC64) += mcount.o ipcsum.o hweight.o ffs.o
 
 obj-$(CONFIG_SPARC64) += iomap.o
 obj-$(CONFIG_SPARC32) += atomic32.o
diff --git a/arch/sparc/lib/xor.S a/arch/sparc/lib/xor.S
deleted file mode 100644
--- a/arch/sparc/lib/xor.S
+++ /dev/null
@@ -1,646 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0 */
-/*
- * arch/sparc64/lib/xor.S
- *
- * High speed xor_block operation for RAID4/5 utilizing the
- * UltraSparc Visual Instruction Set and Niagara store-init/twin-load.
- *
- * Copyright (C) 1997, 1999 Jakub Jelinek (jj@ultra.linux.cz)
- * Copyright (C) 2006 David S. Miller <davem@davemloft.net>
- */
-
-#include <linux/export.h>
-#include <linux/linkage.h>
-#include <asm/visasm.h>
-#include <asm/asi.h>
-#include <asm/dcu.h>
-#include <asm/spitfire.h>
-
-/*
- *	Requirements:
- *	!(((long)dest | (long)sourceN) & (64 - 1)) &&
- *	!(len & 127) && len >= 256
- */
-	.text
-
-	/* VIS versions. */
-ENTRY(xor_vis_2)
-	rd	%fprs, %o5
-	andcc	%o5, FPRS_FEF|FPRS_DU, %g0
-	be,pt	%icc, 0f
-	 sethi	%hi(VISenter), %g1
-	jmpl	%g1 + %lo(VISenter), %g7
-	 add	%g7, 8, %g7
-0:	wr	%g0, FPRS_FEF, %fprs
-	rd	%asi, %g1
-	wr	%g0, ASI_BLK_P, %asi
-	membar	#LoadStore|#StoreLoad|#StoreStore
-	sub	%o0, 128, %o0
-	ldda	[%o1] %asi, %f0
-	ldda	[%o2] %asi, %f16
-
-2:	ldda	[%o1 + 64] %asi, %f32
-	fxor	%f0, %f16, %f16
-	fxor	%f2, %f18, %f18
-	fxor	%f4, %f20, %f20
-	fxor	%f6, %f22, %f22
-	fxor	%f8, %f24, %f24
-	fxor	%f10, %f26, %f26
-	fxor	%f12, %f28, %f28
-	fxor	%f14, %f30, %f30
-	stda	%f16, [%o1] %asi
-	ldda	[%o2 + 64] %asi, %f48
-	ldda	[%o1 + 128] %asi, %f0
-	fxor	%f32, %f48, %f48
-	fxor	%f34, %f50, %f50
-	add	%o1, 128, %o1
-	fxor	%f36, %f52, %f52
-	add	%o2, 128, %o2
-	fxor	%f38, %f54, %f54
-	subcc	%o0, 128, %o0
-	fxor	%f40, %f56, %f56
-	fxor	%f42, %f58, %f58
-	fxor	%f44, %f60, %f60
-	fxor	%f46, %f62, %f62
-	stda	%f48, [%o1 - 64] %asi
-	bne,pt	%xcc, 2b
-	 ldda	[%o2] %asi, %f16
-
-	ldda	[%o1 + 64] %asi, %f32
-	fxor	%f0, %f16, %f16
-	fxor	%f2, %f18, %f18
-	fxor	%f4, %f20, %f20
-	fxor	%f6, %f22, %f22
-	fxor	%f8, %f24, %f24
-	fxor	%f10, %f26, %f26
-	fxor	%f12, %f28, %f28
-	fxor	%f14, %f30, %f30
-	stda	%f16, [%o1] %asi
-	ldda	[%o2 + 64] %asi, %f48
-	membar	#Sync
-	fxor	%f32, %f48, %f48
-	fxor	%f34, %f50, %f50
-	fxor	%f36, %f52, %f52
-	fxor	%f38, %f54, %f54
-	fxor	%f40, %f56, %f56
-	fxor	%f42, %f58, %f58
-	fxor	%f44, %f60, %f60
-	fxor	%f46, %f62, %f62
-	stda	%f48, [%o1 + 64] %asi
-	membar	#Sync|#StoreStore|#StoreLoad
-	wr	%g1, %g0, %asi
-	retl
-	  wr	%g0, 0, %fprs
-ENDPROC(xor_vis_2)
-EXPORT_SYMBOL(xor_vis_2)
-
-ENTRY(xor_vis_3)
-	rd	%fprs, %o5
-	andcc	%o5, FPRS_FEF|FPRS_DU, %g0
-	be,pt	%icc, 0f
-	 sethi	%hi(VISenter), %g1
-	jmpl	%g1 + %lo(VISenter), %g7
-	 add	%g7, 8, %g7
-0:	wr	%g0, FPRS_FEF, %fprs
-	rd	%asi, %g1
-	wr	%g0, ASI_BLK_P, %asi
-	membar	#LoadStore|#StoreLoad|#StoreStore
-	sub	%o0, 64, %o0
-	ldda	[%o1] %asi, %f0
-	ldda	[%o2] %asi, %f16
-
-3:	ldda	[%o3] %asi, %f32
-	fxor	%f0, %f16, %f48
-	fxor	%f2, %f18, %f50
-	add	%o1, 64, %o1
-	fxor	%f4, %f20, %f52
-	fxor	%f6, %f22, %f54
-	add	%o2, 64, %o2
-	fxor	%f8, %f24, %f56
-	fxor	%f10, %f26, %f58
-	fxor	%f12, %f28, %f60
-	fxor	%f14, %f30, %f62
-	ldda	[%o1] %asi, %f0
-	fxor	%f48, %f32, %f48
-	fxor	%f50, %f34, %f50
-	fxor	%f52, %f36, %f52
-	fxor	%f54, %f38, %f54
-	add	%o3, 64, %o3
-	fxor	%f56, %f40, %f56
-	fxor	%f58, %f42, %f58
-	subcc	%o0, 64, %o0
-	fxor	%f60, %f44, %f60
-	fxor	%f62, %f46, %f62
-	stda	%f48, [%o1 - 64] %asi
-	bne,pt	%xcc, 3b
-	 ldda	[%o2] %asi, %f16
-
-	ldda	[%o3] %asi, %f32
-	fxor	%f0, %f16, %f48
-	fxor	%f2, %f18, %f50
-	fxor	%f4, %f20, %f52
-	fxor	%f6, %f22, %f54
-	fxor	%f8, %f24, %f56
-	fxor	%f10, %f26, %f58
-	fxor	%f12, %f28, %f60
-	fxor	%f14, %f30, %f62
-	membar	#Sync
-	fxor	%f48, %f32, %f48
-	fxor	%f50, %f34, %f50
-	fxor	%f52, %f36, %f52
-	fxor	%f54, %f38, %f54
-	fxor	%f56, %f40, %f56
-	fxor	%f58, %f42, %f58
-	fxor	%f60, %f44, %f60
-	fxor	%f62, %f46, %f62
-	stda	%f48, [%o1] %asi
-	membar	#Sync|#StoreStore|#StoreLoad
-	wr	%g1, %g0, %asi
-	retl
-	 wr	%g0, 0, %fprs
-ENDPROC(xor_vis_3)
-EXPORT_SYMBOL(xor_vis_3)
-
-ENTRY(xor_vis_4)
-	rd	%fprs, %o5
-	andcc	%o5, FPRS_FEF|FPRS_DU, %g0
-	be,pt	%icc, 0f
-	 sethi	%hi(VISenter), %g1
-	jmpl	%g1 + %lo(VISenter), %g7
-	 add	%g7, 8, %g7
-0:	wr	%g0, FPRS_FEF, %fprs
-	rd	%asi, %g1
-	wr	%g0, ASI_BLK_P, %asi
-	membar	#LoadStore|#StoreLoad|#StoreStore
-	sub	%o0, 64, %o0
-	ldda	[%o1] %asi, %f0
-	ldda	[%o2] %asi, %f16
-
-4:	ldda	[%o3] %asi, %f32
-	fxor	%f0, %f16, %f16
-	fxor	%f2, %f18, %f18
-	add	%o1, 64, %o1
-	fxor	%f4, %f20, %f20
-	fxor	%f6, %f22, %f22
-	add	%o2, 64, %o2
-	fxor	%f8, %f24, %f24
-	fxor	%f10, %f26, %f26
-	fxor	%f12, %f28, %f28
-	fxor	%f14, %f30, %f30
-	ldda	[%o4] %asi, %f48
-	fxor	%f16, %f32, %f32
-	fxor	%f18, %f34, %f34
-	fxor	%f20, %f36, %f36
-	fxor	%f22, %f38, %f38
-	add	%o3, 64, %o3
-	fxor	%f24, %f40, %f40
-	fxor	%f26, %f42, %f42
-	fxor	%f28, %f44, %f44
-	fxor	%f30, %f46, %f46
-	ldda	[%o1] %asi, %f0
-	fxor	%f32, %f48, %f48
-	fxor	%f34, %f50, %f50
-	fxor	%f36, %f52, %f52
-	add	%o4, 64, %o4
-	fxor	%f38, %f54, %f54
-	fxor	%f40, %f56, %f56
-	fxor	%f42, %f58, %f58
-	subcc	%o0, 64, %o0
-	fxor	%f44, %f60, %f60
-	fxor	%f46, %f62, %f62
-	stda	%f48, [%o1 - 64] %asi
-	bne,pt	%xcc, 4b
-	 ldda	[%o2] %asi, %f16
-
-	ldda	[%o3] %asi, %f32
-	fxor	%f0, %f16, %f16
-	fxor	%f2, %f18, %f18
-	fxor	%f4, %f20, %f20
-	fxor	%f6, %f22, %f22
-	fxor	%f8, %f24, %f24
-	fxor	%f10, %f26, %f26
-	fxor	%f12, %f28, %f28
-	fxor	%f14, %f30, %f30
-	ldda	[%o4] %asi, %f48
-	fxor	%f16, %f32, %f32
-	fxor	%f18, %f34, %f34
-	fxor	%f20, %f36, %f36
-	fxor	%f22, %f38, %f38
-	fxor	%f24, %f40, %f40
-	fxor	%f26, %f42, %f42
-	fxor	%f28, %f44, %f44
-	fxor	%f30, %f46, %f46
-	membar	#Sync
-	fxor	%f32, %f48, %f48
-	fxor	%f34, %f50, %f50
-	fxor	%f36, %f52, %f52
-	fxor	%f38, %f54, %f54
-	fxor	%f40, %f56, %f56
-	fxor	%f42, %f58, %f58
-	fxor	%f44, %f60, %f60
-	fxor	%f46, %f62, %f62
-	stda	%f48, [%o1] %asi
-	membar	#Sync|#StoreStore|#StoreLoad
-	wr	%g1, %g0, %asi
-	retl
-	 wr	%g0, 0, %fprs
-ENDPROC(xor_vis_4)
-EXPORT_SYMBOL(xor_vis_4)
-
-ENTRY(xor_vis_5)
-	save	%sp, -192, %sp
-	rd	%fprs, %o5
-	andcc	%o5, FPRS_FEF|FPRS_DU, %g0
-	be,pt	%icc, 0f
-	 sethi	%hi(VISenter), %g1
-	jmpl	%g1 + %lo(VISenter), %g7
-	 add	%g7, 8, %g7
-0:	wr	%g0, FPRS_FEF, %fprs
-	rd	%asi, %g1
-	wr	%g0, ASI_BLK_P, %asi
-	membar	#LoadStore|#StoreLoad|#StoreStore
-	sub	%i0, 64, %i0
-	ldda	[%i1] %asi, %f0
-	ldda	[%i2] %asi, %f16
-
-5:	ldda	[%i3] %asi, %f32
-	fxor	%f0, %f16, %f48
-	fxor	%f2, %f18, %f50
-	add	%i1, 64, %i1
-	fxor	%f4, %f20, %f52
-	fxor	%f6, %f22, %f54
-	add	%i2, 64, %i2
-	fxor	%f8, %f24, %f56
-	fxor	%f10, %f26, %f58
-	fxor	%f12, %f28, %f60
-	fxor	%f14, %f30, %f62
-	ldda	[%i4] %asi, %f16
-	fxor	%f48, %f32, %f48
-	fxor	%f50, %f34, %f50
-	fxor	%f52, %f36, %f52
-	fxor	%f54, %f38, %f54
-	add	%i3, 64, %i3
-	fxor	%f56, %f40, %f56
-	fxor	%f58, %f42, %f58
-	fxor	%f60, %f44, %f60
-	fxor	%f62, %f46, %f62
-	ldda	[%i5] %asi, %f32
-	fxor	%f48, %f16, %f48
-	fxor	%f50, %f18, %f50
-	add	%i4, 64, %i4
-	fxor	%f52, %f20, %f52
-	fxor	%f54, %f22, %f54
-	add	%i5, 64, %i5
-	fxor	%f56, %f24, %f56
-	fxor	%f58, %f26, %f58
-	fxor	%f60, %f28, %f60
-	fxor	%f62, %f30, %f62
-	ldda	[%i1] %asi, %f0
-	fxor	%f48, %f32, %f48
-	fxor	%f50, %f34, %f50
-	fxor	%f52, %f36, %f52
-	fxor	%f54, %f38, %f54
-	fxor	%f56, %f40, %f56
-	fxor	%f58, %f42, %f58
-	subcc	%i0, 64, %i0
-	fxor	%f60, %f44, %f60
-	fxor	%f62, %f46, %f62
-	stda	%f48, [%i1 - 64] %asi
-	bne,pt	%xcc, 5b
-	 ldda	[%i2] %asi, %f16
-
-	ldda	[%i3] %asi, %f32
-	fxor	%f0, %f16, %f48
-	fxor	%f2, %f18, %f50
-	fxor	%f4, %f20, %f52
-	fxor	%f6, %f22, %f54
-	fxor	%f8, %f24, %f56
-	fxor	%f10, %f26, %f58
-	fxor	%f12, %f28, %f60
-	fxor	%f14, %f30, %f62
-	ldda	[%i4] %asi, %f16
-	fxor	%f48, %f32, %f48
-	fxor	%f50, %f34, %f50
-	fxor	%f52, %f36, %f52
-	fxor	%f54, %f38, %f54
-	fxor	%f56, %f40, %f56
-	fxor	%f58, %f42, %f58
-	fxor	%f60, %f44, %f60
-	fxor	%f62, %f46, %f62
-	ldda	[%i5] %asi, %f32
-	fxor	%f48, %f16, %f48
-	fxor	%f50, %f18, %f50
-	fxor	%f52, %f20, %f52
-	fxor	%f54, %f22, %f54
-	fxor	%f56, %f24, %f56
-	fxor	%f58, %f26, %f58
-	fxor	%f60, %f28, %f60
-	fxor	%f62, %f30, %f62
-	membar	#Sync
-	fxor	%f48, %f32, %f48
-	fxor	%f50, %f34, %f50
-	fxor	%f52, %f36, %f52
-	fxor	%f54, %f38, %f54
-	fxor	%f56, %f40, %f56
-	fxor	%f58, %f42, %f58
-	fxor	%f60, %f44, %f60
-	fxor	%f62, %f46, %f62
-	stda	%f48, [%i1] %asi
-	membar	#Sync|#StoreStore|#StoreLoad
-	wr	%g1, %g0, %asi
-	wr	%g0, 0, %fprs
-	ret
-	 restore
-ENDPROC(xor_vis_5)
-EXPORT_SYMBOL(xor_vis_5)
-
-	/* Niagara versions. */
-ENTRY(xor_niagara_2) /* %o0=bytes, %o1=dest, %o2=src */
-	save		%sp, -192, %sp
-	prefetch	[%i1], #n_writes
-	prefetch	[%i2], #one_read
-	rd		%asi, %g7
-	wr		%g0, ASI_BLK_INIT_QUAD_LDD_P, %asi
-	srlx		%i0, 6, %g1
-	mov		%i1, %i0
-	mov		%i2, %i1
-1:	ldda		[%i1 + 0x00] %asi, %i2	/* %i2/%i3 = src  + 0x00 */
-	ldda		[%i1 + 0x10] %asi, %i4	/* %i4/%i5 = src  + 0x10 */
-	ldda		[%i1 + 0x20] %asi, %g2	/* %g2/%g3 = src  + 0x20 */
-	ldda		[%i1 + 0x30] %asi, %l0	/* %l0/%l1 = src  + 0x30 */
-	prefetch	[%i1 + 0x40], #one_read
-	ldda		[%i0 + 0x00] %asi, %o0  /* %o0/%o1 = dest + 0x00 */
-	ldda		[%i0 + 0x10] %asi, %o2  /* %o2/%o3 = dest + 0x10 */
-	ldda		[%i0 + 0x20] %asi, %o4  /* %o4/%o5 = dest + 0x20 */
-	ldda		[%i0 + 0x30] %asi, %l2  /* %l2/%l3 = dest + 0x30 */
-	prefetch	[%i0 + 0x40], #n_writes
-	xor		%o0, %i2, %o0
-	xor		%o1, %i3, %o1
-	stxa		%o0, [%i0 + 0x00] %asi
-	stxa		%o1, [%i0 + 0x08] %asi
-	xor		%o2, %i4, %o2
-	xor		%o3, %i5, %o3
-	stxa		%o2, [%i0 + 0x10] %asi
-	stxa		%o3, [%i0 + 0x18] %asi
-	xor		%o4, %g2, %o4
-	xor		%o5, %g3, %o5
-	stxa		%o4, [%i0 + 0x20] %asi
-	stxa		%o5, [%i0 + 0x28] %asi
-	xor		%l2, %l0, %l2
-	xor		%l3, %l1, %l3
-	stxa		%l2, [%i0 + 0x30] %asi
-	stxa		%l3, [%i0 + 0x38] %asi
-	add		%i0, 0x40, %i0
-	subcc		%g1, 1, %g1
-	bne,pt		%xcc, 1b
-	 add		%i1, 0x40, %i1
-	membar		#Sync
-	wr		%g7, 0x0, %asi
-	ret
-	 restore
-ENDPROC(xor_niagara_2)
-EXPORT_SYMBOL(xor_niagara_2)
-
-ENTRY(xor_niagara_3) /* %o0=bytes, %o1=dest, %o2=src1, %o3=src2 */
-	save		%sp, -192, %sp
-	prefetch	[%i1], #n_writes
-	prefetch	[%i2], #one_read
-	prefetch	[%i3], #one_read
-	rd		%asi, %g7
-	wr		%g0, ASI_BLK_INIT_QUAD_LDD_P, %asi
-	srlx		%i0, 6, %g1
-	mov		%i1, %i0
-	mov		%i2, %i1
-	mov		%i3, %l7
-1:	ldda		[%i1 + 0x00] %asi, %i2	/* %i2/%i3 = src1 + 0x00 */
-	ldda		[%i1 + 0x10] %asi, %i4	/* %i4/%i5 = src1 + 0x10 */
-	ldda		[%l7 + 0x00] %asi, %g2	/* %g2/%g3 = src2 + 0x00 */
-	ldda		[%l7 + 0x10] %asi, %l0	/* %l0/%l1 = src2 + 0x10 */
-	ldda		[%i0 + 0x00] %asi, %o0  /* %o0/%o1 = dest + 0x00 */
-	ldda		[%i0 + 0x10] %asi, %o2  /* %o2/%o3 = dest + 0x10 */
-	xor		%g2, %i2, %g2
-	xor		%g3, %i3, %g3
-	xor		%o0, %g2, %o0
-	xor		%o1, %g3, %o1
-	stxa		%o0, [%i0 + 0x00] %asi
-	stxa		%o1, [%i0 + 0x08] %asi
-	ldda		[%i1 + 0x20] %asi, %i2	/* %i2/%i3 = src1 + 0x20 */
-	ldda		[%l7 + 0x20] %asi, %g2	/* %g2/%g3 = src2 + 0x20 */
-	ldda		[%i0 + 0x20] %asi, %o0	/* %o0/%o1 = dest + 0x20 */
-	xor		%l0, %i4, %l0
-	xor		%l1, %i5, %l1
-	xor		%o2, %l0, %o2
-	xor		%o3, %l1, %o3
-	stxa		%o2, [%i0 + 0x10] %asi
-	stxa		%o3, [%i0 + 0x18] %asi
-	ldda		[%i1 + 0x30] %asi, %i4	/* %i4/%i5 = src1 + 0x30 */
-	ldda		[%l7 + 0x30] %asi, %l0	/* %l0/%l1 = src2 + 0x30 */
-	ldda		[%i0 + 0x30] %asi, %o2	/* %o2/%o3 = dest + 0x30 */
-	prefetch	[%i1 + 0x40], #one_read
-	prefetch	[%l7 + 0x40], #one_read
-	prefetch	[%i0 + 0x40], #n_writes
-	xor		%g2, %i2, %g2
-	xor		%g3, %i3, %g3
-	xor		%o0, %g2, %o0
-	xor		%o1, %g3, %o1
-	stxa		%o0, [%i0 + 0x20] %asi
-	stxa		%o1, [%i0 + 0x28] %asi
-	xor		%l0, %i4, %l0
-	xor		%l1, %i5, %l1
-	xor		%o2, %l0, %o2
-	xor		%o3, %l1, %o3
-	stxa		%o2, [%i0 + 0x30] %asi
-	stxa		%o3, [%i0 + 0x38] %asi
-	add		%i0, 0x40, %i0
-	add		%i1, 0x40, %i1
-	subcc		%g1, 1, %g1
-	bne,pt		%xcc, 1b
-	 add		%l7, 0x40, %l7
-	membar		#Sync
-	wr		%g7, 0x0, %asi
-	ret
-	 restore
-ENDPROC(xor_niagara_3)
-EXPORT_SYMBOL(xor_niagara_3)
-
-ENTRY(xor_niagara_4) /* %o0=bytes, %o1=dest, %o2=src1, %o3=src2, %o4=src3 */
-	save		%sp, -192, %sp
-	prefetch	[%i1], #n_writes
-	prefetch	[%i2], #one_read
-	prefetch	[%i3], #one_read
-	prefetch	[%i4], #one_read
-	rd		%asi, %g7
-	wr		%g0, ASI_BLK_INIT_QUAD_LDD_P, %asi
-	srlx		%i0, 6, %g1
-	mov		%i1, %i0
-	mov		%i2, %i1
-	mov		%i3, %l7
-	mov		%i4, %l6
-1:	ldda		[%i1 + 0x00] %asi, %i2	/* %i2/%i3 = src1 + 0x00 */
-	ldda		[%l7 + 0x00] %asi, %i4	/* %i4/%i5 = src2 + 0x00 */
-	ldda		[%l6 + 0x00] %asi, %g2	/* %g2/%g3 = src3 + 0x00 */
-	ldda		[%i0 + 0x00] %asi, %l0	/* %l0/%l1 = dest + 0x00 */
-	xor		%i4, %i2, %i4
-	xor		%i5, %i3, %i5
-	ldda		[%i1 + 0x10] %asi, %i2	/* %i2/%i3 = src1 + 0x10 */
-	xor		%g2, %i4, %g2
-	xor		%g3, %i5, %g3
-	ldda		[%l7 + 0x10] %asi, %i4	/* %i4/%i5 = src2 + 0x10 */
-	xor		%l0, %g2, %l0
-	xor		%l1, %g3, %l1
-	stxa		%l0, [%i0 + 0x00] %asi
-	stxa		%l1, [%i0 + 0x08] %asi
-	ldda		[%l6 + 0x10] %asi, %g2	/* %g2/%g3 = src3 + 0x10 */
-	ldda		[%i0 + 0x10] %asi, %l0	/* %l0/%l1 = dest + 0x10 */
-
-	xor		%i4, %i2, %i4
-	xor		%i5, %i3, %i5
-	ldda		[%i1 + 0x20] %asi, %i2	/* %i2/%i3 = src1 + 0x20 */
-	xor		%g2, %i4, %g2
-	xor		%g3, %i5, %g3
-	ldda		[%l7 + 0x20] %asi, %i4	/* %i4/%i5 = src2 + 0x20 */
-	xor		%l0, %g2, %l0
-	xor		%l1, %g3, %l1
-	stxa		%l0, [%i0 + 0x10] %asi
-	stxa		%l1, [%i0 + 0x18] %asi
-	ldda		[%l6 + 0x20] %asi, %g2	/* %g2/%g3 = src3 + 0x20 */
-	ldda		[%i0 + 0x20] %asi, %l0	/* %l0/%l1 = dest + 0x20 */
-
-	xor		%i4, %i2, %i4
-	xor		%i5, %i3, %i5
-	ldda		[%i1 + 0x30] %asi, %i2	/* %i2/%i3 = src1 + 0x30 */
-	xor		%g2, %i4, %g2
-	xor		%g3, %i5, %g3
-	ldda		[%l7 + 0x30] %asi, %i4	/* %i4/%i5 = src2 + 0x30 */
-	xor		%l0, %g2, %l0
-	xor		%l1, %g3, %l1
-	stxa		%l0, [%i0 + 0x20] %asi
-	stxa		%l1, [%i0 + 0x28] %asi
-	ldda		[%l6 + 0x30] %asi, %g2	/* %g2/%g3 = src3 + 0x30 */
-	ldda		[%i0 + 0x30] %asi, %l0	/* %l0/%l1 = dest + 0x30 */
-
-	prefetch	[%i1 + 0x40], #one_read
-	prefetch	[%l7 + 0x40], #one_read
-	prefetch	[%l6 + 0x40], #one_read
-	prefetch	[%i0 + 0x40], #n_writes
-
-	xor		%i4, %i2, %i4
-	xor		%i5, %i3, %i5
-	xor		%g2, %i4, %g2
-	xor		%g3, %i5, %g3
-	xor		%l0, %g2, %l0
-	xor		%l1, %g3, %l1
-	stxa		%l0, [%i0 + 0x30] %asi
-	stxa		%l1, [%i0 + 0x38] %asi
-
-	add		%i0, 0x40, %i0
-	add		%i1, 0x40, %i1
-	add		%l7, 0x40, %l7
-	subcc		%g1, 1, %g1
-	bne,pt		%xcc, 1b
-	 add		%l6, 0x40, %l6
-	membar		#Sync
-	wr		%g7, 0x0, %asi
-	ret
-	 restore
-ENDPROC(xor_niagara_4)
-EXPORT_SYMBOL(xor_niagara_4)
-
-ENTRY(xor_niagara_5) /* %o0=bytes, %o1=dest, %o2=src1, %o3=src2, %o4=src3, %o5=src4 */
-	save		%sp, -192, %sp
-	prefetch	[%i1], #n_writes
-	prefetch	[%i2], #one_read
-	prefetch	[%i3], #one_read
-	prefetch	[%i4], #one_read
-	prefetch	[%i5], #one_read
-	rd		%asi, %g7
-	wr		%g0, ASI_BLK_INIT_QUAD_LDD_P, %asi
-	srlx		%i0, 6, %g1
-	mov		%i1, %i0
-	mov		%i2, %i1
-	mov		%i3, %l7
-	mov		%i4, %l6
-	mov		%i5, %l5
-1:	ldda		[%i1 + 0x00] %asi, %i2	/* %i2/%i3 = src1 + 0x00 */
-	ldda		[%l7 + 0x00] %asi, %i4	/* %i4/%i5 = src2 + 0x00 */
-	ldda		[%l6 + 0x00] %asi, %g2	/* %g2/%g3 = src3 + 0x00 */
-	ldda		[%l5 + 0x00] %asi, %l0	/* %l0/%l1 = src4 + 0x00 */
-	ldda		[%i0 + 0x00] %asi, %l2	/* %l2/%l3 = dest + 0x00 */
-	xor		%i4, %i2, %i4
-	xor		%i5, %i3, %i5
-	ldda		[%i1 + 0x10] %asi, %i2	/* %i2/%i3 = src1 + 0x10 */
-	xor		%g2, %i4, %g2
-	xor		%g3, %i5, %g3
-	ldda		[%l7 + 0x10] %asi, %i4	/* %i4/%i5 = src2 + 0x10 */
-	xor		%l0, %g2, %l0
-	xor		%l1, %g3, %l1
-	ldda		[%l6 + 0x10] %asi, %g2	/* %g2/%g3 = src3 + 0x10 */
-	xor		%l2, %l0, %l2
-	xor		%l3, %l1, %l3
-	stxa		%l2, [%i0 + 0x00] %asi
-	stxa		%l3, [%i0 + 0x08] %asi
-	ldda		[%l5 + 0x10] %asi, %l0	/* %l0/%l1 = src4 + 0x10 */
-	ldda		[%i0 + 0x10] %asi, %l2	/* %l2/%l3 = dest + 0x10 */
-
-	xor		%i4, %i2, %i4
-	xor		%i5, %i3, %i5
-	ldda		[%i1 + 0x20] %asi, %i2	/* %i2/%i3 = src1 + 0x20 */
-	xor		%g2, %i4, %g2
-	xor		%g3, %i5, %g3
-	ldda		[%l7 + 0x20] %asi, %i4	/* %i4/%i5 = src2 + 0x20 */
-	xor		%l0, %g2, %l0
-	xor		%l1, %g3, %l1
-	ldda		[%l6 + 0x20] %asi, %g2	/* %g2/%g3 = src3 + 0x20 */
-	xor		%l2, %l0, %l2
-	xor		%l3, %l1, %l3
-	stxa		%l2, [%i0 + 0x10] %asi
-	stxa		%l3, [%i0 + 0x18] %asi
-	ldda		[%l5 + 0x20] %asi, %l0	/* %l0/%l1 = src4 + 0x20 */
-	ldda		[%i0 + 0x20] %asi, %l2	/* %l2/%l3 = dest + 0x20 */
-
-	xor		%i4, %i2, %i4
-	xor		%i5, %i3, %i5
-	ldda		[%i1 + 0x30] %asi, %i2	/* %i2/%i3 = src1 + 0x30 */
-	xor		%g2, %i4, %g2
-	xor		%g3, %i5, %g3
-	ldda		[%l7 + 0x30] %asi, %i4	/* %i4/%i5 = src2 + 0x30 */
-	xor		%l0, %g2, %l0
-	xor		%l1, %g3, %l1
-	ldda		[%l6 + 0x30] %asi, %g2	/* %g2/%g3 = src3 + 0x30 */
-	xor		%l2, %l0, %l2
-	xor		%l3, %l1, %l3
-	stxa		%l2, [%i0 + 0x20] %asi
-	stxa		%l3, [%i0 + 0x28] %asi
-	ldda		[%l5 + 0x30] %asi, %l0	/* %l0/%l1 = src4 + 0x30 */
-	ldda		[%i0 + 0x30] %asi, %l2	/* %l2/%l3 = dest + 0x30 */
-
-	prefetch	[%i1 + 0x40], #one_read
-	prefetch	[%l7 + 0x40], #one_read
-	prefetch	[%l6 + 0x40], #one_read
-	prefetch	[%l5 + 0x40], #one_read
-	prefetch	[%i0 + 0x40], #n_writes
-
-	xor		%i4, %i2, %i4
-	xor		%i5, %i3, %i5
-	xor		%g2, %i4, %g2
-	xor		%g3, %i5, %g3
-	xor		%l0, %g2, %l0
-	xor		%l1, %g3, %l1
-	xor		%l2, %l0, %l2
-	xor		%l3, %l1, %l3
-	stxa		%l2, [%i0 + 0x30] %asi
-	stxa		%l3, [%i0 + 0x38] %asi
-
-	add		%i0, 0x40, %i0
-	add		%i1, 0x40, %i1
-	add		%l7, 0x40, %l7
-	add		%l6, 0x40, %l6
-	subcc		%g1, 1, %g1
-	bne,pt		%xcc, 1b
-	 add		%l5, 0x40, %l5
-	membar		#Sync
-	wr		%g7, 0x0, %asi
-	ret
-	 restore
-ENDPROC(xor_niagara_5)
-EXPORT_SYMBOL(xor_niagara_5)
--- a/lib/raid/xor/Makefile~sparc-move-the-xor-code-to-lib-raid
+++ a/lib/raid/xor/Makefile
@@ -18,6 +18,8 @@ xor-$(CONFIG_CPU_HAS_LSX)	+= loongarch/x
 xor-$(CONFIG_CPU_HAS_LSX)	+= loongarch/xor_simd_glue.o
 xor-$(CONFIG_ALTIVEC)		+= powerpc/xor_vmx.o powerpc/xor_vmx_glue.o
 xor-$(CONFIG_RISCV_ISA_V)	+= riscv/xor.o riscv/xor-glue.o
+xor-$(CONFIG_SPARC32)		+= sparc/xor-sparc32.o
+xor-$(CONFIG_SPARC64)		+= sparc/xor-sparc64.o sparc/xor-sparc64-glue.o
 
 
 CFLAGS_arm/xor-neon.o		+= $(CC_FLAGS_FPU)
diff --git a/lib/raid/xor/sparc/xor-sparc32.c a/lib/raid/xor/sparc/xor-sparc32.c
new file mode 100664
--- /dev/null
+++ a/lib/raid/xor/sparc/xor-sparc32.c
@@ -0,0 +1,253 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * High speed xor_block operation for RAID4/5 utilizing the
+ * ldd/std SPARC instructions.
+ *
+ * Copyright (C) 1999 Jakub Jelinek (jj@ultra.linux.cz)
+ */
+#include <linux/raid/xor_impl.h>
+#include <asm/xor.h>
+
+static void
+sparc_2(unsigned long bytes, unsigned long * __restrict p1,
+	const unsigned long * __restrict p2)
+{
+	int lines = bytes / (sizeof (long)) / 8;
+
+	do {
+		__asm__ __volatile__(
+		  "ldd [%0 + 0x00], %%g2\n\t"
+		  "ldd [%0 + 0x08], %%g4\n\t"
+		  "ldd [%0 + 0x10], %%o0\n\t"
+		  "ldd [%0 + 0x18], %%o2\n\t"
+		  "ldd [%1 + 0x00], %%o4\n\t"
+		  "ldd [%1 + 0x08], %%l0\n\t"
+		  "ldd [%1 + 0x10], %%l2\n\t"
+		  "ldd [%1 + 0x18], %%l4\n\t"
+		  "xor %%g2, %%o4, %%g2\n\t"
+		  "xor %%g3, %%o5, %%g3\n\t"
+		  "xor %%g4, %%l0, %%g4\n\t"
+		  "xor %%g5, %%l1, %%g5\n\t"
+		  "xor %%o0, %%l2, %%o0\n\t"
+		  "xor %%o1, %%l3, %%o1\n\t"
+		  "xor %%o2, %%l4, %%o2\n\t"
+		  "xor %%o3, %%l5, %%o3\n\t"
+		  "std %%g2, [%0 + 0x00]\n\t"
+		  "std %%g4, [%0 + 0x08]\n\t"
+		  "std %%o0, [%0 + 0x10]\n\t"
+		  "std %%o2, [%0 + 0x18]\n"
+		:
+		: "r" (p1), "r" (p2)
+		: "g2", "g3", "g4", "g5",
+		  "o0", "o1", "o2", "o3", "o4", "o5",
+		  "l0", "l1", "l2", "l3", "l4", "l5");
+		p1 += 8;
+		p2 += 8;
+	} while (--lines > 0);
+}
+
+static void
+sparc_3(unsigned long bytes, unsigned long * __restrict p1,
+	const unsigned long * __restrict p2,
+	const unsigned long * __restrict p3)
+{
+	int lines = bytes / (sizeof (long)) / 8;
+
+	do {
+		__asm__ __volatile__(
+		  "ldd [%0 + 0x00], %%g2\n\t"
+		  "ldd [%0 + 0x08], %%g4\n\t"
+		  "ldd [%0 + 0x10], %%o0\n\t"
+		  "ldd [%0 + 0x18], %%o2\n\t"
+		  "ldd [%1 + 0x00], %%o4\n\t"
+		  "ldd [%1 + 0x08], %%l0\n\t"
+		  "ldd [%1 + 0x10], %%l2\n\t"
+		  "ldd [%1 + 0x18], %%l4\n\t"
+		  "xor %%g2, %%o4, %%g2\n\t"
+		  "xor %%g3, %%o5, %%g3\n\t"
+		  "ldd [%2 + 0x00], %%o4\n\t"
+		  "xor %%g4, %%l0, %%g4\n\t"
+		  "xor %%g5, %%l1, %%g5\n\t"
+		  "ldd [%2 + 0x08], %%l0\n\t"
+		  "xor %%o0, %%l2, %%o0\n\t"
+		  "xor %%o1, %%l3, %%o1\n\t"
+		  "ldd [%2 + 0x10], %%l2\n\t"
+		  "xor %%o2, %%l4, %%o2\n\t"
+		  "xor %%o3, %%l5, %%o3\n\t"
+		  "ldd [%2 + 0x18], %%l4\n\t"
+		  "xor %%g2, %%o4, %%g2\n\t"
+		  "xor %%g3, %%o5, %%g3\n\t"
+		  "xor %%g4, %%l0, %%g4\n\t"
+		  "xor %%g5, %%l1, %%g5\n\t"
+		  "xor %%o0, %%l2, %%o0\n\t"
+		  "xor %%o1, %%l3, %%o1\n\t"
+		  "xor %%o2, %%l4, %%o2\n\t"
+		  "xor %%o3, %%l5, %%o3\n\t"
+		  "std %%g2, [%0 + 0x00]\n\t"
+		  "std %%g4, [%0 + 0x08]\n\t"
+		  "std %%o0, [%0 + 0x10]\n\t"
+		  "std %%o2, [%0 + 0x18]\n"
+		:
+		: "r" (p1), "r" (p2), "r" (p3)
+		: "g2", "g3", "g4", "g5",
+		  "o0", "o1", "o2", "o3", "o4", "o5",
+		  "l0", "l1", "l2", "l3", "l4", "l5");
+		p1 += 8;
+		p2 += 8;
+		p3 += 8;
+	} while (--lines > 0);
+}
+
+static void
+sparc_4(unsigned long bytes, unsigned long * __restrict p1,
+	const unsigned long * __restrict p2,
+	const unsigned long * __restrict p3,
+	const unsigned long * __restrict p4)
+{
+	int lines = bytes / (sizeof (long)) / 8;
+
+	do {
+		__asm__ __volatile__(
+		  "ldd [%0 + 0x00], %%g2\n\t"
+		  "ldd [%0 + 0x08], %%g4\n\t"
+		  "ldd [%0 + 0x10], %%o0\n\t"
+		  "ldd [%0 + 0x18], %%o2\n\t"
+		  "ldd [%1 + 0x00], %%o4\n\t"
+		  "ldd [%1 + 0x08], %%l0\n\t"
+		  "ldd [%1 + 0x10], %%l2\n\t"
+		  "ldd [%1 + 0x18], %%l4\n\t"
+		  "xor %%g2, %%o4, %%g2\n\t"
+		  "xor %%g3, %%o5, %%g3\n\t"
+		  "ldd [%2 + 0x00], %%o4\n\t"
+		  "xor %%g4, %%l0, %%g4\n\t"
+		  "xor %%g5, %%l1, %%g5\n\t"
+		  "ldd [%2 + 0x08], %%l0\n\t"
+		  "xor %%o0, %%l2, %%o0\n\t"
+		  "xor %%o1, %%l3, %%o1\n\t"
+		  "ldd [%2 + 0x10], %%l2\n\t"
+		  "xor %%o2, %%l4, %%o2\n\t"
+		  "xor %%o3, %%l5, %%o3\n\t"
+		  "ldd [%2 + 0x18], %%l4\n\t"
+		  "xor %%g2, %%o4, %%g2\n\t"
+		  "xor %%g3, %%o5, %%g3\n\t"
+		  "ldd [%3 + 0x00], %%o4\n\t"
+		  "xor %%g4, %%l0, %%g4\n\t"
+		  "xor %%g5, %%l1, %%g5\n\t"
+		  "ldd [%3 + 0x08], %%l0\n\t"
+		  "xor %%o0, %%l2, %%o0\n\t"
+		  "xor %%o1, %%l3, %%o1\n\t"
+		  "ldd [%3 + 0x10], %%l2\n\t"
+		  "xor %%o2, %%l4, %%o2\n\t"
+		  "xor %%o3, %%l5, %%o3\n\t"
+		  "ldd [%3 + 0x18], %%l4\n\t"
+		  "xor %%g2, %%o4, %%g2\n\t"
+		  "xor %%g3, %%o5, %%g3\n\t"
+		  "xor %%g4, %%l0, %%g4\n\t"
+		  "xor %%g5, %%l1, %%g5\n\t"
+		  "xor %%o0, %%l2, %%o0\n\t"
+		  "xor %%o1, %%l3, %%o1\n\t"
+		  "xor %%o2, %%l4, %%o2\n\t"
+		  "xor %%o3, %%l5, %%o3\n\t"
+		  "std %%g2, [%0 + 0x00]\n\t"
+		  "std %%g4, [%0 + 0x08]\n\t"
+		  "std %%o0, [%0 + 0x10]\n\t"
+		  "std %%o2, [%0 + 0x18]\n"
+		:
+		: "r" (p1), "r" (p2), "r" (p3), "r" (p4)
+		: "g2", "g3", "g4", "g5",
+		  "o0", "o1", "o2", "o3", "o4", "o5",
+		  "l0", "l1", "l2", "l3", "l4", "l5");
+		p1 += 8;
+		p2 += 8;
+		p3 += 8;
+		p4 += 8;
+	} while (--lines > 0);
+}
+
+static void
+sparc_5(unsigned long bytes, unsigned long * __restrict p1,
+	const unsigned long * __restrict p2,
+	const unsigned long * __restrict p3,
+	const unsigned long * __restrict p4,
+	const unsigned long * __restrict p5)
+{
+	int lines = bytes / (sizeof (long)) / 8;
+
+	do {
+		__asm__ __volatile__(
+		  "ldd [%0 + 0x00], %%g2\n\t"
+		  "ldd [%0 + 0x08], %%g4\n\t"
+		  "ldd [%0 + 0x10], %%o0\n\t"
+		  "ldd [%0 + 0x18], %%o2\n\t"
+		  "ldd [%1 + 0x00], %%o4\n\t"
+		  "ldd [%1 + 0x08], %%l0\n\t"
+		  "ldd [%1 + 0x10], %%l2\n\t"
+		  "ldd [%1 + 0x18], %%l4\n\t"
+		  "xor %%g2, %%o4, %%g2\n\t"
+		  "xor %%g3, %%o5, %%g3\n\t"
+		  "ldd [%2 + 0x00], %%o4\n\t"
+		  "xor %%g4, %%l0, %%g4\n\t"
+		  "xor %%g5, %%l1, %%g5\n\t"
+		  "ldd [%2 + 0x08], %%l0\n\t"
+		  "xor %%o0, %%l2, %%o0\n\t"
+		  "xor %%o1, %%l3, %%o1\n\t"
+		  "ldd [%2 + 0x10], %%l2\n\t"
+		  "xor %%o2, %%l4, %%o2\n\t"
+		  "xor %%o3, %%l5, %%o3\n\t"
+		  "ldd [%2 + 0x18], %%l4\n\t"
+		  "xor %%g2, %%o4, %%g2\n\t"
+		  "xor %%g3, %%o5, %%g3\n\t"
+		  "ldd [%3 + 0x00], %%o4\n\t"
+		  "xor %%g4, %%l0, %%g4\n\t"
+		  "xor %%g5, %%l1, %%g5\n\t"
+		  "ldd [%3 + 0x08], %%l0\n\t"
+		  "xor %%o0, %%l2, %%o0\n\t"
+		  "xor %%o1, %%l3, %%o1\n\t"
+		  "ldd [%3 + 0x10], %%l2\n\t"
+		  "xor %%o2, %%l4, %%o2\n\t"
+		  "xor %%o3, %%l5, %%o3\n\t"
+		  "ldd [%3 + 0x18], %%l4\n\t"
+		  "xor %%g2, %%o4, %%g2\n\t"
+		  "xor %%g3, %%o5, %%g3\n\t"
+		  "ldd [%4 + 0x00], %%o4\n\t"
+		  "xor %%g4, %%l0, %%g4\n\t"
+		  "xor %%g5, %%l1, %%g5\n\t"
+		  "ldd [%4 + 0x08], %%l0\n\t"
+		  "xor %%o0, %%l2, %%o0\n\t"
+		  "xor %%o1, %%l3, %%o1\n\t"
+		  "ldd [%4 + 0x10], %%l2\n\t"
+		  "xor %%o2, %%l4, %%o2\n\t"
+		  "xor %%o3, %%l5, %%o3\n\t"
+		  "ldd [%4 + 0x18], %%l4\n\t"
+		  "xor %%g2, %%o4, %%g2\n\t"
+		  "xor %%g3, %%o5, %%g3\n\t"
+		  "xor %%g4, %%l0, %%g4\n\t"
+		  "xor %%g5, %%l1, %%g5\n\t"
+		  "xor %%o0, %%l2, %%o0\n\t"
+		  "xor %%o1, %%l3, %%o1\n\t"
+		  "xor %%o2, %%l4, %%o2\n\t"
+		  "xor %%o3, %%l5, %%o3\n\t"
+		  "std %%g2, [%0 + 0x00]\n\t"
+		  "std %%g4, [%0 + 0x08]\n\t"
+		  "std %%o0, [%0 + 0x10]\n\t"
+		  "std %%o2, [%0 + 0x18]\n"
+		:
+		: "r" (p1), "r" (p2), "r" (p3), "r" (p4), "r" (p5)
+		: "g2", "g3", "g4", "g5",
+		  "o0", "o1", "o2", "o3", "o4", "o5",
+		  "l0", "l1", "l2", "l3", "l4", "l5");
+		p1 += 8;
+		p2 += 8;
+		p3 += 8;
+		p4 += 8;
+		p5 += 8;
+	} while (--lines > 0);
+}
+
+struct xor_block_template xor_block_SPARC = {
+	.name	= "SPARC",
+	.do_2	= sparc_2,
+	.do_3	= sparc_3,
+	.do_4	= sparc_4,
+	.do_5	= sparc_5,
+};
diff --git a/lib/raid/xor/sparc/xor-sparc64-glue.c a/lib/raid/xor/sparc/xor-sparc64-glue.c
new file mode 100664
--- /dev/null
+++ a/lib/raid/xor/sparc/xor-sparc64-glue.c
@@ -0,0 +1,60 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * High speed xor_block operation for RAID4/5 utilizing the
+ * UltraSparc Visual Instruction Set and Niagara block-init
+ * twin-load instructions.
+ *
+ * Copyright (C) 1997, 1999 Jakub Jelinek (jj@ultra.linux.cz)
+ * Copyright (C) 2006 David S. Miller <davem@davemloft.net>
+ */
+
+#include <linux/raid/xor_impl.h>
+#include <asm/xor.h>
+
+void xor_vis_2(unsigned long bytes, unsigned long * __restrict p1,
+	       const unsigned long * __restrict p2);
+void xor_vis_3(unsigned long bytes, unsigned long * __restrict p1,
+	       const unsigned long * __restrict p2,
+	       const unsigned long * __restrict p3);
+void xor_vis_4(unsigned long bytes, unsigned long * __restrict p1,
+	       const unsigned long * __restrict p2,
+	       const unsigned long * __restrict p3,
+	       const unsigned long * __restrict p4);
+void xor_vis_5(unsigned long bytes, unsigned long * __restrict p1,
+	       const unsigned long * __restrict p2,
+	       const unsigned long * __restrict p3,
+	       const unsigned long * __restrict p4,
+	       const unsigned long * __restrict p5);
+
+/* XXX Ugh, write cheetah versions... -DaveM */
+
+struct xor_block_template xor_block_VIS = {
+        .name	= "VIS",
+        .do_2	= xor_vis_2,
+        .do_3	= xor_vis_3,
+        .do_4	= xor_vis_4,
+        .do_5	= xor_vis_5,
+};
+
+void xor_niagara_2(unsigned long bytes, unsigned long * __restrict p1,
+		   const unsigned long * __restrict p2);
+void xor_niagara_3(unsigned long bytes, unsigned long * __restrict p1,
+		   const unsigned long * __restrict p2,
+		   const unsigned long * __restrict p3);
+void xor_niagara_4(unsigned long bytes, unsigned long * __restrict p1,
+		   const unsigned long * __restrict p2,
+		   const unsigned long * __restrict p3,
+		   const unsigned long * __restrict p4);
+void xor_niagara_5(unsigned long bytes, unsigned long * __restrict p1,
+		   const unsigned long * __restrict p2,
+		   const unsigned long * __restrict p3,
+		   const unsigned long * __restrict p4,
+		   const unsigned long * __restrict p5);
+
+struct xor_block_template xor_block_niagara = {
+        .name	= "Niagara",
+        .do_2	= xor_niagara_2,
+        .do_3	= xor_niagara_3,
+        .do_4	= xor_niagara_4,
+        .do_5	= xor_niagara_5,
+};
diff --git a/lib/raid/xor/sparc/xor-sparc64.S a/lib/raid/xor/sparc/xor-sparc64.S
new file mode 100664
--- /dev/null
+++ a/lib/raid/xor/sparc/xor-sparc64.S
@@ -0,0 +1,636 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * High speed xor_block operation for RAID4/5 utilizing the
+ * UltraSparc Visual Instruction Set and Niagara store-init/twin-load.
+ *
+ * Copyright (C) 1997, 1999 Jakub Jelinek (jj@ultra.linux.cz)
+ * Copyright (C) 2006 David S. Miller <davem@davemloft.net>
+ */
+
+#include <linux/export.h>
+#include <linux/linkage.h>
+#include <asm/visasm.h>
+#include <asm/asi.h>
+#include <asm/dcu.h>
+#include <asm/spitfire.h>
+
+/*
+ *	Requirements:
+ *	!(((long)dest | (long)sourceN) & (64 - 1)) &&
+ *	!(len & 127) && len >= 256
+ */
+	.text
+
+	/* VIS versions. */
+ENTRY(xor_vis_2)
+	rd	%fprs, %o5
+	andcc	%o5, FPRS_FEF|FPRS_DU, %g0
+	be,pt	%icc, 0f
+	 sethi	%hi(VISenter), %g1
+	jmpl	%g1 + %lo(VISenter), %g7
+	 add	%g7, 8, %g7
+0:	wr	%g0, FPRS_FEF, %fprs
+	rd	%asi, %g1
+	wr	%g0, ASI_BLK_P, %asi
+	membar	#LoadStore|#StoreLoad|#StoreStore
+	sub	%o0, 128, %o0
+	ldda	[%o1] %asi, %f0
+	ldda	[%o2] %asi, %f16
+
+2:	ldda	[%o1 + 64] %asi, %f32
+	fxor	%f0, %f16, %f16
+	fxor	%f2, %f18, %f18
+	fxor	%f4, %f20, %f20
+	fxor	%f6, %f22, %f22
+	fxor	%f8, %f24, %f24
+	fxor	%f10, %f26, %f26
+	fxor	%f12, %f28, %f28
+	fxor	%f14, %f30, %f30
+	stda	%f16, [%o1] %asi
+	ldda	[%o2 + 64] %asi, %f48
+	ldda	[%o1 + 128] %asi, %f0
+	fxor	%f32, %f48, %f48
+	fxor	%f34, %f50, %f50
+	add	%o1, 128, %o1
+	fxor	%f36, %f52, %f52
+	add	%o2, 128, %o2
+	fxor	%f38, %f54, %f54
+	subcc	%o0, 128, %o0
+	fxor	%f40, %f56, %f56
+	fxor	%f42, %f58, %f58
+	fxor	%f44, %f60, %f60
+	fxor	%f46, %f62, %f62
+	stda	%f48, [%o1 - 64] %asi
+	bne,pt	%xcc, 2b
+	 ldda	[%o2] %asi, %f16
+
+	ldda	[%o1 + 64] %asi, %f32
+	fxor	%f0, %f16, %f16
+	fxor	%f2, %f18, %f18
+	fxor	%f4, %f20, %f20
+	fxor	%f6, %f22, %f22
+	fxor	%f8, %f24, %f24
+	fxor	%f10, %f26, %f26
+	fxor	%f12, %f28, %f28
+	fxor	%f14, %f30, %f30
+	stda	%f16, [%o1] %asi
+	ldda	[%o2 + 64] %asi, %f48
+	membar	#Sync
+	fxor	%f32, %f48, %f48
+	fxor	%f34, %f50, %f50
+	fxor	%f36, %f52, %f52
+	fxor	%f38, %f54, %f54
+	fxor	%f40, %f56, %f56
+	fxor	%f42, %f58, %f58
+	fxor	%f44, %f60, %f60
+	fxor	%f46, %f62, %f62
+	stda	%f48, [%o1 + 64] %asi
+	membar	#Sync|#StoreStore|#StoreLoad
+	wr	%g1, %g0, %asi
+	retl
+	  wr	%g0, 0, %fprs
+ENDPROC(xor_vis_2)
+
+ENTRY(xor_vis_3)
+	rd	%fprs, %o5
+	andcc	%o5, FPRS_FEF|FPRS_DU, %g0
+	be,pt	%icc, 0f
+	 sethi	%hi(VISenter), %g1
+	jmpl	%g1 + %lo(VISenter), %g7
+	 add	%g7, 8, %g7
+0:	wr	%g0, FPRS_FEF, %fprs
+	rd	%asi, %g1
+	wr	%g0, ASI_BLK_P, %asi
+	membar	#LoadStore|#StoreLoad|#StoreStore
+	sub	%o0, 64, %o0
+	ldda	[%o1] %asi, %f0
+	ldda	[%o2] %asi, %f16
+
+3:	ldda	[%o3] %asi, %f32
+	fxor	%f0, %f16, %f48
+	fxor	%f2, %f18, %f50
+	add	%o1, 64, %o1
+	fxor	%f4, %f20, %f52
+	fxor	%f6, %f22, %f54
+	add	%o2, 64, %o2
+	fxor	%f8, %f24, %f56
+	fxor	%f10, %f26, %f58
+	fxor	%f12, %f28, %f60
+	fxor	%f14, %f30, %f62
+	ldda	[%o1] %asi, %f0
+	fxor	%f48, %f32, %f48
+	fxor	%f50, %f34, %f50
+	fxor	%f52, %f36, %f52
+	fxor	%f54, %f38, %f54
+	add	%o3, 64, %o3
+	fxor	%f56, %f40, %f56
+	fxor	%f58, %f42, %f58
+	subcc	%o0, 64, %o0
+	fxor	%f60, %f44, %f60
+	fxor	%f62, %f46, %f62
+	stda	%f48, [%o1 - 64] %asi
+	bne,pt	%xcc, 3b
+	 ldda	[%o2] %asi, %f16
+
+	ldda	[%o3] %asi, %f32
+	fxor	%f0, %f16, %f48
+	fxor	%f2, %f18, %f50
+	fxor	%f4, %f20, %f52
+	fxor	%f6, %f22, %f54
+	fxor	%f8, %f24, %f56
+	fxor	%f10, %f26, %f58
+	fxor	%f12, %f28, %f60
+	fxor	%f14, %f30, %f62
+	membar	#Sync
+	fxor	%f48, %f32, %f48
+	fxor	%f50, %f34, %f50
+	fxor	%f52, %f36, %f52
+	fxor	%f54, %f38, %f54
+	fxor	%f56, %f40, %f56
+	fxor	%f58, %f42, %f58
+	fxor	%f60, %f44, %f60
+	fxor	%f62, %f46, %f62
+	stda	%f48, [%o1] %asi
+	membar	#Sync|#StoreStore|#StoreLoad
+	wr	%g1, %g0, %asi
+	retl
+	 wr	%g0, 0, %fprs
+ENDPROC(xor_vis_3)
+
+ENTRY(xor_vis_4)
+	rd	%fprs, %o5
+	andcc	%o5, FPRS_FEF|FPRS_DU, %g0
+	be,pt	%icc, 0f
+	 sethi	%hi(VISenter), %g1
+	jmpl	%g1 + %lo(VISenter), %g7
+	 add	%g7, 8, %g7
+0:	wr	%g0, FPRS_FEF, %fprs
+	rd	%asi, %g1
+	wr	%g0, ASI_BLK_P, %asi
+	membar	#LoadStore|#StoreLoad|#StoreStore
+	sub	%o0, 64, %o0
+	ldda	[%o1] %asi, %f0
+	ldda	[%o2] %asi, %f16
+
+4:	ldda	[%o3] %asi, %f32
+	fxor	%f0, %f16, %f16
+	fxor	%f2, %f18, %f18
+	add	%o1, 64, %o1
+	fxor	%f4, %f20, %f20
+	fxor	%f6, %f22, %f22
+	add	%o2, 64, %o2
+	fxor	%f8, %f24, %f24
+	fxor	%f10, %f26, %f26
+	fxor	%f12, %f28, %f28
+	fxor	%f14, %f30, %f30
+	ldda	[%o4] %asi, %f48
+	fxor	%f16, %f32, %f32
+	fxor	%f18, %f34, %f34
+	fxor	%f20, %f36, %f36
+	fxor	%f22, %f38, %f38
+	add	%o3, 64, %o3
+	fxor	%f24, %f40, %f40
+	fxor	%f26, %f42, %f42
+	fxor	%f28, %f44, %f44
+	fxor	%f30, %f46, %f46
+	ldda	[%o1] %asi, %f0
+	fxor	%f32, %f48, %f48
+	fxor	%f34, %f50, %f50
+	fxor	%f36, %f52, %f52
+	add	%o4, 64, %o4
+	fxor	%f38, %f54, %f54
+	fxor	%f40, %f56, %f56
+	fxor	%f42, %f58, %f58
+	subcc	%o0, 64, %o0
+	fxor	%f44, %f60, %f60
+	fxor	%f46, %f62, %f62
+	stda	%f48, [%o1 - 64] %asi
+	bne,pt	%xcc, 4b
+	 ldda	[%o2] %asi, %f16
+
+	ldda	[%o3] %asi, %f32
+	fxor	%f0, %f16, %f16
+	fxor	%f2, %f18, %f18
+	fxor	%f4, %f20, %f20
+	fxor	%f6, %f22, %f22
+	fxor	%f8, %f24, %f24
+	fxor	%f10, %f26, %f26
+	fxor	%f12, %f28, %f28
+	fxor	%f14, %f30, %f30
+	ldda	[%o4] %asi, %f48
+	fxor	%f16, %f32, %f32
+	fxor	%f18, %f34, %f34
+	fxor	%f20, %f36, %f36
+	fxor	%f22, %f38, %f38
+	fxor	%f24, %f40, %f40
+	fxor	%f26, %f42, %f42
+	fxor	%f28, %f44, %f44
+	fxor	%f30, %f46, %f46
+	membar	#Sync
+	fxor	%f32, %f48, %f48
+	fxor	%f34, %f50, %f50
+	fxor	%f36, %f52, %f52
+	fxor	%f38, %f54, %f54
+	fxor	%f40, %f56, %f56
+	fxor	%f42, %f58, %f58
+	fxor	%f44, %f60, %f60
+	fxor	%f46, %f62, %f62
+	stda	%f48, [%o1] %asi
+	membar	#Sync|#StoreStore|#StoreLoad
+	wr	%g1, %g0, %asi
+	retl
+	 wr	%g0, 0, %fprs
+ENDPROC(xor_vis_4)
+
+ENTRY(xor_vis_5)
+	save	%sp, -192, %sp
+	rd	%fprs, %o5
+	andcc	%o5, FPRS_FEF|FPRS_DU, %g0
+	be,pt	%icc, 0f
+	 sethi	%hi(VISenter), %g1
+	jmpl	%g1 + %lo(VISenter), %g7
+	 add	%g7, 8, %g7
+0:	wr	%g0, FPRS_FEF, %fprs
+	rd	%asi, %g1
+	wr	%g0, ASI_BLK_P, %asi
+	membar	#LoadStore|#StoreLoad|#StoreStore
+	sub	%i0, 64, %i0
+	ldda	[%i1] %asi, %f0
+	ldda	[%i2] %asi, %f16
+
+5:	ldda	[%i3] %asi, %f32
+	fxor	%f0, %f16, %f48
+	fxor	%f2, %f18, %f50
+	add	%i1, 64, %i1
+	fxor	%f4, %f20, %f52
+	fxor	%f6, %f22, %f54
+	add	%i2, 64, %i2
+	fxor	%f8, %f24, %f56
+	fxor	%f10, %f26, %f58
+	fxor	%f12, %f28, %f60
+	fxor	%f14, %f30, %f62
+	ldda	[%i4] %asi, %f16
+	fxor	%f48, %f32, %f48
+	fxor	%f50, %f34, %f50
+	fxor	%f52, %f36, %f52
+	fxor	%f54, %f38, %f54
+	add	%i3, 64, %i3
+	fxor	%f56, %f40, %f56
+	fxor	%f58, %f42, %f58
+	fxor	%f60, %f44, %f60
+	fxor	%f62, %f46, %f62
+	ldda	[%i5] %asi, %f32
+	fxor	%f48, %f16, %f48
+	fxor	%f50, %f18, %f50
+	add	%i4, 64, %i4
+	fxor	%f52, %f20, %f52
+	fxor	%f54, %f22, %f54
+	add	%i5, 64, %i5
+	fxor	%f56, %f24, %f56
+	fxor	%f58, %f26, %f58
+	fxor	%f60, %f28, %f60
+	fxor	%f62, %f30, %f62
+	ldda	[%i1] %asi, %f0
+	fxor	%f48, %f32, %f48
+	fxor	%f50, %f34, %f50
+	fxor	%f52, %f36, %f52
+	fxor	%f54, %f38, %f54
+	fxor	%f56, %f40, %f56
+	fxor	%f58, %f42, %f58
+	subcc	%i0, 64, %i0
+	fxor	%f60, %f44, %f60
+	fxor	%f62, %f46, %f62
+	stda	%f48, [%i1 - 64] %asi
+	bne,pt	%xcc, 5b
+	 ldda	[%i2] %asi, %f16
+
+	ldda	[%i3] %asi, %f32
+	fxor	%f0, %f16, %f48
+	fxor	%f2, %f18, %f50
+	fxor	%f4, %f20, %f52
+	fxor	%f6, %f22, %f54
+	fxor	%f8, %f24, %f56
+	fxor	%f10, %f26, %f58
+	fxor	%f12, %f28, %f60
+	fxor	%f14, %f30, %f62
+	ldda	[%i4] %asi, %f16
+	fxor	%f48, %f32, %f48
+	fxor	%f50, %f34, %f50
+	fxor	%f52, %f36, %f52
+	fxor	%f54, %f38, %f54
+	fxor	%f56, %f40, %f56
+	fxor	%f58, %f42, %f58
+	fxor	%f60, %f44, %f60
+	fxor	%f62, %f46, %f62
+	ldda	[%i5] %asi, %f32
+	fxor	%f48, %f16, %f48
+	fxor	%f50, %f18, %f50
+	fxor	%f52, %f20, %f52
+	fxor	%f54, %f22, %f54
+	fxor	%f56, %f24, %f56
+	fxor	%f58, %f26, %f58
+	fxor	%f60, %f28, %f60
+	fxor	%f62, %f30, %f62
+	membar	#Sync
+	fxor	%f48, %f32, %f48
+	fxor	%f50, %f34, %f50
+	fxor	%f52, %f36, %f52
+	fxor	%f54, %f38, %f54
+	fxor	%f56, %f40, %f56
+	fxor	%f58, %f42, %f58
+	fxor	%f60, %f44, %f60
+	fxor	%f62, %f46, %f62
+	stda	%f48, [%i1] %asi
+	membar	#Sync|#StoreStore|#StoreLoad
+	wr	%g1, %g0, %asi
+	wr	%g0, 0, %fprs
+	ret
+	 restore
+ENDPROC(xor_vis_5)
+
+	/* Niagara versions. */
+ENTRY(xor_niagara_2) /* %o0=bytes, %o1=dest, %o2=src */
+	save		%sp, -192, %sp
+	prefetch	[%i1], #n_writes
+	prefetch	[%i2], #one_read
+	rd		%asi, %g7
+	wr		%g0, ASI_BLK_INIT_QUAD_LDD_P, %asi
+	srlx		%i0, 6, %g1
+	mov		%i1, %i0
+	mov		%i2, %i1
+1:	ldda		[%i1 + 0x00] %asi, %i2	/* %i2/%i3 = src  + 0x00 */
+	ldda		[%i1 + 0x10] %asi, %i4	/* %i4/%i5 = src  + 0x10 */
+	ldda		[%i1 + 0x20] %asi, %g2	/* %g2/%g3 = src  + 0x20 */
+	ldda		[%i1 + 0x30] %asi, %l0	/* %l0/%l1 = src  + 0x30 */
+	prefetch	[%i1 + 0x40], #one_read
+	ldda		[%i0 + 0x00] %asi, %o0  /* %o0/%o1 = dest + 0x00 */
+	ldda		[%i0 + 0x10] %asi, %o2  /* %o2/%o3 = dest + 0x10 */
+	ldda		[%i0 + 0x20] %asi, %o4  /* %o4/%o5 = dest + 0x20 */
+	ldda		[%i0 + 0x30] %asi, %l2  /* %l2/%l3 = dest + 0x30 */
+	prefetch	[%i0 + 0x40], #n_writes
+	xor		%o0, %i2, %o0
+	xor		%o1, %i3, %o1
+	stxa		%o0, [%i0 + 0x00] %asi
+	stxa		%o1, [%i0 + 0x08] %asi
+	xor		%o2, %i4, %o2
+	xor		%o3, %i5, %o3
+	stxa		%o2, [%i0 + 0x10] %asi
+	stxa		%o3, [%i0 + 0x18] %asi
+	xor		%o4, %g2, %o4
+	xor		%o5, %g3, %o5
+	stxa		%o4, [%i0 + 0x20] %asi
+	stxa		%o5, [%i0 + 0x28] %asi
+	xor		%l2, %l0, %l2
+	xor		%l3, %l1, %l3
+	stxa		%l2, [%i0 + 0x30] %asi
+	stxa		%l3, [%i0 + 0x38] %asi
+	add		%i0, 0x40, %i0
+	subcc		%g1, 1, %g1
+	bne,pt		%xcc, 1b
+	 add		%i1, 0x40, %i1
+	membar		#Sync
+	wr		%g7, 0x0, %asi
+	ret
+	 restore
+ENDPROC(xor_niagara_2)
+
+ENTRY(xor_niagara_3) /* %o0=bytes, %o1=dest, %o2=src1, %o3=src2 */
+	save		%sp, -192, %sp
+	prefetch	[%i1], #n_writes
+	prefetch	[%i2], #one_read
+	prefetch	[%i3], #one_read
+	rd		%asi, %g7
+	wr		%g0, ASI_BLK_INIT_QUAD_LDD_P, %asi
+	srlx		%i0, 6, %g1
+	mov		%i1, %i0
+	mov		%i2, %i1
+	mov		%i3, %l7
+1:	ldda		[%i1 + 0x00] %asi, %i2	/* %i2/%i3 = src1 + 0x00 */
+	ldda		[%i1 + 0x10] %asi, %i4	/* %i4/%i5 = src1 + 0x10 */
+	ldda		[%l7 + 0x00] %asi, %g2	/* %g2/%g3 = src2 + 0x00 */
+	ldda		[%l7 + 0x10] %asi, %l0	/* %l0/%l1 = src2 + 0x10 */
+	ldda		[%i0 + 0x00] %asi, %o0  /* %o0/%o1 = dest + 0x00 */
+	ldda		[%i0 + 0x10] %asi, %o2  /* %o2/%o3 = dest + 0x10 */
+	xor		%g2, %i2, %g2
+	xor		%g3, %i3, %g3
+	xor		%o0, %g2, %o0
+	xor		%o1, %g3, %o1
+	stxa		%o0, [%i0 + 0x00] %asi
+	stxa		%o1, [%i0 + 0x08] %asi
+	ldda		[%i1 + 0x20] %asi, %i2	/* %i2/%i3 = src1 + 0x20 */
+	ldda		[%l7 + 0x20] %asi, %g2	/* %g2/%g3 = src2 + 0x20 */
+	ldda		[%i0 + 0x20] %asi, %o0	/* %o0/%o1 = dest + 0x20 */
+	xor		%l0, %i4, %l0
+	xor		%l1, %i5, %l1
+	xor		%o2, %l0, %o2
+	xor		%o3, %l1, %o3
+	stxa		%o2, [%i0 + 0x10] %asi
+	stxa		%o3, [%i0 + 0x18] %asi
+	ldda		[%i1 + 0x30] %asi, %i4	/* %i4/%i5 = src1 + 0x30 */
+	ldda		[%l7 + 0x30] %asi, %l0	/* %l0/%l1 = src2 + 0x30 */
+	ldda		[%i0 + 0x30] %asi, %o2	/* %o2/%o3 = dest + 0x30 */
+	prefetch	[%i1 + 0x40], #one_read
+	prefetch	[%l7 + 0x40], #one_read
+	prefetch	[%i0 + 0x40], #n_writes
+	xor		%g2, %i2, %g2
+	xor		%g3, %i3, %g3
+	xor		%o0, %g2, %o0
+	xor		%o1, %g3, %o1
+	stxa		%o0, [%i0 + 0x20] %asi
+	stxa		%o1, [%i0 + 0x28] %asi
+	xor		%l0, %i4, %l0
+	xor		%l1, %i5, %l1
+	xor		%o2, %l0, %o2
+	xor		%o3, %l1, %o3
+	stxa		%o2, [%i0 + 0x30] %asi
+	stxa		%o3, [%i0 + 0x38] %asi
+	add		%i0, 0x40, %i0
+	add		%i1, 0x40, %i1
+	subcc		%g1, 1, %g1
+	bne,pt		%xcc, 1b
+	 add		%l7, 0x40, %l7
+	membar		#Sync
+	wr		%g7, 0x0, %asi
+	ret
+	 restore
+ENDPROC(xor_niagara_3)
+
+ENTRY(xor_niagara_4) /* %o0=bytes, %o1=dest, %o2=src1, %o3=src2, %o4=src3 */
+	save		%sp, -192, %sp
+	prefetch	[%i1], #n_writes
+	prefetch	[%i2], #one_read
+	prefetch	[%i3], #one_read
+	prefetch	[%i4], #one_read
+	rd		%asi, %g7
+	wr		%g0, ASI_BLK_INIT_QUAD_LDD_P, %asi
+	srlx		%i0, 6, %g1
+	mov		%i1, %i0
+	mov		%i2, %i1
+	mov		%i3, %l7
+	mov		%i4, %l6
+1:	ldda		[%i1 + 0x00] %asi, %i2	/* %i2/%i3 = src1 + 0x00 */
+	ldda		[%l7 + 0x00] %asi, %i4	/* %i4/%i5 = src2 + 0x00 */
+	ldda		[%l6 + 0x00] %asi, %g2	/* %g2/%g3 = src3 + 0x00 */
+	ldda		[%i0 + 0x00] %asi, %l0	/* %l0/%l1 = dest + 0x00 */
+	xor		%i4, %i2, %i4
+	xor		%i5, %i3, %i5
+	ldda		[%i1 + 0x10] %asi, %i2	/* %i2/%i3 = src1 + 0x10 */
+	xor		%g2, %i4, %g2
+	xor		%g3, %i5, %g3
+	ldda		[%l7 + 0x10] %asi, %i4	/* %i4/%i5 = src2 + 0x10 */
+	xor		%l0, %g2, %l0
+	xor		%l1, %g3, %l1
+	stxa		%l0, [%i0 + 0x00] %asi
+	stxa		%l1, [%i0 + 0x08] %asi
+	ldda		[%l6 + 0x10] %asi, %g2	/* %g2/%g3 = src3 + 0x10 */
+	ldda		[%i0 + 0x10] %asi, %l0	/* %l0/%l1 = dest + 0x10 */
+
+	xor		%i4, %i2, %i4
+	xor		%i5, %i3, %i5
+	ldda		[%i1 + 0x20] %asi, %i2	/* %i2/%i3 = src1 + 0x20 */
+	xor		%g2, %i4, %g2
+	xor		%g3, %i5, %g3
+	ldda		[%l7 + 0x20] %asi, %i4	/* %i4/%i5 = src2 + 0x20 */
+	xor		%l0, %g2, %l0
+	xor		%l1, %g3, %l1
+	stxa		%l0, [%i0 + 0x10] %asi
+	stxa		%l1, [%i0 + 0x18] %asi
+	ldda		[%l6 + 0x20] %asi, %g2	/* %g2/%g3 = src3 + 0x20 */
+	ldda		[%i0 + 0x20] %asi, %l0	/* %l0/%l1 = dest + 0x20 */
+
+	xor		%i4, %i2, %i4
+	xor		%i5, %i3, %i5
+	ldda		[%i1 + 0x30] %asi, %i2	/* %i2/%i3 = src1 + 0x30 */
+	xor		%g2, %i4, %g2
+	xor		%g3, %i5, %g3
+	ldda		[%l7 + 0x30] %asi, %i4	/* %i4/%i5 = src2 + 0x30 */
+	xor		%l0, %g2, %l0
+	xor		%l1, %g3, %l1
+	stxa		%l0, [%i0 + 0x20] %asi
+	stxa		%l1, [%i0 + 0x28] %asi
+	ldda		[%l6 + 0x30] %asi, %g2	/* %g2/%g3 = src3 + 0x30 */
+	ldda		[%i0 + 0x30] %asi, %l0	/* %l0/%l1 = dest + 0x30 */
+
+	prefetch	[%i1 + 0x40], #one_read
+	prefetch	[%l7 + 0x40], #one_read
+	prefetch	[%l6 + 0x40], #one_read
+	prefetch	[%i0 + 0x40], #n_writes
+
+	xor		%i4, %i2, %i4
+	xor		%i5, %i3, %i5
+	xor		%g2, %i4, %g2
+	xor		%g3, %i5, %g3
+	xor		%l0, %g2, %l0
+	xor		%l1, %g3, %l1
+	stxa		%l0, [%i0 + 0x30] %asi
+	stxa		%l1, [%i0 + 0x38] %asi
+
+	add		%i0, 0x40, %i0
+	add		%i1, 0x40, %i1
+	add		%l7, 0x40, %l7
+	subcc		%g1, 1, %g1
+	bne,pt		%xcc, 1b
+	 add		%l6, 0x40, %l6
+	membar		#Sync
+	wr		%g7, 0x0, %asi
+	ret
+	 restore
+ENDPROC(xor_niagara_4)
+
+ENTRY(xor_niagara_5) /* %o0=bytes, %o1=dest, %o2=src1, %o3=src2, %o4=src3, %o5=src4 */
+	save		%sp, -192, %sp
+	prefetch	[%i1], #n_writes
+	prefetch	[%i2], #one_read
+	prefetch	[%i3], #one_read
+	prefetch	[%i4], #one_read
+	prefetch	[%i5], #one_read
+	rd		%asi, %g7
+	wr		%g0, ASI_BLK_INIT_QUAD_LDD_P, %asi
+	srlx		%i0, 6, %g1
+	mov		%i1, %i0
+	mov		%i2, %i1
+	mov		%i3, %l7
+	mov		%i4, %l6
+	mov		%i5, %l5
+1:	ldda		[%i1 + 0x00] %asi, %i2	/* %i2/%i3 = src1 + 0x00 */
+	ldda		[%l7 + 0x00] %asi, %i4	/* %i4/%i5 = src2 + 0x00 */
+	ldda		[%l6 + 0x00] %asi, %g2	/* %g2/%g3 = src3 + 0x00 */
+	ldda		[%l5 + 0x00] %asi, %l0	/* %l0/%l1 = src4 + 0x00 */
+	ldda		[%i0 + 0x00] %asi, %l2	/* %l2/%l3 = dest + 0x00 */
+	xor		%i4, %i2, %i4
+	xor		%i5, %i3, %i5
+	ldda		[%i1 + 0x10] %asi, %i2	/* %i2/%i3 = src1 + 0x10 */
+	xor		%g2, %i4, %g2
+	xor		%g3, %i5, %g3
+	ldda		[%l7 + 0x10] %asi, %i4	/* %i4/%i5 = src2 + 0x10 */
+	xor		%l0, %g2, %l0
+	xor		%l1, %g3, %l1
+	ldda		[%l6 + 0x10] %asi, %g2	/* %g2/%g3 = src3 + 0x10 */
+	xor		%l2, %l0, %l2
+	xor		%l3, %l1, %l3
+	stxa		%l2, [%i0 + 0x00] %asi
+	stxa		%l3, [%i0 + 0x08] %asi
+	ldda		[%l5 + 0x10] %asi, %l0	/* %l0/%l1 = src4 + 0x10 */
+	ldda		[%i0 + 0x10] %asi, %l2	/* %l2/%l3 = dest + 0x10 */
+
+	xor		%i4, %i2, %i4
+	xor		%i5, %i3, %i5
+	ldda		[%i1 + 0x20] %asi, %i2	/* %i2/%i3 = src1 + 0x20 */
+	xor		%g2, %i4, %g2
+	xor		%g3, %i5, %g3
+	ldda		[%l7 + 0x20] %asi, %i4	/* %i4/%i5 = src2 + 0x20 */
+	xor		%l0, %g2, %l0
+	xor		%l1, %g3, %l1
+	ldda		[%l6 + 0x20] %asi, %g2	/* %g2/%g3 = src3 + 0x20 */
+	xor		%l2, %l0, %l2
+	xor		%l3, %l1, %l3
+	stxa		%l2, [%i0 + 0x10] %asi
+	stxa		%l3, [%i0 + 0x18] %asi
+	ldda		[%l5 + 0x20] %asi, %l0	/* %l0/%l1 = src4 + 0x20 */
+	ldda		[%i0 + 0x20] %asi, %l2	/* %l2/%l3 = dest + 0x20 */
+
+	xor		%i4, %i2, %i4
+	xor		%i5, %i3, %i5
+	ldda		[%i1 + 0x30] %asi, %i2	/* %i2/%i3 = src1 + 0x30 */
+	xor		%g2, %i4, %g2
+	xor		%g3, %i5, %g3
+	ldda		[%l7 + 0x30] %asi, %i4	/* %i4/%i5 = src2 + 0x30 */
+	xor		%l0, %g2, %l0
+	xor		%l1, %g3, %l1
+	ldda		[%l6 + 0x30] %asi, %g2	/* %g2/%g3 = src3 + 0x30 */
+	xor		%l2, %l0, %l2
+	xor		%l3, %l1, %l3
+	stxa		%l2, [%i0 + 0x20] %asi
+	stxa		%l3, [%i0 + 0x28] %asi
+	ldda		[%l5 + 0x30] %asi, %l0	/* %l0/%l1 = src4 + 0x30 */
+	ldda		[%i0 + 0x30] %asi, %l2	/* %l2/%l3 = dest + 0x30 */
+
+	prefetch	[%i1 + 0x40], #one_read
+	prefetch	[%l7 + 0x40], #one_read
+	prefetch	[%l6 + 0x40], #one_read
+	prefetch	[%l5 + 0x40], #one_read
+	prefetch	[%i0 + 0x40], #n_writes
+
+	xor		%i4, %i2, %i4
+	xor		%i5, %i3, %i5
+	xor		%g2, %i4, %g2
+	xor		%g3, %i5, %g3
+	xor		%l0, %g2, %l0
+	xor		%l1, %g3, %l1
+	xor		%l2, %l0, %l2
+	xor		%l3, %l1, %l3
+	stxa		%l2, [%i0 + 0x30] %asi
+	stxa		%l3, [%i0 + 0x38] %asi
+
+	add		%i0, 0x40, %i0
+	add		%i1, 0x40, %i1
+	add		%l7, 0x40, %l7
+	add		%l6, 0x40, %l6
+	subcc		%g1, 1, %g1
+	bne,pt		%xcc, 1b
+	 add		%l5, 0x40, %l5
+	membar		#Sync
+	wr		%g7, 0x0, %asi
+	ret
+	 restore
+ENDPROC(xor_niagara_5)
_

Patches currently in -mm which might be from hch@lst.de are

                 reply	other threads:[~2026-04-03  6:41 UTC|newest]

Thread overview: [no followups] expand[flat|nested]  mbox.gz  Atom feed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260403064158.AB8A1C4CEF7@smtp.kernel.org \
    --to=akpm@linux-foundation.org \
    --cc=agordeev@linux.ibm.com \
    --cc=alex@ghiti.fr \
    --cc=andreas@gaisler.com \
    --cc=anton.ivanov@cambridgegreys.com \
    --cc=aou@eecs.berkeley.edu \
    --cc=ardb@kernel.org \
    --cc=arnd@arndb.de \
    --cc=borntraeger@linux.ibm.com \
    --cc=bp@alien8.de \
    --cc=catalin.marinas@arm.com \
    --cc=chenhuacai@kernel.org \
    --cc=clm@fb.com \
    --cc=dan.j.williams@intel.com \
    --cc=davem@davemloft.net \
    --cc=dsterba@suse.com \
    --cc=ebiggers@kernel.org \
    --cc=gor@linux.ibm.com \
    --cc=hca@linux.ibm.com \
    --cc=hch@lst.de \
    --cc=herbert@gondor.apana.org.au \
    --cc=hpa@zytor.com \
    --cc=jason@zx2c4.com \
    --cc=johannes@sipsolutions.net \
    --cc=kernel@xen0n.name \
    --cc=linan122@huawei.com \
    --cc=linmag7@gmail.com \
    --cc=linux@armlinux.org.uk \
    --cc=maddy@linux.ibm.com \
    --cc=mattst88@gmail.com \
    --cc=mingo@redhat.com \
    --cc=mm-commits@vger.kernel.org \
    --cc=mpe@ellerman.id.au \
    --cc=npiggin@gmail.com \
    --cc=palmer@dabbelt.com \
    --cc=richard.henderson@linaro.org \
    --cc=richard@nod.at \
    --cc=song@kernel.org \
    --cc=svens@linux.ibm.com \
    --cc=tytso@mit.edu \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox