[PATCH 0/5] rslib: Bug fixes and improvements for Reed-Solomon library

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* [PATCH 0/5] rslib: Bug fixes and improvements for Reed-Solomon library
@ 2024-09-24 12:31 Zhang Boyang
  2024-09-24 12:31 ` [PATCH 1/5] rslib: Fix incorrect documentation of rs_modnn() Zhang Boyang
                   ` (5 more replies)
  0 siblings, 6 replies; 12+ messages in thread
From: Zhang Boyang @ 2024-09-24 12:31 UTC (permalink / raw)
  To: Linus Torvalds, linux-kernel
  Cc: Thomas Gleixner, Ferdinand Blomqvist, Kees Cook, Randy Dunlap

Hello,

I have made several bug fixes and improvements to reed-solomon library.

There seems to be no maintainers for reed-solomon library, so I have to send this patch series directly to Linus Torvalds.

p.s. Revival of https://lore.kernel.org/all/20220806162510.157196-1-zhangboyang.id@gmail.com/

Best Regards,
Zhang Boyang

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 1/5] rslib: Fix incorrect documentation of rs_modnn()
  2024-09-24 12:31 [PATCH 0/5] rslib: Bug fixes and improvements for Reed-Solomon library Zhang Boyang
@ 2024-09-24 12:31 ` Zhang Boyang
  2024-09-24 12:31 ` [PATCH 2/5] rslib: Fix documentation of alpha_to[] and index_of[] Zhang Boyang
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 12+ messages in thread
From: Zhang Boyang @ 2024-09-24 12:31 UTC (permalink / raw)
  To: Linus Torvalds, linux-kernel
  Cc: Thomas Gleixner, Ferdinand Blomqvist, Kees Cook, Randy Dunlap,
	Zhang Boyang

Previous documentation of rs_modnn() states simple arithmetic modulo
return a wrong result for values >= (3 * rs->nn). However, that is not
true. The rs_modnn() does the exactly same job as (x % rs->nn). This can
be proved from following loop invariants:

  while (x >= rs->nn) {
    x -= rs->nn; // (1)
    x = (x >> rs->mm) + (x & rs->nn); // (2)
  }

Let x0 denote the value of x before assignment. At (1), it is obvious
that x % nn == x0 % nn. At (2), because nn == ((1 << mm) - 1), we have

  x0 % nn == x0 % nn
  x0 % nn == (((x0 >> mm) << mm) + (x0 & nn)) % nn
  x0 % nn == ((x0 >> mm) * (nn + 1) + (x0 & nn)) % nn
  x0 % nn == ((x0 >> mm) * ((nn + 1) % nn) + (x0 & nn)) % nn
  x0 % nn == ((x0 >> mm) * 1 + (x0 & nn)) % nn   // let's assume nn > 1
  x0 % nn == ((x0 >> mm) + (x0 & nn)) % nn
  x0 % nn == x % nn

When the loop exits, it is obvious that 0 <= x < nn, so the return value
must equal to (x % rs->nn).

This patch also fixes the kernel-doc style of rs_modnn().

Signed-off-by: Zhang Boyang <zhangboyang.id@gmail.com>
---
 include/linux/rslib.h | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/include/linux/rslib.h b/include/linux/rslib.h
index a04dacbdc8ae..f76e0fc097a4 100644
--- a/include/linux/rslib.h
+++ b/include/linux/rslib.h
@@ -106,7 +106,8 @@ struct rs_control *init_rs_non_canonical(int symsize, int (*func)(int),
 /* Release a rs control structure */
 void free_rs(struct rs_control *rs);
 
-/** modulo replacement for galois field arithmetics
+/**
+ * rs_modnn() - Modulo replacement for galois field arithmetics
  *
  *  @rs:	Pointer to the RS codec
  *  @x:		the value to reduce
@@ -115,8 +116,7 @@ void free_rs(struct rs_control *rs);
  *  rs->mm = number of bits per symbol
  *  rs->nn = (2^rs->mm) - 1
  *
- *  Simple arithmetic modulo would return a wrong result for values
- *  >= 3 * rs->nn
+ *  Calculate (x % rs->nn), without using a div instruction
 */
 static inline int rs_modnn(struct rs_codec *rs, int x)
 {
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 2/5] rslib: Fix documentation of alpha_to[] and index_of[]
  2024-09-24 12:31 [PATCH 0/5] rslib: Bug fixes and improvements for Reed-Solomon library Zhang Boyang
  2024-09-24 12:31 ` [PATCH 1/5] rslib: Fix incorrect documentation of rs_modnn() Zhang Boyang
@ 2024-09-24 12:31 ` Zhang Boyang
  2024-09-24 12:31 ` [PATCH 3/5] rslib: Fix wrong result if gffunc(0) != 1 Zhang Boyang
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 12+ messages in thread
From: Zhang Boyang @ 2024-09-24 12:31 UTC (permalink / raw)
  To: Linus Torvalds, linux-kernel
  Cc: Thomas Gleixner, Ferdinand Blomqvist, Kees Cook, Randy Dunlap,
	Zhang Boyang

Obviously, alpha_to[] is antilog table, and index_of[] is logarithm
table.

Signed-off-by: Zhang Boyang <zhangboyang.id@gmail.com>
---
 include/linux/rslib.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/include/linux/rslib.h b/include/linux/rslib.h
index f76e0fc097a4..908bf7d0eb58 100644
--- a/include/linux/rslib.h
+++ b/include/linux/rslib.h
@@ -18,8 +18,8 @@
  *
  * @mm:		Bits per symbol
  * @nn:		Symbols per block (= (1<<mm)-1)
- * @alpha_to:	log lookup table
- * @index_of:	Antilog lookup table
+ * @alpha_to:	Antilog lookup table
+ * @index_of:	log lookup table
  * @genpoly:	Generator polynomial
  * @nroots:	Number of generator roots = number of parity symbols
  * @fcr:	First consecutive root, index form
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 3/5] rslib: Fix wrong result if gffunc(0) != 1
  2024-09-24 12:31 [PATCH 0/5] rslib: Bug fixes and improvements for Reed-Solomon library Zhang Boyang
  2024-09-24 12:31 ` [PATCH 1/5] rslib: Fix incorrect documentation of rs_modnn() Zhang Boyang
  2024-09-24 12:31 ` [PATCH 2/5] rslib: Fix documentation of alpha_to[] and index_of[] Zhang Boyang
@ 2024-09-24 12:31 ` Zhang Boyang
  2024-09-24 12:31 ` [PATCH 4/5] rslib: Improve the performance of encode_rs.c Zhang Boyang
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 12+ messages in thread
From: Zhang Boyang @ 2024-09-24 12:31 UTC (permalink / raw)
  To: Linus Torvalds, linux-kernel
  Cc: Thomas Gleixner, Ferdinand Blomqvist, Kees Cook, Randy Dunlap,
	Zhang Boyang

The rslib allows customizing the finite field by the `gffunc' parameter
of init_rs_non_canonical(). However, there are several places in rslib
use hard-coded 1, leading to errors if gffunc(0) != 1. This patch
replaces hard-coded 1 with alpha_to[0] to fix this problem. One of such
`gffunc' might be gffunc'(x) = swab16(gffunc(swab16(x))), as
gffunc'(0) = swab16(1). This special gffunc'(x) is useful when
implementing RS coder for 16 bit foreign-endian symbols.

Fixes: d7e5a5462f68 ("[RSLIB] Support non-canonical GF representations")
Signed-off-by: Zhang Boyang <zhangboyang.id@gmail.com>
---
 lib/reed_solomon/decode_rs.c    | 4 ++--
 lib/reed_solomon/reed_solomon.c | 4 ++--
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/lib/reed_solomon/decode_rs.c b/lib/reed_solomon/decode_rs.c
index 805de84ae83d..6c1d53d1b702 100644
--- a/lib/reed_solomon/decode_rs.c
+++ b/lib/reed_solomon/decode_rs.c
@@ -104,7 +104,7 @@
 
  decode:
 	memset(&lambda[1], 0, nroots * sizeof(lambda[0]));
-	lambda[0] = 1;
+	lambda[0] = alpha_to[0];
 
 	if (no_eras > 0) {
 		/* Init lambda to be the erasure locator polynomial */
@@ -198,7 +198,7 @@
 	memcpy(&reg[1], &lambda[1], nroots * sizeof(reg[0]));
 	count = 0;		/* Number of roots of lambda(x) */
 	for (i = 1, k = iprim - 1; i <= nn; i++, k = rs_modnn(rs, k + iprim)) {
-		q = 1;		/* lambda[0] is always 0 */
+		q = alpha_to[0];	/* lambda[0] is always 0 */
 		for (j = deg_lambda; j > 0; j--) {
 			if (reg[j] != nn) {
 				reg[j] = rs_modnn(rs, reg[j] + j);
diff --git a/lib/reed_solomon/reed_solomon.c b/lib/reed_solomon/reed_solomon.c
index bbc01bad3053..bb4f44c8edba 100644
--- a/lib/reed_solomon/reed_solomon.c
+++ b/lib/reed_solomon/reed_solomon.c
@@ -131,9 +131,9 @@ static struct rs_codec *codec_init(int symsize, int gfpoly, int (*gffunc)(int),
 	rs->iprim = iprim / prim;
 
 	/* Form RS code generator polynomial from its roots */
-	rs->genpoly[0] = 1;
+	rs->genpoly[0] = rs->alpha_to[0];
 	for (i = 0, root = fcr * prim; i < nroots; i++, root += prim) {
-		rs->genpoly[i + 1] = 1;
+		rs->genpoly[i + 1] = rs->alpha_to[0];
 		/* Multiply rs->genpoly[] by  @**(root + x) */
 		for (j = i; j > 0; j--) {
 			if (rs->genpoly[j] != 0) {
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 4/5] rslib: Improve the performance of encode_rs.c
  2024-09-24 12:31 [PATCH 0/5] rslib: Bug fixes and improvements for Reed-Solomon library Zhang Boyang
                   ` (2 preceding siblings ...)
  2024-09-24 12:31 ` [PATCH 3/5] rslib: Fix wrong result if gffunc(0) != 1 Zhang Boyang
@ 2024-09-24 12:31 ` Zhang Boyang
  2024-09-24 12:31 ` [PATCH 5/5] rslib: Fix 16-bit symbol support Zhang Boyang
  2024-09-26 17:03 ` [PATCH 0/5] rslib: Bug fixes and improvements for Reed-Solomon library Randy Dunlap
  5 siblings, 0 replies; 12+ messages in thread
From: Zhang Boyang @ 2024-09-24 12:31 UTC (permalink / raw)
  To: Linus Torvalds, linux-kernel
  Cc: Thomas Gleixner, Ferdinand Blomqvist, Kees Cook, Randy Dunlap,
	Zhang Boyang

This patch enhances the performance of RS encoder by following points:

1) Avoid memmove(). The shifting operation done by memmove() can be
   merged into the calculation loop above.

2) Introduce rs_modnn_fast(). The original rs_modnn() contains a loop
   which may be slow. Since (fb + genpoly[...]) is always strictly less
   than (2 * rs->nn), we can use a ternary operator to do the same
   calculation. The new faster function is named rs_modnn_fast(). The
   new rs_modnn_fast(x) requires 0 <= x < 2*nn, in contrast, original
   rs_modnn(x) only requires x >= 0. To make things clear, the
   documentation of original rs_modnn() is also updated.

Signed-off-by: Zhang Boyang <zhangboyang.id@gmail.com>
---
 include/linux/rslib.h        | 15 ++++++++++++++-
 lib/reed_solomon/encode_rs.c | 21 ++++++++++-----------
 2 files changed, 24 insertions(+), 12 deletions(-)

diff --git a/include/linux/rslib.h b/include/linux/rslib.h
index 908bf7d0eb58..d228ece01069 100644
--- a/include/linux/rslib.h
+++ b/include/linux/rslib.h
@@ -110,7 +110,7 @@ void free_rs(struct rs_control *rs);
  * rs_modnn() - Modulo replacement for galois field arithmetics
  *
  *  @rs:	Pointer to the RS codec
- *  @x:		the value to reduce
+ *  @x:		the value to reduce (requires x >= 0)
  *
  *  where
  *  rs->mm = number of bits per symbol
@@ -127,4 +127,17 @@ static inline int rs_modnn(struct rs_codec *rs, int x)
 	return x;
 }
 
+/**
+ * rs_modnn_fast() - Modulo replacement for galois field arithmetics
+ *
+ *  @rs:	Pointer to the RS codec
+ *  @x:		the value to reduce (requires 0 <= x < 2*nn)
+ *
+ *  Same as rs_modnn(x), but faster, at the cost of limited value range of @x
+*/
+static inline int rs_modnn_fast(struct rs_codec *rs, int x)
+{
+	return x - rs->nn < 0 ? x : x - rs->nn;
+}
+
 #endif
diff --git a/lib/reed_solomon/encode_rs.c b/lib/reed_solomon/encode_rs.c
index 9112d46e869e..6e3847b17ad4 100644
--- a/lib/reed_solomon/encode_rs.c
+++ b/lib/reed_solomon/encode_rs.c
@@ -27,19 +27,18 @@
 
 	for (i = 0; i < len; i++) {
 		fb = index_of[((((uint16_t) data[i])^invmsk) & msk) ^ par[0]];
-		/* feedback term is non-zero */
 		if (fb != nn) {
-			for (j = 1; j < nroots; j++) {
-				par[j] ^= alpha_to[rs_modnn(rs, fb +
-							 genpoly[nroots - j])];
-			}
-		}
-		/* Shift */
-		memmove(&par[0], &par[1], sizeof(uint16_t) * (nroots - 1));
-		if (fb != nn) {
-			par[nroots - 1] = alpha_to[rs_modnn(rs,
-							    fb + genpoly[0])];
+			/* feedback term is non-zero */
+			for (j = 1; j < nroots; j++)
+				par[j - 1] = par[j] ^ alpha_to[rs_modnn_fast(rs,
+						      fb +
+						      genpoly[nroots - j])];
+			par[nroots - 1] = alpha_to[rs_modnn_fast(rs,
+					  fb +
+					  genpoly[0])];
 		} else {
+			for (j = 1; j < nroots; j++)
+				par[j - 1] = par[j];
 			par[nroots - 1] = 0;
 		}
 	}
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 5/5] rslib: Fix 16-bit symbol support
  2024-09-24 12:31 [PATCH 0/5] rslib: Bug fixes and improvements for Reed-Solomon library Zhang Boyang
                   ` (3 preceding siblings ...)
  2024-09-24 12:31 ` [PATCH 4/5] rslib: Improve the performance of encode_rs.c Zhang Boyang
@ 2024-09-24 12:31 ` Zhang Boyang
  2024-09-26 17:03 ` [PATCH 0/5] rslib: Bug fixes and improvements for Reed-Solomon library Randy Dunlap
  5 siblings, 0 replies; 12+ messages in thread
From: Zhang Boyang @ 2024-09-24 12:31 UTC (permalink / raw)
  To: Linus Torvalds, linux-kernel
  Cc: Thomas Gleixner, Ferdinand Blomqvist, Kees Cook, Randy Dunlap,
	Zhang Boyang

Current rslib support symsize up to 16, so the max value of rs->nn can
be up to 0xFFFF. Since fcr <= nn, prim <= nn, multiplications on them
might overflow and lead to wrong results, e.g. fcr*root[j], fcr*prim.

This patch fixes these overflows by introducing rs_modnn_mul(a, b). This
function is same as rs_modnn(a*b) but it avoids overflow when
calculating a*b. It requires 0 <= a <= nn && 0 <= b <= nn. As it use
uint32_t to do the multiplication internally, there will be no overflow
as long as 0 <= a <= nn <= 0xFFFF && 0 <= b <= nn <= 0xFFFF. In fact, if
we use `unsigned int' everywhere, there is no need to introduce
rs_modnn_mul(). But the `unsigned int' approach has poor scalability and
it may bring us to the mess of signed and unsigned integers.

With rs_modnn(), the intermediate result is now restricted to [0, nn).
This enables us to use rs_modnn_fast(a+b) to replace rs_modnn(a+b), as
long as 0 <= a+b < 2*nn. The most common case is one addend in [0, nn]
and the other addend in [0, nn). The examples of values in [0, nn] are
fcr, prim, indexes taken from rs->index_of[0...nn], etc. The examples of
values in [0, nn) are results from rs_modnn(), indexes taken from
rs->index_of[1...nn], etc.

Since the roots of RS generator polynomial, i.e. (fcr+i)*prim%nn, is
often used. It's now precomputed into rs->genroot[], to avoid writing
rs_modnn_mul(rs, rs_modnn_fast(rs, fcr + i), prim) everywhere.

The algorithm of searching for rs->iprim is also changed. Instead of
searching for (1+what*nn)%prim == 0, then iprim = (1+what*nn)/prim, it
now searches for iprim*prim%nn == 1 directly.

A new test case is also added to test_rslib.c to ensure correctness.

Signed-off-by: Zhang Boyang <zhangboyang.id@gmail.com>
---
 include/linux/rslib.h           | 23 +++++++++++++
 lib/reed_solomon/decode_rs.c    | 60 +++++++++++++++++++--------------
 lib/reed_solomon/reed_solomon.c | 28 +++++++++++----
 lib/reed_solomon/test_rslib.c   |  8 ++---
 4 files changed, 82 insertions(+), 37 deletions(-)

diff --git a/include/linux/rslib.h b/include/linux/rslib.h
index d228ece01069..4fb27d5bdb55 100644
--- a/include/linux/rslib.h
+++ b/include/linux/rslib.h
@@ -21,6 +21,7 @@
  * @alpha_to:	Antilog lookup table
  * @index_of:	log lookup table
  * @genpoly:	Generator polynomial
+ * @genroot:	Roots of generator polynomial, index form
  * @nroots:	Number of generator roots = number of parity symbols
  * @fcr:	First consecutive root, index form
  * @prim:	Primitive element, index form
@@ -36,6 +37,7 @@ struct rs_codec {
 	uint16_t	*alpha_to;
 	uint16_t	*index_of;
 	uint16_t	*genpoly;
+	uint16_t	*genroot;
 	int		nroots;
 	int		fcr;
 	int		prim;
@@ -127,6 +129,27 @@ static inline int rs_modnn(struct rs_codec *rs, int x)
 	return x;
 }
 
+/**
+ * rs_modnn_mul() - Modulo replacement for galois field arithmetics
+ *
+ *  @rs:	Pointer to the RS codec
+ *  @a:		a*b is the value to reduce (requires 0 <= a <= nn <= 0xFFFF)
+ *  @b:		a*b is the value to reduce (requires 0 <= b <= nn <= 0xFFFF)
+ *
+ *  Same as rs_modnn(a*b), but avoid integer overflow when calculating a*b
+*/
+static inline int rs_modnn_mul(struct rs_codec *rs, int a, int b)
+{
+	/* nn <= 0xFFFF, so (a * b) will not overflow uint32_t */
+	uint32_t x = (uint32_t)a * (uint32_t)b;
+	uint32_t nn = (uint32_t)rs->nn;
+	while (x >= nn) {
+		x -= nn;
+		x = (x >> rs->mm) + (x & nn);
+	}
+	return (int)x;
+}
+
 /**
  * rs_modnn_fast() - Modulo replacement for galois field arithmetics
  *
diff --git a/lib/reed_solomon/decode_rs.c b/lib/reed_solomon/decode_rs.c
index 6c1d53d1b702..3387465ab429 100644
--- a/lib/reed_solomon/decode_rs.c
+++ b/lib/reed_solomon/decode_rs.c
@@ -20,6 +20,7 @@
 	int iprim = rs->iprim;
 	uint16_t *alpha_to = rs->alpha_to;
 	uint16_t *index_of = rs->index_of;
+	uint16_t *genroot = rs->genroot;
 	uint16_t u, q, tmp, num1, num2, den, discr_r, syn_error;
 	int count = 0;
 	int num_corrected;
@@ -69,8 +70,8 @@
 			} else {
 				syn[i] = ((((uint16_t) data[j]) ^
 					   invmsk) & msk) ^
-					alpha_to[rs_modnn(rs, index_of[syn[i]] +
-						       (fcr + i) * prim)];
+					alpha_to[rs_modnn_fast(rs,
+						index_of[syn[i]] + genroot[i])];
 			}
 		}
 	}
@@ -81,8 +82,8 @@
 				syn[i] = ((uint16_t) par[j]) & msk;
 			} else {
 				syn[i] = (((uint16_t) par[j]) & msk) ^
-					alpha_to[rs_modnn(rs, index_of[syn[i]] +
-						       (fcr+i)*prim)];
+					alpha_to[rs_modnn_fast(rs,
+						index_of[syn[i]] + genroot[i])];
 			}
 		}
 	}
@@ -108,15 +109,17 @@
 
 	if (no_eras > 0) {
 		/* Init lambda to be the erasure locator polynomial */
-		lambda[1] = alpha_to[rs_modnn(rs,
-					prim * (nn - 1 - (eras_pos[0] + pad)))];
+		lambda[1] = alpha_to[rs_modnn_mul(rs,
+					 prim, (nn - 1 - (eras_pos[0] + pad)))];
 		for (i = 1; i < no_eras; i++) {
-			u = rs_modnn(rs, prim * (nn - 1 - (eras_pos[i] + pad)));
+			u = rs_modnn_mul(rs,
+					 prim, (nn - 1 - (eras_pos[i] + pad)));
 			for (j = i + 1; j > 0; j--) {
 				tmp = index_of[lambda[j - 1]];
 				if (tmp != nn) {
 					lambda[j] ^=
-						alpha_to[rs_modnn(rs, u + tmp)];
+						alpha_to[rs_modnn_fast(rs,
+							 u + tmp)];
 				}
 			}
 		}
@@ -137,9 +140,9 @@
 		for (i = 0; i < r; i++) {
 			if ((lambda[i] != 0) && (s[r - i - 1] != nn)) {
 				discr_r ^=
-					alpha_to[rs_modnn(rs,
-							  index_of[lambda[i]] +
-							  s[r - i - 1])];
+					alpha_to[rs_modnn_fast(rs,
+						 index_of[lambda[i]] +
+						 s[r - i - 1])];
 			}
 		}
 		discr_r = index_of[discr_r];	/* Index form */
@@ -153,8 +156,8 @@
 			for (i = 0; i < nroots; i++) {
 				if (b[i] != nn) {
 					t[i + 1] = lambda[i + 1] ^
-						alpha_to[rs_modnn(rs, discr_r +
-								  b[i])];
+						alpha_to[rs_modnn_fast(rs,
+							 discr_r + b[i])];
 				} else
 					t[i + 1] = lambda[i + 1];
 			}
@@ -166,8 +169,9 @@
 				 */
 				for (i = 0; i <= nroots; i++) {
 					b[i] = (lambda[i] == 0) ? nn :
-						rs_modnn(rs, index_of[lambda[i]]
-							 - discr_r + nn);
+						rs_modnn_fast(rs,
+						        index_of[lambda[i]] +
+							nn - discr_r);
 				}
 			} else {
 				/* 2 lines below: B(x) <-- x*B(x) */
@@ -197,11 +201,11 @@
 	/* Find roots of error+erasure locator polynomial by Chien search */
 	memcpy(&reg[1], &lambda[1], nroots * sizeof(reg[0]));
 	count = 0;		/* Number of roots of lambda(x) */
-	for (i = 1, k = iprim - 1; i <= nn; i++, k = rs_modnn(rs, k + iprim)) {
+	for (i = 1, k = iprim-1; i <= nn; i++, k = rs_modnn_fast(rs, k+iprim)) {
 		q = alpha_to[0];	/* lambda[0] is always 0 */
 		for (j = deg_lambda; j > 0; j--) {
 			if (reg[j] != nn) {
-				reg[j] = rs_modnn(rs, reg[j] + j);
+				reg[j] = rs_modnn_fast(rs, reg[j] + j);
 				q ^= alpha_to[reg[j]];
 			}
 		}
@@ -238,8 +242,8 @@
 		tmp = 0;
 		for (j = i; j >= 0; j--) {
 			if ((s[i - j] != nn) && (lambda[j] != nn))
-				tmp ^=
-				    alpha_to[rs_modnn(rs, s[i - j] + lambda[j])];
+				tmp ^= alpha_to[rs_modnn_fast(rs,
+						s[i - j] + lambda[j])];
 		}
 		omega[i] = index_of[tmp];
 	}
@@ -254,8 +258,9 @@
 		num1 = 0;
 		for (i = deg_omega; i >= 0; i--) {
 			if (omega[i] != nn)
-				num1 ^= alpha_to[rs_modnn(rs, omega[i] +
-							i * root[j])];
+				num1 ^= alpha_to[rs_modnn_fast(rs,
+						 omega[i] +
+						 rs_modnn_mul(rs, i, root[j]))];
 		}
 
 		if (num1 == 0) {
@@ -264,15 +269,18 @@
 			continue;
 		}
 
-		num2 = alpha_to[rs_modnn(rs, root[j] * (fcr - 1) + nn)];
+		num2 = alpha_to[rs_modnn_fast(rs,
+				rs_modnn_mul(rs, root[j], fcr) +
+				nn - root[j])];
 		den = 0;
 
 		/* lambda[i+1] for i even is the formal derivative
 		 * lambda_pr of lambda[i] */
 		for (i = min(deg_lambda, nroots - 1) & ~1; i >= 0; i -= 2) {
 			if (lambda[i + 1] != nn) {
-				den ^= alpha_to[rs_modnn(rs, lambda[i + 1] +
-						       i * root[j])];
+				den ^= alpha_to[rs_modnn_fast(rs,
+						lambda[i + 1] +
+						rs_modnn_mul(rs, i, root[j]))];
 			}
 		}
 
@@ -292,8 +300,8 @@
 			if (b[j] == 0)
 				continue;
 
-			k = (fcr + i) * prim * (nn-loc[j]-1);
-			tmp ^= alpha_to[rs_modnn(rs, index_of[b[j]] + k)];
+			k = rs_modnn_mul(rs, genroot[i], nn - loc[j] - 1);
+			tmp ^= alpha_to[rs_modnn_fast(rs, index_of[b[j]] + k)];
 		}
 
 		if (tmp != alpha_to[s[i]])
diff --git a/lib/reed_solomon/reed_solomon.c b/lib/reed_solomon/reed_solomon.c
index bb4f44c8edba..b924eeb98685 100644
--- a/lib/reed_solomon/reed_solomon.c
+++ b/lib/reed_solomon/reed_solomon.c
@@ -56,7 +56,7 @@ static DEFINE_MUTEX(rslistlock);
 
 /**
  * codec_init - Initialize a Reed-Solomon codec
- * @symsize:	symbol size, bits (1-8)
+ * @symsize:	symbol size, bits (1-16)
  * @gfpoly:	Field generator polynomial coefficients
  * @gffunc:	Field generator function
  * @fcr:	first root of RS code generator polynomial, index form
@@ -100,6 +100,10 @@ static struct rs_codec *codec_init(int symsize, int gfpoly, int (*gffunc)(int),
 	if(rs->genpoly == NULL)
 		goto err;
 
+	rs->genroot = kmalloc_array(rs->nroots, sizeof(uint16_t), gfp);
+	if(rs->genroot == NULL)
+		goto err;
+
 	/* Generate Galois field lookup tables */
 	rs->index_of[0] = rs->nn;	/* log(zero) = -inf */
 	rs->alpha_to[rs->nn] = 0;	/* alpha**-inf = 0 */
@@ -126,26 +130,34 @@ static struct rs_codec *codec_init(int symsize, int gfpoly, int (*gffunc)(int),
 		goto err;
 
 	/* Find prim-th root of 1, used in decoding */
-	for(iprim = 1; (iprim % prim) != 0; iprim += rs->nn);
+	for (iprim = 1; rs_modnn_mul(rs, iprim, prim) != 1; iprim++);
 	/* prim-th root of 1, index form */
-	rs->iprim = iprim / prim;
+	rs->iprim = iprim;
+
+	/* Precompute generator polynomial roots */
+	root = rs_modnn_mul(rs, fcr, prim);
+	for (i = 0; i < nroots; i++) {
+		rs->genroot[i] = root; /*  = (fcr + i) * prim % nn  */
+		root = rs_modnn_fast(rs, root + prim);
+	}
 
 	/* Form RS code generator polynomial from its roots */
 	rs->genpoly[0] = rs->alpha_to[0];
-	for (i = 0, root = fcr * prim; i < nroots; i++, root += prim) {
+	for (i = 0; i < nroots; i++) {
+		root = rs->genroot[i];
 		rs->genpoly[i + 1] = rs->alpha_to[0];
 		/* Multiply rs->genpoly[] by  @**(root + x) */
 		for (j = i; j > 0; j--) {
 			if (rs->genpoly[j] != 0) {
-				rs->genpoly[j] = rs->genpoly[j -1] ^
-					rs->alpha_to[rs_modnn(rs,
+				rs->genpoly[j] = rs->genpoly[j - 1] ^
+					rs->alpha_to[rs_modnn_fast(rs,
 					rs->index_of[rs->genpoly[j]] + root)];
 			} else
 				rs->genpoly[j] = rs->genpoly[j - 1];
 		}
 		/* rs->genpoly[0] can never be zero */
 		rs->genpoly[0] =
-			rs->alpha_to[rs_modnn(rs,
+			rs->alpha_to[rs_modnn_fast(rs,
 				rs->index_of[rs->genpoly[0]] + root)];
 	}
 	/* convert rs->genpoly[] to index form for quicker encoding */
@@ -157,6 +169,7 @@ static struct rs_codec *codec_init(int symsize, int gfpoly, int (*gffunc)(int),
 	return rs;
 
 err:
+	kfree(rs->genroot);
 	kfree(rs->genpoly);
 	kfree(rs->index_of);
 	kfree(rs->alpha_to);
@@ -188,6 +201,7 @@ void free_rs(struct rs_control *rs)
 		kfree(cd->alpha_to);
 		kfree(cd->index_of);
 		kfree(cd->genpoly);
+		kfree(cd->genroot);
 		kfree(cd);
 	}
 	mutex_unlock(&rslistlock);
diff --git a/lib/reed_solomon/test_rslib.c b/lib/reed_solomon/test_rslib.c
index 75cb1adac884..d19c95bcd31d 100644
--- a/lib/reed_solomon/test_rslib.c
+++ b/lib/reed_solomon/test_rslib.c
@@ -55,6 +55,7 @@ static struct etab Tab[] = {
 	{8,	0x11d,	1,	1,	30,	100	},
 	{8,	0x187,	112,	11,	32,	100	},
 	{9,	0x211,	1,	1,	33,	80	},
+	{16,  0x1ffed,	65534,	65534,	50,	5	},
 	{0, 0, 0, 0, 0, 0},
 };
 
@@ -232,9 +233,8 @@ static void compute_syndrome(struct rs_control *rsc, uint16_t *data,
 	struct rs_codec *rs = rsc->codec;
 	uint16_t *alpha_to = rs->alpha_to;
 	uint16_t *index_of = rs->index_of;
+	uint16_t *genroot = rs->genroot;
 	int nroots = rs->nroots;
-	int prim = rs->prim;
-	int fcr = rs->fcr;
 	int i, j;
 
 	/* Calculating syndrome */
@@ -245,8 +245,8 @@ static void compute_syndrome(struct rs_control *rsc, uint16_t *data,
 				syn[i] = data[j];
 			} else {
 				syn[i] = data[j] ^
-					alpha_to[rs_modnn(rs, index_of[syn[i]]
-						+ (fcr + i) * prim)];
+					alpha_to[rs_modnn_fast(rs,
+						index_of[syn[i]] + genroot[i])];
 			}
 		}
 	}
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH 0/5] rslib: Bug fixes and improvements for Reed-Solomon library
  2024-09-24 12:31 [PATCH 0/5] rslib: Bug fixes and improvements for Reed-Solomon library Zhang Boyang
                   ` (4 preceding siblings ...)
  2024-09-24 12:31 ` [PATCH 5/5] rslib: Fix 16-bit symbol support Zhang Boyang
@ 2024-09-26 17:03 ` Randy Dunlap
  2024-09-26 17:12   ` Linus Torvalds
  2024-09-26 23:53   ` Zhang Boyang
  5 siblings, 2 replies; 12+ messages in thread
From: Randy Dunlap @ 2024-09-26 17:03 UTC (permalink / raw)
  To: Zhang Boyang, Linus Torvalds, linux-kernel
  Cc: Thomas Gleixner, Ferdinand Blomqvist, Kees Cook, Andrew Morton



On 9/24/24 5:31 AM, Zhang Boyang wrote:
> Hello,
> 
> I have made several bug fixes and improvements to reed-solomon library.
> 
> There seems to be no maintainers for reed-solomon library, so I have to send this patch series directly to Linus Torvalds.

Hi,

If I were making this patch series, I would send it to Andrew Morton,
but if Linus takes it, I'm certainly OK with that.

Thanks.

> p.s. Revival of https://lore.kernel.org/all/20220806162510.157196-1-zhangboyang.id@gmail.com/

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 0/5] rslib: Bug fixes and improvements for Reed-Solomon library
  2024-09-26 17:03 ` [PATCH 0/5] rslib: Bug fixes and improvements for Reed-Solomon library Randy Dunlap
@ 2024-09-26 17:12   ` Linus Torvalds
  2024-09-26 23:54     ` Zhang Boyang
  2024-09-28 21:11     ` Kees Cook
  2024-09-26 23:53   ` Zhang Boyang
  1 sibling, 2 replies; 12+ messages in thread
From: Linus Torvalds @ 2024-09-26 17:12 UTC (permalink / raw)
  To: Randy Dunlap
  Cc: Zhang Boyang, linux-kernel, Thomas Gleixner, Ferdinand Blomqvist,
	Kees Cook, Andrew Morton

On Thu, 26 Sept 2024 at 10:03, Randy Dunlap <rdunlap@infradead.org> wrote:
>
>
> If I were making this patch series, I would send it to Andrew Morton,
> but if Linus takes it, I'm certainly OK with that.

I don't feel like I have the expertise of something like rslib, and
would actually suggest it go through one of the (very very few) users.

It's just pstore and some NAND chip drivers, afaik.

                 Linus

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 0/5] rslib: Bug fixes and improvements for Reed-Solomon library
  2024-09-26 17:03 ` [PATCH 0/5] rslib: Bug fixes and improvements for Reed-Solomon library Randy Dunlap
  2024-09-26 17:12   ` Linus Torvalds
@ 2024-09-26 23:53   ` Zhang Boyang
  1 sibling, 0 replies; 12+ messages in thread
From: Zhang Boyang @ 2024-09-26 23:53 UTC (permalink / raw)
  To: Randy Dunlap, Linus Torvalds, linux-kernel



On 2024/9/27 01:03, Randy Dunlap wrote:
> 
> 
> On 9/24/24 5:31 AM, Zhang Boyang wrote:
>> Hello,
>>
>> I have made several bug fixes and improvements to reed-solomon library.
>>
>> There seems to be no maintainers for reed-solomon library, so I have to send this patch series directly to Linus Torvalds.
> 
> Hi,
> 
> If I were making this patch series, I would send it to Andrew Morton,
> but if Linus takes it, I'm certainly OK with that.
> 

Oh, yes, I found it would be probably a good idea to let it go through 
Andrew after reading kernel docs carefully. (I decided the patch 
destination according to MAINTAINERS file and I'd like to say sorry to 
Linus)

Zhang Boyang

> Thanks.
> 
>> p.s. Revival of https://lore.kernel.org/all/20220806162510.157196-1-zhangboyang.id@gmail.com/


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 0/5] rslib: Bug fixes and improvements for Reed-Solomon library
  2024-09-26 17:12   ` Linus Torvalds
@ 2024-09-26 23:54     ` Zhang Boyang
  2024-09-28 21:11     ` Kees Cook
  1 sibling, 0 replies; 12+ messages in thread
From: Zhang Boyang @ 2024-09-26 23:54 UTC (permalink / raw)
  To: Linus Torvalds, Randy Dunlap, Linux Kernel Mailing List

On 2024/9/27 01:12, Linus Torvalds wrote:
> On Thu, 26 Sept 2024 at 10:03, Randy Dunlap <rdunlap@infradead.org> wrote:
>>
>>
>> If I were making this patch series, I would send it to Andrew Morton,
>> but if Linus takes it, I'm certainly OK with that.
> 
> I don't feel like I have the expertise of something like rslib, and
> would actually suggest it go through one of the (very very few) users.
> 
> It's just pstore and some NAND chip drivers, afaik.
> 

I will try to send my patches to these rslib users. However, rslib users 
and rslib itself are different worlds (although rslib users can give it 
a test), and many of them are not under active development for years. I 
will try to send to Andrew as a last resort, and I will probably invite 
somebody from, for example, GNU Radio, to give my patches a review.

Thanks, and I'm sorry to disturb you.

Zhang Boyang

>                   Linus

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 0/5] rslib: Bug fixes and improvements for Reed-Solomon library
  2024-09-26 17:12   ` Linus Torvalds
  2024-09-26 23:54     ` Zhang Boyang
@ 2024-09-28 21:11     ` Kees Cook
  2024-09-29  3:22       ` Zhang Boyang
  1 sibling, 1 reply; 12+ messages in thread
From: Kees Cook @ 2024-09-28 21:11 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Randy Dunlap, Zhang Boyang, linux-kernel, Thomas Gleixner,
	Ferdinand Blomqvist, Andrew Morton

On Thu, Sep 26, 2024 at 10:12:39AM -0700, Linus Torvalds wrote:
> On Thu, 26 Sept 2024 at 10:03, Randy Dunlap <rdunlap@infradead.org> wrote:
> >
> >
> > If I were making this patch series, I would send it to Andrew Morton,
> > but if Linus takes it, I'm certainly OK with that.
> 
> I don't feel like I have the expertise of something like rslib, and
> would actually suggest it go through one of the (very very few) users.
> 
> It's just pstore and some NAND chip drivers, afaik.

I can review this soon and can take it via pstore; I'll probably go poke
tglx though, since IIRC, he was the original author.

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 0/5] rslib: Bug fixes and improvements for Reed-Solomon library
  2024-09-28 21:11     ` Kees Cook
@ 2024-09-29  3:22       ` Zhang Boyang
  0 siblings, 0 replies; 12+ messages in thread
From: Zhang Boyang @ 2024-09-29  3:22 UTC (permalink / raw)
  To: Kees Cook
  Cc: Randy Dunlap, linux-kernel, Thomas Gleixner, Ferdinand Blomqvist,
	Andrew Morton, Linus Torvalds



On 2024/9/29 05:11, Kees Cook wrote:
> On Thu, Sep 26, 2024 at 10:12:39AM -0700, Linus Torvalds wrote:
>> On Thu, 26 Sept 2024 at 10:03, Randy Dunlap <rdunlap@infradead.org> wrote:
>>>
>>>
>>> If I were making this patch series, I would send it to Andrew Morton,
>>> but if Linus takes it, I'm certainly OK with that.
>>
>> I don't feel like I have the expertise of something like rslib, and
>> would actually suggest it go through one of the (very very few) users.
>>
>> It's just pstore and some NAND chip drivers, afaik.
> 
> I can review this soon and can take it via pstore; I'll probably go poke
> tglx though, since IIRC, he was the original author.
> 

Wow, that's great. Thanks for reviewing.

Zhang Boyang

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2024-09-29  3:22 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-09-24 12:31 [PATCH 0/5] rslib: Bug fixes and improvements for Reed-Solomon library Zhang Boyang
2024-09-24 12:31 ` [PATCH 1/5] rslib: Fix incorrect documentation of rs_modnn() Zhang Boyang
2024-09-24 12:31 ` [PATCH 2/5] rslib: Fix documentation of alpha_to[] and index_of[] Zhang Boyang
2024-09-24 12:31 ` [PATCH 3/5] rslib: Fix wrong result if gffunc(0) != 1 Zhang Boyang
2024-09-24 12:31 ` [PATCH 4/5] rslib: Improve the performance of encode_rs.c Zhang Boyang
2024-09-24 12:31 ` [PATCH 5/5] rslib: Fix 16-bit symbol support Zhang Boyang
2024-09-26 17:03 ` [PATCH 0/5] rslib: Bug fixes and improvements for Reed-Solomon library Randy Dunlap
2024-09-26 17:12   ` Linus Torvalds
2024-09-26 23:54     ` Zhang Boyang
2024-09-28 21:11     ` Kees Cook
2024-09-29  3:22       ` Zhang Boyang
2024-09-26 23:53   ` Zhang Boyang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox