linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH RFC 0/4] Reintroduce the sm2 algorithm
@ 2025-06-30 13:39 Gu Bowen
  2025-06-30 13:39 ` [PATCH RFC 1/4] Revert "Revert "lib/mpi: Extend the MPI library"" Gu Bowen
                   ` (5 more replies)
  0 siblings, 6 replies; 13+ messages in thread
From: Gu Bowen @ 2025-06-30 13:39 UTC (permalink / raw)
  To: Herbert Xu, David Howells, David Woodhouse, Lukas Wunner,
	Ignat Korchagin, David S . Miller, Jarkko Sakkinen,
	Maxime Coquelin, Alexandre Torgue, Eric Biggers,
	Jason A . Donenfeld, Ard Biesheuvel, Tianjia Zhang, Dan Carpenter
  Cc: keyrings, linux-kernel, linux-crypto, linux-stm32,
	linux-arm-kernel, Lu Jialin, GONG Ruiqi, Gu Bowen

To reintroduce the sm2 algorithm, the patch set did the following:
 - Reintroduce the mpi library based on libgcrypt.
 - Reintroduce ec implementation to MPI library.
 - Rework sm2 algorithm.
 - Support verification of X.509 certificates.

Gu Bowen (4):
  Revert "Revert "lib/mpi: Extend the MPI library""
  Revert "Revert "lib/mpi: Introduce ec implementation to MPI library""
  crypto/sm2: Rework sm2 alg with sig_alg backend
  crypto/sm2: support SM2-with-SM3 verification of X.509 certificates

 certs/system_keyring.c                   |    8 +
 crypto/Kconfig                           |   18 +
 crypto/Makefile                          |    8 +
 crypto/asymmetric_keys/public_key.c      |    7 +
 crypto/asymmetric_keys/x509_public_key.c |   27 +-
 crypto/sm2.c                             |  492 +++++++
 crypto/sm2signature.asn1                 |    4 +
 crypto/testmgr.c                         |    6 +
 crypto/testmgr.h                         |   57 +
 include/crypto/sm2.h                     |   31 +
 include/keys/system_keyring.h            |   13 +
 include/linux/mpi.h                      |  170 +++
 lib/crypto/mpi/Makefile                  |    2 +
 lib/crypto/mpi/ec.c                      | 1507 ++++++++++++++++++++++
 lib/crypto/mpi/mpi-add.c                 |   50 +
 lib/crypto/mpi/mpi-bit.c                 |  143 ++
 lib/crypto/mpi/mpi-cmp.c                 |   46 +-
 lib/crypto/mpi/mpi-div.c                 |   29 +
 lib/crypto/mpi/mpi-internal.h            |   10 +
 lib/crypto/mpi/mpi-inv.c                 |  143 ++
 lib/crypto/mpi/mpi-mod.c                 |  144 +++
 lib/crypto/mpi/mpicoder.c                |  336 +++++
 lib/crypto/mpi/mpih-mul.c                |   25 +
 lib/crypto/mpi/mpiutil.c                 |  182 +++
 24 files changed, 3447 insertions(+), 11 deletions(-)
 create mode 100644 crypto/sm2.c
 create mode 100644 crypto/sm2signature.asn1
 create mode 100644 include/crypto/sm2.h
 create mode 100644 lib/crypto/mpi/ec.c
 create mode 100644 lib/crypto/mpi/mpi-inv.c

-- 
2.25.1



^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH RFC 1/4] Revert "Revert "lib/mpi: Extend the MPI library""
  2025-06-30 13:39 [PATCH RFC 0/4] Reintroduce the sm2 algorithm Gu Bowen
@ 2025-06-30 13:39 ` Gu Bowen
  2025-07-03  9:18   ` Xi Ruoyao
  2025-06-30 13:39 ` [PATCH RFC 2/4] Revert "Revert "lib/mpi: Introduce ec implementation to " Gu Bowen
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 13+ messages in thread
From: Gu Bowen @ 2025-06-30 13:39 UTC (permalink / raw)
  To: Herbert Xu, David Howells, David Woodhouse, Lukas Wunner,
	Ignat Korchagin, David S . Miller, Jarkko Sakkinen,
	Maxime Coquelin, Alexandre Torgue, Eric Biggers,
	Jason A . Donenfeld, Ard Biesheuvel, Tianjia Zhang, Dan Carpenter
  Cc: keyrings, linux-kernel, linux-crypto, linux-stm32,
	linux-arm-kernel, Lu Jialin, GONG Ruiqi, Gu Bowen

This reverts commit fca5cb4dd2b4a9423cb6d112cc71c33899955a1f.

Reintroduce the mpi library based on libgcrypt to support sm2.

Signed-off-by: Gu Bowen <gubowen5@huawei.com>
---
 include/linux/mpi.h           |  65 +++++++
 lib/crypto/mpi/Makefile       |   1 +
 lib/crypto/mpi/mpi-add.c      |  50 +++++
 lib/crypto/mpi/mpi-bit.c      | 143 +++++++++++++++
 lib/crypto/mpi/mpi-cmp.c      |  46 ++++-
 lib/crypto/mpi/mpi-div.c      |  29 +++
 lib/crypto/mpi/mpi-internal.h |  10 +
 lib/crypto/mpi/mpi-inv.c      | 143 +++++++++++++++
 lib/crypto/mpi/mpi-mod.c      | 144 +++++++++++++++
 lib/crypto/mpi/mpicoder.c     | 336 ++++++++++++++++++++++++++++++++++
 lib/crypto/mpi/mpih-mul.c     |  25 +++
 lib/crypto/mpi/mpiutil.c      | 182 ++++++++++++++++++
 12 files changed, 1164 insertions(+), 10 deletions(-)
 create mode 100644 lib/crypto/mpi/mpi-inv.c

diff --git a/include/linux/mpi.h b/include/linux/mpi.h
index 47be46f36435..9ad7e7231ee9 100644
--- a/include/linux/mpi.h
+++ b/include/linux/mpi.h
@@ -40,33 +40,87 @@ struct gcry_mpi {
 typedef struct gcry_mpi *MPI;
 
 #define mpi_get_nlimbs(a)     ((a)->nlimbs)
+#define mpi_has_sign(a)       ((a)->sign)
 
 /*-- mpiutil.c --*/
 MPI mpi_alloc(unsigned nlimbs);
+void mpi_clear(MPI a);
 void mpi_free(MPI a);
 int mpi_resize(MPI a, unsigned nlimbs);
 
+static inline MPI mpi_new(unsigned int nbits)
+{
+	return mpi_alloc((nbits + BITS_PER_MPI_LIMB - 1) / BITS_PER_MPI_LIMB);
+}
+
 MPI mpi_copy(MPI a);
+MPI mpi_alloc_like(MPI a);
+void mpi_snatch(MPI w, MPI u);
+MPI mpi_set(MPI w, MPI u);
+MPI mpi_set_ui(MPI w, unsigned long u);
+MPI mpi_alloc_set_ui(unsigned long u);
+void mpi_swap_cond(MPI a, MPI b, unsigned long swap);
+
+/* Constants used to return constant MPIs.  See mpi_init if you
+ * want to add more constants.
+ */
+#define MPI_NUMBER_OF_CONSTANTS 6
+enum gcry_mpi_constants {
+	MPI_C_ZERO,
+	MPI_C_ONE,
+	MPI_C_TWO,
+	MPI_C_THREE,
+	MPI_C_FOUR,
+	MPI_C_EIGHT
+};
+
+MPI mpi_const(enum gcry_mpi_constants no);
 
 /*-- mpicoder.c --*/
+
+/* Different formats of external big integer representation. */
+enum gcry_mpi_format {
+	GCRYMPI_FMT_NONE = 0,
+	GCRYMPI_FMT_STD = 1,    /* Twos complement stored without length. */
+	GCRYMPI_FMT_PGP = 2,    /* As used by OpenPGP (unsigned only). */
+	GCRYMPI_FMT_SSH = 3,    /* As used by SSH (like STD but with length). */
+	GCRYMPI_FMT_HEX = 4,    /* Hex format. */
+	GCRYMPI_FMT_USG = 5,    /* Like STD but unsigned. */
+	GCRYMPI_FMT_OPAQUE = 8  /* Opaque format (some functions only). */
+};
+
 MPI mpi_read_raw_data(const void *xbuffer, size_t nbytes);
 MPI mpi_read_from_buffer(const void *buffer, unsigned *ret_nread);
+int mpi_fromstr(MPI val, const char *str);
+MPI mpi_scanval(const char *string);
 MPI mpi_read_raw_from_sgl(struct scatterlist *sgl, unsigned int len);
 void *mpi_get_buffer(MPI a, unsigned *nbytes, int *sign);
 int mpi_read_buffer(MPI a, uint8_t *buf, unsigned buf_len, unsigned *nbytes,
 		    int *sign);
 int mpi_write_to_sgl(MPI a, struct scatterlist *sg, unsigned nbytes,
 		     int *sign);
+int mpi_print(enum gcry_mpi_format format, unsigned char *buffer,
+			size_t buflen, size_t *nwritten, MPI a);
 
 /*-- mpi-mod.c --*/
 int mpi_mod(MPI rem, MPI dividend, MPI divisor);
 
+/* Context used with Barrett reduction.  */
+struct barrett_ctx_s;
+typedef struct barrett_ctx_s *mpi_barrett_t;
+
+mpi_barrett_t mpi_barrett_init(MPI m, int copy);
+void mpi_barrett_free(mpi_barrett_t ctx);
+void mpi_mod_barrett(MPI r, MPI x, mpi_barrett_t ctx);
+void mpi_mul_barrett(MPI w, MPI u, MPI v, mpi_barrett_t ctx);
+
 /*-- mpi-pow.c --*/
 int mpi_powm(MPI res, MPI base, MPI exp, MPI mod);
 
 /*-- mpi-cmp.c --*/
 int mpi_cmp_ui(MPI u, ulong v);
 int mpi_cmp(MPI u, MPI v);
+int mpi_cmpabs(MPI u, MPI v);
 
 /*-- mpi-sub-ui.c --*/
 int mpi_sub_ui(MPI w, MPI u, unsigned long vval);
@@ -76,9 +130,16 @@ void mpi_normalize(MPI a);
 unsigned mpi_get_nbits(MPI a);
 int mpi_test_bit(MPI a, unsigned int n);
 int mpi_set_bit(MPI a, unsigned int n);
+void mpi_set_highbit(MPI a, unsigned int n);
+void mpi_clear_highbit(MPI a, unsigned int n);
+void mpi_clear_bit(MPI a, unsigned int n);
+void mpi_rshift_limbs(MPI a, unsigned int count);
 int mpi_rshift(MPI x, MPI a, unsigned int n);
+void mpi_lshift_limbs(MPI a, unsigned int count);
+void mpi_lshift(MPI x, MPI a, unsigned int n);
 
 /*-- mpi-add.c --*/
+void mpi_add_ui(MPI w, MPI u, unsigned long v);
 int mpi_add(MPI w, MPI u, MPI v);
 int mpi_sub(MPI w, MPI u, MPI v);
 int mpi_addm(MPI w, MPI u, MPI v, MPI m);
@@ -91,6 +152,10 @@ int mpi_mulm(MPI w, MPI u, MPI v, MPI m);
 /*-- mpi-div.c --*/
 int mpi_tdiv_r(MPI rem, MPI num, MPI den);
 int mpi_fdiv_r(MPI rem, MPI dividend, MPI divisor);
+void mpi_fdiv_q(MPI quot, MPI dividend, MPI divisor);
+
+/*-- mpi-inv.c --*/
+int mpi_invm(MPI x, MPI a, MPI n);
 
 /* inline functions */
 
diff --git a/lib/crypto/mpi/Makefile b/lib/crypto/mpi/Makefile
index 9ad84079025a..477debd7ed50 100644
--- a/lib/crypto/mpi/Makefile
+++ b/lib/crypto/mpi/Makefile
@@ -19,6 +19,7 @@ mpi-y = \
 	mpi-cmp.o			\
 	mpi-sub-ui.o			\
 	mpi-div.o			\
+	mpi-inv.o			\
 	mpi-mod.o			\
 	mpi-mul.o			\
 	mpih-cmp.o			\
diff --git a/lib/crypto/mpi/mpi-add.c b/lib/crypto/mpi/mpi-add.c
index 3015140d4860..020371891991 100644
--- a/lib/crypto/mpi/mpi-add.c
+++ b/lib/crypto/mpi/mpi-add.c
@@ -13,6 +13,56 @@
 
 #include "mpi-internal.h"
 
+/****************
+ * Add the unsigned integer V to the mpi-integer U and store the
+ * result in W. U and V may be the same.
+ */
+void mpi_add_ui(MPI w, MPI u, unsigned long v)
+{
+	mpi_ptr_t wp, up;
+	mpi_size_t usize, wsize;
+	int usign, wsign;
+
+	usize = u->nlimbs;
+	usign = u->sign;
+	wsign = 0;
+
+	/* If not space for W (and possible carry), increase space.  */
+	wsize = usize + 1;
+	if (w->alloced < wsize)
+		mpi_resize(w, wsize);
+
+	/* These must be after realloc (U may be the same as W).  */
+	up = u->d;
+	wp = w->d;
+
+	if (!usize) {  /* simple */
+		wp[0] = v;
+		wsize = v ? 1:0;
+	} else if (!usign) {  /* mpi is not negative */
+		mpi_limb_t cy;
+		cy = mpihelp_add_1(wp, up, usize, v);
+		wp[usize] = cy;
+		wsize = usize + cy;
+	} else {
+		/* The signs are different.  Need exact comparison to determine
+		 * which operand to subtract from which.
+		 */
+		if (usize == 1 && up[0] < v) {
+			wp[0] = v - up[0];
+			wsize = 1;
+		} else {
+			mpihelp_sub_1(wp, up, usize, v);
+			/* Size can decrease with at most one limb. */
+			wsize = usize - (wp[usize-1] == 0);
+			wsign = 1;
+		}
+	}
+
+	w->nlimbs = wsize;
+	w->sign   = wsign;
+}
+
 int mpi_add(MPI w, MPI u, MPI v)
 {
 	mpi_ptr_t wp, up, vp;
diff --git a/lib/crypto/mpi/mpi-bit.c b/lib/crypto/mpi/mpi-bit.c
index 934d81311360..4790d5b8a216 100644
--- a/lib/crypto/mpi/mpi-bit.c
+++ b/lib/crypto/mpi/mpi-bit.c
@@ -32,6 +32,7 @@ void mpi_normalize(MPI a)
 	for (; a->nlimbs && !a->d[a->nlimbs - 1]; a->nlimbs--)
 		;
 }
+EXPORT_SYMBOL_GPL(mpi_normalize);
 
 /****************
  * Return the number of bits in A.
@@ -97,6 +98,85 @@ int mpi_set_bit(MPI a, unsigned int n)
 }
 EXPORT_SYMBOL_GPL(mpi_set_bit);
 
+/****************
+ * Set bit N of A. and clear all bits above
+ */
+void mpi_set_highbit(MPI a, unsigned int n)
+{
+	unsigned int i, limbno, bitno;
+
+	limbno = n / BITS_PER_MPI_LIMB;
+	bitno  = n % BITS_PER_MPI_LIMB;
+
+	if (limbno >= a->nlimbs) {
+		for (i = a->nlimbs; i < a->alloced; i++)
+			a->d[i] = 0;
+		mpi_resize(a, limbno+1);
+		a->nlimbs = limbno+1;
+	}
+	a->d[limbno] |= (A_LIMB_1<<bitno);
+	for (bitno++; bitno < BITS_PER_MPI_LIMB; bitno++)
+		a->d[limbno] &= ~(A_LIMB_1 << bitno);
+	a->nlimbs = limbno+1;
+}
+EXPORT_SYMBOL_GPL(mpi_set_highbit);
+
+/****************
+ * clear bit N of A and all bits above
+ */
+void mpi_clear_highbit(MPI a, unsigned int n)
+{
+	unsigned int limbno, bitno;
+
+	limbno = n / BITS_PER_MPI_LIMB;
+	bitno  = n % BITS_PER_MPI_LIMB;
+
+	if (limbno >= a->nlimbs)
+		return; /* not allocated, therefore no need to clear bits :-) */
+
+	for ( ; bitno < BITS_PER_MPI_LIMB; bitno++)
+		a->d[limbno] &= ~(A_LIMB_1 << bitno);
+	a->nlimbs = limbno+1;
+}
+
+/****************
+ * Clear bit N of A.
+ */
+void mpi_clear_bit(MPI a, unsigned int n)
+{
+	unsigned int limbno, bitno;
+
+	limbno = n / BITS_PER_MPI_LIMB;
+	bitno  = n % BITS_PER_MPI_LIMB;
+
+	if (limbno >= a->nlimbs)
+		return; /* Don't need to clear this bit, it's far too left.  */
+	a->d[limbno] &= ~(A_LIMB_1 << bitno);
+}
+EXPORT_SYMBOL_GPL(mpi_clear_bit);
+
+
+/****************
+ * Shift A by COUNT limbs to the right
+ * This is used only within the MPI library
+ */
+void mpi_rshift_limbs(MPI a, unsigned int count)
+{
+	mpi_ptr_t ap = a->d;
+	mpi_size_t n = a->nlimbs;
+	unsigned int i;
+
+	if (count >= n) {
+		a->nlimbs = 0;
+		return;
+	}
+
+	for (i = 0; i < n - count; i++)
+		ap[i] = ap[i+count];
+	ap[i] = 0;
+	a->nlimbs -= count;
+}
+
 /*
  * Shift A by N bits to the right.
  */
@@ -173,3 +253,66 @@ int mpi_rshift(MPI x, MPI a, unsigned int n)
 	return 0;
 }
 EXPORT_SYMBOL_GPL(mpi_rshift);
+
+/****************
+ * Shift A by COUNT limbs to the left
+ * This is used only within the MPI library
+ */
+void mpi_lshift_limbs(MPI a, unsigned int count)
+{
+	mpi_ptr_t ap;
+	int n = a->nlimbs;
+	int i;
+
+	if (!count || !n)
+		return;
+
+	RESIZE_IF_NEEDED(a, n+count);
+
+	ap = a->d;
+	for (i = n-1; i >= 0; i--)
+		ap[i+count] = ap[i];
+	for (i = 0; i < count; i++)
+		ap[i] = 0;
+	a->nlimbs += count;
+}
+
+/*
+ * Shift A by N bits to the left.
+ */
+void mpi_lshift(MPI x, MPI a, unsigned int n)
+{
+	unsigned int nlimbs = (n/BITS_PER_MPI_LIMB);
+	unsigned int nbits = (n%BITS_PER_MPI_LIMB);
+
+	if (x == a && !n)
+		return;  /* In-place shift with an amount of zero.  */
+
+	if (x != a) {
+		/* Copy A to X.  */
+		unsigned int alimbs = a->nlimbs;
+		int asign = a->sign;
+		mpi_ptr_t xp, ap;
+
+		RESIZE_IF_NEEDED(x, alimbs+nlimbs+1);
+		xp = x->d;
+		ap = a->d;
+		MPN_COPY(xp, ap, alimbs);
+		x->nlimbs = alimbs;
+		x->flags = a->flags;
+		x->sign = asign;
+	}
+
+	if (nlimbs && !nbits) {
+		/* Shift a full number of limbs.  */
+		mpi_lshift_limbs(x, nlimbs);
+	} else if (n) {
+		/* We use a very dump approach: Shift left by the number of
+		 * limbs plus one and than fix it up by an rshift.
+		 */
+		mpi_lshift_limbs(x, nlimbs+1);
+		mpi_rshift(x, x, BITS_PER_MPI_LIMB - nbits);
+	}
+
+	MPN_NORMALIZE(x->d, x->nlimbs);
+}
diff --git a/lib/crypto/mpi/mpi-cmp.c b/lib/crypto/mpi/mpi-cmp.c
index ceaebe181cd7..0835b6213235 100644
--- a/lib/crypto/mpi/mpi-cmp.c
+++ b/lib/crypto/mpi/mpi-cmp.c
@@ -45,28 +45,54 @@ int mpi_cmp_ui(MPI u, unsigned long v)
 }
 EXPORT_SYMBOL_GPL(mpi_cmp_ui);
 
-int mpi_cmp(MPI u, MPI v)
+static int do_mpi_cmp(MPI u, MPI v, int absmode)
 {
-	mpi_size_t usize, vsize;
+	mpi_size_t usize;
+	mpi_size_t vsize;
+	int usign;
+	int vsign;
 	int cmp;
 
 	mpi_normalize(u);
 	mpi_normalize(v);
+
 	usize = u->nlimbs;
 	vsize = v->nlimbs;
-	if (!u->sign && v->sign)
+	usign = absmode ? 0 : u->sign;
+	vsign = absmode ? 0 : v->sign;
+
+	/* Compare sign bits.  */
+
+	if (!usign && vsign)
 		return 1;
-	if (u->sign && !v->sign)
+	if (usign && !vsign)
 		return -1;
-	if (usize != vsize && !u->sign && !v->sign)
+
+	/* U and V are either both positive or both negative.  */
+
+	if (usize != vsize && !usign && !vsign)
 		return usize - vsize;
-	if (usize != vsize && u->sign && v->sign)
-		return vsize - usize;
+	if (usize != vsize && usign && vsign)
+		return vsize + usize;
 	if (!usize)
 		return 0;
 	cmp = mpihelp_cmp(u->d, v->d, usize);
-	if (u->sign)
-		return -cmp;
-	return cmp;
+	if (!cmp)
+		return 0;
+	if ((cmp < 0?1:0) == (usign?1:0))
+		return 1;
+
+	return -1;
+}
+
+int mpi_cmp(MPI u, MPI v)
+{
+	return do_mpi_cmp(u, v, 0);
 }
 EXPORT_SYMBOL_GPL(mpi_cmp);
+
+int mpi_cmpabs(MPI u, MPI v)
+{
+	return do_mpi_cmp(u, v, 1);
+}
+EXPORT_SYMBOL_GPL(mpi_cmpabs);
diff --git a/lib/crypto/mpi/mpi-div.c b/lib/crypto/mpi/mpi-div.c
index 6e5044e72595..05a67d43ebc2 100644
--- a/lib/crypto/mpi/mpi-div.c
+++ b/lib/crypto/mpi/mpi-div.c
@@ -15,6 +15,7 @@
 #include "longlong.h"
 
 int mpi_tdiv_qr(MPI quot, MPI rem, MPI num, MPI den);
+void mpi_fdiv_qr(MPI quot, MPI rem, MPI dividend, MPI divisor);
 
 int mpi_fdiv_r(MPI rem, MPI dividend, MPI divisor)
 {
@@ -46,6 +47,34 @@ int mpi_fdiv_r(MPI rem, MPI dividend, MPI divisor)
 	return err;
 }
 
+void mpi_fdiv_q(MPI quot, MPI dividend, MPI divisor)
+{
+	MPI tmp = mpi_alloc(mpi_get_nlimbs(quot));
+	mpi_fdiv_qr(quot, tmp, dividend, divisor);
+	mpi_free(tmp);
+}
+
+void mpi_fdiv_qr(MPI quot, MPI rem, MPI dividend, MPI divisor)
+{
+	int divisor_sign = divisor->sign;
+	MPI temp_divisor = NULL;
+
+	if (quot == divisor || rem == divisor) {
+		temp_divisor = mpi_copy(divisor);
+		divisor = temp_divisor;
+	}
+
+	mpi_tdiv_qr(quot, rem, dividend, divisor);
+
+	if ((divisor_sign ^ dividend->sign) && rem->nlimbs) {
+		mpi_sub_ui(quot, quot, 1);
+		mpi_add(rem, rem, divisor);
+	}
+
+	if (temp_divisor)
+		mpi_free(temp_divisor);
+}
+
 /* If den == quot, den needs temporary storage.
  * If den == rem, den needs temporary storage.
  * If num == quot, num needs temporary storage.
diff --git a/lib/crypto/mpi/mpi-internal.h b/lib/crypto/mpi/mpi-internal.h
index 8a4f49e3043c..e6cf87659e29 100644
--- a/lib/crypto/mpi/mpi-internal.h
+++ b/lib/crypto/mpi/mpi-internal.h
@@ -67,6 +67,14 @@ static inline int RESIZE_IF_NEEDED(MPI a, unsigned b)
 			(d)[_i] = (s)[_i];	\
 	} while (0)
 
+#define MPN_COPY_INCR(d, s, n)		\
+	do {					\
+		mpi_size_t _i;			\
+		for (_i = 0; _i < (n); _i++)	\
+			(d)[_i] = (s)[_i];	\
+	} while (0)
+
+
 #define MPN_COPY_DECR(d, s, n) \
 	do {					\
 		mpi_size_t _i;			\
@@ -174,6 +182,8 @@ int mpihelp_mul(mpi_ptr_t prodp, mpi_ptr_t up, mpi_size_t usize,
 void mpih_sqr_n_basecase(mpi_ptr_t prodp, mpi_ptr_t up, mpi_size_t size);
 void mpih_sqr_n(mpi_ptr_t prodp, mpi_ptr_t up, mpi_size_t size,
 		mpi_ptr_t tspace);
+void mpihelp_mul_n(mpi_ptr_t prodp,
+		mpi_ptr_t up, mpi_ptr_t vp, mpi_size_t size);
 
 int mpihelp_mul_karatsuba_case(mpi_ptr_t prodp,
 			       mpi_ptr_t up, mpi_size_t usize,
diff --git a/lib/crypto/mpi/mpi-inv.c b/lib/crypto/mpi/mpi-inv.c
new file mode 100644
index 000000000000..61e37d18f793
--- /dev/null
+++ b/lib/crypto/mpi/mpi-inv.c
@@ -0,0 +1,143 @@
+/* mpi-inv.c  -  MPI functions
+ *	Copyright (C) 1998, 2001, 2002, 2003 Free Software Foundation, Inc.
+ *
+ * This file is part of Libgcrypt.
+ *
+ * Libgcrypt is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU Lesser General Public License as
+ * published by the Free Software Foundation; either version 2.1 of
+ * the License, or (at your option) any later version.
+ *
+ * Libgcrypt is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include "mpi-internal.h"
+
+/****************
+ * Calculate the multiplicative inverse X of A mod N
+ * That is: Find the solution x for
+ *		1 = (a*x) mod n
+ */
+int mpi_invm(MPI x, MPI a, MPI n)
+{
+	/* Extended Euclid's algorithm (See TAOCP Vol II, 4.5.2, Alg X)
+	 * modified according to Michael Penk's solution for Exercise 35
+	 * with further enhancement
+	 */
+	MPI u, v, u1, u2 = NULL, u3, v1, v2 = NULL, v3, t1, t2 = NULL, t3;
+	unsigned int k;
+	int sign;
+	int odd;
+
+	if (!mpi_cmp_ui(a, 0))
+		return 0; /* Inverse does not exists.  */
+	if (!mpi_cmp_ui(n, 1))
+		return 0; /* Inverse does not exists.  */
+
+	u = mpi_copy(a);
+	v = mpi_copy(n);
+
+	for (k = 0; !mpi_test_bit(u, 0) && !mpi_test_bit(v, 0); k++) {
+		mpi_rshift(u, u, 1);
+		mpi_rshift(v, v, 1);
+	}
+	odd = mpi_test_bit(v, 0);
+
+	u1 = mpi_alloc_set_ui(1);
+	if (!odd)
+		u2 = mpi_alloc_set_ui(0);
+	u3 = mpi_copy(u);
+	v1 = mpi_copy(v);
+	if (!odd) {
+		v2 = mpi_alloc(mpi_get_nlimbs(u));
+		mpi_sub(v2, u1, u); /* U is used as const 1 */
+	}
+	v3 = mpi_copy(v);
+	if (mpi_test_bit(u, 0)) { /* u is odd */
+		t1 = mpi_alloc_set_ui(0);
+		if (!odd) {
+			t2 = mpi_alloc_set_ui(1);
+			t2->sign = 1;
+		}
+		t3 = mpi_copy(v);
+		t3->sign = !t3->sign;
+		goto Y4;
+	} else {
+		t1 = mpi_alloc_set_ui(1);
+		if (!odd)
+			t2 = mpi_alloc_set_ui(0);
+		t3 = mpi_copy(u);
+	}
+
+	do {
+		do {
+			if (!odd) {
+				if (mpi_test_bit(t1, 0) || mpi_test_bit(t2, 0)) {
+					/* one is odd */
+					mpi_add(t1, t1, v);
+					mpi_sub(t2, t2, u);
+				}
+				mpi_rshift(t1, t1, 1);
+				mpi_rshift(t2, t2, 1);
+				mpi_rshift(t3, t3, 1);
+			} else {
+				if (mpi_test_bit(t1, 0))
+					mpi_add(t1, t1, v);
+				mpi_rshift(t1, t1, 1);
+				mpi_rshift(t3, t3, 1);
+			}
+Y4:
+			;
+		} while (!mpi_test_bit(t3, 0)); /* while t3 is even */
+
+		if (!t3->sign) {
+			mpi_set(u1, t1);
+			if (!odd)
+				mpi_set(u2, t2);
+			mpi_set(u3, t3);
+		} else {
+			mpi_sub(v1, v, t1);
+			sign = u->sign; u->sign = !u->sign;
+			if (!odd)
+				mpi_sub(v2, u, t2);
+			u->sign = sign;
+			sign = t3->sign; t3->sign = !t3->sign;
+			mpi_set(v3, t3);
+			t3->sign = sign;
+		}
+		mpi_sub(t1, u1, v1);
+		if (!odd)
+			mpi_sub(t2, u2, v2);
+		mpi_sub(t3, u3, v3);
+		if (t1->sign) {
+			mpi_add(t1, t1, v);
+			if (!odd)
+				mpi_sub(t2, t2, u);
+		}
+	} while (mpi_cmp_ui(t3, 0)); /* while t3 != 0 */
+	/* mpi_lshift( u3, k ); */
+	mpi_set(x, u1);
+
+	mpi_free(u1);
+	mpi_free(v1);
+	mpi_free(t1);
+	if (!odd) {
+		mpi_free(u2);
+		mpi_free(v2);
+		mpi_free(t2);
+	}
+	mpi_free(u3);
+	mpi_free(v3);
+	mpi_free(t3);
+
+	mpi_free(u);
+	mpi_free(v);
+	return 1;
+}
+EXPORT_SYMBOL_GPL(mpi_invm);
diff --git a/lib/crypto/mpi/mpi-mod.c b/lib/crypto/mpi/mpi-mod.c
index d5fdaec3d0b6..5a6475f58fa7 100644
--- a/lib/crypto/mpi/mpi-mod.c
+++ b/lib/crypto/mpi/mpi-mod.c
@@ -5,9 +5,153 @@
  * This file is part of Libgcrypt.
  */
 
+
 #include "mpi-internal.h"
+#include "longlong.h"
+
+/* Context used with Barrett reduction.  */
+struct barrett_ctx_s {
+	MPI m;   /* The modulus - may not be modified. */
+	int m_copied;   /* If true, M needs to be released.  */
+	int k;
+	MPI y;
+	MPI r1;  /* Helper MPI. */
+	MPI r2;  /* Helper MPI. */
+	MPI r3;  /* Helper MPI allocated on demand. */
+};
+
+
 
 int mpi_mod(MPI rem, MPI dividend, MPI divisor)
 {
 	return mpi_fdiv_r(rem, dividend, divisor);
 }
+
+/* This function returns a new context for Barrett based operations on
+ * the modulus M.  This context needs to be released using
+ * _gcry_mpi_barrett_free.  If COPY is true M will be transferred to
+ * the context and the user may change M.  If COPY is false, M may not
+ * be changed until gcry_mpi_barrett_free has been called.
+ */
+mpi_barrett_t mpi_barrett_init(MPI m, int copy)
+{
+	mpi_barrett_t ctx;
+	MPI tmp;
+
+	mpi_normalize(m);
+	ctx = kcalloc(1, sizeof(*ctx), GFP_KERNEL);
+	if (!ctx)
+		return NULL;
+
+	if (copy) {
+		ctx->m = mpi_copy(m);
+		ctx->m_copied = 1;
+	} else
+		ctx->m = m;
+
+	ctx->k = mpi_get_nlimbs(m);
+	tmp = mpi_alloc(ctx->k + 1);
+
+	/* Barrett precalculation: y = floor(b^(2k) / m). */
+	mpi_set_ui(tmp, 1);
+	mpi_lshift_limbs(tmp, 2 * ctx->k);
+	mpi_fdiv_q(tmp, tmp, m);
+
+	ctx->y  = tmp;
+	ctx->r1 = mpi_alloc(2 * ctx->k + 1);
+	ctx->r2 = mpi_alloc(2 * ctx->k + 1);
+
+	return ctx;
+}
+
+void mpi_barrett_free(mpi_barrett_t ctx)
+{
+	if (ctx) {
+		mpi_free(ctx->y);
+		mpi_free(ctx->r1);
+		mpi_free(ctx->r2);
+		if (ctx->r3)
+			mpi_free(ctx->r3);
+		if (ctx->m_copied)
+			mpi_free(ctx->m);
+		kfree(ctx);
+	}
+}
+
+
+/* R = X mod M
+ *
+ * Using Barrett reduction.  Before using this function
+ * _gcry_mpi_barrett_init must have been called to do the
+ * precalculations.  CTX is the context created by this precalculation
+ * and also conveys M.  If the Barret reduction could no be done a
+ * straightforward reduction method is used.
+ *
+ * We assume that these conditions are met:
+ * Input:  x =(x_2k-1 ...x_0)_b
+ *     m =(m_k-1 ....m_0)_b	  with m_k-1 != 0
+ * Output: r = x mod m
+ */
+void mpi_mod_barrett(MPI r, MPI x, mpi_barrett_t ctx)
+{
+	MPI m = ctx->m;
+	int k = ctx->k;
+	MPI y = ctx->y;
+	MPI r1 = ctx->r1;
+	MPI r2 = ctx->r2;
+	int sign;
+
+	mpi_normalize(x);
+	if (mpi_get_nlimbs(x) > 2*k) {
+		mpi_mod(r, x, m);
+		return;
+	}
+
+	sign = x->sign;
+	x->sign = 0;
+
+	/* 1. q1 = floor( x / b^k-1)
+	 *    q2 = q1 * y
+	 *    q3 = floor( q2 / b^k+1 )
+	 * Actually, we don't need qx, we can work direct on r2
+	 */
+	mpi_set(r2, x);
+	mpi_rshift_limbs(r2, k-1);
+	mpi_mul(r2, r2, y);
+	mpi_rshift_limbs(r2, k+1);
+
+	/* 2. r1 = x mod b^k+1
+	 *	r2 = q3 * m mod b^k+1
+	 *	r  = r1 - r2
+	 * 3. if r < 0 then  r = r + b^k+1
+	 */
+	mpi_set(r1, x);
+	if (r1->nlimbs > k+1) /* Quick modulo operation.  */
+		r1->nlimbs = k+1;
+	mpi_mul(r2, r2, m);
+	if (r2->nlimbs > k+1) /* Quick modulo operation. */
+		r2->nlimbs = k+1;
+	mpi_sub(r, r1, r2);
+
+	if (mpi_has_sign(r)) {
+		if (!ctx->r3) {
+			ctx->r3 = mpi_alloc(k + 2);
+			mpi_set_ui(ctx->r3, 1);
+			mpi_lshift_limbs(ctx->r3, k + 1);
+		}
+		mpi_add(r, r, ctx->r3);
+	}
+
+	/* 4. while r >= m do r = r - m */
+	while (mpi_cmp(r, m) >= 0)
+		mpi_sub(r, r, m);
+
+	x->sign = sign;
+}
+
+
+void mpi_mul_barrett(MPI w, MPI u, MPI v, mpi_barrett_t ctx)
+{
+	mpi_mul(w, u, v);
+	mpi_mod_barrett(w, w, ctx);
+}
diff --git a/lib/crypto/mpi/mpicoder.c b/lib/crypto/mpi/mpicoder.c
index dde01030807d..3cb6bd148fa9 100644
--- a/lib/crypto/mpi/mpicoder.c
+++ b/lib/crypto/mpi/mpicoder.c
@@ -25,6 +25,7 @@
 #include <linux/string.h>
 #include "mpi-internal.h"
 
+#define MAX_EXTERN_SCAN_BYTES (16*1024*1024)
 #define MAX_EXTERN_MPI_BITS 16384
 
 /**
@@ -109,6 +110,112 @@ MPI mpi_read_from_buffer(const void *xbuffer, unsigned *ret_nread)
 }
 EXPORT_SYMBOL_GPL(mpi_read_from_buffer);
 
+/****************
+ * Fill the mpi VAL from the hex string in STR.
+ */
+int mpi_fromstr(MPI val, const char *str)
+{
+	int sign = 0;
+	int prepend_zero = 0;
+	int i, j, c, c1, c2;
+	unsigned int nbits, nbytes, nlimbs;
+	mpi_limb_t a;
+
+	if (*str == '-') {
+		sign = 1;
+		str++;
+	}
+
+	/* Skip optional hex prefix.  */
+	if (*str == '0' && str[1] == 'x')
+		str += 2;
+
+	nbits = strlen(str);
+	if (nbits > MAX_EXTERN_SCAN_BYTES) {
+		mpi_clear(val);
+		return -EINVAL;
+	}
+	nbits *= 4;
+	if ((nbits % 8))
+		prepend_zero = 1;
+
+	nbytes = (nbits+7) / 8;
+	nlimbs = (nbytes+BYTES_PER_MPI_LIMB-1) / BYTES_PER_MPI_LIMB;
+
+	if (val->alloced < nlimbs)
+		mpi_resize(val, nlimbs);
+
+	i = BYTES_PER_MPI_LIMB - (nbytes % BYTES_PER_MPI_LIMB);
+	i %= BYTES_PER_MPI_LIMB;
+	j = val->nlimbs = nlimbs;
+	val->sign = sign;
+	for (; j > 0; j--) {
+		a = 0;
+		for (; i < BYTES_PER_MPI_LIMB; i++) {
+			if (prepend_zero) {
+				c1 = '0';
+				prepend_zero = 0;
+			} else
+				c1 = *str++;
+
+			if (!c1) {
+				mpi_clear(val);
+				return -EINVAL;
+			}
+			c2 = *str++;
+			if (!c2) {
+				mpi_clear(val);
+				return -EINVAL;
+			}
+			if (c1 >= '0' && c1 <= '9')
+				c = c1 - '0';
+			else if (c1 >= 'a' && c1 <= 'f')
+				c = c1 - 'a' + 10;
+			else if (c1 >= 'A' && c1 <= 'F')
+				c = c1 - 'A' + 10;
+			else {
+				mpi_clear(val);
+				return -EINVAL;
+			}
+			c <<= 4;
+			if (c2 >= '0' && c2 <= '9')
+				c |= c2 - '0';
+			else if (c2 >= 'a' && c2 <= 'f')
+				c |= c2 - 'a' + 10;
+			else if (c2 >= 'A' && c2 <= 'F')
+				c |= c2 - 'A' + 10;
+			else {
+				mpi_clear(val);
+				return -EINVAL;
+			}
+			a <<= 8;
+			a |= c;
+		}
+		i = 0;
+		val->d[j-1] = a;
+	}
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(mpi_fromstr);
+
+MPI mpi_scanval(const char *string)
+{
+	MPI a;
+
+	a = mpi_alloc(0);
+	if (!a)
+		return NULL;
+
+	if (mpi_fromstr(a, string)) {
+		mpi_free(a);
+		return NULL;
+	}
+	mpi_normalize(a);
+	return a;
+}
+EXPORT_SYMBOL_GPL(mpi_scanval);
+
 static int count_lzeros(MPI a)
 {
 	mpi_limb_t alimb;
@@ -414,3 +521,232 @@ MPI mpi_read_raw_from_sgl(struct scatterlist *sgl, unsigned int nbytes)
 	return val;
 }
 EXPORT_SYMBOL_GPL(mpi_read_raw_from_sgl);
+
+/* Perform a two's complement operation on buffer P of size N bytes.  */
+static void twocompl(unsigned char *p, unsigned int n)
+{
+	int i;
+
+	for (i = n-1; i >= 0 && !p[i]; i--)
+		;
+	if (i >= 0) {
+		if ((p[i] & 0x01))
+			p[i] = (((p[i] ^ 0xfe) | 0x01) & 0xff);
+		else if ((p[i] & 0x02))
+			p[i] = (((p[i] ^ 0xfc) | 0x02) & 0xfe);
+		else if ((p[i] & 0x04))
+			p[i] = (((p[i] ^ 0xf8) | 0x04) & 0xfc);
+		else if ((p[i] & 0x08))
+			p[i] = (((p[i] ^ 0xf0) | 0x08) & 0xf8);
+		else if ((p[i] & 0x10))
+			p[i] = (((p[i] ^ 0xe0) | 0x10) & 0xf0);
+		else if ((p[i] & 0x20))
+			p[i] = (((p[i] ^ 0xc0) | 0x20) & 0xe0);
+		else if ((p[i] & 0x40))
+			p[i] = (((p[i] ^ 0x80) | 0x40) & 0xc0);
+		else
+			p[i] = 0x80;
+
+		for (i--; i >= 0; i--)
+			p[i] ^= 0xff;
+	}
+}
+
+int mpi_print(enum gcry_mpi_format format, unsigned char *buffer,
+			size_t buflen, size_t *nwritten, MPI a)
+{
+	unsigned int nbits = mpi_get_nbits(a);
+	size_t len;
+	size_t dummy_nwritten;
+	int negative;
+
+	if (!nwritten)
+		nwritten = &dummy_nwritten;
+
+	/* Libgcrypt does no always care to set clear the sign if the value
+	 * is 0.  For printing this is a bit of a surprise, in particular
+	 * because if some of the formats don't support negative numbers but
+	 * should be able to print a zero.  Thus we need this extra test
+	 * for a negative number.
+	 */
+	if (a->sign && mpi_cmp_ui(a, 0))
+		negative = 1;
+	else
+		negative = 0;
+
+	len = buflen;
+	*nwritten = 0;
+	if (format == GCRYMPI_FMT_STD) {
+		unsigned char *tmp;
+		int extra = 0;
+		unsigned int n;
+
+		tmp = mpi_get_buffer(a, &n, NULL);
+		if (!tmp)
+			return -EINVAL;
+
+		if (negative) {
+			twocompl(tmp, n);
+			if (!(*tmp & 0x80)) {
+				/* Need to extend the sign.  */
+				n++;
+				extra = 2;
+			}
+		} else if (n && (*tmp & 0x80)) {
+			/* Positive but the high bit of the returned buffer is set.
+			 * Thus we need to print an extra leading 0x00 so that the
+			 * output is interpreted as a positive number.
+			 */
+			n++;
+			extra = 1;
+		}
+
+		if (buffer && n > len) {
+			/* The provided buffer is too short. */
+			kfree(tmp);
+			return -E2BIG;
+		}
+		if (buffer) {
+			unsigned char *s = buffer;
+
+			if (extra == 1)
+				*s++ = 0;
+			else if (extra)
+				*s++ = 0xff;
+			memcpy(s, tmp, n-!!extra);
+		}
+		kfree(tmp);
+		*nwritten = n;
+		return 0;
+	} else if (format == GCRYMPI_FMT_USG) {
+		unsigned int n = (nbits + 7)/8;
+
+		/* Note:  We ignore the sign for this format.  */
+		/* FIXME: for performance reasons we should put this into
+		 * mpi_aprint because we can then use the buffer directly.
+		 */
+
+		if (buffer && n > len)
+			return -E2BIG;
+		if (buffer) {
+			unsigned char *tmp;
+
+			tmp = mpi_get_buffer(a, &n, NULL);
+			if (!tmp)
+				return -EINVAL;
+			memcpy(buffer, tmp, n);
+			kfree(tmp);
+		}
+		*nwritten = n;
+		return 0;
+	} else if (format == GCRYMPI_FMT_PGP) {
+		unsigned int n = (nbits + 7)/8;
+
+		/* The PGP format can only handle unsigned integers.  */
+		if (negative)
+			return -EINVAL;
+
+		if (buffer && n+2 > len)
+			return -E2BIG;
+
+		if (buffer) {
+			unsigned char *tmp;
+			unsigned char *s = buffer;
+
+			s[0] = nbits >> 8;
+			s[1] = nbits;
+
+			tmp = mpi_get_buffer(a, &n, NULL);
+			if (!tmp)
+				return -EINVAL;
+			memcpy(s+2, tmp, n);
+			kfree(tmp);
+		}
+		*nwritten = n+2;
+		return 0;
+	} else if (format == GCRYMPI_FMT_SSH) {
+		unsigned char *tmp;
+		int extra = 0;
+		unsigned int n;
+
+		tmp = mpi_get_buffer(a, &n, NULL);
+		if (!tmp)
+			return -EINVAL;
+
+		if (negative) {
+			twocompl(tmp, n);
+			if (!(*tmp & 0x80)) {
+				/* Need to extend the sign.  */
+				n++;
+				extra = 2;
+			}
+		} else if (n && (*tmp & 0x80)) {
+			n++;
+			extra = 1;
+		}
+
+		if (buffer && n+4 > len) {
+			kfree(tmp);
+			return -E2BIG;
+		}
+
+		if (buffer) {
+			unsigned char *s = buffer;
+
+			*s++ = n >> 24;
+			*s++ = n >> 16;
+			*s++ = n >> 8;
+			*s++ = n;
+			if (extra == 1)
+				*s++ = 0;
+			else if (extra)
+				*s++ = 0xff;
+			memcpy(s, tmp, n-!!extra);
+		}
+		kfree(tmp);
+		*nwritten = 4+n;
+		return 0;
+	} else if (format == GCRYMPI_FMT_HEX) {
+		unsigned char *tmp;
+		int i;
+		int extra = 0;
+		unsigned int n = 0;
+
+		tmp = mpi_get_buffer(a, &n, NULL);
+		if (!tmp)
+			return -EINVAL;
+		if (!n || (*tmp & 0x80))
+			extra = 2;
+
+		if (buffer && 2*n + extra + negative + 1 > len) {
+			kfree(tmp);
+			return -E2BIG;
+		}
+		if (buffer) {
+			unsigned char *s = buffer;
+
+			if (negative)
+				*s++ = '-';
+			if (extra) {
+				*s++ = '0';
+				*s++ = '0';
+			}
+
+			for (i = 0; i < n; i++) {
+				unsigned int c = tmp[i];
+
+				*s++ = (c >> 4) < 10 ? '0'+(c>>4) : 'A'+(c>>4)-10;
+				c &= 15;
+				*s++ = c < 10 ? '0'+c : 'A'+c-10;
+			}
+			*s++ = 0;
+			*nwritten = s - buffer;
+		} else {
+			*nwritten = 2*n + extra + negative + 1;
+		}
+		kfree(tmp);
+		return 0;
+	} else
+		return -EINVAL;
+}
+EXPORT_SYMBOL_GPL(mpi_print);
diff --git a/lib/crypto/mpi/mpih-mul.c b/lib/crypto/mpi/mpih-mul.c
index a93647564054..e5f1c84e3c48 100644
--- a/lib/crypto/mpi/mpih-mul.c
+++ b/lib/crypto/mpi/mpih-mul.c
@@ -317,6 +317,31 @@ mpih_sqr_n(mpi_ptr_t prodp, mpi_ptr_t up, mpi_size_t size, mpi_ptr_t tspace)
 	}
 }
 
+
+void mpihelp_mul_n(mpi_ptr_t prodp,
+		mpi_ptr_t up, mpi_ptr_t vp, mpi_size_t size)
+{
+	if (up == vp) {
+		if (size < KARATSUBA_THRESHOLD)
+			mpih_sqr_n_basecase(prodp, up, size);
+		else {
+			mpi_ptr_t tspace;
+			tspace = mpi_alloc_limb_space(2 * size);
+			mpih_sqr_n(prodp, up, size, tspace);
+			mpi_free_limb_space(tspace);
+		}
+	} else {
+		if (size < KARATSUBA_THRESHOLD)
+			mul_n_basecase(prodp, up, vp, size);
+		else {
+			mpi_ptr_t tspace;
+			tspace = mpi_alloc_limb_space(2 * size);
+			mul_n(prodp, up, vp, size, tspace);
+			mpi_free_limb_space(tspace);
+		}
+	}
+}
+
 int
 mpihelp_mul_karatsuba_case(mpi_ptr_t prodp,
 			   mpi_ptr_t up, mpi_size_t usize,
diff --git a/lib/crypto/mpi/mpiutil.c b/lib/crypto/mpi/mpiutil.c
index 979ece5a81d2..c8ab00f6f4f4 100644
--- a/lib/crypto/mpi/mpiutil.c
+++ b/lib/crypto/mpi/mpiutil.c
@@ -20,6 +20,63 @@
 
 #include "mpi-internal.h"
 
+/* Constants allocated right away at startup.  */
+static MPI constants[MPI_NUMBER_OF_CONSTANTS];
+
+/* Initialize the MPI subsystem.  This is called early and allows to
+ * do some initialization without taking care of threading issues.
+ */
+static int __init mpi_init(void)
+{
+	int idx;
+	unsigned long value;
+
+	for (idx = 0; idx < MPI_NUMBER_OF_CONSTANTS; idx++) {
+		switch (idx) {
+		case MPI_C_ZERO:
+			value = 0;
+			break;
+		case MPI_C_ONE:
+			value = 1;
+			break;
+		case MPI_C_TWO:
+			value = 2;
+			break;
+		case MPI_C_THREE:
+			value = 3;
+			break;
+		case MPI_C_FOUR:
+			value = 4;
+			break;
+		case MPI_C_EIGHT:
+			value = 8;
+			break;
+		default:
+			pr_err("MPI: invalid mpi_const selector %d\n", idx);
+			return -EFAULT;
+		}
+		constants[idx] = mpi_alloc_set_ui(value);
+		constants[idx]->flags = (16|32);
+	}
+
+	return 0;
+}
+postcore_initcall(mpi_init);
+
+/* Return a constant MPI descripbed by NO which is one of the
+ * MPI_C_xxx macros.  There is no need to copy this returned value; it
+ * may be used directly.
+ */
+MPI mpi_const(enum gcry_mpi_constants no)
+{
+	if ((int)no < 0 || no > MPI_NUMBER_OF_CONSTANTS)
+		pr_err("MPI: invalid mpi_const selector %d\n", no);
+	if (!constants[no])
+		pr_err("MPI: MPI subsystem not initialized\n");
+	return constants[no];
+}
+EXPORT_SYMBOL_GPL(mpi_const);
+
 /****************
  * Note:  It was a bad idea to use the number of limbs to allocate
  *	  because on a alpha the limbs are large but we normally need
@@ -106,6 +163,15 @@ int mpi_resize(MPI a, unsigned nlimbs)
 	return 0;
 }
 
+void mpi_clear(MPI a)
+{
+	if (!a)
+		return;
+	a->nlimbs = 0;
+	a->flags = 0;
+}
+EXPORT_SYMBOL_GPL(mpi_clear);
+
 void mpi_free(MPI a)
 {
 	if (!a)
@@ -146,5 +212,121 @@ MPI mpi_copy(MPI a)
 	return b;
 }
 
+/****************
+ * This function allocates an MPI which is optimized to hold
+ * a value as large as the one given in the argument and allocates it
+ * with the same flags as A.
+ */
+MPI mpi_alloc_like(MPI a)
+{
+	MPI b;
+
+	if (a) {
+		b = mpi_alloc(a->nlimbs);
+		b->nlimbs = 0;
+		b->sign = 0;
+		b->flags = a->flags;
+	} else
+		b = NULL;
+
+	return b;
+}
+
+
+/* Set U into W and release U.  If W is NULL only U will be released. */
+void mpi_snatch(MPI w, MPI u)
+{
+	if (w) {
+		mpi_assign_limb_space(w, u->d, u->alloced);
+		w->nlimbs = u->nlimbs;
+		w->sign   = u->sign;
+		w->flags  = u->flags;
+		u->alloced = 0;
+		u->nlimbs = 0;
+		u->d = NULL;
+	}
+	mpi_free(u);
+}
+
+
+MPI mpi_set(MPI w, MPI u)
+{
+	mpi_ptr_t wp, up;
+	mpi_size_t usize = u->nlimbs;
+	int usign = u->sign;
+
+	if (!w)
+		w = mpi_alloc(mpi_get_nlimbs(u));
+	RESIZE_IF_NEEDED(w, usize);
+	wp = w->d;
+	up = u->d;
+	MPN_COPY(wp, up, usize);
+	w->nlimbs = usize;
+	w->flags = u->flags;
+	w->flags &= ~(16|32); /* Reset the immutable and constant flags.  */
+	w->sign = usign;
+	return w;
+}
+EXPORT_SYMBOL_GPL(mpi_set);
+
+MPI mpi_set_ui(MPI w, unsigned long u)
+{
+	if (!w)
+		w = mpi_alloc(1);
+	/* FIXME: If U is 0 we have no need to resize and thus possible
+	 * allocating the limbs.
+	 */
+	RESIZE_IF_NEEDED(w, 1);
+	w->d[0] = u;
+	w->nlimbs = u ? 1 : 0;
+	w->sign = 0;
+	w->flags = 0;
+	return w;
+}
+EXPORT_SYMBOL_GPL(mpi_set_ui);
+
+MPI mpi_alloc_set_ui(unsigned long u)
+{
+	MPI w = mpi_alloc(1);
+	w->d[0] = u;
+	w->nlimbs = u ? 1 : 0;
+	w->sign = 0;
+	return w;
+}
+
+/****************
+ * Swap the value of A and B, when SWAP is 1.
+ * Leave the value when SWAP is 0.
+ * This implementation should be constant-time regardless of SWAP.
+ */
+void mpi_swap_cond(MPI a, MPI b, unsigned long swap)
+{
+	mpi_size_t i;
+	mpi_size_t nlimbs;
+	mpi_limb_t mask = ((mpi_limb_t)0) - swap;
+	mpi_limb_t x;
+
+	if (a->alloced > b->alloced)
+		nlimbs = b->alloced;
+	else
+		nlimbs = a->alloced;
+	if (a->nlimbs > nlimbs || b->nlimbs > nlimbs)
+		return;
+
+	for (i = 0; i < nlimbs; i++) {
+		x = mask & (a->d[i] ^ b->d[i]);
+		a->d[i] = a->d[i] ^ x;
+		b->d[i] = b->d[i] ^ x;
+	}
+
+	x = mask & (a->nlimbs ^ b->nlimbs);
+	a->nlimbs = a->nlimbs ^ x;
+	b->nlimbs = b->nlimbs ^ x;
+
+	x = mask & (a->sign ^ b->sign);
+	a->sign = a->sign ^ x;
+	b->sign = b->sign ^ x;
+}
+
 MODULE_DESCRIPTION("Multiprecision maths library");
 MODULE_LICENSE("GPL");
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH RFC 2/4] Revert "Revert "lib/mpi: Introduce ec implementation to MPI library""
  2025-06-30 13:39 [PATCH RFC 0/4] Reintroduce the sm2 algorithm Gu Bowen
  2025-06-30 13:39 ` [PATCH RFC 1/4] Revert "Revert "lib/mpi: Extend the MPI library"" Gu Bowen
@ 2025-06-30 13:39 ` Gu Bowen
  2025-07-02 15:18   ` Ignat Korchagin
  2025-06-30 13:39 ` [PATCH RFC 3/4] crypto/sm2: Rework sm2 alg with sig_alg backend Gu Bowen
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 13+ messages in thread
From: Gu Bowen @ 2025-06-30 13:39 UTC (permalink / raw)
  To: Herbert Xu, David Howells, David Woodhouse, Lukas Wunner,
	Ignat Korchagin, David S . Miller, Jarkko Sakkinen,
	Maxime Coquelin, Alexandre Torgue, Eric Biggers,
	Jason A . Donenfeld, Ard Biesheuvel, Tianjia Zhang, Dan Carpenter
  Cc: keyrings, linux-kernel, linux-crypto, linux-stm32,
	linux-arm-kernel, Lu Jialin, GONG Ruiqi, Gu Bowen

This reverts commit da4fe6815aca25603944a64b0965310512e867d0.

Reintroduce ec implementation to MPI library to support sm2.

Signed-off-by: Gu Bowen <gubowen5@huawei.com>
---
 include/linux/mpi.h     |  105 +++
 lib/crypto/mpi/Makefile |    1 +
 lib/crypto/mpi/ec.c     | 1507 +++++++++++++++++++++++++++++++++++++++
 3 files changed, 1613 insertions(+)
 create mode 100644 lib/crypto/mpi/ec.c

diff --git a/include/linux/mpi.h b/include/linux/mpi.h
index 9ad7e7231ee9..3317effe57ba 100644
--- a/include/linux/mpi.h
+++ b/include/linux/mpi.h
@@ -157,6 +157,111 @@ void mpi_fdiv_q(MPI quot, MPI dividend, MPI divisor);
 /*-- mpi-inv.c --*/
 int mpi_invm(MPI x, MPI a, MPI n);
 
+/*-- ec.c --*/
+
+/* Object to represent a point in projective coordinates */
+struct gcry_mpi_point {
+	MPI x;
+	MPI y;
+	MPI z;
+};
+
+typedef struct gcry_mpi_point *MPI_POINT;
+
+/* Models describing an elliptic curve */
+enum gcry_mpi_ec_models {
+	/* The Short Weierstrass equation is
+	 *      y^2 = x^3 + ax + b
+	 */
+	MPI_EC_WEIERSTRASS = 0,
+	/* The Montgomery equation is
+	 *      by^2 = x^3 + ax^2 + x
+	 */
+	MPI_EC_MONTGOMERY,
+	/* The Twisted Edwards equation is
+	 *      ax^2 + y^2 = 1 + bx^2y^2
+	 * Note that we use 'b' instead of the commonly used 'd'.
+	 */
+	MPI_EC_EDWARDS
+};
+
+/* Dialects used with elliptic curves */
+enum ecc_dialects {
+	ECC_DIALECT_STANDARD = 0,
+	ECC_DIALECT_ED25519,
+	ECC_DIALECT_SAFECURVE
+};
+
+/* This context is used with all our EC functions. */
+struct mpi_ec_ctx {
+	enum gcry_mpi_ec_models model; /* The model describing this curve. */
+	enum ecc_dialects dialect;     /* The ECC dialect used with the curve. */
+	int flags;                     /* Public key flags (not always used). */
+	unsigned int nbits;            /* Number of bits.  */
+
+	/* Domain parameters.  Note that they may not all be set and if set
+	 * the MPIs may be flagged as constant.
+	 */
+	MPI p;         /* Prime specifying the field GF(p).  */
+	MPI a;         /* First coefficient of the Weierstrass equation.  */
+	MPI b;         /* Second coefficient of the Weierstrass equation.  */
+	MPI_POINT G;   /* Base point (generator).  */
+	MPI n;         /* Order of G.  */
+	unsigned int h;       /* Cofactor.  */
+
+	/* The actual key.  May not be set.  */
+	MPI_POINT Q;   /* Public key.   */
+	MPI d;         /* Private key.  */
+
+	const char *name;      /* Name of the curve.  */
+
+	/* This structure is private to mpi/ec.c! */
+	struct {
+		struct {
+			unsigned int a_is_pminus3:1;
+			unsigned int two_inv_p:1;
+		} valid; /* Flags to help setting the helper vars below.  */
+
+		int a_is_pminus3;  /* True if A = P - 3. */
+
+		MPI two_inv_p;
+
+		mpi_barrett_t p_barrett;
+
+		/* Scratch variables.  */
+		MPI scratch[11];
+
+		/* Helper for fast reduction.  */
+		/*   int nist_nbits; /\* If this is a NIST curve, the # of bits. *\/ */
+		/*   MPI s[10]; */
+		/*   MPI c; */
+	} t;
+
+	/* Curve specific computation routines for the field.  */
+	void (*addm)(MPI w, MPI u, MPI v, struct mpi_ec_ctx *ctx);
+	void (*subm)(MPI w, MPI u, MPI v, struct mpi_ec_ctx *ec);
+	void (*mulm)(MPI w, MPI u, MPI v, struct mpi_ec_ctx *ctx);
+	void (*pow2)(MPI w, const MPI b, struct mpi_ec_ctx *ctx);
+	void (*mul2)(MPI w, MPI u, struct mpi_ec_ctx *ctx);
+};
+
+void mpi_ec_init(struct mpi_ec_ctx *ctx, enum gcry_mpi_ec_models model,
+			enum ecc_dialects dialect,
+			int flags, MPI p, MPI a, MPI b);
+void mpi_ec_deinit(struct mpi_ec_ctx *ctx);
+MPI_POINT mpi_point_new(unsigned int nbits);
+void mpi_point_release(MPI_POINT p);
+void mpi_point_init(MPI_POINT p);
+void mpi_point_free_parts(MPI_POINT p);
+int mpi_ec_get_affine(MPI x, MPI y, MPI_POINT point, struct mpi_ec_ctx *ctx);
+void mpi_ec_add_points(MPI_POINT result,
+			MPI_POINT p1, MPI_POINT p2,
+			struct mpi_ec_ctx *ctx);
+void mpi_ec_mul_point(MPI_POINT result,
+			MPI scalar, MPI_POINT point,
+			struct mpi_ec_ctx *ctx);
+int mpi_ec_curve_point(MPI_POINT point, struct mpi_ec_ctx *ctx);
+
 /* inline functions */
 
 /**
diff --git a/lib/crypto/mpi/Makefile b/lib/crypto/mpi/Makefile
index 477debd7ed50..6e6ef9a34fe1 100644
--- a/lib/crypto/mpi/Makefile
+++ b/lib/crypto/mpi/Makefile
@@ -13,6 +13,7 @@ mpi-y = \
 	generic_mpih-rshift.o		\
 	generic_mpih-sub1.o		\
 	generic_mpih-add1.o		\
+	ec.o				\
 	mpicoder.o			\
 	mpi-add.o			\
 	mpi-bit.o			\
diff --git a/lib/crypto/mpi/ec.c b/lib/crypto/mpi/ec.c
new file mode 100644
index 000000000000..4781f00982ef
--- /dev/null
+++ b/lib/crypto/mpi/ec.c
@@ -0,0 +1,1507 @@
+/* ec.c -  Elliptic Curve functions
+ * Copyright (C) 2007 Free Software Foundation, Inc.
+ * Copyright (C) 2013 g10 Code GmbH
+ *
+ * This file is part of Libgcrypt.
+ *
+ * Libgcrypt is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU Lesser General Public License as
+ * published by the Free Software Foundation; either version 2.1 of
+ * the License, or (at your option) any later version.
+ *
+ * Libgcrypt is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include "mpi-internal.h"
+#include "longlong.h"
+
+#define point_init(a)  mpi_point_init((a))
+#define point_free(a)  mpi_point_free_parts((a))
+
+#define log_error(fmt, ...) pr_err(fmt, ##__VA_ARGS__)
+#define log_fatal(fmt, ...) pr_err(fmt, ##__VA_ARGS__)
+
+#define DIM(v) (sizeof(v)/sizeof((v)[0]))
+
+
+/* Create a new point option.  NBITS gives the size in bits of one
+ * coordinate; it is only used to pre-allocate some resources and
+ * might also be passed as 0 to use a default value.
+ */
+MPI_POINT mpi_point_new(unsigned int nbits)
+{
+	MPI_POINT p;
+
+	(void)nbits;  /* Currently not used.  */
+
+	p = kmalloc(sizeof(*p), GFP_KERNEL);
+	if (p)
+		mpi_point_init(p);
+	return p;
+}
+EXPORT_SYMBOL_GPL(mpi_point_new);
+
+/* Release the point object P.  P may be NULL. */
+void mpi_point_release(MPI_POINT p)
+{
+	if (p) {
+		mpi_point_free_parts(p);
+		kfree(p);
+	}
+}
+EXPORT_SYMBOL_GPL(mpi_point_release);
+
+/* Initialize the fields of a point object.  gcry_mpi_point_free_parts
+ * may be used to release the fields.
+ */
+void mpi_point_init(MPI_POINT p)
+{
+	p->x = mpi_new(0);
+	p->y = mpi_new(0);
+	p->z = mpi_new(0);
+}
+EXPORT_SYMBOL_GPL(mpi_point_init);
+
+/* Release the parts of a point object. */
+void mpi_point_free_parts(MPI_POINT p)
+{
+	mpi_free(p->x); p->x = NULL;
+	mpi_free(p->y); p->y = NULL;
+	mpi_free(p->z); p->z = NULL;
+}
+EXPORT_SYMBOL_GPL(mpi_point_free_parts);
+
+/* Set the value from S into D.  */
+static void point_set(MPI_POINT d, MPI_POINT s)
+{
+	mpi_set(d->x, s->x);
+	mpi_set(d->y, s->y);
+	mpi_set(d->z, s->z);
+}
+
+static void point_resize(MPI_POINT p, struct mpi_ec_ctx *ctx)
+{
+	size_t nlimbs = ctx->p->nlimbs;
+
+	mpi_resize(p->x, nlimbs);
+	p->x->nlimbs = nlimbs;
+	mpi_resize(p->z, nlimbs);
+	p->z->nlimbs = nlimbs;
+
+	if (ctx->model != MPI_EC_MONTGOMERY) {
+		mpi_resize(p->y, nlimbs);
+		p->y->nlimbs = nlimbs;
+	}
+}
+
+static void point_swap_cond(MPI_POINT d, MPI_POINT s, unsigned long swap,
+		struct mpi_ec_ctx *ctx)
+{
+	mpi_swap_cond(d->x, s->x, swap);
+	if (ctx->model != MPI_EC_MONTGOMERY)
+		mpi_swap_cond(d->y, s->y, swap);
+	mpi_swap_cond(d->z, s->z, swap);
+}
+
+
+/* W = W mod P.  */
+static void ec_mod(MPI w, struct mpi_ec_ctx *ec)
+{
+	if (ec->t.p_barrett)
+		mpi_mod_barrett(w, w, ec->t.p_barrett);
+	else
+		mpi_mod(w, w, ec->p);
+}
+
+static void ec_addm(MPI w, MPI u, MPI v, struct mpi_ec_ctx *ctx)
+{
+	mpi_add(w, u, v);
+	ec_mod(w, ctx);
+}
+
+static void ec_subm(MPI w, MPI u, MPI v, struct mpi_ec_ctx *ec)
+{
+	mpi_sub(w, u, v);
+	while (w->sign)
+		mpi_add(w, w, ec->p);
+	/*ec_mod(w, ec);*/
+}
+
+static void ec_mulm(MPI w, MPI u, MPI v, struct mpi_ec_ctx *ctx)
+{
+	mpi_mul(w, u, v);
+	ec_mod(w, ctx);
+}
+
+/* W = 2 * U mod P.  */
+static void ec_mul2(MPI w, MPI u, struct mpi_ec_ctx *ctx)
+{
+	mpi_lshift(w, u, 1);
+	ec_mod(w, ctx);
+}
+
+static void ec_powm(MPI w, const MPI b, const MPI e,
+		struct mpi_ec_ctx *ctx)
+{
+	mpi_powm(w, b, e, ctx->p);
+	/* mpi_abs(w); */
+}
+
+/* Shortcut for
+ * ec_powm(B, B, mpi_const(MPI_C_TWO), ctx);
+ * for easier optimization.
+ */
+static void ec_pow2(MPI w, const MPI b, struct mpi_ec_ctx *ctx)
+{
+	/* Using mpi_mul is slightly faster (at least on amd64).  */
+	/* mpi_powm(w, b, mpi_const(MPI_C_TWO), ctx->p); */
+	ec_mulm(w, b, b, ctx);
+}
+
+/* Shortcut for
+ * ec_powm(B, B, mpi_const(MPI_C_THREE), ctx);
+ * for easier optimization.
+ */
+static void ec_pow3(MPI w, const MPI b, struct mpi_ec_ctx *ctx)
+{
+	mpi_powm(w, b, mpi_const(MPI_C_THREE), ctx->p);
+}
+
+static void ec_invm(MPI x, MPI a, struct mpi_ec_ctx *ctx)
+{
+	if (!mpi_invm(x, a, ctx->p))
+		log_error("ec_invm: inverse does not exist:\n");
+}
+
+static void mpih_set_cond(mpi_ptr_t wp, mpi_ptr_t up,
+		mpi_size_t usize, unsigned long set)
+{
+	mpi_size_t i;
+	mpi_limb_t mask = ((mpi_limb_t)0) - set;
+	mpi_limb_t x;
+
+	for (i = 0; i < usize; i++) {
+		x = mask & (wp[i] ^ up[i]);
+		wp[i] = wp[i] ^ x;
+	}
+}
+
+/* Routines for 2^255 - 19.  */
+
+#define LIMB_SIZE_25519 ((256+BITS_PER_MPI_LIMB-1)/BITS_PER_MPI_LIMB)
+
+static void ec_addm_25519(MPI w, MPI u, MPI v, struct mpi_ec_ctx *ctx)
+{
+	mpi_ptr_t wp, up, vp;
+	mpi_size_t wsize = LIMB_SIZE_25519;
+	mpi_limb_t n[LIMB_SIZE_25519];
+	mpi_limb_t borrow;
+
+	if (w->nlimbs != wsize || u->nlimbs != wsize || v->nlimbs != wsize)
+		log_bug("addm_25519: different sizes\n");
+
+	memset(n, 0, sizeof(n));
+	up = u->d;
+	vp = v->d;
+	wp = w->d;
+
+	mpihelp_add_n(wp, up, vp, wsize);
+	borrow = mpihelp_sub_n(wp, wp, ctx->p->d, wsize);
+	mpih_set_cond(n, ctx->p->d, wsize, (borrow != 0UL));
+	mpihelp_add_n(wp, wp, n, wsize);
+	wp[LIMB_SIZE_25519-1] &= ~((mpi_limb_t)1 << (255 % BITS_PER_MPI_LIMB));
+}
+
+static void ec_subm_25519(MPI w, MPI u, MPI v, struct mpi_ec_ctx *ctx)
+{
+	mpi_ptr_t wp, up, vp;
+	mpi_size_t wsize = LIMB_SIZE_25519;
+	mpi_limb_t n[LIMB_SIZE_25519];
+	mpi_limb_t borrow;
+
+	if (w->nlimbs != wsize || u->nlimbs != wsize || v->nlimbs != wsize)
+		log_bug("subm_25519: different sizes\n");
+
+	memset(n, 0, sizeof(n));
+	up = u->d;
+	vp = v->d;
+	wp = w->d;
+
+	borrow = mpihelp_sub_n(wp, up, vp, wsize);
+	mpih_set_cond(n, ctx->p->d, wsize, (borrow != 0UL));
+	mpihelp_add_n(wp, wp, n, wsize);
+	wp[LIMB_SIZE_25519-1] &= ~((mpi_limb_t)1 << (255 % BITS_PER_MPI_LIMB));
+}
+
+static void ec_mulm_25519(MPI w, MPI u, MPI v, struct mpi_ec_ctx *ctx)
+{
+	mpi_ptr_t wp, up, vp;
+	mpi_size_t wsize = LIMB_SIZE_25519;
+	mpi_limb_t n[LIMB_SIZE_25519*2];
+	mpi_limb_t m[LIMB_SIZE_25519+1];
+	mpi_limb_t cy;
+	int msb;
+
+	(void)ctx;
+	if (w->nlimbs != wsize || u->nlimbs != wsize || v->nlimbs != wsize)
+		log_bug("mulm_25519: different sizes\n");
+
+	up = u->d;
+	vp = v->d;
+	wp = w->d;
+
+	mpihelp_mul_n(n, up, vp, wsize);
+	memcpy(wp, n, wsize * BYTES_PER_MPI_LIMB);
+	wp[LIMB_SIZE_25519-1] &= ~((mpi_limb_t)1 << (255 % BITS_PER_MPI_LIMB));
+
+	memcpy(m, n+LIMB_SIZE_25519-1, (wsize+1) * BYTES_PER_MPI_LIMB);
+	mpihelp_rshift(m, m, LIMB_SIZE_25519+1, (255 % BITS_PER_MPI_LIMB));
+
+	memcpy(n, m, wsize * BYTES_PER_MPI_LIMB);
+	cy = mpihelp_lshift(m, m, LIMB_SIZE_25519, 4);
+	m[LIMB_SIZE_25519] = cy;
+	cy = mpihelp_add_n(m, m, n, wsize);
+	m[LIMB_SIZE_25519] += cy;
+	cy = mpihelp_add_n(m, m, n, wsize);
+	m[LIMB_SIZE_25519] += cy;
+	cy = mpihelp_add_n(m, m, n, wsize);
+	m[LIMB_SIZE_25519] += cy;
+
+	cy = mpihelp_add_n(wp, wp, m, wsize);
+	m[LIMB_SIZE_25519] += cy;
+
+	memset(m, 0, wsize * BYTES_PER_MPI_LIMB);
+	msb = (wp[LIMB_SIZE_25519-1] >> (255 % BITS_PER_MPI_LIMB));
+	m[0] = (m[LIMB_SIZE_25519] * 2 + msb) * 19;
+	wp[LIMB_SIZE_25519-1] &= ~((mpi_limb_t)1 << (255 % BITS_PER_MPI_LIMB));
+	mpihelp_add_n(wp, wp, m, wsize);
+
+	m[0] = 0;
+	cy = mpihelp_sub_n(wp, wp, ctx->p->d, wsize);
+	mpih_set_cond(m, ctx->p->d, wsize, (cy != 0UL));
+	mpihelp_add_n(wp, wp, m, wsize);
+}
+
+static void ec_mul2_25519(MPI w, MPI u, struct mpi_ec_ctx *ctx)
+{
+	ec_addm_25519(w, u, u, ctx);
+}
+
+static void ec_pow2_25519(MPI w, const MPI b, struct mpi_ec_ctx *ctx)
+{
+	ec_mulm_25519(w, b, b, ctx);
+}
+
+/* Routines for 2^448 - 2^224 - 1.  */
+
+#define LIMB_SIZE_448 ((448+BITS_PER_MPI_LIMB-1)/BITS_PER_MPI_LIMB)
+#define LIMB_SIZE_HALF_448 ((LIMB_SIZE_448+1)/2)
+
+static void ec_addm_448(MPI w, MPI u, MPI v, struct mpi_ec_ctx *ctx)
+{
+	mpi_ptr_t wp, up, vp;
+	mpi_size_t wsize = LIMB_SIZE_448;
+	mpi_limb_t n[LIMB_SIZE_448];
+	mpi_limb_t cy;
+
+	if (w->nlimbs != wsize || u->nlimbs != wsize || v->nlimbs != wsize)
+		log_bug("addm_448: different sizes\n");
+
+	memset(n, 0, sizeof(n));
+	up = u->d;
+	vp = v->d;
+	wp = w->d;
+
+	cy = mpihelp_add_n(wp, up, vp, wsize);
+	mpih_set_cond(n, ctx->p->d, wsize, (cy != 0UL));
+	mpihelp_sub_n(wp, wp, n, wsize);
+}
+
+static void ec_subm_448(MPI w, MPI u, MPI v, struct mpi_ec_ctx *ctx)
+{
+	mpi_ptr_t wp, up, vp;
+	mpi_size_t wsize = LIMB_SIZE_448;
+	mpi_limb_t n[LIMB_SIZE_448];
+	mpi_limb_t borrow;
+
+	if (w->nlimbs != wsize || u->nlimbs != wsize || v->nlimbs != wsize)
+		log_bug("subm_448: different sizes\n");
+
+	memset(n, 0, sizeof(n));
+	up = u->d;
+	vp = v->d;
+	wp = w->d;
+
+	borrow = mpihelp_sub_n(wp, up, vp, wsize);
+	mpih_set_cond(n, ctx->p->d, wsize, (borrow != 0UL));
+	mpihelp_add_n(wp, wp, n, wsize);
+}
+
+static void ec_mulm_448(MPI w, MPI u, MPI v, struct mpi_ec_ctx *ctx)
+{
+	mpi_ptr_t wp, up, vp;
+	mpi_size_t wsize = LIMB_SIZE_448;
+	mpi_limb_t n[LIMB_SIZE_448*2];
+	mpi_limb_t a2[LIMB_SIZE_HALF_448];
+	mpi_limb_t a3[LIMB_SIZE_HALF_448];
+	mpi_limb_t b0[LIMB_SIZE_HALF_448];
+	mpi_limb_t b1[LIMB_SIZE_HALF_448];
+	mpi_limb_t cy;
+	int i;
+#if (LIMB_SIZE_HALF_448 > LIMB_SIZE_448/2)
+	mpi_limb_t b1_rest, a3_rest;
+#endif
+
+	if (w->nlimbs != wsize || u->nlimbs != wsize || v->nlimbs != wsize)
+		log_bug("mulm_448: different sizes\n");
+
+	up = u->d;
+	vp = v->d;
+	wp = w->d;
+
+	mpihelp_mul_n(n, up, vp, wsize);
+
+	for (i = 0; i < (wsize + 1) / 2; i++) {
+		b0[i] = n[i];
+		b1[i] = n[i+wsize/2];
+		a2[i] = n[i+wsize];
+		a3[i] = n[i+wsize+wsize/2];
+	}
+
+#if (LIMB_SIZE_HALF_448 > LIMB_SIZE_448/2)
+	b0[LIMB_SIZE_HALF_448-1] &= ((mpi_limb_t)1UL << 32)-1;
+	a2[LIMB_SIZE_HALF_448-1] &= ((mpi_limb_t)1UL << 32)-1;
+
+	b1_rest = 0;
+	a3_rest = 0;
+
+	for (i = (wsize + 1) / 2 - 1; i >= 0; i--) {
+		mpi_limb_t b1v, a3v;
+		b1v = b1[i];
+		a3v = a3[i];
+		b1[i] = (b1_rest << 32) | (b1v >> 32);
+		a3[i] = (a3_rest << 32) | (a3v >> 32);
+		b1_rest = b1v & (((mpi_limb_t)1UL << 32)-1);
+		a3_rest = a3v & (((mpi_limb_t)1UL << 32)-1);
+	}
+#endif
+
+	cy = mpihelp_add_n(b0, b0, a2, LIMB_SIZE_HALF_448);
+	cy += mpihelp_add_n(b0, b0, a3, LIMB_SIZE_HALF_448);
+	for (i = 0; i < (wsize + 1) / 2; i++)
+		wp[i] = b0[i];
+#if (LIMB_SIZE_HALF_448 > LIMB_SIZE_448/2)
+	wp[LIMB_SIZE_HALF_448-1] &= (((mpi_limb_t)1UL << 32)-1);
+#endif
+
+#if (LIMB_SIZE_HALF_448 > LIMB_SIZE_448/2)
+	cy = b0[LIMB_SIZE_HALF_448-1] >> 32;
+#endif
+
+	cy = mpihelp_add_1(b1, b1, LIMB_SIZE_HALF_448, cy);
+	cy += mpihelp_add_n(b1, b1, a2, LIMB_SIZE_HALF_448);
+	cy += mpihelp_add_n(b1, b1, a3, LIMB_SIZE_HALF_448);
+	cy += mpihelp_add_n(b1, b1, a3, LIMB_SIZE_HALF_448);
+#if (LIMB_SIZE_HALF_448 > LIMB_SIZE_448/2)
+	b1_rest = 0;
+	for (i = (wsize + 1) / 2 - 1; i >= 0; i--) {
+		mpi_limb_t b1v = b1[i];
+		b1[i] = (b1_rest << 32) | (b1v >> 32);
+		b1_rest = b1v & (((mpi_limb_t)1UL << 32)-1);
+	}
+	wp[LIMB_SIZE_HALF_448-1] |= (b1_rest << 32);
+#endif
+	for (i = 0; i < wsize / 2; i++)
+		wp[i+(wsize + 1) / 2] = b1[i];
+
+#if (LIMB_SIZE_HALF_448 > LIMB_SIZE_448/2)
+	cy = b1[LIMB_SIZE_HALF_448-1];
+#endif
+
+	memset(n, 0, wsize * BYTES_PER_MPI_LIMB);
+
+#if (LIMB_SIZE_HALF_448 > LIMB_SIZE_448/2)
+	n[LIMB_SIZE_HALF_448-1] = cy << 32;
+#else
+	n[LIMB_SIZE_HALF_448] = cy;
+#endif
+	n[0] = cy;
+	mpihelp_add_n(wp, wp, n, wsize);
+
+	memset(n, 0, wsize * BYTES_PER_MPI_LIMB);
+	cy = mpihelp_sub_n(wp, wp, ctx->p->d, wsize);
+	mpih_set_cond(n, ctx->p->d, wsize, (cy != 0UL));
+	mpihelp_add_n(wp, wp, n, wsize);
+}
+
+static void ec_mul2_448(MPI w, MPI u, struct mpi_ec_ctx *ctx)
+{
+	ec_addm_448(w, u, u, ctx);
+}
+
+static void ec_pow2_448(MPI w, const MPI b, struct mpi_ec_ctx *ctx)
+{
+	ec_mulm_448(w, b, b, ctx);
+}
+
+struct field_table {
+	const char *p;
+
+	/* computation routines for the field.  */
+	void (*addm)(MPI w, MPI u, MPI v, struct mpi_ec_ctx *ctx);
+	void (*subm)(MPI w, MPI u, MPI v, struct mpi_ec_ctx *ctx);
+	void (*mulm)(MPI w, MPI u, MPI v, struct mpi_ec_ctx *ctx);
+	void (*mul2)(MPI w, MPI u, struct mpi_ec_ctx *ctx);
+	void (*pow2)(MPI w, const MPI b, struct mpi_ec_ctx *ctx);
+};
+
+static const struct field_table field_table[] = {
+	{
+		"0x7FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFED",
+		ec_addm_25519,
+		ec_subm_25519,
+		ec_mulm_25519,
+		ec_mul2_25519,
+		ec_pow2_25519
+	},
+	{
+		"0xFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFE"
+		"FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF",
+		ec_addm_448,
+		ec_subm_448,
+		ec_mulm_448,
+		ec_mul2_448,
+		ec_pow2_448
+	},
+	{ NULL, NULL, NULL, NULL, NULL, NULL },
+};
+
+/* Force recomputation of all helper variables.  */
+static void mpi_ec_get_reset(struct mpi_ec_ctx *ec)
+{
+	ec->t.valid.a_is_pminus3 = 0;
+	ec->t.valid.two_inv_p = 0;
+}
+
+/* Accessor for helper variable.  */
+static int ec_get_a_is_pminus3(struct mpi_ec_ctx *ec)
+{
+	MPI tmp;
+
+	if (!ec->t.valid.a_is_pminus3) {
+		ec->t.valid.a_is_pminus3 = 1;
+		tmp = mpi_alloc_like(ec->p);
+		mpi_sub_ui(tmp, ec->p, 3);
+		ec->t.a_is_pminus3 = !mpi_cmp(ec->a, tmp);
+		mpi_free(tmp);
+	}
+
+	return ec->t.a_is_pminus3;
+}
+
+/* Accessor for helper variable.  */
+static MPI ec_get_two_inv_p(struct mpi_ec_ctx *ec)
+{
+	if (!ec->t.valid.two_inv_p) {
+		ec->t.valid.two_inv_p = 1;
+		if (!ec->t.two_inv_p)
+			ec->t.two_inv_p = mpi_alloc(0);
+		ec_invm(ec->t.two_inv_p, mpi_const(MPI_C_TWO), ec);
+	}
+	return ec->t.two_inv_p;
+}
+
+static const char *const curve25519_bad_points[] = {
+	"0x7fffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffed",
+	"0x0000000000000000000000000000000000000000000000000000000000000000",
+	"0x0000000000000000000000000000000000000000000000000000000000000001",
+	"0x00b8495f16056286fdb1329ceb8d09da6ac49ff1fae35616aeb8413b7c7aebe0",
+	"0x57119fd0dd4e22d8868e1c58c45c44045bef839c55b1d0b1248c50a3bc959c5f",
+	"0x7fffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffec",
+	"0x7fffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffee",
+	NULL
+};
+
+static const char *const curve448_bad_points[] = {
+	"0xfffffffffffffffffffffffffffffffffffffffffffffffffffffffe"
+	"ffffffffffffffffffffffffffffffffffffffffffffffffffffffff",
+	"0x00000000000000000000000000000000000000000000000000000000"
+	"00000000000000000000000000000000000000000000000000000000",
+	"0x00000000000000000000000000000000000000000000000000000000"
+	"00000000000000000000000000000000000000000000000000000001",
+	"0xfffffffffffffffffffffffffffffffffffffffffffffffffffffffe"
+	"fffffffffffffffffffffffffffffffffffffffffffffffffffffffe",
+	"0xffffffffffffffffffffffffffffffffffffffffffffffffffffffff"
+	"00000000000000000000000000000000000000000000000000000000",
+	NULL
+};
+
+static const char *const *bad_points_table[] = {
+	curve25519_bad_points,
+	curve448_bad_points,
+};
+
+static void mpi_ec_coefficient_normalize(MPI a, MPI p)
+{
+	if (a->sign) {
+		mpi_resize(a, p->nlimbs);
+		mpihelp_sub_n(a->d, p->d, a->d, p->nlimbs);
+		a->nlimbs = p->nlimbs;
+		a->sign = 0;
+	}
+}
+
+/* This function initialized a context for elliptic curve based on the
+ * field GF(p).  P is the prime specifying this field, A is the first
+ * coefficient.  CTX is expected to be zeroized.
+ */
+void mpi_ec_init(struct mpi_ec_ctx *ctx, enum gcry_mpi_ec_models model,
+			enum ecc_dialects dialect,
+			int flags, MPI p, MPI a, MPI b)
+{
+	int i;
+	static int use_barrett = -1 /* TODO: 1 or -1 */;
+
+	mpi_ec_coefficient_normalize(a, p);
+	mpi_ec_coefficient_normalize(b, p);
+
+	/* Fixme: Do we want to check some constraints? e.g.  a < p  */
+
+	ctx->model = model;
+	ctx->dialect = dialect;
+	ctx->flags = flags;
+	if (dialect == ECC_DIALECT_ED25519)
+		ctx->nbits = 256;
+	else
+		ctx->nbits = mpi_get_nbits(p);
+	ctx->p = mpi_copy(p);
+	ctx->a = mpi_copy(a);
+	ctx->b = mpi_copy(b);
+
+	ctx->d = NULL;
+	ctx->t.two_inv_p = NULL;
+
+	ctx->t.p_barrett = use_barrett > 0 ? mpi_barrett_init(ctx->p, 0) : NULL;
+
+	mpi_ec_get_reset(ctx);
+
+	if (model == MPI_EC_MONTGOMERY) {
+		for (i = 0; i < DIM(bad_points_table); i++) {
+			MPI p_candidate = mpi_scanval(bad_points_table[i][0]);
+			int match_p = !mpi_cmp(ctx->p, p_candidate);
+			int j;
+
+			mpi_free(p_candidate);
+			if (!match_p)
+				continue;
+
+			for (j = 0; i < DIM(ctx->t.scratch) && bad_points_table[i][j]; j++)
+				ctx->t.scratch[j] = mpi_scanval(bad_points_table[i][j]);
+		}
+	} else {
+		/* Allocate scratch variables.  */
+		for (i = 0; i < DIM(ctx->t.scratch); i++)
+			ctx->t.scratch[i] = mpi_alloc_like(ctx->p);
+	}
+
+	ctx->addm = ec_addm;
+	ctx->subm = ec_subm;
+	ctx->mulm = ec_mulm;
+	ctx->mul2 = ec_mul2;
+	ctx->pow2 = ec_pow2;
+
+	for (i = 0; field_table[i].p; i++) {
+		MPI f_p;
+
+		f_p = mpi_scanval(field_table[i].p);
+		if (!f_p)
+			break;
+
+		if (!mpi_cmp(p, f_p)) {
+			ctx->addm = field_table[i].addm;
+			ctx->subm = field_table[i].subm;
+			ctx->mulm = field_table[i].mulm;
+			ctx->mul2 = field_table[i].mul2;
+			ctx->pow2 = field_table[i].pow2;
+			mpi_free(f_p);
+
+			mpi_resize(ctx->a, ctx->p->nlimbs);
+			ctx->a->nlimbs = ctx->p->nlimbs;
+
+			mpi_resize(ctx->b, ctx->p->nlimbs);
+			ctx->b->nlimbs = ctx->p->nlimbs;
+
+			for (i = 0; i < DIM(ctx->t.scratch) && ctx->t.scratch[i]; i++)
+				ctx->t.scratch[i]->nlimbs = ctx->p->nlimbs;
+
+			break;
+		}
+
+		mpi_free(f_p);
+	}
+}
+EXPORT_SYMBOL_GPL(mpi_ec_init);
+
+void mpi_ec_deinit(struct mpi_ec_ctx *ctx)
+{
+	int i;
+
+	mpi_barrett_free(ctx->t.p_barrett);
+
+	/* Domain parameter.  */
+	mpi_free(ctx->p);
+	mpi_free(ctx->a);
+	mpi_free(ctx->b);
+	mpi_point_release(ctx->G);
+	mpi_free(ctx->n);
+
+	/* The key.  */
+	mpi_point_release(ctx->Q);
+	mpi_free(ctx->d);
+
+	/* Private data of ec.c.  */
+	mpi_free(ctx->t.two_inv_p);
+
+	for (i = 0; i < DIM(ctx->t.scratch); i++)
+		mpi_free(ctx->t.scratch[i]);
+}
+EXPORT_SYMBOL_GPL(mpi_ec_deinit);
+
+/* Compute the affine coordinates from the projective coordinates in
+ * POINT.  Set them into X and Y.  If one coordinate is not required,
+ * X or Y may be passed as NULL.  CTX is the usual context. Returns: 0
+ * on success or !0 if POINT is at infinity.
+ */
+int mpi_ec_get_affine(MPI x, MPI y, MPI_POINT point, struct mpi_ec_ctx *ctx)
+{
+	if (!mpi_cmp_ui(point->z, 0))
+		return -1;
+
+	switch (ctx->model) {
+	case MPI_EC_WEIERSTRASS: /* Using Jacobian coordinates.  */
+		{
+			MPI z1, z2, z3;
+
+			z1 = mpi_new(0);
+			z2 = mpi_new(0);
+			ec_invm(z1, point->z, ctx);  /* z1 = z^(-1) mod p  */
+			ec_mulm(z2, z1, z1, ctx);    /* z2 = z^(-2) mod p  */
+
+			if (x)
+				ec_mulm(x, point->x, z2, ctx);
+
+			if (y) {
+				z3 = mpi_new(0);
+				ec_mulm(z3, z2, z1, ctx);      /* z3 = z^(-3) mod p */
+				ec_mulm(y, point->y, z3, ctx);
+				mpi_free(z3);
+			}
+
+			mpi_free(z2);
+			mpi_free(z1);
+		}
+		return 0;
+
+	case MPI_EC_MONTGOMERY:
+		{
+			if (x)
+				mpi_set(x, point->x);
+
+			if (y) {
+				log_fatal("%s: Getting Y-coordinate on %s is not supported\n",
+						"mpi_ec_get_affine", "Montgomery");
+				return -1;
+			}
+		}
+		return 0;
+
+	case MPI_EC_EDWARDS:
+		{
+			MPI z;
+
+			z = mpi_new(0);
+			ec_invm(z, point->z, ctx);
+
+			mpi_resize(z, ctx->p->nlimbs);
+			z->nlimbs = ctx->p->nlimbs;
+
+			if (x) {
+				mpi_resize(x, ctx->p->nlimbs);
+				x->nlimbs = ctx->p->nlimbs;
+				ctx->mulm(x, point->x, z, ctx);
+			}
+			if (y) {
+				mpi_resize(y, ctx->p->nlimbs);
+				y->nlimbs = ctx->p->nlimbs;
+				ctx->mulm(y, point->y, z, ctx);
+			}
+
+			mpi_free(z);
+		}
+		return 0;
+
+	default:
+		return -1;
+	}
+}
+EXPORT_SYMBOL_GPL(mpi_ec_get_affine);
+
+/*  RESULT = 2 * POINT  (Weierstrass version). */
+static void dup_point_weierstrass(MPI_POINT result,
+		MPI_POINT point, struct mpi_ec_ctx *ctx)
+{
+#define x3 (result->x)
+#define y3 (result->y)
+#define z3 (result->z)
+#define t1 (ctx->t.scratch[0])
+#define t2 (ctx->t.scratch[1])
+#define t3 (ctx->t.scratch[2])
+#define l1 (ctx->t.scratch[3])
+#define l2 (ctx->t.scratch[4])
+#define l3 (ctx->t.scratch[5])
+
+	if (!mpi_cmp_ui(point->y, 0) || !mpi_cmp_ui(point->z, 0)) {
+		/* P_y == 0 || P_z == 0 => [1:1:0] */
+		mpi_set_ui(x3, 1);
+		mpi_set_ui(y3, 1);
+		mpi_set_ui(z3, 0);
+	} else {
+		if (ec_get_a_is_pminus3(ctx)) {
+			/* Use the faster case.  */
+			/* L1 = 3(X - Z^2)(X + Z^2) */
+			/*                          T1: used for Z^2. */
+			/*                          T2: used for the right term. */
+			ec_pow2(t1, point->z, ctx);
+			ec_subm(l1, point->x, t1, ctx);
+			ec_mulm(l1, l1, mpi_const(MPI_C_THREE), ctx);
+			ec_addm(t2, point->x, t1, ctx);
+			ec_mulm(l1, l1, t2, ctx);
+		} else {
+			/* Standard case. */
+			/* L1 = 3X^2 + aZ^4 */
+			/*                          T1: used for aZ^4. */
+			ec_pow2(l1, point->x, ctx);
+			ec_mulm(l1, l1, mpi_const(MPI_C_THREE), ctx);
+			ec_powm(t1, point->z, mpi_const(MPI_C_FOUR), ctx);
+			ec_mulm(t1, t1, ctx->a, ctx);
+			ec_addm(l1, l1, t1, ctx);
+		}
+		/* Z3 = 2YZ */
+		ec_mulm(z3, point->y, point->z, ctx);
+		ec_mul2(z3, z3, ctx);
+
+		/* L2 = 4XY^2 */
+		/*                              T2: used for Y2; required later. */
+		ec_pow2(t2, point->y, ctx);
+		ec_mulm(l2, t2, point->x, ctx);
+		ec_mulm(l2, l2, mpi_const(MPI_C_FOUR), ctx);
+
+		/* X3 = L1^2 - 2L2 */
+		/*                              T1: used for L2^2. */
+		ec_pow2(x3, l1, ctx);
+		ec_mul2(t1, l2, ctx);
+		ec_subm(x3, x3, t1, ctx);
+
+		/* L3 = 8Y^4 */
+		/*                              T2: taken from above. */
+		ec_pow2(t2, t2, ctx);
+		ec_mulm(l3, t2, mpi_const(MPI_C_EIGHT), ctx);
+
+		/* Y3 = L1(L2 - X3) - L3 */
+		ec_subm(y3, l2, x3, ctx);
+		ec_mulm(y3, y3, l1, ctx);
+		ec_subm(y3, y3, l3, ctx);
+	}
+
+#undef x3
+#undef y3
+#undef z3
+#undef t1
+#undef t2
+#undef t3
+#undef l1
+#undef l2
+#undef l3
+}
+
+/*  RESULT = 2 * POINT  (Montgomery version). */
+static void dup_point_montgomery(MPI_POINT result,
+				MPI_POINT point, struct mpi_ec_ctx *ctx)
+{
+	(void)result;
+	(void)point;
+	(void)ctx;
+	log_fatal("%s: %s not yet supported\n",
+			"mpi_ec_dup_point", "Montgomery");
+}
+
+/*  RESULT = 2 * POINT  (Twisted Edwards version). */
+static void dup_point_edwards(MPI_POINT result,
+		MPI_POINT point, struct mpi_ec_ctx *ctx)
+{
+#define X1 (point->x)
+#define Y1 (point->y)
+#define Z1 (point->z)
+#define X3 (result->x)
+#define Y3 (result->y)
+#define Z3 (result->z)
+#define B (ctx->t.scratch[0])
+#define C (ctx->t.scratch[1])
+#define D (ctx->t.scratch[2])
+#define E (ctx->t.scratch[3])
+#define F (ctx->t.scratch[4])
+#define H (ctx->t.scratch[5])
+#define J (ctx->t.scratch[6])
+
+	/* Compute: (X_3 : Y_3 : Z_3) = 2( X_1 : Y_1 : Z_1 ) */
+
+	/* B = (X_1 + Y_1)^2  */
+	ctx->addm(B, X1, Y1, ctx);
+	ctx->pow2(B, B, ctx);
+
+	/* C = X_1^2 */
+	/* D = Y_1^2 */
+	ctx->pow2(C, X1, ctx);
+	ctx->pow2(D, Y1, ctx);
+
+	/* E = aC */
+	if (ctx->dialect == ECC_DIALECT_ED25519)
+		ctx->subm(E, ctx->p, C, ctx);
+	else
+		ctx->mulm(E, ctx->a, C, ctx);
+
+	/* F = E + D */
+	ctx->addm(F, E, D, ctx);
+
+	/* H = Z_1^2 */
+	ctx->pow2(H, Z1, ctx);
+
+	/* J = F - 2H */
+	ctx->mul2(J, H, ctx);
+	ctx->subm(J, F, J, ctx);
+
+	/* X_3 = (B - C - D) · J */
+	ctx->subm(X3, B, C, ctx);
+	ctx->subm(X3, X3, D, ctx);
+	ctx->mulm(X3, X3, J, ctx);
+
+	/* Y_3 = F · (E - D) */
+	ctx->subm(Y3, E, D, ctx);
+	ctx->mulm(Y3, Y3, F, ctx);
+
+	/* Z_3 = F · J */
+	ctx->mulm(Z3, F, J, ctx);
+
+#undef X1
+#undef Y1
+#undef Z1
+#undef X3
+#undef Y3
+#undef Z3
+#undef B
+#undef C
+#undef D
+#undef E
+#undef F
+#undef H
+#undef J
+}
+
+/*  RESULT = 2 * POINT  */
+static void
+mpi_ec_dup_point(MPI_POINT result, MPI_POINT point, struct mpi_ec_ctx *ctx)
+{
+	switch (ctx->model) {
+	case MPI_EC_WEIERSTRASS:
+		dup_point_weierstrass(result, point, ctx);
+		break;
+	case MPI_EC_MONTGOMERY:
+		dup_point_montgomery(result, point, ctx);
+		break;
+	case MPI_EC_EDWARDS:
+		dup_point_edwards(result, point, ctx);
+		break;
+	}
+}
+
+/* RESULT = P1 + P2  (Weierstrass version).*/
+static void add_points_weierstrass(MPI_POINT result,
+		MPI_POINT p1, MPI_POINT p2,
+		struct mpi_ec_ctx *ctx)
+{
+#define x1 (p1->x)
+#define y1 (p1->y)
+#define z1 (p1->z)
+#define x2 (p2->x)
+#define y2 (p2->y)
+#define z2 (p2->z)
+#define x3 (result->x)
+#define y3 (result->y)
+#define z3 (result->z)
+#define l1 (ctx->t.scratch[0])
+#define l2 (ctx->t.scratch[1])
+#define l3 (ctx->t.scratch[2])
+#define l4 (ctx->t.scratch[3])
+#define l5 (ctx->t.scratch[4])
+#define l6 (ctx->t.scratch[5])
+#define l7 (ctx->t.scratch[6])
+#define l8 (ctx->t.scratch[7])
+#define l9 (ctx->t.scratch[8])
+#define t1 (ctx->t.scratch[9])
+#define t2 (ctx->t.scratch[10])
+
+	if ((!mpi_cmp(x1, x2)) && (!mpi_cmp(y1, y2)) && (!mpi_cmp(z1, z2))) {
+		/* Same point; need to call the duplicate function.  */
+		mpi_ec_dup_point(result, p1, ctx);
+	} else if (!mpi_cmp_ui(z1, 0)) {
+		/* P1 is at infinity.  */
+		mpi_set(x3, p2->x);
+		mpi_set(y3, p2->y);
+		mpi_set(z3, p2->z);
+	} else if (!mpi_cmp_ui(z2, 0)) {
+		/* P2 is at infinity.  */
+		mpi_set(x3, p1->x);
+		mpi_set(y3, p1->y);
+		mpi_set(z3, p1->z);
+	} else {
+		int z1_is_one = !mpi_cmp_ui(z1, 1);
+		int z2_is_one = !mpi_cmp_ui(z2, 1);
+
+		/* l1 = x1 z2^2  */
+		/* l2 = x2 z1^2  */
+		if (z2_is_one)
+			mpi_set(l1, x1);
+		else {
+			ec_pow2(l1, z2, ctx);
+			ec_mulm(l1, l1, x1, ctx);
+		}
+		if (z1_is_one)
+			mpi_set(l2, x2);
+		else {
+			ec_pow2(l2, z1, ctx);
+			ec_mulm(l2, l2, x2, ctx);
+		}
+		/* l3 = l1 - l2 */
+		ec_subm(l3, l1, l2, ctx);
+		/* l4 = y1 z2^3  */
+		ec_powm(l4, z2, mpi_const(MPI_C_THREE), ctx);
+		ec_mulm(l4, l4, y1, ctx);
+		/* l5 = y2 z1^3  */
+		ec_powm(l5, z1, mpi_const(MPI_C_THREE), ctx);
+		ec_mulm(l5, l5, y2, ctx);
+		/* l6 = l4 - l5  */
+		ec_subm(l6, l4, l5, ctx);
+
+		if (!mpi_cmp_ui(l3, 0)) {
+			if (!mpi_cmp_ui(l6, 0)) {
+				/* P1 and P2 are the same - use duplicate function. */
+				mpi_ec_dup_point(result, p1, ctx);
+			} else {
+				/* P1 is the inverse of P2.  */
+				mpi_set_ui(x3, 1);
+				mpi_set_ui(y3, 1);
+				mpi_set_ui(z3, 0);
+			}
+		} else {
+			/* l7 = l1 + l2  */
+			ec_addm(l7, l1, l2, ctx);
+			/* l8 = l4 + l5  */
+			ec_addm(l8, l4, l5, ctx);
+			/* z3 = z1 z2 l3  */
+			ec_mulm(z3, z1, z2, ctx);
+			ec_mulm(z3, z3, l3, ctx);
+			/* x3 = l6^2 - l7 l3^2  */
+			ec_pow2(t1, l6, ctx);
+			ec_pow2(t2, l3, ctx);
+			ec_mulm(t2, t2, l7, ctx);
+			ec_subm(x3, t1, t2, ctx);
+			/* l9 = l7 l3^2 - 2 x3  */
+			ec_mul2(t1, x3, ctx);
+			ec_subm(l9, t2, t1, ctx);
+			/* y3 = (l9 l6 - l8 l3^3)/2  */
+			ec_mulm(l9, l9, l6, ctx);
+			ec_powm(t1, l3, mpi_const(MPI_C_THREE), ctx); /* fixme: Use saved value*/
+			ec_mulm(t1, t1, l8, ctx);
+			ec_subm(y3, l9, t1, ctx);
+			ec_mulm(y3, y3, ec_get_two_inv_p(ctx), ctx);
+		}
+	}
+
+#undef x1
+#undef y1
+#undef z1
+#undef x2
+#undef y2
+#undef z2
+#undef x3
+#undef y3
+#undef z3
+#undef l1
+#undef l2
+#undef l3
+#undef l4
+#undef l5
+#undef l6
+#undef l7
+#undef l8
+#undef l9
+#undef t1
+#undef t2
+}
+
+/* RESULT = P1 + P2  (Montgomery version).*/
+static void add_points_montgomery(MPI_POINT result,
+		MPI_POINT p1, MPI_POINT p2,
+		struct mpi_ec_ctx *ctx)
+{
+	(void)result;
+	(void)p1;
+	(void)p2;
+	(void)ctx;
+	log_fatal("%s: %s not yet supported\n",
+			"mpi_ec_add_points", "Montgomery");
+}
+
+/* RESULT = P1 + P2  (Twisted Edwards version).*/
+static void add_points_edwards(MPI_POINT result,
+		MPI_POINT p1, MPI_POINT p2,
+		struct mpi_ec_ctx *ctx)
+{
+#define X1 (p1->x)
+#define Y1 (p1->y)
+#define Z1 (p1->z)
+#define X2 (p2->x)
+#define Y2 (p2->y)
+#define Z2 (p2->z)
+#define X3 (result->x)
+#define Y3 (result->y)
+#define Z3 (result->z)
+#define A (ctx->t.scratch[0])
+#define B (ctx->t.scratch[1])
+#define C (ctx->t.scratch[2])
+#define D (ctx->t.scratch[3])
+#define E (ctx->t.scratch[4])
+#define F (ctx->t.scratch[5])
+#define G (ctx->t.scratch[6])
+#define tmp (ctx->t.scratch[7])
+
+	point_resize(result, ctx);
+
+	/* Compute: (X_3 : Y_3 : Z_3) = (X_1 : Y_1 : Z_1) + (X_2 : Y_2 : Z_3) */
+
+	/* A = Z1 · Z2 */
+	ctx->mulm(A, Z1, Z2, ctx);
+
+	/* B = A^2 */
+	ctx->pow2(B, A, ctx);
+
+	/* C = X1 · X2 */
+	ctx->mulm(C, X1, X2, ctx);
+
+	/* D = Y1 · Y2 */
+	ctx->mulm(D, Y1, Y2, ctx);
+
+	/* E = d · C · D */
+	ctx->mulm(E, ctx->b, C, ctx);
+	ctx->mulm(E, E, D, ctx);
+
+	/* F = B - E */
+	ctx->subm(F, B, E, ctx);
+
+	/* G = B + E */
+	ctx->addm(G, B, E, ctx);
+
+	/* X_3 = A · F · ((X_1 + Y_1) · (X_2 + Y_2) - C - D) */
+	ctx->addm(tmp, X1, Y1, ctx);
+	ctx->addm(X3, X2, Y2, ctx);
+	ctx->mulm(X3, X3, tmp, ctx);
+	ctx->subm(X3, X3, C, ctx);
+	ctx->subm(X3, X3, D, ctx);
+	ctx->mulm(X3, X3, F, ctx);
+	ctx->mulm(X3, X3, A, ctx);
+
+	/* Y_3 = A · G · (D - aC) */
+	if (ctx->dialect == ECC_DIALECT_ED25519) {
+		ctx->addm(Y3, D, C, ctx);
+	} else {
+		ctx->mulm(Y3, ctx->a, C, ctx);
+		ctx->subm(Y3, D, Y3, ctx);
+	}
+	ctx->mulm(Y3, Y3, G, ctx);
+	ctx->mulm(Y3, Y3, A, ctx);
+
+	/* Z_3 = F · G */
+	ctx->mulm(Z3, F, G, ctx);
+
+
+#undef X1
+#undef Y1
+#undef Z1
+#undef X2
+#undef Y2
+#undef Z2
+#undef X3
+#undef Y3
+#undef Z3
+#undef A
+#undef B
+#undef C
+#undef D
+#undef E
+#undef F
+#undef G
+#undef tmp
+}
+
+/* Compute a step of Montgomery Ladder (only use X and Z in the point).
+ * Inputs:  P1, P2, and x-coordinate of DIF = P1 - P1.
+ * Outputs: PRD = 2 * P1 and  SUM = P1 + P2.
+ */
+static void montgomery_ladder(MPI_POINT prd, MPI_POINT sum,
+		MPI_POINT p1, MPI_POINT p2, MPI dif_x,
+		struct mpi_ec_ctx *ctx)
+{
+	ctx->addm(sum->x, p2->x, p2->z, ctx);
+	ctx->subm(p2->z, p2->x, p2->z, ctx);
+	ctx->addm(prd->x, p1->x, p1->z, ctx);
+	ctx->subm(p1->z, p1->x, p1->z, ctx);
+	ctx->mulm(p2->x, p1->z, sum->x, ctx);
+	ctx->mulm(p2->z, prd->x, p2->z, ctx);
+	ctx->pow2(p1->x, prd->x, ctx);
+	ctx->pow2(p1->z, p1->z, ctx);
+	ctx->addm(sum->x, p2->x, p2->z, ctx);
+	ctx->subm(p2->z, p2->x, p2->z, ctx);
+	ctx->mulm(prd->x, p1->x, p1->z, ctx);
+	ctx->subm(p1->z, p1->x, p1->z, ctx);
+	ctx->pow2(sum->x, sum->x, ctx);
+	ctx->pow2(sum->z, p2->z, ctx);
+	ctx->mulm(prd->z, p1->z, ctx->a, ctx); /* CTX->A: (a-2)/4 */
+	ctx->mulm(sum->z, sum->z, dif_x, ctx);
+	ctx->addm(prd->z, p1->x, prd->z, ctx);
+	ctx->mulm(prd->z, prd->z, p1->z, ctx);
+}
+
+/* RESULT = P1 + P2 */
+void mpi_ec_add_points(MPI_POINT result,
+		MPI_POINT p1, MPI_POINT p2,
+		struct mpi_ec_ctx *ctx)
+{
+	switch (ctx->model) {
+	case MPI_EC_WEIERSTRASS:
+		add_points_weierstrass(result, p1, p2, ctx);
+		break;
+	case MPI_EC_MONTGOMERY:
+		add_points_montgomery(result, p1, p2, ctx);
+		break;
+	case MPI_EC_EDWARDS:
+		add_points_edwards(result, p1, p2, ctx);
+		break;
+	}
+}
+EXPORT_SYMBOL_GPL(mpi_ec_add_points);
+
+/* Scalar point multiplication - the main function for ECC.  If takes
+ * an integer SCALAR and a POINT as well as the usual context CTX.
+ * RESULT will be set to the resulting point.
+ */
+void mpi_ec_mul_point(MPI_POINT result,
+			MPI scalar, MPI_POINT point,
+			struct mpi_ec_ctx *ctx)
+{
+	MPI x1, y1, z1, k, h, yy;
+	unsigned int i, loops;
+	struct gcry_mpi_point p1, p2, p1inv;
+
+	if (ctx->model == MPI_EC_EDWARDS) {
+		/* Simple left to right binary method.  Algorithm 3.27 from
+		 * {author={Hankerson, Darrel and Menezes, Alfred J. and Vanstone, Scott},
+		 *  title = {Guide to Elliptic Curve Cryptography},
+		 *  year = {2003}, isbn = {038795273X},
+		 *  url = {http://www.cacr.math.uwaterloo.ca/ecc/},
+		 *  publisher = {Springer-Verlag New York, Inc.}}
+		 */
+		unsigned int nbits;
+		int j;
+
+		if (mpi_cmp(scalar, ctx->p) >= 0)
+			nbits = mpi_get_nbits(scalar);
+		else
+			nbits = mpi_get_nbits(ctx->p);
+
+		mpi_set_ui(result->x, 0);
+		mpi_set_ui(result->y, 1);
+		mpi_set_ui(result->z, 1);
+		point_resize(point, ctx);
+
+		point_resize(result, ctx);
+		point_resize(point, ctx);
+
+		for (j = nbits-1; j >= 0; j--) {
+			mpi_ec_dup_point(result, result, ctx);
+			if (mpi_test_bit(scalar, j))
+				mpi_ec_add_points(result, result, point, ctx);
+		}
+		return;
+	} else if (ctx->model == MPI_EC_MONTGOMERY) {
+		unsigned int nbits;
+		int j;
+		struct gcry_mpi_point p1_, p2_;
+		MPI_POINT q1, q2, prd, sum;
+		unsigned long sw;
+		mpi_size_t rsize;
+
+		/* Compute scalar point multiplication with Montgomery Ladder.
+		 * Note that we don't use Y-coordinate in the points at all.
+		 * RESULT->Y will be filled by zero.
+		 */
+
+		nbits = mpi_get_nbits(scalar);
+		point_init(&p1);
+		point_init(&p2);
+		point_init(&p1_);
+		point_init(&p2_);
+		mpi_set_ui(p1.x, 1);
+		mpi_free(p2.x);
+		p2.x = mpi_copy(point->x);
+		mpi_set_ui(p2.z, 1);
+
+		point_resize(&p1, ctx);
+		point_resize(&p2, ctx);
+		point_resize(&p1_, ctx);
+		point_resize(&p2_, ctx);
+
+		mpi_resize(point->x, ctx->p->nlimbs);
+		point->x->nlimbs = ctx->p->nlimbs;
+
+		q1 = &p1;
+		q2 = &p2;
+		prd = &p1_;
+		sum = &p2_;
+
+		for (j = nbits-1; j >= 0; j--) {
+			sw = mpi_test_bit(scalar, j);
+			point_swap_cond(q1, q2, sw, ctx);
+			montgomery_ladder(prd, sum, q1, q2, point->x, ctx);
+			point_swap_cond(prd, sum, sw, ctx);
+			swap(q1, prd);
+			swap(q2, sum);
+		}
+
+		mpi_clear(result->y);
+		sw = (nbits & 1);
+		point_swap_cond(&p1, &p1_, sw, ctx);
+
+		rsize = p1.z->nlimbs;
+		MPN_NORMALIZE(p1.z->d, rsize);
+		if (rsize == 0) {
+			mpi_set_ui(result->x, 1);
+			mpi_set_ui(result->z, 0);
+		} else {
+			z1 = mpi_new(0);
+			ec_invm(z1, p1.z, ctx);
+			ec_mulm(result->x, p1.x, z1, ctx);
+			mpi_set_ui(result->z, 1);
+			mpi_free(z1);
+		}
+
+		point_free(&p1);
+		point_free(&p2);
+		point_free(&p1_);
+		point_free(&p2_);
+		return;
+	}
+
+	x1 = mpi_alloc_like(ctx->p);
+	y1 = mpi_alloc_like(ctx->p);
+	h  = mpi_alloc_like(ctx->p);
+	k  = mpi_copy(scalar);
+	yy = mpi_copy(point->y);
+
+	if (mpi_has_sign(k)) {
+		k->sign = 0;
+		ec_invm(yy, yy, ctx);
+	}
+
+	if (!mpi_cmp_ui(point->z, 1)) {
+		mpi_set(x1, point->x);
+		mpi_set(y1, yy);
+	} else {
+		MPI z2, z3;
+
+		z2 = mpi_alloc_like(ctx->p);
+		z3 = mpi_alloc_like(ctx->p);
+		ec_mulm(z2, point->z, point->z, ctx);
+		ec_mulm(z3, point->z, z2, ctx);
+		ec_invm(z2, z2, ctx);
+		ec_mulm(x1, point->x, z2, ctx);
+		ec_invm(z3, z3, ctx);
+		ec_mulm(y1, yy, z3, ctx);
+		mpi_free(z2);
+		mpi_free(z3);
+	}
+	z1 = mpi_copy(mpi_const(MPI_C_ONE));
+
+	mpi_mul(h, k, mpi_const(MPI_C_THREE)); /* h = 3k */
+	loops = mpi_get_nbits(h);
+	if (loops < 2) {
+		/* If SCALAR is zero, the above mpi_mul sets H to zero and thus
+		 * LOOPs will be zero.  To avoid an underflow of I in the main
+		 * loop we set LOOP to 2 and the result to (0,0,0).
+		 */
+		loops = 2;
+		mpi_clear(result->x);
+		mpi_clear(result->y);
+		mpi_clear(result->z);
+	} else {
+		mpi_set(result->x, point->x);
+		mpi_set(result->y, yy);
+		mpi_set(result->z, point->z);
+	}
+	mpi_free(yy); yy = NULL;
+
+	p1.x = x1; x1 = NULL;
+	p1.y = y1; y1 = NULL;
+	p1.z = z1; z1 = NULL;
+	point_init(&p2);
+	point_init(&p1inv);
+
+	/* Invert point: y = p - y mod p  */
+	point_set(&p1inv, &p1);
+	ec_subm(p1inv.y, ctx->p, p1inv.y, ctx);
+
+	for (i = loops-2; i > 0; i--) {
+		mpi_ec_dup_point(result, result, ctx);
+		if (mpi_test_bit(h, i) == 1 && mpi_test_bit(k, i) == 0) {
+			point_set(&p2, result);
+			mpi_ec_add_points(result, &p2, &p1, ctx);
+		}
+		if (mpi_test_bit(h, i) == 0 && mpi_test_bit(k, i) == 1) {
+			point_set(&p2, result);
+			mpi_ec_add_points(result, &p2, &p1inv, ctx);
+		}
+	}
+
+	point_free(&p1);
+	point_free(&p2);
+	point_free(&p1inv);
+	mpi_free(h);
+	mpi_free(k);
+}
+EXPORT_SYMBOL_GPL(mpi_ec_mul_point);
+
+/* Return true if POINT is on the curve described by CTX.  */
+int mpi_ec_curve_point(MPI_POINT point, struct mpi_ec_ctx *ctx)
+{
+	int res = 0;
+	MPI x, y, w;
+
+	x = mpi_new(0);
+	y = mpi_new(0);
+	w = mpi_new(0);
+
+	/* Check that the point is in range.  This needs to be done here and
+	 * not after conversion to affine coordinates.
+	 */
+	if (mpi_cmpabs(point->x, ctx->p) >= 0)
+		goto leave;
+	if (mpi_cmpabs(point->y, ctx->p) >= 0)
+		goto leave;
+	if (mpi_cmpabs(point->z, ctx->p) >= 0)
+		goto leave;
+
+	switch (ctx->model) {
+	case MPI_EC_WEIERSTRASS:
+		{
+			MPI xxx;
+
+			if (mpi_ec_get_affine(x, y, point, ctx))
+				goto leave;
+
+			xxx = mpi_new(0);
+
+			/* y^2 == x^3 + a·x + b */
+			ec_pow2(y, y, ctx);
+
+			ec_pow3(xxx, x, ctx);
+			ec_mulm(w, ctx->a, x, ctx);
+			ec_addm(w, w, ctx->b, ctx);
+			ec_addm(w, w, xxx, ctx);
+
+			if (!mpi_cmp(y, w))
+				res = 1;
+
+			mpi_free(xxx);
+		}
+		break;
+
+	case MPI_EC_MONTGOMERY:
+		{
+#define xx y
+			/* With Montgomery curve, only X-coordinate is valid. */
+			if (mpi_ec_get_affine(x, NULL, point, ctx))
+				goto leave;
+
+			/* The equation is: b * y^2 == x^3 + a · x^2 + x */
+			/* We check if right hand is quadratic residue or not by
+			 * Euler's criterion.
+			 */
+			/* CTX->A has (a-2)/4 and CTX->B has b^-1 */
+			ec_mulm(w, ctx->a, mpi_const(MPI_C_FOUR), ctx);
+			ec_addm(w, w, mpi_const(MPI_C_TWO), ctx);
+			ec_mulm(w, w, x, ctx);
+			ec_pow2(xx, x, ctx);
+			ec_addm(w, w, xx, ctx);
+			ec_addm(w, w, mpi_const(MPI_C_ONE), ctx);
+			ec_mulm(w, w, x, ctx);
+			ec_mulm(w, w, ctx->b, ctx);
+#undef xx
+			/* Compute Euler's criterion: w^(p-1)/2 */
+#define p_minus1 y
+			ec_subm(p_minus1, ctx->p, mpi_const(MPI_C_ONE), ctx);
+			mpi_rshift(p_minus1, p_minus1, 1);
+			ec_powm(w, w, p_minus1, ctx);
+
+			res = !mpi_cmp_ui(w, 1);
+#undef p_minus1
+		}
+		break;
+
+	case MPI_EC_EDWARDS:
+		{
+			if (mpi_ec_get_affine(x, y, point, ctx))
+				goto leave;
+
+			mpi_resize(w, ctx->p->nlimbs);
+			w->nlimbs = ctx->p->nlimbs;
+
+			/* a · x^2 + y^2 - 1 - b · x^2 · y^2 == 0 */
+			ctx->pow2(x, x, ctx);
+			ctx->pow2(y, y, ctx);
+			if (ctx->dialect == ECC_DIALECT_ED25519)
+				ctx->subm(w, ctx->p, x, ctx);
+			else
+				ctx->mulm(w, ctx->a, x, ctx);
+			ctx->addm(w, w, y, ctx);
+			ctx->mulm(x, x, y, ctx);
+			ctx->mulm(x, x, ctx->b, ctx);
+			ctx->subm(w, w, x, ctx);
+			if (!mpi_cmp_ui(w, 1))
+				res = 1;
+		}
+		break;
+	}
+
+leave:
+	mpi_free(w);
+	mpi_free(x);
+	mpi_free(y);
+
+	return res;
+}
+EXPORT_SYMBOL_GPL(mpi_ec_curve_point);
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH RFC 3/4] crypto/sm2: Rework sm2 alg with sig_alg backend
  2025-06-30 13:39 [PATCH RFC 0/4] Reintroduce the sm2 algorithm Gu Bowen
  2025-06-30 13:39 ` [PATCH RFC 1/4] Revert "Revert "lib/mpi: Extend the MPI library"" Gu Bowen
  2025-06-30 13:39 ` [PATCH RFC 2/4] Revert "Revert "lib/mpi: Introduce ec implementation to " Gu Bowen
@ 2025-06-30 13:39 ` Gu Bowen
  2025-06-30 13:39 ` [PATCH RFC 4/4] crypto/sm2: support SM2-with-SM3 verification of X.509 certificates Gu Bowen
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 13+ messages in thread
From: Gu Bowen @ 2025-06-30 13:39 UTC (permalink / raw)
  To: Herbert Xu, David Howells, David Woodhouse, Lukas Wunner,
	Ignat Korchagin, David S . Miller, Jarkko Sakkinen,
	Maxime Coquelin, Alexandre Torgue, Eric Biggers,
	Jason A . Donenfeld, Ard Biesheuvel, Tianjia Zhang, Dan Carpenter
  Cc: keyrings, linux-kernel, linux-crypto, linux-stm32,
	linux-arm-kernel, Lu Jialin, GONG Ruiqi, Gu Bowen

Based on preivous sm2 implementations ea7ecb66440b("crypto: sm2 -
introduce OSCCA SM2 asymmetric cipher algorithm"), rework sm2 alg with
sig_alg backend.

Signed-off-by: Gu Bowen <gubowen5@huawei.com>
---
 crypto/Kconfig           |  18 ++
 crypto/Makefile          |   8 +
 crypto/sm2.c             | 492 +++++++++++++++++++++++++++++++++++++++
 crypto/sm2signature.asn1 |   4 +
 crypto/testmgr.c         |   6 +
 crypto/testmgr.h         |  57 +++++
 include/crypto/sm2.h     |  31 +++
 7 files changed, 616 insertions(+)
 create mode 100644 crypto/sm2.c
 create mode 100644 crypto/sm2signature.asn1
 create mode 100644 include/crypto/sm2.h

diff --git a/crypto/Kconfig b/crypto/Kconfig
index e1cfd0d4cc8f..7bd6f025d29d 100644
--- a/crypto/Kconfig
+++ b/crypto/Kconfig
@@ -344,6 +344,24 @@ config CRYPTO_ECRDSA
 	  One of the Russian cryptographic standard algorithms (called GOST
 	  algorithms). Only signature verification is implemented.
 
+config CRYPTO_SM2
+        tristate "SM2 algorithm"
+        select CRYPTO_SM3
+        select CRYPTO_SIG
+        select CRYPTO_MANAGER
+        select MPILIB
+        select ASN1
+        help
+          Generic implementation of the SM2 public key algorithm. It was
+          published by State Encryption Management Bureau, China.
+          as specified by OSCCA GM/T 0003.1-2012 -- 0003.5-2012.
+
+          References:
+          https://tools.ietf.org/html/draft-shen-sm2-ecdsa-02
+          http://www.oscca.gov.cn/sca/xxgk/2010-12/17/content_1002386.shtml
+          http://www.gmbz.org.cn/main/bzlb.html
+
+
 config CRYPTO_CURVE25519
 	tristate "Curve25519"
 	select CRYPTO_KPP
diff --git a/crypto/Makefile b/crypto/Makefile
index 017df3a2e4bb..e36953356f68 100644
--- a/crypto/Makefile
+++ b/crypto/Makefile
@@ -52,6 +52,14 @@ rsa_generic-y += rsa-pkcs1pad.o
 rsa_generic-y += rsassa-pkcs1.o
 obj-$(CONFIG_CRYPTO_RSA) += rsa_generic.o
 
+$(obj)/sm2signature.asn1.o: $(obj)/sm2signature.asn1.c $(obj)/sm2signature.asn1.h
+$(obj)/sm2.o: $(obj)/sm2signature.asn1.h
+
+sm2_generic-y += sm2signature.asn1.o
+sm2_generic-y += sm2.o
+
+obj-$(CONFIG_CRYPTO_SM2) += sm2_generic.o
+
 $(obj)/ecdsasignature.asn1.o: $(obj)/ecdsasignature.asn1.c $(obj)/ecdsasignature.asn1.h
 $(obj)/ecdsa-x962.o: $(obj)/ecdsasignature.asn1.h
 ecdsa_generic-y += ecdsa.o
diff --git a/crypto/sm2.c b/crypto/sm2.c
new file mode 100644
index 000000000000..31e10fcee13c
--- /dev/null
+++ b/crypto/sm2.c
@@ -0,0 +1,492 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+/*
+ * SM2 asymmetric public-key algorithm
+ * as specified by OSCCA GM/T 0003.1-2012 -- 0003.5-2012 SM2 and
+ * described at https://tools.ietf.org/html/draft-shen-sm2-ecdsa-02
+ *
+ * Copyright (c) 2020, Alibaba Group.
+ * Authors: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
+ *
+ * Copyright (c) 2025, Huawei Tech. Co., Ltd.
+ * Authors: Gu Bowen <gubowen5@huawei.com>
+ */
+
+#include <linux/module.h>
+#include <linux/mpi.h>
+#include <crypto/hash.h>
+#include <crypto/sm3_base.h>
+#include <crypto/rng.h>
+#include <crypto/sm2.h>
+#include <crypto/sig.h>
+#include <crypto/internal/sig.h>
+#include "sm2signature.asn1.h"
+
+/* The default user id as specified in GM/T 0009-2012 */
+#define SM2_DEFAULT_USERID "1234567812345678"
+#define SM2_DEFAULT_USERID_LEN 16
+
+#define MPI_NBYTES(m)   ((mpi_get_nbits(m) + 7) / 8)
+
+struct ecc_domain_parms {
+	const char *desc;           /* Description of the curve.  */
+	unsigned int nbits;         /* Number of bits.  */
+	unsigned int fips:1; /* True if this is a FIPS140-2 approved curve */
+
+	/* The model describing this curve.  This is mainly used to select
+	 * the group equation.
+	 */
+	enum gcry_mpi_ec_models model;
+
+	/* The actual ECC dialect used.  This is used for curve specific
+	 * optimizations and to select encodings etc.
+	 */
+	enum ecc_dialects dialect;
+
+	const char *p;              /* The prime defining the field.  */
+	const char *a, *b;          /* The coefficients.  For Twisted Edwards
+				     * Curves b is used for d.  For Montgomery
+				     * Curves (a,b) has ((A-2)/4,B^-1).
+				     */
+	const char *n;              /* The order of the base point.  */
+	const char *g_x, *g_y;      /* Base point.  */
+	unsigned int h;             /* Cofactor.  */
+};
+
+static const struct ecc_domain_parms sm2_ecp = {
+	.desc = "sm2p256v1",
+	.nbits = 256,
+	.fips = 0,
+	.model = MPI_EC_WEIERSTRASS,
+	.dialect = ECC_DIALECT_STANDARD,
+	.p   = "0xfffffffeffffffffffffffffffffffffffffffff00000000ffffffffffffffff",
+	.a   = "0xfffffffeffffffffffffffffffffffffffffffff00000000fffffffffffffffc",
+	.b   = "0x28e9fa9e9d9f5e344d5a9e4bcf6509a7f39789f515ab8f92ddbcbd414d940e93",
+	.n   = "0xfffffffeffffffffffffffffffffffff7203df6b21c6052b53bbf40939d54123",
+	.g_x = "0x32c4ae2c1f1981195f9904466a39c9948fe30bbff2660be1715a4589334c74c7",
+	.g_y = "0xbc3736a2f4f6779c59bdcee36b692153d0a9877cc62a474002df32e52139f0a0",
+	.h = 1
+};
+
+static int __sm2_set_pub_key(struct mpi_ec_ctx *ec,
+			     const void *key, unsigned int keylen);
+
+static int sm2_ec_ctx_init(struct mpi_ec_ctx *ec)
+{
+	const struct ecc_domain_parms *ecp = &sm2_ecp;
+	MPI p, a, b;
+	MPI x, y;
+	int rc = -EINVAL;
+
+	p = mpi_scanval(ecp->p);
+	a = mpi_scanval(ecp->a);
+	b = mpi_scanval(ecp->b);
+	if (!p || !a || !b)
+		goto free_p;
+
+	x = mpi_scanval(ecp->g_x);
+	y = mpi_scanval(ecp->g_y);
+	if (!x || !y)
+		goto free;
+
+	rc = -ENOMEM;
+
+	ec->Q = mpi_point_new(0);
+	if (!ec->Q)
+		goto free;
+
+	/* mpi_ec_setup_elliptic_curve */
+	ec->G = mpi_point_new(0);
+	if (!ec->G) {
+		mpi_point_release(ec->Q);
+		goto free;
+	}
+
+	mpi_set(ec->G->x, x);
+	mpi_set(ec->G->y, y);
+	mpi_set_ui(ec->G->z, 1);
+
+	rc = -EINVAL;
+	ec->n = mpi_scanval(ecp->n);
+	if (!ec->n) {
+		mpi_point_release(ec->Q);
+		mpi_point_release(ec->G);
+		goto free;
+	}
+
+	ec->h = ecp->h;
+	ec->name = ecp->desc;
+	mpi_ec_init(ec, ecp->model, ecp->dialect, 0, p, a, b);
+
+	rc = 0;
+
+free:
+	mpi_free(x);
+	mpi_free(y);
+free_p:
+	mpi_free(p);
+	mpi_free(a);
+	mpi_free(b);
+
+	return rc;
+}
+
+static void sm2_ec_ctx_deinit(struct mpi_ec_ctx *ec)
+{
+	mpi_ec_deinit(ec);
+
+	memset(ec, 0, sizeof(*ec));
+}
+
+/* RESULT must have been initialized and is set on success to the
+ * point given by VALUE.
+ */
+static int sm2_ecc_os2ec(MPI_POINT result, MPI value)
+{
+	int rc;
+	size_t n;
+	unsigned char *buf;
+	MPI x, y;
+
+	n = MPI_NBYTES(value);
+	buf = kmalloc(n, GFP_KERNEL);
+	if (!buf)
+		return -ENOMEM;
+
+	rc = mpi_print(GCRYMPI_FMT_USG, buf, n, &n, value);
+	if (rc)
+		goto err_freebuf;
+
+	rc = -EINVAL;
+	if (n < 1 || ((n - 1) % 2))
+		goto err_freebuf;
+	/* No support for point compression */
+	if (*buf != 0x4)
+		goto err_freebuf;
+
+	rc = -ENOMEM;
+	n = (n - 1) / 2;
+	x = mpi_read_raw_data(buf + 1, n);
+	if (!x)
+		goto err_freebuf;
+	y = mpi_read_raw_data(buf + 1 + n, n);
+	if (!y)
+		goto err_freex;
+
+	mpi_normalize(x);
+	mpi_normalize(y);
+	mpi_set(result->x, x);
+	mpi_set(result->y, y);
+	mpi_set_ui(result->z, 1);
+
+	rc = 0;
+
+	mpi_free(y);
+err_freex:
+	mpi_free(x);
+err_freebuf:
+	kfree(buf);
+	return rc;
+}
+
+struct sm2_signature_ctx {
+	MPI sig_r;
+	MPI sig_s;
+};
+
+int sm2_get_signature_r(void *context, size_t hdrlen, unsigned char tag,
+				const void *value, size_t vlen)
+{
+	struct sm2_signature_ctx *sig = context;
+
+	if (!value || !vlen)
+		return -EINVAL;
+
+	sig->sig_r = mpi_read_raw_data(value, vlen);
+	if (!sig->sig_r)
+		return -ENOMEM;
+
+	return 0;
+}
+
+int sm2_get_signature_s(void *context, size_t hdrlen, unsigned char tag,
+				const void *value, size_t vlen)
+{
+	struct sm2_signature_ctx *sig = context;
+
+	if (!value || !vlen)
+		return -EINVAL;
+
+	sig->sig_s = mpi_read_raw_data(value, vlen);
+	if (!sig->sig_s)
+		return -ENOMEM;
+
+	return 0;
+}
+
+static int sm2_z_digest_update(struct shash_desc *desc,
+			MPI m, unsigned int pbytes)
+{
+	static const unsigned char zero[32];
+	unsigned char *in;
+	unsigned int inlen;
+	int err;
+
+	in = mpi_get_buffer(m, &inlen, NULL);
+	if (!in)
+		return -EINVAL;
+
+	if (inlen < pbytes) {
+		/* padding with zero */
+		err = crypto_shash_update(desc, zero, pbytes - inlen) ?:
+		      crypto_shash_update(desc, in, inlen);
+	} else if (inlen > pbytes) {
+		/* skip the starting zero */
+		err = crypto_shash_update(desc, in + inlen - pbytes, pbytes);
+	} else {
+		err = crypto_shash_update(desc, in, inlen);
+	}
+
+	kfree(in);
+	return err;
+}
+
+static int sm2_z_digest_update_point(struct shash_desc *desc,
+				     MPI_POINT point, struct mpi_ec_ctx *ec,
+				     unsigned int pbytes)
+{
+	MPI x, y;
+	int ret = -EINVAL;
+
+	x = mpi_new(0);
+	y = mpi_new(0);
+
+	ret = mpi_ec_get_affine(x, y, point, ec) ? -EINVAL :
+				sm2_z_digest_update(desc, x, pbytes) ?:
+				sm2_z_digest_update(desc, y, pbytes);
+
+	mpi_free(x);
+	mpi_free(y);
+	return ret;
+}
+
+int sm2_compute_z_digest(struct shash_desc *desc,
+			 const void *key, unsigned int keylen, void *dgst)
+{
+	struct mpi_ec_ctx *ec;
+	unsigned int bits_len;
+	unsigned int pbytes;
+	u8 entl[2];
+	int err;
+
+	ec = kmalloc(sizeof(*ec), GFP_KERNEL);
+	if (!ec)
+		return -ENOMEM;
+
+	err = sm2_ec_ctx_init(ec);
+	if (err)
+		goto out_free_ec;
+
+	err = __sm2_set_pub_key(ec, key, keylen);
+	if (err)
+		goto out_deinit_ec;
+
+	bits_len = SM2_DEFAULT_USERID_LEN * 8;
+	entl[0] = bits_len >> 8;
+	entl[1] = bits_len & 0xff;
+
+	pbytes = MPI_NBYTES(ec->p);
+
+	/* ZA = H256(ENTLA | IDA | a | b | xG | yG | xA | yA) */
+	err = crypto_shash_init(desc);
+	if (err)
+		goto out_deinit_ec;
+
+	err = crypto_shash_update(desc, entl, 2);
+	if (err)
+		goto out_deinit_ec;
+
+	err = crypto_shash_update(desc, SM2_DEFAULT_USERID,
+			SM2_DEFAULT_USERID_LEN);
+	if (err)
+		goto out_deinit_ec;
+
+	err = sm2_z_digest_update(desc, ec->a, pbytes) ?:
+		sm2_z_digest_update(desc, ec->b, pbytes) ?:
+		sm2_z_digest_update_point(desc, ec->G, ec, pbytes) ?:
+		sm2_z_digest_update_point(desc, ec->Q, ec, pbytes);
+	if (err)
+		goto out_deinit_ec;
+
+	err = crypto_shash_final(desc, dgst);
+
+out_deinit_ec:
+	sm2_ec_ctx_deinit(ec);
+out_free_ec:
+	kfree(ec);
+	return err;
+}
+EXPORT_SYMBOL_GPL(sm2_compute_z_digest);
+
+static int _sm2_verify(struct mpi_ec_ctx *ec, MPI hash, MPI sig_r, MPI sig_s)
+{
+	int rc = -EINVAL;
+	struct gcry_mpi_point sG, tP;
+	MPI t = NULL;
+	MPI x1 = NULL, y1 = NULL;
+
+	mpi_point_init(&sG);
+	mpi_point_init(&tP);
+	x1 = mpi_new(0);
+	y1 = mpi_new(0);
+	t = mpi_new(0);
+
+	/* r, s in [1, n-1] */
+	if (mpi_cmp_ui(sig_r, 1) < 0 || mpi_cmp(sig_r, ec->n) > 0 ||
+		mpi_cmp_ui(sig_s, 1) < 0 || mpi_cmp(sig_s, ec->n) > 0) {
+		goto leave;
+	}
+
+	/* t = (r + s) % n, t == 0 */
+	mpi_addm(t, sig_r, sig_s, ec->n);
+	if (mpi_cmp_ui(t, 0) == 0)
+		goto leave;
+
+	/* sG + tP = (x1, y1) */
+	rc = -EBADMSG;
+	mpi_ec_mul_point(&sG, sig_s, ec->G, ec);
+	mpi_ec_mul_point(&tP, t, ec->Q, ec);
+	mpi_ec_add_points(&sG, &sG, &tP, ec);
+	if (mpi_ec_get_affine(x1, y1, &sG, ec))
+		goto leave;
+
+	/* R = (e + x1) % n */
+	mpi_addm(t, hash, x1, ec->n);
+
+	/* check R == r */
+	rc = -EKEYREJECTED;
+	if (mpi_cmp(t, sig_r))
+		goto leave;
+
+	rc = 0;
+
+leave:
+	mpi_point_free_parts(&sG);
+	mpi_point_free_parts(&tP);
+	mpi_free(x1);
+	mpi_free(y1);
+	mpi_free(t);
+
+	return rc;
+}
+
+static int sm2_verify(struct crypto_sig *tfm,
+		      const void *src, unsigned int slen,
+		      const void *digest, unsigned int dlen)
+{
+	struct mpi_ec_ctx *ec = crypto_sig_ctx(tfm);
+	struct sm2_signature_ctx sig;
+	MPI hash;
+	int ret;
+
+	if (unlikely(!ec->Q))
+		return -EINVAL;
+
+	sig.sig_r = NULL;
+	sig.sig_s = NULL;
+	ret = asn1_ber_decoder(&sm2signature_decoder, &sig, src, slen);
+	if (ret)
+		goto error;
+
+	ret = -ENOMEM;
+	hash = mpi_read_raw_data(digest, dlen);
+	if (!hash)
+		goto error;
+
+	ret = _sm2_verify(ec, hash, sig.sig_r, sig.sig_s);
+
+	mpi_free(hash);
+error:
+	mpi_free(sig.sig_r);
+	mpi_free(sig.sig_s);
+	return ret;
+}
+
+static int sm2_set_pub_key(struct crypto_sig *tfm,
+			   const void *key, unsigned int keylen)
+{
+	struct mpi_ec_ctx *ec = crypto_sig_ctx(tfm);
+
+	return __sm2_set_pub_key(ec, key, keylen);
+}
+
+static int __sm2_set_pub_key(struct mpi_ec_ctx *ec,
+			     const void *key, unsigned int keylen)
+{
+	MPI a;
+	int rc;
+
+	/* include the uncompressed flag '0x04' */
+	a = mpi_read_raw_data(key, keylen);
+	if (!a)
+		return -ENOMEM;
+
+	mpi_normalize(a);
+	rc = sm2_ecc_os2ec(ec->Q, a);
+	mpi_free(a);
+
+	return rc;
+}
+
+static unsigned int sm2_max_size(struct crypto_sig *tfm)
+{
+	/* Unlimited max size */
+	return PAGE_SIZE;
+}
+
+static int sm2_init_tfm(struct crypto_sig *tfm)
+{
+	struct mpi_ec_ctx *ec = crypto_sig_ctx(tfm);
+
+	return sm2_ec_ctx_init(ec);
+}
+
+static void sm2_exit_tfm(struct crypto_sig *tfm)
+{
+	struct mpi_ec_ctx *ec = crypto_sig_ctx(tfm);
+
+	sm2_ec_ctx_deinit(ec);
+}
+
+static struct sig_alg sm2 = {
+	.verify = sm2_verify,
+	.set_pub_key = sm2_set_pub_key,
+	.max_size = sm2_max_size,
+	.init = sm2_init_tfm,
+	.exit = sm2_exit_tfm,
+	.base = {
+		.cra_name = "sm2",
+		.cra_driver_name = "sm2-generic",
+		.cra_priority = 100,
+		.cra_module = THIS_MODULE,
+		.cra_ctxsize = sizeof(struct mpi_ec_ctx),
+	},
+};
+
+static int __init sm2_init(void)
+{
+	return crypto_register_sig(&sm2);
+}
+
+static void __exit sm2_exit(void)
+{
+	crypto_unregister_sig(&sm2);
+}
+
+subsys_initcall(sm2_init);
+module_exit(sm2_exit);
+
+MODULE_LICENSE("GPL");
+MODULE_AUTHOR("Tianjia Zhang <tianjia.zhang@linux.alibaba.com>");
+MODULE_AUTHOR("Gu Bowen <gubowen5@huawei.com>");
+MODULE_DESCRIPTION("SM2 generic algorithm");
+MODULE_ALIAS_CRYPTO("sm2-generic");
diff --git a/crypto/sm2signature.asn1 b/crypto/sm2signature.asn1
new file mode 100644
index 000000000000..ab8c0b754d21
--- /dev/null
+++ b/crypto/sm2signature.asn1
@@ -0,0 +1,4 @@
+Sm2Signature ::= SEQUENCE {
+	sig_r	INTEGER ({ sm2_get_signature_r }),
+	sig_s	INTEGER ({ sm2_get_signature_s })
+}
diff --git a/crypto/testmgr.c b/crypto/testmgr.c
index 32f753d6c430..9fde36527711 100644
--- a/crypto/testmgr.c
+++ b/crypto/testmgr.c
@@ -5505,6 +5505,12 @@ static const struct alg_test_desc alg_test_descs[] = {
 		.suite = {
 			.hash = __VECS(sha512_tv_template)
 		}
+	}, {
+		.alg = "sm2",
+		.test = alg_test_sig,
+		.suite = {
+			.sig = __VECS(sm2_tv_template)
+		}
 	}, {
 		.alg = "sm3",
 		.test = alg_test_hash,
diff --git a/crypto/testmgr.h b/crypto/testmgr.h
index 32d099ac9e73..68928b4fbd1c 100644
--- a/crypto/testmgr.h
+++ b/crypto/testmgr.h
@@ -6133,6 +6133,63 @@ static const struct hash_testvec hmac_streebog512_tv_template[] = {
 	},
 };
 
+/*
+ * SM2 test vectors.
+ */
+static const struct sig_testvec sm2_tv_template[] = {
+	{ /* Generated from openssl */
+	.key =
+	"\x04"
+	"\x8e\xa0\x33\x69\x91\x7e\x3d\xec\xad\x8e\xf0\x45\x5e\x13\x3e\x68"
+	"\x5b\x8c\xab\x5c\xc6\xc8\x50\xdf\x91\x00\xe0\x24\x73\x4d\x31\xf2"
+	"\x2e\xc0\xd5\x6b\xee\xda\x98\x93\xec\xd8\x36\xaa\xb9\xcf\x63\x82"
+	"\xef\xa7\x1a\x03\xed\x16\xba\x74\xb8\x8b\xf9\xe5\x70\x39\xa4\x70",
+	.key_len = 65,
+	.param_len = 0,
+	.c =
+	"\x30\x45"
+	"\x02\x20"
+	"\x70\xab\xb6\x7d\xd6\x54\x80\x64\x42\x7e\x2d\x05\x08\x36\xc9\x96"
+	"\x25\xc2\xbb\xff\x08\xe5\x43\x15\x5e\xf3\x06\xd9\x2b\x2f\x0a\x9f"
+	"\x02\x21"
+	"\x00"
+	"\xbf\x21\x5f\x7e\x5d\x3f\x1a\x4d\x8f\x84\xc2\xe9\xa6\x4c\xa4\x18"
+	"\xb2\xb8\x46\xf4\x32\x96\xfa\x57\xc6\x29\xd4\x89\xae\xcc\xda\xdb",
+	.c_size = 71,
+	.algo = OID_SM2_with_SM3,
+	.m =
+	"\x47\xa7\xbf\xd3\xda\xc4\x79\xee\xda\x8b\x4f\xe8\x40\x94\xd4\x32"
+	"\x8f\xf1\xcd\x68\x4d\xbd\x9b\x1d\xe0\xd8\x9a\x5d\xad\x85\x47\x5c",
+	.m_size = 32,
+	.public_key_vec = true,
+	},
+	{ /* From libgcrypt */
+	.key =
+	"\x04"
+	"\x87\x59\x38\x9a\x34\xaa\xad\x07\xec\xf4\xe0\xc8\xc2\x65\x0a\x44"
+	"\x59\xc8\xd9\x26\xee\x23\x78\x32\x4e\x02\x61\xc5\x25\x38\xcb\x47"
+	"\x75\x28\x10\x6b\x1e\x0b\x7c\x8d\xd5\xff\x29\xa9\xc8\x6a\x89\x06"
+	"\x56\x56\xeb\x33\x15\x4b\xc0\x55\x60\x91\xef\x8a\xc9\xd1\x7d\x78",
+	.key_len = 65,
+	.param_len = 0,
+	.c =
+	"\x30\x44"
+	"\x02\x20"
+	"\xd9\xec\xef\xe8\x5f\xee\x3c\x59\x57\x8e\x5b\xab\xb3\x02\xe1\x42"
+	"\x4b\x67\x2c\x0b\x26\xb6\x51\x2c\x3e\xfc\xc6\x49\xec\xfe\x89\xe5"
+	"\x02\x20"
+	"\x43\x45\xd0\xa5\xff\xe5\x13\x27\x26\xd0\xec\x37\xad\x24\x1e\x9a"
+	"\x71\x9a\xa4\x89\xb0\x7e\x0f\xc4\xbb\x2d\x50\xd0\xe5\x7f\x7a\x68",
+	.c_size = 70,
+	.algo = OID_SM2_with_SM3,
+	.m =
+	"\x11\x22\x33\x44\x55\x66\x77\x88\x99\xaa\xbb\xcc\xdd\xee\xff\x00"
+	"\x12\x34\x56\x78\x9a\xbc\xde\xf0\x12\x34\x56\x78\x9a\xbc\xde\xf0",
+	.m_size = 32,
+	.public_key_vec = true,
+	},
+};
+
 /* Example vectors below taken from
  * http://www.oscca.gov.cn/UpFile/20101222141857786.pdf
  *
diff --git a/include/crypto/sm2.h b/include/crypto/sm2.h
new file mode 100644
index 000000000000..a93c6fd395ff
--- /dev/null
+++ b/include/crypto/sm2.h
@@ -0,0 +1,31 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+/*
+ * sm2.h - SM2 asymmetric public-key algorithm
+ * as specified by OSCCA GM/T 0003.1-2012 -- 0003.5-2012 SM2 and
+ * described at https://tools.ietf.org/html/draft-shen-sm2-ecdsa-02
+ *
+ * Copyright (c) 2020, Alibaba Group.
+ * Written by Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
+ *
+ * Copyright (c) 2025, Huawei Tech. Co., Ltd.
+ * Authors: Gu Bowen <gubowen5@huawei.com>
+ */
+
+#ifndef _CRYPTO_SM2_H
+#define _CRYPTO_SM2_H
+
+struct shash_desc;
+
+#if IS_REACHABLE(CONFIG_CRYPTO_SM2)
+int sm2_compute_z_digest(struct shash_desc *desc,
+			 const void *key, unsigned int keylen, void *dgst);
+#else
+static inline int sm2_compute_z_digest(struct shash_desc *desc,
+				       const void *key, unsigned int keylen,
+				       void *dgst)
+{
+	return -EOPNOTSUPP;
+}
+#endif
+
+#endif /* _CRYPTO_SM2_H */
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH RFC 4/4] crypto/sm2: support SM2-with-SM3 verification of X.509 certificates
  2025-06-30 13:39 [PATCH RFC 0/4] Reintroduce the sm2 algorithm Gu Bowen
                   ` (2 preceding siblings ...)
  2025-06-30 13:39 ` [PATCH RFC 3/4] crypto/sm2: Rework sm2 alg with sig_alg backend Gu Bowen
@ 2025-06-30 13:39 ` Gu Bowen
  2025-06-30 19:41 ` [PATCH RFC 0/4] Reintroduce the sm2 algorithm Dan Carpenter
  2025-07-03 13:14 ` Jason A. Donenfeld
  5 siblings, 0 replies; 13+ messages in thread
From: Gu Bowen @ 2025-06-30 13:39 UTC (permalink / raw)
  To: Herbert Xu, David Howells, David Woodhouse, Lukas Wunner,
	Ignat Korchagin, David S . Miller, Jarkko Sakkinen,
	Maxime Coquelin, Alexandre Torgue, Eric Biggers,
	Jason A . Donenfeld, Ard Biesheuvel, Tianjia Zhang, Dan Carpenter
  Cc: keyrings, linux-kernel, linux-crypto, linux-stm32,
	linux-arm-kernel, Lu Jialin, GONG Ruiqi, Gu Bowen

The digest is calculated during certificate parsing, but the public key of
the signing certificate need to be obtained before calculating the digest
to correctly calculate the Z value.

By attempting to obtain the public key before computing the digest, the
feasibility of doing so was tested and verified.

Signed-off-by: Gu Bowen <gubowen5@huawei.com>
---
 certs/system_keyring.c                   |  8 +++++++
 crypto/asymmetric_keys/public_key.c      |  7 ++++++
 crypto/asymmetric_keys/x509_public_key.c | 27 +++++++++++++++++++++++-
 include/keys/system_keyring.h            | 13 ++++++++++++
 4 files changed, 54 insertions(+), 1 deletion(-)

diff --git a/certs/system_keyring.c b/certs/system_keyring.c
index 9de610bf1f4b..adceb3f0928c 100644
--- a/certs/system_keyring.c
+++ b/certs/system_keyring.c
@@ -32,6 +32,14 @@ extern __initconst const u8 system_certificate_list[];
 extern __initconst const unsigned long system_certificate_list_size;
 extern __initconst const unsigned long module_cert_size;
 
+struct key *find_asymmetric_pub_key(const struct asymmetric_key_id *id_0,
+				    const struct asymmetric_key_id *id_1,
+				    const struct asymmetric_key_id *id_2)
+{
+	return find_asymmetric_key(builtin_trusted_keys, id_0,
+				   id_1, id_2, false);
+}
+
 /**
  * restrict_link_by_builtin_trusted - Restrict keyring addition by built-in CA
  * @dest_keyring: Keyring being linked to.
diff --git a/crypto/asymmetric_keys/public_key.c b/crypto/asymmetric_keys/public_key.c
index e5b177c8e842..ca0bb32e093a 100644
--- a/crypto/asymmetric_keys/public_key.c
+++ b/crypto/asymmetric_keys/public_key.c
@@ -134,6 +134,13 @@ software_key_determine_akcipher(const struct public_key *pkey,
 		n = snprintf(alg_name, CRYPTO_MAX_ALG_NAME, "%s(%s)",
 			     encoding, pkey->pkey_algo);
 		return n >= CRYPTO_MAX_ALG_NAME ? -EINVAL : 0;
+	} else if (strcmp(pkey->pkey_algo, "sm2") == 0) {
+		if (strcmp(encoding, "raw") != 0)
+			return -EINVAL;
+		if (!hash_algo)
+			return -EINVAL;
+		if (strcmp(hash_algo, "sm3") != 0)
+			return -EINVAL;
 	} else if (strcmp(pkey->pkey_algo, "ecrdsa") == 0) {
 		if (strcmp(encoding, "raw") != 0)
 			return -EINVAL;
diff --git a/crypto/asymmetric_keys/x509_public_key.c b/crypto/asymmetric_keys/x509_public_key.c
index 8409d7d36cb4..62bbc423d632 100644
--- a/crypto/asymmetric_keys/x509_public_key.c
+++ b/crypto/asymmetric_keys/x509_public_key.c
@@ -7,6 +7,7 @@
 
 #define pr_fmt(fmt) "X.509: "fmt
 #include <crypto/hash.h>
+#include <crypto/sm2.h>
 #include <keys/asymmetric-parser.h>
 #include <keys/asymmetric-subtype.h>
 #include <keys/system_keyring.h>
@@ -28,6 +29,8 @@ int x509_get_sig_params(struct x509_certificate *cert)
 	struct shash_desc *desc;
 	size_t desc_size;
 	int ret;
+	struct key *key;
+	struct public_key *pkey;
 
 	pr_devel("==>%s()\n", __func__);
 
@@ -63,8 +66,30 @@ int x509_get_sig_params(struct x509_certificate *cert)
 
 	desc->tfm = tfm;
 
-	ret = crypto_shash_digest(desc, cert->tbs, cert->tbs_size,
+	if (strcmp(cert->pub->pkey_algo, "sm2") == 0) {
+		if (!sig->auth_ids[0] && !sig->auth_ids[1] && !sig->auth_ids[2])
+			return -ENOKEY;
+
+		key = find_asymmetric_pub_key(sig->auth_ids[0], sig->auth_ids[1],
+					      sig->auth_ids[2]);
+		if (IS_ERR(key))
+			pkey = cert->pub;
+		else
+			pkey = key->payload.data[asym_crypto];
+
+		ret = strcmp(sig->hash_algo, "sm3") != 0 ? -EINVAL :
+			crypto_shash_init(desc) ?:
+			sm2_compute_z_digest(desc, pkey->key,
+					     pkey->keylen, sig->digest) ?:
+			crypto_shash_init(desc) ?:
+			crypto_shash_update(desc, sig->digest,
+					    sig->digest_size) ?:
+			crypto_shash_finup(desc, cert->tbs, cert->tbs_size,
+					   sig->digest);
+	} else {
+		ret = crypto_shash_digest(desc, cert->tbs, cert->tbs_size,
 				  sig->digest);
+	}
 
 	if (ret < 0)
 		goto error_2;
diff --git a/include/keys/system_keyring.h b/include/keys/system_keyring.h
index a6c2897bcc63..21b466e5d2f3 100644
--- a/include/keys/system_keyring.h
+++ b/include/keys/system_keyring.h
@@ -10,6 +10,8 @@
 
 #include <linux/key.h>
 
+struct asymmetric_key_id;
+
 enum blacklist_hash_type {
 	/* TBSCertificate hash */
 	BLACKLIST_HASH_X509_TBS = 1,
@@ -19,6 +21,10 @@ enum blacklist_hash_type {
 
 #ifdef CONFIG_SYSTEM_TRUSTED_KEYRING
 
+extern struct key *find_asymmetric_pub_key(const struct asymmetric_key_id *id_0,
+					   const struct asymmetric_key_id *id_1,
+					   const struct asymmetric_key_id *id_2);
+
 extern int restrict_link_by_builtin_trusted(struct key *keyring,
 					    const struct key_type *type,
 					    const union key_payload *payload,
@@ -30,6 +36,13 @@ int restrict_link_by_digsig_builtin(struct key *dest_keyring,
 extern __init int load_module_cert(struct key *keyring);
 
 #else
+static inline struct key *find_asymmetric_pub_key(const struct asymmetric_key_id *id_0,
+						  const struct asymmetric_key_id *id_1,
+						  const struct asymmetric_key_id *id_2)
+{
+	return NULL;
+}
+
 #define restrict_link_by_builtin_trusted restrict_link_reject
 #define restrict_link_by_digsig_builtin restrict_link_reject
 
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH RFC 0/4] Reintroduce the sm2 algorithm
  2025-06-30 13:39 [PATCH RFC 0/4] Reintroduce the sm2 algorithm Gu Bowen
                   ` (3 preceding siblings ...)
  2025-06-30 13:39 ` [PATCH RFC 4/4] crypto/sm2: support SM2-with-SM3 verification of X.509 certificates Gu Bowen
@ 2025-06-30 19:41 ` Dan Carpenter
  2025-07-01  3:49   ` Gu Bowen
  2025-07-03 13:14 ` Jason A. Donenfeld
  5 siblings, 1 reply; 13+ messages in thread
From: Dan Carpenter @ 2025-06-30 19:41 UTC (permalink / raw)
  To: Gu Bowen
  Cc: Herbert Xu, David Howells, David Woodhouse, Lukas Wunner,
	Ignat Korchagin, David S . Miller, Jarkko Sakkinen,
	Maxime Coquelin, Alexandre Torgue, Eric Biggers,
	Jason A . Donenfeld, Ard Biesheuvel, Tianjia Zhang, keyrings,
	linux-kernel, linux-crypto, linux-stm32, linux-arm-kernel,
	Lu Jialin, GONG Ruiqi

On Mon, Jun 30, 2025 at 09:39:30PM +0800, Gu Bowen wrote:
> To reintroduce the sm2 algorithm, the patch set did the following:
>  - Reintroduce the mpi library based on libgcrypt.
>  - Reintroduce ec implementation to MPI library.
>  - Rework sm2 algorithm.
>  - Support verification of X.509 certificates.

Remind me, why did we remove these?

regards,
dan carpenter



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH RFC 0/4] Reintroduce the sm2 algorithm
  2025-06-30 19:41 ` [PATCH RFC 0/4] Reintroduce the sm2 algorithm Dan Carpenter
@ 2025-07-01  3:49   ` Gu Bowen
  0 siblings, 0 replies; 13+ messages in thread
From: Gu Bowen @ 2025-07-01  3:49 UTC (permalink / raw)
  To: Dan Carpenter
  Cc: Herbert Xu, David Howells, David Woodhouse, Lukas Wunner,
	Ignat Korchagin, David S . Miller, Jarkko Sakkinen,
	Maxime Coquelin, Alexandre Torgue, Eric Biggers,
	Jason A . Donenfeld, Ard Biesheuvel, Tianjia Zhang, keyrings,
	linux-kernel, linux-crypto, linux-stm32, linux-arm-kernel,
	Lu Jialin, GONG Ruiqi

Hi,

On 7/1/2025 3:41 AM, Dan Carpenter wrote:
> On Mon, Jun 30, 2025 at 09:39:30PM +0800, Gu Bowen wrote:
>> To reintroduce the sm2 algorithm, the patch set did the following:
>>   - Reintroduce the mpi library based on libgcrypt.
>>   - Reintroduce ec implementation to MPI library.
>>   - Rework sm2 algorithm.
>>   - Support verification of X.509 certificates.
> 
> Remind me, why did we remove these?
> 

At first, the process of calculating the digest with the SM2 certificate
was coupled with the signature verification process, and this 
unreasonable situation was corrected with commit e5221fa6a355 ("KEYS: 
asymmetric: Move sm2 code into x509_public_key "). However, this commit 
also caused SM2 to be unable to verify secondary certificates due to its 
special implementation. This issue was not resolved, which led to the 
removal of the sm2 algorithm.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH RFC 2/4] Revert "Revert "lib/mpi: Introduce ec implementation to MPI library""
  2025-06-30 13:39 ` [PATCH RFC 2/4] Revert "Revert "lib/mpi: Introduce ec implementation to " Gu Bowen
@ 2025-07-02 15:18   ` Ignat Korchagin
  0 siblings, 0 replies; 13+ messages in thread
From: Ignat Korchagin @ 2025-07-02 15:18 UTC (permalink / raw)
  To: Gu Bowen
  Cc: Herbert Xu, David Howells, David Woodhouse, Lukas Wunner,
	David S . Miller, Jarkko Sakkinen, Maxime Coquelin,
	Alexandre Torgue, Eric Biggers, Jason A . Donenfeld,
	Ard Biesheuvel, Tianjia Zhang, Dan Carpenter, keyrings,
	linux-kernel, linux-crypto, linux-stm32, linux-arm-kernel,
	Lu Jialin, GONG Ruiqi

On Mon, Jun 30, 2025 at 3:27 PM Gu Bowen <gubowen5@huawei.com> wrote:
>
> This reverts commit da4fe6815aca25603944a64b0965310512e867d0.
>
> Reintroduce ec implementation to MPI library to support sm2.

Sorry for my potential ignorance on the question, but can it be
implemented by extending existing ECC primitives (the ones used for
ECDSA)? Feels a bit weird having different ECC primitives for
different algorithms (well curve25519 being an exception...)

> Signed-off-by: Gu Bowen <gubowen5@huawei.com>
> ---
>  include/linux/mpi.h     |  105 +++
>  lib/crypto/mpi/Makefile |    1 +
>  lib/crypto/mpi/ec.c     | 1507 +++++++++++++++++++++++++++++++++++++++
>  3 files changed, 1613 insertions(+)
>  create mode 100644 lib/crypto/mpi/ec.c
>
> diff --git a/include/linux/mpi.h b/include/linux/mpi.h
> index 9ad7e7231ee9..3317effe57ba 100644
> --- a/include/linux/mpi.h
> +++ b/include/linux/mpi.h
> @@ -157,6 +157,111 @@ void mpi_fdiv_q(MPI quot, MPI dividend, MPI divisor);
>  /*-- mpi-inv.c --*/
>  int mpi_invm(MPI x, MPI a, MPI n);
>
> +/*-- ec.c --*/
> +
> +/* Object to represent a point in projective coordinates */
> +struct gcry_mpi_point {
> +       MPI x;
> +       MPI y;
> +       MPI z;
> +};
> +
> +typedef struct gcry_mpi_point *MPI_POINT;
> +
> +/* Models describing an elliptic curve */
> +enum gcry_mpi_ec_models {
> +       /* The Short Weierstrass equation is
> +        *      y^2 = x^3 + ax + b
> +        */
> +       MPI_EC_WEIERSTRASS = 0,
> +       /* The Montgomery equation is
> +        *      by^2 = x^3 + ax^2 + x
> +        */
> +       MPI_EC_MONTGOMERY,
> +       /* The Twisted Edwards equation is
> +        *      ax^2 + y^2 = 1 + bx^2y^2
> +        * Note that we use 'b' instead of the commonly used 'd'.
> +        */
> +       MPI_EC_EDWARDS
> +};
> +
> +/* Dialects used with elliptic curves */
> +enum ecc_dialects {
> +       ECC_DIALECT_STANDARD = 0,
> +       ECC_DIALECT_ED25519,
> +       ECC_DIALECT_SAFECURVE
> +};
> +
> +/* This context is used with all our EC functions. */
> +struct mpi_ec_ctx {
> +       enum gcry_mpi_ec_models model; /* The model describing this curve. */
> +       enum ecc_dialects dialect;     /* The ECC dialect used with the curve. */
> +       int flags;                     /* Public key flags (not always used). */
> +       unsigned int nbits;            /* Number of bits.  */
> +
> +       /* Domain parameters.  Note that they may not all be set and if set
> +        * the MPIs may be flagged as constant.
> +        */
> +       MPI p;         /* Prime specifying the field GF(p).  */
> +       MPI a;         /* First coefficient of the Weierstrass equation.  */
> +       MPI b;         /* Second coefficient of the Weierstrass equation.  */
> +       MPI_POINT G;   /* Base point (generator).  */
> +       MPI n;         /* Order of G.  */
> +       unsigned int h;       /* Cofactor.  */
> +
> +       /* The actual key.  May not be set.  */
> +       MPI_POINT Q;   /* Public key.   */
> +       MPI d;         /* Private key.  */
> +
> +       const char *name;      /* Name of the curve.  */
> +
> +       /* This structure is private to mpi/ec.c! */
> +       struct {
> +               struct {
> +                       unsigned int a_is_pminus3:1;
> +                       unsigned int two_inv_p:1;
> +               } valid; /* Flags to help setting the helper vars below.  */
> +
> +               int a_is_pminus3;  /* True if A = P - 3. */
> +
> +               MPI two_inv_p;
> +
> +               mpi_barrett_t p_barrett;
> +
> +               /* Scratch variables.  */
> +               MPI scratch[11];
> +
> +               /* Helper for fast reduction.  */
> +               /*   int nist_nbits; /\* If this is a NIST curve, the # of bits. *\/ */
> +               /*   MPI s[10]; */
> +               /*   MPI c; */
> +       } t;
> +
> +       /* Curve specific computation routines for the field.  */
> +       void (*addm)(MPI w, MPI u, MPI v, struct mpi_ec_ctx *ctx);
> +       void (*subm)(MPI w, MPI u, MPI v, struct mpi_ec_ctx *ec);
> +       void (*mulm)(MPI w, MPI u, MPI v, struct mpi_ec_ctx *ctx);
> +       void (*pow2)(MPI w, const MPI b, struct mpi_ec_ctx *ctx);
> +       void (*mul2)(MPI w, MPI u, struct mpi_ec_ctx *ctx);
> +};
> +
> +void mpi_ec_init(struct mpi_ec_ctx *ctx, enum gcry_mpi_ec_models model,
> +                       enum ecc_dialects dialect,
> +                       int flags, MPI p, MPI a, MPI b);
> +void mpi_ec_deinit(struct mpi_ec_ctx *ctx);
> +MPI_POINT mpi_point_new(unsigned int nbits);
> +void mpi_point_release(MPI_POINT p);
> +void mpi_point_init(MPI_POINT p);
> +void mpi_point_free_parts(MPI_POINT p);
> +int mpi_ec_get_affine(MPI x, MPI y, MPI_POINT point, struct mpi_ec_ctx *ctx);
> +void mpi_ec_add_points(MPI_POINT result,
> +                       MPI_POINT p1, MPI_POINT p2,
> +                       struct mpi_ec_ctx *ctx);
> +void mpi_ec_mul_point(MPI_POINT result,
> +                       MPI scalar, MPI_POINT point,
> +                       struct mpi_ec_ctx *ctx);
> +int mpi_ec_curve_point(MPI_POINT point, struct mpi_ec_ctx *ctx);
> +
>  /* inline functions */
>
>  /**
> diff --git a/lib/crypto/mpi/Makefile b/lib/crypto/mpi/Makefile
> index 477debd7ed50..6e6ef9a34fe1 100644
> --- a/lib/crypto/mpi/Makefile
> +++ b/lib/crypto/mpi/Makefile
> @@ -13,6 +13,7 @@ mpi-y = \
>         generic_mpih-rshift.o           \
>         generic_mpih-sub1.o             \
>         generic_mpih-add1.o             \
> +       ec.o                            \
>         mpicoder.o                      \
>         mpi-add.o                       \
>         mpi-bit.o                       \
> diff --git a/lib/crypto/mpi/ec.c b/lib/crypto/mpi/ec.c
> new file mode 100644
> index 000000000000..4781f00982ef
> --- /dev/null
> +++ b/lib/crypto/mpi/ec.c
> @@ -0,0 +1,1507 @@
> +/* ec.c -  Elliptic Curve functions
> + * Copyright (C) 2007 Free Software Foundation, Inc.
> + * Copyright (C) 2013 g10 Code GmbH
> + *
> + * This file is part of Libgcrypt.
> + *
> + * Libgcrypt is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU Lesser General Public License as
> + * published by the Free Software Foundation; either version 2.1 of
> + * the License, or (at your option) any later version.
> + *
> + * Libgcrypt is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU Lesser General Public License for more details.
> + *
> + * You should have received a copy of the GNU Lesser General Public
> + * License along with this program; if not, see <http://www.gnu.org/licenses/>.
> + */
> +
> +#include "mpi-internal.h"
> +#include "longlong.h"
> +
> +#define point_init(a)  mpi_point_init((a))
> +#define point_free(a)  mpi_point_free_parts((a))
> +
> +#define log_error(fmt, ...) pr_err(fmt, ##__VA_ARGS__)
> +#define log_fatal(fmt, ...) pr_err(fmt, ##__VA_ARGS__)
> +
> +#define DIM(v) (sizeof(v)/sizeof((v)[0]))
> +
> +
> +/* Create a new point option.  NBITS gives the size in bits of one
> + * coordinate; it is only used to pre-allocate some resources and
> + * might also be passed as 0 to use a default value.
> + */
> +MPI_POINT mpi_point_new(unsigned int nbits)
> +{
> +       MPI_POINT p;
> +
> +       (void)nbits;  /* Currently not used.  */
> +
> +       p = kmalloc(sizeof(*p), GFP_KERNEL);
> +       if (p)
> +               mpi_point_init(p);
> +       return p;
> +}
> +EXPORT_SYMBOL_GPL(mpi_point_new);
> +
> +/* Release the point object P.  P may be NULL. */
> +void mpi_point_release(MPI_POINT p)
> +{
> +       if (p) {
> +               mpi_point_free_parts(p);
> +               kfree(p);
> +       }
> +}
> +EXPORT_SYMBOL_GPL(mpi_point_release);
> +
> +/* Initialize the fields of a point object.  gcry_mpi_point_free_parts
> + * may be used to release the fields.
> + */
> +void mpi_point_init(MPI_POINT p)
> +{
> +       p->x = mpi_new(0);
> +       p->y = mpi_new(0);
> +       p->z = mpi_new(0);
> +}
> +EXPORT_SYMBOL_GPL(mpi_point_init);
> +
> +/* Release the parts of a point object. */
> +void mpi_point_free_parts(MPI_POINT p)
> +{
> +       mpi_free(p->x); p->x = NULL;
> +       mpi_free(p->y); p->y = NULL;
> +       mpi_free(p->z); p->z = NULL;
> +}
> +EXPORT_SYMBOL_GPL(mpi_point_free_parts);
> +
> +/* Set the value from S into D.  */
> +static void point_set(MPI_POINT d, MPI_POINT s)
> +{
> +       mpi_set(d->x, s->x);
> +       mpi_set(d->y, s->y);
> +       mpi_set(d->z, s->z);
> +}
> +
> +static void point_resize(MPI_POINT p, struct mpi_ec_ctx *ctx)
> +{
> +       size_t nlimbs = ctx->p->nlimbs;
> +
> +       mpi_resize(p->x, nlimbs);
> +       p->x->nlimbs = nlimbs;
> +       mpi_resize(p->z, nlimbs);
> +       p->z->nlimbs = nlimbs;
> +
> +       if (ctx->model != MPI_EC_MONTGOMERY) {
> +               mpi_resize(p->y, nlimbs);
> +               p->y->nlimbs = nlimbs;
> +       }
> +}
> +
> +static void point_swap_cond(MPI_POINT d, MPI_POINT s, unsigned long swap,
> +               struct mpi_ec_ctx *ctx)
> +{
> +       mpi_swap_cond(d->x, s->x, swap);
> +       if (ctx->model != MPI_EC_MONTGOMERY)
> +               mpi_swap_cond(d->y, s->y, swap);
> +       mpi_swap_cond(d->z, s->z, swap);
> +}
> +
> +
> +/* W = W mod P.  */
> +static void ec_mod(MPI w, struct mpi_ec_ctx *ec)
> +{
> +       if (ec->t.p_barrett)
> +               mpi_mod_barrett(w, w, ec->t.p_barrett);
> +       else
> +               mpi_mod(w, w, ec->p);
> +}
> +
> +static void ec_addm(MPI w, MPI u, MPI v, struct mpi_ec_ctx *ctx)
> +{
> +       mpi_add(w, u, v);
> +       ec_mod(w, ctx);
> +}
> +
> +static void ec_subm(MPI w, MPI u, MPI v, struct mpi_ec_ctx *ec)
> +{
> +       mpi_sub(w, u, v);
> +       while (w->sign)
> +               mpi_add(w, w, ec->p);
> +       /*ec_mod(w, ec);*/
> +}
> +
> +static void ec_mulm(MPI w, MPI u, MPI v, struct mpi_ec_ctx *ctx)
> +{
> +       mpi_mul(w, u, v);
> +       ec_mod(w, ctx);
> +}
> +
> +/* W = 2 * U mod P.  */
> +static void ec_mul2(MPI w, MPI u, struct mpi_ec_ctx *ctx)
> +{
> +       mpi_lshift(w, u, 1);
> +       ec_mod(w, ctx);
> +}
> +
> +static void ec_powm(MPI w, const MPI b, const MPI e,
> +               struct mpi_ec_ctx *ctx)
> +{
> +       mpi_powm(w, b, e, ctx->p);
> +       /* mpi_abs(w); */
> +}
> +
> +/* Shortcut for
> + * ec_powm(B, B, mpi_const(MPI_C_TWO), ctx);
> + * for easier optimization.
> + */
> +static void ec_pow2(MPI w, const MPI b, struct mpi_ec_ctx *ctx)
> +{
> +       /* Using mpi_mul is slightly faster (at least on amd64).  */
> +       /* mpi_powm(w, b, mpi_const(MPI_C_TWO), ctx->p); */
> +       ec_mulm(w, b, b, ctx);
> +}
> +
> +/* Shortcut for
> + * ec_powm(B, B, mpi_const(MPI_C_THREE), ctx);
> + * for easier optimization.
> + */
> +static void ec_pow3(MPI w, const MPI b, struct mpi_ec_ctx *ctx)
> +{
> +       mpi_powm(w, b, mpi_const(MPI_C_THREE), ctx->p);
> +}
> +
> +static void ec_invm(MPI x, MPI a, struct mpi_ec_ctx *ctx)
> +{
> +       if (!mpi_invm(x, a, ctx->p))
> +               log_error("ec_invm: inverse does not exist:\n");
> +}
> +
> +static void mpih_set_cond(mpi_ptr_t wp, mpi_ptr_t up,
> +               mpi_size_t usize, unsigned long set)
> +{
> +       mpi_size_t i;
> +       mpi_limb_t mask = ((mpi_limb_t)0) - set;
> +       mpi_limb_t x;
> +
> +       for (i = 0; i < usize; i++) {
> +               x = mask & (wp[i] ^ up[i]);
> +               wp[i] = wp[i] ^ x;
> +       }
> +}
> +
> +/* Routines for 2^255 - 19.  */
> +
> +#define LIMB_SIZE_25519 ((256+BITS_PER_MPI_LIMB-1)/BITS_PER_MPI_LIMB)
> +
> +static void ec_addm_25519(MPI w, MPI u, MPI v, struct mpi_ec_ctx *ctx)
> +{
> +       mpi_ptr_t wp, up, vp;
> +       mpi_size_t wsize = LIMB_SIZE_25519;
> +       mpi_limb_t n[LIMB_SIZE_25519];
> +       mpi_limb_t borrow;
> +
> +       if (w->nlimbs != wsize || u->nlimbs != wsize || v->nlimbs != wsize)
> +               log_bug("addm_25519: different sizes\n");
> +
> +       memset(n, 0, sizeof(n));
> +       up = u->d;
> +       vp = v->d;
> +       wp = w->d;
> +
> +       mpihelp_add_n(wp, up, vp, wsize);
> +       borrow = mpihelp_sub_n(wp, wp, ctx->p->d, wsize);
> +       mpih_set_cond(n, ctx->p->d, wsize, (borrow != 0UL));
> +       mpihelp_add_n(wp, wp, n, wsize);
> +       wp[LIMB_SIZE_25519-1] &= ~((mpi_limb_t)1 << (255 % BITS_PER_MPI_LIMB));
> +}
> +
> +static void ec_subm_25519(MPI w, MPI u, MPI v, struct mpi_ec_ctx *ctx)
> +{
> +       mpi_ptr_t wp, up, vp;
> +       mpi_size_t wsize = LIMB_SIZE_25519;
> +       mpi_limb_t n[LIMB_SIZE_25519];
> +       mpi_limb_t borrow;
> +
> +       if (w->nlimbs != wsize || u->nlimbs != wsize || v->nlimbs != wsize)
> +               log_bug("subm_25519: different sizes\n");
> +
> +       memset(n, 0, sizeof(n));
> +       up = u->d;
> +       vp = v->d;
> +       wp = w->d;
> +
> +       borrow = mpihelp_sub_n(wp, up, vp, wsize);
> +       mpih_set_cond(n, ctx->p->d, wsize, (borrow != 0UL));
> +       mpihelp_add_n(wp, wp, n, wsize);
> +       wp[LIMB_SIZE_25519-1] &= ~((mpi_limb_t)1 << (255 % BITS_PER_MPI_LIMB));
> +}
> +
> +static void ec_mulm_25519(MPI w, MPI u, MPI v, struct mpi_ec_ctx *ctx)
> +{
> +       mpi_ptr_t wp, up, vp;
> +       mpi_size_t wsize = LIMB_SIZE_25519;
> +       mpi_limb_t n[LIMB_SIZE_25519*2];
> +       mpi_limb_t m[LIMB_SIZE_25519+1];
> +       mpi_limb_t cy;
> +       int msb;
> +
> +       (void)ctx;
> +       if (w->nlimbs != wsize || u->nlimbs != wsize || v->nlimbs != wsize)
> +               log_bug("mulm_25519: different sizes\n");
> +
> +       up = u->d;
> +       vp = v->d;
> +       wp = w->d;
> +
> +       mpihelp_mul_n(n, up, vp, wsize);
> +       memcpy(wp, n, wsize * BYTES_PER_MPI_LIMB);
> +       wp[LIMB_SIZE_25519-1] &= ~((mpi_limb_t)1 << (255 % BITS_PER_MPI_LIMB));
> +
> +       memcpy(m, n+LIMB_SIZE_25519-1, (wsize+1) * BYTES_PER_MPI_LIMB);
> +       mpihelp_rshift(m, m, LIMB_SIZE_25519+1, (255 % BITS_PER_MPI_LIMB));
> +
> +       memcpy(n, m, wsize * BYTES_PER_MPI_LIMB);
> +       cy = mpihelp_lshift(m, m, LIMB_SIZE_25519, 4);
> +       m[LIMB_SIZE_25519] = cy;
> +       cy = mpihelp_add_n(m, m, n, wsize);
> +       m[LIMB_SIZE_25519] += cy;
> +       cy = mpihelp_add_n(m, m, n, wsize);
> +       m[LIMB_SIZE_25519] += cy;
> +       cy = mpihelp_add_n(m, m, n, wsize);
> +       m[LIMB_SIZE_25519] += cy;
> +
> +       cy = mpihelp_add_n(wp, wp, m, wsize);
> +       m[LIMB_SIZE_25519] += cy;
> +
> +       memset(m, 0, wsize * BYTES_PER_MPI_LIMB);
> +       msb = (wp[LIMB_SIZE_25519-1] >> (255 % BITS_PER_MPI_LIMB));
> +       m[0] = (m[LIMB_SIZE_25519] * 2 + msb) * 19;
> +       wp[LIMB_SIZE_25519-1] &= ~((mpi_limb_t)1 << (255 % BITS_PER_MPI_LIMB));
> +       mpihelp_add_n(wp, wp, m, wsize);
> +
> +       m[0] = 0;
> +       cy = mpihelp_sub_n(wp, wp, ctx->p->d, wsize);
> +       mpih_set_cond(m, ctx->p->d, wsize, (cy != 0UL));
> +       mpihelp_add_n(wp, wp, m, wsize);
> +}
> +
> +static void ec_mul2_25519(MPI w, MPI u, struct mpi_ec_ctx *ctx)
> +{
> +       ec_addm_25519(w, u, u, ctx);
> +}
> +
> +static void ec_pow2_25519(MPI w, const MPI b, struct mpi_ec_ctx *ctx)
> +{
> +       ec_mulm_25519(w, b, b, ctx);
> +}
> +
> +/* Routines for 2^448 - 2^224 - 1.  */
> +
> +#define LIMB_SIZE_448 ((448+BITS_PER_MPI_LIMB-1)/BITS_PER_MPI_LIMB)
> +#define LIMB_SIZE_HALF_448 ((LIMB_SIZE_448+1)/2)
> +
> +static void ec_addm_448(MPI w, MPI u, MPI v, struct mpi_ec_ctx *ctx)
> +{
> +       mpi_ptr_t wp, up, vp;
> +       mpi_size_t wsize = LIMB_SIZE_448;
> +       mpi_limb_t n[LIMB_SIZE_448];
> +       mpi_limb_t cy;
> +
> +       if (w->nlimbs != wsize || u->nlimbs != wsize || v->nlimbs != wsize)
> +               log_bug("addm_448: different sizes\n");
> +
> +       memset(n, 0, sizeof(n));
> +       up = u->d;
> +       vp = v->d;
> +       wp = w->d;
> +
> +       cy = mpihelp_add_n(wp, up, vp, wsize);
> +       mpih_set_cond(n, ctx->p->d, wsize, (cy != 0UL));
> +       mpihelp_sub_n(wp, wp, n, wsize);
> +}
> +
> +static void ec_subm_448(MPI w, MPI u, MPI v, struct mpi_ec_ctx *ctx)
> +{
> +       mpi_ptr_t wp, up, vp;
> +       mpi_size_t wsize = LIMB_SIZE_448;
> +       mpi_limb_t n[LIMB_SIZE_448];
> +       mpi_limb_t borrow;
> +
> +       if (w->nlimbs != wsize || u->nlimbs != wsize || v->nlimbs != wsize)
> +               log_bug("subm_448: different sizes\n");
> +
> +       memset(n, 0, sizeof(n));
> +       up = u->d;
> +       vp = v->d;
> +       wp = w->d;
> +
> +       borrow = mpihelp_sub_n(wp, up, vp, wsize);
> +       mpih_set_cond(n, ctx->p->d, wsize, (borrow != 0UL));
> +       mpihelp_add_n(wp, wp, n, wsize);
> +}
> +
> +static void ec_mulm_448(MPI w, MPI u, MPI v, struct mpi_ec_ctx *ctx)
> +{
> +       mpi_ptr_t wp, up, vp;
> +       mpi_size_t wsize = LIMB_SIZE_448;
> +       mpi_limb_t n[LIMB_SIZE_448*2];
> +       mpi_limb_t a2[LIMB_SIZE_HALF_448];
> +       mpi_limb_t a3[LIMB_SIZE_HALF_448];
> +       mpi_limb_t b0[LIMB_SIZE_HALF_448];
> +       mpi_limb_t b1[LIMB_SIZE_HALF_448];
> +       mpi_limb_t cy;
> +       int i;
> +#if (LIMB_SIZE_HALF_448 > LIMB_SIZE_448/2)
> +       mpi_limb_t b1_rest, a3_rest;
> +#endif
> +
> +       if (w->nlimbs != wsize || u->nlimbs != wsize || v->nlimbs != wsize)
> +               log_bug("mulm_448: different sizes\n");
> +
> +       up = u->d;
> +       vp = v->d;
> +       wp = w->d;
> +
> +       mpihelp_mul_n(n, up, vp, wsize);
> +
> +       for (i = 0; i < (wsize + 1) / 2; i++) {
> +               b0[i] = n[i];
> +               b1[i] = n[i+wsize/2];
> +               a2[i] = n[i+wsize];
> +               a3[i] = n[i+wsize+wsize/2];
> +       }
> +
> +#if (LIMB_SIZE_HALF_448 > LIMB_SIZE_448/2)
> +       b0[LIMB_SIZE_HALF_448-1] &= ((mpi_limb_t)1UL << 32)-1;
> +       a2[LIMB_SIZE_HALF_448-1] &= ((mpi_limb_t)1UL << 32)-1;
> +
> +       b1_rest = 0;
> +       a3_rest = 0;
> +
> +       for (i = (wsize + 1) / 2 - 1; i >= 0; i--) {
> +               mpi_limb_t b1v, a3v;
> +               b1v = b1[i];
> +               a3v = a3[i];
> +               b1[i] = (b1_rest << 32) | (b1v >> 32);
> +               a3[i] = (a3_rest << 32) | (a3v >> 32);
> +               b1_rest = b1v & (((mpi_limb_t)1UL << 32)-1);
> +               a3_rest = a3v & (((mpi_limb_t)1UL << 32)-1);
> +       }
> +#endif
> +
> +       cy = mpihelp_add_n(b0, b0, a2, LIMB_SIZE_HALF_448);
> +       cy += mpihelp_add_n(b0, b0, a3, LIMB_SIZE_HALF_448);
> +       for (i = 0; i < (wsize + 1) / 2; i++)
> +               wp[i] = b0[i];
> +#if (LIMB_SIZE_HALF_448 > LIMB_SIZE_448/2)
> +       wp[LIMB_SIZE_HALF_448-1] &= (((mpi_limb_t)1UL << 32)-1);
> +#endif
> +
> +#if (LIMB_SIZE_HALF_448 > LIMB_SIZE_448/2)
> +       cy = b0[LIMB_SIZE_HALF_448-1] >> 32;
> +#endif
> +
> +       cy = mpihelp_add_1(b1, b1, LIMB_SIZE_HALF_448, cy);
> +       cy += mpihelp_add_n(b1, b1, a2, LIMB_SIZE_HALF_448);
> +       cy += mpihelp_add_n(b1, b1, a3, LIMB_SIZE_HALF_448);
> +       cy += mpihelp_add_n(b1, b1, a3, LIMB_SIZE_HALF_448);
> +#if (LIMB_SIZE_HALF_448 > LIMB_SIZE_448/2)
> +       b1_rest = 0;
> +       for (i = (wsize + 1) / 2 - 1; i >= 0; i--) {
> +               mpi_limb_t b1v = b1[i];
> +               b1[i] = (b1_rest << 32) | (b1v >> 32);
> +               b1_rest = b1v & (((mpi_limb_t)1UL << 32)-1);
> +       }
> +       wp[LIMB_SIZE_HALF_448-1] |= (b1_rest << 32);
> +#endif
> +       for (i = 0; i < wsize / 2; i++)
> +               wp[i+(wsize + 1) / 2] = b1[i];
> +
> +#if (LIMB_SIZE_HALF_448 > LIMB_SIZE_448/2)
> +       cy = b1[LIMB_SIZE_HALF_448-1];
> +#endif
> +
> +       memset(n, 0, wsize * BYTES_PER_MPI_LIMB);
> +
> +#if (LIMB_SIZE_HALF_448 > LIMB_SIZE_448/2)
> +       n[LIMB_SIZE_HALF_448-1] = cy << 32;
> +#else
> +       n[LIMB_SIZE_HALF_448] = cy;
> +#endif
> +       n[0] = cy;
> +       mpihelp_add_n(wp, wp, n, wsize);
> +
> +       memset(n, 0, wsize * BYTES_PER_MPI_LIMB);
> +       cy = mpihelp_sub_n(wp, wp, ctx->p->d, wsize);
> +       mpih_set_cond(n, ctx->p->d, wsize, (cy != 0UL));
> +       mpihelp_add_n(wp, wp, n, wsize);
> +}
> +
> +static void ec_mul2_448(MPI w, MPI u, struct mpi_ec_ctx *ctx)
> +{
> +       ec_addm_448(w, u, u, ctx);
> +}
> +
> +static void ec_pow2_448(MPI w, const MPI b, struct mpi_ec_ctx *ctx)
> +{
> +       ec_mulm_448(w, b, b, ctx);
> +}
> +
> +struct field_table {
> +       const char *p;
> +
> +       /* computation routines for the field.  */
> +       void (*addm)(MPI w, MPI u, MPI v, struct mpi_ec_ctx *ctx);
> +       void (*subm)(MPI w, MPI u, MPI v, struct mpi_ec_ctx *ctx);
> +       void (*mulm)(MPI w, MPI u, MPI v, struct mpi_ec_ctx *ctx);
> +       void (*mul2)(MPI w, MPI u, struct mpi_ec_ctx *ctx);
> +       void (*pow2)(MPI w, const MPI b, struct mpi_ec_ctx *ctx);
> +};
> +
> +static const struct field_table field_table[] = {
> +       {
> +               "0x7FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFED",
> +               ec_addm_25519,
> +               ec_subm_25519,
> +               ec_mulm_25519,
> +               ec_mul2_25519,
> +               ec_pow2_25519
> +       },
> +       {
> +               "0xFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFE"
> +               "FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF",
> +               ec_addm_448,
> +               ec_subm_448,
> +               ec_mulm_448,
> +               ec_mul2_448,
> +               ec_pow2_448
> +       },
> +       { NULL, NULL, NULL, NULL, NULL, NULL },
> +};
> +
> +/* Force recomputation of all helper variables.  */
> +static void mpi_ec_get_reset(struct mpi_ec_ctx *ec)
> +{
> +       ec->t.valid.a_is_pminus3 = 0;
> +       ec->t.valid.two_inv_p = 0;
> +}
> +
> +/* Accessor for helper variable.  */
> +static int ec_get_a_is_pminus3(struct mpi_ec_ctx *ec)
> +{
> +       MPI tmp;
> +
> +       if (!ec->t.valid.a_is_pminus3) {
> +               ec->t.valid.a_is_pminus3 = 1;
> +               tmp = mpi_alloc_like(ec->p);
> +               mpi_sub_ui(tmp, ec->p, 3);
> +               ec->t.a_is_pminus3 = !mpi_cmp(ec->a, tmp);
> +               mpi_free(tmp);
> +       }
> +
> +       return ec->t.a_is_pminus3;
> +}
> +
> +/* Accessor for helper variable.  */
> +static MPI ec_get_two_inv_p(struct mpi_ec_ctx *ec)
> +{
> +       if (!ec->t.valid.two_inv_p) {
> +               ec->t.valid.two_inv_p = 1;
> +               if (!ec->t.two_inv_p)
> +                       ec->t.two_inv_p = mpi_alloc(0);
> +               ec_invm(ec->t.two_inv_p, mpi_const(MPI_C_TWO), ec);
> +       }
> +       return ec->t.two_inv_p;
> +}
> +
> +static const char *const curve25519_bad_points[] = {
> +       "0x7fffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffed",
> +       "0x0000000000000000000000000000000000000000000000000000000000000000",
> +       "0x0000000000000000000000000000000000000000000000000000000000000001",
> +       "0x00b8495f16056286fdb1329ceb8d09da6ac49ff1fae35616aeb8413b7c7aebe0",
> +       "0x57119fd0dd4e22d8868e1c58c45c44045bef839c55b1d0b1248c50a3bc959c5f",
> +       "0x7fffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffec",
> +       "0x7fffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffee",
> +       NULL
> +};
> +
> +static const char *const curve448_bad_points[] = {
> +       "0xfffffffffffffffffffffffffffffffffffffffffffffffffffffffe"
> +       "ffffffffffffffffffffffffffffffffffffffffffffffffffffffff",
> +       "0x00000000000000000000000000000000000000000000000000000000"
> +       "00000000000000000000000000000000000000000000000000000000",
> +       "0x00000000000000000000000000000000000000000000000000000000"
> +       "00000000000000000000000000000000000000000000000000000001",
> +       "0xfffffffffffffffffffffffffffffffffffffffffffffffffffffffe"
> +       "fffffffffffffffffffffffffffffffffffffffffffffffffffffffe",
> +       "0xffffffffffffffffffffffffffffffffffffffffffffffffffffffff"
> +       "00000000000000000000000000000000000000000000000000000000",
> +       NULL
> +};
> +
> +static const char *const *bad_points_table[] = {
> +       curve25519_bad_points,
> +       curve448_bad_points,
> +};
> +
> +static void mpi_ec_coefficient_normalize(MPI a, MPI p)
> +{
> +       if (a->sign) {
> +               mpi_resize(a, p->nlimbs);
> +               mpihelp_sub_n(a->d, p->d, a->d, p->nlimbs);
> +               a->nlimbs = p->nlimbs;
> +               a->sign = 0;
> +       }
> +}
> +
> +/* This function initialized a context for elliptic curve based on the
> + * field GF(p).  P is the prime specifying this field, A is the first
> + * coefficient.  CTX is expected to be zeroized.
> + */
> +void mpi_ec_init(struct mpi_ec_ctx *ctx, enum gcry_mpi_ec_models model,
> +                       enum ecc_dialects dialect,
> +                       int flags, MPI p, MPI a, MPI b)
> +{
> +       int i;
> +       static int use_barrett = -1 /* TODO: 1 or -1 */;
> +
> +       mpi_ec_coefficient_normalize(a, p);
> +       mpi_ec_coefficient_normalize(b, p);
> +
> +       /* Fixme: Do we want to check some constraints? e.g.  a < p  */
> +
> +       ctx->model = model;
> +       ctx->dialect = dialect;
> +       ctx->flags = flags;
> +       if (dialect == ECC_DIALECT_ED25519)
> +               ctx->nbits = 256;
> +       else
> +               ctx->nbits = mpi_get_nbits(p);
> +       ctx->p = mpi_copy(p);
> +       ctx->a = mpi_copy(a);
> +       ctx->b = mpi_copy(b);
> +
> +       ctx->d = NULL;
> +       ctx->t.two_inv_p = NULL;
> +
> +       ctx->t.p_barrett = use_barrett > 0 ? mpi_barrett_init(ctx->p, 0) : NULL;
> +
> +       mpi_ec_get_reset(ctx);
> +
> +       if (model == MPI_EC_MONTGOMERY) {
> +               for (i = 0; i < DIM(bad_points_table); i++) {
> +                       MPI p_candidate = mpi_scanval(bad_points_table[i][0]);
> +                       int match_p = !mpi_cmp(ctx->p, p_candidate);
> +                       int j;
> +
> +                       mpi_free(p_candidate);
> +                       if (!match_p)
> +                               continue;
> +
> +                       for (j = 0; i < DIM(ctx->t.scratch) && bad_points_table[i][j]; j++)
> +                               ctx->t.scratch[j] = mpi_scanval(bad_points_table[i][j]);
> +               }
> +       } else {
> +               /* Allocate scratch variables.  */
> +               for (i = 0; i < DIM(ctx->t.scratch); i++)
> +                       ctx->t.scratch[i] = mpi_alloc_like(ctx->p);
> +       }
> +
> +       ctx->addm = ec_addm;
> +       ctx->subm = ec_subm;
> +       ctx->mulm = ec_mulm;
> +       ctx->mul2 = ec_mul2;
> +       ctx->pow2 = ec_pow2;
> +
> +       for (i = 0; field_table[i].p; i++) {
> +               MPI f_p;
> +
> +               f_p = mpi_scanval(field_table[i].p);
> +               if (!f_p)
> +                       break;
> +
> +               if (!mpi_cmp(p, f_p)) {
> +                       ctx->addm = field_table[i].addm;
> +                       ctx->subm = field_table[i].subm;
> +                       ctx->mulm = field_table[i].mulm;
> +                       ctx->mul2 = field_table[i].mul2;
> +                       ctx->pow2 = field_table[i].pow2;
> +                       mpi_free(f_p);
> +
> +                       mpi_resize(ctx->a, ctx->p->nlimbs);
> +                       ctx->a->nlimbs = ctx->p->nlimbs;
> +
> +                       mpi_resize(ctx->b, ctx->p->nlimbs);
> +                       ctx->b->nlimbs = ctx->p->nlimbs;
> +
> +                       for (i = 0; i < DIM(ctx->t.scratch) && ctx->t.scratch[i]; i++)
> +                               ctx->t.scratch[i]->nlimbs = ctx->p->nlimbs;
> +
> +                       break;
> +               }
> +
> +               mpi_free(f_p);
> +       }
> +}
> +EXPORT_SYMBOL_GPL(mpi_ec_init);
> +
> +void mpi_ec_deinit(struct mpi_ec_ctx *ctx)
> +{
> +       int i;
> +
> +       mpi_barrett_free(ctx->t.p_barrett);
> +
> +       /* Domain parameter.  */
> +       mpi_free(ctx->p);
> +       mpi_free(ctx->a);
> +       mpi_free(ctx->b);
> +       mpi_point_release(ctx->G);
> +       mpi_free(ctx->n);
> +
> +       /* The key.  */
> +       mpi_point_release(ctx->Q);
> +       mpi_free(ctx->d);
> +
> +       /* Private data of ec.c.  */
> +       mpi_free(ctx->t.two_inv_p);
> +
> +       for (i = 0; i < DIM(ctx->t.scratch); i++)
> +               mpi_free(ctx->t.scratch[i]);
> +}
> +EXPORT_SYMBOL_GPL(mpi_ec_deinit);
> +
> +/* Compute the affine coordinates from the projective coordinates in
> + * POINT.  Set them into X and Y.  If one coordinate is not required,
> + * X or Y may be passed as NULL.  CTX is the usual context. Returns: 0
> + * on success or !0 if POINT is at infinity.
> + */
> +int mpi_ec_get_affine(MPI x, MPI y, MPI_POINT point, struct mpi_ec_ctx *ctx)
> +{
> +       if (!mpi_cmp_ui(point->z, 0))
> +               return -1;
> +
> +       switch (ctx->model) {
> +       case MPI_EC_WEIERSTRASS: /* Using Jacobian coordinates.  */
> +               {
> +                       MPI z1, z2, z3;
> +
> +                       z1 = mpi_new(0);
> +                       z2 = mpi_new(0);
> +                       ec_invm(z1, point->z, ctx);  /* z1 = z^(-1) mod p  */
> +                       ec_mulm(z2, z1, z1, ctx);    /* z2 = z^(-2) mod p  */
> +
> +                       if (x)
> +                               ec_mulm(x, point->x, z2, ctx);
> +
> +                       if (y) {
> +                               z3 = mpi_new(0);
> +                               ec_mulm(z3, z2, z1, ctx);      /* z3 = z^(-3) mod p */
> +                               ec_mulm(y, point->y, z3, ctx);
> +                               mpi_free(z3);
> +                       }
> +
> +                       mpi_free(z2);
> +                       mpi_free(z1);
> +               }
> +               return 0;
> +
> +       case MPI_EC_MONTGOMERY:
> +               {
> +                       if (x)
> +                               mpi_set(x, point->x);
> +
> +                       if (y) {
> +                               log_fatal("%s: Getting Y-coordinate on %s is not supported\n",
> +                                               "mpi_ec_get_affine", "Montgomery");
> +                               return -1;
> +                       }
> +               }
> +               return 0;
> +
> +       case MPI_EC_EDWARDS:
> +               {
> +                       MPI z;
> +
> +                       z = mpi_new(0);
> +                       ec_invm(z, point->z, ctx);
> +
> +                       mpi_resize(z, ctx->p->nlimbs);
> +                       z->nlimbs = ctx->p->nlimbs;
> +
> +                       if (x) {
> +                               mpi_resize(x, ctx->p->nlimbs);
> +                               x->nlimbs = ctx->p->nlimbs;
> +                               ctx->mulm(x, point->x, z, ctx);
> +                       }
> +                       if (y) {
> +                               mpi_resize(y, ctx->p->nlimbs);
> +                               y->nlimbs = ctx->p->nlimbs;
> +                               ctx->mulm(y, point->y, z, ctx);
> +                       }
> +
> +                       mpi_free(z);
> +               }
> +               return 0;
> +
> +       default:
> +               return -1;
> +       }
> +}
> +EXPORT_SYMBOL_GPL(mpi_ec_get_affine);
> +
> +/*  RESULT = 2 * POINT  (Weierstrass version). */
> +static void dup_point_weierstrass(MPI_POINT result,
> +               MPI_POINT point, struct mpi_ec_ctx *ctx)
> +{
> +#define x3 (result->x)
> +#define y3 (result->y)
> +#define z3 (result->z)
> +#define t1 (ctx->t.scratch[0])
> +#define t2 (ctx->t.scratch[1])
> +#define t3 (ctx->t.scratch[2])
> +#define l1 (ctx->t.scratch[3])
> +#define l2 (ctx->t.scratch[4])
> +#define l3 (ctx->t.scratch[5])
> +
> +       if (!mpi_cmp_ui(point->y, 0) || !mpi_cmp_ui(point->z, 0)) {
> +               /* P_y == 0 || P_z == 0 => [1:1:0] */
> +               mpi_set_ui(x3, 1);
> +               mpi_set_ui(y3, 1);
> +               mpi_set_ui(z3, 0);
> +       } else {
> +               if (ec_get_a_is_pminus3(ctx)) {
> +                       /* Use the faster case.  */
> +                       /* L1 = 3(X - Z^2)(X + Z^2) */
> +                       /*                          T1: used for Z^2. */
> +                       /*                          T2: used for the right term. */
> +                       ec_pow2(t1, point->z, ctx);
> +                       ec_subm(l1, point->x, t1, ctx);
> +                       ec_mulm(l1, l1, mpi_const(MPI_C_THREE), ctx);
> +                       ec_addm(t2, point->x, t1, ctx);
> +                       ec_mulm(l1, l1, t2, ctx);
> +               } else {
> +                       /* Standard case. */
> +                       /* L1 = 3X^2 + aZ^4 */
> +                       /*                          T1: used for aZ^4. */
> +                       ec_pow2(l1, point->x, ctx);
> +                       ec_mulm(l1, l1, mpi_const(MPI_C_THREE), ctx);
> +                       ec_powm(t1, point->z, mpi_const(MPI_C_FOUR), ctx);
> +                       ec_mulm(t1, t1, ctx->a, ctx);
> +                       ec_addm(l1, l1, t1, ctx);
> +               }
> +               /* Z3 = 2YZ */
> +               ec_mulm(z3, point->y, point->z, ctx);
> +               ec_mul2(z3, z3, ctx);
> +
> +               /* L2 = 4XY^2 */
> +               /*                              T2: used for Y2; required later. */
> +               ec_pow2(t2, point->y, ctx);
> +               ec_mulm(l2, t2, point->x, ctx);
> +               ec_mulm(l2, l2, mpi_const(MPI_C_FOUR), ctx);
> +
> +               /* X3 = L1^2 - 2L2 */
> +               /*                              T1: used for L2^2. */
> +               ec_pow2(x3, l1, ctx);
> +               ec_mul2(t1, l2, ctx);
> +               ec_subm(x3, x3, t1, ctx);
> +
> +               /* L3 = 8Y^4 */
> +               /*                              T2: taken from above. */
> +               ec_pow2(t2, t2, ctx);
> +               ec_mulm(l3, t2, mpi_const(MPI_C_EIGHT), ctx);
> +
> +               /* Y3 = L1(L2 - X3) - L3 */
> +               ec_subm(y3, l2, x3, ctx);
> +               ec_mulm(y3, y3, l1, ctx);
> +               ec_subm(y3, y3, l3, ctx);
> +       }
> +
> +#undef x3
> +#undef y3
> +#undef z3
> +#undef t1
> +#undef t2
> +#undef t3
> +#undef l1
> +#undef l2
> +#undef l3
> +}
> +
> +/*  RESULT = 2 * POINT  (Montgomery version). */
> +static void dup_point_montgomery(MPI_POINT result,
> +                               MPI_POINT point, struct mpi_ec_ctx *ctx)
> +{
> +       (void)result;
> +       (void)point;
> +       (void)ctx;
> +       log_fatal("%s: %s not yet supported\n",
> +                       "mpi_ec_dup_point", "Montgomery");
> +}
> +
> +/*  RESULT = 2 * POINT  (Twisted Edwards version). */
> +static void dup_point_edwards(MPI_POINT result,
> +               MPI_POINT point, struct mpi_ec_ctx *ctx)
> +{
> +#define X1 (point->x)
> +#define Y1 (point->y)
> +#define Z1 (point->z)
> +#define X3 (result->x)
> +#define Y3 (result->y)
> +#define Z3 (result->z)
> +#define B (ctx->t.scratch[0])
> +#define C (ctx->t.scratch[1])
> +#define D (ctx->t.scratch[2])
> +#define E (ctx->t.scratch[3])
> +#define F (ctx->t.scratch[4])
> +#define H (ctx->t.scratch[5])
> +#define J (ctx->t.scratch[6])
> +
> +       /* Compute: (X_3 : Y_3 : Z_3) = 2( X_1 : Y_1 : Z_1 ) */
> +
> +       /* B = (X_1 + Y_1)^2  */
> +       ctx->addm(B, X1, Y1, ctx);
> +       ctx->pow2(B, B, ctx);
> +
> +       /* C = X_1^2 */
> +       /* D = Y_1^2 */
> +       ctx->pow2(C, X1, ctx);
> +       ctx->pow2(D, Y1, ctx);
> +
> +       /* E = aC */
> +       if (ctx->dialect == ECC_DIALECT_ED25519)
> +               ctx->subm(E, ctx->p, C, ctx);
> +       else
> +               ctx->mulm(E, ctx->a, C, ctx);
> +
> +       /* F = E + D */
> +       ctx->addm(F, E, D, ctx);
> +
> +       /* H = Z_1^2 */
> +       ctx->pow2(H, Z1, ctx);
> +
> +       /* J = F - 2H */
> +       ctx->mul2(J, H, ctx);
> +       ctx->subm(J, F, J, ctx);
> +
> +       /* X_3 = (B - C - D) · J */
> +       ctx->subm(X3, B, C, ctx);
> +       ctx->subm(X3, X3, D, ctx);
> +       ctx->mulm(X3, X3, J, ctx);
> +
> +       /* Y_3 = F · (E - D) */
> +       ctx->subm(Y3, E, D, ctx);
> +       ctx->mulm(Y3, Y3, F, ctx);
> +
> +       /* Z_3 = F · J */
> +       ctx->mulm(Z3, F, J, ctx);
> +
> +#undef X1
> +#undef Y1
> +#undef Z1
> +#undef X3
> +#undef Y3
> +#undef Z3
> +#undef B
> +#undef C
> +#undef D
> +#undef E
> +#undef F
> +#undef H
> +#undef J
> +}
> +
> +/*  RESULT = 2 * POINT  */
> +static void
> +mpi_ec_dup_point(MPI_POINT result, MPI_POINT point, struct mpi_ec_ctx *ctx)
> +{
> +       switch (ctx->model) {
> +       case MPI_EC_WEIERSTRASS:
> +               dup_point_weierstrass(result, point, ctx);
> +               break;
> +       case MPI_EC_MONTGOMERY:
> +               dup_point_montgomery(result, point, ctx);
> +               break;
> +       case MPI_EC_EDWARDS:
> +               dup_point_edwards(result, point, ctx);
> +               break;
> +       }
> +}
> +
> +/* RESULT = P1 + P2  (Weierstrass version).*/
> +static void add_points_weierstrass(MPI_POINT result,
> +               MPI_POINT p1, MPI_POINT p2,
> +               struct mpi_ec_ctx *ctx)
> +{
> +#define x1 (p1->x)
> +#define y1 (p1->y)
> +#define z1 (p1->z)
> +#define x2 (p2->x)
> +#define y2 (p2->y)
> +#define z2 (p2->z)
> +#define x3 (result->x)
> +#define y3 (result->y)
> +#define z3 (result->z)
> +#define l1 (ctx->t.scratch[0])
> +#define l2 (ctx->t.scratch[1])
> +#define l3 (ctx->t.scratch[2])
> +#define l4 (ctx->t.scratch[3])
> +#define l5 (ctx->t.scratch[4])
> +#define l6 (ctx->t.scratch[5])
> +#define l7 (ctx->t.scratch[6])
> +#define l8 (ctx->t.scratch[7])
> +#define l9 (ctx->t.scratch[8])
> +#define t1 (ctx->t.scratch[9])
> +#define t2 (ctx->t.scratch[10])
> +
> +       if ((!mpi_cmp(x1, x2)) && (!mpi_cmp(y1, y2)) && (!mpi_cmp(z1, z2))) {
> +               /* Same point; need to call the duplicate function.  */
> +               mpi_ec_dup_point(result, p1, ctx);
> +       } else if (!mpi_cmp_ui(z1, 0)) {
> +               /* P1 is at infinity.  */
> +               mpi_set(x3, p2->x);
> +               mpi_set(y3, p2->y);
> +               mpi_set(z3, p2->z);
> +       } else if (!mpi_cmp_ui(z2, 0)) {
> +               /* P2 is at infinity.  */
> +               mpi_set(x3, p1->x);
> +               mpi_set(y3, p1->y);
> +               mpi_set(z3, p1->z);
> +       } else {
> +               int z1_is_one = !mpi_cmp_ui(z1, 1);
> +               int z2_is_one = !mpi_cmp_ui(z2, 1);
> +
> +               /* l1 = x1 z2^2  */
> +               /* l2 = x2 z1^2  */
> +               if (z2_is_one)
> +                       mpi_set(l1, x1);
> +               else {
> +                       ec_pow2(l1, z2, ctx);
> +                       ec_mulm(l1, l1, x1, ctx);
> +               }
> +               if (z1_is_one)
> +                       mpi_set(l2, x2);
> +               else {
> +                       ec_pow2(l2, z1, ctx);
> +                       ec_mulm(l2, l2, x2, ctx);
> +               }
> +               /* l3 = l1 - l2 */
> +               ec_subm(l3, l1, l2, ctx);
> +               /* l4 = y1 z2^3  */
> +               ec_powm(l4, z2, mpi_const(MPI_C_THREE), ctx);
> +               ec_mulm(l4, l4, y1, ctx);
> +               /* l5 = y2 z1^3  */
> +               ec_powm(l5, z1, mpi_const(MPI_C_THREE), ctx);
> +               ec_mulm(l5, l5, y2, ctx);
> +               /* l6 = l4 - l5  */
> +               ec_subm(l6, l4, l5, ctx);
> +
> +               if (!mpi_cmp_ui(l3, 0)) {
> +                       if (!mpi_cmp_ui(l6, 0)) {
> +                               /* P1 and P2 are the same - use duplicate function. */
> +                               mpi_ec_dup_point(result, p1, ctx);
> +                       } else {
> +                               /* P1 is the inverse of P2.  */
> +                               mpi_set_ui(x3, 1);
> +                               mpi_set_ui(y3, 1);
> +                               mpi_set_ui(z3, 0);
> +                       }
> +               } else {
> +                       /* l7 = l1 + l2  */
> +                       ec_addm(l7, l1, l2, ctx);
> +                       /* l8 = l4 + l5  */
> +                       ec_addm(l8, l4, l5, ctx);
> +                       /* z3 = z1 z2 l3  */
> +                       ec_mulm(z3, z1, z2, ctx);
> +                       ec_mulm(z3, z3, l3, ctx);
> +                       /* x3 = l6^2 - l7 l3^2  */
> +                       ec_pow2(t1, l6, ctx);
> +                       ec_pow2(t2, l3, ctx);
> +                       ec_mulm(t2, t2, l7, ctx);
> +                       ec_subm(x3, t1, t2, ctx);
> +                       /* l9 = l7 l3^2 - 2 x3  */
> +                       ec_mul2(t1, x3, ctx);
> +                       ec_subm(l9, t2, t1, ctx);
> +                       /* y3 = (l9 l6 - l8 l3^3)/2  */
> +                       ec_mulm(l9, l9, l6, ctx);
> +                       ec_powm(t1, l3, mpi_const(MPI_C_THREE), ctx); /* fixme: Use saved value*/
> +                       ec_mulm(t1, t1, l8, ctx);
> +                       ec_subm(y3, l9, t1, ctx);
> +                       ec_mulm(y3, y3, ec_get_two_inv_p(ctx), ctx);
> +               }
> +       }
> +
> +#undef x1
> +#undef y1
> +#undef z1
> +#undef x2
> +#undef y2
> +#undef z2
> +#undef x3
> +#undef y3
> +#undef z3
> +#undef l1
> +#undef l2
> +#undef l3
> +#undef l4
> +#undef l5
> +#undef l6
> +#undef l7
> +#undef l8
> +#undef l9
> +#undef t1
> +#undef t2
> +}
> +
> +/* RESULT = P1 + P2  (Montgomery version).*/
> +static void add_points_montgomery(MPI_POINT result,
> +               MPI_POINT p1, MPI_POINT p2,
> +               struct mpi_ec_ctx *ctx)
> +{
> +       (void)result;
> +       (void)p1;
> +       (void)p2;
> +       (void)ctx;
> +       log_fatal("%s: %s not yet supported\n",
> +                       "mpi_ec_add_points", "Montgomery");
> +}
> +
> +/* RESULT = P1 + P2  (Twisted Edwards version).*/
> +static void add_points_edwards(MPI_POINT result,
> +               MPI_POINT p1, MPI_POINT p2,
> +               struct mpi_ec_ctx *ctx)
> +{
> +#define X1 (p1->x)
> +#define Y1 (p1->y)
> +#define Z1 (p1->z)
> +#define X2 (p2->x)
> +#define Y2 (p2->y)
> +#define Z2 (p2->z)
> +#define X3 (result->x)
> +#define Y3 (result->y)
> +#define Z3 (result->z)
> +#define A (ctx->t.scratch[0])
> +#define B (ctx->t.scratch[1])
> +#define C (ctx->t.scratch[2])
> +#define D (ctx->t.scratch[3])
> +#define E (ctx->t.scratch[4])
> +#define F (ctx->t.scratch[5])
> +#define G (ctx->t.scratch[6])
> +#define tmp (ctx->t.scratch[7])
> +
> +       point_resize(result, ctx);
> +
> +       /* Compute: (X_3 : Y_3 : Z_3) = (X_1 : Y_1 : Z_1) + (X_2 : Y_2 : Z_3) */
> +
> +       /* A = Z1 · Z2 */
> +       ctx->mulm(A, Z1, Z2, ctx);
> +
> +       /* B = A^2 */
> +       ctx->pow2(B, A, ctx);
> +
> +       /* C = X1 · X2 */
> +       ctx->mulm(C, X1, X2, ctx);
> +
> +       /* D = Y1 · Y2 */
> +       ctx->mulm(D, Y1, Y2, ctx);
> +
> +       /* E = d · C · D */
> +       ctx->mulm(E, ctx->b, C, ctx);
> +       ctx->mulm(E, E, D, ctx);
> +
> +       /* F = B - E */
> +       ctx->subm(F, B, E, ctx);
> +
> +       /* G = B + E */
> +       ctx->addm(G, B, E, ctx);
> +
> +       /* X_3 = A · F · ((X_1 + Y_1) · (X_2 + Y_2) - C - D) */
> +       ctx->addm(tmp, X1, Y1, ctx);
> +       ctx->addm(X3, X2, Y2, ctx);
> +       ctx->mulm(X3, X3, tmp, ctx);
> +       ctx->subm(X3, X3, C, ctx);
> +       ctx->subm(X3, X3, D, ctx);
> +       ctx->mulm(X3, X3, F, ctx);
> +       ctx->mulm(X3, X3, A, ctx);
> +
> +       /* Y_3 = A · G · (D - aC) */
> +       if (ctx->dialect == ECC_DIALECT_ED25519) {
> +               ctx->addm(Y3, D, C, ctx);
> +       } else {
> +               ctx->mulm(Y3, ctx->a, C, ctx);
> +               ctx->subm(Y3, D, Y3, ctx);
> +       }
> +       ctx->mulm(Y3, Y3, G, ctx);
> +       ctx->mulm(Y3, Y3, A, ctx);
> +
> +       /* Z_3 = F · G */
> +       ctx->mulm(Z3, F, G, ctx);
> +
> +
> +#undef X1
> +#undef Y1
> +#undef Z1
> +#undef X2
> +#undef Y2
> +#undef Z2
> +#undef X3
> +#undef Y3
> +#undef Z3
> +#undef A
> +#undef B
> +#undef C
> +#undef D
> +#undef E
> +#undef F
> +#undef G
> +#undef tmp
> +}
> +
> +/* Compute a step of Montgomery Ladder (only use X and Z in the point).
> + * Inputs:  P1, P2, and x-coordinate of DIF = P1 - P1.
> + * Outputs: PRD = 2 * P1 and  SUM = P1 + P2.
> + */
> +static void montgomery_ladder(MPI_POINT prd, MPI_POINT sum,
> +               MPI_POINT p1, MPI_POINT p2, MPI dif_x,
> +               struct mpi_ec_ctx *ctx)
> +{
> +       ctx->addm(sum->x, p2->x, p2->z, ctx);
> +       ctx->subm(p2->z, p2->x, p2->z, ctx);
> +       ctx->addm(prd->x, p1->x, p1->z, ctx);
> +       ctx->subm(p1->z, p1->x, p1->z, ctx);
> +       ctx->mulm(p2->x, p1->z, sum->x, ctx);
> +       ctx->mulm(p2->z, prd->x, p2->z, ctx);
> +       ctx->pow2(p1->x, prd->x, ctx);
> +       ctx->pow2(p1->z, p1->z, ctx);
> +       ctx->addm(sum->x, p2->x, p2->z, ctx);
> +       ctx->subm(p2->z, p2->x, p2->z, ctx);
> +       ctx->mulm(prd->x, p1->x, p1->z, ctx);
> +       ctx->subm(p1->z, p1->x, p1->z, ctx);
> +       ctx->pow2(sum->x, sum->x, ctx);
> +       ctx->pow2(sum->z, p2->z, ctx);
> +       ctx->mulm(prd->z, p1->z, ctx->a, ctx); /* CTX->A: (a-2)/4 */
> +       ctx->mulm(sum->z, sum->z, dif_x, ctx);
> +       ctx->addm(prd->z, p1->x, prd->z, ctx);
> +       ctx->mulm(prd->z, prd->z, p1->z, ctx);
> +}
> +
> +/* RESULT = P1 + P2 */
> +void mpi_ec_add_points(MPI_POINT result,
> +               MPI_POINT p1, MPI_POINT p2,
> +               struct mpi_ec_ctx *ctx)
> +{
> +       switch (ctx->model) {
> +       case MPI_EC_WEIERSTRASS:
> +               add_points_weierstrass(result, p1, p2, ctx);
> +               break;
> +       case MPI_EC_MONTGOMERY:
> +               add_points_montgomery(result, p1, p2, ctx);
> +               break;
> +       case MPI_EC_EDWARDS:
> +               add_points_edwards(result, p1, p2, ctx);
> +               break;
> +       }
> +}
> +EXPORT_SYMBOL_GPL(mpi_ec_add_points);
> +
> +/* Scalar point multiplication - the main function for ECC.  If takes
> + * an integer SCALAR and a POINT as well as the usual context CTX.
> + * RESULT will be set to the resulting point.
> + */
> +void mpi_ec_mul_point(MPI_POINT result,
> +                       MPI scalar, MPI_POINT point,
> +                       struct mpi_ec_ctx *ctx)
> +{
> +       MPI x1, y1, z1, k, h, yy;
> +       unsigned int i, loops;
> +       struct gcry_mpi_point p1, p2, p1inv;
> +
> +       if (ctx->model == MPI_EC_EDWARDS) {
> +               /* Simple left to right binary method.  Algorithm 3.27 from
> +                * {author={Hankerson, Darrel and Menezes, Alfred J. and Vanstone, Scott},
> +                *  title = {Guide to Elliptic Curve Cryptography},
> +                *  year = {2003}, isbn = {038795273X},
> +                *  url = {http://www.cacr.math.uwaterloo.ca/ecc/},
> +                *  publisher = {Springer-Verlag New York, Inc.}}
> +                */
> +               unsigned int nbits;
> +               int j;
> +
> +               if (mpi_cmp(scalar, ctx->p) >= 0)
> +                       nbits = mpi_get_nbits(scalar);
> +               else
> +                       nbits = mpi_get_nbits(ctx->p);
> +
> +               mpi_set_ui(result->x, 0);
> +               mpi_set_ui(result->y, 1);
> +               mpi_set_ui(result->z, 1);
> +               point_resize(point, ctx);
> +
> +               point_resize(result, ctx);
> +               point_resize(point, ctx);
> +
> +               for (j = nbits-1; j >= 0; j--) {
> +                       mpi_ec_dup_point(result, result, ctx);
> +                       if (mpi_test_bit(scalar, j))
> +                               mpi_ec_add_points(result, result, point, ctx);
> +               }
> +               return;
> +       } else if (ctx->model == MPI_EC_MONTGOMERY) {
> +               unsigned int nbits;
> +               int j;
> +               struct gcry_mpi_point p1_, p2_;
> +               MPI_POINT q1, q2, prd, sum;
> +               unsigned long sw;
> +               mpi_size_t rsize;
> +
> +               /* Compute scalar point multiplication with Montgomery Ladder.
> +                * Note that we don't use Y-coordinate in the points at all.
> +                * RESULT->Y will be filled by zero.
> +                */
> +
> +               nbits = mpi_get_nbits(scalar);
> +               point_init(&p1);
> +               point_init(&p2);
> +               point_init(&p1_);
> +               point_init(&p2_);
> +               mpi_set_ui(p1.x, 1);
> +               mpi_free(p2.x);
> +               p2.x = mpi_copy(point->x);
> +               mpi_set_ui(p2.z, 1);
> +
> +               point_resize(&p1, ctx);
> +               point_resize(&p2, ctx);
> +               point_resize(&p1_, ctx);
> +               point_resize(&p2_, ctx);
> +
> +               mpi_resize(point->x, ctx->p->nlimbs);
> +               point->x->nlimbs = ctx->p->nlimbs;
> +
> +               q1 = &p1;
> +               q2 = &p2;
> +               prd = &p1_;
> +               sum = &p2_;
> +
> +               for (j = nbits-1; j >= 0; j--) {
> +                       sw = mpi_test_bit(scalar, j);
> +                       point_swap_cond(q1, q2, sw, ctx);
> +                       montgomery_ladder(prd, sum, q1, q2, point->x, ctx);
> +                       point_swap_cond(prd, sum, sw, ctx);
> +                       swap(q1, prd);
> +                       swap(q2, sum);
> +               }
> +
> +               mpi_clear(result->y);
> +               sw = (nbits & 1);
> +               point_swap_cond(&p1, &p1_, sw, ctx);
> +
> +               rsize = p1.z->nlimbs;
> +               MPN_NORMALIZE(p1.z->d, rsize);
> +               if (rsize == 0) {
> +                       mpi_set_ui(result->x, 1);
> +                       mpi_set_ui(result->z, 0);
> +               } else {
> +                       z1 = mpi_new(0);
> +                       ec_invm(z1, p1.z, ctx);
> +                       ec_mulm(result->x, p1.x, z1, ctx);
> +                       mpi_set_ui(result->z, 1);
> +                       mpi_free(z1);
> +               }
> +
> +               point_free(&p1);
> +               point_free(&p2);
> +               point_free(&p1_);
> +               point_free(&p2_);
> +               return;
> +       }
> +
> +       x1 = mpi_alloc_like(ctx->p);
> +       y1 = mpi_alloc_like(ctx->p);
> +       h  = mpi_alloc_like(ctx->p);
> +       k  = mpi_copy(scalar);
> +       yy = mpi_copy(point->y);
> +
> +       if (mpi_has_sign(k)) {
> +               k->sign = 0;
> +               ec_invm(yy, yy, ctx);
> +       }
> +
> +       if (!mpi_cmp_ui(point->z, 1)) {
> +               mpi_set(x1, point->x);
> +               mpi_set(y1, yy);
> +       } else {
> +               MPI z2, z3;
> +
> +               z2 = mpi_alloc_like(ctx->p);
> +               z3 = mpi_alloc_like(ctx->p);
> +               ec_mulm(z2, point->z, point->z, ctx);
> +               ec_mulm(z3, point->z, z2, ctx);
> +               ec_invm(z2, z2, ctx);
> +               ec_mulm(x1, point->x, z2, ctx);
> +               ec_invm(z3, z3, ctx);
> +               ec_mulm(y1, yy, z3, ctx);
> +               mpi_free(z2);
> +               mpi_free(z3);
> +       }
> +       z1 = mpi_copy(mpi_const(MPI_C_ONE));
> +
> +       mpi_mul(h, k, mpi_const(MPI_C_THREE)); /* h = 3k */
> +       loops = mpi_get_nbits(h);
> +       if (loops < 2) {
> +               /* If SCALAR is zero, the above mpi_mul sets H to zero and thus
> +                * LOOPs will be zero.  To avoid an underflow of I in the main
> +                * loop we set LOOP to 2 and the result to (0,0,0).
> +                */
> +               loops = 2;
> +               mpi_clear(result->x);
> +               mpi_clear(result->y);
> +               mpi_clear(result->z);
> +       } else {
> +               mpi_set(result->x, point->x);
> +               mpi_set(result->y, yy);
> +               mpi_set(result->z, point->z);
> +       }
> +       mpi_free(yy); yy = NULL;
> +
> +       p1.x = x1; x1 = NULL;
> +       p1.y = y1; y1 = NULL;
> +       p1.z = z1; z1 = NULL;
> +       point_init(&p2);
> +       point_init(&p1inv);
> +
> +       /* Invert point: y = p - y mod p  */
> +       point_set(&p1inv, &p1);
> +       ec_subm(p1inv.y, ctx->p, p1inv.y, ctx);
> +
> +       for (i = loops-2; i > 0; i--) {
> +               mpi_ec_dup_point(result, result, ctx);
> +               if (mpi_test_bit(h, i) == 1 && mpi_test_bit(k, i) == 0) {
> +                       point_set(&p2, result);
> +                       mpi_ec_add_points(result, &p2, &p1, ctx);
> +               }
> +               if (mpi_test_bit(h, i) == 0 && mpi_test_bit(k, i) == 1) {
> +                       point_set(&p2, result);
> +                       mpi_ec_add_points(result, &p2, &p1inv, ctx);
> +               }
> +       }
> +
> +       point_free(&p1);
> +       point_free(&p2);
> +       point_free(&p1inv);
> +       mpi_free(h);
> +       mpi_free(k);
> +}
> +EXPORT_SYMBOL_GPL(mpi_ec_mul_point);
> +
> +/* Return true if POINT is on the curve described by CTX.  */
> +int mpi_ec_curve_point(MPI_POINT point, struct mpi_ec_ctx *ctx)
> +{
> +       int res = 0;
> +       MPI x, y, w;
> +
> +       x = mpi_new(0);
> +       y = mpi_new(0);
> +       w = mpi_new(0);
> +
> +       /* Check that the point is in range.  This needs to be done here and
> +        * not after conversion to affine coordinates.
> +        */
> +       if (mpi_cmpabs(point->x, ctx->p) >= 0)
> +               goto leave;
> +       if (mpi_cmpabs(point->y, ctx->p) >= 0)
> +               goto leave;
> +       if (mpi_cmpabs(point->z, ctx->p) >= 0)
> +               goto leave;
> +
> +       switch (ctx->model) {
> +       case MPI_EC_WEIERSTRASS:
> +               {
> +                       MPI xxx;
> +
> +                       if (mpi_ec_get_affine(x, y, point, ctx))
> +                               goto leave;
> +
> +                       xxx = mpi_new(0);
> +
> +                       /* y^2 == x^3 + a·x + b */
> +                       ec_pow2(y, y, ctx);
> +
> +                       ec_pow3(xxx, x, ctx);
> +                       ec_mulm(w, ctx->a, x, ctx);
> +                       ec_addm(w, w, ctx->b, ctx);
> +                       ec_addm(w, w, xxx, ctx);
> +
> +                       if (!mpi_cmp(y, w))
> +                               res = 1;
> +
> +                       mpi_free(xxx);
> +               }
> +               break;
> +
> +       case MPI_EC_MONTGOMERY:
> +               {
> +#define xx y
> +                       /* With Montgomery curve, only X-coordinate is valid. */
> +                       if (mpi_ec_get_affine(x, NULL, point, ctx))
> +                               goto leave;
> +
> +                       /* The equation is: b * y^2 == x^3 + a · x^2 + x */
> +                       /* We check if right hand is quadratic residue or not by
> +                        * Euler's criterion.
> +                        */
> +                       /* CTX->A has (a-2)/4 and CTX->B has b^-1 */
> +                       ec_mulm(w, ctx->a, mpi_const(MPI_C_FOUR), ctx);
> +                       ec_addm(w, w, mpi_const(MPI_C_TWO), ctx);
> +                       ec_mulm(w, w, x, ctx);
> +                       ec_pow2(xx, x, ctx);
> +                       ec_addm(w, w, xx, ctx);
> +                       ec_addm(w, w, mpi_const(MPI_C_ONE), ctx);
> +                       ec_mulm(w, w, x, ctx);
> +                       ec_mulm(w, w, ctx->b, ctx);
> +#undef xx
> +                       /* Compute Euler's criterion: w^(p-1)/2 */
> +#define p_minus1 y
> +                       ec_subm(p_minus1, ctx->p, mpi_const(MPI_C_ONE), ctx);
> +                       mpi_rshift(p_minus1, p_minus1, 1);
> +                       ec_powm(w, w, p_minus1, ctx);
> +
> +                       res = !mpi_cmp_ui(w, 1);
> +#undef p_minus1
> +               }
> +               break;
> +
> +       case MPI_EC_EDWARDS:
> +               {
> +                       if (mpi_ec_get_affine(x, y, point, ctx))
> +                               goto leave;
> +
> +                       mpi_resize(w, ctx->p->nlimbs);
> +                       w->nlimbs = ctx->p->nlimbs;
> +
> +                       /* a · x^2 + y^2 - 1 - b · x^2 · y^2 == 0 */
> +                       ctx->pow2(x, x, ctx);
> +                       ctx->pow2(y, y, ctx);
> +                       if (ctx->dialect == ECC_DIALECT_ED25519)
> +                               ctx->subm(w, ctx->p, x, ctx);
> +                       else
> +                               ctx->mulm(w, ctx->a, x, ctx);
> +                       ctx->addm(w, w, y, ctx);
> +                       ctx->mulm(x, x, y, ctx);
> +                       ctx->mulm(x, x, ctx->b, ctx);
> +                       ctx->subm(w, w, x, ctx);
> +                       if (!mpi_cmp_ui(w, 1))
> +                               res = 1;
> +               }
> +               break;
> +       }
> +
> +leave:
> +       mpi_free(w);
> +       mpi_free(x);
> +       mpi_free(y);
> +
> +       return res;
> +}
> +EXPORT_SYMBOL_GPL(mpi_ec_curve_point);
> --
> 2.25.1
>

Ignat


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH RFC 1/4] Revert "Revert "lib/mpi: Extend the MPI library""
  2025-06-30 13:39 ` [PATCH RFC 1/4] Revert "Revert "lib/mpi: Extend the MPI library"" Gu Bowen
@ 2025-07-03  9:18   ` Xi Ruoyao
  0 siblings, 0 replies; 13+ messages in thread
From: Xi Ruoyao @ 2025-07-03  9:18 UTC (permalink / raw)
  To: Gu Bowen, Herbert Xu, David Howells, David Woodhouse,
	Lukas Wunner, Ignat Korchagin, David S . Miller, Jarkko Sakkinen,
	Maxime Coquelin, Alexandre Torgue, Eric Biggers,
	Jason A . Donenfeld, Ard Biesheuvel, Tianjia Zhang, Dan Carpenter
  Cc: keyrings, linux-kernel, linux-crypto, linux-stm32,
	linux-arm-kernel, Lu Jialin, GONG Ruiqi

On Mon, 2025-06-30 at 21:39 +0800, Gu Bowen wrote:
> This reverts commit fca5cb4dd2b4a9423cb6d112cc71c33899955a1f.
> 
> Reintroduce the mpi library based on libgcrypt to support sm2.

If you use a newer version of Git, the subject would be 'Reapply
"lib/mpi: Extend the MPI library"' and IMO it would be better.

-- 
Xi Ruoyao <xry111@xry111.site>


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH RFC 0/4] Reintroduce the sm2 algorithm
  2025-06-30 13:39 [PATCH RFC 0/4] Reintroduce the sm2 algorithm Gu Bowen
                   ` (4 preceding siblings ...)
  2025-06-30 19:41 ` [PATCH RFC 0/4] Reintroduce the sm2 algorithm Dan Carpenter
@ 2025-07-03 13:14 ` Jason A. Donenfeld
  2025-07-03 13:29   ` Jason A. Donenfeld
  2025-07-11  2:14   ` Gu Bowen
  5 siblings, 2 replies; 13+ messages in thread
From: Jason A. Donenfeld @ 2025-07-03 13:14 UTC (permalink / raw)
  To: Gu Bowen
  Cc: Herbert Xu, David Howells, David Woodhouse, Lukas Wunner,
	Ignat Korchagin, David S . Miller, Jarkko Sakkinen,
	Maxime Coquelin, Alexandre Torgue, Eric Biggers, Ard Biesheuvel,
	Tianjia Zhang, Dan Carpenter, keyrings, linux-kernel,
	linux-crypto, linux-stm32, linux-arm-kernel, Lu Jialin,
	GONG Ruiqi

Hi,

On Mon, Jun 30, 2025 at 09:39:30PM +0800, Gu Bowen wrote:
> To reintroduce the sm2 algorithm, the patch set did the following:
>  - Reintroduce the mpi library based on libgcrypt.
>  - Reintroduce ec implementation to MPI library.
>  - Rework sm2 algorithm.
>  - Support verification of X.509 certificates.
> 
> Gu Bowen (4):
>   Revert "Revert "lib/mpi: Extend the MPI library""
>   Revert "Revert "lib/mpi: Introduce ec implementation to MPI library""
>   crypto/sm2: Rework sm2 alg with sig_alg backend
>   crypto/sm2: support SM2-with-SM3 verification of X.509 certificates

I am less than enthusiastic about this. Firstly, I'm kind of biased
against the whole "national flag algorithms" thing. But I don't know how
much weight that argument will have here. More importantly, however,
implementing this atop MPI sounds very bad. The more MPI we can get rid
of, the better.

Is MPI constant time? Usually the good way to implement EC algorithms
like this is to very carefully work out constant time (and fast!) field
arithmetic routines, verify their correctness, and then implement your
ECC atop that. At this point, there's *lots* of work out there on doing
fast verified ECC and a bunch of different frameworks for producing good
implementations. There are also other implementations out there you
could look at that people have presumably studied a lot. This is old
news. (In 3 minutes of scrolling around, I noticed that
count_leading_zeros() on a value is used as a loop index, for example.
Maybe fine, maybe not, I dunno; this stuff requires analysis.)

On the other hand, maybe you don't care because you only implement
verification, not signing, so all info is public? If so, the fact that
you don't care about CT should probably be made pretty visible. But
either way, you should still be concerned with having an actually good &
correct implementation of which you feel strongly about the correctness.

Secondly, the MPI stuff you're proposing here adds a 25519 and 448
implementation, and support for weierstrauss, montgomery, and edwards,
and... surely you don't need all of this for SM-2. Why add all this
unused code? Presumably because you don't really understand or "own" all
of the code that you're proposing to add. And that gives me a lot of
hesitation, because somebody is going to have to maintain this, and if
the person sending patches with it isn't fully on top of it, we're not
off to a good start.

Lastly, just to nip in the bud the argument, "but weierstrauss is all
the same, so why not just have one library to do all possible
weierstrauss curves?" -- the fact that this series reintroduces the
removed "generic EC library" indicates there's actually not another user
of it, even before we get into questions of whether it's a good idea.

Jason


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH RFC 0/4] Reintroduce the sm2 algorithm
  2025-07-03 13:14 ` Jason A. Donenfeld
@ 2025-07-03 13:29   ` Jason A. Donenfeld
  2025-07-03 13:33     ` Ignat Korchagin
  2025-07-11  2:14   ` Gu Bowen
  1 sibling, 1 reply; 13+ messages in thread
From: Jason A. Donenfeld @ 2025-07-03 13:29 UTC (permalink / raw)
  To: Gu Bowen
  Cc: Herbert Xu, David Howells, David Woodhouse, Lukas Wunner,
	Ignat Korchagin, David S . Miller, Jarkko Sakkinen,
	Maxime Coquelin, Alexandre Torgue, Eric Biggers, Ard Biesheuvel,
	Tianjia Zhang, Dan Carpenter, keyrings, linux-kernel,
	linux-crypto, linux-stm32, linux-arm-kernel, Lu Jialin,
	GONG Ruiqi

On Thu, Jul 03, 2025 at 03:14:52PM +0200, Jason A. Donenfeld wrote:
> Hi,
> 
> On Mon, Jun 30, 2025 at 09:39:30PM +0800, Gu Bowen wrote:
> > To reintroduce the sm2 algorithm, the patch set did the following:
> >  - Reintroduce the mpi library based on libgcrypt.
> >  - Reintroduce ec implementation to MPI library.
> >  - Rework sm2 algorithm.
> >  - Support verification of X.509 certificates.
> > 
> > Gu Bowen (4):
> >   Revert "Revert "lib/mpi: Extend the MPI library""
> >   Revert "Revert "lib/mpi: Introduce ec implementation to MPI library""
> >   crypto/sm2: Rework sm2 alg with sig_alg backend
> >   crypto/sm2: support SM2-with-SM3 verification of X.509 certificates
> 
> I am less than enthusiastic about this. Firstly, I'm kind of biased
> against the whole "national flag algorithms" thing. But I don't know how
> much weight that argument will have here. More importantly, however,
> implementing this atop MPI sounds very bad. The more MPI we can get rid
> of, the better.
> 
> Is MPI constant time? Usually the good way to implement EC algorithms
> like this is to very carefully work out constant time (and fast!) field
> arithmetic routines, verify their correctness, and then implement your
> ECC atop that. At this point, there's *lots* of work out there on doing
> fast verified ECC and a bunch of different frameworks for producing good
> implementations. There are also other implementations out there you
> could look at that people have presumably studied a lot. This is old
> news. (In 3 minutes of scrolling around, I noticed that
> count_leading_zeros() on a value is used as a loop index, for example.
> Maybe fine, maybe not, I dunno; this stuff requires analysis.)
> 
> On the other hand, maybe you don't care because you only implement
> verification, not signing, so all info is public? If so, the fact that
> you don't care about CT should probably be made pretty visible. But
> either way, you should still be concerned with having an actually good &
> correct implementation of which you feel strongly about the correctness.
> 
> Secondly, the MPI stuff you're proposing here adds a 25519 and 448
> implementation, and support for weierstrauss, montgomery, and edwards,
> and... surely you don't need all of this for SM-2. Why add all this
> unused code? Presumably because you don't really understand or "own" all
> of the code that you're proposing to add. And that gives me a lot of
> hesitation, because somebody is going to have to maintain this, and if
> the person sending patches with it isn't fully on top of it, we're not
> off to a good start.
> 
> Lastly, just to nip in the bud the argument, "but weierstrauss is all
> the same, so why not just have one library to do all possible
> weierstrauss curves?" -- the fact that this series reintroduces the
> removed "generic EC library" indicates there's actually not another user
> of it, even before we get into questions of whether it's a good idea.

I went looking for reference implementations and came across this
"GmSSL" project and located:

https://github.com/guanzhi/GmSSL/blob/master/src/sm2_sign.c#L271
which uses some routines from
https://github.com/guanzhi/GmSSL/blob/master/src/sm2_z256.c

I have no idea what the deal actually is here -- is this any good? has
anybody looked at it? is it a random github? -- but it certainly
_resembles_ something more comfortable than the MPI code. Who knows, it
could be terrible, but you get the idea.

Jason


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH RFC 0/4] Reintroduce the sm2 algorithm
  2025-07-03 13:29   ` Jason A. Donenfeld
@ 2025-07-03 13:33     ` Ignat Korchagin
  0 siblings, 0 replies; 13+ messages in thread
From: Ignat Korchagin @ 2025-07-03 13:33 UTC (permalink / raw)
  To: Jason A. Donenfeld
  Cc: Gu Bowen, Herbert Xu, David Howells, David Woodhouse,
	Lukas Wunner, David S . Miller, Jarkko Sakkinen, Maxime Coquelin,
	Alexandre Torgue, Eric Biggers, Ard Biesheuvel, Tianjia Zhang,
	Dan Carpenter, keyrings, linux-kernel, linux-crypto, linux-stm32,
	linux-arm-kernel, Lu Jialin, GONG Ruiqi

On Thu, Jul 3, 2025 at 3:29 PM Jason A. Donenfeld <Jason@zx2c4.com> wrote:
>
> On Thu, Jul 03, 2025 at 03:14:52PM +0200, Jason A. Donenfeld wrote:
> > Hi,
> >
> > On Mon, Jun 30, 2025 at 09:39:30PM +0800, Gu Bowen wrote:
> > > To reintroduce the sm2 algorithm, the patch set did the following:
> > >  - Reintroduce the mpi library based on libgcrypt.
> > >  - Reintroduce ec implementation to MPI library.
> > >  - Rework sm2 algorithm.
> > >  - Support verification of X.509 certificates.
> > >
> > > Gu Bowen (4):
> > >   Revert "Revert "lib/mpi: Extend the MPI library""
> > >   Revert "Revert "lib/mpi: Introduce ec implementation to MPI library""
> > >   crypto/sm2: Rework sm2 alg with sig_alg backend
> > >   crypto/sm2: support SM2-with-SM3 verification of X.509 certificates
> >
> > I am less than enthusiastic about this. Firstly, I'm kind of biased
> > against the whole "national flag algorithms" thing. But I don't know how
> > much weight that argument will have here. More importantly, however,
> > implementing this atop MPI sounds very bad. The more MPI we can get rid
> > of, the better.
> >
> > Is MPI constant time? Usually the good way to implement EC algorithms
> > like this is to very carefully work out constant time (and fast!) field
> > arithmetic routines, verify their correctness, and then implement your
> > ECC atop that. At this point, there's *lots* of work out there on doing
> > fast verified ECC and a bunch of different frameworks for producing good
> > implementations. There are also other implementations out there you
> > could look at that people have presumably studied a lot. This is old
> > news. (In 3 minutes of scrolling around, I noticed that
> > count_leading_zeros() on a value is used as a loop index, for example.
> > Maybe fine, maybe not, I dunno; this stuff requires analysis.)
> >
> > On the other hand, maybe you don't care because you only implement
> > verification, not signing, so all info is public? If so, the fact that
> > you don't care about CT should probably be made pretty visible. But
> > either way, you should still be concerned with having an actually good &
> > correct implementation of which you feel strongly about the correctness.
> >
> > Secondly, the MPI stuff you're proposing here adds a 25519 and 448
> > implementation, and support for weierstrauss, montgomery, and edwards,
> > and... surely you don't need all of this for SM-2. Why add all this
> > unused code? Presumably because you don't really understand or "own" all
> > of the code that you're proposing to add. And that gives me a lot of
> > hesitation, because somebody is going to have to maintain this, and if
> > the person sending patches with it isn't fully on top of it, we're not
> > off to a good start.
> >
> > Lastly, just to nip in the bud the argument, "but weierstrauss is all
> > the same, so why not just have one library to do all possible
> > weierstrauss curves?" -- the fact that this series reintroduces the
> > removed "generic EC library" indicates there's actually not another user
> > of it, even before we get into questions of whether it's a good idea.
>
> I went looking for reference implementations and came across this
> "GmSSL" project and located:
>
> https://github.com/guanzhi/GmSSL/blob/master/src/sm2_sign.c#L271
> which uses some routines from
> https://github.com/guanzhi/GmSSL/blob/master/src/sm2_z256.c
>
> I have no idea what the deal actually is here -- is this any good? has
> anybody looked at it? is it a random github? -- but it certainly
> _resembles_ something more comfortable than the MPI code. Who knows, it
> could be terrible, but you get the idea.

One thing to keep in mind with this project (and other projects) is
license compatibility with GPLv2 (I don't think the above project is
compatible)

>
> Jason


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH RFC 0/4] Reintroduce the sm2 algorithm
  2025-07-03 13:14 ` Jason A. Donenfeld
  2025-07-03 13:29   ` Jason A. Donenfeld
@ 2025-07-11  2:14   ` Gu Bowen
  1 sibling, 0 replies; 13+ messages in thread
From: Gu Bowen @ 2025-07-11  2:14 UTC (permalink / raw)
  To: Jason A. Donenfeld
  Cc: Herbert Xu, David Howells, David Woodhouse, Lukas Wunner,
	Ignat Korchagin, David S . Miller, Jarkko Sakkinen,
	Maxime Coquelin, Alexandre Torgue, Eric Biggers, Ard Biesheuvel,
	Tianjia Zhang, Dan Carpenter, keyrings, linux-kernel,
	linux-crypto, linux-stm32, linux-arm-kernel, Lu Jialin,
	GONG Ruiqi

Hi,

On 7/3/2025 9:14 PM, Jason A. Donenfeld wrote:
> Hi,
> 
> On Mon, Jun 30, 2025 at 09:39:30PM +0800, Gu Bowen wrote:
>> To reintroduce the sm2 algorithm, the patch set did the following:
>>   - Reintroduce the mpi library based on libgcrypt.
>>   - Reintroduce ec implementation to MPI library.
>>   - Rework sm2 algorithm.
>>   - Support verification of X.509 certificates.
>>
>> Gu Bowen (4):
>>    Revert "Revert "lib/mpi: Extend the MPI library""
>>    Revert "Revert "lib/mpi: Introduce ec implementation to MPI library""
>>    crypto/sm2: Rework sm2 alg with sig_alg backend
>>    crypto/sm2: support SM2-with-SM3 verification of X.509 certificates
> 
> I am less than enthusiastic about this. Firstly, I'm kind of biased
> against the whole "national flag algorithms" thing. But I don't know how
> much weight that argument will have here. More importantly, however,
> implementing this atop MPI sounds very bad. The more MPI we can get rid
> of, the better.
> 
> Is MPI constant time? Usually the good way to implement EC algorithms
> like this is to very carefully work out constant time (and fast!) field
> arithmetic routines, verify their correctness, and then implement your
> ECC atop that. At this point, there's *lots* of work out there on doing
> fast verified ECC and a bunch of different frameworks for producing good
> implementations. There are also other implementations out there you
> could look at that people have presumably studied a lot. This is old
> news. (In 3 minutes of scrolling around, I noticed that
> count_leading_zeros() on a value is used as a loop index, for example.
> Maybe fine, maybe not, I dunno; this stuff requires analysis.)

Actually, I wasn't very familiar with MPI in the past. Previously, the 
implementation of sm2 was done through MPI, so I used it as well. 
Perhaps I could try using the ecc algorithm in the kernel.

> On the other hand, maybe you don't care because you only implement
> verification, not signing, so all info is public? If so, the fact that
> you don't care about CT should probably be made pretty visible. But
> either way, you should still be concerned with having an actually good &
> correct implementation of which you feel strongly about the correctness.
> 
> Secondly, the MPI stuff you're proposing here adds a 25519 and 448
> implementation, and support for weierstrauss, montgomery, and edwards,
> and... surely you don't need all of this for SM-2. Why add all this
> unused code? Presumably because you don't really understand or "own" all
> of the code that you're proposing to add. And that gives me a lot of
> hesitation, because somebody is going to have to maintain this, and if
> the person sending patches with it isn't fully on top of it, we're not
> off to a good start.
> 
> Lastly, just to nip in the bud the argument, "but weierstrauss is all
> the same, so why not just have one library to do all possible
> weierstrauss curves?" -- the fact that this series reintroduces the
> removed "generic EC library" indicates there's actually not another user
> of it, even before we get into questions of whether it's a good idea.

Thank you for your advice, it has been very beneficial for me as I just 
started participating in the community. I will try to implement the 
functionality with more robust code and only submit parts that I fully 
understand.

Best Regards,
Guber




^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2025-07-11  2:17 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-06-30 13:39 [PATCH RFC 0/4] Reintroduce the sm2 algorithm Gu Bowen
2025-06-30 13:39 ` [PATCH RFC 1/4] Revert "Revert "lib/mpi: Extend the MPI library"" Gu Bowen
2025-07-03  9:18   ` Xi Ruoyao
2025-06-30 13:39 ` [PATCH RFC 2/4] Revert "Revert "lib/mpi: Introduce ec implementation to " Gu Bowen
2025-07-02 15:18   ` Ignat Korchagin
2025-06-30 13:39 ` [PATCH RFC 3/4] crypto/sm2: Rework sm2 alg with sig_alg backend Gu Bowen
2025-06-30 13:39 ` [PATCH RFC 4/4] crypto/sm2: support SM2-with-SM3 verification of X.509 certificates Gu Bowen
2025-06-30 19:41 ` [PATCH RFC 0/4] Reintroduce the sm2 algorithm Dan Carpenter
2025-07-01  3:49   ` Gu Bowen
2025-07-03 13:14 ` Jason A. Donenfeld
2025-07-03 13:29   ` Jason A. Donenfeld
2025-07-03 13:33     ` Ignat Korchagin
2025-07-11  2:14   ` Gu Bowen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).