Linux cryptographic layer development
 help / color / mirror / Atom feed
* Re: [PATCH 00/13] crypto: copy AAD during encrypt for AEAD ciphers
From: Stephan Müller @ 2017-01-13 11:12 UTC (permalink / raw)
  To: Herbert Xu; +Cc: linux-crypto
In-Reply-To: <20170113110417.GB23617@gondor.apana.org.au>

Am Freitag, 13. Januar 2017, 19:04:17 CET schrieb Herbert Xu:

Hi Herbert,

> On Fri, Jan 13, 2017 at 11:58:24AM +0100, Stephan Müller wrote:
> > May I ask how the in-place code path can be invoked by algif_aead or
> > algif_skcipher? As far as I understand, this code path is only invoked
> > when
> > the cipher implementation sees that the src and dst SGLs are identical.
> 
> It's not right now but it isn't difficult to add the code to allow
> it by comparing SGLs.

Adding such code should IMHO not be impaired by pointing to the AAD held in 
the src SGL by the dst SGL as offered with the older patch mentioned before.

Ciao
Stephan

^ permalink raw reply

* Re: [PATCH 1/2] crypto: aead AF_ALG - overhaul memory management
From: Herbert Xu @ 2017-01-13 11:12 UTC (permalink / raw)
  To: Stephan Müller; +Cc: linux-crypto
In-Reply-To: <5603535.pO8F2Xs7xC@positron.chronox.de>

On Fri, Jan 13, 2017 at 12:10:02PM +0100, Stephan Müller wrote:
>
> > Well if ordering is not guaranteed that I don't see how your code
> > can work either.  Or am I missing something?
> 
> The patch simply stores all data it gets from sendmsg in the src SGL. In 
> addition it maintains an offset pointer into that src SGLs.
> 
> When the recvmsg call comes in and the dst SGL is prepared, it simply takes as 
> much data from the src SGL as needed to cover the request defined by the dst 
> SGL. After completing that operation, the offset pointer is moved forward to 
> point to a yet unused part of the src SGL. If another recvmsg comes in without 
> an intermediate sendmsg, it simply starts using the data from the src SGL 
> starting from the offset.
> 
> Therefore, the code should now be able to handle a write / write / read / read 
> scenario. Or it can handle, say, a write(32 bytes) / read (16 bytes) / read 
> (16 bytes). At least my tests covered a successful testing of that scenario 
> which always crashed the kernel before.

Are you making separate read calls or just a single one? If you're
making separate calls, then this is no differnt to just doing
write/read pairs.  You're not saving any overhead.

If you're making a single call, what guarantees the ordering?

Cheers,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply

* Re: [PATCH 00/13] crypto: copy AAD during encrypt for AEAD ciphers
From: Herbert Xu @ 2017-01-13 11:14 UTC (permalink / raw)
  To: Stephan Müller; +Cc: linux-crypto
In-Reply-To: <1593277.7QY5i6DVS0@positron.chronox.de>

On Fri, Jan 13, 2017 at 12:12:39PM +0100, Stephan Müller wrote:
>
> Adding such code should IMHO not be impaired by pointing to the AAD held in 
> the src SGL by the dst SGL as offered with the older patch mentioned before.

The point is you're turning what could otherwise be a linear SGL
into a non-linear one by forcing the same AAD on both sides.

Cheers,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply

* Re: [PATCH 1/2] crypto: aead AF_ALG - overhaul memory management
From: Stephan Müller @ 2017-01-13 11:16 UTC (permalink / raw)
  To: Herbert Xu; +Cc: linux-crypto
In-Reply-To: <20170113111259.GA23800@gondor.apana.org.au>

Am Freitag, 13. Januar 2017, 19:12:59 CET schrieb Herbert Xu:

Hi Herbert,

> On Fri, Jan 13, 2017 at 12:10:02PM +0100, Stephan Müller wrote:
> > > Well if ordering is not guaranteed that I don't see how your code
> > > can work either.  Or am I missing something?
> > 
> > The patch simply stores all data it gets from sendmsg in the src SGL. In
> > addition it maintains an offset pointer into that src SGLs.
> > 
> > When the recvmsg call comes in and the dst SGL is prepared, it simply
> > takes as much data from the src SGL as needed to cover the request
> > defined by the dst SGL. After completing that operation, the offset
> > pointer is moved forward to point to a yet unused part of the src SGL. If
> > another recvmsg comes in without an intermediate sendmsg, it simply
> > starts using the data from the src SGL starting from the offset.
> > 
> > Therefore, the code should now be able to handle a write / write / read /
> > read scenario. Or it can handle, say, a write(32 bytes) / read (16 bytes)
> > / read (16 bytes). At least my tests covered a successful testing of that
> > scenario which always crashed the kernel before.
> 
> Are you making separate read calls or just a single one? If you're
> making separate calls, then this is no differnt to just doing
> write/read pairs.  You're not saving any overhead.

I make one read call.
> 
> If you're making a single call, what guarantees the ordering?

Technically, io_submit is the syscall that triggers the recvmsg. Are you 
saying that this syscall does not maintain ordering? At least the man page 
does not add any hints that it would not (unlike the lio_list man page).

Ciao
Stephan

^ permalink raw reply

* Re: [PATCH 00/13] crypto: copy AAD during encrypt for AEAD ciphers
From: Stephan Müller @ 2017-01-13 11:19 UTC (permalink / raw)
  To: Herbert Xu; +Cc: linux-crypto
In-Reply-To: <20170113111406.GB23800@gondor.apana.org.au>

Am Freitag, 13. Januar 2017, 19:14:06 CET schrieb Herbert Xu:

Hi Herbert,

> On Fri, Jan 13, 2017 at 12:12:39PM +0100, Stephan Müller wrote:
> > Adding such code should IMHO not be impaired by pointing to the AAD held
> > in
> > the src SGL by the dst SGL as offered with the older patch mentioned
> > before.
> The point is you're turning what could otherwise be a linear SGL
> into a non-linear one by forcing the same AAD on both sides.

That is correct, but I thought that playing with pointers is always faster 
than doing memcpy. Are you saying that this assumption is not true when we 
somehow have the code to try to perform an in-place operation?

Ciao
Stephan

^ permalink raw reply

* Re: [PATCH 1/2] crypto: aead AF_ALG - overhaul memory management
From: Herbert Xu @ 2017-01-13 11:25 UTC (permalink / raw)
  To: Stephan Müller; +Cc: linux-crypto
In-Reply-To: <2278925.gEXngSfgVI@positron.chronox.de>

On Fri, Jan 13, 2017 at 12:16:27PM +0100, Stephan Müller wrote:
>
> > If you're making a single call, what guarantees the ordering?
> 
> Technically, io_submit is the syscall that triggers the recvmsg. Are you 
> saying that this syscall does not maintain ordering? At least the man page 
> does not add any hints that it would not (unlike the lio_list man page).

The code certainly does.  But my point is that you can do the
same thing using the current API.  Just make your list be pairs
of write/read and it should work.

Cheers,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply

* Re: [PATCH 00/13] crypto: copy AAD during encrypt for AEAD ciphers
From: Herbert Xu @ 2017-01-13 11:26 UTC (permalink / raw)
  To: Stephan Müller; +Cc: linux-crypto
In-Reply-To: <2328881.s8hPz576Kn@positron.chronox.de>

On Fri, Jan 13, 2017 at 12:19:48PM +0100, Stephan Müller wrote:
> 
> That is correct, but I thought that playing with pointers is always faster 
> than doing memcpy. Are you saying that this assumption is not true when we 
> somehow have the code to try to perform an in-place operation?

If you're doing crypto the overhead in copying the AAD is the
last thing you need to worry about.

Cheers,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply

* Re: [RFC PATCH 5/6] crypto: aesni-intel - Add bulk request support
From: Ondrej Mosnáček @ 2017-01-13 11:27 UTC (permalink / raw)
  To: Eric Biggers
  Cc: Herbert Xu, linux-crypto, dm-devel, Mike Snitzer, Milan Broz,
	Mikulas Patocka, Binoy Jayan
In-Reply-To: <20170113031933.GA4956@zzz>

Hi Eric,

2017-01-13 4:19 GMT+01:00 Eric Biggers <ebiggers3@gmail.com>:
> To what extent does the performance benefit of this patchset result from just
> the reduced numbers of calls to kernel_fpu_begin() and kernel_fpu_end()?
>
> If it's most of the benefit, would it make any sense to optimize
> kernel_fpu_begin() and kernel_fpu_end() instead?
>
> And if there are other examples besides kernel_fpu_begin/kernel_fpu_end where
> the bulk API would provide a significant performance boost, can you mention
> them?

In the case of AES-NI ciphers, this is the only benefit. However, this
change is not intended solely (or primarily) for AES-NI ciphers, but
also for other drivers that have a high per-request overhead.

This patchset is in fact a reaction to Binoy Jayan's efforts (see
[1]). The problem with small requests to HW crypto drivers comes up
for example in Qualcomm's Android [2], where they actually hacked
together their own version of dm-crypt (called 'dm-req-crypt'), which
in turn used a driver-specific crypto mode, which does the IV
generation on its own, and thereby is able to process several sectors
at once. The goal is to extend the crypto API so that vendors don't
have to roll out their own workarounds to have efficient disk
encryption.

> Interestingly, the arm64 equivalent to kernel_fpu_begin()
> (kernel_neon_begin_partial() in arch/arm64/kernel/fpsimd.c) appears to have an
> optimization where the SIMD registers aren't saved if they were already saved.
> I wonder why something similar isn't done on x86.

AFAIK, there can't be done much about the kernel_fpu_* functions, see e.g. [3].

Regards,
Ondrej

[1] https://lkml.org/lkml/2016/12/20/111
[2] https://nelenkov.blogspot.com/2015/05/hardware-accelerated-disk-encryption-in.html
[3] https://lkml.org/lkml/2016/12/21/354

>
> Eric

^ permalink raw reply

* (unknown), 
From: service @ 2017-01-13 11:28 UTC (permalink / raw)
  To: linux-crypto

[-- Attachment #1: INFO_4883709289_linux-crypto.zip --]
[-- Type: application/zip, Size: 45688 bytes --]

^ permalink raw reply

* Re: [PATCH 00/13] crypto: copy AAD during encrypt for AEAD ciphers
From: Herbert Xu @ 2017-01-13 11:04 UTC (permalink / raw)
  To: Stephan Müller; +Cc: linux-crypto
In-Reply-To: <9279288.fr5DIlfFLr@positron.chronox.de>

On Fri, Jan 13, 2017 at 11:58:24AM +0100, Stephan Müller wrote:
>
> May I ask how the in-place code path can be invoked by algif_aead or 
> algif_skcipher? As far as I understand, this code path is only invoked when 
> the cipher implementation sees that the src and dst SGLs are identical.

It's not right now but it isn't difficult to add the code to allow
it by comparing SGLs.

Cheers,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply

* Re: [PATCH 00/13] crypto: copy AAD during encrypt for AEAD ciphers
From: Stephan Müller @ 2017-01-13 11:30 UTC (permalink / raw)
  To: Herbert Xu; +Cc: linux-crypto
In-Reply-To: <20170113112623.GB23928@gondor.apana.org.au>

Am Freitag, 13. Januar 2017, 19:26:23 CET schrieb Herbert Xu:

Hi Herbert,

> On Fri, Jan 13, 2017 at 12:19:48PM +0100, Stephan Müller wrote:
> > That is correct, but I thought that playing with pointers is always faster
> > than doing memcpy. Are you saying that this assumption is not true when we
> > somehow have the code to try to perform an in-place operation?
> 
> If you're doing crypto the overhead in copying the AAD is the
> last thing you need to worry about.

So, the patch set you want to see is:

- remove the AAD copy operation from authenc and not add it to any AEAD 
implementations

- add the AAD copy operation to algif_aead

Shall the null cipher context we need for that be anchored in the crypto_aead 
data structure as done in patch 01 of this series (this would allow an easy 
addition to the copy call to every AEAD implementation if deemed needed) or 
shall it be anchored within algif_aead?

Ciao
Stephan

^ permalink raw reply

* Re: [PATCH 00/13] crypto: copy AAD during encrypt for AEAD ciphers
From: Herbert Xu @ 2017-01-13 11:33 UTC (permalink / raw)
  To: Stephan Müller; +Cc: linux-crypto
In-Reply-To: <1704568.lHqrOxcchT@positron.chronox.de>

On Fri, Jan 13, 2017 at 12:30:15PM +0100, Stephan Müller wrote:
>
> So, the patch set you want to see is:
> 
> - remove the AAD copy operation from authenc and not add it to any AEAD 
> implementations

Why would you remove it from authenc?

> - add the AAD copy operation to algif_aead

No just copy everything and then do in-place crypto.

> Shall the null cipher context we need for that be anchored in the crypto_aead 
> data structure as done in patch 01 of this series (this would allow an easy 
> addition to the copy call to every AEAD implementation if deemed needed) or 
> shall it be anchored within algif_aead?

It should be in algif_aead.

Cheers,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply

* Re: [PATCH 00/13] crypto: copy AAD during encrypt for AEAD ciphers
From: Herbert Xu @ 2017-01-13 11:39 UTC (permalink / raw)
  To: Stephan Müller; +Cc: linux-crypto
In-Reply-To: <2829634.RJknVxGPvY@positron.chronox.de>

On Fri, Jan 13, 2017 at 12:36:56PM +0100, Stephan Müller wrote:
>
> I thought I understood that you would not want to see it in any 
> implementation. But, ok, if you want to leave it.

If you remove it from authenc then authenc will be broken.
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply

* Re: [PATCH 00/13] crypto: copy AAD during encrypt for AEAD ciphers
From: Stephan Müller @ 2017-01-13 11:36 UTC (permalink / raw)
  To: Herbert Xu; +Cc: linux-crypto
In-Reply-To: <20170113113324.GA24021@gondor.apana.org.au>

Am Freitag, 13. Januar 2017, 19:33:24 CET schrieb Herbert Xu:

Hi Herbert,

> On Fri, Jan 13, 2017 at 12:30:15PM +0100, Stephan Müller wrote:
> > So, the patch set you want to see is:
> > 
> > - remove the AAD copy operation from authenc and not add it to any AEAD
> > implementations
> 
> Why would you remove it from authenc?

I thought I understood that you would not want to see it in any 
implementation. But, ok, if you want to leave it.

Ciao
Stephan

^ permalink raw reply

* Re: [PATCH 1/2] crypto: aead AF_ALG - overhaul memory management
From: Stephan Müller @ 2017-01-13 11:51 UTC (permalink / raw)
  To: Herbert Xu; +Cc: linux-crypto
In-Reply-To: <20170113112539.GA23928@gondor.apana.org.au>

Am Freitag, 13. Januar 2017, 19:25:39 CET schrieb Herbert Xu:

Hi Herbert,

> On Fri, Jan 13, 2017 at 12:16:27PM +0100, Stephan Müller wrote:
> > > If you're making a single call, what guarantees the ordering?
> > 
> > Technically, io_submit is the syscall that triggers the recvmsg. Are you
> > saying that this syscall does not maintain ordering? At least the man page
> > does not add any hints that it would not (unlike the lio_list man page).
> 
> The code certainly does.  But my point is that you can do the
> same thing using the current API.  Just make your list be pairs
> of write/read and it should work.

How shall these pairs come into existence? The read/write system calls may 
come at unspecified times with potentially dissimilar buffer lengths.


Ciao
Stephan

^ permalink raw reply

* [PATCH] crypto: api - Clear CRYPTO_ALG_DEAD bit before registering an alg
From: Salvatore Benedetto @ 2017-01-13 11:54 UTC (permalink / raw)
  To: herbert; +Cc: salvatore.benedetto, linux-crypto

Make sure CRYPTO_ALG_DEAD bit is cleared before proceeding with
the algorithm registration. This fixes qat-dh registration when
driver is restarted

Signed-off-by: Salvatore Benedetto <salvatore.benedetto@intel.com>
---
 crypto/algapi.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/crypto/algapi.c b/crypto/algapi.c
index df939b5..1fad2a6 100644
--- a/crypto/algapi.c
+++ b/crypto/algapi.c
@@ -356,6 +356,7 @@ int crypto_register_alg(struct crypto_alg *alg)
 	struct crypto_larval *larval;
 	int err;
 
+	alg->cra_flags &= ~CRYPTO_ALG_DEAD;
 	err = crypto_check_alg(alg);
 	if (err)
 		return err;
-- 
2.7.4

^ permalink raw reply related

* Re: [RFC PATCH 0/6] Add bulk skcipher requests to crypto API and dm-crypt
From: Ondrej Mosnáček @ 2017-01-13 12:01 UTC (permalink / raw)
  To: Herbert Xu
  Cc: linux-crypto, dm-devel, Mike Snitzer, Milan Broz, Mikulas Patocka,
	Binoy Jayan
In-Reply-To: <20170113104128.GA23497@gondor.apana.org.au>

2017-01-13 11:41 GMT+01:00 Herbert Xu <herbert@gondor.apana.org.au>:
> On Thu, Jan 12, 2017 at 01:59:52PM +0100, Ondrej Mosnacek wrote:
>> the goal of this patchset is to allow those skcipher API users that need to
>> process batches of small messages (especially dm-crypt) to do so efficiently.
>
> Please explain why this can't be done with the existing framework
> using IV generators similar to the ones used for IPsec.

As I already mentioned in another thread, there are basically two reasons:

1) Milan would like to add authenticated encryption support to
dm-crypt (see [1]) and as part of this change, a new random IV mode
would be introduced. This mode generates a random IV for each sector
write, includes it in the authenticated data and stores it in the
sector's metadata (in a separate part of the disk). In this case
dm-crypt will need to have control over the IV generation (or at least
be able to somehow retrieve it after the crypto operation... but
passing RNG responsibility to drivers doesn't seem to be a good idea
anyway).

2) With this API, drivers wouldn't have to provide implementations for
specific IV generation modes, and just implement bulk requests for the
common modes/algorithms (XTS, CBC, ...) while still getting
performance benefit.

Regards,
Ondrej

[1] https://www.redhat.com/archives/dm-devel/2017-January/msg00028.html

>
> Thanks,
> --
> Email: Herbert Xu <herbert@gondor.apana.org.au>
> Home Page: http://gondor.apana.org.au/~herbert/
> PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply

* [PATCH v1 0/4]crypto:chcr- Bug Fixes for 4.10
From: Harsh Jain @ 2017-01-13 12:29 UTC (permalink / raw)
  To: hariprasad, netdev, herbert, linux-crypto; +Cc: Harsh Jain

This patch series is based on Herbert's cryptodev-2.6 tree.
It includes several critical bug fixes.

Atul Gupta (3):
  crypto:chcr-Change flow IDs
  crypto:chcr- Fix panic on dma_unmap_sg
  crypto:chcr- Check device is allocated before use
Julia Lawall (1):
  crypto:chcr-fix itnull.cocci warnings

 drivers/crypto/chelsio/chcr_algo.c            | 67 ++++++++++++++-------------
 drivers/crypto/chelsio/chcr_algo.h            |  9 ++--
 drivers/crypto/chelsio/chcr_core.c            | 18 ++++---
 drivers/crypto/chelsio/chcr_core.h            |  1 +
 drivers/crypto/chelsio/chcr_crypto.h          |  3 ++
 drivers/net/ethernet/chelsio/cxgb4/t4fw_api.h |  8 ++++
 6 files changed, 61 insertions(+), 45 deletions(-)

-- 
1.8.2.3

^ permalink raw reply

* [PATCH v1 2/4] crypto:chcr- Fix panic on dma_unmap_sg
From: Harsh Jain @ 2017-01-13 12:29 UTC (permalink / raw)
  To: hariprasad, netdev, herbert, linux-crypto; +Cc: Harsh Jain, Atul Gupta
In-Reply-To: <cover.1484309965.git.harsh@chelsio.com>

Save DMA mapped sg list addresses to request context buffer.

Signed-off-by: Atul Gupta <atul.gupta@chelsio.com>
---
 drivers/crypto/chelsio/chcr_algo.c   | 49 +++++++++++++++++++-----------------
 drivers/crypto/chelsio/chcr_crypto.h |  3 +++
 2 files changed, 29 insertions(+), 23 deletions(-)

diff --git a/drivers/crypto/chelsio/chcr_algo.c b/drivers/crypto/chelsio/chcr_algo.c
index 1d7dfcf..deec7c0 100644
--- a/drivers/crypto/chelsio/chcr_algo.c
+++ b/drivers/crypto/chelsio/chcr_algo.c
@@ -158,7 +158,7 @@ int chcr_handle_resp(struct crypto_async_request *req, unsigned char *input,
 	case CRYPTO_ALG_TYPE_AEAD:
 		ctx_req.req.aead_req = (struct aead_request *)req;
 		ctx_req.ctx.reqctx = aead_request_ctx(ctx_req.req.aead_req);
-		dma_unmap_sg(&u_ctx->lldi.pdev->dev, ctx_req.req.aead_req->dst,
+		dma_unmap_sg(&u_ctx->lldi.pdev->dev, ctx_req.ctx.reqctx->dst,
 			     ctx_req.ctx.reqctx->dst_nents, DMA_FROM_DEVICE);
 		if (ctx_req.ctx.reqctx->skb) {
 			kfree_skb(ctx_req.ctx.reqctx->skb);
@@ -1364,8 +1364,7 @@ static struct sk_buff *create_authenc_wr(struct aead_request *req,
 	struct chcr_wr *chcr_req;
 	struct cpl_rx_phys_dsgl *phys_cpl;
 	struct phys_sge_parm sg_param;
-	struct scatterlist *src, *dst;
-	struct scatterlist src_sg[2], dst_sg[2];
+	struct scatterlist *src;
 	unsigned int frags = 0, transhdr_len;
 	unsigned int ivsize = crypto_aead_ivsize(tfm), dst_size = 0;
 	unsigned int   kctx_len = 0;
@@ -1385,19 +1384,21 @@ static struct sk_buff *create_authenc_wr(struct aead_request *req,
 
 	if (sg_nents_for_len(req->src, req->assoclen + req->cryptlen) < 0)
 		goto err;
-	src = scatterwalk_ffwd(src_sg, req->src, req->assoclen);
-	dst = src;
+	src = scatterwalk_ffwd(reqctx->srcffwd, req->src, req->assoclen);
+	reqctx->dst = src;
+
 	if (req->src != req->dst) {
 		err = chcr_copy_assoc(req, aeadctx);
 		if (err)
 			return ERR_PTR(err);
-		dst = scatterwalk_ffwd(dst_sg, req->dst, req->assoclen);
+		reqctx->dst = scatterwalk_ffwd(reqctx->dstffwd, req->dst,
+					       req->assoclen);
 	}
 	if (get_aead_subtype(tfm) == CRYPTO_ALG_SUB_TYPE_AEAD_NULL) {
 		null = 1;
 		assoclen = 0;
 	}
-	reqctx->dst_nents = sg_nents_for_len(dst, req->cryptlen +
+	reqctx->dst_nents = sg_nents_for_len(reqctx->dst, req->cryptlen +
 					     (op_type ? -authsize : authsize));
 	if (reqctx->dst_nents <= 0) {
 		pr_err("AUTHENC:Invalid Destination sg entries\n");
@@ -1462,7 +1463,7 @@ static struct sk_buff *create_authenc_wr(struct aead_request *req,
 	sg_param.obsize = req->cryptlen + (op_type ? -authsize : authsize);
 	sg_param.qid = qid;
 	sg_param.align = 0;
-	if (map_writesg_phys_cpl(&u_ctx->lldi.pdev->dev, phys_cpl, dst,
+	if (map_writesg_phys_cpl(&u_ctx->lldi.pdev->dev, phys_cpl, reqctx->dst,
 				  &sg_param))
 		goto dstmap_fail;
 
@@ -1713,8 +1714,7 @@ static struct sk_buff *create_aead_ccm_wr(struct aead_request *req,
 	struct chcr_wr *chcr_req;
 	struct cpl_rx_phys_dsgl *phys_cpl;
 	struct phys_sge_parm sg_param;
-	struct scatterlist *src, *dst;
-	struct scatterlist src_sg[2], dst_sg[2];
+	struct scatterlist *src;
 	unsigned int frags = 0, transhdr_len, ivsize = AES_BLOCK_SIZE;
 	unsigned int dst_size = 0, kctx_len;
 	unsigned int sub_type;
@@ -1730,17 +1730,19 @@ static struct sk_buff *create_aead_ccm_wr(struct aead_request *req,
 	if (sg_nents_for_len(req->src, req->assoclen + req->cryptlen) < 0)
 		goto err;
 	sub_type = get_aead_subtype(tfm);
-	src = scatterwalk_ffwd(src_sg, req->src, req->assoclen);
-	dst = src;
+	src = scatterwalk_ffwd(reqctx->srcffwd, req->src, req->assoclen);
+	reqctx->dst = src;
+
 	if (req->src != req->dst) {
 		err = chcr_copy_assoc(req, aeadctx);
 		if (err) {
 			pr_err("AAD copy to destination buffer fails\n");
 			return ERR_PTR(err);
 		}
-		dst = scatterwalk_ffwd(dst_sg, req->dst, req->assoclen);
+		reqctx->dst = scatterwalk_ffwd(reqctx->dstffwd, req->dst,
+					       req->assoclen);
 	}
-	reqctx->dst_nents = sg_nents_for_len(dst, req->cryptlen +
+	reqctx->dst_nents = sg_nents_for_len(reqctx->dst, req->cryptlen +
 					     (op_type ? -authsize : authsize));
 	if (reqctx->dst_nents <= 0) {
 		pr_err("CCM:Invalid Destination sg entries\n");
@@ -1779,7 +1781,7 @@ static struct sk_buff *create_aead_ccm_wr(struct aead_request *req,
 	sg_param.obsize = req->cryptlen + (op_type ? -authsize : authsize);
 	sg_param.qid = qid;
 	sg_param.align = 0;
-	if (map_writesg_phys_cpl(&u_ctx->lldi.pdev->dev, phys_cpl, dst,
+	if (map_writesg_phys_cpl(&u_ctx->lldi.pdev->dev, phys_cpl, reqctx->dst,
 				  &sg_param))
 		goto dstmap_fail;
 
@@ -1811,8 +1813,7 @@ static struct sk_buff *create_gcm_wr(struct aead_request *req,
 	struct chcr_wr *chcr_req;
 	struct cpl_rx_phys_dsgl *phys_cpl;
 	struct phys_sge_parm sg_param;
-	struct scatterlist *src, *dst;
-	struct scatterlist src_sg[2], dst_sg[2];
+	struct scatterlist *src;
 	unsigned int frags = 0, transhdr_len;
 	unsigned int ivsize = AES_BLOCK_SIZE;
 	unsigned int dst_size = 0, kctx_len;
@@ -1834,13 +1835,14 @@ static struct sk_buff *create_gcm_wr(struct aead_request *req,
 	if (sg_nents_for_len(req->src, req->assoclen + req->cryptlen) < 0)
 		goto err;
 
-	src = scatterwalk_ffwd(src_sg, req->src, req->assoclen);
-	dst = src;
+	src = scatterwalk_ffwd(reqctx->srcffwd, req->src, req->assoclen);
+	reqctx->dst = src;
 	if (req->src != req->dst) {
 		err = chcr_copy_assoc(req, aeadctx);
 		if (err)
 			return	ERR_PTR(err);
-		dst = scatterwalk_ffwd(dst_sg, req->dst, req->assoclen);
+		reqctx->dst = scatterwalk_ffwd(reqctx->dstffwd, req->dst,
+					       req->assoclen);
 	}
 
 	if (!req->cryptlen)
@@ -1850,7 +1852,7 @@ static struct sk_buff *create_gcm_wr(struct aead_request *req,
 		crypt_len = AES_BLOCK_SIZE;
 	else
 		crypt_len = req->cryptlen;
-	reqctx->dst_nents = sg_nents_for_len(dst, req->cryptlen +
+	reqctx->dst_nents = sg_nents_for_len(reqctx->dst, req->cryptlen +
 					     (op_type ? -authsize : authsize));
 	if (reqctx->dst_nents <= 0) {
 		pr_err("GCM:Invalid Destination sg entries\n");
@@ -1925,7 +1927,7 @@ static struct sk_buff *create_gcm_wr(struct aead_request *req,
 	sg_param.obsize = req->cryptlen + (op_type ? -authsize : authsize);
 	sg_param.qid = qid;
 	sg_param.align = 0;
-	if (map_writesg_phys_cpl(&u_ctx->lldi.pdev->dev, phys_cpl, dst,
+	if (map_writesg_phys_cpl(&u_ctx->lldi.pdev->dev, phys_cpl, reqctx->dst,
 				  &sg_param))
 		goto dstmap_fail;
 
@@ -1939,7 +1941,8 @@ static struct sk_buff *create_gcm_wr(struct aead_request *req,
 		write_sg_to_skb(skb, &frags, src, req->cryptlen);
 	} else {
 		aes_gcm_empty_pld_pad(req->dst, authsize - 1);
-		write_sg_to_skb(skb, &frags, dst, crypt_len);
+		write_sg_to_skb(skb, &frags, reqctx->dst, crypt_len);
+
 	}
 
 	create_wreq(ctx, chcr_req, req, skb, kctx_len, size, 1,
diff --git a/drivers/crypto/chelsio/chcr_crypto.h b/drivers/crypto/chelsio/chcr_crypto.h
index d5af7d6..7ec0a8f 100644
--- a/drivers/crypto/chelsio/chcr_crypto.h
+++ b/drivers/crypto/chelsio/chcr_crypto.h
@@ -158,6 +158,9 @@ struct ablk_ctx {
 };
 struct chcr_aead_reqctx {
 	struct	sk_buff	*skb;
+	struct scatterlist *dst;
+	struct scatterlist srcffwd[2];
+	struct scatterlist dstffwd[2];
 	short int dst_nents;
 	u16 verify;
 	u8 iv[CHCR_MAX_CRYPTO_IV_LEN];
-- 
1.8.2.3

^ permalink raw reply related

* [PATCH v1 4/4] crypto:chcr-fix itnull.cocci warnings
From: Harsh Jain @ 2017-01-13 12:29 UTC (permalink / raw)
  To: hariprasad, netdev, herbert, linux-crypto
  Cc: Harsh Jain, Julia Lawall, Fengguang Wu
In-Reply-To: <cover.1484309965.git.harsh@chelsio.com>

The first argument to list_for_each_entry cannot be NULL.

Generated by: scripts/coccinelle/iterators/itnull.cocci

Signed-off-by: Julia Lawall <julia.lawall@lip6.fr>
Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Harsh Jain <harsh@chelsio.com>
---
 drivers/crypto/chelsio/chcr_core.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/crypto/chelsio/chcr_core.c b/drivers/crypto/chelsio/chcr_core.c
index 1c65f07..2bfd61a 100644
--- a/drivers/crypto/chelsio/chcr_core.c
+++ b/drivers/crypto/chelsio/chcr_core.c
@@ -61,7 +61,7 @@ int assign_chcr_device(struct chcr_dev **dev)
 	 */
 	mutex_lock(&dev_mutex); /* TODO ? */
 	list_for_each_entry(u_ctx, &uld_ctx_list, entry)
-		if (u_ctx && u_ctx->dev) {
+		if (u_ctx->dev) {
 			*dev = u_ctx->dev;
 			ret = 0;
 			break;
-- 
1.8.2.3

^ permalink raw reply related

* [PATCH v1 1/4] crypto:chcr-Change flow IDs
From: Harsh Jain @ 2017-01-13 12:29 UTC (permalink / raw)
  To: hariprasad, netdev, herbert, linux-crypto; +Cc: Harsh Jain, Atul Gupta
In-Reply-To: <cover.1484309965.git.harsh@chelsio.com>

Change assign flowc id to each outgoing request.Firmware use flowc id
to schedule each request onto HW. FW reply may lost without this change.

Reviewed-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: Atul Gupta <atul.gupta@chelsio.com>
---
 drivers/crypto/chelsio/chcr_algo.c            | 18 ++++++++++--------
 drivers/crypto/chelsio/chcr_algo.h            |  9 +++++----
 drivers/crypto/chelsio/chcr_core.h            |  1 +
 drivers/net/ethernet/chelsio/cxgb4/t4fw_api.h |  8 ++++++++
 4 files changed, 24 insertions(+), 12 deletions(-)

diff --git a/drivers/crypto/chelsio/chcr_algo.c b/drivers/crypto/chelsio/chcr_algo.c
index 2ed1e24..1d7dfcf 100644
--- a/drivers/crypto/chelsio/chcr_algo.c
+++ b/drivers/crypto/chelsio/chcr_algo.c
@@ -542,10 +542,11 @@ static inline void create_wreq(struct chcr_context *ctx,
 				    (calc_tx_flits_ofld(skb) * 8), 16)));
 	chcr_req->wreq.cookie = cpu_to_be64((uintptr_t)req);
 	chcr_req->wreq.rx_chid_to_rx_q_id =
-		FILL_WR_RX_Q_ID(ctx->dev->tx_channel_id, qid,
-				is_iv ? iv_loc : IV_NOP);
+		FILL_WR_RX_Q_ID(ctx->dev->rx_channel_id, qid,
+				is_iv ? iv_loc : IV_NOP, ctx->tx_channel_id);
 
-	chcr_req->ulptx.cmd_dest = FILL_ULPTX_CMD_DEST(ctx->dev->tx_channel_id);
+	chcr_req->ulptx.cmd_dest = FILL_ULPTX_CMD_DEST(ctx->dev->tx_channel_id,
+						       qid);
 	chcr_req->ulptx.len = htonl((DIV_ROUND_UP((calc_tx_flits_ofld(skb) * 8),
 					16) - ((sizeof(chcr_req->wreq)) >> 4)));
 
@@ -606,7 +607,7 @@ static inline void create_wreq(struct chcr_context *ctx,
 	chcr_req = (struct chcr_wr *)__skb_put(skb, transhdr_len);
 	memset(chcr_req, 0, transhdr_len);
 	chcr_req->sec_cpl.op_ivinsrtofst =
-		FILL_SEC_CPL_OP_IVINSR(ctx->dev->tx_channel_id, 2, 1);
+		FILL_SEC_CPL_OP_IVINSR(ctx->dev->rx_channel_id, 2, 1);
 
 	chcr_req->sec_cpl.pldlen = htonl(ivsize + req->nbytes);
 	chcr_req->sec_cpl.aadstart_cipherstop_hi =
@@ -782,6 +783,7 @@ static int chcr_device_init(struct chcr_context *ctx)
 		spin_lock(&ctx->dev->lock_chcr_dev);
 		ctx->tx_channel_id = rxq_idx;
 		ctx->dev->tx_channel_id = !ctx->dev->tx_channel_id;
+		ctx->dev->rx_channel_id = 0;
 		spin_unlock(&ctx->dev->lock_chcr_dev);
 	}
 out:
@@ -874,7 +876,7 @@ static struct sk_buff *create_hash_wr(struct ahash_request *req,
 	memset(chcr_req, 0, transhdr_len);
 
 	chcr_req->sec_cpl.op_ivinsrtofst =
-		FILL_SEC_CPL_OP_IVINSR(ctx->dev->tx_channel_id, 2, 0);
+		FILL_SEC_CPL_OP_IVINSR(ctx->dev->rx_channel_id, 2, 0);
 	chcr_req->sec_cpl.pldlen = htonl(param->bfr_len + param->sg_len);
 
 	chcr_req->sec_cpl.aadstart_cipherstop_hi =
@@ -1424,7 +1426,7 @@ static struct sk_buff *create_authenc_wr(struct aead_request *req,
 	 * to the hardware spec
 	 */
 	chcr_req->sec_cpl.op_ivinsrtofst =
-		FILL_SEC_CPL_OP_IVINSR(ctx->dev->tx_channel_id, 2,
+		FILL_SEC_CPL_OP_IVINSR(ctx->dev->rx_channel_id, 2,
 				       (ivsize ? (assoclen + 1) : 0));
 	chcr_req->sec_cpl.pldlen = htonl(assoclen + ivsize + req->cryptlen);
 	chcr_req->sec_cpl.aadstart_cipherstop_hi = FILL_SEC_CPL_CIPHERSTOP_HI(
@@ -1600,7 +1602,7 @@ static void fill_sec_cpl_for_aead(struct cpl_tx_sec_pdu *sec_cpl,
 	unsigned int ivsize = AES_BLOCK_SIZE;
 	unsigned int cipher_mode = CHCR_SCMD_CIPHER_MODE_AES_CCM;
 	unsigned int mac_mode = CHCR_SCMD_AUTH_MODE_CBCMAC;
-	unsigned int c_id = chcrctx->dev->tx_channel_id;
+	unsigned int c_id = chcrctx->dev->rx_channel_id;
 	unsigned int ccm_xtra;
 	unsigned char tag_offset = 0, auth_offset = 0;
 	unsigned char hmac_ctrl = get_hmac(crypto_aead_authsize(tfm));
@@ -1875,7 +1877,7 @@ static struct sk_buff *create_gcm_wr(struct aead_request *req,
 
 	tag_offset = (op_type == CHCR_ENCRYPT_OP) ? 0 : authsize;
 	chcr_req->sec_cpl.op_ivinsrtofst = FILL_SEC_CPL_OP_IVINSR(
-					ctx->dev->tx_channel_id, 2, (ivsize ?
+					ctx->dev->rx_channel_id, 2, (ivsize ?
 					(req->assoclen + 1) : 0));
 	chcr_req->sec_cpl.pldlen = htonl(req->assoclen + ivsize + crypt_len);
 	chcr_req->sec_cpl.aadstart_cipherstop_hi = FILL_SEC_CPL_CIPHERSTOP_HI(
diff --git a/drivers/crypto/chelsio/chcr_algo.h b/drivers/crypto/chelsio/chcr_algo.h
index 3c7c51f..ba38bae 100644
--- a/drivers/crypto/chelsio/chcr_algo.h
+++ b/drivers/crypto/chelsio/chcr_algo.h
@@ -185,20 +185,21 @@
 			FW_CRYPTO_LOOKASIDE_WR_CCTX_LOC_V(1) | \
 			FW_CRYPTO_LOOKASIDE_WR_CCTX_SIZE_V((ctx_len)))
 
-#define FILL_WR_RX_Q_ID(cid, qid, wr_iv) \
+#define FILL_WR_RX_Q_ID(cid, qid, wr_iv, fid) \
 		htonl( \
 			FW_CRYPTO_LOOKASIDE_WR_RX_CHID_V((cid)) | \
 			FW_CRYPTO_LOOKASIDE_WR_RX_Q_ID_V((qid)) | \
 			FW_CRYPTO_LOOKASIDE_WR_LCB_V(0) | \
-			FW_CRYPTO_LOOKASIDE_WR_IV_V((wr_iv)))
+			FW_CRYPTO_LOOKASIDE_WR_IV_V((wr_iv)) | \
+			FW_CRYPTO_LOOKASIDE_WR_FQIDX_V(fid))
 
-#define FILL_ULPTX_CMD_DEST(cid) \
+#define FILL_ULPTX_CMD_DEST(cid, qid) \
 	htonl(ULPTX_CMD_V(ULP_TX_PKT) | \
 	      ULP_TXPKT_DEST_V(0) | \
 	      ULP_TXPKT_DATAMODIFY_V(0) | \
 	      ULP_TXPKT_CHANNELID_V((cid)) | \
 	      ULP_TXPKT_RO_V(1) | \
-	      ULP_TXPKT_FID_V(0))
+	      ULP_TXPKT_FID_V(qid))
 
 #define KEYCTX_ALIGN_PAD(bs) ({unsigned int _bs = (bs);\
 			      _bs == SHA1_DIGEST_SIZE ? 12 : 0; })
diff --git a/drivers/crypto/chelsio/chcr_core.h b/drivers/crypto/chelsio/chcr_core.h
index c7088a4..79da22b 100644
--- a/drivers/crypto/chelsio/chcr_core.h
+++ b/drivers/crypto/chelsio/chcr_core.h
@@ -75,6 +75,7 @@ struct chcr_dev {
 	spinlock_t lock_chcr_dev;
 	struct uld_ctx *u_ctx;
 	unsigned char tx_channel_id;
+	unsigned char rx_channel_id;
 };
 
 struct uld_ctx {
diff --git a/drivers/net/ethernet/chelsio/cxgb4/t4fw_api.h b/drivers/net/ethernet/chelsio/cxgb4/t4fw_api.h
index 8d9e4b7..ccc05f8 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/t4fw_api.h
+++ b/drivers/net/ethernet/chelsio/cxgb4/t4fw_api.h
@@ -3385,6 +3385,14 @@ struct fw_crypto_lookaside_wr {
 #define FW_CRYPTO_LOOKASIDE_WR_IV_G(x) \
 	(((x) >> FW_CRYPTO_LOOKASIDE_WR_IV_S) & FW_CRYPTO_LOOKASIDE_WR_IV_M)
 
+#define FW_CRYPTO_LOOKASIDE_WR_FQIDX_S   15
+#define FW_CRYPTO_LOOKASIDE_WR_FQIDX_M   0xff
+#define FW_CRYPTO_LOOKASIDE_WR_FQIDX_V(x) \
+	((x) << FW_CRYPTO_LOOKASIDE_WR_FQIDX_S)
+#define FW_CRYPTO_LOOKASIDE_WR_FQIDX_G(x) \
+	(((x) >> FW_CRYPTO_LOOKASIDE_WR_FQIDX_S) & \
+	 FW_CRYPTO_LOOKASIDE_WR_FQIDX_M)
+
 #define FW_CRYPTO_LOOKASIDE_WR_TX_CH_S 10
 #define FW_CRYPTO_LOOKASIDE_WR_TX_CH_M 0x3
 #define FW_CRYPTO_LOOKASIDE_WR_TX_CH_V(x) \
-- 
1.8.2.3

^ permalink raw reply related

* [PATCH v1 3/4] crypto:chcr- Check device is allocated before use
From: Harsh Jain @ 2017-01-13 12:29 UTC (permalink / raw)
  To: hariprasad, netdev, herbert, linux-crypto; +Cc: Harsh Jain, Atul Gupta
In-Reply-To: <cover.1484309965.git.harsh@chelsio.com>

Ensure dev is allocated for crypto uld context before using the device
for crypto operations.

Signed-off-by: Atul Gupta <atul.gupta@chelsio.com>
---
 drivers/crypto/chelsio/chcr_core.c | 18 ++++++++----------
 1 file changed, 8 insertions(+), 10 deletions(-)

diff --git a/drivers/crypto/chelsio/chcr_core.c b/drivers/crypto/chelsio/chcr_core.c
index 918da8e..1c65f07 100644
--- a/drivers/crypto/chelsio/chcr_core.c
+++ b/drivers/crypto/chelsio/chcr_core.c
@@ -52,6 +52,7 @@
 int assign_chcr_device(struct chcr_dev **dev)
 {
 	struct uld_ctx *u_ctx;
+	int ret = -ENXIO;
 
 	/*
 	 * Which device to use if multiple devices are available TODO
@@ -59,15 +60,14 @@ int assign_chcr_device(struct chcr_dev **dev)
 	 * must go to the same device to maintain the ordering.
 	 */
 	mutex_lock(&dev_mutex); /* TODO ? */
-	u_ctx = list_first_entry(&uld_ctx_list, struct uld_ctx, entry);
-	if (!u_ctx) {
-		mutex_unlock(&dev_mutex);
-		return -ENXIO;
+	list_for_each_entry(u_ctx, &uld_ctx_list, entry)
+		if (u_ctx && u_ctx->dev) {
+			*dev = u_ctx->dev;
+			ret = 0;
+			break;
 	}
-
-	*dev = u_ctx->dev;
 	mutex_unlock(&dev_mutex);
-	return 0;
+	return ret;
 }
 
 static int chcr_dev_add(struct uld_ctx *u_ctx)
@@ -202,10 +202,8 @@ static int chcr_uld_state_change(void *handle, enum cxgb4_state state)
 
 static int __init chcr_crypto_init(void)
 {
-	if (cxgb4_register_uld(CXGB4_ULD_CRYPTO, &chcr_uld_info)) {
+	if (cxgb4_register_uld(CXGB4_ULD_CRYPTO, &chcr_uld_info))
 		pr_err("ULD register fail: No chcr crypto support in cxgb4");
-		return -1;
-	}
 
 	return 0;
 }
-- 
1.8.2.3

^ permalink raw reply related

* Re: x86-64: Maintain 16-byte stack alignment
From: Josh Poimboeuf @ 2017-01-13 13:07 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Andy Lutomirski, Linus Torvalds, Linux Kernel Mailing List,
	Linux Crypto Mailing List, Ingo Molnar, Thomas Gleixner,
	Andy Lutomirski, Ard Biesheuvel
In-Reply-To: <20170113083648.GA22022@gondor.apana.org.au>

On Fri, Jan 13, 2017 at 04:36:48PM +0800, Herbert Xu wrote:
> On Thu, Jan 12, 2017 at 12:08:07PM -0800, Andy Lutomirski wrote:
> >
> > I think we have some inline functions that do asm volatile ("call
> > ..."), and I don't see any credible way of forcing alignment short of
> > generating an entirely new stack frame and aligning that.  Ick.  This
> 
> A straight asm call from C should always work because gcc keeps
> the stack aligned in the prologue.
> 
> The only problem with inline assembly is when you start pushing
> things onto the stack directly.

I tried another approach.  I rebuilt the kernel with
-mpreferred-stack-boundary=4 and used awk (poor man's objtool) to find
all leaf functions with misaligned stacks.

  objdump -d ~/k/vmlinux | awk '/>:/ { f=$2; call=0; push=0 } /fentry/ { next } /callq/ { call=1 } /push/ { push=!push } /sub.*8,%rsp/ { push=!push } /^$/ && call == 0 && push == 0 { print f }'

It found a lot of functions.  Here's one of them:

  ffffffff814ab450 <mpihelp_add_n>:
  ffffffff814ab450:	55                   	push   %rbp
  ffffffff814ab451:	f7 d9                	neg    %ecx
  ffffffff814ab453:	31 c0                	xor    %eax,%eax
  ffffffff814ab455:	4c 63 c1             	movslq %ecx,%r8
  ffffffff814ab458:	48 89 e5             	mov    %rsp,%rbp
  ffffffff814ab45b:	53                   	push   %rbx
  ffffffff814ab45c:	4a 8d 1c c5 00 00 00 	lea    0x0(,%r8,8),%rbx
  ffffffff814ab463:	00 
  ffffffff814ab464:	eb 03                	jmp    ffffffff814ab469 <mpihelp_add_n+0x19>
  ffffffff814ab466:	4c 63 c1             	movslq %ecx,%r8
  ffffffff814ab469:	49 c1 e0 03          	shl    $0x3,%r8
  ffffffff814ab46d:	45 31 c9             	xor    %r9d,%r9d
  ffffffff814ab470:	49 29 d8             	sub    %rbx,%r8
  ffffffff814ab473:	4a 03 04 02          	add    (%rdx,%r8,1),%rax
  ffffffff814ab477:	41 0f 92 c1          	setb   %r9b
  ffffffff814ab47b:	4a 03 04 06          	add    (%rsi,%r8,1),%rax
  ffffffff814ab47f:	41 0f 92 c2          	setb   %r10b
  ffffffff814ab483:	49 89 c3             	mov    %rax,%r11
  ffffffff814ab486:	83 c1 01             	add    $0x1,%ecx
  ffffffff814ab489:	45 0f b6 d2          	movzbl %r10b,%r10d
  ffffffff814ab48d:	4e 89 1c 07          	mov    %r11,(%rdi,%r8,1)
  ffffffff814ab491:	4b 8d 04 0a          	lea    (%r10,%r9,1),%rax
  ffffffff814ab495:	75 cf                	jne    ffffffff814ab466 <mpihelp_add_n+0x16>
  ffffffff814ab497:	5b                   	pop    %rbx
  ffffffff814ab498:	5d                   	pop    %rbp
  ffffffff814ab499:	c3                   	retq   
  ffffffff814ab49a:	66 0f 1f 44 00 00    	nopw   0x0(%rax,%rax,1)

That's a leaf function which, as far as I can tell, doesn't use any
inline asm, but its prologue produces a misaligned stack.

I added inline asm with a call instruction and no operands or clobbers,
and got the same result.

So Andy's theory seems to be correct.  As long as we allow calls from
inline asm, we can't rely on aligned stacks.

-- 
Josh

^ permalink raw reply

* Re: [RFC PATCH 0/6] Add bulk skcipher requests to crypto API and dm-crypt
From: Herbert Xu @ 2017-01-13 14:29 UTC (permalink / raw)
  To: Ondrej Mosnáček
  Cc: linux-crypto, dm-devel, Mike Snitzer, Milan Broz, Mikulas Patocka,
	Binoy Jayan
In-Reply-To: <CAAUqJDtqN-ugcaz4VhuW0fzLbmKt4db8ggrRmaAq_zFvO+9Lvw@mail.gmail.com>

On Fri, Jan 13, 2017 at 01:01:56PM +0100, Ondrej Mosnáček wrote:
>
> As I already mentioned in another thread, there are basically two reasons:
> 
> 1) Milan would like to add authenticated encryption support to
> dm-crypt (see [1]) and as part of this change, a new random IV mode
> would be introduced. This mode generates a random IV for each sector
> write, includes it in the authenticated data and stores it in the
> sector's metadata (in a separate part of the disk). In this case
> dm-crypt will need to have control over the IV generation (or at least
> be able to somehow retrieve it after the crypto operation... but
> passing RNG responsibility to drivers doesn't seem to be a good idea
> anyway).

This sounds exactly like the IV generator for IPsec modes such as
CTR or GCM.  The only difference is that you deal with sectors
instead of packets.

> 2) With this API, drivers wouldn't have to provide implementations for
> specific IV generation modes, and just implement bulk requests for the
> common modes/algorithms (XTS, CBC, ...) while still getting
> performance benefit.

What if the driver had hardware support for generating these IVs?
With your scheme this cannot be supported at all.

Getting the IVs back is not actually that hard.  We could simply
change the algorithm definition for the IV generator so that
the IVs are embedded in the plaintext and ciphertext.  For
example, you could declare it so that the for n sectors the
first n*ivsize bytes would be the IV, and the actual plaintext
or ciphertext would follow.

With such a definition you could either generate the IVs in dm-crypt
or have them generated in the IV generator.

Cheers,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply

* Re: [PATCH] crypto: testmgr - use calculated count for number of test vectors
From: Herbert Xu @ 2017-01-13 14:32 UTC (permalink / raw)
  To: Ard Biesheuvel; +Cc: linux-crypto
In-Reply-To: <1484228439-28373-1-git-send-email-ard.biesheuvel@linaro.org>

On Thu, Jan 12, 2017 at 01:40:39PM +0000, Ard Biesheuvel wrote:
> When working on AES in CCM mode for ARM, my code passed the internal
> tcrypt test before I had even bothered to implement the AES-192 and
> AES-256 code paths, which is strange because the tcrypt does contain
> AES-192 and AES-256 test vectors for CCM.
> 
> As it turned out, the define AES_CCM_ENC_TEST_VECTORS was out of sync
> with the actual number of test vectors, causing only the AES-128 ones
> to be executed.
> 
> So get rid of the defines, and wrap the test vector references in a
> macro that calculates the number of vectors automatically.
> 
> The following test vector counts were out of sync with the respective
> defines:
> 
>     BF_CTR_ENC_TEST_VECTORS          2 ->  3
>     BF_CTR_DEC_TEST_VECTORS          2 ->  3
>     TF_CTR_ENC_TEST_VECTORS          2 ->  3
>     TF_CTR_DEC_TEST_VECTORS          2 ->  3
>     SERPENT_CTR_ENC_TEST_VECTORS     2 ->  3
>     SERPENT_CTR_DEC_TEST_VECTORS     2 ->  3
>     AES_CCM_ENC_TEST_VECTORS         8 -> 14
>     AES_CCM_DEC_TEST_VECTORS         7 -> 17
>     AES_CCM_4309_ENC_TEST_VECTORS    7 -> 23
>     AES_CCM_4309_DEC_TEST_VECTORS   10 -> 23
>     CAMELLIA_CTR_ENC_TEST_VECTORS    2 ->  3
>     CAMELLIA_CTR_DEC_TEST_VECTORS    2 ->  3
> 
> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>

Patch applied.  Thanks.
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox