* RFC: support for MV_CESA with TDMA @ 2012-05-25 16:08 Phil Sutter 2012-05-25 16:08 ` [PATCH 01/13] mv_cesa: do not use scatterlist iterators Phil Sutter ` (14 more replies) 0 siblings, 15 replies; 67+ messages in thread From: Phil Sutter @ 2012-05-25 16:08 UTC (permalink / raw) To: linux-crypto; +Cc: Herbert Xu Hi, The following patch series adds support for the TDMA engine built into Marvell's Kirkwood-based SoCs, and enhances mv_cesa.c in order to use it for speeding up crypto operations. Kirkwood hardware contains a security accelerator, which can control DMA as well as crypto engines. It allows for operation with minimal software intervenience, which the following patches implement: using a chain of DMA descriptors, data input, configuration, engine startup and data output repeat fully automatically until the whole input data has been handled. The point for this being RFC is backwards-compatibility: earlier hardware (Orion) ships a (slightly) different DMA engine (IDMA) along with the same crypto engine, so in fact mv_cesa.c is in use on these platforms, too. But since I don't possess hardware of this kind, I am not able to make this code IDMA-compatible. Also, due to the quite massive reorganisation of code flow, I don't really see how to make TDMA support optional in mv_cesa.c. Greetings, Phil ^ permalink raw reply [flat|nested] 67+ messages in thread
* [PATCH 01/13] mv_cesa: do not use scatterlist iterators 2012-05-25 16:08 RFC: support for MV_CESA with TDMA Phil Sutter @ 2012-05-25 16:08 ` Phil Sutter 2012-05-25 16:08 ` [PATCH 02/13] mv_cesa: minor formatting cleanup, will all make sense soon Phil Sutter ` (13 subsequent siblings) 14 siblings, 0 replies; 67+ messages in thread From: Phil Sutter @ 2012-05-25 16:08 UTC (permalink / raw) To: linux-crypto; +Cc: Herbert Xu The big problem is they cannot be used to iterate over DMA mapped scatterlists, so get rid of them in order to add DMA functionality to mv_cesa. Signed-off-by: Phil Sutter <phil.sutter@viprinet.com> --- drivers/crypto/mv_cesa.c | 57 ++++++++++++++++++++++----------------------- 1 files changed, 28 insertions(+), 29 deletions(-) diff --git a/drivers/crypto/mv_cesa.c b/drivers/crypto/mv_cesa.c index 3cc9237..c305350 100644 --- a/drivers/crypto/mv_cesa.c +++ b/drivers/crypto/mv_cesa.c @@ -43,8 +43,8 @@ enum engine_status { /** * struct req_progress - used for every crypt request - * @src_sg_it: sg iterator for src - * @dst_sg_it: sg iterator for dst + * @src_sg: sg list for src + * @dst_sg: sg list for dst * @sg_src_left: bytes left in src to process (scatter list) * @src_start: offset to add to src start position (scatter list) * @crypt_len: length of current hw crypt/hash process @@ -59,8 +59,8 @@ enum engine_status { * track of progress within current scatterlist. */ struct req_progress { - struct sg_mapping_iter src_sg_it; - struct sg_mapping_iter dst_sg_it; + struct scatterlist *src_sg; + struct scatterlist *dst_sg; void (*complete) (void); void (*process) (int is_first); @@ -210,19 +210,19 @@ static int mv_setkey_aes(struct crypto_ablkcipher *cipher, const u8 *key, static void copy_src_to_buf(struct req_progress *p, char *dbuf, int len) { - int ret; void *sbuf; int copy_len; while (len) { if (!p->sg_src_left) { - ret = sg_miter_next(&p->src_sg_it); - BUG_ON(!ret); - p->sg_src_left = p->src_sg_it.length; + /* next sg please */ + p->src_sg = sg_next(p->src_sg); + BUG_ON(!p->src_sg); + p->sg_src_left = p->src_sg->length; p->src_start = 0; } - sbuf = p->src_sg_it.addr + p->src_start; + sbuf = sg_virt(p->src_sg) + p->src_start; copy_len = min(p->sg_src_left, len); memcpy(dbuf, sbuf, copy_len); @@ -305,9 +305,6 @@ static void mv_crypto_algo_completion(void) struct ablkcipher_request *req = ablkcipher_request_cast(cpg->cur_req); struct mv_req_ctx *req_ctx = ablkcipher_request_ctx(req); - sg_miter_stop(&cpg->p.src_sg_it); - sg_miter_stop(&cpg->p.dst_sg_it); - if (req_ctx->op != COP_AES_CBC) return ; @@ -437,7 +434,6 @@ static void mv_hash_algo_completion(void) if (ctx->extra_bytes) copy_src_to_buf(&cpg->p, ctx->buffer, ctx->extra_bytes); - sg_miter_stop(&cpg->p.src_sg_it); if (likely(ctx->last_chunk)) { if (likely(ctx->count <= MAX_HW_HASH_SIZE)) { @@ -457,7 +453,6 @@ static void dequeue_complete_req(void) { struct crypto_async_request *req = cpg->cur_req; void *buf; - int ret; cpg->p.hw_processed_bytes += cpg->p.crypt_len; if (cpg->p.copy_back) { int need_copy_len = cpg->p.crypt_len; @@ -466,14 +461,14 @@ static void dequeue_complete_req(void) int dst_copy; if (!cpg->p.sg_dst_left) { - ret = sg_miter_next(&cpg->p.dst_sg_it); - BUG_ON(!ret); - cpg->p.sg_dst_left = cpg->p.dst_sg_it.length; + /* next sg please */ + cpg->p.dst_sg = sg_next(cpg->p.dst_sg); + BUG_ON(!cpg->p.dst_sg); + cpg->p.sg_dst_left = cpg->p.dst_sg->length; cpg->p.dst_start = 0; } - buf = cpg->p.dst_sg_it.addr; - buf += cpg->p.dst_start; + buf = sg_virt(cpg->p.dst_sg) + cpg->p.dst_start; dst_copy = min(need_copy_len, cpg->p.sg_dst_left); @@ -523,7 +518,6 @@ static int count_sgs(struct scatterlist *sl, unsigned int total_bytes) static void mv_start_new_crypt_req(struct ablkcipher_request *req) { struct req_progress *p = &cpg->p; - int num_sgs; cpg->cur_req = &req->base; memset(p, 0, sizeof(struct req_progress)); @@ -532,11 +526,14 @@ static void mv_start_new_crypt_req(struct ablkcipher_request *req) p->process = mv_process_current_q; p->copy_back = 1; - num_sgs = count_sgs(req->src, req->nbytes); - sg_miter_start(&p->src_sg_it, req->src, num_sgs, SG_MITER_FROM_SG); - - num_sgs = count_sgs(req->dst, req->nbytes); - sg_miter_start(&p->dst_sg_it, req->dst, num_sgs, SG_MITER_TO_SG); + p->src_sg = req->src; + p->dst_sg = req->dst; + if (req->nbytes) { + BUG_ON(!req->src); + BUG_ON(!req->dst); + p->sg_src_left = req->src->length; + p->sg_dst_left = req->dst->length; + } mv_process_current_q(1); } @@ -545,7 +542,7 @@ static void mv_start_new_hash_req(struct ahash_request *req) { struct req_progress *p = &cpg->p; struct mv_req_hash_ctx *ctx = ahash_request_ctx(req); - int num_sgs, hw_bytes, old_extra_bytes, rc; + int hw_bytes, old_extra_bytes, rc; cpg->cur_req = &req->base; memset(p, 0, sizeof(struct req_progress)); hw_bytes = req->nbytes + ctx->extra_bytes; @@ -558,8 +555,11 @@ static void mv_start_new_hash_req(struct ahash_request *req) else ctx->extra_bytes = 0; - num_sgs = count_sgs(req->src, req->nbytes); - sg_miter_start(&p->src_sg_it, req->src, num_sgs, SG_MITER_FROM_SG); + p->src_sg = req->src; + if (req->nbytes) { + BUG_ON(!req->src); + p->sg_src_left = req->src->length; + } if (hw_bytes) { p->hw_nbytes = hw_bytes; @@ -576,7 +576,6 @@ static void mv_start_new_hash_req(struct ahash_request *req) } else { copy_src_to_buf(p, ctx->buffer + old_extra_bytes, ctx->extra_bytes - old_extra_bytes); - sg_miter_stop(&p->src_sg_it); if (ctx->last_chunk) rc = mv_hash_final_fallback(req); else -- 1.7.3.4 ^ permalink raw reply related [flat|nested] 67+ messages in thread
* [PATCH 02/13] mv_cesa: minor formatting cleanup, will all make sense soon 2012-05-25 16:08 RFC: support for MV_CESA with TDMA Phil Sutter 2012-05-25 16:08 ` [PATCH 01/13] mv_cesa: do not use scatterlist iterators Phil Sutter @ 2012-05-25 16:08 ` Phil Sutter 2012-05-25 16:08 ` [PATCH 03/13] mv_cesa: prepare the full sram config in dram Phil Sutter ` (12 subsequent siblings) 14 siblings, 0 replies; 67+ messages in thread From: Phil Sutter @ 2012-05-25 16:08 UTC (permalink / raw) To: linux-crypto; +Cc: Herbert Xu This is just to keep formatting changes out of the following commit, hopefully simplifying it a bit. Signed-off-by: Phil Sutter <phil.sutter@viprinet.com> --- drivers/crypto/mv_cesa.c | 14 ++++++-------- 1 files changed, 6 insertions(+), 8 deletions(-) diff --git a/drivers/crypto/mv_cesa.c b/drivers/crypto/mv_cesa.c index c305350..3862a93 100644 --- a/drivers/crypto/mv_cesa.c +++ b/drivers/crypto/mv_cesa.c @@ -267,12 +267,10 @@ static void mv_process_current_q(int first_block) } if (req_ctx->decrypt) { op.config |= CFG_DIR_DEC; - memcpy(cpg->sram + SRAM_DATA_KEY_P, ctx->aes_dec_key, - AES_KEY_LEN); + memcpy(cpg->sram + SRAM_DATA_KEY_P, ctx->aes_dec_key, AES_KEY_LEN); } else { op.config |= CFG_DIR_ENC; - memcpy(cpg->sram + SRAM_DATA_KEY_P, ctx->aes_enc_key, - AES_KEY_LEN); + memcpy(cpg->sram + SRAM_DATA_KEY_P, ctx->aes_enc_key, AES_KEY_LEN); } switch (ctx->key_len) { @@ -333,9 +331,8 @@ static void mv_process_hash_current(int first_block) } op.mac_src_p = - MAC_SRC_DATA_P(SRAM_DATA_IN_START) | MAC_SRC_TOTAL_LEN((u32) - req_ctx-> - count); + MAC_SRC_DATA_P(SRAM_DATA_IN_START) | + MAC_SRC_TOTAL_LEN((u32)req_ctx->count); setup_data_in(); @@ -370,7 +367,8 @@ static void mv_process_hash_current(int first_block) } } - memcpy(cpg->sram + SRAM_CONFIG, &op, sizeof(struct sec_accel_config)); + memcpy(cpg->sram + SRAM_CONFIG, &op, + sizeof(struct sec_accel_config)); /* GO */ mv_setup_timer(); -- 1.7.3.4 ^ permalink raw reply related [flat|nested] 67+ messages in thread
* [PATCH 03/13] mv_cesa: prepare the full sram config in dram 2012-05-25 16:08 RFC: support for MV_CESA with TDMA Phil Sutter 2012-05-25 16:08 ` [PATCH 01/13] mv_cesa: do not use scatterlist iterators Phil Sutter 2012-05-25 16:08 ` [PATCH 02/13] mv_cesa: minor formatting cleanup, will all make sense soon Phil Sutter @ 2012-05-25 16:08 ` Phil Sutter 2012-05-25 16:08 ` [PATCH 04/13] mv_cesa: split up processing callbacks Phil Sutter ` (11 subsequent siblings) 14 siblings, 0 replies; 67+ messages in thread From: Phil Sutter @ 2012-05-25 16:08 UTC (permalink / raw) To: linux-crypto; +Cc: Herbert Xu This way reconfiguring the cryptographic accelerator consists of a single step (memcpy here), which in future can be done by the tdma engine. This patch introduces some ugly IV copying, necessary for input buffers above 1920bytes. But this will go away later. Signed-off-by: Phil Sutter <phil.sutter@viprinet.com> --- drivers/crypto/mv_cesa.c | 83 ++++++++++++++++++++++++++++----------------- 1 files changed, 52 insertions(+), 31 deletions(-) diff --git a/drivers/crypto/mv_cesa.c b/drivers/crypto/mv_cesa.c index 3862a93..68b83d8 100644 --- a/drivers/crypto/mv_cesa.c +++ b/drivers/crypto/mv_cesa.c @@ -76,6 +76,24 @@ struct req_progress { int hw_processed_bytes; }; +struct sec_accel_sram { + struct sec_accel_config op; + union { + struct { + u32 key[8]; + u32 iv[4]; + } crypt; + struct { + u32 ivi[5]; + u32 ivo[5]; + } hash; + } type; +#define sa_key type.crypt.key +#define sa_iv type.crypt.iv +#define sa_ivi type.hash.ivi +#define sa_ivo type.hash.ivo +} __attribute__((packed)); + struct crypto_priv { void __iomem *reg; void __iomem *sram; @@ -93,6 +111,8 @@ struct crypto_priv { int sram_size; int has_sha1; int has_hmac_sha1; + + struct sec_accel_sram sa_sram; }; static struct crypto_priv *cpg; @@ -250,48 +270,49 @@ static void mv_process_current_q(int first_block) struct ablkcipher_request *req = ablkcipher_request_cast(cpg->cur_req); struct mv_ctx *ctx = crypto_tfm_ctx(req->base.tfm); struct mv_req_ctx *req_ctx = ablkcipher_request_ctx(req); - struct sec_accel_config op; + struct sec_accel_config *op = &cpg->sa_sram.op; switch (req_ctx->op) { case COP_AES_ECB: - op.config = CFG_OP_CRYPT_ONLY | CFG_ENCM_AES | CFG_ENC_MODE_ECB; + op->config = CFG_OP_CRYPT_ONLY | CFG_ENCM_AES | CFG_ENC_MODE_ECB; break; case COP_AES_CBC: default: - op.config = CFG_OP_CRYPT_ONLY | CFG_ENCM_AES | CFG_ENC_MODE_CBC; - op.enc_iv = ENC_IV_POINT(SRAM_DATA_IV) | + op->config = CFG_OP_CRYPT_ONLY | CFG_ENCM_AES | CFG_ENC_MODE_CBC; + op->enc_iv = ENC_IV_POINT(SRAM_DATA_IV) | ENC_IV_BUF_POINT(SRAM_DATA_IV_BUF); - if (first_block) - memcpy(cpg->sram + SRAM_DATA_IV, req->info, 16); + if (!first_block) + memcpy(req->info, cpg->sram + SRAM_DATA_IV_BUF, 16); + memcpy(cpg->sa_sram.sa_iv, req->info, 16); break; } if (req_ctx->decrypt) { - op.config |= CFG_DIR_DEC; - memcpy(cpg->sram + SRAM_DATA_KEY_P, ctx->aes_dec_key, AES_KEY_LEN); + op->config |= CFG_DIR_DEC; + memcpy(cpg->sa_sram.sa_key, ctx->aes_dec_key, AES_KEY_LEN); } else { - op.config |= CFG_DIR_ENC; - memcpy(cpg->sram + SRAM_DATA_KEY_P, ctx->aes_enc_key, AES_KEY_LEN); + op->config |= CFG_DIR_ENC; + memcpy(cpg->sa_sram.sa_key, ctx->aes_enc_key, AES_KEY_LEN); } switch (ctx->key_len) { case AES_KEYSIZE_128: - op.config |= CFG_AES_LEN_128; + op->config |= CFG_AES_LEN_128; break; case AES_KEYSIZE_192: - op.config |= CFG_AES_LEN_192; + op->config |= CFG_AES_LEN_192; break; case AES_KEYSIZE_256: - op.config |= CFG_AES_LEN_256; + op->config |= CFG_AES_LEN_256; break; } - op.enc_p = ENC_P_SRC(SRAM_DATA_IN_START) | + op->enc_p = ENC_P_SRC(SRAM_DATA_IN_START) | ENC_P_DST(SRAM_DATA_OUT_START); - op.enc_key_p = SRAM_DATA_KEY_P; + op->enc_key_p = SRAM_DATA_KEY_P; setup_data_in(); - op.enc_len = cpg->p.crypt_len; - memcpy(cpg->sram + SRAM_CONFIG, &op, - sizeof(struct sec_accel_config)); + op->enc_len = cpg->p.crypt_len; + memcpy(cpg->sram + SRAM_CONFIG, &cpg->sa_sram, + sizeof(struct sec_accel_sram)); /* GO */ mv_setup_timer(); @@ -315,30 +336,30 @@ static void mv_process_hash_current(int first_block) const struct mv_tfm_hash_ctx *tfm_ctx = crypto_tfm_ctx(req->base.tfm); struct mv_req_hash_ctx *req_ctx = ahash_request_ctx(req); struct req_progress *p = &cpg->p; - struct sec_accel_config op = { 0 }; + struct sec_accel_config *op = &cpg->sa_sram.op; int is_last; switch (req_ctx->op) { case COP_SHA1: default: - op.config = CFG_OP_MAC_ONLY | CFG_MACM_SHA1; + op->config = CFG_OP_MAC_ONLY | CFG_MACM_SHA1; break; case COP_HMAC_SHA1: - op.config = CFG_OP_MAC_ONLY | CFG_MACM_HMAC_SHA1; - memcpy(cpg->sram + SRAM_HMAC_IV_IN, + op->config = CFG_OP_MAC_ONLY | CFG_MACM_HMAC_SHA1; + memcpy(cpg->sa_sram.sa_ivi, tfm_ctx->ivs, sizeof(tfm_ctx->ivs)); break; } - op.mac_src_p = + op->mac_src_p = MAC_SRC_DATA_P(SRAM_DATA_IN_START) | MAC_SRC_TOTAL_LEN((u32)req_ctx->count); setup_data_in(); - op.mac_digest = + op->mac_digest = MAC_DIGEST_P(SRAM_DIGEST_BUF) | MAC_FRAG_LEN(p->crypt_len); - op.mac_iv = + op->mac_iv = MAC_INNER_IV_P(SRAM_HMAC_IV_IN) | MAC_OUTER_IV_P(SRAM_HMAC_IV_OUT); @@ -347,16 +368,16 @@ static void mv_process_hash_current(int first_block) && (req_ctx->count <= MAX_HW_HASH_SIZE); if (req_ctx->first_hash) { if (is_last) - op.config |= CFG_NOT_FRAG; + op->config |= CFG_NOT_FRAG; else - op.config |= CFG_FIRST_FRAG; + op->config |= CFG_FIRST_FRAG; req_ctx->first_hash = 0; } else { if (is_last) - op.config |= CFG_LAST_FRAG; + op->config |= CFG_LAST_FRAG; else - op.config |= CFG_MID_FRAG; + op->config |= CFG_MID_FRAG; if (first_block) { writel(req_ctx->state[0], cpg->reg + DIGEST_INITIAL_VAL_A); @@ -367,8 +388,8 @@ static void mv_process_hash_current(int first_block) } } - memcpy(cpg->sram + SRAM_CONFIG, &op, - sizeof(struct sec_accel_config)); + memcpy(cpg->sram + SRAM_CONFIG, &cpg->sa_sram, + sizeof(struct sec_accel_sram)); /* GO */ mv_setup_timer(); -- 1.7.3.4 ^ permalink raw reply related [flat|nested] 67+ messages in thread
* [PATCH 04/13] mv_cesa: split up processing callbacks 2012-05-25 16:08 RFC: support for MV_CESA with TDMA Phil Sutter ` (2 preceding siblings ...) 2012-05-25 16:08 ` [PATCH 03/13] mv_cesa: prepare the full sram config in dram Phil Sutter @ 2012-05-25 16:08 ` Phil Sutter 2012-05-25 16:08 ` [PATCH 05/13] add a driver for the Marvell TDMA engine Phil Sutter ` (10 subsequent siblings) 14 siblings, 0 replies; 67+ messages in thread From: Phil Sutter @ 2012-05-25 16:08 UTC (permalink / raw) To: linux-crypto; +Cc: Herbert Xu Have a dedicated function initialising the full SRAM config, then use a minimal callback for changing only relevant parts of it. Signed-off-by: Phil Sutter <phil.sutter@viprinet.com> --- drivers/crypto/mv_cesa.c | 87 +++++++++++++++++++++++++++++++++------------ 1 files changed, 64 insertions(+), 23 deletions(-) diff --git a/drivers/crypto/mv_cesa.c b/drivers/crypto/mv_cesa.c index 68b83d8..4a989ea 100644 --- a/drivers/crypto/mv_cesa.c +++ b/drivers/crypto/mv_cesa.c @@ -62,7 +62,7 @@ struct req_progress { struct scatterlist *src_sg; struct scatterlist *dst_sg; void (*complete) (void); - void (*process) (int is_first); + void (*process) (void); /* src mostly */ int sg_src_left; @@ -265,9 +265,8 @@ static void setup_data_in(void) p->crypt_len = data_in_sram; } -static void mv_process_current_q(int first_block) +static void mv_init_crypt_config(struct ablkcipher_request *req) { - struct ablkcipher_request *req = ablkcipher_request_cast(cpg->cur_req); struct mv_ctx *ctx = crypto_tfm_ctx(req->base.tfm); struct mv_req_ctx *req_ctx = ablkcipher_request_ctx(req); struct sec_accel_config *op = &cpg->sa_sram.op; @@ -281,8 +280,6 @@ static void mv_process_current_q(int first_block) op->config = CFG_OP_CRYPT_ONLY | CFG_ENCM_AES | CFG_ENC_MODE_CBC; op->enc_iv = ENC_IV_POINT(SRAM_DATA_IV) | ENC_IV_BUF_POINT(SRAM_DATA_IV_BUF); - if (!first_block) - memcpy(req->info, cpg->sram + SRAM_DATA_IV_BUF, 16); memcpy(cpg->sa_sram.sa_iv, req->info, 16); break; } @@ -308,9 +305,8 @@ static void mv_process_current_q(int first_block) op->enc_p = ENC_P_SRC(SRAM_DATA_IN_START) | ENC_P_DST(SRAM_DATA_OUT_START); op->enc_key_p = SRAM_DATA_KEY_P; - - setup_data_in(); op->enc_len = cpg->p.crypt_len; + memcpy(cpg->sram + SRAM_CONFIG, &cpg->sa_sram, sizeof(struct sec_accel_sram)); @@ -319,6 +315,17 @@ static void mv_process_current_q(int first_block) writel(SEC_CMD_EN_SEC_ACCL0, cpg->reg + SEC_ACCEL_CMD); } +static void mv_update_crypt_config(void) +{ + /* update the enc_len field only */ + memcpy(cpg->sram + SRAM_CONFIG + 2 * sizeof(u32), + &cpg->p.crypt_len, sizeof(u32)); + + /* GO */ + mv_setup_timer(); + writel(SEC_CMD_EN_SEC_ACCL0, cpg->reg + SEC_ACCEL_CMD); +} + static void mv_crypto_algo_completion(void) { struct ablkcipher_request *req = ablkcipher_request_cast(cpg->cur_req); @@ -330,9 +337,8 @@ static void mv_crypto_algo_completion(void) memcpy(req->info, cpg->sram + SRAM_DATA_IV_BUF, 16); } -static void mv_process_hash_current(int first_block) +static void mv_init_hash_config(struct ahash_request *req) { - struct ahash_request *req = ahash_request_cast(cpg->cur_req); const struct mv_tfm_hash_ctx *tfm_ctx = crypto_tfm_ctx(req->base.tfm); struct mv_req_hash_ctx *req_ctx = ahash_request_ctx(req); struct req_progress *p = &cpg->p; @@ -355,8 +361,6 @@ static void mv_process_hash_current(int first_block) MAC_SRC_DATA_P(SRAM_DATA_IN_START) | MAC_SRC_TOTAL_LEN((u32)req_ctx->count); - setup_data_in(); - op->mac_digest = MAC_DIGEST_P(SRAM_DIGEST_BUF) | MAC_FRAG_LEN(p->crypt_len); op->mac_iv = @@ -379,13 +383,11 @@ static void mv_process_hash_current(int first_block) else op->config |= CFG_MID_FRAG; - if (first_block) { - writel(req_ctx->state[0], cpg->reg + DIGEST_INITIAL_VAL_A); - writel(req_ctx->state[1], cpg->reg + DIGEST_INITIAL_VAL_B); - writel(req_ctx->state[2], cpg->reg + DIGEST_INITIAL_VAL_C); - writel(req_ctx->state[3], cpg->reg + DIGEST_INITIAL_VAL_D); - writel(req_ctx->state[4], cpg->reg + DIGEST_INITIAL_VAL_E); - } + writel(req_ctx->state[0], cpg->reg + DIGEST_INITIAL_VAL_A); + writel(req_ctx->state[1], cpg->reg + DIGEST_INITIAL_VAL_B); + writel(req_ctx->state[2], cpg->reg + DIGEST_INITIAL_VAL_C); + writel(req_ctx->state[3], cpg->reg + DIGEST_INITIAL_VAL_D); + writel(req_ctx->state[4], cpg->reg + DIGEST_INITIAL_VAL_E); } memcpy(cpg->sram + SRAM_CONFIG, &cpg->sa_sram, @@ -396,6 +398,42 @@ static void mv_process_hash_current(int first_block) writel(SEC_CMD_EN_SEC_ACCL0, cpg->reg + SEC_ACCEL_CMD); } +static void mv_update_hash_config(void) +{ + struct ahash_request *req = ahash_request_cast(cpg->cur_req); + struct mv_req_hash_ctx *req_ctx = ahash_request_ctx(req); + struct req_progress *p = &cpg->p; + struct sec_accel_config *op = &cpg->sa_sram.op; + int is_last; + + /* update only the config (for changed fragment state) and + * mac_digest (for changed frag len) fields */ + + switch (req_ctx->op) { + case COP_SHA1: + default: + op->config = CFG_OP_MAC_ONLY | CFG_MACM_SHA1; + break; + case COP_HMAC_SHA1: + op->config = CFG_OP_MAC_ONLY | CFG_MACM_HMAC_SHA1; + break; + } + + is_last = req_ctx->last_chunk + && (p->hw_processed_bytes + p->crypt_len >= p->hw_nbytes) + && (req_ctx->count <= MAX_HW_HASH_SIZE); + + op->config |= is_last ? CFG_LAST_FRAG : CFG_MID_FRAG; + memcpy(cpg->sram + SRAM_CONFIG, &op->config, sizeof(u32)); + + op->mac_digest = MAC_DIGEST_P(SRAM_DIGEST_BUF) | MAC_FRAG_LEN(p->crypt_len); + memcpy(cpg->sram + SRAM_CONFIG + 6 * sizeof(u32), &op->mac_digest, sizeof(u32)); + + /* GO */ + mv_setup_timer(); + writel(SEC_CMD_EN_SEC_ACCL0, cpg->reg + SEC_ACCEL_CMD); +} + static inline int mv_hash_import_sha1_ctx(const struct mv_req_hash_ctx *ctx, struct shash_desc *desc) { @@ -507,7 +545,8 @@ static void dequeue_complete_req(void) if (cpg->p.hw_processed_bytes < cpg->p.hw_nbytes) { /* process next scatter list entry */ cpg->eng_st = ENGINE_BUSY; - cpg->p.process(0); + setup_data_in(); + cpg->p.process(); } else { cpg->p.complete(); cpg->eng_st = ENGINE_IDLE; @@ -542,7 +581,7 @@ static void mv_start_new_crypt_req(struct ablkcipher_request *req) memset(p, 0, sizeof(struct req_progress)); p->hw_nbytes = req->nbytes; p->complete = mv_crypto_algo_completion; - p->process = mv_process_current_q; + p->process = mv_update_crypt_config; p->copy_back = 1; p->src_sg = req->src; @@ -554,7 +593,8 @@ static void mv_start_new_crypt_req(struct ablkcipher_request *req) p->sg_dst_left = req->dst->length; } - mv_process_current_q(1); + setup_data_in(); + mv_init_crypt_config(req); } static void mv_start_new_hash_req(struct ahash_request *req) @@ -583,7 +623,7 @@ static void mv_start_new_hash_req(struct ahash_request *req) if (hw_bytes) { p->hw_nbytes = hw_bytes; p->complete = mv_hash_algo_completion; - p->process = mv_process_hash_current; + p->process = mv_update_hash_config; if (unlikely(old_extra_bytes)) { memcpy(cpg->sram + SRAM_DATA_IN_START, ctx->buffer, @@ -591,7 +631,8 @@ static void mv_start_new_hash_req(struct ahash_request *req) p->crypt_len = old_extra_bytes; } - mv_process_hash_current(1); + setup_data_in(); + mv_init_hash_config(req); } else { copy_src_to_buf(p, ctx->buffer + old_extra_bytes, ctx->extra_bytes - old_extra_bytes); -- 1.7.3.4 ^ permalink raw reply related [flat|nested] 67+ messages in thread
* [PATCH 05/13] add a driver for the Marvell TDMA engine 2012-05-25 16:08 RFC: support for MV_CESA with TDMA Phil Sutter ` (3 preceding siblings ...) 2012-05-25 16:08 ` [PATCH 04/13] mv_cesa: split up processing callbacks Phil Sutter @ 2012-05-25 16:08 ` Phil Sutter 2012-05-25 16:08 ` [PATCH 06/13] mv_cesa: use TDMA engine for data transfers Phil Sutter ` (9 subsequent siblings) 14 siblings, 0 replies; 67+ messages in thread From: Phil Sutter @ 2012-05-25 16:08 UTC (permalink / raw) To: linux-crypto; +Cc: Herbert Xu This is a DMA engine integrated into the Marvell Kirkwood SoC, designed to offload data transfers from/to the CESA crypto engine. Signed-off-by: Phil Sutter <phil.sutter@viprinet.com> --- arch/arm/mach-kirkwood/common.c | 33 +++ arch/arm/mach-kirkwood/include/mach/irqs.h | 1 + drivers/crypto/Kconfig | 5 + drivers/crypto/Makefile | 3 +- drivers/crypto/mv_tdma.c | 377 ++++++++++++++++++++++++++++ drivers/crypto/mv_tdma.h | 50 ++++ 6 files changed, 468 insertions(+), 1 deletions(-) create mode 100644 drivers/crypto/mv_tdma.c create mode 100644 drivers/crypto/mv_tdma.h diff --git a/arch/arm/mach-kirkwood/common.c b/arch/arm/mach-kirkwood/common.c index 3ad0373..adc6eff 100644 --- a/arch/arm/mach-kirkwood/common.c +++ b/arch/arm/mach-kirkwood/common.c @@ -269,9 +269,42 @@ void __init kirkwood_uart1_init(void) /***************************************************************************** * Cryptographic Engines and Security Accelerator (CESA) ****************************************************************************/ +static struct resource kirkwood_tdma_res[] = { + { + .name = "regs deco", + .start = CRYPTO_PHYS_BASE + 0xA00, + .end = CRYPTO_PHYS_BASE + 0xA24, + .flags = IORESOURCE_MEM, + }, { + .name = "regs control and error", + .start = CRYPTO_PHYS_BASE + 0x800, + .end = CRYPTO_PHYS_BASE + 0x8CF, + .flags = IORESOURCE_MEM, + }, { + .name = "crypto error", + .start = IRQ_KIRKWOOD_TDMA_ERR, + .end = IRQ_KIRKWOOD_TDMA_ERR, + .flags = IORESOURCE_IRQ, + }, +}; + +static u64 mv_tdma_dma_mask = 0xffffffffUL; + +static struct platform_device kirkwood_tdma_device = { + .name = "mv_tdma", + .id = -1, + .dev = { + .dma_mask = &mv_tdma_dma_mask, + .coherent_dma_mask = 0xffffffff, + }, + .num_resources = ARRAY_SIZE(kirkwood_tdma_res), + .resource = kirkwood_tdma_res, +}; + void __init kirkwood_crypto_init(void) { kirkwood_clk_ctrl |= CGC_CRYPTO; + platform_device_register(&kirkwood_tdma_device); orion_crypto_init(CRYPTO_PHYS_BASE, KIRKWOOD_SRAM_PHYS_BASE, KIRKWOOD_SRAM_SIZE, IRQ_KIRKWOOD_CRYPTO); } diff --git a/arch/arm/mach-kirkwood/include/mach/irqs.h b/arch/arm/mach-kirkwood/include/mach/irqs.h index 2bf8161..a66aa3f 100644 --- a/arch/arm/mach-kirkwood/include/mach/irqs.h +++ b/arch/arm/mach-kirkwood/include/mach/irqs.h @@ -51,6 +51,7 @@ #define IRQ_KIRKWOOD_GPIO_HIGH_16_23 41 #define IRQ_KIRKWOOD_GE00_ERR 46 #define IRQ_KIRKWOOD_GE01_ERR 47 +#define IRQ_KIRKWOOD_TDMA_ERR 49 #define IRQ_KIRKWOOD_RTC 53 /* diff --git a/drivers/crypto/Kconfig b/drivers/crypto/Kconfig index 1092a77..17becf3 100644 --- a/drivers/crypto/Kconfig +++ b/drivers/crypto/Kconfig @@ -159,6 +159,10 @@ config CRYPTO_GHASH_S390 It is available as of z196. +config CRYPTO_DEV_MV_TDMA + tristate + default no + config CRYPTO_DEV_MV_CESA tristate "Marvell's Cryptographic Engine" depends on PLAT_ORION @@ -166,6 +170,7 @@ config CRYPTO_DEV_MV_CESA select CRYPTO_AES select CRYPTO_BLKCIPHER2 select CRYPTO_HASH + select CRYPTO_DEV_MV_TDMA help This driver allows you to utilize the Cryptographic Engines and Security Accelerator (CESA) which can be found on the Marvell Orion diff --git a/drivers/crypto/Makefile b/drivers/crypto/Makefile index 0139032..65806e8 100644 --- a/drivers/crypto/Makefile +++ b/drivers/crypto/Makefile @@ -4,6 +4,7 @@ obj-$(CONFIG_CRYPTO_DEV_GEODE) += geode-aes.o obj-$(CONFIG_CRYPTO_DEV_NIAGARA2) += n2_crypto.o n2_crypto-y := n2_core.o n2_asm.o obj-$(CONFIG_CRYPTO_DEV_HIFN_795X) += hifn_795x.o +obj-$(CONFIG_CRYPTO_DEV_MV_TDMA) += mv_tdma.o obj-$(CONFIG_CRYPTO_DEV_MV_CESA) += mv_cesa.o obj-$(CONFIG_CRYPTO_DEV_TALITOS) += talitos.o obj-$(CONFIG_CRYPTO_DEV_FSL_CAAM) += caam/ @@ -14,4 +15,4 @@ obj-$(CONFIG_CRYPTO_DEV_OMAP_AES) += omap-aes.o obj-$(CONFIG_CRYPTO_DEV_PICOXCELL) += picoxcell_crypto.o obj-$(CONFIG_CRYPTO_DEV_S5P) += s5p-sss.o obj-$(CONFIG_CRYPTO_DEV_TEGRA_AES) += tegra-aes.o -obj-$(CONFIG_CRYPTO_DEV_UX500) += ux500/ \ No newline at end of file +obj-$(CONFIG_CRYPTO_DEV_UX500) += ux500/ diff --git a/drivers/crypto/mv_tdma.c b/drivers/crypto/mv_tdma.c new file mode 100644 index 0000000..aa5316a --- /dev/null +++ b/drivers/crypto/mv_tdma.c @@ -0,0 +1,377 @@ +/* + * Support for Marvell's TDMA engine found on Kirkwood chips, + * used exclusively by the CESA crypto accelerator. + * + * Based on unpublished code for IDMA written by Sebastian Siewior. + * + * Copyright (C) 2012 Phil Sutter <phil.sutter@viprinet.com> + * License: GPLv2 + */ + +#include <linux/delay.h> +#include <linux/dma-mapping.h> +#include <linux/dmapool.h> +#include <linux/interrupt.h> +#include <linux/module.h> +#include <linux/slab.h> +#include <linux/platform_device.h> + +#include "mv_tdma.h" + +#define MV_TDMA "MV-TDMA: " + +#define MV_DMA_INIT_POOLSIZE 16 +#define MV_DMA_ALIGN 16 + +struct tdma_desc { + u32 count; + u32 src; + u32 dst; + u32 next; +} __attribute__((packed)); + +struct desc_mempair { + struct tdma_desc *vaddr; + dma_addr_t daddr; +}; + +struct tdma_priv { + struct device *dev; + void __iomem *reg; + int irq; + /* protecting the dma descriptors and stuff */ + spinlock_t lock; + struct dma_pool *descpool; + struct desc_mempair *desclist; + int desclist_len; + int desc_usage; +} tpg; + +#define DESC(x) (tpg.desclist[x].vaddr) +#define DESC_DMA(x) (tpg.desclist[x].daddr) + +static inline int set_poolsize(int nelem) +{ + /* need to increase size first if requested */ + if (nelem > tpg.desclist_len) { + struct desc_mempair *newmem; + int newsize = nelem * sizeof(struct desc_mempair); + + newmem = krealloc(tpg.desclist, newsize, GFP_KERNEL); + if (!newmem) + return -ENOMEM; + tpg.desclist = newmem; + } + + /* allocate/free dma descriptors, adjusting tpg.desclist_len on the go */ + for (; tpg.desclist_len < nelem; tpg.desclist_len++) { + DESC(tpg.desclist_len) = dma_pool_alloc(tpg.descpool, + GFP_KERNEL, &DESC_DMA(tpg.desclist_len)); + if (!DESC((tpg.desclist_len))) + return -ENOMEM; + } + for (; tpg.desclist_len > nelem; tpg.desclist_len--) + dma_pool_free(tpg.descpool, DESC(tpg.desclist_len - 1), + DESC_DMA(tpg.desclist_len - 1)); + + /* ignore size decreases but those to zero */ + if (!nelem) { + kfree(tpg.desclist); + tpg.desclist = 0; + } + return 0; +} + +static inline void wait_for_tdma_idle(void) +{ + while (readl(tpg.reg + TDMA_CTRL) & TDMA_CTRL_ACTIVE) + mdelay(100); +} + +static inline void switch_tdma_engine(bool state) +{ + u32 val = readl(tpg.reg + TDMA_CTRL); + + val |= ( state * TDMA_CTRL_ENABLE); + val &= ~(!state * TDMA_CTRL_ENABLE); + + writel(val, tpg.reg + TDMA_CTRL); +} + +static struct tdma_desc *get_new_last_desc(void) +{ + if (unlikely(tpg.desc_usage == tpg.desclist_len) && + set_poolsize(tpg.desclist_len << 1)) { + printk(KERN_ERR MV_TDMA "failed to increase DMA pool to %d\n", + tpg.desclist_len << 1); + return NULL; + } + + if (likely(tpg.desc_usage)) + DESC(tpg.desc_usage - 1)->next = DESC_DMA(tpg.desc_usage); + + return DESC(tpg.desc_usage++); +} + +static inline void mv_tdma_desc_dump(void) +{ + struct tdma_desc *tmp; + int i; + + if (!tpg.desc_usage) { + printk(KERN_WARNING MV_TDMA "DMA descriptor list is empty\n"); + return; + } + + printk(KERN_WARNING MV_TDMA "DMA descriptor list:\n"); + for (i = 0; i < tpg.desc_usage; i++) { + tmp = DESC(i); + printk(KERN_WARNING MV_TDMA "entry %d at 0x%x: dma addr 0x%x, " + "src 0x%x, dst 0x%x, count %u, own %d, next 0x%x", i, + (u32)tmp, DESC_DMA(i) , tmp->src, tmp->dst, + tmp->count & ~TDMA_OWN_BIT, !!(tmp->count & TDMA_OWN_BIT), + tmp->next); + } +} + +static inline void mv_tdma_reg_dump(void) +{ +#define PRINTREG(offset) \ + printk(KERN_WARNING MV_TDMA "tpg.reg + " #offset " = 0x%x\n", \ + readl(tpg.reg + offset)) + + PRINTREG(TDMA_CTRL); + PRINTREG(TDMA_BYTE_COUNT); + PRINTREG(TDMA_SRC_ADDR); + PRINTREG(TDMA_DST_ADDR); + PRINTREG(TDMA_NEXT_DESC); + PRINTREG(TDMA_CURR_DESC); + +#undef PRINTREG +} + +void mv_tdma_clear(void) +{ + if (!tpg.dev) + return; + + spin_lock(&tpg.lock); + + /* make sure tdma is idle */ + wait_for_tdma_idle(); + switch_tdma_engine(0); + wait_for_tdma_idle(); + + /* clear descriptor registers */ + writel(0, tpg.reg + TDMA_BYTE_COUNT); + writel(0, tpg.reg + TDMA_CURR_DESC); + writel(0, tpg.reg + TDMA_NEXT_DESC); + + tpg.desc_usage = 0; + + switch_tdma_engine(1); + + /* finally free system lock again */ + spin_unlock(&tpg.lock); +} +EXPORT_SYMBOL_GPL(mv_tdma_clear); + +void mv_tdma_trigger(void) +{ + if (!tpg.dev) + return; + + spin_lock(&tpg.lock); + + writel(DESC_DMA(0), tpg.reg + TDMA_NEXT_DESC); + + spin_unlock(&tpg.lock); +} +EXPORT_SYMBOL_GPL(mv_tdma_trigger); + +void mv_tdma_separator(void) +{ + struct tdma_desc *tmp; + + if (!tpg.dev) + return; + + spin_lock(&tpg.lock); + + tmp = get_new_last_desc(); + memset(tmp, 0, sizeof(*tmp)); + + spin_unlock(&tpg.lock); +} +EXPORT_SYMBOL_GPL(mv_tdma_separator); + +void mv_tdma_memcpy(dma_addr_t dst, dma_addr_t src, unsigned int size) +{ + struct tdma_desc *tmp; + + if (!tpg.dev) + return; + + spin_lock(&tpg.lock); + + tmp = get_new_last_desc(); + tmp->count = size | TDMA_OWN_BIT; + tmp->src = src; + tmp->dst = dst; + tmp->next = 0; + + spin_unlock(&tpg.lock); +} +EXPORT_SYMBOL_GPL(mv_tdma_memcpy); + +irqreturn_t tdma_int(int irq, void *priv) +{ + u32 val; + + val = readl(tpg.reg + TDMA_ERR_CAUSE); + + if (val & TDMA_INT_MISS) + printk(KERN_ERR MV_TDMA "%s: miss!\n", __func__); + if (val & TDMA_INT_DOUBLE_HIT) + printk(KERN_ERR MV_TDMA "%s: double hit!\n", __func__); + if (val & TDMA_INT_BOTH_HIT) + printk(KERN_ERR MV_TDMA "%s: both hit!\n", __func__); + if (val & TDMA_INT_DATA_ERROR) + printk(KERN_ERR MV_TDMA "%s: data error!\n", __func__); + if (val) { + mv_tdma_reg_dump(); + mv_tdma_desc_dump(); + } + + switch_tdma_engine(0); + wait_for_tdma_idle(); + + /* clear descriptor registers */ + writel(0, tpg.reg + TDMA_BYTE_COUNT); + writel(0, tpg.reg + TDMA_SRC_ADDR); + writel(0, tpg.reg + TDMA_DST_ADDR); + writel(0, tpg.reg + TDMA_CURR_DESC); + + /* clear error cause register */ + writel(0, tpg.reg + TDMA_ERR_CAUSE); + + /* initialize control register (also enables engine) */ + writel(TDMA_CTRL_INIT_VALUE, tpg.reg + TDMA_CTRL); + wait_for_tdma_idle(); + + return (val ? IRQ_HANDLED : IRQ_NONE); +} + +static int mv_probe(struct platform_device *pdev) +{ + struct resource *res; + int rc; + + if (tpg.dev) { + printk(KERN_ERR MV_TDMA "second TDMA device?!\n"); + return -ENXIO; + } + tpg.dev = &pdev->dev; + + res = platform_get_resource_byname(pdev, + IORESOURCE_MEM, "regs control and error"); + if (!res) + return -ENXIO; + + if (!(tpg.reg = ioremap(res->start, resource_size(res)))) + return -ENOMEM; + + tpg.irq = platform_get_irq(pdev, 0); + if (tpg.irq < 0 || tpg.irq == NO_IRQ) { + rc = -ENXIO; + goto out_unmap_reg; + } + + tpg.descpool = dma_pool_create("TDMA Descriptor Pool", tpg.dev, + sizeof(struct tdma_desc), MV_DMA_ALIGN, 0); + if (!tpg.descpool) { + rc = -ENOMEM; + goto out_free_irq; + } + set_poolsize(MV_DMA_INIT_POOLSIZE); + + platform_set_drvdata(pdev, &tpg); + + switch_tdma_engine(0); + wait_for_tdma_idle(); + + /* clear descriptor registers */ + writel(0, tpg.reg + TDMA_BYTE_COUNT); + writel(0, tpg.reg + TDMA_SRC_ADDR); + writel(0, tpg.reg + TDMA_DST_ADDR); + writel(0, tpg.reg + TDMA_CURR_DESC); + + /* have an ear for occurring errors */ + writel(TDMA_INT_ALL, tpg.reg + TDMA_ERR_MASK); + writel(0, tpg.reg + TDMA_ERR_CAUSE); + + /* initialize control register (also enables engine) */ + writel(TDMA_CTRL_INIT_VALUE, tpg.reg + TDMA_CTRL); + wait_for_tdma_idle(); + + if (request_irq(tpg.irq, tdma_int, IRQF_DISABLED, + dev_name(tpg.dev), &tpg)) { + rc = -ENXIO; + goto out_free_all; + } + + spin_lock_init(&tpg.lock); + + printk(KERN_INFO MV_TDMA "up and running, IRQ %d\n", tpg.irq); + return 0; +out_free_all: + switch_tdma_engine(0); + platform_set_drvdata(pdev, NULL); + set_poolsize(0); + dma_pool_destroy(tpg.descpool); +out_free_irq: + free_irq(tpg.irq, &tpg); +out_unmap_reg: + iounmap(tpg.reg); + tpg.dev = NULL; + return rc; +} + +static int mv_remove(struct platform_device *pdev) +{ + switch_tdma_engine(0); + platform_set_drvdata(pdev, NULL); + set_poolsize(0); + dma_pool_destroy(tpg.descpool); + free_irq(tpg.irq, &tpg); + iounmap(tpg.reg); + tpg.dev = NULL; + return 0; +} + +static struct platform_driver marvell_tdma = { + .probe = mv_probe, + .remove = mv_remove, + .driver = { + .owner = THIS_MODULE, + .name = "mv_tdma", + }, +}; +MODULE_ALIAS("platform:mv_tdma"); + +static int __init mv_tdma_init(void) +{ + return platform_driver_register(&marvell_tdma); +} +module_init(mv_tdma_init); + +static void __exit mv_tdma_exit(void) +{ + platform_driver_unregister(&marvell_tdma); +} +module_exit(mv_tdma_exit); + +MODULE_AUTHOR("Phil Sutter <phil.sutter@viprinet.com>"); +MODULE_DESCRIPTION("Support for Marvell's TDMA engine"); +MODULE_LICENSE("GPL"); + diff --git a/drivers/crypto/mv_tdma.h b/drivers/crypto/mv_tdma.h new file mode 100644 index 0000000..3efa44c3 --- /dev/null +++ b/drivers/crypto/mv_tdma.h @@ -0,0 +1,50 @@ +#ifndef _MV_TDMA_H +#define _MV_TDMA_H + +/* TDMA_CTRL register bits */ +#define TDMA_CTRL_DST_BURST(x) (x) +#define TDMA_CTRL_DST_BURST_32 TDMA_CTRL_DST_BURST(3) +#define TDMA_CTRL_DST_BURST_128 TDMA_CTRL_DST_BURST(4) +#define TDMA_CTRL_OUTST_RD_EN (1 << 4) +#define TDMA_CTRL_SRC_BURST(x) (x << 6) +#define TDMA_CTRL_SRC_BURST_32 TDMA_CTRL_SRC_BURST(3) +#define TDMA_CTRL_SRC_BURST_128 TDMA_CTRL_SRC_BURST(4) +#define TDMA_CTRL_NO_CHAIN_MODE (1 << 9) +#define TDMA_CTRL_NO_BYTE_SWAP (1 << 11) +#define TDMA_CTRL_ENABLE (1 << 12) +#define TDMA_CTRL_FETCH_ND (1 << 13) +#define TDMA_CTRL_ACTIVE (1 << 14) + +#define TDMA_CTRL_INIT_VALUE ( \ + TDMA_CTRL_DST_BURST_128 | TDMA_CTRL_SRC_BURST_128 | \ + TDMA_CTRL_NO_BYTE_SWAP | TDMA_CTRL_ENABLE \ +) + +/* TDMA_ERR_CAUSE bits */ +#define TDMA_INT_MISS (1 << 0) +#define TDMA_INT_DOUBLE_HIT (1 << 1) +#define TDMA_INT_BOTH_HIT (1 << 2) +#define TDMA_INT_DATA_ERROR (1 << 3) +#define TDMA_INT_ALL 0x0f + +/* offsets of registers, starting at "regs control and error" */ +#define TDMA_BYTE_COUNT 0x00 +#define TDMA_SRC_ADDR 0x10 +#define TDMA_DST_ADDR 0x20 +#define TDMA_NEXT_DESC 0x30 +#define TDMA_CTRL 0x40 +#define TDMA_CURR_DESC 0x70 +#define TDMA_ERR_CAUSE 0xc8 +#define TDMA_ERR_MASK 0xcc + +/* Owner bit in TDMA_BYTE_COUNT and descriptors' count field, used + * to signal TDMA in descriptor chain when input data is complete. */ +#define TDMA_OWN_BIT (1 << 31) + +extern void mv_tdma_memcpy(dma_addr_t, dma_addr_t, unsigned int); +extern void mv_tdma_separator(void); +extern void mv_tdma_clear(void); +extern void mv_tdma_trigger(void); + + +#endif /* _MV_TDMA_H */ -- 1.7.3.4 ^ permalink raw reply related [flat|nested] 67+ messages in thread
* [PATCH 06/13] mv_cesa: use TDMA engine for data transfers 2012-05-25 16:08 RFC: support for MV_CESA with TDMA Phil Sutter ` (4 preceding siblings ...) 2012-05-25 16:08 ` [PATCH 05/13] add a driver for the Marvell TDMA engine Phil Sutter @ 2012-05-25 16:08 ` Phil Sutter 2012-05-25 16:08 ` [PATCH 07/13] mv_cesa: have TDMA copy back the digest result Phil Sutter ` (8 subsequent siblings) 14 siblings, 0 replies; 67+ messages in thread From: Phil Sutter @ 2012-05-25 16:08 UTC (permalink / raw) To: linux-crypto; +Cc: Herbert Xu Simply chose the same DMA mask value as for mvsdio and ehci. Signed-off-by: Phil Sutter <phil.sutter@viprinet.com> --- arch/arm/plat-orion/common.c | 6 + drivers/crypto/mv_cesa.c | 214 +++++++++++++++++++++++++++++++++--------- 2 files changed, 175 insertions(+), 45 deletions(-) diff --git a/arch/arm/plat-orion/common.c b/arch/arm/plat-orion/common.c index 74daf5e..dd3a327 100644 --- a/arch/arm/plat-orion/common.c +++ b/arch/arm/plat-orion/common.c @@ -916,9 +916,15 @@ static struct resource orion_crypto_resources[] = { }, }; +static u64 mv_crypto_dmamask = DMA_BIT_MASK(32); + static struct platform_device orion_crypto = { .name = "mv_crypto", .id = -1, + .dev = { + .dma_mask = &mv_crypto_dmamask, + .coherent_dma_mask = DMA_BIT_MASK(32), + }, }; void __init orion_crypto_init(unsigned long mapbase, diff --git a/drivers/crypto/mv_cesa.c b/drivers/crypto/mv_cesa.c index 4a989ea..e10da2b 100644 --- a/drivers/crypto/mv_cesa.c +++ b/drivers/crypto/mv_cesa.c @@ -9,6 +9,7 @@ #include <crypto/aes.h> #include <crypto/algapi.h> #include <linux/crypto.h> +#include <linux/dma-mapping.h> #include <linux/interrupt.h> #include <linux/io.h> #include <linux/kthread.h> @@ -20,11 +21,14 @@ #include <crypto/sha.h> #include "mv_cesa.h" +#include "mv_tdma.h" #define MV_CESA "MV-CESA:" #define MAX_HW_HASH_SIZE 0xFFFF #define MV_CESA_EXPIRE 500 /* msec */ +static int count_sgs(struct scatterlist *, unsigned int); + /* * STM: * /---------------------------------------\ @@ -49,7 +53,6 @@ enum engine_status { * @src_start: offset to add to src start position (scatter list) * @crypt_len: length of current hw crypt/hash process * @hw_nbytes: total bytes to process in hw for this request - * @copy_back: whether to copy data back (crypt) or not (hash) * @sg_dst_left: bytes left dst to process in this scatter list * @dst_start: offset to add to dst start position (scatter list) * @hw_processed_bytes: number of bytes processed by hw (request). @@ -70,7 +73,6 @@ struct req_progress { int crypt_len; int hw_nbytes; /* dst mostly */ - int copy_back; int sg_dst_left; int dst_start; int hw_processed_bytes; @@ -95,8 +97,10 @@ struct sec_accel_sram { } __attribute__((packed)); struct crypto_priv { + struct device *dev; void __iomem *reg; void __iomem *sram; + u32 sram_phys; int irq; struct task_struct *queue_th; @@ -113,6 +117,7 @@ struct crypto_priv { int has_hmac_sha1; struct sec_accel_sram sa_sram; + dma_addr_t sa_sram_dma; }; static struct crypto_priv *cpg; @@ -181,6 +186,23 @@ static void mv_setup_timer(void) jiffies + msecs_to_jiffies(MV_CESA_EXPIRE)); } +static inline bool +mv_dma_map_sg(struct scatterlist *sg, int nbytes, enum dma_data_direction dir) +{ + int nents = count_sgs(sg, nbytes); + + if (nbytes && dma_map_sg(cpg->dev, sg, nents, dir) != nents) + return false; + return true; +} + +static inline void +mv_dma_unmap_sg(struct scatterlist *sg, int nbytes, enum dma_data_direction dir) +{ + if (nbytes) + dma_unmap_sg(cpg->dev, sg, count_sgs(sg, nbytes), dir); +} + static void compute_aes_dec_key(struct mv_ctx *ctx) { struct crypto_aes_ctx gen_aes_key; @@ -255,12 +277,66 @@ static void copy_src_to_buf(struct req_progress *p, char *dbuf, int len) } } +static void dma_copy_src_to_buf(struct req_progress *p, dma_addr_t dbuf, int len) +{ + dma_addr_t sbuf; + int copy_len; + + while (len) { + if (!p->sg_src_left) { + /* next sg please */ + p->src_sg = sg_next(p->src_sg); + BUG_ON(!p->src_sg); + p->sg_src_left = sg_dma_len(p->src_sg); + p->src_start = 0; + } + + sbuf = sg_dma_address(p->src_sg) + p->src_start; + + copy_len = min(p->sg_src_left, len); + mv_tdma_memcpy(dbuf, sbuf, copy_len); + + p->src_start += copy_len; + p->sg_src_left -= copy_len; + + len -= copy_len; + dbuf += copy_len; + } +} + +static void dma_copy_buf_to_dst(struct req_progress *p, dma_addr_t sbuf, int len) +{ + dma_addr_t dbuf; + int copy_len; + + while (len) { + if (!p->sg_dst_left) { + /* next sg please */ + p->dst_sg = sg_next(p->dst_sg); + BUG_ON(!p->dst_sg); + p->sg_dst_left = sg_dma_len(p->dst_sg); + p->dst_start = 0; + } + + dbuf = sg_dma_address(p->dst_sg) + p->dst_start; + + copy_len = min(p->sg_dst_left, len); + mv_tdma_memcpy(dbuf, sbuf, copy_len); + + p->dst_start += copy_len; + p->sg_dst_left -= copy_len; + + len -= copy_len; + sbuf += copy_len; + } +} + static void setup_data_in(void) { struct req_progress *p = &cpg->p; int data_in_sram = min(p->hw_nbytes - p->hw_processed_bytes, cpg->max_req_size); - copy_src_to_buf(p, cpg->sram + SRAM_DATA_IN_START + p->crypt_len, + dma_copy_src_to_buf(p, cpg->sram_phys + SRAM_DATA_IN_START + p->crypt_len, data_in_sram - p->crypt_len); p->crypt_len = data_in_sram; } @@ -307,22 +383,39 @@ static void mv_init_crypt_config(struct ablkcipher_request *req) op->enc_key_p = SRAM_DATA_KEY_P; op->enc_len = cpg->p.crypt_len; - memcpy(cpg->sram + SRAM_CONFIG, &cpg->sa_sram, + dma_sync_single_for_device(cpg->dev, cpg->sa_sram_dma, + sizeof(struct sec_accel_sram), DMA_TO_DEVICE); + mv_tdma_memcpy(cpg->sram_phys + SRAM_CONFIG, cpg->sa_sram_dma, sizeof(struct sec_accel_sram)); + mv_tdma_separator(); + dma_copy_buf_to_dst(&cpg->p, cpg->sram_phys + SRAM_DATA_OUT_START, cpg->p.crypt_len); + /* GO */ mv_setup_timer(); + mv_tdma_trigger(); writel(SEC_CMD_EN_SEC_ACCL0, cpg->reg + SEC_ACCEL_CMD); } static void mv_update_crypt_config(void) { + struct sec_accel_config *op = &cpg->sa_sram.op; + /* update the enc_len field only */ - memcpy(cpg->sram + SRAM_CONFIG + 2 * sizeof(u32), - &cpg->p.crypt_len, sizeof(u32)); + + op->enc_len = cpg->p.crypt_len; + + dma_sync_single_for_device(cpg->dev, cpg->sa_sram_dma + 2 * sizeof(u32), + sizeof(u32), DMA_TO_DEVICE); + mv_tdma_memcpy(cpg->sram_phys + SRAM_CONFIG + 2 * sizeof(u32), + cpg->sa_sram_dma + 2 * sizeof(u32), sizeof(u32)); + + mv_tdma_separator(); + dma_copy_buf_to_dst(&cpg->p, cpg->sram_phys + SRAM_DATA_OUT_START, cpg->p.crypt_len); /* GO */ mv_setup_timer(); + mv_tdma_trigger(); writel(SEC_CMD_EN_SEC_ACCL0, cpg->reg + SEC_ACCEL_CMD); } @@ -331,6 +424,13 @@ static void mv_crypto_algo_completion(void) struct ablkcipher_request *req = ablkcipher_request_cast(cpg->cur_req); struct mv_req_ctx *req_ctx = ablkcipher_request_ctx(req); + if (req->src == req->dst) { + mv_dma_unmap_sg(req->src, req->nbytes, DMA_BIDIRECTIONAL); + } else { + mv_dma_unmap_sg(req->src, req->nbytes, DMA_TO_DEVICE); + mv_dma_unmap_sg(req->dst, req->nbytes, DMA_FROM_DEVICE); + } + if (req_ctx->op != COP_AES_CBC) return ; @@ -390,11 +490,20 @@ static void mv_init_hash_config(struct ahash_request *req) writel(req_ctx->state[4], cpg->reg + DIGEST_INITIAL_VAL_E); } - memcpy(cpg->sram + SRAM_CONFIG, &cpg->sa_sram, + dma_sync_single_for_device(cpg->dev, cpg->sa_sram_dma, + sizeof(struct sec_accel_sram), DMA_TO_DEVICE); + mv_tdma_memcpy(cpg->sram_phys + SRAM_CONFIG, cpg->sa_sram_dma, sizeof(struct sec_accel_sram)); + mv_tdma_separator(); + + /* XXX: this fixes some ugly register fuckup bug in the tdma engine + * (no need to sync since the data is ignored anyway) */ + mv_tdma_memcpy(cpg->sa_sram_dma, cpg->sram_phys + SRAM_CONFIG, 1); + /* GO */ mv_setup_timer(); + mv_tdma_trigger(); writel(SEC_CMD_EN_SEC_ACCL0, cpg->reg + SEC_ACCEL_CMD); } @@ -424,13 +533,26 @@ static void mv_update_hash_config(void) && (req_ctx->count <= MAX_HW_HASH_SIZE); op->config |= is_last ? CFG_LAST_FRAG : CFG_MID_FRAG; - memcpy(cpg->sram + SRAM_CONFIG, &op->config, sizeof(u32)); + dma_sync_single_for_device(cpg->dev, cpg->sa_sram_dma, + sizeof(u32), DMA_TO_DEVICE); + mv_tdma_memcpy(cpg->sram_phys + SRAM_CONFIG, + cpg->sa_sram_dma, sizeof(u32)); op->mac_digest = MAC_DIGEST_P(SRAM_DIGEST_BUF) | MAC_FRAG_LEN(p->crypt_len); - memcpy(cpg->sram + SRAM_CONFIG + 6 * sizeof(u32), &op->mac_digest, sizeof(u32)); + dma_sync_single_for_device(cpg->dev, cpg->sa_sram_dma + 6 * sizeof(u32), + sizeof(u32), DMA_TO_DEVICE); + mv_tdma_memcpy(cpg->sram_phys + SRAM_CONFIG + 6 * sizeof(u32), + cpg->sa_sram_dma + 6 * sizeof(u32), sizeof(u32)); + + mv_tdma_separator(); + + /* XXX: this fixes some ugly register fuckup bug in the tdma engine + * (no need to sync since the data is ignored anyway) */ + mv_tdma_memcpy(cpg->sa_sram_dma, cpg->sram_phys + SRAM_CONFIG, 1); /* GO */ mv_setup_timer(); + mv_tdma_trigger(); writel(SEC_CMD_EN_SEC_ACCL0, cpg->reg + SEC_ACCEL_CMD); } @@ -504,43 +626,18 @@ static void mv_hash_algo_completion(void) } else { mv_save_digest_state(ctx); } + + mv_dma_unmap_sg(req->src, req->nbytes, DMA_TO_DEVICE); } static void dequeue_complete_req(void) { struct crypto_async_request *req = cpg->cur_req; - void *buf; cpg->p.hw_processed_bytes += cpg->p.crypt_len; - if (cpg->p.copy_back) { - int need_copy_len = cpg->p.crypt_len; - int sram_offset = 0; - do { - int dst_copy; - - if (!cpg->p.sg_dst_left) { - /* next sg please */ - cpg->p.dst_sg = sg_next(cpg->p.dst_sg); - BUG_ON(!cpg->p.dst_sg); - cpg->p.sg_dst_left = cpg->p.dst_sg->length; - cpg->p.dst_start = 0; - } - - buf = sg_virt(cpg->p.dst_sg) + cpg->p.dst_start; - - dst_copy = min(need_copy_len, cpg->p.sg_dst_left); - - memcpy(buf, - cpg->sram + SRAM_DATA_OUT_START + sram_offset, - dst_copy); - sram_offset += dst_copy; - cpg->p.sg_dst_left -= dst_copy; - need_copy_len -= dst_copy; - cpg->p.dst_start += dst_copy; - } while (need_copy_len > 0); - } - cpg->p.crypt_len = 0; + mv_tdma_clear(); + BUG_ON(cpg->eng_st != ENGINE_W_DEQUEUE); if (cpg->p.hw_processed_bytes < cpg->p.hw_nbytes) { /* process next scatter list entry */ @@ -582,15 +679,28 @@ static void mv_start_new_crypt_req(struct ablkcipher_request *req) p->hw_nbytes = req->nbytes; p->complete = mv_crypto_algo_completion; p->process = mv_update_crypt_config; - p->copy_back = 1; + + /* assume inplace request */ + if (req->src == req->dst) { + if (!mv_dma_map_sg(req->src, req->nbytes, DMA_BIDIRECTIONAL)) + return; + } else { + if (!mv_dma_map_sg(req->src, req->nbytes, DMA_TO_DEVICE)) + return; + + if (!mv_dma_map_sg(req->dst, req->nbytes, DMA_FROM_DEVICE)) { + mv_dma_unmap_sg(req->src, req->nbytes, DMA_TO_DEVICE); + return; + } + } p->src_sg = req->src; p->dst_sg = req->dst; if (req->nbytes) { BUG_ON(!req->src); BUG_ON(!req->dst); - p->sg_src_left = req->src->length; - p->sg_dst_left = req->dst->length; + p->sg_src_left = sg_dma_len(req->src); + p->sg_dst_left = sg_dma_len(req->dst); } setup_data_in(); @@ -602,6 +712,7 @@ static void mv_start_new_hash_req(struct ahash_request *req) struct req_progress *p = &cpg->p; struct mv_req_hash_ctx *ctx = ahash_request_ctx(req); int hw_bytes, old_extra_bytes, rc; + cpg->cur_req = &req->base; memset(p, 0, sizeof(struct req_progress)); hw_bytes = req->nbytes + ctx->extra_bytes; @@ -631,6 +742,11 @@ static void mv_start_new_hash_req(struct ahash_request *req) p->crypt_len = old_extra_bytes; } + if (!mv_dma_map_sg(req->src, req->nbytes, DMA_TO_DEVICE)) { + printk(KERN_ERR "%s: out of memory\n", __func__); + return; + } + setup_data_in(); mv_init_hash_config(req); } else { @@ -966,14 +1082,14 @@ irqreturn_t crypto_int(int irq, void *priv) u32 val; val = readl(cpg->reg + SEC_ACCEL_INT_STATUS); - if (!(val & SEC_INT_ACCEL0_DONE)) + if (!(val & SEC_INT_ACC0_IDMA_DONE)) return IRQ_NONE; if (!del_timer(&cpg->completion_timer)) { printk(KERN_WARNING MV_CESA "got an interrupt but no pending timer?\n"); } - val &= ~SEC_INT_ACCEL0_DONE; + val &= ~SEC_INT_ACC0_IDMA_DONE; writel(val, cpg->reg + SEC_ACCEL_INT_STATUS); BUG_ON(cpg->eng_st != ENGINE_BUSY); cpg->eng_st = ENGINE_W_DEQUEUE; @@ -1112,6 +1228,7 @@ static int mv_probe(struct platform_device *pdev) } cp->sram_size = resource_size(res); cp->max_req_size = cp->sram_size - SRAM_CFG_SPACE; + cp->sram_phys = res->start; cp->sram = ioremap(res->start, cp->sram_size); if (!cp->sram) { ret = -ENOMEM; @@ -1127,6 +1244,7 @@ static int mv_probe(struct platform_device *pdev) platform_set_drvdata(pdev, cp); cpg = cp; + cpg->dev = &pdev->dev; cp->queue_th = kthread_run(queue_manag, cp, "mv_crypto"); if (IS_ERR(cp->queue_th)) { @@ -1140,10 +1258,14 @@ static int mv_probe(struct platform_device *pdev) goto err_thread; writel(0, cpg->reg + SEC_ACCEL_INT_STATUS); - writel(SEC_INT_ACCEL0_DONE, cpg->reg + SEC_ACCEL_INT_MASK); - writel(SEC_CFG_STOP_DIG_ERR, cpg->reg + SEC_ACCEL_CFG); + writel(SEC_INT_ACC0_IDMA_DONE, cpg->reg + SEC_ACCEL_INT_MASK); + writel((SEC_CFG_STOP_DIG_ERR | SEC_CFG_CH0_W_IDMA | + SEC_CFG_ACT_CH0_IDMA), cpg->reg + SEC_ACCEL_CFG); writel(SRAM_CONFIG, cpg->reg + SEC_ACCEL_DESC_P0); + cp->sa_sram_dma = dma_map_single(&pdev->dev, &cp->sa_sram, + sizeof(struct sec_accel_sram), DMA_TO_DEVICE); + ret = crypto_register_alg(&mv_aes_alg_ecb); if (ret) { printk(KERN_WARNING MV_CESA @@ -1202,6 +1324,8 @@ static int mv_remove(struct platform_device *pdev) crypto_unregister_ahash(&mv_hmac_sha1_alg); kthread_stop(cp->queue_th); free_irq(cp->irq, cp); + dma_unmap_single(&pdev->dev, cpg->sa_sram_dma, + sizeof(struct sec_accel_sram), DMA_TO_DEVICE); memset(cp->sram, 0, cp->sram_size); iounmap(cp->sram); iounmap(cp->reg); -- 1.7.3.4 ^ permalink raw reply related [flat|nested] 67+ messages in thread
* [PATCH 07/13] mv_cesa: have TDMA copy back the digest result 2012-05-25 16:08 RFC: support for MV_CESA with TDMA Phil Sutter ` (5 preceding siblings ...) 2012-05-25 16:08 ` [PATCH 06/13] mv_cesa: use TDMA engine for data transfers Phil Sutter @ 2012-05-25 16:08 ` Phil Sutter 2012-05-25 16:08 ` [PATCH 08/13] mv_cesa: fetch extra_bytes via TDMA engine, too Phil Sutter ` (7 subsequent siblings) 14 siblings, 0 replies; 67+ messages in thread From: Phil Sutter @ 2012-05-25 16:08 UTC (permalink / raw) To: linux-crypto; +Cc: Herbert Xu Signed-off-by: Phil Sutter <phil.sutter@viprinet.com> --- drivers/crypto/mv_cesa.c | 40 +++++++++++++++++++++++++++++----------- 1 files changed, 29 insertions(+), 11 deletions(-) diff --git a/drivers/crypto/mv_cesa.c b/drivers/crypto/mv_cesa.c index e10da2b..d099aa0 100644 --- a/drivers/crypto/mv_cesa.c +++ b/drivers/crypto/mv_cesa.c @@ -159,8 +159,10 @@ struct mv_req_hash_ctx { int first_hash; /* marks that we don't have previous state */ int last_chunk; /* marks that this is the 'final' request */ int extra_bytes; /* unprocessed bytes in buffer */ + int digestsize; /* size of the digest */ enum hash_op op; int count_add; + dma_addr_t result_dma; }; static void mv_completion_timer_callback(unsigned long unused) @@ -497,9 +499,17 @@ static void mv_init_hash_config(struct ahash_request *req) mv_tdma_separator(); - /* XXX: this fixes some ugly register fuckup bug in the tdma engine - * (no need to sync since the data is ignored anyway) */ - mv_tdma_memcpy(cpg->sa_sram_dma, cpg->sram_phys + SRAM_CONFIG, 1); + if (req->result) { + req_ctx->result_dma = dma_map_single(cpg->dev, req->result, + req_ctx->digestsize, DMA_FROM_DEVICE); + mv_tdma_memcpy(req_ctx->result_dma, + cpg->sram_phys + SRAM_DIGEST_BUF, req_ctx->digestsize); + } else { + /* XXX: this fixes some ugly register fuckup bug in the tdma engine + * (no need to sync since the data is ignored anyway) */ + mv_tdma_memcpy(cpg->sa_sram_dma, + cpg->sram_phys + SRAM_CONFIG, 1); + } /* GO */ mv_setup_timer(); @@ -546,9 +556,17 @@ static void mv_update_hash_config(void) mv_tdma_separator(); - /* XXX: this fixes some ugly register fuckup bug in the tdma engine - * (no need to sync since the data is ignored anyway) */ - mv_tdma_memcpy(cpg->sa_sram_dma, cpg->sram_phys + SRAM_CONFIG, 1); + if (req->result) { + req_ctx->result_dma = dma_map_single(cpg->dev, req->result, + req_ctx->digestsize, DMA_FROM_DEVICE); + mv_tdma_memcpy(req_ctx->result_dma, + cpg->sram_phys + SRAM_DIGEST_BUF, req_ctx->digestsize); + } else { + /* XXX: this fixes some ugly register fuckup bug in the tdma engine + * (no need to sync since the data is ignored anyway) */ + mv_tdma_memcpy(cpg->sa_sram_dma, + cpg->sram_phys + SRAM_CONFIG, 1); + } /* GO */ mv_setup_timer(); @@ -615,11 +633,10 @@ static void mv_hash_algo_completion(void) copy_src_to_buf(&cpg->p, ctx->buffer, ctx->extra_bytes); if (likely(ctx->last_chunk)) { - if (likely(ctx->count <= MAX_HW_HASH_SIZE)) { - memcpy(req->result, cpg->sram + SRAM_DIGEST_BUF, - crypto_ahash_digestsize(crypto_ahash_reqtfm - (req))); - } else { + dma_unmap_single(cpg->dev, ctx->result_dma, + ctx->digestsize, DMA_FROM_DEVICE); + + if (unlikely(ctx->count > MAX_HW_HASH_SIZE)) { mv_save_digest_state(ctx); mv_hash_final_fallback(req); } @@ -717,6 +734,7 @@ static void mv_start_new_hash_req(struct ahash_request *req) memset(p, 0, sizeof(struct req_progress)); hw_bytes = req->nbytes + ctx->extra_bytes; old_extra_bytes = ctx->extra_bytes; + ctx->digestsize = crypto_ahash_digestsize(crypto_ahash_reqtfm(req)); ctx->extra_bytes = hw_bytes % SHA1_BLOCK_SIZE; if (ctx->extra_bytes != 0 -- 1.7.3.4 ^ permalink raw reply related [flat|nested] 67+ messages in thread
* [PATCH 08/13] mv_cesa: fetch extra_bytes via TDMA engine, too 2012-05-25 16:08 RFC: support for MV_CESA with TDMA Phil Sutter ` (6 preceding siblings ...) 2012-05-25 16:08 ` [PATCH 07/13] mv_cesa: have TDMA copy back the digest result Phil Sutter @ 2012-05-25 16:08 ` Phil Sutter 2012-05-25 16:08 ` [PATCH 09/13] mv_cesa: implementing packet chain mode, only aes for now Phil Sutter ` (6 subsequent siblings) 14 siblings, 0 replies; 67+ messages in thread From: Phil Sutter @ 2012-05-25 16:08 UTC (permalink / raw) To: linux-crypto; +Cc: Herbert Xu Signed-off-by: Phil Sutter <phil.sutter@viprinet.com> --- drivers/crypto/mv_cesa.c | 12 ++++++++++-- 1 files changed, 10 insertions(+), 2 deletions(-) diff --git a/drivers/crypto/mv_cesa.c b/drivers/crypto/mv_cesa.c index d099aa0..bc2692e 100644 --- a/drivers/crypto/mv_cesa.c +++ b/drivers/crypto/mv_cesa.c @@ -156,6 +156,7 @@ struct mv_req_hash_ctx { u64 count; u32 state[SHA1_DIGEST_SIZE / 4]; u8 buffer[SHA1_BLOCK_SIZE]; + dma_addr_t buffer_dma; int first_hash; /* marks that we don't have previous state */ int last_chunk; /* marks that this is the 'final' request */ int extra_bytes; /* unprocessed bytes in buffer */ @@ -636,6 +637,9 @@ static void mv_hash_algo_completion(void) dma_unmap_single(cpg->dev, ctx->result_dma, ctx->digestsize, DMA_FROM_DEVICE); + dma_unmap_single(cpg->dev, ctx->buffer_dma, + SHA1_BLOCK_SIZE, DMA_TO_DEVICE); + if (unlikely(ctx->count > MAX_HW_HASH_SIZE)) { mv_save_digest_state(ctx); mv_hash_final_fallback(req); @@ -755,8 +759,10 @@ static void mv_start_new_hash_req(struct ahash_request *req) p->process = mv_update_hash_config; if (unlikely(old_extra_bytes)) { - memcpy(cpg->sram + SRAM_DATA_IN_START, ctx->buffer, - old_extra_bytes); + dma_sync_single_for_device(cpg->dev, ctx->buffer_dma, + SHA1_BLOCK_SIZE, DMA_TO_DEVICE); + mv_tdma_memcpy(cpg->sram_phys + SRAM_DATA_IN_START, + ctx->buffer_dma, old_extra_bytes); p->crypt_len = old_extra_bytes; } @@ -901,6 +907,8 @@ static void mv_init_hash_req_ctx(struct mv_req_hash_ctx *ctx, int op, ctx->first_hash = 1; ctx->last_chunk = is_last; ctx->count_add = count_add; + ctx->buffer_dma = dma_map_single(cpg->dev, ctx->buffer, + SHA1_BLOCK_SIZE, DMA_TO_DEVICE); } static void mv_update_hash_req_ctx(struct mv_req_hash_ctx *ctx, int is_last, -- 1.7.3.4 ^ permalink raw reply related [flat|nested] 67+ messages in thread
* [PATCH 09/13] mv_cesa: implementing packet chain mode, only aes for now 2012-05-25 16:08 RFC: support for MV_CESA with TDMA Phil Sutter ` (7 preceding siblings ...) 2012-05-25 16:08 ` [PATCH 08/13] mv_cesa: fetch extra_bytes via TDMA engine, too Phil Sutter @ 2012-05-25 16:08 ` Phil Sutter 2012-05-25 16:08 ` [PATCH 10/13] mv_cesa: reorganise mv_start_new_hash_req a bit Phil Sutter ` (5 subsequent siblings) 14 siblings, 0 replies; 67+ messages in thread From: Phil Sutter @ 2012-05-25 16:08 UTC (permalink / raw) To: linux-crypto; +Cc: Herbert Xu This introduces a pool of four-byte DMA buffers for security accelerator config updates. Signed-off-by: Phil Sutter <phil.sutter@viprinet.com> --- drivers/crypto/mv_cesa.c | 134 ++++++++++++++++++++++++++++++++++++---------- drivers/crypto/mv_cesa.h | 1 + 2 files changed, 106 insertions(+), 29 deletions(-) diff --git a/drivers/crypto/mv_cesa.c b/drivers/crypto/mv_cesa.c index bc2692e..8e66080 100644 --- a/drivers/crypto/mv_cesa.c +++ b/drivers/crypto/mv_cesa.c @@ -10,6 +10,7 @@ #include <crypto/algapi.h> #include <linux/crypto.h> #include <linux/dma-mapping.h> +#include <linux/dmapool.h> #include <linux/interrupt.h> #include <linux/io.h> #include <linux/kthread.h> @@ -27,6 +28,9 @@ #define MAX_HW_HASH_SIZE 0xFFFF #define MV_CESA_EXPIRE 500 /* msec */ +#define MV_DMA_INIT_POOLSIZE 16 +#define MV_DMA_ALIGN 16 + static int count_sgs(struct scatterlist *, unsigned int); /* @@ -96,6 +100,11 @@ struct sec_accel_sram { #define sa_ivo type.hash.ivo } __attribute__((packed)); +struct u32_mempair { + u32 *vaddr; + dma_addr_t daddr; +}; + struct crypto_priv { struct device *dev; void __iomem *reg; @@ -118,6 +127,11 @@ struct crypto_priv { struct sec_accel_sram sa_sram; dma_addr_t sa_sram_dma; + + struct dma_pool *u32_pool; + struct u32_mempair *u32_list; + int u32_list_len; + int u32_usage; }; static struct crypto_priv *cpg; @@ -189,6 +203,54 @@ static void mv_setup_timer(void) jiffies + msecs_to_jiffies(MV_CESA_EXPIRE)); } +#define U32_ITEM(x) (cpg->u32_list[x].vaddr) +#define U32_ITEM_DMA(x) (cpg->u32_list[x].daddr) + +static inline int set_u32_poolsize(int nelem) +{ + /* need to increase size first if requested */ + if (nelem > cpg->u32_list_len) { + struct u32_mempair *newmem; + int newsize = nelem * sizeof(struct u32_mempair); + + newmem = krealloc(cpg->u32_list, newsize, GFP_KERNEL); + if (!newmem) + return -ENOMEM; + cpg->u32_list = newmem; + } + + /* allocate/free dma descriptors, adjusting cpg->u32_list_len on the go */ + for (; cpg->u32_list_len < nelem; cpg->u32_list_len++) { + U32_ITEM(cpg->u32_list_len) = dma_pool_alloc(cpg->u32_pool, + GFP_KERNEL, &U32_ITEM_DMA(cpg->u32_list_len)); + if (!U32_ITEM((cpg->u32_list_len))) + return -ENOMEM; + } + for (; cpg->u32_list_len > nelem; cpg->u32_list_len--) + dma_pool_free(cpg->u32_pool, U32_ITEM(cpg->u32_list_len - 1), + U32_ITEM_DMA(cpg->u32_list_len - 1)); + + /* ignore size decreases but those to zero */ + if (!nelem) { + kfree(cpg->u32_list); + cpg->u32_list = 0; + } + return 0; +} + +static inline void mv_tdma_u32_copy(dma_addr_t dst, u32 val) +{ + if (unlikely(cpg->u32_usage == cpg->u32_list_len) + && set_u32_poolsize(cpg->u32_list_len << 1)) { + printk(KERN_ERR MV_CESA "resizing poolsize to %d failed\n", + cpg->u32_list_len << 1); + return; + } + *(U32_ITEM(cpg->u32_usage)) = val; + mv_tdma_memcpy(dst, U32_ITEM_DMA(cpg->u32_usage), sizeof(u32)); + cpg->u32_usage++; +} + static inline bool mv_dma_map_sg(struct scatterlist *sg, int nbytes, enum dma_data_direction dir) { @@ -390,36 +452,13 @@ static void mv_init_crypt_config(struct ablkcipher_request *req) sizeof(struct sec_accel_sram), DMA_TO_DEVICE); mv_tdma_memcpy(cpg->sram_phys + SRAM_CONFIG, cpg->sa_sram_dma, sizeof(struct sec_accel_sram)); - - mv_tdma_separator(); - dma_copy_buf_to_dst(&cpg->p, cpg->sram_phys + SRAM_DATA_OUT_START, cpg->p.crypt_len); - - /* GO */ - mv_setup_timer(); - mv_tdma_trigger(); - writel(SEC_CMD_EN_SEC_ACCL0, cpg->reg + SEC_ACCEL_CMD); } static void mv_update_crypt_config(void) { - struct sec_accel_config *op = &cpg->sa_sram.op; - /* update the enc_len field only */ - - op->enc_len = cpg->p.crypt_len; - - dma_sync_single_for_device(cpg->dev, cpg->sa_sram_dma + 2 * sizeof(u32), - sizeof(u32), DMA_TO_DEVICE); - mv_tdma_memcpy(cpg->sram_phys + SRAM_CONFIG + 2 * sizeof(u32), - cpg->sa_sram_dma + 2 * sizeof(u32), sizeof(u32)); - - mv_tdma_separator(); - dma_copy_buf_to_dst(&cpg->p, cpg->sram_phys + SRAM_DATA_OUT_START, cpg->p.crypt_len); - - /* GO */ - mv_setup_timer(); - mv_tdma_trigger(); - writel(SEC_CMD_EN_SEC_ACCL0, cpg->reg + SEC_ACCEL_CMD); + mv_tdma_u32_copy(cpg->sram_phys + SRAM_CONFIG + 2 * sizeof(u32), + (u32)cpg->p.crypt_len); } static void mv_crypto_algo_completion(void) @@ -658,6 +697,7 @@ static void dequeue_complete_req(void) cpg->p.crypt_len = 0; mv_tdma_clear(); + cpg->u32_usage = 0; BUG_ON(cpg->eng_st != ENGINE_W_DEQUEUE); if (cpg->p.hw_processed_bytes < cpg->p.hw_nbytes) { @@ -699,7 +739,6 @@ static void mv_start_new_crypt_req(struct ablkcipher_request *req) memset(p, 0, sizeof(struct req_progress)); p->hw_nbytes = req->nbytes; p->complete = mv_crypto_algo_completion; - p->process = mv_update_crypt_config; /* assume inplace request */ if (req->src == req->dst) { @@ -726,6 +765,24 @@ static void mv_start_new_crypt_req(struct ablkcipher_request *req) setup_data_in(); mv_init_crypt_config(req); + mv_tdma_separator(); + dma_copy_buf_to_dst(&cpg->p, cpg->sram_phys + SRAM_DATA_OUT_START, cpg->p.crypt_len); + cpg->p.hw_processed_bytes += cpg->p.crypt_len; + while (cpg->p.hw_processed_bytes < cpg->p.hw_nbytes) { + cpg->p.crypt_len = 0; + + setup_data_in(); + mv_update_crypt_config(); + mv_tdma_separator(); + dma_copy_buf_to_dst(&cpg->p, cpg->sram_phys + SRAM_DATA_OUT_START, cpg->p.crypt_len); + cpg->p.hw_processed_bytes += cpg->p.crypt_len; + } + + + /* GO */ + mv_setup_timer(); + mv_tdma_trigger(); + writel(SEC_CMD_EN_SEC_ACCL0, cpg->reg + SEC_ACCEL_CMD); } static void mv_start_new_hash_req(struct ahash_request *req) @@ -1285,18 +1342,29 @@ static int mv_probe(struct platform_device *pdev) writel(0, cpg->reg + SEC_ACCEL_INT_STATUS); writel(SEC_INT_ACC0_IDMA_DONE, cpg->reg + SEC_ACCEL_INT_MASK); - writel((SEC_CFG_STOP_DIG_ERR | SEC_CFG_CH0_W_IDMA | + writel((SEC_CFG_STOP_DIG_ERR | SEC_CFG_CH0_W_IDMA | SEC_CFG_MP_CHAIN | SEC_CFG_ACT_CH0_IDMA), cpg->reg + SEC_ACCEL_CFG); writel(SRAM_CONFIG, cpg->reg + SEC_ACCEL_DESC_P0); cp->sa_sram_dma = dma_map_single(&pdev->dev, &cp->sa_sram, sizeof(struct sec_accel_sram), DMA_TO_DEVICE); + cpg->u32_pool = dma_pool_create("CESA U32 Item Pool", + &pdev->dev, sizeof(u32), MV_DMA_ALIGN, 0); + if (!cpg->u32_pool) { + ret = -ENOMEM; + goto err_mapping; + } + if (set_u32_poolsize(MV_DMA_INIT_POOLSIZE)) { + printk(KERN_ERR MV_CESA "failed to initialise poolsize\n"); + goto err_pool; + } + ret = crypto_register_alg(&mv_aes_alg_ecb); if (ret) { printk(KERN_WARNING MV_CESA "Could not register aes-ecb driver\n"); - goto err_irq; + goto err_poolsize; } ret = crypto_register_alg(&mv_aes_alg_cbc); @@ -1323,7 +1391,13 @@ static int mv_probe(struct platform_device *pdev) return 0; err_unreg_ecb: crypto_unregister_alg(&mv_aes_alg_ecb); -err_irq: +err_poolsize: + set_u32_poolsize(0); +err_pool: + dma_pool_destroy(cpg->u32_pool); +err_mapping: + dma_unmap_single(&pdev->dev, cpg->sa_sram_dma, + sizeof(struct sec_accel_sram), DMA_TO_DEVICE); free_irq(irq, cp); err_thread: kthread_stop(cp->queue_th); @@ -1352,6 +1426,8 @@ static int mv_remove(struct platform_device *pdev) free_irq(cp->irq, cp); dma_unmap_single(&pdev->dev, cpg->sa_sram_dma, sizeof(struct sec_accel_sram), DMA_TO_DEVICE); + set_u32_poolsize(0); + dma_pool_destroy(cpg->u32_pool); memset(cp->sram, 0, cp->sram_size); iounmap(cp->sram); iounmap(cp->reg); diff --git a/drivers/crypto/mv_cesa.h b/drivers/crypto/mv_cesa.h index 81ce109..83730ca 100644 --- a/drivers/crypto/mv_cesa.h +++ b/drivers/crypto/mv_cesa.h @@ -24,6 +24,7 @@ #define SEC_CFG_CH1_W_IDMA (1 << 8) #define SEC_CFG_ACT_CH0_IDMA (1 << 9) #define SEC_CFG_ACT_CH1_IDMA (1 << 10) +#define SEC_CFG_MP_CHAIN (1 << 11) #define SEC_ACCEL_STATUS 0xde0c #define SEC_ST_ACT_0 (1 << 0) -- 1.7.3.4 ^ permalink raw reply related [flat|nested] 67+ messages in thread
* [PATCH 10/13] mv_cesa: reorganise mv_start_new_hash_req a bit 2012-05-25 16:08 RFC: support for MV_CESA with TDMA Phil Sutter ` (8 preceding siblings ...) 2012-05-25 16:08 ` [PATCH 09/13] mv_cesa: implementing packet chain mode, only aes for now Phil Sutter @ 2012-05-25 16:08 ` Phil Sutter 2012-05-25 16:08 ` [PATCH 11/13] mv_cesa: implement descriptor chaining for hashes, too Phil Sutter ` (4 subsequent siblings) 14 siblings, 0 replies; 67+ messages in thread From: Phil Sutter @ 2012-05-25 16:08 UTC (permalink / raw) To: linux-crypto; +Cc: Herbert Xu Check and exit early for whether CESA can be used at all. Signed-off-by: Phil Sutter <phil.sutter@viprinet.com> --- drivers/crypto/mv_cesa.c | 61 +++++++++++++++++++++++++--------------------- 1 files changed, 33 insertions(+), 28 deletions(-) diff --git a/drivers/crypto/mv_cesa.c b/drivers/crypto/mv_cesa.c index 8e66080..5dba9df 100644 --- a/drivers/crypto/mv_cesa.c +++ b/drivers/crypto/mv_cesa.c @@ -804,35 +804,13 @@ static void mv_start_new_hash_req(struct ahash_request *req) else ctx->extra_bytes = 0; - p->src_sg = req->src; - if (req->nbytes) { - BUG_ON(!req->src); - p->sg_src_left = req->src->length; - } - - if (hw_bytes) { - p->hw_nbytes = hw_bytes; - p->complete = mv_hash_algo_completion; - p->process = mv_update_hash_config; - - if (unlikely(old_extra_bytes)) { - dma_sync_single_for_device(cpg->dev, ctx->buffer_dma, - SHA1_BLOCK_SIZE, DMA_TO_DEVICE); - mv_tdma_memcpy(cpg->sram_phys + SRAM_DATA_IN_START, - ctx->buffer_dma, old_extra_bytes); - p->crypt_len = old_extra_bytes; + if (unlikely(!hw_bytes)) { /* too little data for CESA */ + if (req->nbytes) { + p->src_sg = req->src; + p->sg_src_left = req->src->length; + copy_src_to_buf(p, ctx->buffer + old_extra_bytes, + req->nbytes); } - - if (!mv_dma_map_sg(req->src, req->nbytes, DMA_TO_DEVICE)) { - printk(KERN_ERR "%s: out of memory\n", __func__); - return; - } - - setup_data_in(); - mv_init_hash_config(req); - } else { - copy_src_to_buf(p, ctx->buffer + old_extra_bytes, - ctx->extra_bytes - old_extra_bytes); if (ctx->last_chunk) rc = mv_hash_final_fallback(req); else @@ -841,7 +819,34 @@ static void mv_start_new_hash_req(struct ahash_request *req) local_bh_disable(); req->base.complete(&req->base, rc); local_bh_enable(); + return; } + + if (likely(req->nbytes)) { + BUG_ON(!req->src); + + if (!mv_dma_map_sg(req->src, req->nbytes, DMA_TO_DEVICE)) { + printk(KERN_ERR "%s: out of memory\n", __func__); + return; + } + p->sg_src_left = sg_dma_len(req->src); + p->src_sg = req->src; + } + + p->hw_nbytes = hw_bytes; + p->complete = mv_hash_algo_completion; + p->process = mv_update_hash_config; + + if (unlikely(old_extra_bytes)) { + dma_sync_single_for_device(cpg->dev, ctx->buffer_dma, + SHA1_BLOCK_SIZE, DMA_TO_DEVICE); + mv_tdma_memcpy(cpg->sram_phys + SRAM_DATA_IN_START, + ctx->buffer_dma, old_extra_bytes); + p->crypt_len = old_extra_bytes; + } + + setup_data_in(); + mv_init_hash_config(req); } static int queue_manag(void *data) -- 1.7.3.4 ^ permalink raw reply related [flat|nested] 67+ messages in thread
* [PATCH 11/13] mv_cesa: implement descriptor chaining for hashes, too 2012-05-25 16:08 RFC: support for MV_CESA with TDMA Phil Sutter ` (9 preceding siblings ...) 2012-05-25 16:08 ` [PATCH 10/13] mv_cesa: reorganise mv_start_new_hash_req a bit Phil Sutter @ 2012-05-25 16:08 ` Phil Sutter 2012-05-25 16:08 ` [PATCH 12/13] mv_cesa: drop the now unused process callback Phil Sutter ` (3 subsequent siblings) 14 siblings, 0 replies; 67+ messages in thread From: Phil Sutter @ 2012-05-25 16:08 UTC (permalink / raw) To: linux-crypto; +Cc: Herbert Xu Signed-off-by: Phil Sutter <phil.sutter@viprinet.com> --- drivers/crypto/mv_cesa.c | 89 ++++++++++++++++++---------------------------- 1 files changed, 35 insertions(+), 54 deletions(-) diff --git a/drivers/crypto/mv_cesa.c b/drivers/crypto/mv_cesa.c index 5dba9df..9afed2d 100644 --- a/drivers/crypto/mv_cesa.c +++ b/drivers/crypto/mv_cesa.c @@ -536,34 +536,14 @@ static void mv_init_hash_config(struct ahash_request *req) sizeof(struct sec_accel_sram), DMA_TO_DEVICE); mv_tdma_memcpy(cpg->sram_phys + SRAM_CONFIG, cpg->sa_sram_dma, sizeof(struct sec_accel_sram)); - - mv_tdma_separator(); - - if (req->result) { - req_ctx->result_dma = dma_map_single(cpg->dev, req->result, - req_ctx->digestsize, DMA_FROM_DEVICE); - mv_tdma_memcpy(req_ctx->result_dma, - cpg->sram_phys + SRAM_DIGEST_BUF, req_ctx->digestsize); - } else { - /* XXX: this fixes some ugly register fuckup bug in the tdma engine - * (no need to sync since the data is ignored anyway) */ - mv_tdma_memcpy(cpg->sa_sram_dma, - cpg->sram_phys + SRAM_CONFIG, 1); - } - - /* GO */ - mv_setup_timer(); - mv_tdma_trigger(); - writel(SEC_CMD_EN_SEC_ACCL0, cpg->reg + SEC_ACCEL_CMD); } -static void mv_update_hash_config(void) +static void mv_update_hash_config(struct ahash_request *req) { - struct ahash_request *req = ahash_request_cast(cpg->cur_req); struct mv_req_hash_ctx *req_ctx = ahash_request_ctx(req); struct req_progress *p = &cpg->p; - struct sec_accel_config *op = &cpg->sa_sram.op; int is_last; + u32 val; /* update only the config (for changed fragment state) and * mac_digest (for changed frag len) fields */ @@ -571,10 +551,10 @@ static void mv_update_hash_config(void) switch (req_ctx->op) { case COP_SHA1: default: - op->config = CFG_OP_MAC_ONLY | CFG_MACM_SHA1; + val = CFG_OP_MAC_ONLY | CFG_MACM_SHA1; break; case COP_HMAC_SHA1: - op->config = CFG_OP_MAC_ONLY | CFG_MACM_HMAC_SHA1; + val = CFG_OP_MAC_ONLY | CFG_MACM_HMAC_SHA1; break; } @@ -582,36 +562,11 @@ static void mv_update_hash_config(void) && (p->hw_processed_bytes + p->crypt_len >= p->hw_nbytes) && (req_ctx->count <= MAX_HW_HASH_SIZE); - op->config |= is_last ? CFG_LAST_FRAG : CFG_MID_FRAG; - dma_sync_single_for_device(cpg->dev, cpg->sa_sram_dma, - sizeof(u32), DMA_TO_DEVICE); - mv_tdma_memcpy(cpg->sram_phys + SRAM_CONFIG, - cpg->sa_sram_dma, sizeof(u32)); - - op->mac_digest = MAC_DIGEST_P(SRAM_DIGEST_BUF) | MAC_FRAG_LEN(p->crypt_len); - dma_sync_single_for_device(cpg->dev, cpg->sa_sram_dma + 6 * sizeof(u32), - sizeof(u32), DMA_TO_DEVICE); - mv_tdma_memcpy(cpg->sram_phys + SRAM_CONFIG + 6 * sizeof(u32), - cpg->sa_sram_dma + 6 * sizeof(u32), sizeof(u32)); - - mv_tdma_separator(); - - if (req->result) { - req_ctx->result_dma = dma_map_single(cpg->dev, req->result, - req_ctx->digestsize, DMA_FROM_DEVICE); - mv_tdma_memcpy(req_ctx->result_dma, - cpg->sram_phys + SRAM_DIGEST_BUF, req_ctx->digestsize); - } else { - /* XXX: this fixes some ugly register fuckup bug in the tdma engine - * (no need to sync since the data is ignored anyway) */ - mv_tdma_memcpy(cpg->sa_sram_dma, - cpg->sram_phys + SRAM_CONFIG, 1); - } + val |= is_last ? CFG_LAST_FRAG : CFG_MID_FRAG; + mv_tdma_u32_copy(cpg->sram_phys + SRAM_CONFIG, val); - /* GO */ - mv_setup_timer(); - mv_tdma_trigger(); - writel(SEC_CMD_EN_SEC_ACCL0, cpg->reg + SEC_ACCEL_CMD); + val = MAC_DIGEST_P(SRAM_DIGEST_BUF) | MAC_FRAG_LEN(p->crypt_len); + mv_tdma_u32_copy(cpg->sram_phys + SRAM_CONFIG + 6 * sizeof(u32), val); } static inline int mv_hash_import_sha1_ctx(const struct mv_req_hash_ctx *ctx, @@ -835,7 +790,6 @@ static void mv_start_new_hash_req(struct ahash_request *req) p->hw_nbytes = hw_bytes; p->complete = mv_hash_algo_completion; - p->process = mv_update_hash_config; if (unlikely(old_extra_bytes)) { dma_sync_single_for_device(cpg->dev, ctx->buffer_dma, @@ -847,6 +801,33 @@ static void mv_start_new_hash_req(struct ahash_request *req) setup_data_in(); mv_init_hash_config(req); + mv_tdma_separator(); + cpg->p.hw_processed_bytes += cpg->p.crypt_len; + while (cpg->p.hw_processed_bytes < cpg->p.hw_nbytes) { + cpg->p.crypt_len = 0; + + setup_data_in(); + mv_update_hash_config(req); + mv_tdma_separator(); + cpg->p.hw_processed_bytes += cpg->p.crypt_len; + } + if (req->result) { + ctx->result_dma = dma_map_single(cpg->dev, req->result, + ctx->digestsize, DMA_FROM_DEVICE); + mv_tdma_memcpy(ctx->result_dma, + cpg->sram_phys + SRAM_DIGEST_BUF, + ctx->digestsize); + } else { + /* XXX: this fixes some ugly register fuckup bug in the tdma engine + * (no need to sync since the data is ignored anyway) */ + mv_tdma_memcpy(cpg->sa_sram_dma, + cpg->sram_phys + SRAM_CONFIG, 1); + } + + /* GO */ + mv_setup_timer(); + mv_tdma_trigger(); + writel(SEC_CMD_EN_SEC_ACCL0, cpg->reg + SEC_ACCEL_CMD); } static int queue_manag(void *data) -- 1.7.3.4 ^ permalink raw reply related [flat|nested] 67+ messages in thread
* [PATCH 12/13] mv_cesa: drop the now unused process callback 2012-05-25 16:08 RFC: support for MV_CESA with TDMA Phil Sutter ` (10 preceding siblings ...) 2012-05-25 16:08 ` [PATCH 11/13] mv_cesa: implement descriptor chaining for hashes, too Phil Sutter @ 2012-05-25 16:08 ` Phil Sutter 2012-05-25 16:08 ` [PATCH 13/13] mv_cesa, mv_tdma: outsource common dma-pool handling code Phil Sutter ` (2 subsequent siblings) 14 siblings, 0 replies; 67+ messages in thread From: Phil Sutter @ 2012-05-25 16:08 UTC (permalink / raw) To: linux-crypto; +Cc: Herbert Xu And while here, simplify dequeue_complete_req() a bit. Signed-off-by: Phil Sutter <phil.sutter@viprinet.com> --- drivers/crypto/mv_cesa.c | 21 ++++++--------------- 1 files changed, 6 insertions(+), 15 deletions(-) diff --git a/drivers/crypto/mv_cesa.c b/drivers/crypto/mv_cesa.c index 9afed2d..9a2f413 100644 --- a/drivers/crypto/mv_cesa.c +++ b/drivers/crypto/mv_cesa.c @@ -69,7 +69,6 @@ struct req_progress { struct scatterlist *src_sg; struct scatterlist *dst_sg; void (*complete) (void); - void (*process) (void); /* src mostly */ int sg_src_left; @@ -648,25 +647,17 @@ static void mv_hash_algo_completion(void) static void dequeue_complete_req(void) { struct crypto_async_request *req = cpg->cur_req; - cpg->p.hw_processed_bytes += cpg->p.crypt_len; - cpg->p.crypt_len = 0; mv_tdma_clear(); cpg->u32_usage = 0; BUG_ON(cpg->eng_st != ENGINE_W_DEQUEUE); - if (cpg->p.hw_processed_bytes < cpg->p.hw_nbytes) { - /* process next scatter list entry */ - cpg->eng_st = ENGINE_BUSY; - setup_data_in(); - cpg->p.process(); - } else { - cpg->p.complete(); - cpg->eng_st = ENGINE_IDLE; - local_bh_disable(); - req->complete(req, 0); - local_bh_enable(); - } + + cpg->p.complete(); + cpg->eng_st = ENGINE_IDLE; + local_bh_disable(); + req->complete(req, 0); + local_bh_enable(); } static int count_sgs(struct scatterlist *sl, unsigned int total_bytes) -- 1.7.3.4 ^ permalink raw reply related [flat|nested] 67+ messages in thread
* [PATCH 13/13] mv_cesa, mv_tdma: outsource common dma-pool handling code 2012-05-25 16:08 RFC: support for MV_CESA with TDMA Phil Sutter ` (11 preceding siblings ...) 2012-05-25 16:08 ` [PATCH 12/13] mv_cesa: drop the now unused process callback Phil Sutter @ 2012-05-25 16:08 ` Phil Sutter 2012-05-27 14:03 ` RFC: support for MV_CESA with TDMA cloudy.linux 2012-06-12 10:04 ` Herbert Xu 14 siblings, 0 replies; 67+ messages in thread From: Phil Sutter @ 2012-05-25 16:08 UTC (permalink / raw) To: linux-crypto; +Cc: Herbert Xu Signed-off-by: Phil Sutter <phil.sutter@viprinet.com> --- drivers/crypto/dma_desclist.h | 79 +++++++++++++++++++++++++++++++++++ drivers/crypto/mv_cesa.c | 81 +++++++++---------------------------- drivers/crypto/mv_tdma.c | 91 ++++++++++++----------------------------- 3 files changed, 125 insertions(+), 126 deletions(-) create mode 100644 drivers/crypto/dma_desclist.h diff --git a/drivers/crypto/dma_desclist.h b/drivers/crypto/dma_desclist.h new file mode 100644 index 0000000..c471ad6 --- /dev/null +++ b/drivers/crypto/dma_desclist.h @@ -0,0 +1,79 @@ +#ifndef __DMA_DESCLIST__ +#define __DMA_DESCLIST__ + +struct dma_desc { + void *virt; + dma_addr_t phys; +}; + +struct dma_desclist { + struct dma_pool *itempool; + struct dma_desc *desclist; + unsigned long length; + unsigned long usage; +}; + +#define DESCLIST_ITEM(dl, x) ((dl).desclist[(x)].virt) +#define DESCLIST_ITEM_DMA(dl, x) ((dl).desclist[(x)].phys) +#define DESCLIST_FULL(dl) ((dl).length == (dl).usage) + +static inline int +init_dma_desclist(struct dma_desclist *dl, struct device *dev, + size_t size, size_t align, size_t boundary) +{ +#define STRX(x) #x +#define STR(x) STRX(x) + dl->itempool = dma_pool_create( + "DMA Desclist Pool at "__FILE__"("STR(__LINE__)")", + dev, size, align, boundary); +#undef STR +#undef STRX + if (!dl->itempool) + return 1; + dl->desclist = NULL; + dl->length = dl->usage = 0; + return 0; +} + +static inline int +set_dma_desclist_size(struct dma_desclist *dl, unsigned long nelem) +{ + /* need to increase size first if requested */ + if (nelem > dl->length) { + struct dma_desc *newmem; + int newsize = nelem * sizeof(struct dma_desc); + + newmem = krealloc(dl->desclist, newsize, GFP_KERNEL); + if (!newmem) + return -ENOMEM; + dl->desclist = newmem; + } + + /* allocate/free dma descriptors, adjusting dl->length on the go */ + for (; dl->length < nelem; dl->length++) { + DESCLIST_ITEM(*dl, dl->length) = dma_pool_alloc(dl->itempool, + GFP_KERNEL, &DESCLIST_ITEM_DMA(*dl, dl->length)); + if (!DESCLIST_ITEM(*dl, dl->length)) + return -ENOMEM; + } + for (; dl->length > nelem; dl->length--) + dma_pool_free(dl->itempool, DESCLIST_ITEM(*dl, dl->length - 1), + DESCLIST_ITEM_DMA(*dl, dl->length - 1)); + + /* ignore size decreases but those to zero */ + if (!nelem) { + kfree(dl->desclist); + dl->desclist = 0; + } + return 0; +} + +static inline void +fini_dma_desclist(struct dma_desclist *dl) +{ + set_dma_desclist_size(dl, 0); + dma_pool_destroy(dl->itempool); + dl->length = dl->usage = 0; +} + +#endif /* __DMA_DESCLIST__ */ diff --git a/drivers/crypto/mv_cesa.c b/drivers/crypto/mv_cesa.c index 9a2f413..367aa18 100644 --- a/drivers/crypto/mv_cesa.c +++ b/drivers/crypto/mv_cesa.c @@ -23,6 +23,7 @@ #include "mv_cesa.h" #include "mv_tdma.h" +#include "dma_desclist.h" #define MV_CESA "MV-CESA:" #define MAX_HW_HASH_SIZE 0xFFFF @@ -99,11 +100,6 @@ struct sec_accel_sram { #define sa_ivo type.hash.ivo } __attribute__((packed)); -struct u32_mempair { - u32 *vaddr; - dma_addr_t daddr; -}; - struct crypto_priv { struct device *dev; void __iomem *reg; @@ -127,14 +123,14 @@ struct crypto_priv { struct sec_accel_sram sa_sram; dma_addr_t sa_sram_dma; - struct dma_pool *u32_pool; - struct u32_mempair *u32_list; - int u32_list_len; - int u32_usage; + struct dma_desclist desclist; }; static struct crypto_priv *cpg; +#define ITEM(x) ((u32 *)DESCLIST_ITEM(cpg->desclist, x)) +#define ITEM_DMA(x) DESCLIST_ITEM_DMA(cpg->desclist, x) + struct mv_ctx { u8 aes_enc_key[AES_KEY_LEN]; u32 aes_dec_key[8]; @@ -202,52 +198,17 @@ static void mv_setup_timer(void) jiffies + msecs_to_jiffies(MV_CESA_EXPIRE)); } -#define U32_ITEM(x) (cpg->u32_list[x].vaddr) -#define U32_ITEM_DMA(x) (cpg->u32_list[x].daddr) - -static inline int set_u32_poolsize(int nelem) -{ - /* need to increase size first if requested */ - if (nelem > cpg->u32_list_len) { - struct u32_mempair *newmem; - int newsize = nelem * sizeof(struct u32_mempair); - - newmem = krealloc(cpg->u32_list, newsize, GFP_KERNEL); - if (!newmem) - return -ENOMEM; - cpg->u32_list = newmem; - } - - /* allocate/free dma descriptors, adjusting cpg->u32_list_len on the go */ - for (; cpg->u32_list_len < nelem; cpg->u32_list_len++) { - U32_ITEM(cpg->u32_list_len) = dma_pool_alloc(cpg->u32_pool, - GFP_KERNEL, &U32_ITEM_DMA(cpg->u32_list_len)); - if (!U32_ITEM((cpg->u32_list_len))) - return -ENOMEM; - } - for (; cpg->u32_list_len > nelem; cpg->u32_list_len--) - dma_pool_free(cpg->u32_pool, U32_ITEM(cpg->u32_list_len - 1), - U32_ITEM_DMA(cpg->u32_list_len - 1)); - - /* ignore size decreases but those to zero */ - if (!nelem) { - kfree(cpg->u32_list); - cpg->u32_list = 0; - } - return 0; -} - static inline void mv_tdma_u32_copy(dma_addr_t dst, u32 val) { - if (unlikely(cpg->u32_usage == cpg->u32_list_len) - && set_u32_poolsize(cpg->u32_list_len << 1)) { - printk(KERN_ERR MV_CESA "resizing poolsize to %d failed\n", - cpg->u32_list_len << 1); + if (unlikely(DESCLIST_FULL(cpg->desclist)) && + set_dma_desclist_size(&cpg->desclist, cpg->desclist.length << 1)) { + printk(KERN_ERR MV_CESA "resizing poolsize to %lu failed\n", + cpg->desclist.length << 1); return; } - *(U32_ITEM(cpg->u32_usage)) = val; - mv_tdma_memcpy(dst, U32_ITEM_DMA(cpg->u32_usage), sizeof(u32)); - cpg->u32_usage++; + *ITEM(cpg->desclist.usage) = val; + mv_tdma_memcpy(dst, ITEM_DMA(cpg->desclist.usage), sizeof(u32)); + cpg->desclist.usage++; } static inline bool @@ -649,7 +610,7 @@ static void dequeue_complete_req(void) struct crypto_async_request *req = cpg->cur_req; mv_tdma_clear(); - cpg->u32_usage = 0; + cpg->desclist.usage = 0; BUG_ON(cpg->eng_st != ENGINE_W_DEQUEUE); @@ -1326,13 +1287,12 @@ static int mv_probe(struct platform_device *pdev) cp->sa_sram_dma = dma_map_single(&pdev->dev, &cp->sa_sram, sizeof(struct sec_accel_sram), DMA_TO_DEVICE); - cpg->u32_pool = dma_pool_create("CESA U32 Item Pool", - &pdev->dev, sizeof(u32), MV_DMA_ALIGN, 0); - if (!cpg->u32_pool) { + if (init_dma_desclist(&cpg->desclist, &pdev->dev, + sizeof(u32), MV_DMA_ALIGN, 0)) { ret = -ENOMEM; goto err_mapping; } - if (set_u32_poolsize(MV_DMA_INIT_POOLSIZE)) { + if (set_dma_desclist_size(&cpg->desclist, MV_DMA_INIT_POOLSIZE)) { printk(KERN_ERR MV_CESA "failed to initialise poolsize\n"); goto err_pool; } @@ -1341,7 +1301,7 @@ static int mv_probe(struct platform_device *pdev) if (ret) { printk(KERN_WARNING MV_CESA "Could not register aes-ecb driver\n"); - goto err_poolsize; + goto err_pool; } ret = crypto_register_alg(&mv_aes_alg_cbc); @@ -1368,10 +1328,8 @@ static int mv_probe(struct platform_device *pdev) return 0; err_unreg_ecb: crypto_unregister_alg(&mv_aes_alg_ecb); -err_poolsize: - set_u32_poolsize(0); err_pool: - dma_pool_destroy(cpg->u32_pool); + fini_dma_desclist(&cpg->desclist); err_mapping: dma_unmap_single(&pdev->dev, cpg->sa_sram_dma, sizeof(struct sec_accel_sram), DMA_TO_DEVICE); @@ -1403,8 +1361,7 @@ static int mv_remove(struct platform_device *pdev) free_irq(cp->irq, cp); dma_unmap_single(&pdev->dev, cpg->sa_sram_dma, sizeof(struct sec_accel_sram), DMA_TO_DEVICE); - set_u32_poolsize(0); - dma_pool_destroy(cpg->u32_pool); + fini_dma_desclist(&cpg->desclist); memset(cp->sram, 0, cp->sram_size); iounmap(cp->sram); iounmap(cp->reg); diff --git a/drivers/crypto/mv_tdma.c b/drivers/crypto/mv_tdma.c index aa5316a..d8e8c3f 100644 --- a/drivers/crypto/mv_tdma.c +++ b/drivers/crypto/mv_tdma.c @@ -17,6 +17,7 @@ #include <linux/platform_device.h> #include "mv_tdma.h" +#include "dma_desclist.h" #define MV_TDMA "MV-TDMA: " @@ -30,57 +31,17 @@ struct tdma_desc { u32 next; } __attribute__((packed)); -struct desc_mempair { - struct tdma_desc *vaddr; - dma_addr_t daddr; -}; - struct tdma_priv { struct device *dev; void __iomem *reg; int irq; /* protecting the dma descriptors and stuff */ spinlock_t lock; - struct dma_pool *descpool; - struct desc_mempair *desclist; - int desclist_len; - int desc_usage; + struct dma_desclist desclist; } tpg; -#define DESC(x) (tpg.desclist[x].vaddr) -#define DESC_DMA(x) (tpg.desclist[x].daddr) - -static inline int set_poolsize(int nelem) -{ - /* need to increase size first if requested */ - if (nelem > tpg.desclist_len) { - struct desc_mempair *newmem; - int newsize = nelem * sizeof(struct desc_mempair); - - newmem = krealloc(tpg.desclist, newsize, GFP_KERNEL); - if (!newmem) - return -ENOMEM; - tpg.desclist = newmem; - } - - /* allocate/free dma descriptors, adjusting tpg.desclist_len on the go */ - for (; tpg.desclist_len < nelem; tpg.desclist_len++) { - DESC(tpg.desclist_len) = dma_pool_alloc(tpg.descpool, - GFP_KERNEL, &DESC_DMA(tpg.desclist_len)); - if (!DESC((tpg.desclist_len))) - return -ENOMEM; - } - for (; tpg.desclist_len > nelem; tpg.desclist_len--) - dma_pool_free(tpg.descpool, DESC(tpg.desclist_len - 1), - DESC_DMA(tpg.desclist_len - 1)); - - /* ignore size decreases but those to zero */ - if (!nelem) { - kfree(tpg.desclist); - tpg.desclist = 0; - } - return 0; -} +#define ITEM(x) ((struct tdma_desc *)DESCLIST_ITEM(tpg.desclist, x)) +#define ITEM_DMA(x) DESCLIST_ITEM_DMA(tpg.desclist, x) static inline void wait_for_tdma_idle(void) { @@ -100,17 +61,18 @@ static inline void switch_tdma_engine(bool state) static struct tdma_desc *get_new_last_desc(void) { - if (unlikely(tpg.desc_usage == tpg.desclist_len) && - set_poolsize(tpg.desclist_len << 1)) { - printk(KERN_ERR MV_TDMA "failed to increase DMA pool to %d\n", - tpg.desclist_len << 1); + if (unlikely(DESCLIST_FULL(tpg.desclist)) && + set_dma_desclist_size(&tpg.desclist, tpg.desclist.length << 1)) { + printk(KERN_ERR MV_TDMA "failed to increase DMA pool to %lu\n", + tpg.desclist.length << 1); return NULL; } - if (likely(tpg.desc_usage)) - DESC(tpg.desc_usage - 1)->next = DESC_DMA(tpg.desc_usage); + if (likely(tpg.desclist.usage)) + ITEM(tpg.desclist.usage - 1)->next = + ITEM_DMA(tpg.desclist.usage); - return DESC(tpg.desc_usage++); + return ITEM(tpg.desclist.usage++); } static inline void mv_tdma_desc_dump(void) @@ -118,17 +80,17 @@ static inline void mv_tdma_desc_dump(void) struct tdma_desc *tmp; int i; - if (!tpg.desc_usage) { + if (!tpg.desclist.usage) { printk(KERN_WARNING MV_TDMA "DMA descriptor list is empty\n"); return; } printk(KERN_WARNING MV_TDMA "DMA descriptor list:\n"); - for (i = 0; i < tpg.desc_usage; i++) { - tmp = DESC(i); + for (i = 0; i < tpg.desclist.usage; i++) { + tmp = ITEM(i); printk(KERN_WARNING MV_TDMA "entry %d at 0x%x: dma addr 0x%x, " "src 0x%x, dst 0x%x, count %u, own %d, next 0x%x", i, - (u32)tmp, DESC_DMA(i) , tmp->src, tmp->dst, + (u32)tmp, ITEM_DMA(i) , tmp->src, tmp->dst, tmp->count & ~TDMA_OWN_BIT, !!(tmp->count & TDMA_OWN_BIT), tmp->next); } @@ -167,7 +129,7 @@ void mv_tdma_clear(void) writel(0, tpg.reg + TDMA_CURR_DESC); writel(0, tpg.reg + TDMA_NEXT_DESC); - tpg.desc_usage = 0; + tpg.desclist.usage = 0; switch_tdma_engine(1); @@ -183,7 +145,7 @@ void mv_tdma_trigger(void) spin_lock(&tpg.lock); - writel(DESC_DMA(0), tpg.reg + TDMA_NEXT_DESC); + writel(ITEM_DMA(0), tpg.reg + TDMA_NEXT_DESC); spin_unlock(&tpg.lock); } @@ -287,13 +249,15 @@ static int mv_probe(struct platform_device *pdev) goto out_unmap_reg; } - tpg.descpool = dma_pool_create("TDMA Descriptor Pool", tpg.dev, - sizeof(struct tdma_desc), MV_DMA_ALIGN, 0); - if (!tpg.descpool) { + if (init_dma_desclist(&tpg.desclist, tpg.dev, + sizeof(struct tdma_desc), MV_DMA_ALIGN, 0)) { rc = -ENOMEM; goto out_free_irq; } - set_poolsize(MV_DMA_INIT_POOLSIZE); + if (set_dma_desclist_size(&tpg.desclist, MV_DMA_INIT_POOLSIZE)) { + rc = -ENOMEM; + goto out_free_desclist; + } platform_set_drvdata(pdev, &tpg); @@ -327,8 +291,8 @@ static int mv_probe(struct platform_device *pdev) out_free_all: switch_tdma_engine(0); platform_set_drvdata(pdev, NULL); - set_poolsize(0); - dma_pool_destroy(tpg.descpool); +out_free_desclist: + fini_dma_desclist(&tpg.desclist); out_free_irq: free_irq(tpg.irq, &tpg); out_unmap_reg: @@ -341,8 +305,7 @@ static int mv_remove(struct platform_device *pdev) { switch_tdma_engine(0); platform_set_drvdata(pdev, NULL); - set_poolsize(0); - dma_pool_destroy(tpg.descpool); + fini_dma_desclist(&tpg.desclist); free_irq(tpg.irq, &tpg); iounmap(tpg.reg); tpg.dev = NULL; -- 1.7.3.4 ^ permalink raw reply related [flat|nested] 67+ messages in thread
* Re: RFC: support for MV_CESA with TDMA 2012-05-25 16:08 RFC: support for MV_CESA with TDMA Phil Sutter ` (12 preceding siblings ...) 2012-05-25 16:08 ` [PATCH 13/13] mv_cesa, mv_tdma: outsource common dma-pool handling code Phil Sutter @ 2012-05-27 14:03 ` cloudy.linux 2012-05-29 11:34 ` Phil Sutter 2012-06-12 10:04 ` Herbert Xu 14 siblings, 1 reply; 67+ messages in thread From: cloudy.linux @ 2012-05-27 14:03 UTC (permalink / raw) To: Phil Sutter; +Cc: linux-crypto, Herbert Xu On 2012-5-26 0:08, Phil Sutter wrote: > Hi, > > The following patch series adds support for the TDMA engine built into > Marvell's Kirkwood-based SoCs, and enhances mv_cesa.c in order to use it > for speeding up crypto operations. Kirkwood hardware contains a security > accelerator, which can control DMA as well as crypto engines. It allows > for operation with minimal software intervenience, which the following > patches implement: using a chain of DMA descriptors, data input, > configuration, engine startup and data output repeat fully automatically > until the whole input data has been handled. > > The point for this being RFC is backwards-compatibility: earlier > hardware (Orion) ships a (slightly) different DMA engine (IDMA) along > with the same crypto engine, so in fact mv_cesa.c is in use on these > platforms, too. But since I don't possess hardware of this kind, I am > not able to make this code IDMA-compatible. Also, due to the quite > massive reorganisation of code flow, I don't really see how to make TDMA > support optional in mv_cesa.c. > > Greetings, Phil > -- > To unsubscribe from this list: send the line "unsubscribe linux-crypto" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html Could the source code from the manufacturers of hardwares using kirkwood be helpful? I saw the source code of ls-wvl from buffalo contains driver for CESA. And it deals with both IDMA and TDMA. If you need, I can send you the download link. I also have to point out that CESA of some orion revisions has hardware flaws that needs to be addressed which currently doesn't. Information about those flaws can be found in 88F5182_Functional_Errata.pdf which is available on the net. ^ permalink raw reply [flat|nested] 67+ messages in thread
* Re: RFC: support for MV_CESA with TDMA 2012-05-27 14:03 ` RFC: support for MV_CESA with TDMA cloudy.linux @ 2012-05-29 11:34 ` Phil Sutter 0 siblings, 0 replies; 67+ messages in thread From: Phil Sutter @ 2012-05-29 11:34 UTC (permalink / raw) To: cloudy.linux; +Cc: linux-crypto, Herbert Xu Hi, On Sun, May 27, 2012 at 10:03:07PM +0800, cloudy.linux wrote: > Could the source code from the manufacturers of hardwares using kirkwood > be helpful? > I saw the source code of ls-wvl from buffalo contains driver for CESA. > And it deals with both IDMA and TDMA. If you need, I can send you the > download link. Actually, I do have the sources. Just had doubts about how useful it would be to write code for something I couldn't test at all. OTOH, that's probably a better start than nothing. > I also have to point out that CESA of some orion revisions has hardware > flaws that needs to be addressed which currently doesn't. Information > about those flaws can be found in 88F5182_Functional_Errata.pdf which is > available on the net. OK, thanks for the pointer! Looks like implementing combined (crypto/digest) operation for Orion will be no fun at least. Greetings, Phil Phil Sutter Software Engineer -- Viprinet GmbH Mainzer Str. 43 55411 Bingen am Rhein Germany Phone/Zentrale: +49-6721-49030-0 Direct line/Durchwahl: +49-6721-49030-134 Fax: +49-6721-49030-209 phil.sutter@viprinet.com http://www.viprinet.com Registered office/Sitz der Gesellschaft: Bingen am Rhein Commercial register/Handelsregister: Amtsgericht Mainz HRB40380 CEO/Geschäftsführer: Simon Kissel ^ permalink raw reply [flat|nested] 67+ messages in thread
* Re: RFC: support for MV_CESA with TDMA 2012-05-25 16:08 RFC: support for MV_CESA with TDMA Phil Sutter ` (13 preceding siblings ...) 2012-05-27 14:03 ` RFC: support for MV_CESA with TDMA cloudy.linux @ 2012-06-12 10:04 ` Herbert Xu 2012-06-12 10:24 ` Phil Sutter 14 siblings, 1 reply; 67+ messages in thread From: Herbert Xu @ 2012-06-12 10:04 UTC (permalink / raw) To: Phil Sutter; +Cc: linux-crypto On Fri, May 25, 2012 at 06:08:26PM +0200, Phil Sutter wrote: > > The point for this being RFC is backwards-compatibility: earlier > hardware (Orion) ships a (slightly) different DMA engine (IDMA) along > with the same crypto engine, so in fact mv_cesa.c is in use on these > platforms, too. But since I don't possess hardware of this kind, I am > not able to make this code IDMA-compatible. Also, due to the quite > massive reorganisation of code flow, I don't really see how to make TDMA > support optional in mv_cesa.c. So does this break existing functionality or not? Cheers, -- Email: Herbert Xu <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt ^ permalink raw reply [flat|nested] 67+ messages in thread
* Re: RFC: support for MV_CESA with TDMA 2012-06-12 10:04 ` Herbert Xu @ 2012-06-12 10:24 ` Phil Sutter 2012-06-12 11:39 ` Herbert Xu 0 siblings, 1 reply; 67+ messages in thread From: Phil Sutter @ 2012-06-12 10:24 UTC (permalink / raw) To: Herbert Xu; +Cc: linux-crypto On Tue, Jun 12, 2012 at 06:04:37PM +0800, Herbert Xu wrote: > On Fri, May 25, 2012 at 06:08:26PM +0200, Phil Sutter wrote: > > > > The point for this being RFC is backwards-compatibility: earlier > > hardware (Orion) ships a (slightly) different DMA engine (IDMA) along > > with the same crypto engine, so in fact mv_cesa.c is in use on these > > platforms, too. But since I don't possess hardware of this kind, I am > > not able to make this code IDMA-compatible. Also, due to the quite > > massive reorganisation of code flow, I don't really see how to make TDMA > > support optional in mv_cesa.c. > > So does this break existing functionality or not? It does break mv_cesa on Orion-based devices (precisely those with IDMA instead of TDMA). I am currently working on a version which supports IDMA, too. Since all CESA-equipped hardware comes with either TDMA or IDMA, that version then should improve all platforms without breaking any. Greetings, Phil Phil Sutter Software Engineer -- Viprinet GmbH Mainzer Str. 43 55411 Bingen am Rhein Germany Phone/Zentrale: +49-6721-49030-0 Direct line/Durchwahl: +49-6721-49030-134 Fax: +49-6721-49030-209 phil.sutter@viprinet.com http://www.viprinet.com Registered office/Sitz der Gesellschaft: Bingen am Rhein Commercial register/Handelsregister: Amtsgericht Mainz HRB40380 CEO/Geschäftsführer: Simon Kissel ^ permalink raw reply [flat|nested] 67+ messages in thread
* Re: RFC: support for MV_CESA with TDMA 2012-06-12 10:24 ` Phil Sutter @ 2012-06-12 11:39 ` Herbert Xu 2012-06-12 17:17 ` RFC: support for MV_CESA with IDMA or TDMA Phil Sutter 0 siblings, 1 reply; 67+ messages in thread From: Herbert Xu @ 2012-06-12 11:39 UTC (permalink / raw) To: Phil Sutter; +Cc: linux-crypto On Tue, Jun 12, 2012 at 12:24:52PM +0200, Phil Sutter wrote: > > It does break mv_cesa on Orion-based devices (precisely those with IDMA > instead of TDMA). I am currently working on a version which supports > IDMA, too. Since all CESA-equipped hardware comes with either TDMA or > IDMA, that version then should improve all platforms without breaking > any. Thanks for the explanation. I'll wait for your new patches :) -- Email: Herbert Xu <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt ^ permalink raw reply [flat|nested] 67+ messages in thread
* RFC: support for MV_CESA with IDMA or TDMA 2012-06-12 11:39 ` Herbert Xu @ 2012-06-12 17:17 ` Phil Sutter 2012-06-12 17:17 ` [PATCH 01/13] mv_cesa: do not use scatterlist iterators Phil Sutter ` (14 more replies) 0 siblings, 15 replies; 67+ messages in thread From: Phil Sutter @ 2012-06-12 17:17 UTC (permalink / raw) To: linux-crypto; +Cc: Herbert Xu Hi, The following patch series adds support for the TDMA engine built into Marvell's Kirkwood-based SoCs as well as the IDMA engine built into Marvell's Orion-based SoCs and enhances mv_cesa.c in order to use it for speeding up crypto operations. The hardware contains a security accelerator, which can control DMA as well as crypto engines. It allows for operation with minimal software intervention, which the following patches implement: using a chain of DMA descriptors, data input, configuration, engine startup and data output repeat fully automatically until the whole input data has been handled. The point for this being RFC is lack of hardware on my side for testing the IDMA support. I'd highly appreciate if someone with Orion hardware could test this, preferably using the hmac_comp tool shipped with cryptodev-linux as it does a more extensive testing (with bigger buffer sizes at least) than tcrypt or the standard kernel-internal use cases. Greetings, Phil ^ permalink raw reply [flat|nested] 67+ messages in thread
* [PATCH 01/13] mv_cesa: do not use scatterlist iterators 2012-06-12 17:17 ` RFC: support for MV_CESA with IDMA or TDMA Phil Sutter @ 2012-06-12 17:17 ` Phil Sutter 2012-06-12 17:17 ` [PATCH 02/13] mv_cesa: minor formatting cleanup, will all make sense soon Phil Sutter ` (13 subsequent siblings) 14 siblings, 0 replies; 67+ messages in thread From: Phil Sutter @ 2012-06-12 17:17 UTC (permalink / raw) To: linux-crypto; +Cc: Herbert Xu The big problem is they cannot be used to iterate over DMA mapped scatterlists, so get rid of them in order to add DMA functionality to mv_cesa. Signed-off-by: Phil Sutter <phil.sutter@viprinet.com> --- drivers/crypto/mv_cesa.c | 57 ++++++++++++++++++++++----------------------- 1 files changed, 28 insertions(+), 29 deletions(-) diff --git a/drivers/crypto/mv_cesa.c b/drivers/crypto/mv_cesa.c index 0d40717..818a5c7 100644 --- a/drivers/crypto/mv_cesa.c +++ b/drivers/crypto/mv_cesa.c @@ -44,8 +44,8 @@ enum engine_status { /** * struct req_progress - used for every crypt request - * @src_sg_it: sg iterator for src - * @dst_sg_it: sg iterator for dst + * @src_sg: sg list for src + * @dst_sg: sg list for dst * @sg_src_left: bytes left in src to process (scatter list) * @src_start: offset to add to src start position (scatter list) * @crypt_len: length of current hw crypt/hash process @@ -60,8 +60,8 @@ enum engine_status { * track of progress within current scatterlist. */ struct req_progress { - struct sg_mapping_iter src_sg_it; - struct sg_mapping_iter dst_sg_it; + struct scatterlist *src_sg; + struct scatterlist *dst_sg; void (*complete) (void); void (*process) (int is_first); @@ -212,19 +212,19 @@ static int mv_setkey_aes(struct crypto_ablkcipher *cipher, const u8 *key, static void copy_src_to_buf(struct req_progress *p, char *dbuf, int len) { - int ret; void *sbuf; int copy_len; while (len) { if (!p->sg_src_left) { - ret = sg_miter_next(&p->src_sg_it); - BUG_ON(!ret); - p->sg_src_left = p->src_sg_it.length; + /* next sg please */ + p->src_sg = sg_next(p->src_sg); + BUG_ON(!p->src_sg); + p->sg_src_left = p->src_sg->length; p->src_start = 0; } - sbuf = p->src_sg_it.addr + p->src_start; + sbuf = sg_virt(p->src_sg) + p->src_start; copy_len = min(p->sg_src_left, len); memcpy(dbuf, sbuf, copy_len); @@ -307,9 +307,6 @@ static void mv_crypto_algo_completion(void) struct ablkcipher_request *req = ablkcipher_request_cast(cpg->cur_req); struct mv_req_ctx *req_ctx = ablkcipher_request_ctx(req); - sg_miter_stop(&cpg->p.src_sg_it); - sg_miter_stop(&cpg->p.dst_sg_it); - if (req_ctx->op != COP_AES_CBC) return ; @@ -439,7 +436,6 @@ static void mv_hash_algo_completion(void) if (ctx->extra_bytes) copy_src_to_buf(&cpg->p, ctx->buffer, ctx->extra_bytes); - sg_miter_stop(&cpg->p.src_sg_it); if (likely(ctx->last_chunk)) { if (likely(ctx->count <= MAX_HW_HASH_SIZE)) { @@ -459,7 +455,6 @@ static void dequeue_complete_req(void) { struct crypto_async_request *req = cpg->cur_req; void *buf; - int ret; cpg->p.hw_processed_bytes += cpg->p.crypt_len; if (cpg->p.copy_back) { int need_copy_len = cpg->p.crypt_len; @@ -468,14 +463,14 @@ static void dequeue_complete_req(void) int dst_copy; if (!cpg->p.sg_dst_left) { - ret = sg_miter_next(&cpg->p.dst_sg_it); - BUG_ON(!ret); - cpg->p.sg_dst_left = cpg->p.dst_sg_it.length; + /* next sg please */ + cpg->p.dst_sg = sg_next(cpg->p.dst_sg); + BUG_ON(!cpg->p.dst_sg); + cpg->p.sg_dst_left = cpg->p.dst_sg->length; cpg->p.dst_start = 0; } - buf = cpg->p.dst_sg_it.addr; - buf += cpg->p.dst_start; + buf = sg_virt(cpg->p.dst_sg) + cpg->p.dst_start; dst_copy = min(need_copy_len, cpg->p.sg_dst_left); @@ -525,7 +520,6 @@ static int count_sgs(struct scatterlist *sl, unsigned int total_bytes) static void mv_start_new_crypt_req(struct ablkcipher_request *req) { struct req_progress *p = &cpg->p; - int num_sgs; cpg->cur_req = &req->base; memset(p, 0, sizeof(struct req_progress)); @@ -534,11 +528,14 @@ static void mv_start_new_crypt_req(struct ablkcipher_request *req) p->process = mv_process_current_q; p->copy_back = 1; - num_sgs = count_sgs(req->src, req->nbytes); - sg_miter_start(&p->src_sg_it, req->src, num_sgs, SG_MITER_FROM_SG); - - num_sgs = count_sgs(req->dst, req->nbytes); - sg_miter_start(&p->dst_sg_it, req->dst, num_sgs, SG_MITER_TO_SG); + p->src_sg = req->src; + p->dst_sg = req->dst; + if (req->nbytes) { + BUG_ON(!req->src); + BUG_ON(!req->dst); + p->sg_src_left = req->src->length; + p->sg_dst_left = req->dst->length; + } mv_process_current_q(1); } @@ -547,7 +544,7 @@ static void mv_start_new_hash_req(struct ahash_request *req) { struct req_progress *p = &cpg->p; struct mv_req_hash_ctx *ctx = ahash_request_ctx(req); - int num_sgs, hw_bytes, old_extra_bytes, rc; + int hw_bytes, old_extra_bytes, rc; cpg->cur_req = &req->base; memset(p, 0, sizeof(struct req_progress)); hw_bytes = req->nbytes + ctx->extra_bytes; @@ -560,8 +557,11 @@ static void mv_start_new_hash_req(struct ahash_request *req) else ctx->extra_bytes = 0; - num_sgs = count_sgs(req->src, req->nbytes); - sg_miter_start(&p->src_sg_it, req->src, num_sgs, SG_MITER_FROM_SG); + p->src_sg = req->src; + if (req->nbytes) { + BUG_ON(!req->src); + p->sg_src_left = req->src->length; + } if (hw_bytes) { p->hw_nbytes = hw_bytes; @@ -578,7 +578,6 @@ static void mv_start_new_hash_req(struct ahash_request *req) } else { copy_src_to_buf(p, ctx->buffer + old_extra_bytes, ctx->extra_bytes - old_extra_bytes); - sg_miter_stop(&p->src_sg_it); if (ctx->last_chunk) rc = mv_hash_final_fallback(req); else -- 1.7.3.4 ^ permalink raw reply related [flat|nested] 67+ messages in thread
* [PATCH 02/13] mv_cesa: minor formatting cleanup, will all make sense soon 2012-06-12 17:17 ` RFC: support for MV_CESA with IDMA or TDMA Phil Sutter 2012-06-12 17:17 ` [PATCH 01/13] mv_cesa: do not use scatterlist iterators Phil Sutter @ 2012-06-12 17:17 ` Phil Sutter 2012-06-12 17:17 ` [PATCH 03/13] mv_cesa: prepare the full sram config in dram Phil Sutter ` (12 subsequent siblings) 14 siblings, 0 replies; 67+ messages in thread From: Phil Sutter @ 2012-06-12 17:17 UTC (permalink / raw) To: linux-crypto; +Cc: Herbert Xu This is just to keep formatting changes out of the following commit, hopefully simplifying it a bit. Signed-off-by: Phil Sutter <phil.sutter@viprinet.com> --- drivers/crypto/mv_cesa.c | 14 ++++++-------- 1 files changed, 6 insertions(+), 8 deletions(-) diff --git a/drivers/crypto/mv_cesa.c b/drivers/crypto/mv_cesa.c index 818a5c7..59c2ed2 100644 --- a/drivers/crypto/mv_cesa.c +++ b/drivers/crypto/mv_cesa.c @@ -269,12 +269,10 @@ static void mv_process_current_q(int first_block) } if (req_ctx->decrypt) { op.config |= CFG_DIR_DEC; - memcpy(cpg->sram + SRAM_DATA_KEY_P, ctx->aes_dec_key, - AES_KEY_LEN); + memcpy(cpg->sram + SRAM_DATA_KEY_P, ctx->aes_dec_key, AES_KEY_LEN); } else { op.config |= CFG_DIR_ENC; - memcpy(cpg->sram + SRAM_DATA_KEY_P, ctx->aes_enc_key, - AES_KEY_LEN); + memcpy(cpg->sram + SRAM_DATA_KEY_P, ctx->aes_enc_key, AES_KEY_LEN); } switch (ctx->key_len) { @@ -335,9 +333,8 @@ static void mv_process_hash_current(int first_block) } op.mac_src_p = - MAC_SRC_DATA_P(SRAM_DATA_IN_START) | MAC_SRC_TOTAL_LEN((u32) - req_ctx-> - count); + MAC_SRC_DATA_P(SRAM_DATA_IN_START) | + MAC_SRC_TOTAL_LEN((u32)req_ctx->count); setup_data_in(); @@ -372,7 +369,8 @@ static void mv_process_hash_current(int first_block) } } - memcpy(cpg->sram + SRAM_CONFIG, &op, sizeof(struct sec_accel_config)); + memcpy(cpg->sram + SRAM_CONFIG, &op, + sizeof(struct sec_accel_config)); /* GO */ mv_setup_timer(); -- 1.7.3.4 ^ permalink raw reply related [flat|nested] 67+ messages in thread
* [PATCH 03/13] mv_cesa: prepare the full sram config in dram 2012-06-12 17:17 ` RFC: support for MV_CESA with IDMA or TDMA Phil Sutter 2012-06-12 17:17 ` [PATCH 01/13] mv_cesa: do not use scatterlist iterators Phil Sutter 2012-06-12 17:17 ` [PATCH 02/13] mv_cesa: minor formatting cleanup, will all make sense soon Phil Sutter @ 2012-06-12 17:17 ` Phil Sutter 2012-06-12 17:17 ` [PATCH 04/13] mv_cesa: split up processing callbacks Phil Sutter ` (11 subsequent siblings) 14 siblings, 0 replies; 67+ messages in thread From: Phil Sutter @ 2012-06-12 17:17 UTC (permalink / raw) To: linux-crypto; +Cc: Herbert Xu This way reconfiguring the cryptographic accelerator consists of a single step (memcpy here), which in future can be done by the tdma engine. This patch introduces some ugly IV copying, necessary for input buffers above 1920bytes. But this will go away later. Signed-off-by: Phil Sutter <phil.sutter@viprinet.com> --- drivers/crypto/mv_cesa.c | 83 ++++++++++++++++++++++++++++----------------- 1 files changed, 52 insertions(+), 31 deletions(-) diff --git a/drivers/crypto/mv_cesa.c b/drivers/crypto/mv_cesa.c index 59c2ed2..80dcf16 100644 --- a/drivers/crypto/mv_cesa.c +++ b/drivers/crypto/mv_cesa.c @@ -77,6 +77,24 @@ struct req_progress { int hw_processed_bytes; }; +struct sec_accel_sram { + struct sec_accel_config op; + union { + struct { + u32 key[8]; + u32 iv[4]; + } crypt; + struct { + u32 ivi[5]; + u32 ivo[5]; + } hash; + } type; +#define sa_key type.crypt.key +#define sa_iv type.crypt.iv +#define sa_ivi type.hash.ivi +#define sa_ivo type.hash.ivo +} __attribute__((packed)); + struct crypto_priv { void __iomem *reg; void __iomem *sram; @@ -95,6 +113,8 @@ struct crypto_priv { int sram_size; int has_sha1; int has_hmac_sha1; + + struct sec_accel_sram sa_sram; }; static struct crypto_priv *cpg; @@ -252,48 +272,49 @@ static void mv_process_current_q(int first_block) struct ablkcipher_request *req = ablkcipher_request_cast(cpg->cur_req); struct mv_ctx *ctx = crypto_tfm_ctx(req->base.tfm); struct mv_req_ctx *req_ctx = ablkcipher_request_ctx(req); - struct sec_accel_config op; + struct sec_accel_config *op = &cpg->sa_sram.op; switch (req_ctx->op) { case COP_AES_ECB: - op.config = CFG_OP_CRYPT_ONLY | CFG_ENCM_AES | CFG_ENC_MODE_ECB; + op->config = CFG_OP_CRYPT_ONLY | CFG_ENCM_AES | CFG_ENC_MODE_ECB; break; case COP_AES_CBC: default: - op.config = CFG_OP_CRYPT_ONLY | CFG_ENCM_AES | CFG_ENC_MODE_CBC; - op.enc_iv = ENC_IV_POINT(SRAM_DATA_IV) | + op->config = CFG_OP_CRYPT_ONLY | CFG_ENCM_AES | CFG_ENC_MODE_CBC; + op->enc_iv = ENC_IV_POINT(SRAM_DATA_IV) | ENC_IV_BUF_POINT(SRAM_DATA_IV_BUF); - if (first_block) - memcpy(cpg->sram + SRAM_DATA_IV, req->info, 16); + if (!first_block) + memcpy(req->info, cpg->sram + SRAM_DATA_IV_BUF, 16); + memcpy(cpg->sa_sram.sa_iv, req->info, 16); break; } if (req_ctx->decrypt) { - op.config |= CFG_DIR_DEC; - memcpy(cpg->sram + SRAM_DATA_KEY_P, ctx->aes_dec_key, AES_KEY_LEN); + op->config |= CFG_DIR_DEC; + memcpy(cpg->sa_sram.sa_key, ctx->aes_dec_key, AES_KEY_LEN); } else { - op.config |= CFG_DIR_ENC; - memcpy(cpg->sram + SRAM_DATA_KEY_P, ctx->aes_enc_key, AES_KEY_LEN); + op->config |= CFG_DIR_ENC; + memcpy(cpg->sa_sram.sa_key, ctx->aes_enc_key, AES_KEY_LEN); } switch (ctx->key_len) { case AES_KEYSIZE_128: - op.config |= CFG_AES_LEN_128; + op->config |= CFG_AES_LEN_128; break; case AES_KEYSIZE_192: - op.config |= CFG_AES_LEN_192; + op->config |= CFG_AES_LEN_192; break; case AES_KEYSIZE_256: - op.config |= CFG_AES_LEN_256; + op->config |= CFG_AES_LEN_256; break; } - op.enc_p = ENC_P_SRC(SRAM_DATA_IN_START) | + op->enc_p = ENC_P_SRC(SRAM_DATA_IN_START) | ENC_P_DST(SRAM_DATA_OUT_START); - op.enc_key_p = SRAM_DATA_KEY_P; + op->enc_key_p = SRAM_DATA_KEY_P; setup_data_in(); - op.enc_len = cpg->p.crypt_len; - memcpy(cpg->sram + SRAM_CONFIG, &op, - sizeof(struct sec_accel_config)); + op->enc_len = cpg->p.crypt_len; + memcpy(cpg->sram + SRAM_CONFIG, &cpg->sa_sram, + sizeof(struct sec_accel_sram)); /* GO */ mv_setup_timer(); @@ -317,30 +338,30 @@ static void mv_process_hash_current(int first_block) const struct mv_tfm_hash_ctx *tfm_ctx = crypto_tfm_ctx(req->base.tfm); struct mv_req_hash_ctx *req_ctx = ahash_request_ctx(req); struct req_progress *p = &cpg->p; - struct sec_accel_config op = { 0 }; + struct sec_accel_config *op = &cpg->sa_sram.op; int is_last; switch (req_ctx->op) { case COP_SHA1: default: - op.config = CFG_OP_MAC_ONLY | CFG_MACM_SHA1; + op->config = CFG_OP_MAC_ONLY | CFG_MACM_SHA1; break; case COP_HMAC_SHA1: - op.config = CFG_OP_MAC_ONLY | CFG_MACM_HMAC_SHA1; - memcpy(cpg->sram + SRAM_HMAC_IV_IN, + op->config = CFG_OP_MAC_ONLY | CFG_MACM_HMAC_SHA1; + memcpy(cpg->sa_sram.sa_ivi, tfm_ctx->ivs, sizeof(tfm_ctx->ivs)); break; } - op.mac_src_p = + op->mac_src_p = MAC_SRC_DATA_P(SRAM_DATA_IN_START) | MAC_SRC_TOTAL_LEN((u32)req_ctx->count); setup_data_in(); - op.mac_digest = + op->mac_digest = MAC_DIGEST_P(SRAM_DIGEST_BUF) | MAC_FRAG_LEN(p->crypt_len); - op.mac_iv = + op->mac_iv = MAC_INNER_IV_P(SRAM_HMAC_IV_IN) | MAC_OUTER_IV_P(SRAM_HMAC_IV_OUT); @@ -349,16 +370,16 @@ static void mv_process_hash_current(int first_block) && (req_ctx->count <= MAX_HW_HASH_SIZE); if (req_ctx->first_hash) { if (is_last) - op.config |= CFG_NOT_FRAG; + op->config |= CFG_NOT_FRAG; else - op.config |= CFG_FIRST_FRAG; + op->config |= CFG_FIRST_FRAG; req_ctx->first_hash = 0; } else { if (is_last) - op.config |= CFG_LAST_FRAG; + op->config |= CFG_LAST_FRAG; else - op.config |= CFG_MID_FRAG; + op->config |= CFG_MID_FRAG; if (first_block) { writel(req_ctx->state[0], cpg->reg + DIGEST_INITIAL_VAL_A); @@ -369,8 +390,8 @@ static void mv_process_hash_current(int first_block) } } - memcpy(cpg->sram + SRAM_CONFIG, &op, - sizeof(struct sec_accel_config)); + memcpy(cpg->sram + SRAM_CONFIG, &cpg->sa_sram, + sizeof(struct sec_accel_sram)); /* GO */ mv_setup_timer(); -- 1.7.3.4 ^ permalink raw reply related [flat|nested] 67+ messages in thread
* [PATCH 04/13] mv_cesa: split up processing callbacks 2012-06-12 17:17 ` RFC: support for MV_CESA with IDMA or TDMA Phil Sutter ` (2 preceding siblings ...) 2012-06-12 17:17 ` [PATCH 03/13] mv_cesa: prepare the full sram config in dram Phil Sutter @ 2012-06-12 17:17 ` Phil Sutter 2012-06-12 17:17 ` [PATCH 05/13] add a driver for the Marvell IDMA/TDMA engines Phil Sutter ` (10 subsequent siblings) 14 siblings, 0 replies; 67+ messages in thread From: Phil Sutter @ 2012-06-12 17:17 UTC (permalink / raw) To: linux-crypto; +Cc: Herbert Xu Have a dedicated function initialising the full SRAM config, then use a minimal callback for changing only relevant parts of it. Signed-off-by: Phil Sutter <phil.sutter@viprinet.com> --- drivers/crypto/mv_cesa.c | 87 +++++++++++++++++++++++++++++++++------------ 1 files changed, 64 insertions(+), 23 deletions(-) diff --git a/drivers/crypto/mv_cesa.c b/drivers/crypto/mv_cesa.c index 80dcf16..ad21c72 100644 --- a/drivers/crypto/mv_cesa.c +++ b/drivers/crypto/mv_cesa.c @@ -63,7 +63,7 @@ struct req_progress { struct scatterlist *src_sg; struct scatterlist *dst_sg; void (*complete) (void); - void (*process) (int is_first); + void (*process) (void); /* src mostly */ int sg_src_left; @@ -267,9 +267,8 @@ static void setup_data_in(void) p->crypt_len = data_in_sram; } -static void mv_process_current_q(int first_block) +static void mv_init_crypt_config(struct ablkcipher_request *req) { - struct ablkcipher_request *req = ablkcipher_request_cast(cpg->cur_req); struct mv_ctx *ctx = crypto_tfm_ctx(req->base.tfm); struct mv_req_ctx *req_ctx = ablkcipher_request_ctx(req); struct sec_accel_config *op = &cpg->sa_sram.op; @@ -283,8 +282,6 @@ static void mv_process_current_q(int first_block) op->config = CFG_OP_CRYPT_ONLY | CFG_ENCM_AES | CFG_ENC_MODE_CBC; op->enc_iv = ENC_IV_POINT(SRAM_DATA_IV) | ENC_IV_BUF_POINT(SRAM_DATA_IV_BUF); - if (!first_block) - memcpy(req->info, cpg->sram + SRAM_DATA_IV_BUF, 16); memcpy(cpg->sa_sram.sa_iv, req->info, 16); break; } @@ -310,9 +307,8 @@ static void mv_process_current_q(int first_block) op->enc_p = ENC_P_SRC(SRAM_DATA_IN_START) | ENC_P_DST(SRAM_DATA_OUT_START); op->enc_key_p = SRAM_DATA_KEY_P; - - setup_data_in(); op->enc_len = cpg->p.crypt_len; + memcpy(cpg->sram + SRAM_CONFIG, &cpg->sa_sram, sizeof(struct sec_accel_sram)); @@ -321,6 +317,17 @@ static void mv_process_current_q(int first_block) writel(SEC_CMD_EN_SEC_ACCL0, cpg->reg + SEC_ACCEL_CMD); } +static void mv_update_crypt_config(void) +{ + /* update the enc_len field only */ + memcpy(cpg->sram + SRAM_CONFIG + 2 * sizeof(u32), + &cpg->p.crypt_len, sizeof(u32)); + + /* GO */ + mv_setup_timer(); + writel(SEC_CMD_EN_SEC_ACCL0, cpg->reg + SEC_ACCEL_CMD); +} + static void mv_crypto_algo_completion(void) { struct ablkcipher_request *req = ablkcipher_request_cast(cpg->cur_req); @@ -332,9 +339,8 @@ static void mv_crypto_algo_completion(void) memcpy(req->info, cpg->sram + SRAM_DATA_IV_BUF, 16); } -static void mv_process_hash_current(int first_block) +static void mv_init_hash_config(struct ahash_request *req) { - struct ahash_request *req = ahash_request_cast(cpg->cur_req); const struct mv_tfm_hash_ctx *tfm_ctx = crypto_tfm_ctx(req->base.tfm); struct mv_req_hash_ctx *req_ctx = ahash_request_ctx(req); struct req_progress *p = &cpg->p; @@ -357,8 +363,6 @@ static void mv_process_hash_current(int first_block) MAC_SRC_DATA_P(SRAM_DATA_IN_START) | MAC_SRC_TOTAL_LEN((u32)req_ctx->count); - setup_data_in(); - op->mac_digest = MAC_DIGEST_P(SRAM_DIGEST_BUF) | MAC_FRAG_LEN(p->crypt_len); op->mac_iv = @@ -381,13 +385,11 @@ static void mv_process_hash_current(int first_block) else op->config |= CFG_MID_FRAG; - if (first_block) { - writel(req_ctx->state[0], cpg->reg + DIGEST_INITIAL_VAL_A); - writel(req_ctx->state[1], cpg->reg + DIGEST_INITIAL_VAL_B); - writel(req_ctx->state[2], cpg->reg + DIGEST_INITIAL_VAL_C); - writel(req_ctx->state[3], cpg->reg + DIGEST_INITIAL_VAL_D); - writel(req_ctx->state[4], cpg->reg + DIGEST_INITIAL_VAL_E); - } + writel(req_ctx->state[0], cpg->reg + DIGEST_INITIAL_VAL_A); + writel(req_ctx->state[1], cpg->reg + DIGEST_INITIAL_VAL_B); + writel(req_ctx->state[2], cpg->reg + DIGEST_INITIAL_VAL_C); + writel(req_ctx->state[3], cpg->reg + DIGEST_INITIAL_VAL_D); + writel(req_ctx->state[4], cpg->reg + DIGEST_INITIAL_VAL_E); } memcpy(cpg->sram + SRAM_CONFIG, &cpg->sa_sram, @@ -398,6 +400,42 @@ static void mv_process_hash_current(int first_block) writel(SEC_CMD_EN_SEC_ACCL0, cpg->reg + SEC_ACCEL_CMD); } +static void mv_update_hash_config(void) +{ + struct ahash_request *req = ahash_request_cast(cpg->cur_req); + struct mv_req_hash_ctx *req_ctx = ahash_request_ctx(req); + struct req_progress *p = &cpg->p; + struct sec_accel_config *op = &cpg->sa_sram.op; + int is_last; + + /* update only the config (for changed fragment state) and + * mac_digest (for changed frag len) fields */ + + switch (req_ctx->op) { + case COP_SHA1: + default: + op->config = CFG_OP_MAC_ONLY | CFG_MACM_SHA1; + break; + case COP_HMAC_SHA1: + op->config = CFG_OP_MAC_ONLY | CFG_MACM_HMAC_SHA1; + break; + } + + is_last = req_ctx->last_chunk + && (p->hw_processed_bytes + p->crypt_len >= p->hw_nbytes) + && (req_ctx->count <= MAX_HW_HASH_SIZE); + + op->config |= is_last ? CFG_LAST_FRAG : CFG_MID_FRAG; + memcpy(cpg->sram + SRAM_CONFIG, &op->config, sizeof(u32)); + + op->mac_digest = MAC_DIGEST_P(SRAM_DIGEST_BUF) | MAC_FRAG_LEN(p->crypt_len); + memcpy(cpg->sram + SRAM_CONFIG + 6 * sizeof(u32), &op->mac_digest, sizeof(u32)); + + /* GO */ + mv_setup_timer(); + writel(SEC_CMD_EN_SEC_ACCL0, cpg->reg + SEC_ACCEL_CMD); +} + static inline int mv_hash_import_sha1_ctx(const struct mv_req_hash_ctx *ctx, struct shash_desc *desc) { @@ -509,7 +547,8 @@ static void dequeue_complete_req(void) if (cpg->p.hw_processed_bytes < cpg->p.hw_nbytes) { /* process next scatter list entry */ cpg->eng_st = ENGINE_BUSY; - cpg->p.process(0); + setup_data_in(); + cpg->p.process(); } else { cpg->p.complete(); cpg->eng_st = ENGINE_IDLE; @@ -544,7 +583,7 @@ static void mv_start_new_crypt_req(struct ablkcipher_request *req) memset(p, 0, sizeof(struct req_progress)); p->hw_nbytes = req->nbytes; p->complete = mv_crypto_algo_completion; - p->process = mv_process_current_q; + p->process = mv_update_crypt_config; p->copy_back = 1; p->src_sg = req->src; @@ -556,7 +595,8 @@ static void mv_start_new_crypt_req(struct ablkcipher_request *req) p->sg_dst_left = req->dst->length; } - mv_process_current_q(1); + setup_data_in(); + mv_init_crypt_config(req); } static void mv_start_new_hash_req(struct ahash_request *req) @@ -585,7 +625,7 @@ static void mv_start_new_hash_req(struct ahash_request *req) if (hw_bytes) { p->hw_nbytes = hw_bytes; p->complete = mv_hash_algo_completion; - p->process = mv_process_hash_current; + p->process = mv_update_hash_config; if (unlikely(old_extra_bytes)) { memcpy(cpg->sram + SRAM_DATA_IN_START, ctx->buffer, @@ -593,7 +633,8 @@ static void mv_start_new_hash_req(struct ahash_request *req) p->crypt_len = old_extra_bytes; } - mv_process_hash_current(1); + setup_data_in(); + mv_init_hash_config(req); } else { copy_src_to_buf(p, ctx->buffer + old_extra_bytes, ctx->extra_bytes - old_extra_bytes); -- 1.7.3.4 ^ permalink raw reply related [flat|nested] 67+ messages in thread
* [PATCH 05/13] add a driver for the Marvell IDMA/TDMA engines 2012-06-12 17:17 ` RFC: support for MV_CESA with IDMA or TDMA Phil Sutter ` (3 preceding siblings ...) 2012-06-12 17:17 ` [PATCH 04/13] mv_cesa: split up processing callbacks Phil Sutter @ 2012-06-12 17:17 ` Phil Sutter 2012-06-12 17:17 ` [PATCH 06/13] mv_cesa: use DMA engine for data transfers Phil Sutter ` (9 subsequent siblings) 14 siblings, 0 replies; 67+ messages in thread From: Phil Sutter @ 2012-06-12 17:17 UTC (permalink / raw) To: linux-crypto; +Cc: Herbert Xu These are DMA engines integrated into the Marvell Orion/Kirkwood SoCs, designed to offload data transfers from/to the CESA crypto engine. Signed-off-by: Phil Sutter <phil.sutter@viprinet.com> --- arch/arm/mach-kirkwood/common.c | 33 ++ arch/arm/mach-kirkwood/include/mach/irqs.h | 1 + arch/arm/mach-orion5x/common.c | 33 ++ arch/arm/mach-orion5x/include/mach/orion5x.h | 2 + drivers/crypto/Kconfig | 5 + drivers/crypto/Makefile | 3 +- drivers/crypto/mv_dma.c | 464 ++++++++++++++++++++++++++ drivers/crypto/mv_dma.h | 127 +++++++ 8 files changed, 667 insertions(+), 1 deletions(-) create mode 100644 drivers/crypto/mv_dma.c create mode 100644 drivers/crypto/mv_dma.h diff --git a/arch/arm/mach-kirkwood/common.c b/arch/arm/mach-kirkwood/common.c index 25fb3fd..dcd1327 100644 --- a/arch/arm/mach-kirkwood/common.c +++ b/arch/arm/mach-kirkwood/common.c @@ -426,8 +426,41 @@ void __init kirkwood_uart1_init(void) /***************************************************************************** * Cryptographic Engines and Security Accelerator (CESA) ****************************************************************************/ +static struct resource kirkwood_tdma_res[] = { + { + .name = "regs deco", + .start = CRYPTO_PHYS_BASE + 0xA00, + .end = CRYPTO_PHYS_BASE + 0xA24, + .flags = IORESOURCE_MEM, + }, { + .name = "regs control and error", + .start = CRYPTO_PHYS_BASE + 0x800, + .end = CRYPTO_PHYS_BASE + 0x8CF, + .flags = IORESOURCE_MEM, + }, { + .name = "crypto error", + .start = IRQ_KIRKWOOD_TDMA_ERR, + .end = IRQ_KIRKWOOD_TDMA_ERR, + .flags = IORESOURCE_IRQ, + }, +}; + +static u64 mv_tdma_dma_mask = DMA_BIT_MASK(32); + +static struct platform_device kirkwood_tdma_device = { + .name = "mv_tdma", + .id = -1, + .dev = { + .dma_mask = &mv_tdma_dma_mask, + .coherent_dma_mask = DMA_BIT_MASK(32), + }, + .num_resources = ARRAY_SIZE(kirkwood_tdma_res), + .resource = kirkwood_tdma_res, +}; + void __init kirkwood_crypto_init(void) { + platform_device_register(&kirkwood_tdma_device); orion_crypto_init(CRYPTO_PHYS_BASE, KIRKWOOD_SRAM_PHYS_BASE, KIRKWOOD_SRAM_SIZE, IRQ_KIRKWOOD_CRYPTO); } diff --git a/arch/arm/mach-kirkwood/include/mach/irqs.h b/arch/arm/mach-kirkwood/include/mach/irqs.h index 2bf8161..a66aa3f 100644 --- a/arch/arm/mach-kirkwood/include/mach/irqs.h +++ b/arch/arm/mach-kirkwood/include/mach/irqs.h @@ -51,6 +51,7 @@ #define IRQ_KIRKWOOD_GPIO_HIGH_16_23 41 #define IRQ_KIRKWOOD_GE00_ERR 46 #define IRQ_KIRKWOOD_GE01_ERR 47 +#define IRQ_KIRKWOOD_TDMA_ERR 49 #define IRQ_KIRKWOOD_RTC 53 /* diff --git a/arch/arm/mach-orion5x/common.c b/arch/arm/mach-orion5x/common.c index 9148b22..553ccf2 100644 --- a/arch/arm/mach-orion5x/common.c +++ b/arch/arm/mach-orion5x/common.c @@ -181,9 +181,42 @@ void __init orion5x_xor_init(void) /***************************************************************************** * Cryptographic Engines and Security Accelerator (CESA) ****************************************************************************/ +static struct resource orion_idma_res[] = { + { + .name = "regs deco", + .start = ORION5X_IDMA_PHYS_BASE + 0xA00, + .end = ORION5X_IDMA_PHYS_BASE + 0xA24, + .flags = IORESOURCE_MEM, + }, { + .name = "regs control and error", + .start = ORION5X_IDMA_PHYS_BASE + 0x800, + .end = ORION5X_IDMA_PHYS_BASE + 0x8CF, + .flags = IORESOURCE_MEM, + }, { + .name = "crypto error", + .start = IRQ_ORION5X_IDMA_ERR, + .end = IRQ_ORION5X_IDMA_ERR, + .flags = IORESOURCE_IRQ, + }, +}; + +static u64 mv_idma_dma_mask = DMA_BIT_MASK(32); + +static struct platform_device orion_idma_device = { + .name = "mv_idma", + .id = -1, + .dev = { + .dma_mask = &mv_idma_dma_mask, + .coherent_dma_mask = DMA_BIT_MASK(32), + }, + .num_resources = ARRAY_SIZE(orion_idma_res), + .resource = orion_idma_res, +}; + static void __init orion5x_crypto_init(void) { orion5x_setup_sram_win(); + platform_device_register(&orion_idma_device); orion_crypto_init(ORION5X_CRYPTO_PHYS_BASE, ORION5X_SRAM_PHYS_BASE, SZ_8K, IRQ_ORION5X_CESA); } diff --git a/arch/arm/mach-orion5x/include/mach/orion5x.h b/arch/arm/mach-orion5x/include/mach/orion5x.h index 2745f5d..a31ac88 100644 --- a/arch/arm/mach-orion5x/include/mach/orion5x.h +++ b/arch/arm/mach-orion5x/include/mach/orion5x.h @@ -90,6 +90,8 @@ #define ORION5X_USB0_PHYS_BASE (ORION5X_REGS_PHYS_BASE | 0x50000) #define ORION5X_USB0_VIRT_BASE (ORION5X_REGS_VIRT_BASE | 0x50000) +#define ORION5X_IDMA_PHYS_BASE (ORION5X_REGS_PHYS_BASE | 0x60000) + #define ORION5X_XOR_PHYS_BASE (ORION5X_REGS_PHYS_BASE | 0x60900) #define ORION5X_XOR_VIRT_BASE (ORION5X_REGS_VIRT_BASE | 0x60900) diff --git a/drivers/crypto/Kconfig b/drivers/crypto/Kconfig index 1092a77..3709f38 100644 --- a/drivers/crypto/Kconfig +++ b/drivers/crypto/Kconfig @@ -159,6 +159,10 @@ config CRYPTO_GHASH_S390 It is available as of z196. +config CRYPTO_DEV_MV_DMA + tristate + default no + config CRYPTO_DEV_MV_CESA tristate "Marvell's Cryptographic Engine" depends on PLAT_ORION @@ -166,6 +170,7 @@ config CRYPTO_DEV_MV_CESA select CRYPTO_AES select CRYPTO_BLKCIPHER2 select CRYPTO_HASH + select CRYPTO_DEV_MV_DMA help This driver allows you to utilize the Cryptographic Engines and Security Accelerator (CESA) which can be found on the Marvell Orion diff --git a/drivers/crypto/Makefile b/drivers/crypto/Makefile index 0139032..cb655ad 100644 --- a/drivers/crypto/Makefile +++ b/drivers/crypto/Makefile @@ -4,6 +4,7 @@ obj-$(CONFIG_CRYPTO_DEV_GEODE) += geode-aes.o obj-$(CONFIG_CRYPTO_DEV_NIAGARA2) += n2_crypto.o n2_crypto-y := n2_core.o n2_asm.o obj-$(CONFIG_CRYPTO_DEV_HIFN_795X) += hifn_795x.o +obj-$(CONFIG_CRYPTO_DEV_MV_DMA) += mv_dma.o obj-$(CONFIG_CRYPTO_DEV_MV_CESA) += mv_cesa.o obj-$(CONFIG_CRYPTO_DEV_TALITOS) += talitos.o obj-$(CONFIG_CRYPTO_DEV_FSL_CAAM) += caam/ @@ -14,4 +15,4 @@ obj-$(CONFIG_CRYPTO_DEV_OMAP_AES) += omap-aes.o obj-$(CONFIG_CRYPTO_DEV_PICOXCELL) += picoxcell_crypto.o obj-$(CONFIG_CRYPTO_DEV_S5P) += s5p-sss.o obj-$(CONFIG_CRYPTO_DEV_TEGRA_AES) += tegra-aes.o -obj-$(CONFIG_CRYPTO_DEV_UX500) += ux500/ \ No newline at end of file +obj-$(CONFIG_CRYPTO_DEV_UX500) += ux500/ diff --git a/drivers/crypto/mv_dma.c b/drivers/crypto/mv_dma.c new file mode 100644 index 0000000..24c5256 --- /dev/null +++ b/drivers/crypto/mv_dma.c @@ -0,0 +1,464 @@ +/* + * Support for Marvell's IDMA/TDMA engines found on Orion/Kirkwood chips, + * used exclusively by the CESA crypto accelerator. + * + * Based on unpublished code for IDMA written by Sebastian Siewior. + * + * Copyright (C) 2012 Phil Sutter <phil.sutter@viprinet.com> + * License: GPLv2 + */ + +#include <linux/delay.h> +#include <linux/dma-mapping.h> +#include <linux/dmapool.h> +#include <linux/interrupt.h> +#include <linux/module.h> +#include <linux/slab.h> +#include <linux/platform_device.h> + +#include "mv_dma.h" + +#define MV_DMA "MV-DMA: " + +#define MV_DMA_INIT_POOLSIZE 16 +#define MV_DMA_ALIGN 16 + +struct mv_dma_desc { + u32 count; + u32 src; + u32 dst; + u32 next; +} __attribute__((packed)); + +struct desc_mempair { + struct mv_dma_desc *vaddr; + dma_addr_t daddr; +}; + +struct mv_dma_priv { + bool idma_registered, tdma_registered; + struct device *dev; + void __iomem *reg; + int irq; + /* protecting the dma descriptors and stuff */ + spinlock_t lock; + struct dma_pool *descpool; + struct desc_mempair *desclist; + int desclist_len; + int desc_usage; + u32 (*print_and_clear_irq)(void); +} tpg; + +#define DESC(x) (tpg.desclist[x].vaddr) +#define DESC_DMA(x) (tpg.desclist[x].daddr) + +static inline int set_poolsize(int nelem) +{ + /* need to increase size first if requested */ + if (nelem > tpg.desclist_len) { + struct desc_mempair *newmem; + int newsize = nelem * sizeof(struct desc_mempair); + + newmem = krealloc(tpg.desclist, newsize, GFP_KERNEL); + if (!newmem) + return -ENOMEM; + tpg.desclist = newmem; + } + + /* allocate/free dma descriptors, adjusting tpg.desclist_len on the go */ + for (; tpg.desclist_len < nelem; tpg.desclist_len++) { + DESC(tpg.desclist_len) = dma_pool_alloc(tpg.descpool, + GFP_KERNEL, &DESC_DMA(tpg.desclist_len)); + if (!DESC((tpg.desclist_len))) + return -ENOMEM; + } + for (; tpg.desclist_len > nelem; tpg.desclist_len--) + dma_pool_free(tpg.descpool, DESC(tpg.desclist_len - 1), + DESC_DMA(tpg.desclist_len - 1)); + + /* ignore size decreases but those to zero */ + if (!nelem) { + kfree(tpg.desclist); + tpg.desclist = 0; + } + return 0; +} + +static inline void wait_for_dma_idle(void) +{ + while (readl(tpg.reg + DMA_CTRL) & DMA_CTRL_ACTIVE) + mdelay(100); +} + +static inline void switch_dma_engine(bool state) +{ + u32 val = readl(tpg.reg + DMA_CTRL); + + val |= ( state * DMA_CTRL_ENABLE); + val &= ~(!state * DMA_CTRL_ENABLE); + + writel(val, tpg.reg + DMA_CTRL); +} + +static struct mv_dma_desc *get_new_last_desc(void) +{ + if (unlikely(tpg.desc_usage == tpg.desclist_len) && + set_poolsize(tpg.desclist_len << 1)) { + printk(KERN_ERR MV_DMA "failed to increase DMA pool to %d\n", + tpg.desclist_len << 1); + return NULL; + } + + if (likely(tpg.desc_usage)) + DESC(tpg.desc_usage - 1)->next = DESC_DMA(tpg.desc_usage); + + return DESC(tpg.desc_usage++); +} + +static inline void mv_dma_desc_dump(void) +{ + struct mv_dma_desc *tmp; + int i; + + if (!tpg.desc_usage) { + printk(KERN_WARNING MV_DMA "DMA descriptor list is empty\n"); + return; + } + + printk(KERN_WARNING MV_DMA "DMA descriptor list:\n"); + for (i = 0; i < tpg.desc_usage; i++) { + tmp = DESC(i); + printk(KERN_WARNING MV_DMA "entry %d at 0x%x: dma addr 0x%x, " + "src 0x%x, dst 0x%x, count %u, own %d, next 0x%x", i, + (u32)tmp, DESC_DMA(i) , tmp->src, tmp->dst, + tmp->count & DMA_BYTE_COUNT_MASK, !!(tmp->count & DMA_OWN_BIT), + tmp->next); + } +} + +static inline void mv_dma_reg_dump(void) +{ +#define PRINTREG(offset) \ + printk(KERN_WARNING MV_DMA "tpg.reg + " #offset " = 0x%x\n", \ + readl(tpg.reg + offset)) + + PRINTREG(DMA_CTRL); + PRINTREG(DMA_BYTE_COUNT); + PRINTREG(DMA_SRC_ADDR); + PRINTREG(DMA_DST_ADDR); + PRINTREG(DMA_NEXT_DESC); + PRINTREG(DMA_CURR_DESC); + +#undef PRINTREG +} + +static inline void mv_dma_clear_desc_reg(void) +{ + writel(0, tpg.reg + DMA_BYTE_COUNT); + writel(0, tpg.reg + DMA_SRC_ADDR); + writel(0, tpg.reg + DMA_DST_ADDR); + writel(0, tpg.reg + DMA_CURR_DESC); + writel(0, tpg.reg + DMA_NEXT_DESC); +} + +void mv_dma_clear(void) +{ + if (!tpg.dev) + return; + + spin_lock(&tpg.lock); + + /* make sure engine is idle */ + wait_for_dma_idle(); + switch_dma_engine(0); + wait_for_dma_idle(); + + /* clear descriptor registers */ + mv_dma_clear_desc_reg(); + + tpg.desc_usage = 0; + + switch_dma_engine(1); + + /* finally free system lock again */ + spin_unlock(&tpg.lock); +} +EXPORT_SYMBOL_GPL(mv_dma_clear); + +void mv_dma_trigger(void) +{ + if (!tpg.dev) + return; + + spin_lock(&tpg.lock); + + writel(DESC_DMA(0), tpg.reg + DMA_NEXT_DESC); + + spin_unlock(&tpg.lock); +} +EXPORT_SYMBOL_GPL(mv_dma_trigger); + +void mv_dma_separator(void) +{ + struct mv_dma_desc *tmp; + + if (!tpg.dev) + return; + + spin_lock(&tpg.lock); + + tmp = get_new_last_desc(); + memset(tmp, 0, sizeof(*tmp)); + + spin_unlock(&tpg.lock); +} +EXPORT_SYMBOL_GPL(mv_dma_separator); + +void mv_dma_memcpy(dma_addr_t dst, dma_addr_t src, unsigned int size) +{ + struct mv_dma_desc *tmp; + + if (!tpg.dev) + return; + + spin_lock(&tpg.lock); + + tmp = get_new_last_desc(); + tmp->count = size | DMA_OWN_BIT; + tmp->src = src; + tmp->dst = dst; + tmp->next = 0; + + spin_unlock(&tpg.lock); +} +EXPORT_SYMBOL_GPL(mv_dma_memcpy); + +static u32 idma_print_and_clear_irq(void) +{ + u32 val, val2, addr; + + val = readl(tpg.reg + IDMA_INT_CAUSE); + val2 = readl(tpg.reg + IDMA_ERR_SELECT); + addr = readl(tpg.reg + IDMA_ERR_ADDR); + + if (val & IDMA_INT_MISS(0)) + printk(KERN_ERR MV_DMA "%s: address miss @%x!\n", + __func__, val2 & IDMA_INT_MISS(0) ? addr : 0); + if (val & IDMA_INT_APROT(0)) + printk(KERN_ERR MV_DMA "%s: access protection @%x!\n", + __func__, val2 & IDMA_INT_APROT(0) ? addr : 0); + if (val & IDMA_INT_WPROT(0)) + printk(KERN_ERR MV_DMA "%s: write protection @%x!\n", + __func__, val2 & IDMA_INT_WPROT(0) ? addr : 0); + + /* clear interrupt cause register */ + writel(0, tpg.reg + IDMA_INT_CAUSE); + + return val; +} + +static u32 tdma_print_and_clear_irq(void) +{ + u32 val; + + val = readl(tpg.reg + TDMA_ERR_CAUSE); + + if (val & TDMA_INT_MISS) + printk(KERN_ERR MV_DMA "%s: miss!\n", __func__); + if (val & TDMA_INT_DOUBLE_HIT) + printk(KERN_ERR MV_DMA "%s: double hit!\n", __func__); + if (val & TDMA_INT_BOTH_HIT) + printk(KERN_ERR MV_DMA "%s: both hit!\n", __func__); + if (val & TDMA_INT_DATA_ERROR) + printk(KERN_ERR MV_DMA "%s: data error!\n", __func__); + + /* clear error cause register */ + writel(0, tpg.reg + TDMA_ERR_CAUSE); + + return val; +} + +irqreturn_t mv_dma_int(int irq, void *priv) +{ + int handled; + + handled = (*tpg.print_and_clear_irq)(); + + if (handled) { + mv_dma_reg_dump(); + mv_dma_desc_dump(); + } + + switch_dma_engine(0); + wait_for_dma_idle(); + + /* clear descriptor registers */ + mv_dma_clear_desc_reg(); + + switch_dma_engine(1); + wait_for_dma_idle(); + + return (handled ? IRQ_HANDLED : IRQ_NONE); +} + +/* initialise the global tpg structure */ +static int mv_init_engine(struct platform_device *pdev, + u32 ctrl_init_val, u32 (*print_and_clear_irq)(void)) +{ + struct resource *res; + int rc; + + if (tpg.dev) { + printk(KERN_ERR MV_DMA "second DMA device?!\n"); + return -ENXIO; + } + tpg.dev = &pdev->dev; + tpg.print_and_clear_irq = print_and_clear_irq; + + /* get register start address */ + res = platform_get_resource_byname(pdev, + IORESOURCE_MEM, "regs control and error"); + if (!res) + return -ENXIO; + if (!(tpg.reg = ioremap(res->start, resource_size(res)))) + return -ENOMEM; + + /* get the IRQ */ + tpg.irq = platform_get_irq(pdev, 0); + if (tpg.irq < 0 || tpg.irq == NO_IRQ) { + rc = -ENXIO; + goto out_unmap_reg; + } + + /* initialise DMA descriptor list */ + tpg.descpool = dma_pool_create("MV_DMA Descriptor Pool", tpg.dev, + sizeof(struct mv_dma_desc), MV_DMA_ALIGN, 0); + if (!tpg.descpool) { + rc = -ENOMEM; + goto out_free_irq; + } + set_poolsize(MV_DMA_INIT_POOLSIZE); + + platform_set_drvdata(pdev, &tpg); + + spin_lock_init(&tpg.lock); + + switch_dma_engine(0); + wait_for_dma_idle(); + + /* clear descriptor registers */ + mv_dma_clear_desc_reg(); + + /* initialize control register (also enables engine) */ + writel(ctrl_init_val, tpg.reg + DMA_CTRL); + wait_for_dma_idle(); + + if (request_irq(tpg.irq, mv_dma_int, IRQF_DISABLED, + dev_name(tpg.dev), &tpg)) { + rc = -ENXIO; + goto out_free_all; + } + + return 0; + +out_free_all: + switch_dma_engine(0); + platform_set_drvdata(pdev, NULL); + set_poolsize(0); + dma_pool_destroy(tpg.descpool); +out_free_irq: + free_irq(tpg.irq, &tpg); +out_unmap_reg: + iounmap(tpg.reg); + tpg.dev = NULL; + return rc; +} + +static int mv_remove(struct platform_device *pdev) +{ + switch_dma_engine(0); + platform_set_drvdata(pdev, NULL); + set_poolsize(0); + dma_pool_destroy(tpg.descpool); + free_irq(tpg.irq, &tpg); + iounmap(tpg.reg); + tpg.dev = NULL; + return 0; +} + +static int mv_probe_tdma(struct platform_device *pdev) +{ + int rc; + + rc = mv_init_engine(pdev, TDMA_CTRL_INIT_VALUE, + &tdma_print_and_clear_irq); + if (rc) + return rc; + + /* have an ear for occurring errors */ + writel(TDMA_INT_ALL, tpg.reg + TDMA_ERR_MASK); + writel(0, tpg.reg + TDMA_ERR_CAUSE); + + printk(KERN_INFO MV_DMA + "TDMA engine up and running, IRQ %d\n", tpg.irq); + return 0; +} + +static int mv_probe_idma(struct platform_device *pdev) +{ + int rc; + + rc = mv_init_engine(pdev, IDMA_CTRL_INIT_VALUE, + &idma_print_and_clear_irq); + if (rc) + return rc; + + /* have an ear for occurring errors */ + writel(IDMA_INT_MISS(0) | IDMA_INT_APROT(0) | IDMA_INT_WPROT(0), + tpg.reg + IDMA_INT_MASK); + writel(0, tpg.reg + IDMA_INT_CAUSE); + + printk(KERN_INFO MV_DMA + "IDMA engine up and running, IRQ %d\n", tpg.irq); + return 0; +} + +static struct platform_driver marvell_tdma = { + .probe = mv_probe_tdma, + .remove = mv_remove, + .driver = { + .owner = THIS_MODULE, + .name = "mv_tdma", + }, +}, marvell_idma = { + .probe = mv_probe_idma, + .remove = mv_remove, + .driver = { + .owner = THIS_MODULE, + .name = "mv_idma", + }, +}; +MODULE_ALIAS("platform:mv_tdma"); +MODULE_ALIAS("platform:mv_idma"); + +static int __init mv_dma_init(void) +{ + tpg.tdma_registered = !platform_driver_register(&marvell_tdma); + tpg.idma_registered = !platform_driver_register(&marvell_idma); + return !(tpg.tdma_registered || tpg.idma_registered); +} +module_init(mv_dma_init); + +static void __exit mv_dma_exit(void) +{ + if (tpg.tdma_registered) + platform_driver_unregister(&marvell_tdma); + if (tpg.idma_registered) + platform_driver_unregister(&marvell_idma); +} +module_exit(mv_dma_exit); + +MODULE_AUTHOR("Phil Sutter <phil.sutter@viprinet.com>"); +MODULE_DESCRIPTION("Support for Marvell's IDMA/TDMA engines"); +MODULE_LICENSE("GPL"); + diff --git a/drivers/crypto/mv_dma.h b/drivers/crypto/mv_dma.h new file mode 100644 index 0000000..d0c9d0c --- /dev/null +++ b/drivers/crypto/mv_dma.h @@ -0,0 +1,127 @@ +#ifndef _MV_DMA_H +#define _MV_DMA_H + +/* common TDMA_CTRL/IDMA_CTRL_LOW bits */ +#define DMA_CTRL_DST_BURST(x) (x) +#define DMA_CTRL_SRC_BURST(x) (x << 6) +#define DMA_CTRL_NO_CHAIN_MODE (1 << 9) +#define DMA_CTRL_ENABLE (1 << 12) +#define DMA_CTRL_FETCH_ND (1 << 13) +#define DMA_CTRL_ACTIVE (1 << 14) + +/* TDMA_CTRL register bits */ +#define TDMA_CTRL_DST_BURST_32 DMA_CTRL_DST_BURST(3) +#define TDMA_CTRL_DST_BURST_128 DMA_CTRL_DST_BURST(4) +#define TDMA_CTRL_OUTST_RD_EN (1 << 4) +#define TDMA_CTRL_SRC_BURST_32 DMA_CTRL_SRC_BURST(3) +#define TDMA_CTRL_SRC_BURST_128 DMA_CTRL_SRC_BURST(4) +#define TDMA_CTRL_NO_BYTE_SWAP (1 << 11) + +#define TDMA_CTRL_INIT_VALUE ( \ + TDMA_CTRL_DST_BURST_128 | TDMA_CTRL_SRC_BURST_128 | \ + TDMA_CTRL_NO_BYTE_SWAP | DMA_CTRL_ENABLE \ +) + +/* IDMA_CTRL_LOW register bits */ +#define IDMA_CTRL_DST_BURST_8 DMA_CTRL_DST_BURST(0) +#define IDMA_CTRL_DST_BURST_16 DMA_CTRL_DST_BURST(1) +#define IDMA_CTRL_DST_BURST_32 DMA_CTRL_DST_BURST(3) +#define IDMA_CTRL_DST_BURST_64 DMA_CTRL_DST_BURST(7) +#define IDMA_CTRL_DST_BURST_128 DMA_CTRL_DST_BURST(4) +#define IDMA_CTRL_SRC_HOLD (1 << 3) +#define IDMA_CTRL_DST_HOLD (1 << 5) +#define IDMA_CTRL_SRC_BURST_8 DMA_CTRL_SRC_BURST(0) +#define IDMA_CTRL_SRC_BURST_16 DMA_CTRL_SRC_BURST(1) +#define IDMA_CTRL_SRC_BURST_32 DMA_CTRL_SRC_BURST(3) +#define IDMA_CTRL_SRC_BURST_64 DMA_CTRL_SRC_BURST(7) +#define IDMA_CTRL_SRC_BURST_128 DMA_CTRL_SRC_BURST(4) +#define IDMA_CTRL_INT_MODE (1 << 10) +#define IDMA_CTRL_BLOCK_MODE (1 << 11) +#define IDMA_CTRL_CLOSE_DESC (1 << 17) +#define IDMA_CTRL_ABORT (1 << 20) +#define IDMA_CTRL_SADDR_OVR(x) (x << 21) +#define IDMA_CTRL_NO_SADDR_OVR IDMA_CTRL_SADDR_OVR(0) +#define IDMA_CTRL_SADDR_OVR_1 IDMA_CTRL_SADDR_OVR(1) +#define IDMA_CTRL_SADDR_OVR_2 IDMA_CTRL_SADDR_OVR(2) +#define IDMA_CTRL_SADDR_OVR_3 IDMA_CTRL_SADDR_OVR(3) +#define IDMA_CTRL_DADDR_OVR(x) (x << 23) +#define IDMA_CTRL_NO_DADDR_OVR IDMA_CTRL_DADDR_OVR(0) +#define IDMA_CTRL_DADDR_OVR_1 IDMA_CTRL_DADDR_OVR(1) +#define IDMA_CTRL_DADDR_OVR_2 IDMA_CTRL_DADDR_OVR(2) +#define IDMA_CTRL_DADDR_OVR_3 IDMA_CTRL_DADDR_OVR(3) +#define IDMA_CTRL_NADDR_OVR(x) (x << 25) +#define IDMA_CTRL_NO_NADDR_OVR IDMA_CTRL_NADDR_OVR(0) +#define IDMA_CTRL_NADDR_OVR_1 IDMA_CTRL_NADDR_OVR(1) +#define IDMA_CTRL_NADDR_OVR_2 IDMA_CTRL_NADDR_OVR(2) +#define IDMA_CTRL_NADDR_OVR_3 IDMA_CTRL_NADDR_OVR(3) +#define IDMA_CTRL_DESC_MODE_16M (1 << 31) + +#define IDMA_CTRL_INIT_VALUE ( \ + IDMA_CTRL_DST_BURST_128 | IDMA_CTRL_SRC_BURST_128 | \ + IDMA_CTRL_INT_MODE | IDMA_CTRL_BLOCK_MODE | \ + DMA_CTRL_ENABLE | IDMA_CTRL_DESC_MODE_16M \ +) + +/* TDMA_ERR_CAUSE bits */ +#define TDMA_INT_MISS (1 << 0) +#define TDMA_INT_DOUBLE_HIT (1 << 1) +#define TDMA_INT_BOTH_HIT (1 << 2) +#define TDMA_INT_DATA_ERROR (1 << 3) +#define TDMA_INT_ALL 0x0f + +/* offsets of registers, starting at "regs control and error" */ +#define TDMA_BYTE_COUNT 0x00 +#define TDMA_SRC_ADDR 0x10 +#define TDMA_DST_ADDR 0x20 +#define TDMA_NEXT_DESC 0x30 +#define TDMA_CTRL 0x40 +#define TDMA_CURR_DESC 0x70 +#define TDMA_ERR_CAUSE 0xc8 +#define TDMA_ERR_MASK 0xcc + +#define IDMA_BYTE_COUNT(chan) (0x00 + (chan) * 4) +#define IDMA_SRC_ADDR(chan) (0x10 + (chan) * 4) +#define IDMA_DST_ADDR(chan) (0x20 + (chan) * 4) +#define IDMA_NEXT_DESC(chan) (0x30 + (chan) * 4) +#define IDMA_CTRL_LOW(chan) (0x40 + (chan) * 4) +#define IDMA_CURR_DESC(chan) (0x70 + (chan) * 4) +#define IDMA_CTRL_HIGH(chan) (0x80 + (chan) * 4) +#define IDMA_INT_CAUSE (0xc0) +#define IDMA_INT_MASK (0xc4) +#define IDMA_ERR_ADDR (0xc8) +#define IDMA_ERR_SELECT (0xcc) + +/* register offsets common to TDMA and IDMA channel 0 */ +#define DMA_BYTE_COUNT TDMA_BYTE_COUNT +#define DMA_SRC_ADDR TDMA_SRC_ADDR +#define DMA_DST_ADDR TDMA_DST_ADDR +#define DMA_NEXT_DESC TDMA_NEXT_DESC +#define DMA_CTRL TDMA_CTRL +#define DMA_CURR_DESC TDMA_CURR_DESC + +/* IDMA_INT_CAUSE and IDMA_INT_MASK bits */ +#define IDMA_INT_COMP(chan) ((1 << 0) << ((chan) * 8)) +#define IDMA_INT_MISS(chan) ((1 << 1) << ((chan) * 8)) +#define IDMA_INT_APROT(chan) ((1 << 2) << ((chan) * 8)) +#define IDMA_INT_WPROT(chan) ((1 << 3) << ((chan) * 8)) +#define IDMA_INT_OWN(chan) ((1 << 4) << ((chan) * 8)) +#define IDMA_INT_ALL(chan) (0x1f << (chan) * 8) + +/* Owner bit in DMA_BYTE_COUNT and descriptors' count field, used + * to signal input data completion in descriptor chain */ +#define DMA_OWN_BIT (1 << 31) + +/* IDMA also has a "Left Byte Count" bit, + * indicating not everything was transfered */ +#define IDMA_LEFT_BYTE_COUNT (1 << 30) + +/* filter the actual byte count value from the DMA_BYTE_COUNT field */ +#define DMA_BYTE_COUNT_MASK (~(DMA_OWN_BIT | IDMA_LEFT_BYTE_COUNT)) + +extern void mv_dma_memcpy(dma_addr_t, dma_addr_t, unsigned int); +extern void mv_dma_separator(void); +extern void mv_dma_clear(void); +extern void mv_dma_trigger(void); + + +#endif /* _MV_DMA_H */ -- 1.7.3.4 ^ permalink raw reply related [flat|nested] 67+ messages in thread
* [PATCH 06/13] mv_cesa: use DMA engine for data transfers 2012-06-12 17:17 ` RFC: support for MV_CESA with IDMA or TDMA Phil Sutter ` (4 preceding siblings ...) 2012-06-12 17:17 ` [PATCH 05/13] add a driver for the Marvell IDMA/TDMA engines Phil Sutter @ 2012-06-12 17:17 ` Phil Sutter 2012-06-12 17:17 ` [PATCH 07/13] mv_cesa: have DMA engine copy back the digest result Phil Sutter ` (8 subsequent siblings) 14 siblings, 0 replies; 67+ messages in thread From: Phil Sutter @ 2012-06-12 17:17 UTC (permalink / raw) To: linux-crypto; +Cc: Herbert Xu Signed-off-by: Phil Sutter <phil.sutter@viprinet.com> --- arch/arm/plat-orion/common.c | 6 + drivers/crypto/mv_cesa.c | 214 +++++++++++++++++++++++++++++++++--------- 2 files changed, 175 insertions(+), 45 deletions(-) diff --git a/arch/arm/plat-orion/common.c b/arch/arm/plat-orion/common.c index 61fd837..0c6c695 100644 --- a/arch/arm/plat-orion/common.c +++ b/arch/arm/plat-orion/common.c @@ -924,9 +924,15 @@ static struct resource orion_crypto_resources[] = { }, }; +static u64 mv_crypto_dmamask = DMA_BIT_MASK(32); + static struct platform_device orion_crypto = { .name = "mv_crypto", .id = -1, + .dev = { + .dma_mask = &mv_crypto_dmamask, + .coherent_dma_mask = DMA_BIT_MASK(32), + }, }; void __init orion_crypto_init(unsigned long mapbase, diff --git a/drivers/crypto/mv_cesa.c b/drivers/crypto/mv_cesa.c index ad21c72..cdbc82e 100644 --- a/drivers/crypto/mv_cesa.c +++ b/drivers/crypto/mv_cesa.c @@ -9,6 +9,7 @@ #include <crypto/aes.h> #include <crypto/algapi.h> #include <linux/crypto.h> +#include <linux/dma-mapping.h> #include <linux/interrupt.h> #include <linux/io.h> #include <linux/kthread.h> @@ -21,11 +22,14 @@ #include <crypto/sha.h> #include "mv_cesa.h" +#include "mv_dma.h" #define MV_CESA "MV-CESA:" #define MAX_HW_HASH_SIZE 0xFFFF #define MV_CESA_EXPIRE 500 /* msec */ +static int count_sgs(struct scatterlist *, unsigned int); + /* * STM: * /---------------------------------------\ @@ -50,7 +54,6 @@ enum engine_status { * @src_start: offset to add to src start position (scatter list) * @crypt_len: length of current hw crypt/hash process * @hw_nbytes: total bytes to process in hw for this request - * @copy_back: whether to copy data back (crypt) or not (hash) * @sg_dst_left: bytes left dst to process in this scatter list * @dst_start: offset to add to dst start position (scatter list) * @hw_processed_bytes: number of bytes processed by hw (request). @@ -71,7 +74,6 @@ struct req_progress { int crypt_len; int hw_nbytes; /* dst mostly */ - int copy_back; int sg_dst_left; int dst_start; int hw_processed_bytes; @@ -96,8 +98,10 @@ struct sec_accel_sram { } __attribute__((packed)); struct crypto_priv { + struct device *dev; void __iomem *reg; void __iomem *sram; + u32 sram_phys; int irq; struct clk *clk; struct task_struct *queue_th; @@ -115,6 +119,7 @@ struct crypto_priv { int has_hmac_sha1; struct sec_accel_sram sa_sram; + dma_addr_t sa_sram_dma; }; static struct crypto_priv *cpg; @@ -183,6 +188,23 @@ static void mv_setup_timer(void) jiffies + msecs_to_jiffies(MV_CESA_EXPIRE)); } +static inline bool +mv_dma_map_sg(struct scatterlist *sg, int nbytes, enum dma_data_direction dir) +{ + int nents = count_sgs(sg, nbytes); + + if (nbytes && dma_map_sg(cpg->dev, sg, nents, dir) != nents) + return false; + return true; +} + +static inline void +mv_dma_unmap_sg(struct scatterlist *sg, int nbytes, enum dma_data_direction dir) +{ + if (nbytes) + dma_unmap_sg(cpg->dev, sg, count_sgs(sg, nbytes), dir); +} + static void compute_aes_dec_key(struct mv_ctx *ctx) { struct crypto_aes_ctx gen_aes_key; @@ -257,12 +279,66 @@ static void copy_src_to_buf(struct req_progress *p, char *dbuf, int len) } } +static void dma_copy_src_to_buf(struct req_progress *p, dma_addr_t dbuf, int len) +{ + dma_addr_t sbuf; + int copy_len; + + while (len) { + if (!p->sg_src_left) { + /* next sg please */ + p->src_sg = sg_next(p->src_sg); + BUG_ON(!p->src_sg); + p->sg_src_left = sg_dma_len(p->src_sg); + p->src_start = 0; + } + + sbuf = sg_dma_address(p->src_sg) + p->src_start; + + copy_len = min(p->sg_src_left, len); + mv_dma_memcpy(dbuf, sbuf, copy_len); + + p->src_start += copy_len; + p->sg_src_left -= copy_len; + + len -= copy_len; + dbuf += copy_len; + } +} + +static void dma_copy_buf_to_dst(struct req_progress *p, dma_addr_t sbuf, int len) +{ + dma_addr_t dbuf; + int copy_len; + + while (len) { + if (!p->sg_dst_left) { + /* next sg please */ + p->dst_sg = sg_next(p->dst_sg); + BUG_ON(!p->dst_sg); + p->sg_dst_left = sg_dma_len(p->dst_sg); + p->dst_start = 0; + } + + dbuf = sg_dma_address(p->dst_sg) + p->dst_start; + + copy_len = min(p->sg_dst_left, len); + mv_dma_memcpy(dbuf, sbuf, copy_len); + + p->dst_start += copy_len; + p->sg_dst_left -= copy_len; + + len -= copy_len; + sbuf += copy_len; + } +} + static void setup_data_in(void) { struct req_progress *p = &cpg->p; int data_in_sram = min(p->hw_nbytes - p->hw_processed_bytes, cpg->max_req_size); - copy_src_to_buf(p, cpg->sram + SRAM_DATA_IN_START + p->crypt_len, + dma_copy_src_to_buf(p, cpg->sram_phys + SRAM_DATA_IN_START + p->crypt_len, data_in_sram - p->crypt_len); p->crypt_len = data_in_sram; } @@ -309,22 +385,39 @@ static void mv_init_crypt_config(struct ablkcipher_request *req) op->enc_key_p = SRAM_DATA_KEY_P; op->enc_len = cpg->p.crypt_len; - memcpy(cpg->sram + SRAM_CONFIG, &cpg->sa_sram, + dma_sync_single_for_device(cpg->dev, cpg->sa_sram_dma, + sizeof(struct sec_accel_sram), DMA_TO_DEVICE); + mv_dma_memcpy(cpg->sram_phys + SRAM_CONFIG, cpg->sa_sram_dma, sizeof(struct sec_accel_sram)); + mv_dma_separator(); + dma_copy_buf_to_dst(&cpg->p, cpg->sram_phys + SRAM_DATA_OUT_START, cpg->p.crypt_len); + /* GO */ mv_setup_timer(); + mv_dma_trigger(); writel(SEC_CMD_EN_SEC_ACCL0, cpg->reg + SEC_ACCEL_CMD); } static void mv_update_crypt_config(void) { + struct sec_accel_config *op = &cpg->sa_sram.op; + /* update the enc_len field only */ - memcpy(cpg->sram + SRAM_CONFIG + 2 * sizeof(u32), - &cpg->p.crypt_len, sizeof(u32)); + + op->enc_len = cpg->p.crypt_len; + + dma_sync_single_for_device(cpg->dev, cpg->sa_sram_dma + 2 * sizeof(u32), + sizeof(u32), DMA_TO_DEVICE); + mv_dma_memcpy(cpg->sram_phys + SRAM_CONFIG + 2 * sizeof(u32), + cpg->sa_sram_dma + 2 * sizeof(u32), sizeof(u32)); + + mv_dma_separator(); + dma_copy_buf_to_dst(&cpg->p, cpg->sram_phys + SRAM_DATA_OUT_START, cpg->p.crypt_len); /* GO */ mv_setup_timer(); + mv_dma_trigger(); writel(SEC_CMD_EN_SEC_ACCL0, cpg->reg + SEC_ACCEL_CMD); } @@ -333,6 +426,13 @@ static void mv_crypto_algo_completion(void) struct ablkcipher_request *req = ablkcipher_request_cast(cpg->cur_req); struct mv_req_ctx *req_ctx = ablkcipher_request_ctx(req); + if (req->src == req->dst) { + mv_dma_unmap_sg(req->src, req->nbytes, DMA_BIDIRECTIONAL); + } else { + mv_dma_unmap_sg(req->src, req->nbytes, DMA_TO_DEVICE); + mv_dma_unmap_sg(req->dst, req->nbytes, DMA_FROM_DEVICE); + } + if (req_ctx->op != COP_AES_CBC) return ; @@ -392,11 +492,20 @@ static void mv_init_hash_config(struct ahash_request *req) writel(req_ctx->state[4], cpg->reg + DIGEST_INITIAL_VAL_E); } - memcpy(cpg->sram + SRAM_CONFIG, &cpg->sa_sram, + dma_sync_single_for_device(cpg->dev, cpg->sa_sram_dma, + sizeof(struct sec_accel_sram), DMA_TO_DEVICE); + mv_dma_memcpy(cpg->sram_phys + SRAM_CONFIG, cpg->sa_sram_dma, sizeof(struct sec_accel_sram)); + mv_dma_separator(); + + /* XXX: this fixes some ugly register fuckup bug in the tdma engine + * (no need to sync since the data is ignored anyway) */ + mv_dma_memcpy(cpg->sa_sram_dma, cpg->sram_phys + SRAM_CONFIG, 1); + /* GO */ mv_setup_timer(); + mv_dma_trigger(); writel(SEC_CMD_EN_SEC_ACCL0, cpg->reg + SEC_ACCEL_CMD); } @@ -426,13 +535,26 @@ static void mv_update_hash_config(void) && (req_ctx->count <= MAX_HW_HASH_SIZE); op->config |= is_last ? CFG_LAST_FRAG : CFG_MID_FRAG; - memcpy(cpg->sram + SRAM_CONFIG, &op->config, sizeof(u32)); + dma_sync_single_for_device(cpg->dev, cpg->sa_sram_dma, + sizeof(u32), DMA_TO_DEVICE); + mv_dma_memcpy(cpg->sram_phys + SRAM_CONFIG, + cpg->sa_sram_dma, sizeof(u32)); op->mac_digest = MAC_DIGEST_P(SRAM_DIGEST_BUF) | MAC_FRAG_LEN(p->crypt_len); - memcpy(cpg->sram + SRAM_CONFIG + 6 * sizeof(u32), &op->mac_digest, sizeof(u32)); + dma_sync_single_for_device(cpg->dev, cpg->sa_sram_dma + 6 * sizeof(u32), + sizeof(u32), DMA_TO_DEVICE); + mv_dma_memcpy(cpg->sram_phys + SRAM_CONFIG + 6 * sizeof(u32), + cpg->sa_sram_dma + 6 * sizeof(u32), sizeof(u32)); + + mv_dma_separator(); + + /* XXX: this fixes some ugly register fuckup bug in the tdma engine + * (no need to sync since the data is ignored anyway) */ + mv_dma_memcpy(cpg->sa_sram_dma, cpg->sram_phys + SRAM_CONFIG, 1); /* GO */ mv_setup_timer(); + mv_dma_trigger(); writel(SEC_CMD_EN_SEC_ACCL0, cpg->reg + SEC_ACCEL_CMD); } @@ -506,43 +628,18 @@ static void mv_hash_algo_completion(void) } else { mv_save_digest_state(ctx); } + + mv_dma_unmap_sg(req->src, req->nbytes, DMA_TO_DEVICE); } static void dequeue_complete_req(void) { struct crypto_async_request *req = cpg->cur_req; - void *buf; cpg->p.hw_processed_bytes += cpg->p.crypt_len; - if (cpg->p.copy_back) { - int need_copy_len = cpg->p.crypt_len; - int sram_offset = 0; - do { - int dst_copy; - - if (!cpg->p.sg_dst_left) { - /* next sg please */ - cpg->p.dst_sg = sg_next(cpg->p.dst_sg); - BUG_ON(!cpg->p.dst_sg); - cpg->p.sg_dst_left = cpg->p.dst_sg->length; - cpg->p.dst_start = 0; - } - - buf = sg_virt(cpg->p.dst_sg) + cpg->p.dst_start; - - dst_copy = min(need_copy_len, cpg->p.sg_dst_left); - - memcpy(buf, - cpg->sram + SRAM_DATA_OUT_START + sram_offset, - dst_copy); - sram_offset += dst_copy; - cpg->p.sg_dst_left -= dst_copy; - need_copy_len -= dst_copy; - cpg->p.dst_start += dst_copy; - } while (need_copy_len > 0); - } - cpg->p.crypt_len = 0; + mv_dma_clear(); + BUG_ON(cpg->eng_st != ENGINE_W_DEQUEUE); if (cpg->p.hw_processed_bytes < cpg->p.hw_nbytes) { /* process next scatter list entry */ @@ -584,15 +681,28 @@ static void mv_start_new_crypt_req(struct ablkcipher_request *req) p->hw_nbytes = req->nbytes; p->complete = mv_crypto_algo_completion; p->process = mv_update_crypt_config; - p->copy_back = 1; + + /* assume inplace request */ + if (req->src == req->dst) { + if (!mv_dma_map_sg(req->src, req->nbytes, DMA_BIDIRECTIONAL)) + return; + } else { + if (!mv_dma_map_sg(req->src, req->nbytes, DMA_TO_DEVICE)) + return; + + if (!mv_dma_map_sg(req->dst, req->nbytes, DMA_FROM_DEVICE)) { + mv_dma_unmap_sg(req->src, req->nbytes, DMA_TO_DEVICE); + return; + } + } p->src_sg = req->src; p->dst_sg = req->dst; if (req->nbytes) { BUG_ON(!req->src); BUG_ON(!req->dst); - p->sg_src_left = req->src->length; - p->sg_dst_left = req->dst->length; + p->sg_src_left = sg_dma_len(req->src); + p->sg_dst_left = sg_dma_len(req->dst); } setup_data_in(); @@ -604,6 +714,7 @@ static void mv_start_new_hash_req(struct ahash_request *req) struct req_progress *p = &cpg->p; struct mv_req_hash_ctx *ctx = ahash_request_ctx(req); int hw_bytes, old_extra_bytes, rc; + cpg->cur_req = &req->base; memset(p, 0, sizeof(struct req_progress)); hw_bytes = req->nbytes + ctx->extra_bytes; @@ -633,6 +744,11 @@ static void mv_start_new_hash_req(struct ahash_request *req) p->crypt_len = old_extra_bytes; } + if (!mv_dma_map_sg(req->src, req->nbytes, DMA_TO_DEVICE)) { + printk(KERN_ERR "%s: out of memory\n", __func__); + return; + } + setup_data_in(); mv_init_hash_config(req); } else { @@ -968,14 +1084,14 @@ irqreturn_t crypto_int(int irq, void *priv) u32 val; val = readl(cpg->reg + SEC_ACCEL_INT_STATUS); - if (!(val & SEC_INT_ACCEL0_DONE)) + if (!(val & SEC_INT_ACC0_IDMA_DONE)) return IRQ_NONE; if (!del_timer(&cpg->completion_timer)) { printk(KERN_WARNING MV_CESA "got an interrupt but no pending timer?\n"); } - val &= ~SEC_INT_ACCEL0_DONE; + val &= ~SEC_INT_ACC0_IDMA_DONE; writel(val, cpg->reg + FPGA_INT_STATUS); writel(val, cpg->reg + SEC_ACCEL_INT_STATUS); BUG_ON(cpg->eng_st != ENGINE_BUSY); @@ -1115,6 +1231,7 @@ static int mv_probe(struct platform_device *pdev) } cp->sram_size = resource_size(res); cp->max_req_size = cp->sram_size - SRAM_CFG_SPACE; + cp->sram_phys = res->start; cp->sram = ioremap(res->start, cp->sram_size); if (!cp->sram) { ret = -ENOMEM; @@ -1130,6 +1247,7 @@ static int mv_probe(struct platform_device *pdev) platform_set_drvdata(pdev, cp); cpg = cp; + cpg->dev = &pdev->dev; cp->queue_th = kthread_run(queue_manag, cp, "mv_crypto"); if (IS_ERR(cp->queue_th)) { @@ -1149,10 +1267,14 @@ static int mv_probe(struct platform_device *pdev) clk_prepare_enable(cp->clk); writel(0, cpg->reg + SEC_ACCEL_INT_STATUS); - writel(SEC_INT_ACCEL0_DONE, cpg->reg + SEC_ACCEL_INT_MASK); - writel(SEC_CFG_STOP_DIG_ERR, cpg->reg + SEC_ACCEL_CFG); + writel(SEC_INT_ACC0_IDMA_DONE, cpg->reg + SEC_ACCEL_INT_MASK); + writel((SEC_CFG_STOP_DIG_ERR | SEC_CFG_CH0_W_IDMA | + SEC_CFG_ACT_CH0_IDMA), cpg->reg + SEC_ACCEL_CFG); writel(SRAM_CONFIG, cpg->reg + SEC_ACCEL_DESC_P0); + cp->sa_sram_dma = dma_map_single(&pdev->dev, &cp->sa_sram, + sizeof(struct sec_accel_sram), DMA_TO_DEVICE); + ret = crypto_register_alg(&mv_aes_alg_ecb); if (ret) { printk(KERN_WARNING MV_CESA @@ -1211,6 +1333,8 @@ static int mv_remove(struct platform_device *pdev) crypto_unregister_ahash(&mv_hmac_sha1_alg); kthread_stop(cp->queue_th); free_irq(cp->irq, cp); + dma_unmap_single(&pdev->dev, cpg->sa_sram_dma, + sizeof(struct sec_accel_sram), DMA_TO_DEVICE); memset(cp->sram, 0, cp->sram_size); iounmap(cp->sram); iounmap(cp->reg); -- 1.7.3.4 ^ permalink raw reply related [flat|nested] 67+ messages in thread
* [PATCH 07/13] mv_cesa: have DMA engine copy back the digest result 2012-06-12 17:17 ` RFC: support for MV_CESA with IDMA or TDMA Phil Sutter ` (5 preceding siblings ...) 2012-06-12 17:17 ` [PATCH 06/13] mv_cesa: use DMA engine for data transfers Phil Sutter @ 2012-06-12 17:17 ` Phil Sutter 2012-06-12 17:17 ` [PATCH 08/13] mv_cesa: fetch extra_bytes via DMA engine, too Phil Sutter ` (7 subsequent siblings) 14 siblings, 0 replies; 67+ messages in thread From: Phil Sutter @ 2012-06-12 17:17 UTC (permalink / raw) To: linux-crypto; +Cc: Herbert Xu Signed-off-by: Phil Sutter <phil.sutter@viprinet.com> --- drivers/crypto/mv_cesa.c | 40 +++++++++++++++++++++++++++++----------- 1 files changed, 29 insertions(+), 11 deletions(-) diff --git a/drivers/crypto/mv_cesa.c b/drivers/crypto/mv_cesa.c index cdbc82e..4b08137 100644 --- a/drivers/crypto/mv_cesa.c +++ b/drivers/crypto/mv_cesa.c @@ -161,8 +161,10 @@ struct mv_req_hash_ctx { int first_hash; /* marks that we don't have previous state */ int last_chunk; /* marks that this is the 'final' request */ int extra_bytes; /* unprocessed bytes in buffer */ + int digestsize; /* size of the digest */ enum hash_op op; int count_add; + dma_addr_t result_dma; }; static void mv_completion_timer_callback(unsigned long unused) @@ -499,9 +501,17 @@ static void mv_init_hash_config(struct ahash_request *req) mv_dma_separator(); - /* XXX: this fixes some ugly register fuckup bug in the tdma engine - * (no need to sync since the data is ignored anyway) */ - mv_dma_memcpy(cpg->sa_sram_dma, cpg->sram_phys + SRAM_CONFIG, 1); + if (req->result) { + req_ctx->result_dma = dma_map_single(cpg->dev, req->result, + req_ctx->digestsize, DMA_FROM_DEVICE); + mv_dma_memcpy(req_ctx->result_dma, + cpg->sram_phys + SRAM_DIGEST_BUF, req_ctx->digestsize); + } else { + /* XXX: this fixes some ugly register fuckup bug in the tdma engine + * (no need to sync since the data is ignored anyway) */ + mv_dma_memcpy(cpg->sa_sram_dma, + cpg->sram_phys + SRAM_CONFIG, 1); + } /* GO */ mv_setup_timer(); @@ -548,9 +558,17 @@ static void mv_update_hash_config(void) mv_dma_separator(); - /* XXX: this fixes some ugly register fuckup bug in the tdma engine - * (no need to sync since the data is ignored anyway) */ - mv_dma_memcpy(cpg->sa_sram_dma, cpg->sram_phys + SRAM_CONFIG, 1); + if (req->result) { + req_ctx->result_dma = dma_map_single(cpg->dev, req->result, + req_ctx->digestsize, DMA_FROM_DEVICE); + mv_dma_memcpy(req_ctx->result_dma, + cpg->sram_phys + SRAM_DIGEST_BUF, req_ctx->digestsize); + } else { + /* XXX: this fixes some ugly register fuckup bug in the tdma engine + * (no need to sync since the data is ignored anyway) */ + mv_dma_memcpy(cpg->sa_sram_dma, + cpg->sram_phys + SRAM_CONFIG, 1); + } /* GO */ mv_setup_timer(); @@ -617,11 +635,10 @@ static void mv_hash_algo_completion(void) copy_src_to_buf(&cpg->p, ctx->buffer, ctx->extra_bytes); if (likely(ctx->last_chunk)) { - if (likely(ctx->count <= MAX_HW_HASH_SIZE)) { - memcpy(req->result, cpg->sram + SRAM_DIGEST_BUF, - crypto_ahash_digestsize(crypto_ahash_reqtfm - (req))); - } else { + dma_unmap_single(cpg->dev, ctx->result_dma, + ctx->digestsize, DMA_FROM_DEVICE); + + if (unlikely(ctx->count > MAX_HW_HASH_SIZE)) { mv_save_digest_state(ctx); mv_hash_final_fallback(req); } @@ -719,6 +736,7 @@ static void mv_start_new_hash_req(struct ahash_request *req) memset(p, 0, sizeof(struct req_progress)); hw_bytes = req->nbytes + ctx->extra_bytes; old_extra_bytes = ctx->extra_bytes; + ctx->digestsize = crypto_ahash_digestsize(crypto_ahash_reqtfm(req)); ctx->extra_bytes = hw_bytes % SHA1_BLOCK_SIZE; if (ctx->extra_bytes != 0 -- 1.7.3.4 ^ permalink raw reply related [flat|nested] 67+ messages in thread
* [PATCH 08/13] mv_cesa: fetch extra_bytes via DMA engine, too 2012-06-12 17:17 ` RFC: support for MV_CESA with IDMA or TDMA Phil Sutter ` (6 preceding siblings ...) 2012-06-12 17:17 ` [PATCH 07/13] mv_cesa: have DMA engine copy back the digest result Phil Sutter @ 2012-06-12 17:17 ` Phil Sutter 2012-06-12 17:17 ` [PATCH 09/13] mv_cesa: implementing packet chain mode, only aes for now Phil Sutter ` (6 subsequent siblings) 14 siblings, 0 replies; 67+ messages in thread From: Phil Sutter @ 2012-06-12 17:17 UTC (permalink / raw) To: linux-crypto; +Cc: Herbert Xu Signed-off-by: Phil Sutter <phil.sutter@viprinet.com> --- drivers/crypto/mv_cesa.c | 12 ++++++++++-- 1 files changed, 10 insertions(+), 2 deletions(-) diff --git a/drivers/crypto/mv_cesa.c b/drivers/crypto/mv_cesa.c index 4b08137..7dfab85 100644 --- a/drivers/crypto/mv_cesa.c +++ b/drivers/crypto/mv_cesa.c @@ -158,6 +158,7 @@ struct mv_req_hash_ctx { u64 count; u32 state[SHA1_DIGEST_SIZE / 4]; u8 buffer[SHA1_BLOCK_SIZE]; + dma_addr_t buffer_dma; int first_hash; /* marks that we don't have previous state */ int last_chunk; /* marks that this is the 'final' request */ int extra_bytes; /* unprocessed bytes in buffer */ @@ -638,6 +639,9 @@ static void mv_hash_algo_completion(void) dma_unmap_single(cpg->dev, ctx->result_dma, ctx->digestsize, DMA_FROM_DEVICE); + dma_unmap_single(cpg->dev, ctx->buffer_dma, + SHA1_BLOCK_SIZE, DMA_TO_DEVICE); + if (unlikely(ctx->count > MAX_HW_HASH_SIZE)) { mv_save_digest_state(ctx); mv_hash_final_fallback(req); @@ -757,8 +761,10 @@ static void mv_start_new_hash_req(struct ahash_request *req) p->process = mv_update_hash_config; if (unlikely(old_extra_bytes)) { - memcpy(cpg->sram + SRAM_DATA_IN_START, ctx->buffer, - old_extra_bytes); + dma_sync_single_for_device(cpg->dev, ctx->buffer_dma, + SHA1_BLOCK_SIZE, DMA_TO_DEVICE); + mv_dma_memcpy(cpg->sram_phys + SRAM_DATA_IN_START, + ctx->buffer_dma, old_extra_bytes); p->crypt_len = old_extra_bytes; } @@ -903,6 +909,8 @@ static void mv_init_hash_req_ctx(struct mv_req_hash_ctx *ctx, int op, ctx->first_hash = 1; ctx->last_chunk = is_last; ctx->count_add = count_add; + ctx->buffer_dma = dma_map_single(cpg->dev, ctx->buffer, + SHA1_BLOCK_SIZE, DMA_TO_DEVICE); } static void mv_update_hash_req_ctx(struct mv_req_hash_ctx *ctx, int is_last, -- 1.7.3.4 ^ permalink raw reply related [flat|nested] 67+ messages in thread
* [PATCH 09/13] mv_cesa: implementing packet chain mode, only aes for now 2012-06-12 17:17 ` RFC: support for MV_CESA with IDMA or TDMA Phil Sutter ` (7 preceding siblings ...) 2012-06-12 17:17 ` [PATCH 08/13] mv_cesa: fetch extra_bytes via DMA engine, too Phil Sutter @ 2012-06-12 17:17 ` Phil Sutter 2012-06-12 17:17 ` [PATCH 10/13] mv_cesa: reorganise mv_start_new_hash_req a bit Phil Sutter ` (5 subsequent siblings) 14 siblings, 0 replies; 67+ messages in thread From: Phil Sutter @ 2012-06-12 17:17 UTC (permalink / raw) To: linux-crypto; +Cc: Herbert Xu This introduces a pool of four-byte DMA buffers for security accelerator config updates. Signed-off-by: Phil Sutter <phil.sutter@viprinet.com> --- drivers/crypto/mv_cesa.c | 134 ++++++++++++++++++++++++++++++++++++---------- drivers/crypto/mv_cesa.h | 1 + 2 files changed, 106 insertions(+), 29 deletions(-) diff --git a/drivers/crypto/mv_cesa.c b/drivers/crypto/mv_cesa.c index 7dfab85..7917d1a 100644 --- a/drivers/crypto/mv_cesa.c +++ b/drivers/crypto/mv_cesa.c @@ -10,6 +10,7 @@ #include <crypto/algapi.h> #include <linux/crypto.h> #include <linux/dma-mapping.h> +#include <linux/dmapool.h> #include <linux/interrupt.h> #include <linux/io.h> #include <linux/kthread.h> @@ -28,6 +29,9 @@ #define MAX_HW_HASH_SIZE 0xFFFF #define MV_CESA_EXPIRE 500 /* msec */ +#define MV_DMA_INIT_POOLSIZE 16 +#define MV_DMA_ALIGN 16 + static int count_sgs(struct scatterlist *, unsigned int); /* @@ -97,6 +101,11 @@ struct sec_accel_sram { #define sa_ivo type.hash.ivo } __attribute__((packed)); +struct u32_mempair { + u32 *vaddr; + dma_addr_t daddr; +}; + struct crypto_priv { struct device *dev; void __iomem *reg; @@ -120,6 +129,11 @@ struct crypto_priv { struct sec_accel_sram sa_sram; dma_addr_t sa_sram_dma; + + struct dma_pool *u32_pool; + struct u32_mempair *u32_list; + int u32_list_len; + int u32_usage; }; static struct crypto_priv *cpg; @@ -191,6 +205,54 @@ static void mv_setup_timer(void) jiffies + msecs_to_jiffies(MV_CESA_EXPIRE)); } +#define U32_ITEM(x) (cpg->u32_list[x].vaddr) +#define U32_ITEM_DMA(x) (cpg->u32_list[x].daddr) + +static inline int set_u32_poolsize(int nelem) +{ + /* need to increase size first if requested */ + if (nelem > cpg->u32_list_len) { + struct u32_mempair *newmem; + int newsize = nelem * sizeof(struct u32_mempair); + + newmem = krealloc(cpg->u32_list, newsize, GFP_KERNEL); + if (!newmem) + return -ENOMEM; + cpg->u32_list = newmem; + } + + /* allocate/free dma descriptors, adjusting cpg->u32_list_len on the go */ + for (; cpg->u32_list_len < nelem; cpg->u32_list_len++) { + U32_ITEM(cpg->u32_list_len) = dma_pool_alloc(cpg->u32_pool, + GFP_KERNEL, &U32_ITEM_DMA(cpg->u32_list_len)); + if (!U32_ITEM((cpg->u32_list_len))) + return -ENOMEM; + } + for (; cpg->u32_list_len > nelem; cpg->u32_list_len--) + dma_pool_free(cpg->u32_pool, U32_ITEM(cpg->u32_list_len - 1), + U32_ITEM_DMA(cpg->u32_list_len - 1)); + + /* ignore size decreases but those to zero */ + if (!nelem) { + kfree(cpg->u32_list); + cpg->u32_list = 0; + } + return 0; +} + +static inline void mv_dma_u32_copy(dma_addr_t dst, u32 val) +{ + if (unlikely(cpg->u32_usage == cpg->u32_list_len) + && set_u32_poolsize(cpg->u32_list_len << 1)) { + printk(KERN_ERR MV_CESA "resizing poolsize to %d failed\n", + cpg->u32_list_len << 1); + return; + } + *(U32_ITEM(cpg->u32_usage)) = val; + mv_dma_memcpy(dst, U32_ITEM_DMA(cpg->u32_usage), sizeof(u32)); + cpg->u32_usage++; +} + static inline bool mv_dma_map_sg(struct scatterlist *sg, int nbytes, enum dma_data_direction dir) { @@ -392,36 +454,13 @@ static void mv_init_crypt_config(struct ablkcipher_request *req) sizeof(struct sec_accel_sram), DMA_TO_DEVICE); mv_dma_memcpy(cpg->sram_phys + SRAM_CONFIG, cpg->sa_sram_dma, sizeof(struct sec_accel_sram)); - - mv_dma_separator(); - dma_copy_buf_to_dst(&cpg->p, cpg->sram_phys + SRAM_DATA_OUT_START, cpg->p.crypt_len); - - /* GO */ - mv_setup_timer(); - mv_dma_trigger(); - writel(SEC_CMD_EN_SEC_ACCL0, cpg->reg + SEC_ACCEL_CMD); } static void mv_update_crypt_config(void) { - struct sec_accel_config *op = &cpg->sa_sram.op; - /* update the enc_len field only */ - - op->enc_len = cpg->p.crypt_len; - - dma_sync_single_for_device(cpg->dev, cpg->sa_sram_dma + 2 * sizeof(u32), - sizeof(u32), DMA_TO_DEVICE); - mv_dma_memcpy(cpg->sram_phys + SRAM_CONFIG + 2 * sizeof(u32), - cpg->sa_sram_dma + 2 * sizeof(u32), sizeof(u32)); - - mv_dma_separator(); - dma_copy_buf_to_dst(&cpg->p, cpg->sram_phys + SRAM_DATA_OUT_START, cpg->p.crypt_len); - - /* GO */ - mv_setup_timer(); - mv_dma_trigger(); - writel(SEC_CMD_EN_SEC_ACCL0, cpg->reg + SEC_ACCEL_CMD); + mv_dma_u32_copy(cpg->sram_phys + SRAM_CONFIG + 2 * sizeof(u32), + (u32)cpg->p.crypt_len); } static void mv_crypto_algo_completion(void) @@ -660,6 +699,7 @@ static void dequeue_complete_req(void) cpg->p.crypt_len = 0; mv_dma_clear(); + cpg->u32_usage = 0; BUG_ON(cpg->eng_st != ENGINE_W_DEQUEUE); if (cpg->p.hw_processed_bytes < cpg->p.hw_nbytes) { @@ -701,7 +741,6 @@ static void mv_start_new_crypt_req(struct ablkcipher_request *req) memset(p, 0, sizeof(struct req_progress)); p->hw_nbytes = req->nbytes; p->complete = mv_crypto_algo_completion; - p->process = mv_update_crypt_config; /* assume inplace request */ if (req->src == req->dst) { @@ -728,6 +767,24 @@ static void mv_start_new_crypt_req(struct ablkcipher_request *req) setup_data_in(); mv_init_crypt_config(req); + mv_dma_separator(); + dma_copy_buf_to_dst(&cpg->p, cpg->sram_phys + SRAM_DATA_OUT_START, cpg->p.crypt_len); + cpg->p.hw_processed_bytes += cpg->p.crypt_len; + while (cpg->p.hw_processed_bytes < cpg->p.hw_nbytes) { + cpg->p.crypt_len = 0; + + setup_data_in(); + mv_update_crypt_config(); + mv_dma_separator(); + dma_copy_buf_to_dst(&cpg->p, cpg->sram_phys + SRAM_DATA_OUT_START, cpg->p.crypt_len); + cpg->p.hw_processed_bytes += cpg->p.crypt_len; + } + + + /* GO */ + mv_setup_timer(); + mv_dma_trigger(); + writel(SEC_CMD_EN_SEC_ACCL0, cpg->reg + SEC_ACCEL_CMD); } static void mv_start_new_hash_req(struct ahash_request *req) @@ -1294,18 +1351,29 @@ static int mv_probe(struct platform_device *pdev) writel(0, cpg->reg + SEC_ACCEL_INT_STATUS); writel(SEC_INT_ACC0_IDMA_DONE, cpg->reg + SEC_ACCEL_INT_MASK); - writel((SEC_CFG_STOP_DIG_ERR | SEC_CFG_CH0_W_IDMA | + writel((SEC_CFG_STOP_DIG_ERR | SEC_CFG_CH0_W_IDMA | SEC_CFG_MP_CHAIN | SEC_CFG_ACT_CH0_IDMA), cpg->reg + SEC_ACCEL_CFG); writel(SRAM_CONFIG, cpg->reg + SEC_ACCEL_DESC_P0); cp->sa_sram_dma = dma_map_single(&pdev->dev, &cp->sa_sram, sizeof(struct sec_accel_sram), DMA_TO_DEVICE); + cpg->u32_pool = dma_pool_create("CESA U32 Item Pool", + &pdev->dev, sizeof(u32), MV_DMA_ALIGN, 0); + if (!cpg->u32_pool) { + ret = -ENOMEM; + goto err_mapping; + } + if (set_u32_poolsize(MV_DMA_INIT_POOLSIZE)) { + printk(KERN_ERR MV_CESA "failed to initialise poolsize\n"); + goto err_pool; + } + ret = crypto_register_alg(&mv_aes_alg_ecb); if (ret) { printk(KERN_WARNING MV_CESA "Could not register aes-ecb driver\n"); - goto err_irq; + goto err_poolsize; } ret = crypto_register_alg(&mv_aes_alg_cbc); @@ -1332,7 +1400,13 @@ static int mv_probe(struct platform_device *pdev) return 0; err_unreg_ecb: crypto_unregister_alg(&mv_aes_alg_ecb); -err_irq: +err_poolsize: + set_u32_poolsize(0); +err_pool: + dma_pool_destroy(cpg->u32_pool); +err_mapping: + dma_unmap_single(&pdev->dev, cpg->sa_sram_dma, + sizeof(struct sec_accel_sram), DMA_TO_DEVICE); free_irq(irq, cp); err_thread: kthread_stop(cp->queue_th); @@ -1361,6 +1435,8 @@ static int mv_remove(struct platform_device *pdev) free_irq(cp->irq, cp); dma_unmap_single(&pdev->dev, cpg->sa_sram_dma, sizeof(struct sec_accel_sram), DMA_TO_DEVICE); + set_u32_poolsize(0); + dma_pool_destroy(cpg->u32_pool); memset(cp->sram, 0, cp->sram_size); iounmap(cp->sram); iounmap(cp->reg); diff --git a/drivers/crypto/mv_cesa.h b/drivers/crypto/mv_cesa.h index 08fcb11..866c437 100644 --- a/drivers/crypto/mv_cesa.h +++ b/drivers/crypto/mv_cesa.h @@ -24,6 +24,7 @@ #define SEC_CFG_CH1_W_IDMA (1 << 8) #define SEC_CFG_ACT_CH0_IDMA (1 << 9) #define SEC_CFG_ACT_CH1_IDMA (1 << 10) +#define SEC_CFG_MP_CHAIN (1 << 11) #define SEC_ACCEL_STATUS 0xde0c #define SEC_ST_ACT_0 (1 << 0) -- 1.7.3.4 ^ permalink raw reply related [flat|nested] 67+ messages in thread
* [PATCH 10/13] mv_cesa: reorganise mv_start_new_hash_req a bit 2012-06-12 17:17 ` RFC: support for MV_CESA with IDMA or TDMA Phil Sutter ` (8 preceding siblings ...) 2012-06-12 17:17 ` [PATCH 09/13] mv_cesa: implementing packet chain mode, only aes for now Phil Sutter @ 2012-06-12 17:17 ` Phil Sutter 2012-06-12 17:17 ` [PATCH 11/13] mv_cesa: implement descriptor chaining for hashes, too Phil Sutter ` (4 subsequent siblings) 14 siblings, 0 replies; 67+ messages in thread From: Phil Sutter @ 2012-06-12 17:17 UTC (permalink / raw) To: linux-crypto; +Cc: Herbert Xu Check and exit early for whether CESA can be used at all. Signed-off-by: Phil Sutter <phil.sutter@viprinet.com> --- drivers/crypto/mv_cesa.c | 61 +++++++++++++++++++++++++--------------------- 1 files changed, 33 insertions(+), 28 deletions(-) diff --git a/drivers/crypto/mv_cesa.c b/drivers/crypto/mv_cesa.c index 7917d1a..9c65980 100644 --- a/drivers/crypto/mv_cesa.c +++ b/drivers/crypto/mv_cesa.c @@ -806,35 +806,13 @@ static void mv_start_new_hash_req(struct ahash_request *req) else ctx->extra_bytes = 0; - p->src_sg = req->src; - if (req->nbytes) { - BUG_ON(!req->src); - p->sg_src_left = req->src->length; - } - - if (hw_bytes) { - p->hw_nbytes = hw_bytes; - p->complete = mv_hash_algo_completion; - p->process = mv_update_hash_config; - - if (unlikely(old_extra_bytes)) { - dma_sync_single_for_device(cpg->dev, ctx->buffer_dma, - SHA1_BLOCK_SIZE, DMA_TO_DEVICE); - mv_dma_memcpy(cpg->sram_phys + SRAM_DATA_IN_START, - ctx->buffer_dma, old_extra_bytes); - p->crypt_len = old_extra_bytes; + if (unlikely(!hw_bytes)) { /* too little data for CESA */ + if (req->nbytes) { + p->src_sg = req->src; + p->sg_src_left = req->src->length; + copy_src_to_buf(p, ctx->buffer + old_extra_bytes, + req->nbytes); } - - if (!mv_dma_map_sg(req->src, req->nbytes, DMA_TO_DEVICE)) { - printk(KERN_ERR "%s: out of memory\n", __func__); - return; - } - - setup_data_in(); - mv_init_hash_config(req); - } else { - copy_src_to_buf(p, ctx->buffer + old_extra_bytes, - ctx->extra_bytes - old_extra_bytes); if (ctx->last_chunk) rc = mv_hash_final_fallback(req); else @@ -843,7 +821,34 @@ static void mv_start_new_hash_req(struct ahash_request *req) local_bh_disable(); req->base.complete(&req->base, rc); local_bh_enable(); + return; } + + if (likely(req->nbytes)) { + BUG_ON(!req->src); + + if (!mv_dma_map_sg(req->src, req->nbytes, DMA_TO_DEVICE)) { + printk(KERN_ERR "%s: out of memory\n", __func__); + return; + } + p->sg_src_left = sg_dma_len(req->src); + p->src_sg = req->src; + } + + p->hw_nbytes = hw_bytes; + p->complete = mv_hash_algo_completion; + p->process = mv_update_hash_config; + + if (unlikely(old_extra_bytes)) { + dma_sync_single_for_device(cpg->dev, ctx->buffer_dma, + SHA1_BLOCK_SIZE, DMA_TO_DEVICE); + mv_dma_memcpy(cpg->sram_phys + SRAM_DATA_IN_START, + ctx->buffer_dma, old_extra_bytes); + p->crypt_len = old_extra_bytes; + } + + setup_data_in(); + mv_init_hash_config(req); } static int queue_manag(void *data) -- 1.7.3.4 ^ permalink raw reply related [flat|nested] 67+ messages in thread
* [PATCH 11/13] mv_cesa: implement descriptor chaining for hashes, too 2012-06-12 17:17 ` RFC: support for MV_CESA with IDMA or TDMA Phil Sutter ` (9 preceding siblings ...) 2012-06-12 17:17 ` [PATCH 10/13] mv_cesa: reorganise mv_start_new_hash_req a bit Phil Sutter @ 2012-06-12 17:17 ` Phil Sutter 2012-06-12 17:17 ` [PATCH 12/13] mv_cesa: drop the now unused process callback Phil Sutter ` (3 subsequent siblings) 14 siblings, 0 replies; 67+ messages in thread From: Phil Sutter @ 2012-06-12 17:17 UTC (permalink / raw) To: linux-crypto; +Cc: Herbert Xu Signed-off-by: Phil Sutter <phil.sutter@viprinet.com> --- drivers/crypto/mv_cesa.c | 89 ++++++++++++++++++---------------------------- 1 files changed, 35 insertions(+), 54 deletions(-) diff --git a/drivers/crypto/mv_cesa.c b/drivers/crypto/mv_cesa.c index 9c65980..86b73d1 100644 --- a/drivers/crypto/mv_cesa.c +++ b/drivers/crypto/mv_cesa.c @@ -538,34 +538,14 @@ static void mv_init_hash_config(struct ahash_request *req) sizeof(struct sec_accel_sram), DMA_TO_DEVICE); mv_dma_memcpy(cpg->sram_phys + SRAM_CONFIG, cpg->sa_sram_dma, sizeof(struct sec_accel_sram)); - - mv_dma_separator(); - - if (req->result) { - req_ctx->result_dma = dma_map_single(cpg->dev, req->result, - req_ctx->digestsize, DMA_FROM_DEVICE); - mv_dma_memcpy(req_ctx->result_dma, - cpg->sram_phys + SRAM_DIGEST_BUF, req_ctx->digestsize); - } else { - /* XXX: this fixes some ugly register fuckup bug in the tdma engine - * (no need to sync since the data is ignored anyway) */ - mv_dma_memcpy(cpg->sa_sram_dma, - cpg->sram_phys + SRAM_CONFIG, 1); - } - - /* GO */ - mv_setup_timer(); - mv_dma_trigger(); - writel(SEC_CMD_EN_SEC_ACCL0, cpg->reg + SEC_ACCEL_CMD); } -static void mv_update_hash_config(void) +static void mv_update_hash_config(struct ahash_request *req) { - struct ahash_request *req = ahash_request_cast(cpg->cur_req); struct mv_req_hash_ctx *req_ctx = ahash_request_ctx(req); struct req_progress *p = &cpg->p; - struct sec_accel_config *op = &cpg->sa_sram.op; int is_last; + u32 val; /* update only the config (for changed fragment state) and * mac_digest (for changed frag len) fields */ @@ -573,10 +553,10 @@ static void mv_update_hash_config(void) switch (req_ctx->op) { case COP_SHA1: default: - op->config = CFG_OP_MAC_ONLY | CFG_MACM_SHA1; + val = CFG_OP_MAC_ONLY | CFG_MACM_SHA1; break; case COP_HMAC_SHA1: - op->config = CFG_OP_MAC_ONLY | CFG_MACM_HMAC_SHA1; + val = CFG_OP_MAC_ONLY | CFG_MACM_HMAC_SHA1; break; } @@ -584,36 +564,11 @@ static void mv_update_hash_config(void) && (p->hw_processed_bytes + p->crypt_len >= p->hw_nbytes) && (req_ctx->count <= MAX_HW_HASH_SIZE); - op->config |= is_last ? CFG_LAST_FRAG : CFG_MID_FRAG; - dma_sync_single_for_device(cpg->dev, cpg->sa_sram_dma, - sizeof(u32), DMA_TO_DEVICE); - mv_dma_memcpy(cpg->sram_phys + SRAM_CONFIG, - cpg->sa_sram_dma, sizeof(u32)); - - op->mac_digest = MAC_DIGEST_P(SRAM_DIGEST_BUF) | MAC_FRAG_LEN(p->crypt_len); - dma_sync_single_for_device(cpg->dev, cpg->sa_sram_dma + 6 * sizeof(u32), - sizeof(u32), DMA_TO_DEVICE); - mv_dma_memcpy(cpg->sram_phys + SRAM_CONFIG + 6 * sizeof(u32), - cpg->sa_sram_dma + 6 * sizeof(u32), sizeof(u32)); - - mv_dma_separator(); - - if (req->result) { - req_ctx->result_dma = dma_map_single(cpg->dev, req->result, - req_ctx->digestsize, DMA_FROM_DEVICE); - mv_dma_memcpy(req_ctx->result_dma, - cpg->sram_phys + SRAM_DIGEST_BUF, req_ctx->digestsize); - } else { - /* XXX: this fixes some ugly register fuckup bug in the tdma engine - * (no need to sync since the data is ignored anyway) */ - mv_dma_memcpy(cpg->sa_sram_dma, - cpg->sram_phys + SRAM_CONFIG, 1); - } + val |= is_last ? CFG_LAST_FRAG : CFG_MID_FRAG; + mv_dma_u32_copy(cpg->sram_phys + SRAM_CONFIG, val); - /* GO */ - mv_setup_timer(); - mv_dma_trigger(); - writel(SEC_CMD_EN_SEC_ACCL0, cpg->reg + SEC_ACCEL_CMD); + val = MAC_DIGEST_P(SRAM_DIGEST_BUF) | MAC_FRAG_LEN(p->crypt_len); + mv_dma_u32_copy(cpg->sram_phys + SRAM_CONFIG + 6 * sizeof(u32), val); } static inline int mv_hash_import_sha1_ctx(const struct mv_req_hash_ctx *ctx, @@ -837,7 +792,6 @@ static void mv_start_new_hash_req(struct ahash_request *req) p->hw_nbytes = hw_bytes; p->complete = mv_hash_algo_completion; - p->process = mv_update_hash_config; if (unlikely(old_extra_bytes)) { dma_sync_single_for_device(cpg->dev, ctx->buffer_dma, @@ -849,6 +803,33 @@ static void mv_start_new_hash_req(struct ahash_request *req) setup_data_in(); mv_init_hash_config(req); + mv_dma_separator(); + cpg->p.hw_processed_bytes += cpg->p.crypt_len; + while (cpg->p.hw_processed_bytes < cpg->p.hw_nbytes) { + cpg->p.crypt_len = 0; + + setup_data_in(); + mv_update_hash_config(req); + mv_dma_separator(); + cpg->p.hw_processed_bytes += cpg->p.crypt_len; + } + if (req->result) { + ctx->result_dma = dma_map_single(cpg->dev, req->result, + ctx->digestsize, DMA_FROM_DEVICE); + mv_dma_memcpy(ctx->result_dma, + cpg->sram_phys + SRAM_DIGEST_BUF, + ctx->digestsize); + } else { + /* XXX: this fixes some ugly register fuckup bug in the tdma engine + * (no need to sync since the data is ignored anyway) */ + mv_dma_memcpy(cpg->sa_sram_dma, + cpg->sram_phys + SRAM_CONFIG, 1); + } + + /* GO */ + mv_setup_timer(); + mv_dma_trigger(); + writel(SEC_CMD_EN_SEC_ACCL0, cpg->reg + SEC_ACCEL_CMD); } static int queue_manag(void *data) -- 1.7.3.4 ^ permalink raw reply related [flat|nested] 67+ messages in thread
* [PATCH 12/13] mv_cesa: drop the now unused process callback 2012-06-12 17:17 ` RFC: support for MV_CESA with IDMA or TDMA Phil Sutter ` (10 preceding siblings ...) 2012-06-12 17:17 ` [PATCH 11/13] mv_cesa: implement descriptor chaining for hashes, too Phil Sutter @ 2012-06-12 17:17 ` Phil Sutter 2012-06-12 17:17 ` [PATCH 13/13] mv_cesa, mv_dma: outsource common dma-pool handling code Phil Sutter ` (2 subsequent siblings) 14 siblings, 0 replies; 67+ messages in thread From: Phil Sutter @ 2012-06-12 17:17 UTC (permalink / raw) To: linux-crypto; +Cc: Herbert Xu And while here, simplify dequeue_complete_req() a bit. Signed-off-by: Phil Sutter <phil.sutter@viprinet.com> --- drivers/crypto/mv_cesa.c | 21 ++++++--------------- 1 files changed, 6 insertions(+), 15 deletions(-) diff --git a/drivers/crypto/mv_cesa.c b/drivers/crypto/mv_cesa.c index 86b73d1..7b2b693 100644 --- a/drivers/crypto/mv_cesa.c +++ b/drivers/crypto/mv_cesa.c @@ -70,7 +70,6 @@ struct req_progress { struct scatterlist *src_sg; struct scatterlist *dst_sg; void (*complete) (void); - void (*process) (void); /* src mostly */ int sg_src_left; @@ -650,25 +649,17 @@ static void mv_hash_algo_completion(void) static void dequeue_complete_req(void) { struct crypto_async_request *req = cpg->cur_req; - cpg->p.hw_processed_bytes += cpg->p.crypt_len; - cpg->p.crypt_len = 0; mv_dma_clear(); cpg->u32_usage = 0; BUG_ON(cpg->eng_st != ENGINE_W_DEQUEUE); - if (cpg->p.hw_processed_bytes < cpg->p.hw_nbytes) { - /* process next scatter list entry */ - cpg->eng_st = ENGINE_BUSY; - setup_data_in(); - cpg->p.process(); - } else { - cpg->p.complete(); - cpg->eng_st = ENGINE_IDLE; - local_bh_disable(); - req->complete(req, 0); - local_bh_enable(); - } + + cpg->p.complete(); + cpg->eng_st = ENGINE_IDLE; + local_bh_disable(); + req->complete(req, 0); + local_bh_enable(); } static int count_sgs(struct scatterlist *sl, unsigned int total_bytes) -- 1.7.3.4 ^ permalink raw reply related [flat|nested] 67+ messages in thread
* [PATCH 13/13] mv_cesa, mv_dma: outsource common dma-pool handling code 2012-06-12 17:17 ` RFC: support for MV_CESA with IDMA or TDMA Phil Sutter ` (11 preceding siblings ...) 2012-06-12 17:17 ` [PATCH 12/13] mv_cesa: drop the now unused process callback Phil Sutter @ 2012-06-12 17:17 ` Phil Sutter 2012-06-15 1:40 ` RFC: support for MV_CESA with IDMA or TDMA cloudy.linux 2012-06-16 0:20 ` [PATCH 0/2] Fixes " Simon Baatz 14 siblings, 0 replies; 67+ messages in thread From: Phil Sutter @ 2012-06-12 17:17 UTC (permalink / raw) To: linux-crypto; +Cc: Herbert Xu Signed-off-by: Phil Sutter <phil.sutter@viprinet.com> --- drivers/crypto/dma_desclist.h | 79 +++++++++++++++++++++++++++++++++++ drivers/crypto/mv_cesa.c | 81 +++++++++---------------------------- drivers/crypto/mv_dma.c | 91 ++++++++++++----------------------------- 3 files changed, 125 insertions(+), 126 deletions(-) create mode 100644 drivers/crypto/dma_desclist.h diff --git a/drivers/crypto/dma_desclist.h b/drivers/crypto/dma_desclist.h new file mode 100644 index 0000000..c471ad6 --- /dev/null +++ b/drivers/crypto/dma_desclist.h @@ -0,0 +1,79 @@ +#ifndef __DMA_DESCLIST__ +#define __DMA_DESCLIST__ + +struct dma_desc { + void *virt; + dma_addr_t phys; +}; + +struct dma_desclist { + struct dma_pool *itempool; + struct dma_desc *desclist; + unsigned long length; + unsigned long usage; +}; + +#define DESCLIST_ITEM(dl, x) ((dl).desclist[(x)].virt) +#define DESCLIST_ITEM_DMA(dl, x) ((dl).desclist[(x)].phys) +#define DESCLIST_FULL(dl) ((dl).length == (dl).usage) + +static inline int +init_dma_desclist(struct dma_desclist *dl, struct device *dev, + size_t size, size_t align, size_t boundary) +{ +#define STRX(x) #x +#define STR(x) STRX(x) + dl->itempool = dma_pool_create( + "DMA Desclist Pool at "__FILE__"("STR(__LINE__)")", + dev, size, align, boundary); +#undef STR +#undef STRX + if (!dl->itempool) + return 1; + dl->desclist = NULL; + dl->length = dl->usage = 0; + return 0; +} + +static inline int +set_dma_desclist_size(struct dma_desclist *dl, unsigned long nelem) +{ + /* need to increase size first if requested */ + if (nelem > dl->length) { + struct dma_desc *newmem; + int newsize = nelem * sizeof(struct dma_desc); + + newmem = krealloc(dl->desclist, newsize, GFP_KERNEL); + if (!newmem) + return -ENOMEM; + dl->desclist = newmem; + } + + /* allocate/free dma descriptors, adjusting dl->length on the go */ + for (; dl->length < nelem; dl->length++) { + DESCLIST_ITEM(*dl, dl->length) = dma_pool_alloc(dl->itempool, + GFP_KERNEL, &DESCLIST_ITEM_DMA(*dl, dl->length)); + if (!DESCLIST_ITEM(*dl, dl->length)) + return -ENOMEM; + } + for (; dl->length > nelem; dl->length--) + dma_pool_free(dl->itempool, DESCLIST_ITEM(*dl, dl->length - 1), + DESCLIST_ITEM_DMA(*dl, dl->length - 1)); + + /* ignore size decreases but those to zero */ + if (!nelem) { + kfree(dl->desclist); + dl->desclist = 0; + } + return 0; +} + +static inline void +fini_dma_desclist(struct dma_desclist *dl) +{ + set_dma_desclist_size(dl, 0); + dma_pool_destroy(dl->itempool); + dl->length = dl->usage = 0; +} + +#endif /* __DMA_DESCLIST__ */ diff --git a/drivers/crypto/mv_cesa.c b/drivers/crypto/mv_cesa.c index 7b2b693..2a9fe8a 100644 --- a/drivers/crypto/mv_cesa.c +++ b/drivers/crypto/mv_cesa.c @@ -24,6 +24,7 @@ #include "mv_cesa.h" #include "mv_dma.h" +#include "dma_desclist.h" #define MV_CESA "MV-CESA:" #define MAX_HW_HASH_SIZE 0xFFFF @@ -100,11 +101,6 @@ struct sec_accel_sram { #define sa_ivo type.hash.ivo } __attribute__((packed)); -struct u32_mempair { - u32 *vaddr; - dma_addr_t daddr; -}; - struct crypto_priv { struct device *dev; void __iomem *reg; @@ -129,14 +125,14 @@ struct crypto_priv { struct sec_accel_sram sa_sram; dma_addr_t sa_sram_dma; - struct dma_pool *u32_pool; - struct u32_mempair *u32_list; - int u32_list_len; - int u32_usage; + struct dma_desclist desclist; }; static struct crypto_priv *cpg; +#define ITEM(x) ((u32 *)DESCLIST_ITEM(cpg->desclist, x)) +#define ITEM_DMA(x) DESCLIST_ITEM_DMA(cpg->desclist, x) + struct mv_ctx { u8 aes_enc_key[AES_KEY_LEN]; u32 aes_dec_key[8]; @@ -204,52 +200,17 @@ static void mv_setup_timer(void) jiffies + msecs_to_jiffies(MV_CESA_EXPIRE)); } -#define U32_ITEM(x) (cpg->u32_list[x].vaddr) -#define U32_ITEM_DMA(x) (cpg->u32_list[x].daddr) - -static inline int set_u32_poolsize(int nelem) -{ - /* need to increase size first if requested */ - if (nelem > cpg->u32_list_len) { - struct u32_mempair *newmem; - int newsize = nelem * sizeof(struct u32_mempair); - - newmem = krealloc(cpg->u32_list, newsize, GFP_KERNEL); - if (!newmem) - return -ENOMEM; - cpg->u32_list = newmem; - } - - /* allocate/free dma descriptors, adjusting cpg->u32_list_len on the go */ - for (; cpg->u32_list_len < nelem; cpg->u32_list_len++) { - U32_ITEM(cpg->u32_list_len) = dma_pool_alloc(cpg->u32_pool, - GFP_KERNEL, &U32_ITEM_DMA(cpg->u32_list_len)); - if (!U32_ITEM((cpg->u32_list_len))) - return -ENOMEM; - } - for (; cpg->u32_list_len > nelem; cpg->u32_list_len--) - dma_pool_free(cpg->u32_pool, U32_ITEM(cpg->u32_list_len - 1), - U32_ITEM_DMA(cpg->u32_list_len - 1)); - - /* ignore size decreases but those to zero */ - if (!nelem) { - kfree(cpg->u32_list); - cpg->u32_list = 0; - } - return 0; -} - static inline void mv_dma_u32_copy(dma_addr_t dst, u32 val) { - if (unlikely(cpg->u32_usage == cpg->u32_list_len) - && set_u32_poolsize(cpg->u32_list_len << 1)) { - printk(KERN_ERR MV_CESA "resizing poolsize to %d failed\n", - cpg->u32_list_len << 1); + if (unlikely(DESCLIST_FULL(cpg->desclist)) && + set_dma_desclist_size(&cpg->desclist, cpg->desclist.length << 1)) { + printk(KERN_ERR MV_CESA "resizing poolsize to %lu failed\n", + cpg->desclist.length << 1); return; } - *(U32_ITEM(cpg->u32_usage)) = val; - mv_dma_memcpy(dst, U32_ITEM_DMA(cpg->u32_usage), sizeof(u32)); - cpg->u32_usage++; + *ITEM(cpg->desclist.usage) = val; + mv_dma_memcpy(dst, ITEM_DMA(cpg->desclist.usage), sizeof(u32)); + cpg->desclist.usage++; } static inline bool @@ -651,7 +612,7 @@ static void dequeue_complete_req(void) struct crypto_async_request *req = cpg->cur_req; mv_dma_clear(); - cpg->u32_usage = 0; + cpg->desclist.usage = 0; BUG_ON(cpg->eng_st != ENGINE_W_DEQUEUE); @@ -1335,13 +1296,12 @@ static int mv_probe(struct platform_device *pdev) cp->sa_sram_dma = dma_map_single(&pdev->dev, &cp->sa_sram, sizeof(struct sec_accel_sram), DMA_TO_DEVICE); - cpg->u32_pool = dma_pool_create("CESA U32 Item Pool", - &pdev->dev, sizeof(u32), MV_DMA_ALIGN, 0); - if (!cpg->u32_pool) { + if (init_dma_desclist(&cpg->desclist, &pdev->dev, + sizeof(u32), MV_DMA_ALIGN, 0)) { ret = -ENOMEM; goto err_mapping; } - if (set_u32_poolsize(MV_DMA_INIT_POOLSIZE)) { + if (set_dma_desclist_size(&cpg->desclist, MV_DMA_INIT_POOLSIZE)) { printk(KERN_ERR MV_CESA "failed to initialise poolsize\n"); goto err_pool; } @@ -1350,7 +1310,7 @@ static int mv_probe(struct platform_device *pdev) if (ret) { printk(KERN_WARNING MV_CESA "Could not register aes-ecb driver\n"); - goto err_poolsize; + goto err_pool; } ret = crypto_register_alg(&mv_aes_alg_cbc); @@ -1377,10 +1337,8 @@ static int mv_probe(struct platform_device *pdev) return 0; err_unreg_ecb: crypto_unregister_alg(&mv_aes_alg_ecb); -err_poolsize: - set_u32_poolsize(0); err_pool: - dma_pool_destroy(cpg->u32_pool); + fini_dma_desclist(&cpg->desclist); err_mapping: dma_unmap_single(&pdev->dev, cpg->sa_sram_dma, sizeof(struct sec_accel_sram), DMA_TO_DEVICE); @@ -1412,8 +1370,7 @@ static int mv_remove(struct platform_device *pdev) free_irq(cp->irq, cp); dma_unmap_single(&pdev->dev, cpg->sa_sram_dma, sizeof(struct sec_accel_sram), DMA_TO_DEVICE); - set_u32_poolsize(0); - dma_pool_destroy(cpg->u32_pool); + fini_dma_desclist(&cpg->desclist); memset(cp->sram, 0, cp->sram_size); iounmap(cp->sram); iounmap(cp->reg); diff --git a/drivers/crypto/mv_dma.c b/drivers/crypto/mv_dma.c index 24c5256..b84ff80 100644 --- a/drivers/crypto/mv_dma.c +++ b/drivers/crypto/mv_dma.c @@ -17,6 +17,7 @@ #include <linux/platform_device.h> #include "mv_dma.h" +#include "dma_desclist.h" #define MV_DMA "MV-DMA: " @@ -30,11 +31,6 @@ struct mv_dma_desc { u32 next; } __attribute__((packed)); -struct desc_mempair { - struct mv_dma_desc *vaddr; - dma_addr_t daddr; -}; - struct mv_dma_priv { bool idma_registered, tdma_registered; struct device *dev; @@ -42,47 +38,12 @@ struct mv_dma_priv { int irq; /* protecting the dma descriptors and stuff */ spinlock_t lock; - struct dma_pool *descpool; - struct desc_mempair *desclist; - int desclist_len; - int desc_usage; + struct dma_desclist desclist; u32 (*print_and_clear_irq)(void); } tpg; -#define DESC(x) (tpg.desclist[x].vaddr) -#define DESC_DMA(x) (tpg.desclist[x].daddr) - -static inline int set_poolsize(int nelem) -{ - /* need to increase size first if requested */ - if (nelem > tpg.desclist_len) { - struct desc_mempair *newmem; - int newsize = nelem * sizeof(struct desc_mempair); - - newmem = krealloc(tpg.desclist, newsize, GFP_KERNEL); - if (!newmem) - return -ENOMEM; - tpg.desclist = newmem; - } - - /* allocate/free dma descriptors, adjusting tpg.desclist_len on the go */ - for (; tpg.desclist_len < nelem; tpg.desclist_len++) { - DESC(tpg.desclist_len) = dma_pool_alloc(tpg.descpool, - GFP_KERNEL, &DESC_DMA(tpg.desclist_len)); - if (!DESC((tpg.desclist_len))) - return -ENOMEM; - } - for (; tpg.desclist_len > nelem; tpg.desclist_len--) - dma_pool_free(tpg.descpool, DESC(tpg.desclist_len - 1), - DESC_DMA(tpg.desclist_len - 1)); - - /* ignore size decreases but those to zero */ - if (!nelem) { - kfree(tpg.desclist); - tpg.desclist = 0; - } - return 0; -} +#define ITEM(x) ((struct mv_dma_desc *)DESCLIST_ITEM(tpg.desclist, x)) +#define ITEM_DMA(x) DESCLIST_ITEM_DMA(tpg.desclist, x) static inline void wait_for_dma_idle(void) { @@ -102,17 +63,18 @@ static inline void switch_dma_engine(bool state) static struct mv_dma_desc *get_new_last_desc(void) { - if (unlikely(tpg.desc_usage == tpg.desclist_len) && - set_poolsize(tpg.desclist_len << 1)) { - printk(KERN_ERR MV_DMA "failed to increase DMA pool to %d\n", - tpg.desclist_len << 1); + if (unlikely(DESCLIST_FULL(tpg.desclist)) && + set_dma_desclist_size(&tpg.desclist, tpg.desclist.length << 1)) { + printk(KERN_ERR MV_DMA "failed to increase DMA pool to %lu\n", + tpg.desclist.length << 1); return NULL; } - if (likely(tpg.desc_usage)) - DESC(tpg.desc_usage - 1)->next = DESC_DMA(tpg.desc_usage); + if (likely(tpg.desclist.usage)) + ITEM(tpg.desclist.usage - 1)->next = + ITEM_DMA(tpg.desclist.usage); - return DESC(tpg.desc_usage++); + return ITEM(tpg.desclist.usage++); } static inline void mv_dma_desc_dump(void) @@ -120,17 +82,17 @@ static inline void mv_dma_desc_dump(void) struct mv_dma_desc *tmp; int i; - if (!tpg.desc_usage) { + if (!tpg.desclist.usage) { printk(KERN_WARNING MV_DMA "DMA descriptor list is empty\n"); return; } printk(KERN_WARNING MV_DMA "DMA descriptor list:\n"); - for (i = 0; i < tpg.desc_usage; i++) { - tmp = DESC(i); + for (i = 0; i < tpg.desclist.usage; i++) { + tmp = ITEM(i); printk(KERN_WARNING MV_DMA "entry %d at 0x%x: dma addr 0x%x, " "src 0x%x, dst 0x%x, count %u, own %d, next 0x%x", i, - (u32)tmp, DESC_DMA(i) , tmp->src, tmp->dst, + (u32)tmp, ITEM_DMA(i) , tmp->src, tmp->dst, tmp->count & DMA_BYTE_COUNT_MASK, !!(tmp->count & DMA_OWN_BIT), tmp->next); } @@ -176,7 +138,7 @@ void mv_dma_clear(void) /* clear descriptor registers */ mv_dma_clear_desc_reg(); - tpg.desc_usage = 0; + tpg.desclist.usage = 0; switch_dma_engine(1); @@ -192,7 +154,7 @@ void mv_dma_trigger(void) spin_lock(&tpg.lock); - writel(DESC_DMA(0), tpg.reg + DMA_NEXT_DESC); + writel(ITEM_DMA(0), tpg.reg + DMA_NEXT_DESC); spin_unlock(&tpg.lock); } @@ -331,13 +293,15 @@ static int mv_init_engine(struct platform_device *pdev, } /* initialise DMA descriptor list */ - tpg.descpool = dma_pool_create("MV_DMA Descriptor Pool", tpg.dev, - sizeof(struct mv_dma_desc), MV_DMA_ALIGN, 0); - if (!tpg.descpool) { + if (init_dma_desclist(&tpg.desclist, tpg.dev, + sizeof(struct mv_dma_desc), MV_DMA_ALIGN, 0)) { rc = -ENOMEM; goto out_free_irq; } - set_poolsize(MV_DMA_INIT_POOLSIZE); + if (set_dma_desclist_size(&tpg.desclist, MV_DMA_INIT_POOLSIZE)) { + rc = -ENOMEM; + goto out_free_desclist; + } platform_set_drvdata(pdev, &tpg); @@ -364,8 +328,8 @@ static int mv_init_engine(struct platform_device *pdev, out_free_all: switch_dma_engine(0); platform_set_drvdata(pdev, NULL); - set_poolsize(0); - dma_pool_destroy(tpg.descpool); +out_free_desclist: + fini_dma_desclist(&tpg.desclist); out_free_irq: free_irq(tpg.irq, &tpg); out_unmap_reg: @@ -378,8 +342,7 @@ static int mv_remove(struct platform_device *pdev) { switch_dma_engine(0); platform_set_drvdata(pdev, NULL); - set_poolsize(0); - dma_pool_destroy(tpg.descpool); + fini_dma_desclist(&tpg.desclist); free_irq(tpg.irq, &tpg); iounmap(tpg.reg); tpg.dev = NULL; -- 1.7.3.4 ^ permalink raw reply related [flat|nested] 67+ messages in thread
* Re: RFC: support for MV_CESA with IDMA or TDMA 2012-06-12 17:17 ` RFC: support for MV_CESA with IDMA or TDMA Phil Sutter ` (12 preceding siblings ...) 2012-06-12 17:17 ` [PATCH 13/13] mv_cesa, mv_dma: outsource common dma-pool handling code Phil Sutter @ 2012-06-15 1:40 ` cloudy.linux 2012-06-15 9:51 ` Phil Sutter 2012-06-16 0:20 ` [PATCH 0/2] Fixes " Simon Baatz 14 siblings, 1 reply; 67+ messages in thread From: cloudy.linux @ 2012-06-15 1:40 UTC (permalink / raw) To: Phil Sutter; +Cc: linux-crypto, Herbert Xu On 2012-6-13 1:17, Phil Sutter wrote: > Hi, > > The following patch series adds support for the TDMA engine built into > Marvell's Kirkwood-based SoCs as well as the IDMA engine built into > Marvell's Orion-based SoCs and enhances mv_cesa.c in order to use it for > speeding up crypto operations. The hardware contains a security > accelerator, which can control DMA as well as crypto engines. It allows > for operation with minimal software intervention, which the following > patches implement: using a chain of DMA descriptors, data input, > configuration, engine startup and data output repeat fully automatically > until the whole input data has been handled. > > The point for this being RFC is lack of hardware on my side for testing > the IDMA support. I'd highly appreciate if someone with Orion hardware > could test this, preferably using the hmac_comp tool shipped with > cryptodev-linux as it does a more extensive testing (with bigger buffer > sizes at least) than tcrypt or the standard kernel-internal use cases. > > Greetings, Phil > > -- > To unsubscribe from this list: send the line "unsubscribe linux-crypto" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html I would like to have a try on those patches. But what version of kernel should I apply those patches on? Thanks. ^ permalink raw reply [flat|nested] 67+ messages in thread
* Re: RFC: support for MV_CESA with IDMA or TDMA 2012-06-15 1:40 ` RFC: support for MV_CESA with IDMA or TDMA cloudy.linux @ 2012-06-15 9:51 ` Phil Sutter 0 siblings, 0 replies; 67+ messages in thread From: Phil Sutter @ 2012-06-15 9:51 UTC (permalink / raw) To: cloudy.linux; +Cc: linux-crypto, Herbert Xu Hi, On Fri, Jun 15, 2012 at 09:40:28AM +0800, cloudy.linux wrote: > I would like to have a try on those patches. But what version of kernel > should I apply those patches on? Sorry for the caused confusion. I have applied those patches to linus' git, preceded by the three accepted ones of the earlier four. Yay. Long story short, please just fetch git://nwl.cc/~n0-1/linux.git and checkout the 'cesa-dma' branch. It's exactly what I formatted the patches from. Greetings, Phil Phil Sutter Software Engineer -- Viprinet GmbH Mainzer Str. 43 55411 Bingen am Rhein Germany Phone/Zentrale: +49-6721-49030-0 Direct line/Durchwahl: +49-6721-49030-134 Fax: +49-6721-49030-209 phil.sutter@viprinet.com http://www.viprinet.com Registered office/Sitz der Gesellschaft: Bingen am Rhein Commercial register/Handelsregister: Amtsgericht Mainz HRB40380 CEO/Geschäftsführer: Simon Kissel ^ permalink raw reply [flat|nested] 67+ messages in thread
* [PATCH 0/2] Fixes for MV_CESA with IDMA or TDMA 2012-06-12 17:17 ` RFC: support for MV_CESA with IDMA or TDMA Phil Sutter ` (13 preceding siblings ...) 2012-06-15 1:40 ` RFC: support for MV_CESA with IDMA or TDMA cloudy.linux @ 2012-06-16 0:20 ` Simon Baatz 2012-06-16 0:20 ` [PATCH 1/2] mv_dma: fix mv_init_engine() error case Simon Baatz ` (2 more replies) 14 siblings, 3 replies; 67+ messages in thread From: Simon Baatz @ 2012-06-16 0:20 UTC (permalink / raw) To: phil.sutter; +Cc: linux-crypto Hi Phil, thanks for providing these patches; it's great to finally see DMA support for CESA in the kernel. Additionally, the implementation seems to be fine regarding cache incoherencies (at least my test in [*] works). I have two patches for your patchset... - Fix for mv_init_engine error handling - My system locked up hard when mv_dma and mv_cesa were built as modules. mv_cesa has code to enable the crypto clock in 3.5, but mv_dma already accesses the CESA engine before. Thus, we need to enable this clock here, too. [*] http://www.spinics.net/lists/arm-kernel/msg176913.html Simon Baatz (2): mv_dma: fix mv_init_engine() error case ARM: Orion: mv_dma: Add support for clk arch/arm/mach-kirkwood/common.c | 1 + drivers/crypto/mv_dma.c | 18 +++++++++++++++--- 2 files changed, 16 insertions(+), 3 deletions(-) -- 1.7.9.5 ^ permalink raw reply [flat|nested] 67+ messages in thread
* [PATCH 1/2] mv_dma: fix mv_init_engine() error case 2012-06-16 0:20 ` [PATCH 0/2] Fixes " Simon Baatz @ 2012-06-16 0:20 ` Simon Baatz 2012-06-16 0:20 ` [PATCH 2/2] ARM: Orion: mv_dma: Add support for clk Simon Baatz 2012-06-18 13:47 ` [PATCH 0/2] Fixes for MV_CESA with IDMA or TDMA Phil Sutter 2 siblings, 0 replies; 67+ messages in thread From: Simon Baatz @ 2012-06-16 0:20 UTC (permalink / raw) To: phil.sutter; +Cc: linux-crypto Fix wrongly placed free_irq in mv_init_engine() error recovery. In fact, we can remove the respective label, since request_irq() is the last thing the function does anyway. Signed-off-by: Simon Baatz <gmbnomis@gmail.com> --- drivers/crypto/mv_dma.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/drivers/crypto/mv_dma.c b/drivers/crypto/mv_dma.c index b84ff80..125dfee 100644 --- a/drivers/crypto/mv_dma.c +++ b/drivers/crypto/mv_dma.c @@ -296,7 +296,7 @@ static int mv_init_engine(struct platform_device *pdev, if (init_dma_desclist(&tpg.desclist, tpg.dev, sizeof(struct mv_dma_desc), MV_DMA_ALIGN, 0)) { rc = -ENOMEM; - goto out_free_irq; + goto out_unmap_reg; } if (set_dma_desclist_size(&tpg.desclist, MV_DMA_INIT_POOLSIZE)) { rc = -ENOMEM; @@ -330,8 +330,6 @@ out_free_all: platform_set_drvdata(pdev, NULL); out_free_desclist: fini_dma_desclist(&tpg.desclist); -out_free_irq: - free_irq(tpg.irq, &tpg); out_unmap_reg: iounmap(tpg.reg); tpg.dev = NULL; -- 1.7.9.5 ^ permalink raw reply related [flat|nested] 67+ messages in thread
* [PATCH 2/2] ARM: Orion: mv_dma: Add support for clk 2012-06-16 0:20 ` [PATCH 0/2] Fixes " Simon Baatz 2012-06-16 0:20 ` [PATCH 1/2] mv_dma: fix mv_init_engine() error case Simon Baatz @ 2012-06-16 0:20 ` Simon Baatz 2012-06-18 13:47 ` [PATCH 0/2] Fixes for MV_CESA with IDMA or TDMA Phil Sutter 2 siblings, 0 replies; 67+ messages in thread From: Simon Baatz @ 2012-06-16 0:20 UTC (permalink / raw) To: phil.sutter; +Cc: linux-crypto mv_dma needs the crypto clock. Some orion platforms support gating of the clock. If the clock exists enable/disable it as appropriate. Signed-off-by: Simon Baatz <gmbnomis@gmail.com> --- arch/arm/mach-kirkwood/common.c | 1 + drivers/crypto/mv_dma.c | 14 ++++++++++++++ 2 files changed, 15 insertions(+) diff --git a/arch/arm/mach-kirkwood/common.c b/arch/arm/mach-kirkwood/common.c index 560b920..e7bbc60 100644 --- a/arch/arm/mach-kirkwood/common.c +++ b/arch/arm/mach-kirkwood/common.c @@ -234,6 +234,7 @@ void __init kirkwood_clk_init(void) orion_clkdev_add(NULL, "orion-ehci.0", usb0); orion_clkdev_add(NULL, "orion_nand", runit); orion_clkdev_add(NULL, "mvsdio", sdio); + orion_clkdev_add(NULL, "mv_tdma", crypto); orion_clkdev_add(NULL, "mv_crypto", crypto); orion_clkdev_add(NULL, MV_XOR_SHARED_NAME ".0", xor0); orion_clkdev_add(NULL, MV_XOR_SHARED_NAME ".1", xor1); diff --git a/drivers/crypto/mv_dma.c b/drivers/crypto/mv_dma.c index 125dfee..9fdb7be 100644 --- a/drivers/crypto/mv_dma.c +++ b/drivers/crypto/mv_dma.c @@ -13,6 +13,7 @@ #include <linux/dmapool.h> #include <linux/interrupt.h> #include <linux/module.h> +#include <linux/clk.h> #include <linux/slab.h> #include <linux/platform_device.h> @@ -36,6 +37,7 @@ struct mv_dma_priv { struct device *dev; void __iomem *reg; int irq; + struct clk *clk; /* protecting the dma descriptors and stuff */ spinlock_t lock; struct dma_desclist desclist; @@ -292,6 +294,12 @@ static int mv_init_engine(struct platform_device *pdev, goto out_unmap_reg; } + /* Not all platforms can gate the clock, so it is not + an error if the clock does not exists. */ + tpg.clk = clk_get(&pdev->dev, NULL); + if (!IS_ERR(tpg.clk)) + clk_prepare_enable(tpg.clk); + /* initialise DMA descriptor list */ if (init_dma_desclist(&tpg.desclist, tpg.dev, sizeof(struct mv_dma_desc), MV_DMA_ALIGN, 0)) { @@ -343,6 +351,12 @@ static int mv_remove(struct platform_device *pdev) fini_dma_desclist(&tpg.desclist); free_irq(tpg.irq, &tpg); iounmap(tpg.reg); + + if (!IS_ERR(tpg.clk)) { + clk_disable_unprepare(tpg.clk); + clk_put(tpg.clk); + } + tpg.dev = NULL; return 0; } -- 1.7.9.5 ^ permalink raw reply related [flat|nested] 67+ messages in thread
* Re: [PATCH 0/2] Fixes for MV_CESA with IDMA or TDMA 2012-06-16 0:20 ` [PATCH 0/2] Fixes " Simon Baatz 2012-06-16 0:20 ` [PATCH 1/2] mv_dma: fix mv_init_engine() error case Simon Baatz 2012-06-16 0:20 ` [PATCH 2/2] ARM: Orion: mv_dma: Add support for clk Simon Baatz @ 2012-06-18 13:47 ` Phil Sutter 2012-06-18 20:12 ` Simon Baatz 2012-06-26 20:31 ` [PATCH 0/1] MV_CESA with DMA: Clk init fixes Simon Baatz 2 siblings, 2 replies; 67+ messages in thread From: Phil Sutter @ 2012-06-18 13:47 UTC (permalink / raw) To: Simon Baatz; +Cc: linux-crypto, cloudy.linux Hi Simon, On Sat, Jun 16, 2012 at 02:20:19AM +0200, Simon Baatz wrote: > thanks for providing these patches; it's great to finally see DMA > support for CESA in the kernel. Additionally, the implementation seems > to be fine regarding cache incoherencies (at least my test in [*] > works). Thanks for testing and the fixes. Could you also specify the platform you are testing on? > I have two patches for your patchset... > > - Fix for mv_init_engine error handling > > - My system locked up hard when mv_dma and mv_cesa were built as > modules. mv_cesa has code to enable the crypto clock in 3.5, but > mv_dma already accesses the CESA engine before. Thus, we need to > enable this clock here, too. I have folded them into my patch series, thanks again. I somewhat miss the orion_clkdev_add() part for orion5x platforms, but also fail to find any equivalent place in the correspondent subdirectory. So I hope it is OK like this. The updated patch series is available at git://nwl.cc/~n0-1/linux.git, branch 'cesa-dma'. My push changed history, so you have to either reset --hard to it's HEAD, or rebase skipping the outdated patches. Greetings, Phil Phil Sutter Software Engineer -- Viprinet GmbH Mainzer Str. 43 55411 Bingen am Rhein Germany Phone/Zentrale: +49-6721-49030-0 Direct line/Durchwahl: +49-6721-49030-134 Fax: +49-6721-49030-209 phil.sutter@viprinet.com http://www.viprinet.com Registered office/Sitz der Gesellschaft: Bingen am Rhein Commercial register/Handelsregister: Amtsgericht Mainz HRB40380 CEO/Geschäftsführer: Simon Kissel ^ permalink raw reply [flat|nested] 67+ messages in thread
* Re: [PATCH 0/2] Fixes for MV_CESA with IDMA or TDMA 2012-06-18 13:47 ` [PATCH 0/2] Fixes for MV_CESA with IDMA or TDMA Phil Sutter @ 2012-06-18 20:12 ` Simon Baatz 2012-06-19 11:51 ` Phil Sutter 2012-06-20 13:31 ` cloudy.linux 2012-06-26 20:31 ` [PATCH 0/1] MV_CESA with DMA: Clk init fixes Simon Baatz 1 sibling, 2 replies; 67+ messages in thread From: Simon Baatz @ 2012-06-18 20:12 UTC (permalink / raw) To: Phil Sutter; +Cc: Simon Baatz, linux-crypto, cloudy.linux, andrew Hi Phil, On Mon, Jun 18, 2012 at 03:47:18PM +0200, Phil Sutter wrote: > On Sat, Jun 16, 2012 at 02:20:19AM +0200, Simon Baatz wrote: > > thanks for providing these patches; it's great to finally see DMA > > support for CESA in the kernel. Additionally, the implementation seems > > to be fine regarding cache incoherencies (at least my test in [*] > > works). > > Thanks for testing and the fixes. Could you also specify the platform > you are testing on? This is a Marvell Kirkwood MV88F6281-A1. I see one effect that I don't fully understand. Similar to the previous implementation, the system is mostly in kernel space when accessing an encrypted dm-crypt device: # cryptsetup --cipher=aes-cbc-plain --key-size=128 create c_sda2 /dev/sda2 Enter passphrase: # dd if=/dev/mapper/c_sda2 of=/dev/null bs=64k count=2048 2048+0 records in 2048+0 records out 134217728 bytes (134 MB) copied, 10.7324 s, 12.5 MB/s Doing an "mpstat 1" at the same time gives: 21:21:42 CPU %usr %nice %sys %iowait %irq %soft ... 21:21:45 all 0.00 0.00 0.00 0.00 0.00 0.00 21:21:46 all 0.00 0.00 79.00 0.00 0.00 2.00 21:21:47 all 0.00 0.00 95.00 0.00 0.00 5.00 21:21:48 all 0.00 0.00 94.00 0.00 0.00 6.00 21:21:49 all 0.00 0.00 96.00 0.00 0.00 4.00 ... The underlying device is a SATA drive and should not be the limit: # dd if=/dev/sda2 of=/dev/null bs=64k count=2048 2048+0 records in 2048+0 records out 134217728 bytes (134 MB) copied, 1.79804 s, 74.6 MB/s I did not dare hope the DMA implementation to be much faster than the old one, but I would have expected a rather low CPU usage using DMA. Do you have an idea where the kernel spends its time? (Am I hitting a non/only partially accelerated path here?) > > - My system locked up hard when mv_dma and mv_cesa were built as > > modules. mv_cesa has code to enable the crypto clock in 3.5, but > > mv_dma already accesses the CESA engine before. Thus, we need to > > enable this clock here, too. > > I have folded them into my patch series, thanks again. I somewhat miss > the orion_clkdev_add() part for orion5x platforms, but also fail to find > any equivalent place in the correspondent subdirectory. So I hope it is > OK like this. The change follows the original clk changes by Andrew. I don't know orion5x, but apparently, only kirkwood has such fine grained clock gates: /* Create clkdev entries for all orion platforms except kirkwood. Kirkwood has gated clocks for some of its peripherals, so creates its own clkdev entries. For all the other orion devices, create clkdev entries to the tclk. */ (from plat-orion/common.c) This is why the clock enabling code in the modules ignores the case that the clock can't be found. I think the clocks defined by plat-orion are for those drivers that need the actual TCLK rate (but there is no clock gate functionality here). - Simon ^ permalink raw reply [flat|nested] 67+ messages in thread
* Re: [PATCH 0/2] Fixes for MV_CESA with IDMA or TDMA 2012-06-18 20:12 ` Simon Baatz @ 2012-06-19 11:51 ` Phil Sutter 2012-06-19 15:09 ` cloudy.linux 2012-06-20 13:31 ` cloudy.linux 1 sibling, 1 reply; 67+ messages in thread From: Phil Sutter @ 2012-06-19 11:51 UTC (permalink / raw) To: Simon Baatz; +Cc: linux-crypto, cloudy.linux, andrew Hi Simon, On Mon, Jun 18, 2012 at 10:12:36PM +0200, Simon Baatz wrote: > On Mon, Jun 18, 2012 at 03:47:18PM +0200, Phil Sutter wrote: > > On Sat, Jun 16, 2012 at 02:20:19AM +0200, Simon Baatz wrote: > > > thanks for providing these patches; it's great to finally see DMA > > > support for CESA in the kernel. Additionally, the implementation seems > > > to be fine regarding cache incoherencies (at least my test in [*] > > > works). > > > > Thanks for testing and the fixes. Could you also specify the platform > > you are testing on? > > This is a Marvell Kirkwood MV88F6281-A1. OK, thanks. Just wanted to be sure it's not already the Orion test I'm hoping for. :) > I see one effect that I don't fully understand. > Similar to the previous implementation, the system is mostly in > kernel space when accessing an encrypted dm-crypt device: > > # cryptsetup --cipher=aes-cbc-plain --key-size=128 create c_sda2 /dev/sda2 > Enter passphrase: > # dd if=/dev/mapper/c_sda2 of=/dev/null bs=64k count=2048 > 2048+0 records in > 2048+0 records out > 134217728 bytes (134 MB) copied, 10.7324 s, 12.5 MB/s > > Doing an "mpstat 1" at the same time gives: > > 21:21:42 CPU %usr %nice %sys %iowait %irq %soft ... > 21:21:45 all 0.00 0.00 0.00 0.00 0.00 0.00 > 21:21:46 all 0.00 0.00 79.00 0.00 0.00 2.00 > 21:21:47 all 0.00 0.00 95.00 0.00 0.00 5.00 > 21:21:48 all 0.00 0.00 94.00 0.00 0.00 6.00 > 21:21:49 all 0.00 0.00 96.00 0.00 0.00 4.00 > ... > > The underlying device is a SATA drive and should not be the limit: > > # dd if=/dev/sda2 of=/dev/null bs=64k count=2048 > 2048+0 records in > 2048+0 records out > 134217728 bytes (134 MB) copied, 1.79804 s, 74.6 MB/s > > I did not dare hope the DMA implementation to be much faster than the > old one, but I would have expected a rather low CPU usage using DMA. > Do you have an idea where the kernel spends its time? (Am I hitting > a non/only partially accelerated path here?) Hmm. Though you passed bs=64k to dd, block sizes may still be the bottleneck. No idea if the parameter is really passed down to dm-crypt or if that uses the underlying device's block size anyway. I just did a short speed test on the 2.6.39.2 we're using productively: | Testing AES-128-CBC cipher: | Encrypting in chunks of 512 bytes: done. 46.19 MB in 5.00 secs: 9.24 MB/sec | Encrypting in chunks of 1024 bytes: done. 81.82 MB in 5.00 secs: 16.36 MB/sec | Encrypting in chunks of 2048 bytes: done. 124.63 MB in 5.00 secs: 24.93 MB/sec | Encrypting in chunks of 4096 bytes: done. 162.88 MB in 5.00 secs: 32.58 MB/sec | Encrypting in chunks of 8192 bytes: done. 200.47 MB in 5.00 secs: 40.09 MB/sec | Encrypting in chunks of 16384 bytes: done. 226.61 MB in 5.00 secs: 45.32 MB/sec | Encrypting in chunks of 32768 bytes: done. 242.78 MB in 5.00 secs: 48.55 MB/sec | Encrypting in chunks of 65536 bytes: done. 251.85 MB in 5.00 secs: 50.36 MB/sec | | Testing AES-256-CBC cipher: | Encrypting in chunks of 512 bytes: done. 45.15 MB in 5.00 secs: 9.03 MB/sec | Encrypting in chunks of 1024 bytes: done. 78.72 MB in 5.00 secs: 15.74 MB/sec | Encrypting in chunks of 2048 bytes: done. 117.59 MB in 5.00 secs: 23.52 MB/sec | Encrypting in chunks of 4096 bytes: done. 151.59 MB in 5.00 secs: 30.32 MB/sec | Encrypting in chunks of 8192 bytes: done. 182.95 MB in 5.00 secs: 36.59 MB/sec | Encrypting in chunks of 16384 bytes: done. 204.00 MB in 5.00 secs: 40.80 MB/sec | Encrypting in chunks of 32768 bytes: done. 216.17 MB in 5.00 secs: 43.23 MB/sec | Encrypting in chunks of 65536 bytes: done. 223.22 MB in 5.00 secs: 44.64 MB/sec Observing top while it was running revealed that system load was decreasing with increased block sizes - ~75% at 512B, ~20% at 32kB. I fear this is a limitation we have to live with, the overhead of setting up DMA descriptors and handling the returned data is quite high, especially when compared to the time it takes the engine to encrypt 512B. I was playing around with descriptor preparation at some point (i.e. preparing the next descriptor chaing while the engine is active), but without satisfying results. Maybe I should have another look at it, especially regarding the case of small chunk sizes. OTOH this all makes sense only when used asymmetrically, and I have no idea whether dm-crypt (or fellows like IPsec) makes use of that interface at all. > > > - My system locked up hard when mv_dma and mv_cesa were built as > > > modules. mv_cesa has code to enable the crypto clock in 3.5, but > > > mv_dma already accesses the CESA engine before. Thus, we need to > > > enable this clock here, too. > > > > I have folded them into my patch series, thanks again. I somewhat miss > > the orion_clkdev_add() part for orion5x platforms, but also fail to find > > any equivalent place in the correspondent subdirectory. So I hope it is > > OK like this. > > The change follows the original clk changes by Andrew. I don't know > orion5x, but apparently, only kirkwood has such fine grained clock > gates: > > /* Create clkdev entries for all orion platforms except kirkwood. > Kirkwood has gated clocks for some of its peripherals, so creates > its own clkdev entries. For all the other orion devices, create > clkdev entries to the tclk. */ > > (from plat-orion/common.c) > > This is why the clock enabling code in the modules ignores the case > that the clock can't be found. I think the clocks defined by > plat-orion are for those drivers that need the actual TCLK rate (but > there is no clock gate functionality here). Ah, OK. Reading helps, they say. Thanks anyway for your explanation. Greetings, Phil Phil Sutter Software Engineer -- Viprinet GmbH Mainzer Str. 43 55411 Bingen am Rhein Germany Phone/Zentrale: +49-6721-49030-0 Direct line/Durchwahl: +49-6721-49030-134 Fax: +49-6721-49030-209 phil.sutter@viprinet.com http://www.viprinet.com Registered office/Sitz der Gesellschaft: Bingen am Rhein Commercial register/Handelsregister: Amtsgericht Mainz HRB40380 CEO/Geschäftsführer: Simon Kissel ^ permalink raw reply [flat|nested] 67+ messages in thread
* Re: [PATCH 0/2] Fixes for MV_CESA with IDMA or TDMA 2012-06-19 11:51 ` Phil Sutter @ 2012-06-19 15:09 ` cloudy.linux 2012-06-19 17:13 ` Phil Sutter 0 siblings, 1 reply; 67+ messages in thread From: cloudy.linux @ 2012-06-19 15:09 UTC (permalink / raw) To: Phil Sutter; +Cc: Simon Baatz, linux-crypto, andrew On 2012-6-19 19:51, Phil Sutter wrote: > Hi Simon, > > On Mon, Jun 18, 2012 at 10:12:36PM +0200, Simon Baatz wrote: >> On Mon, Jun 18, 2012 at 03:47:18PM +0200, Phil Sutter wrote: >>> On Sat, Jun 16, 2012 at 02:20:19AM +0200, Simon Baatz wrote: >>>> thanks for providing these patches; it's great to finally see DMA >>>> support for CESA in the kernel. Additionally, the implementation seems >>>> to be fine regarding cache incoherencies (at least my test in [*] >>>> works). >>> >>> Thanks for testing and the fixes. Could you also specify the platform >>> you are testing on? >> >> This is a Marvell Kirkwood MV88F6281-A1. > > OK, thanks. Just wanted to be sure it's not already the Orion test I'm > hoping for. :) > OK, here comes the Orion test result - Linkstation Pro with 88F5182 A2. I didn't enable any debug option yet (I don't know what to be enabled in fact). Hope the mv_cesa and mv_dma related kernel messages below could be helpful though: ... MV-DMA: IDMA engine up and running, IRQ 23 MV-DMA: idma_print_and_clear_irq: address miss @0! MV-DMA: tpg.reg + DMA_CTRL = 0x80001d04 MV-DMA: tpg.reg + DMA_BYTE_COUNT = 0x80000010 MV-DMA: tpg.reg + DMA_SRC_ADDR = 0x79b4008 MV-DMA: tpg.reg + DMA_DST_ADDR = 0xf2200080 MV-DMA: tpg.reg + DMA_NEXT_DESC = 0x79b1010 MV-DMA: tpg.reg + DMA_CURR_DESC = 0x79b1000 MV-DMA: DMA descriptor list: MV-DMA: entry 0 at 0xffdbb000: dma addr 0x79b1000, src 0x79b4000, dst 0xf2200080, count 16, own 1, next 0x79b1010 MV-DMA: entry 1 at 0xffdbb010: dma addr 0x79b1010, src 0x799c28c, dst 0xf2200000, count 80, own 1, next 0x79b1020 MV-DMA: entry 2 at 0xffdbb020: dma addr 0x79b1020, src 0x0, dst 0x0, count 0, own 0, next 0x79b1030 MV-DMA: entry 3 at 0xffdbb030: dma addr 0x79b1030, src 0xf2200080, dst 0x79b4000, count 16, own 1, next 0x0 MV-CESA:got an interrupt but no pending timer? alg: skcipher: Test 1 failed on encryption for mv-ecb-aes 00000000: 00 11 22 33 44 55 66 77 88 99 aa bb cc dd ee ff ... MV-CESA:completion timer expired (CESA active), cleaning up. MV-CESA:mv_completion_timer_callback: waiting for engine finishing MV-CESA:mv_completion_timer_callback: waiting for engine finishing Then the console was flooded by the "waiting for engine finshing" message and the boot can't finish. I'll be happy to help to debug this. Just tell me how. ^ permalink raw reply [flat|nested] 67+ messages in thread
* Re: [PATCH 0/2] Fixes for MV_CESA with IDMA or TDMA 2012-06-19 15:09 ` cloudy.linux @ 2012-06-19 17:13 ` Phil Sutter 2012-06-20 1:16 ` cloudy.linux 0 siblings, 1 reply; 67+ messages in thread From: Phil Sutter @ 2012-06-19 17:13 UTC (permalink / raw) To: cloudy.linux; +Cc: Simon Baatz, linux-crypto, andrew [-- Attachment #1: Type: text/plain, Size: 3435 bytes --] Hi, On Tue, Jun 19, 2012 at 11:09:43PM +0800, cloudy.linux wrote: > On 2012-6-19 19:51, Phil Sutter wrote: > > Hi Simon, > > > > On Mon, Jun 18, 2012 at 10:12:36PM +0200, Simon Baatz wrote: > >> On Mon, Jun 18, 2012 at 03:47:18PM +0200, Phil Sutter wrote: > >>> On Sat, Jun 16, 2012 at 02:20:19AM +0200, Simon Baatz wrote: > >>>> thanks for providing these patches; it's great to finally see DMA > >>>> support for CESA in the kernel. Additionally, the implementation seems > >>>> to be fine regarding cache incoherencies (at least my test in [*] > >>>> works). > >>> > >>> Thanks for testing and the fixes. Could you also specify the platform > >>> you are testing on? > >> > >> This is a Marvell Kirkwood MV88F6281-A1. > > > > OK, thanks. Just wanted to be sure it's not already the Orion test I'm > > hoping for. :) > > > > OK, here comes the Orion test result - Linkstation Pro with 88F5182 A2. > I didn't enable any debug option yet (I don't know what to be enabled in > fact). Hope the mv_cesa and mv_dma related kernel messages below could > be helpful though: > > ... > > MV-DMA: IDMA engine up and running, IRQ 23 > MV-DMA: idma_print_and_clear_irq: address miss @0! > MV-DMA: tpg.reg + DMA_CTRL = 0x80001d04 > MV-DMA: tpg.reg + DMA_BYTE_COUNT = 0x80000010 > MV-DMA: tpg.reg + DMA_SRC_ADDR = 0x79b4008 > MV-DMA: tpg.reg + DMA_DST_ADDR = 0xf2200080 > MV-DMA: tpg.reg + DMA_NEXT_DESC = 0x79b1010 > MV-DMA: tpg.reg + DMA_CURR_DESC = 0x79b1000 > MV-DMA: DMA descriptor list: > MV-DMA: entry 0 at 0xffdbb000: dma addr 0x79b1000, src 0x79b4000, dst > 0xf2200080, count 16, own 1, next 0x79b1010 > MV-DMA: entry 1 at 0xffdbb010: dma addr 0x79b1010, src 0x799c28c, dst > 0xf2200000, count 80, own 1, next 0x79b1020 > MV-DMA: entry 2 at 0xffdbb020: dma addr 0x79b1020, src 0x0, dst 0x0, > count 0, own 0, next 0x79b1030 > MV-DMA: entry 3 at 0xffdbb030: dma addr 0x79b1030, src 0xf2200080, dst > 0x79b4000, count 16, own 1, next 0x0 > MV-CESA:got an interrupt but no pending timer? > alg: skcipher: Test 1 failed on encryption for mv-ecb-aes > 00000000: 00 11 22 33 44 55 66 77 88 99 aa bb cc dd ee ff > ... > > MV-CESA:completion timer expired (CESA active), cleaning up. > MV-CESA:mv_completion_timer_callback: waiting for engine finishing > MV-CESA:mv_completion_timer_callback: waiting for engine finishing > > Then the console was flooded by the "waiting for engine finshing" > message and the boot can't finish. > > I'll be happy to help to debug this. Just tell me how. OK. IDMA bailing out was more or less expected, but the error path flooding the log makes me deserve the darwin award. ;) I suspect address decoding to be the real problem here (kirkwood seems not to need any setup, so I completely skipped that), at least the IDMA interrupt cause points that out. OTOH I found out that CESA wasn't exactly configured as stated in the specs, so could you please test the attached diff? (Should also sanitise the error case a bit.) In any case, thanks a lot for your time! Phil Sutter Software Engineer -- Viprinet GmbH Mainzer Str. 43 55411 Bingen am Rhein Germany Phone/Zentrale: +49-6721-49030-0 Direct line/Durchwahl: +49-6721-49030-134 Fax: +49-6721-49030-209 phil.sutter@viprinet.com http://www.viprinet.com Registered office/Sitz der Gesellschaft: Bingen am Rhein Commercial register/Handelsregister: Amtsgericht Mainz HRB40380 CEO/Geschäftsführer: Simon Kissel [-- Attachment #2: cesa_test.diff --] [-- Type: text/plain, Size: 2001 bytes --] diff --git a/drivers/crypto/mv_cesa.c b/drivers/crypto/mv_cesa.c index 2a9fe8a..4361dff 100644 --- a/drivers/crypto/mv_cesa.c +++ b/drivers/crypto/mv_cesa.c @@ -9,6 +9,7 @@ #include <crypto/aes.h> #include <crypto/algapi.h> #include <linux/crypto.h> +#include <linux/delay.h> #include <linux/dma-mapping.h> #include <linux/dmapool.h> #include <linux/interrupt.h> @@ -180,6 +181,7 @@ struct mv_req_hash_ctx { static void mv_completion_timer_callback(unsigned long unused) { int active = readl(cpg->reg + SEC_ACCEL_CMD) & SEC_CMD_EN_SEC_ACCL0; + int count = 5; printk(KERN_ERR MV_CESA "completion timer expired (CESA %sactive), cleaning up.\n", @@ -187,8 +189,12 @@ static void mv_completion_timer_callback(unsigned long unused) del_timer(&cpg->completion_timer); writel(SEC_CMD_DISABLE_SEC, cpg->reg + SEC_ACCEL_CMD); - while(readl(cpg->reg + SEC_ACCEL_CMD) & SEC_CMD_DISABLE_SEC) - printk(KERN_INFO MV_CESA "%s: waiting for engine finishing\n", __func__); + while((readl(cpg->reg + SEC_ACCEL_CMD) & SEC_CMD_DISABLE_SEC) && count) { + printk(KERN_INFO MV_CESA "%s: waiting for engine finishing (%d)\n", + __func__, count--); + mdelay(1000); + } + BUG_ON(!count); cpg->eng_st = ENGINE_W_DEQUEUE; wake_up_process(cpg->queue_th); } @@ -1288,9 +1294,9 @@ static int mv_probe(struct platform_device *pdev) clk_prepare_enable(cp->clk); writel(0, cpg->reg + SEC_ACCEL_INT_STATUS); - writel(SEC_INT_ACC0_IDMA_DONE, cpg->reg + SEC_ACCEL_INT_MASK); - writel((SEC_CFG_STOP_DIG_ERR | SEC_CFG_CH0_W_IDMA | SEC_CFG_MP_CHAIN | - SEC_CFG_ACT_CH0_IDMA), cpg->reg + SEC_ACCEL_CFG); + writel(SEC_INT_ACC0_IDMA_DONE | SEC_INT_ACC1_IDMA_DONE, cpg->reg + SEC_ACCEL_INT_MASK); + writel((SEC_CFG_STOP_DIG_ERR | SEC_CFG_CH0_W_IDMA | SEC_CFG_CH1_W_IDMA | SEC_CFG_MP_CHAIN | + SEC_CFG_ACT_CH0_IDMA | SEC_CFG_ACT_CH1_IDMA), cpg->reg + SEC_ACCEL_CFG); writel(SRAM_CONFIG, cpg->reg + SEC_ACCEL_DESC_P0); cp->sa_sram_dma = dma_map_single(&pdev->dev, &cp->sa_sram, ^ permalink raw reply related [flat|nested] 67+ messages in thread
* Re: [PATCH 0/2] Fixes for MV_CESA with IDMA or TDMA 2012-06-19 17:13 ` Phil Sutter @ 2012-06-20 1:16 ` cloudy.linux 2012-07-16 9:32 ` Andrew Lunn 0 siblings, 1 reply; 67+ messages in thread From: cloudy.linux @ 2012-06-20 1:16 UTC (permalink / raw) To: Phil Sutter; +Cc: Simon Baatz, linux-crypto, andrew The CESA still didn't work as expected. But this time the machine managed to finish the boot. ... MV-DMA: IDMA engine up and running, IRQ 23 MV-DMA: idma_print_and_clear_irq: address miss @0! MV-DMA: tpg.reg + DMA_CTRL = 0x80001d04 MV-DMA: tpg.reg + DMA_BYTE_COUNT = 0x80000010 MV-DMA: tpg.reg + DMA_SRC_ADDR = 0x79b4008 MV-DMA: tpg.reg + DMA_DST_ADDR = 0xf2200080 MV-DMA: tpg.reg + DMA_NEXT_DESC = 0x79b1010 MV-DMA: tpg.reg + DMA_CURR_DESC = 0x79b1000 MV-DMA: DMA descriptor list: MV-DMA: entry 0 at 0xffdbb000: dma addr 0x79b1000, src 0x79b4000, dst 0xf2200080, count 16, own 1, next 0x79b1010 MV-DMA: entry 1 at 0xffdbb010: dma addr 0x79b1010, src 0x799c28c, dst 0xf2200000, count 80, own 1, next 0x79b1020 MV-DMA: entry 2 at 0xffdbb020: dma addr 0x79b1020, src 0x0, dst 0x0, count 0, own 0, next 0x79b1030 MV-DMA: entry 3 at 0xffdbb030: dma addr 0x79b1030, src 0xf2200080, dst 0x79b4000, count 16, own 1, next 0x0 MV-CESA:got an interrupt but no pending timer? alg: skcipher: Test 1 failed on encryption for mv-ecb-aes 00000000: 00 11 22 33 44 55 66 77 88 99 aa bb cc dd ee ff ... MV-CESA:completion timer expired (CESA active), cleaning up. MV-CESA:mv_completion_timer_callback: waiting for engine finishing (5) MV-CESA:mv_completion_timer_callback: waiting for engine finishing (4) MV-CESA:mv_completion_timer_callback: waiting for engine finishing (3) MV-CESA:mv_completion_timer_callback: waiting for engine finishing (2) MV-CESA:mv_completion_timer_callback: waiting for engine finishing (1) alg: hash: Test 1 failed for mv-sha1 00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00000010: 00 00 00 00 ata2: SATA link down (SStatus 0 SControl 300) MV-CESA:completion timer expired (CESA active), cleaning up. MV-CESA:mv_completion_timer_callback: waiting for engine finishing (5) MV-CESA:mv_completion_timer_callback: waiting for engine finishing (4) MV-CESA:mv_completion_timer_callback: waiting for engine finishing (3) MV-CESA:mv_completion_timer_callback: waiting for engine finishing (2) MV-CESA:mv_completion_timer_callback: waiting for engine finishing (1) alg: hash: Test 1 failed for mv-hmac-sha1 00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00000010: 00 00 00 00 ... Regards Cloudy ^ permalink raw reply [flat|nested] 67+ messages in thread
* Re: [PATCH 0/2] Fixes for MV_CESA with IDMA or TDMA 2012-06-20 1:16 ` cloudy.linux @ 2012-07-16 9:32 ` Andrew Lunn 2012-07-16 13:52 ` Phil Sutter 0 siblings, 1 reply; 67+ messages in thread From: Andrew Lunn @ 2012-07-16 9:32 UTC (permalink / raw) To: cloudy.linux; +Cc: Phil Sutter, Simon Baatz, linux-crypto, andrew Hi Cloudy I've not been following this thread too closely.. Do you have any patches you want included into mainline? Thanks Andrew ^ permalink raw reply [flat|nested] 67+ messages in thread
* Re: [PATCH 0/2] Fixes for MV_CESA with IDMA or TDMA 2012-07-16 9:32 ` Andrew Lunn @ 2012-07-16 13:52 ` Phil Sutter 2012-07-16 14:03 ` Andrew Lunn 0 siblings, 1 reply; 67+ messages in thread From: Phil Sutter @ 2012-07-16 13:52 UTC (permalink / raw) To: Andrew Lunn; +Cc: cloudy.linux, Simon Baatz, linux-crypto Hey Andrew, On Mon, Jul 16, 2012 at 11:32:25AM +0200, Andrew Lunn wrote: > I've not been following this thread too closely.. > > Do you have any patches you want included into mainline? No need to fix anything mainline, he's just testing my RFC-state DMA engine addon to MV_CESA. Current state is failing operation on IDMA-based machines due to errors in hardware configuration I have not been able to track down yet. On Kirkwood (i.e. TDMA), the only hardware I have access to, the same code runs fine. After all, I am not sure why he decided to put you in Cc in the first place? Greetings, Phil Phil Sutter Software Engineer -- VNet Europe GmbH Mainzer Str. 43 55411 Bingen am Rhein Germany Management Buy-Out at Viprinet - please read http://www.viprinet.com/en/mbo Management Buy-Out bei Viprinet - bitte lesen Sie http://www.viprinet.com/de/mbo Phone/Zentrale: +49 6721 49030-0 Direct line/Durchwahl: +49 6721 49030-134 Fax: +49 6721 49030-109 phil.sutter@viprinet.com http://www.viprinet.com Registered office/Sitz der Gesellschaft: Bingen am Rhein, Germany Commercial register/Handelsregister: Amtsgericht Mainz HRB44090 CEO/Geschäftsführer: Simon Kissel ^ permalink raw reply [flat|nested] 67+ messages in thread
* Re: [PATCH 0/2] Fixes for MV_CESA with IDMA or TDMA 2012-07-16 13:52 ` Phil Sutter @ 2012-07-16 14:03 ` Andrew Lunn 2012-07-16 14:53 ` Phil Sutter 0 siblings, 1 reply; 67+ messages in thread From: Andrew Lunn @ 2012-07-16 14:03 UTC (permalink / raw) To: Phil Sutter; +Cc: Andrew Lunn, cloudy.linux, Simon Baatz, linux-crypto On Mon, Jul 16, 2012 at 03:52:16PM +0200, Phil Sutter wrote: > Hey Andrew, > > On Mon, Jul 16, 2012 at 11:32:25AM +0200, Andrew Lunn wrote: > > I've not been following this thread too closely.. > > > > Do you have any patches you want included into mainline? > > No need to fix anything mainline O.K. I thought there was a problem with user space using it, some flushes missing somewhere? Or VM mapping problem? Has that been fixed? Thanks Andrew ^ permalink raw reply [flat|nested] 67+ messages in thread
* Re: [PATCH 0/2] Fixes for MV_CESA with IDMA or TDMA 2012-07-16 14:03 ` Andrew Lunn @ 2012-07-16 14:53 ` Phil Sutter 2012-07-16 17:32 ` Simon Baatz 0 siblings, 1 reply; 67+ messages in thread From: Phil Sutter @ 2012-07-16 14:53 UTC (permalink / raw) To: Andrew Lunn; +Cc: cloudy.linux, Simon Baatz, linux-crypto On Mon, Jul 16, 2012 at 04:03:44PM +0200, Andrew Lunn wrote: > On Mon, Jul 16, 2012 at 03:52:16PM +0200, Phil Sutter wrote: > > Hey Andrew, > > > > On Mon, Jul 16, 2012 at 11:32:25AM +0200, Andrew Lunn wrote: > > > I've not been following this thread too closely.. > > > > > > Do you have any patches you want included into mainline? > > > > No need to fix anything mainline > > O.K. I thought there was a problem with user space using it, some > flushes missing somewhere? Or VM mapping problem? Has that been fixed? Hmm, there was some discussion about an issue like that in this list at end of February/beginning of March, which was resolved then. On the other hand there is an unanswered mail from Cloudy at 20. April about a failing kernel hash test. Maybe he can elaborate on this? Greetings, Phil Phil Sutter Software Engineer -- VNet Europe GmbH Mainzer Str. 43 55411 Bingen am Rhein Germany Management Buy-Out at Viprinet - please read http://www.viprinet.com/en/mbo Management Buy-Out bei Viprinet - bitte lesen Sie http://www.viprinet.com/de/mbo Phone/Zentrale: +49 6721 49030-0 Direct line/Durchwahl: +49 6721 49030-134 Fax: +49 6721 49030-109 phil.sutter@viprinet.com http://www.viprinet.com Registered office/Sitz der Gesellschaft: Bingen am Rhein, Germany Commercial register/Handelsregister: Amtsgericht Mainz HRB44090 CEO/Geschäftsführer: Simon Kissel ^ permalink raw reply [flat|nested] 67+ messages in thread
* Re: [PATCH 0/2] Fixes for MV_CESA with IDMA or TDMA 2012-07-16 14:53 ` Phil Sutter @ 2012-07-16 17:32 ` Simon Baatz 2012-07-16 17:59 ` Andrew Lunn 0 siblings, 1 reply; 67+ messages in thread From: Simon Baatz @ 2012-07-16 17:32 UTC (permalink / raw) To: Phil Sutter, Andrew Lunn; +Cc: cloudy.linux, Simon Baatz, linux-crypto Hi Andrew, Phil, On Mon, Jul 16, 2012 at 04:53:18PM +0200, Phil Sutter wrote: > On Mon, Jul 16, 2012 at 04:03:44PM +0200, Andrew Lunn wrote: > > On Mon, Jul 16, 2012 at 03:52:16PM +0200, Phil Sutter wrote: > > > Hey Andrew, > > > > > > On Mon, Jul 16, 2012 at 11:32:25AM +0200, Andrew Lunn wrote: > > > > I've not been following this thread too closely.. > > > > > > > > Do you have any patches you want included into mainline? > > > > > > No need to fix anything mainline > > > > O.K. I thought there was a problem with user space using it, some > > flushes missing somewhere? Or VM mapping problem? Has that been fixed? > > Hmm, there was some discussion about an issue like that in this list at > end of February/beginning of March, which was resolved then. On the > other hand there is an unanswered mail from Cloudy at 20. April about a > failing kernel hash test. Maybe he can elaborate on this? > I think the problem is not in mv_cesa but in flush_kernel_dcache_page(). I have proposed a fix here: http://www.spinics.net/lists/arm-kernel/msg176913.html There was a little bit of discussion on the patch, but it has not been picked up yet. - Simon ^ permalink raw reply [flat|nested] 67+ messages in thread
* Re: [PATCH 0/2] Fixes for MV_CESA with IDMA or TDMA 2012-07-16 17:32 ` Simon Baatz @ 2012-07-16 17:59 ` Andrew Lunn 0 siblings, 0 replies; 67+ messages in thread From: Andrew Lunn @ 2012-07-16 17:59 UTC (permalink / raw) To: Simon Baatz; +Cc: Phil Sutter, Andrew Lunn, cloudy.linux, linux-crypto > I think the problem is not in mv_cesa but in > flush_kernel_dcache_page(). I have proposed a fix here: > > http://www.spinics.net/lists/arm-kernel/msg176913.html > > There was a little bit of discussion on the patch, but it has not > been picked up yet. Hi Simon This is core code, not an area i feel comfortable about. I suggest you repost it, CC: Catalin and Russell. Andrew ^ permalink raw reply [flat|nested] 67+ messages in thread
* Re: [PATCH 0/2] Fixes for MV_CESA with IDMA or TDMA 2012-06-18 20:12 ` Simon Baatz 2012-06-19 11:51 ` Phil Sutter @ 2012-06-20 13:31 ` cloudy.linux 2012-06-20 15:41 ` Phil Sutter 1 sibling, 1 reply; 67+ messages in thread From: cloudy.linux @ 2012-06-20 13:31 UTC (permalink / raw) To: Phil Sutter; +Cc: linux-crypto, andrew, Simon Baatz Hi Phil On 2012-6-19 4:12, Simon Baatz wrote: > I see one effect that I don't fully understand. > Similar to the previous implementation, the system is mostly in > kernel space when accessing an encrypted dm-crypt device: Today I also compiled the patched 3.5.0-rc3 for another NAS box with MV88F6282-Rev-A0 (LS-WVL), I noticed one thing that when the CESA engine was used, the interrupt number of mv_crypto kept rising, but the interrupt number of mv_tdma was always zero. $ cat /proc/interrupts CPU0 1: 31296 orion_irq orion_tick 5: 2 orion_irq mv_xor.0 6: 2 orion_irq mv_xor.1 7: 2 orion_irq mv_xor.2 8: 2 orion_irq mv_xor.3 11: 23763 orion_irq eth0 19: 0 orion_irq ehci_hcd:usb1 21: 4696 orion_irq sata_mv 22: 64907 orion_irq mv_crypto 33: 432 orion_irq serial 46: 51 orion_irq mv643xx_eth 49: 0 orion_irq mv_tdma 53: 0 orion_irq rtc-mv 107: 0 - GPIO fan alarm 109: 0 - function 110: 0 - power-on 111: 0 - power-auto Err: 0 Is this normal? Regards Cloudy ^ permalink raw reply [flat|nested] 67+ messages in thread
* Re: [PATCH 0/2] Fixes for MV_CESA with IDMA or TDMA 2012-06-20 13:31 ` cloudy.linux @ 2012-06-20 15:41 ` Phil Sutter 2012-06-25 13:40 ` Phil Sutter 0 siblings, 1 reply; 67+ messages in thread From: Phil Sutter @ 2012-06-20 15:41 UTC (permalink / raw) To: cloudy.linux; +Cc: linux-crypto, andrew, Simon Baatz Hi Cloudy, On Wed, Jun 20, 2012 at 09:31:10PM +0800, cloudy.linux wrote: > On 2012-6-19 4:12, Simon Baatz wrote: > > I see one effect that I don't fully understand. > > Similar to the previous implementation, the system is mostly in > > kernel space when accessing an encrypted dm-crypt device: > > Today I also compiled the patched 3.5.0-rc3 for another NAS box with > MV88F6282-Rev-A0 (LS-WVL), I noticed one thing that when the CESA engine > was used, the interrupt number of mv_crypto kept rising, but the > interrupt number of mv_tdma was always zero. Yes, that is exactly how it should be: the DMA engine is configured to run "attached" to CESA, meaning that when CESA is triggered from mv_cesa.c, it first enables the DMA engine. Using a special descriptor in the chain, the DMA engine knows when to stop and signals CESA again so it can start the crypto operation. Afterwards, CESA triggers the DMA engine again for copying back the results (or more specific: process the remaining descriptors in the chain after the special one). After a descriptor with it's next descriptor field being zero has been handled, CESA is signaled again which in turn generates the interrupt to signal the software. So no DMA interrupt needed, and no software interaction in between data copying and crypto operation, of course. :) Greetings, Phil PS: I am currently working at the address decoding problem, will get back to in a few days when I have something to test. So stay tuned! Phil Sutter Software Engineer -- Viprinet GmbH Mainzer Str. 43 55411 Bingen am Rhein Germany Phone/Zentrale: +49-6721-49030-0 Direct line/Durchwahl: +49-6721-49030-134 Fax: +49-6721-49030-209 phil.sutter@viprinet.com http://www.viprinet.com Registered office/Sitz der Gesellschaft: Bingen am Rhein Commercial register/Handelsregister: Amtsgericht Mainz HRB40380 CEO/Geschäftsführer: Simon Kissel ^ permalink raw reply [flat|nested] 67+ messages in thread
* Re: [PATCH 0/2] Fixes for MV_CESA with IDMA or TDMA 2012-06-20 15:41 ` Phil Sutter @ 2012-06-25 13:40 ` Phil Sutter 2012-06-25 14:25 ` cloudy.linux 0 siblings, 1 reply; 67+ messages in thread From: Phil Sutter @ 2012-06-25 13:40 UTC (permalink / raw) To: cloudy.linux; +Cc: linux-crypto, andrew, Simon Baatz Hi, On Wed, Jun 20, 2012 at 05:41:31PM +0200, Phil Sutter wrote: > PS: I am currently working at the address decoding problem, will get > back to in a few days when I have something to test. So stay tuned! I have updated the cesa-dma branch at git://nwl.cc/~n0-1/linux.git with code setting the decoding windows. I hope this fixes the issues on orion. I decided not to publish the changes regarding the second DMA channel for now, as this seems to be support for a second crypto session (handled consecutively, so no real improvement) which is not supported anyway. Greetings, Phil Phil Sutter Software Engineer -- Viprinet GmbH Mainzer Str. 43 55411 Bingen am Rhein Germany Phone/Zentrale: +49-6721-49030-0 Direct line/Durchwahl: +49-6721-49030-134 Fax: +49-6721-49030-209 phil.sutter@viprinet.com http://www.viprinet.com Registered office/Sitz der Gesellschaft: Bingen am Rhein Commercial register/Handelsregister: Amtsgericht Mainz HRB40380 CEO/Geschäftsführer: Simon Kissel ^ permalink raw reply [flat|nested] 67+ messages in thread
* Re: [PATCH 0/2] Fixes for MV_CESA with IDMA or TDMA 2012-06-25 13:40 ` Phil Sutter @ 2012-06-25 14:25 ` cloudy.linux 2012-06-25 14:36 ` Phil Sutter 2012-06-25 16:05 ` cloudy.linux 0 siblings, 2 replies; 67+ messages in thread From: cloudy.linux @ 2012-06-25 14:25 UTC (permalink / raw) To: Phil Sutter; +Cc: linux-crypto, andrew, Simon Baatz On 2012-6-25 21:40, Phil Sutter wrote: > Hi, > > On Wed, Jun 20, 2012 at 05:41:31PM +0200, Phil Sutter wrote: >> PS: I am currently working at the address decoding problem, will get >> back to in a few days when I have something to test. So stay tuned! > > I have updated the cesa-dma branch at git://nwl.cc/~n0-1/linux.git with > code setting the decoding windows. I hope this fixes the issues on > orion. I decided not to publish the changes regarding the second DMA > channel for now, as this seems to be support for a second crypto session > (handled consecutively, so no real improvement) which is not supported > anyway. > > Greetings, Phil > > > Phil Sutter > Software Engineer > Thanks Phil. I'm cloning your git now but the speed is really slow. Last time I tried to do this but had to cancel after hours of downloading (at only about 20% progress). So the previous tests were actually done with 3.5-rc3 (I tried the up-to-date Linus' linux-git, but met compiling problem), of course with your patch and Simon's. Could you provide a diff based on your last round patch (diff to the not patched kernel should also be good, I think)? In the mean time, I'm still trying with a cloning speed of 5KiB/s ... Regards Cloudy ^ permalink raw reply [flat|nested] 67+ messages in thread
* Re: [PATCH 0/2] Fixes for MV_CESA with IDMA or TDMA 2012-06-25 14:25 ` cloudy.linux @ 2012-06-25 14:36 ` Phil Sutter 2012-06-25 16:05 ` cloudy.linux 1 sibling, 0 replies; 67+ messages in thread From: Phil Sutter @ 2012-06-25 14:36 UTC (permalink / raw) To: cloudy.linux; +Cc: linux-crypto, andrew, Simon Baatz [-- Attachment #1: Type: text/plain, Size: 2048 bytes --] Hi, On Mon, Jun 25, 2012 at 10:25:01PM +0800, cloudy.linux wrote: > On 2012-6-25 21:40, Phil Sutter wrote: > > Hi, > > > > On Wed, Jun 20, 2012 at 05:41:31PM +0200, Phil Sutter wrote: > >> PS: I am currently working at the address decoding problem, will get > >> back to in a few days when I have something to test. So stay tuned! > > > > I have updated the cesa-dma branch at git://nwl.cc/~n0-1/linux.git with > > code setting the decoding windows. I hope this fixes the issues on > > orion. I decided not to publish the changes regarding the second DMA > > channel for now, as this seems to be support for a second crypto session > > (handled consecutively, so no real improvement) which is not supported > > anyway. > > > > Greetings, Phil > > > > > > Phil Sutter > > Software Engineer > > > > Thanks Phil. I'm cloning your git now but the speed is really slow. Last > time I tried to do this but had to cancel after hours of downloading (at > only about 20% progress). So the previous tests were actually done with > 3.5-rc3 (I tried the up-to-date Linus' linux-git, but met compiling > problem), of course with your patch and Simon's. Could you provide a > diff based on your last round patch (diff to the not patched kernel > should also be good, I think)? > > In the mean time, I'm still trying with a cloning speed of 5KiB/s ... Ugh, that's horrible. No idea what's going wrong there, and no access to the management interface right now. In the mean time, please refer to the attached patch. It bases on 94fa83c in linus' git but should cleanly apply to it's current HEAD, too. Greetings, Phil Phil Sutter Software Engineer -- Viprinet GmbH Mainzer Str. 43 55411 Bingen am Rhein Germany Phone/Zentrale: +49-6721-49030-0 Direct line/Durchwahl: +49-6721-49030-134 Fax: +49-6721-49030-209 phil.sutter@viprinet.com http://www.viprinet.com Registered office/Sitz der Gesellschaft: Bingen am Rhein Commercial register/Handelsregister: Amtsgericht Mainz HRB40380 CEO/Geschäftsführer: Simon Kissel [-- Attachment #2: mv_dma_full.diff --] [-- Type: text/plain, Size: 55019 bytes --] diff --git a/arch/arm/mach-kirkwood/common.c b/arch/arm/mach-kirkwood/common.c index 25fb3fd..a011b93 100644 --- a/arch/arm/mach-kirkwood/common.c +++ b/arch/arm/mach-kirkwood/common.c @@ -232,6 +232,7 @@ void __init kirkwood_clk_init(void) orion_clkdev_add(NULL, "orion-ehci.0", usb0); orion_clkdev_add(NULL, "orion_nand", runit); orion_clkdev_add(NULL, "mvsdio", sdio); + orion_clkdev_add(NULL, "mv_tdma", crypto); orion_clkdev_add(NULL, "mv_crypto", crypto); orion_clkdev_add(NULL, MV_XOR_SHARED_NAME ".0", xor0); orion_clkdev_add(NULL, MV_XOR_SHARED_NAME ".1", xor1); @@ -426,8 +427,41 @@ void __init kirkwood_uart1_init(void) /***************************************************************************** * Cryptographic Engines and Security Accelerator (CESA) ****************************************************************************/ +static struct resource kirkwood_tdma_res[] = { + { + .name = "regs deco", + .start = CRYPTO_PHYS_BASE + 0xA00, + .end = CRYPTO_PHYS_BASE + 0xA24, + .flags = IORESOURCE_MEM, + }, { + .name = "regs control and error", + .start = CRYPTO_PHYS_BASE + 0x800, + .end = CRYPTO_PHYS_BASE + 0x8CF, + .flags = IORESOURCE_MEM, + }, { + .name = "crypto error", + .start = IRQ_KIRKWOOD_TDMA_ERR, + .end = IRQ_KIRKWOOD_TDMA_ERR, + .flags = IORESOURCE_IRQ, + }, +}; + +static u64 mv_tdma_dma_mask = DMA_BIT_MASK(32); + +static struct platform_device kirkwood_tdma_device = { + .name = "mv_tdma", + .id = -1, + .dev = { + .dma_mask = &mv_tdma_dma_mask, + .coherent_dma_mask = DMA_BIT_MASK(32), + }, + .num_resources = ARRAY_SIZE(kirkwood_tdma_res), + .resource = kirkwood_tdma_res, +}; + void __init kirkwood_crypto_init(void) { + platform_device_register(&kirkwood_tdma_device); orion_crypto_init(CRYPTO_PHYS_BASE, KIRKWOOD_SRAM_PHYS_BASE, KIRKWOOD_SRAM_SIZE, IRQ_KIRKWOOD_CRYPTO); } diff --git a/arch/arm/mach-kirkwood/include/mach/irqs.h b/arch/arm/mach-kirkwood/include/mach/irqs.h index 2bf8161..a66aa3f 100644 --- a/arch/arm/mach-kirkwood/include/mach/irqs.h +++ b/arch/arm/mach-kirkwood/include/mach/irqs.h @@ -51,6 +51,7 @@ #define IRQ_KIRKWOOD_GPIO_HIGH_16_23 41 #define IRQ_KIRKWOOD_GE00_ERR 46 #define IRQ_KIRKWOOD_GE01_ERR 47 +#define IRQ_KIRKWOOD_TDMA_ERR 49 #define IRQ_KIRKWOOD_RTC 53 /* diff --git a/arch/arm/mach-orion5x/common.c b/arch/arm/mach-orion5x/common.c index 9148b22..4734231 100644 --- a/arch/arm/mach-orion5x/common.c +++ b/arch/arm/mach-orion5x/common.c @@ -181,9 +181,49 @@ void __init orion5x_xor_init(void) /***************************************************************************** * Cryptographic Engines and Security Accelerator (CESA) ****************************************************************************/ +static struct resource orion_idma_res[] = { + { + .name = "regs deco", + .start = ORION5X_IDMA_PHYS_BASE + 0xA00, + .end = ORION5X_IDMA_PHYS_BASE + 0xA24, + .flags = IORESOURCE_MEM, + }, { + .name = "regs control and error", + .start = ORION5X_IDMA_PHYS_BASE + 0x800, + .end = ORION5X_IDMA_PHYS_BASE + 0x8CF, + .flags = IORESOURCE_MEM, + }, { + .name = "crypto error", + .start = IRQ_ORION5X_IDMA_ERR, + .end = IRQ_ORION5X_IDMA_ERR, + .flags = IORESOURCE_IRQ, + }, +}; + +static u64 mv_idma_dma_mask = DMA_BIT_MASK(32); + +static struct mv_dma_pdata mv_idma_pdata = { + .sram_target_id = TARGET_SRAM, + .sram_attr = 0, + .sram_base = ORION5X_SRAM_PHYS_BASE, +}; + +static struct platform_device orion_idma_device = { + .name = "mv_idma", + .id = -1, + .dev = { + .dma_mask = &mv_idma_dma_mask, + .coherent_dma_mask = DMA_BIT_MASK(32), + .platform_data = &mv_idma_pdata, + }, + .num_resources = ARRAY_SIZE(orion_idma_res), + .resource = orion_idma_res, +}; + static void __init orion5x_crypto_init(void) { orion5x_setup_sram_win(); + platform_device_register(&orion_idma_device); orion_crypto_init(ORION5X_CRYPTO_PHYS_BASE, ORION5X_SRAM_PHYS_BASE, SZ_8K, IRQ_ORION5X_CESA); } diff --git a/arch/arm/mach-orion5x/include/mach/orion5x.h b/arch/arm/mach-orion5x/include/mach/orion5x.h index 2745f5d..a31ac88 100644 --- a/arch/arm/mach-orion5x/include/mach/orion5x.h +++ b/arch/arm/mach-orion5x/include/mach/orion5x.h @@ -90,6 +90,8 @@ #define ORION5X_USB0_PHYS_BASE (ORION5X_REGS_PHYS_BASE | 0x50000) #define ORION5X_USB0_VIRT_BASE (ORION5X_REGS_VIRT_BASE | 0x50000) +#define ORION5X_IDMA_PHYS_BASE (ORION5X_REGS_PHYS_BASE | 0x60000) + #define ORION5X_XOR_PHYS_BASE (ORION5X_REGS_PHYS_BASE | 0x60900) #define ORION5X_XOR_VIRT_BASE (ORION5X_REGS_VIRT_BASE | 0x60900) diff --git a/arch/arm/plat-orion/common.c b/arch/arm/plat-orion/common.c index 61fd837..0c6c695 100644 --- a/arch/arm/plat-orion/common.c +++ b/arch/arm/plat-orion/common.c @@ -924,9 +924,15 @@ static struct resource orion_crypto_resources[] = { }, }; +static u64 mv_crypto_dmamask = DMA_BIT_MASK(32); + static struct platform_device orion_crypto = { .name = "mv_crypto", .id = -1, + .dev = { + .dma_mask = &mv_crypto_dmamask, + .coherent_dma_mask = DMA_BIT_MASK(32), + }, }; void __init orion_crypto_init(unsigned long mapbase, diff --git a/arch/arm/plat-orion/include/plat/mv_dma.h b/arch/arm/plat-orion/include/plat/mv_dma.h new file mode 100644 index 0000000..e4e72bb --- /dev/null +++ b/arch/arm/plat-orion/include/plat/mv_dma.h @@ -0,0 +1,15 @@ +/* + * arch/arm/plat-orion/include/plat/mv_dma.h + * + * Marvell IDMA/TDMA platform device data definition file. + */ +#ifndef __PLAT_MV_DMA_H +#define __PLAT_MV_DMA_H + +struct mv_dma_pdata { + unsigned int sram_target_id; + unsigned int sram_attr; + unsigned int sram_base; +}; + +#endif /* __PLAT_MV_DMA_H */ diff --git a/drivers/crypto/Kconfig b/drivers/crypto/Kconfig index 1092a77..3709f38 100644 --- a/drivers/crypto/Kconfig +++ b/drivers/crypto/Kconfig @@ -159,6 +159,10 @@ config CRYPTO_GHASH_S390 It is available as of z196. +config CRYPTO_DEV_MV_DMA + tristate + default no + config CRYPTO_DEV_MV_CESA tristate "Marvell's Cryptographic Engine" depends on PLAT_ORION @@ -166,6 +170,7 @@ config CRYPTO_DEV_MV_CESA select CRYPTO_AES select CRYPTO_BLKCIPHER2 select CRYPTO_HASH + select CRYPTO_DEV_MV_DMA help This driver allows you to utilize the Cryptographic Engines and Security Accelerator (CESA) which can be found on the Marvell Orion diff --git a/drivers/crypto/Makefile b/drivers/crypto/Makefile index 0139032..cb655ad 100644 --- a/drivers/crypto/Makefile +++ b/drivers/crypto/Makefile @@ -4,6 +4,7 @@ obj-$(CONFIG_CRYPTO_DEV_GEODE) += geode-aes.o obj-$(CONFIG_CRYPTO_DEV_NIAGARA2) += n2_crypto.o n2_crypto-y := n2_core.o n2_asm.o obj-$(CONFIG_CRYPTO_DEV_HIFN_795X) += hifn_795x.o +obj-$(CONFIG_CRYPTO_DEV_MV_DMA) += mv_dma.o obj-$(CONFIG_CRYPTO_DEV_MV_CESA) += mv_cesa.o obj-$(CONFIG_CRYPTO_DEV_TALITOS) += talitos.o obj-$(CONFIG_CRYPTO_DEV_FSL_CAAM) += caam/ @@ -14,4 +15,4 @@ obj-$(CONFIG_CRYPTO_DEV_OMAP_AES) += omap-aes.o obj-$(CONFIG_CRYPTO_DEV_PICOXCELL) += picoxcell_crypto.o obj-$(CONFIG_CRYPTO_DEV_S5P) += s5p-sss.o obj-$(CONFIG_CRYPTO_DEV_TEGRA_AES) += tegra-aes.o -obj-$(CONFIG_CRYPTO_DEV_UX500) += ux500/ \ No newline at end of file +obj-$(CONFIG_CRYPTO_DEV_UX500) += ux500/ diff --git a/drivers/crypto/dma_desclist.h b/drivers/crypto/dma_desclist.h new file mode 100644 index 0000000..c471ad6 --- /dev/null +++ b/drivers/crypto/dma_desclist.h @@ -0,0 +1,79 @@ +#ifndef __DMA_DESCLIST__ +#define __DMA_DESCLIST__ + +struct dma_desc { + void *virt; + dma_addr_t phys; +}; + +struct dma_desclist { + struct dma_pool *itempool; + struct dma_desc *desclist; + unsigned long length; + unsigned long usage; +}; + +#define DESCLIST_ITEM(dl, x) ((dl).desclist[(x)].virt) +#define DESCLIST_ITEM_DMA(dl, x) ((dl).desclist[(x)].phys) +#define DESCLIST_FULL(dl) ((dl).length == (dl).usage) + +static inline int +init_dma_desclist(struct dma_desclist *dl, struct device *dev, + size_t size, size_t align, size_t boundary) +{ +#define STRX(x) #x +#define STR(x) STRX(x) + dl->itempool = dma_pool_create( + "DMA Desclist Pool at "__FILE__"("STR(__LINE__)")", + dev, size, align, boundary); +#undef STR +#undef STRX + if (!dl->itempool) + return 1; + dl->desclist = NULL; + dl->length = dl->usage = 0; + return 0; +} + +static inline int +set_dma_desclist_size(struct dma_desclist *dl, unsigned long nelem) +{ + /* need to increase size first if requested */ + if (nelem > dl->length) { + struct dma_desc *newmem; + int newsize = nelem * sizeof(struct dma_desc); + + newmem = krealloc(dl->desclist, newsize, GFP_KERNEL); + if (!newmem) + return -ENOMEM; + dl->desclist = newmem; + } + + /* allocate/free dma descriptors, adjusting dl->length on the go */ + for (; dl->length < nelem; dl->length++) { + DESCLIST_ITEM(*dl, dl->length) = dma_pool_alloc(dl->itempool, + GFP_KERNEL, &DESCLIST_ITEM_DMA(*dl, dl->length)); + if (!DESCLIST_ITEM(*dl, dl->length)) + return -ENOMEM; + } + for (; dl->length > nelem; dl->length--) + dma_pool_free(dl->itempool, DESCLIST_ITEM(*dl, dl->length - 1), + DESCLIST_ITEM_DMA(*dl, dl->length - 1)); + + /* ignore size decreases but those to zero */ + if (!nelem) { + kfree(dl->desclist); + dl->desclist = 0; + } + return 0; +} + +static inline void +fini_dma_desclist(struct dma_desclist *dl) +{ + set_dma_desclist_size(dl, 0); + dma_pool_destroy(dl->itempool); + dl->length = dl->usage = 0; +} + +#endif /* __DMA_DESCLIST__ */ diff --git a/drivers/crypto/mv_cesa.c b/drivers/crypto/mv_cesa.c index 1cc6b3f..b75fdf5 100644 --- a/drivers/crypto/mv_cesa.c +++ b/drivers/crypto/mv_cesa.c @@ -9,6 +9,9 @@ #include <crypto/aes.h> #include <crypto/algapi.h> #include <linux/crypto.h> +#include <linux/delay.h> +#include <linux/dma-mapping.h> +#include <linux/dmapool.h> #include <linux/interrupt.h> #include <linux/io.h> #include <linux/kthread.h> @@ -21,9 +24,17 @@ #include <crypto/sha.h> #include "mv_cesa.h" +#include "mv_dma.h" +#include "dma_desclist.h" #define MV_CESA "MV-CESA:" #define MAX_HW_HASH_SIZE 0xFFFF +#define MV_CESA_EXPIRE 500 /* msec */ + +#define MV_DMA_INIT_POOLSIZE 16 +#define MV_DMA_ALIGN 16 + +static int count_sgs(struct scatterlist *, unsigned int); /* * STM: @@ -43,13 +54,12 @@ enum engine_status { /** * struct req_progress - used for every crypt request - * @src_sg_it: sg iterator for src - * @dst_sg_it: sg iterator for dst + * @src_sg: sg list for src + * @dst_sg: sg list for dst * @sg_src_left: bytes left in src to process (scatter list) * @src_start: offset to add to src start position (scatter list) * @crypt_len: length of current hw crypt/hash process * @hw_nbytes: total bytes to process in hw for this request - * @copy_back: whether to copy data back (crypt) or not (hash) * @sg_dst_left: bytes left dst to process in this scatter list * @dst_start: offset to add to dst start position (scatter list) * @hw_processed_bytes: number of bytes processed by hw (request). @@ -59,10 +69,9 @@ enum engine_status { * track of progress within current scatterlist. */ struct req_progress { - struct sg_mapping_iter src_sg_it; - struct sg_mapping_iter dst_sg_it; + struct scatterlist *src_sg; + struct scatterlist *dst_sg; void (*complete) (void); - void (*process) (int is_first); /* src mostly */ int sg_src_left; @@ -70,15 +79,34 @@ struct req_progress { int crypt_len; int hw_nbytes; /* dst mostly */ - int copy_back; int sg_dst_left; int dst_start; int hw_processed_bytes; }; +struct sec_accel_sram { + struct sec_accel_config op; + union { + struct { + u32 key[8]; + u32 iv[4]; + } crypt; + struct { + u32 ivi[5]; + u32 ivo[5]; + } hash; + } type; +#define sa_key type.crypt.key +#define sa_iv type.crypt.iv +#define sa_ivi type.hash.ivi +#define sa_ivo type.hash.ivo +} __attribute__((packed)); + struct crypto_priv { + struct device *dev; void __iomem *reg; void __iomem *sram; + u32 sram_phys; int irq; struct clk *clk; struct task_struct *queue_th; @@ -87,16 +115,25 @@ struct crypto_priv { spinlock_t lock; struct crypto_queue queue; enum engine_status eng_st; + struct timer_list completion_timer; struct crypto_async_request *cur_req; struct req_progress p; int max_req_size; int sram_size; int has_sha1; int has_hmac_sha1; + + struct sec_accel_sram sa_sram; + dma_addr_t sa_sram_dma; + + struct dma_desclist desclist; }; static struct crypto_priv *cpg; +#define ITEM(x) ((u32 *)DESCLIST_ITEM(cpg->desclist, x)) +#define ITEM_DMA(x) DESCLIST_ITEM_DMA(cpg->desclist, x) + struct mv_ctx { u8 aes_enc_key[AES_KEY_LEN]; u32 aes_dec_key[8]; @@ -131,13 +168,75 @@ struct mv_req_hash_ctx { u64 count; u32 state[SHA1_DIGEST_SIZE / 4]; u8 buffer[SHA1_BLOCK_SIZE]; + dma_addr_t buffer_dma; int first_hash; /* marks that we don't have previous state */ int last_chunk; /* marks that this is the 'final' request */ int extra_bytes; /* unprocessed bytes in buffer */ + int digestsize; /* size of the digest */ enum hash_op op; int count_add; + dma_addr_t result_dma; }; +static void mv_completion_timer_callback(unsigned long unused) +{ + int active = readl(cpg->reg + SEC_ACCEL_CMD) & SEC_CMD_EN_SEC_ACCL0; + int count = 10; + + printk(KERN_ERR MV_CESA + "completion timer expired (CESA %sactive), cleaning up.\n", + active ? "" : "in"); + + del_timer(&cpg->completion_timer); + writel(SEC_CMD_DISABLE_SEC, cpg->reg + SEC_ACCEL_CMD); + while((readl(cpg->reg + SEC_ACCEL_CMD) & SEC_CMD_DISABLE_SEC) && count--) { + mdelay(100); + } + if (count < 0) { + printk(KERN_ERR MV_CESA + "%s: engine reset timed out!\n", __func__); + } + cpg->eng_st = ENGINE_W_DEQUEUE; + wake_up_process(cpg->queue_th); +} + +static void mv_setup_timer(void) +{ + setup_timer(&cpg->completion_timer, &mv_completion_timer_callback, 0); + mod_timer(&cpg->completion_timer, + jiffies + msecs_to_jiffies(MV_CESA_EXPIRE)); +} + +static inline void mv_dma_u32_copy(dma_addr_t dst, u32 val) +{ + if (unlikely(DESCLIST_FULL(cpg->desclist)) && + set_dma_desclist_size(&cpg->desclist, cpg->desclist.length << 1)) { + printk(KERN_ERR MV_CESA "resizing poolsize to %lu failed\n", + cpg->desclist.length << 1); + return; + } + *ITEM(cpg->desclist.usage) = val; + mv_dma_memcpy(dst, ITEM_DMA(cpg->desclist.usage), sizeof(u32)); + cpg->desclist.usage++; +} + +static inline bool +mv_dma_map_sg(struct scatterlist *sg, int nbytes, enum dma_data_direction dir) +{ + int nents = count_sgs(sg, nbytes); + + if (nbytes && dma_map_sg(cpg->dev, sg, nents, dir) != nents) + return false; + return true; +} + +static inline void +mv_dma_unmap_sg(struct scatterlist *sg, int nbytes, enum dma_data_direction dir) +{ + if (nbytes) + dma_unmap_sg(cpg->dev, sg, count_sgs(sg, nbytes), dir); +} + static void compute_aes_dec_key(struct mv_ctx *ctx) { struct crypto_aes_ctx gen_aes_key; @@ -187,19 +286,19 @@ static int mv_setkey_aes(struct crypto_ablkcipher *cipher, const u8 *key, static void copy_src_to_buf(struct req_progress *p, char *dbuf, int len) { - int ret; void *sbuf; int copy_len; while (len) { if (!p->sg_src_left) { - ret = sg_miter_next(&p->src_sg_it); - BUG_ON(!ret); - p->sg_src_left = p->src_sg_it.length; + /* next sg please */ + p->src_sg = sg_next(p->src_sg); + BUG_ON(!p->src_sg); + p->sg_src_left = p->src_sg->length; p->src_start = 0; } - sbuf = p->src_sg_it.addr + p->src_start; + sbuf = sg_virt(p->src_sg) + p->src_start; copy_len = min(p->sg_src_left, len); memcpy(dbuf, sbuf, copy_len); @@ -212,73 +311,123 @@ static void copy_src_to_buf(struct req_progress *p, char *dbuf, int len) } } +static void dma_copy_src_to_buf(struct req_progress *p, dma_addr_t dbuf, int len) +{ + dma_addr_t sbuf; + int copy_len; + + while (len) { + if (!p->sg_src_left) { + /* next sg please */ + p->src_sg = sg_next(p->src_sg); + BUG_ON(!p->src_sg); + p->sg_src_left = sg_dma_len(p->src_sg); + p->src_start = 0; + } + + sbuf = sg_dma_address(p->src_sg) + p->src_start; + + copy_len = min(p->sg_src_left, len); + mv_dma_memcpy(dbuf, sbuf, copy_len); + + p->src_start += copy_len; + p->sg_src_left -= copy_len; + + len -= copy_len; + dbuf += copy_len; + } +} + +static void dma_copy_buf_to_dst(struct req_progress *p, dma_addr_t sbuf, int len) +{ + dma_addr_t dbuf; + int copy_len; + + while (len) { + if (!p->sg_dst_left) { + /* next sg please */ + p->dst_sg = sg_next(p->dst_sg); + BUG_ON(!p->dst_sg); + p->sg_dst_left = sg_dma_len(p->dst_sg); + p->dst_start = 0; + } + + dbuf = sg_dma_address(p->dst_sg) + p->dst_start; + + copy_len = min(p->sg_dst_left, len); + mv_dma_memcpy(dbuf, sbuf, copy_len); + + p->dst_start += copy_len; + p->sg_dst_left -= copy_len; + + len -= copy_len; + sbuf += copy_len; + } +} + static void setup_data_in(void) { struct req_progress *p = &cpg->p; int data_in_sram = min(p->hw_nbytes - p->hw_processed_bytes, cpg->max_req_size); - copy_src_to_buf(p, cpg->sram + SRAM_DATA_IN_START + p->crypt_len, + dma_copy_src_to_buf(p, cpg->sram_phys + SRAM_DATA_IN_START + p->crypt_len, data_in_sram - p->crypt_len); p->crypt_len = data_in_sram; } -static void mv_process_current_q(int first_block) +static void mv_init_crypt_config(struct ablkcipher_request *req) { - struct ablkcipher_request *req = ablkcipher_request_cast(cpg->cur_req); struct mv_ctx *ctx = crypto_tfm_ctx(req->base.tfm); struct mv_req_ctx *req_ctx = ablkcipher_request_ctx(req); - struct sec_accel_config op; + struct sec_accel_config *op = &cpg->sa_sram.op; switch (req_ctx->op) { case COP_AES_ECB: - op.config = CFG_OP_CRYPT_ONLY | CFG_ENCM_AES | CFG_ENC_MODE_ECB; + op->config = CFG_OP_CRYPT_ONLY | CFG_ENCM_AES | CFG_ENC_MODE_ECB; break; case COP_AES_CBC: default: - op.config = CFG_OP_CRYPT_ONLY | CFG_ENCM_AES | CFG_ENC_MODE_CBC; - op.enc_iv = ENC_IV_POINT(SRAM_DATA_IV) | + op->config = CFG_OP_CRYPT_ONLY | CFG_ENCM_AES | CFG_ENC_MODE_CBC; + op->enc_iv = ENC_IV_POINT(SRAM_DATA_IV) | ENC_IV_BUF_POINT(SRAM_DATA_IV_BUF); - if (first_block) - memcpy(cpg->sram + SRAM_DATA_IV, req->info, 16); + memcpy(cpg->sa_sram.sa_iv, req->info, 16); break; } if (req_ctx->decrypt) { - op.config |= CFG_DIR_DEC; - memcpy(cpg->sram + SRAM_DATA_KEY_P, ctx->aes_dec_key, - AES_KEY_LEN); + op->config |= CFG_DIR_DEC; + memcpy(cpg->sa_sram.sa_key, ctx->aes_dec_key, AES_KEY_LEN); } else { - op.config |= CFG_DIR_ENC; - memcpy(cpg->sram + SRAM_DATA_KEY_P, ctx->aes_enc_key, - AES_KEY_LEN); + op->config |= CFG_DIR_ENC; + memcpy(cpg->sa_sram.sa_key, ctx->aes_enc_key, AES_KEY_LEN); } switch (ctx->key_len) { case AES_KEYSIZE_128: - op.config |= CFG_AES_LEN_128; + op->config |= CFG_AES_LEN_128; break; case AES_KEYSIZE_192: - op.config |= CFG_AES_LEN_192; + op->config |= CFG_AES_LEN_192; break; case AES_KEYSIZE_256: - op.config |= CFG_AES_LEN_256; + op->config |= CFG_AES_LEN_256; break; } - op.enc_p = ENC_P_SRC(SRAM_DATA_IN_START) | + op->enc_p = ENC_P_SRC(SRAM_DATA_IN_START) | ENC_P_DST(SRAM_DATA_OUT_START); - op.enc_key_p = SRAM_DATA_KEY_P; + op->enc_key_p = SRAM_DATA_KEY_P; + op->enc_len = cpg->p.crypt_len; - setup_data_in(); - op.enc_len = cpg->p.crypt_len; - memcpy(cpg->sram + SRAM_CONFIG, &op, - sizeof(struct sec_accel_config)); - - /* GO */ - writel(SEC_CMD_EN_SEC_ACCL0, cpg->reg + SEC_ACCEL_CMD); + dma_sync_single_for_device(cpg->dev, cpg->sa_sram_dma, + sizeof(struct sec_accel_sram), DMA_TO_DEVICE); + mv_dma_memcpy(cpg->sram_phys + SRAM_CONFIG, cpg->sa_sram_dma, + sizeof(struct sec_accel_sram)); +} - /* - * XXX: add timer if the interrupt does not occur for some mystery - * reason - */ +static void mv_update_crypt_config(void) +{ + /* update the enc_len field only */ + mv_dma_u32_copy(cpg->sram_phys + SRAM_CONFIG + 2 * sizeof(u32), + (u32)cpg->p.crypt_len); } static void mv_crypto_algo_completion(void) @@ -286,8 +435,12 @@ static void mv_crypto_algo_completion(void) struct ablkcipher_request *req = ablkcipher_request_cast(cpg->cur_req); struct mv_req_ctx *req_ctx = ablkcipher_request_ctx(req); - sg_miter_stop(&cpg->p.src_sg_it); - sg_miter_stop(&cpg->p.dst_sg_it); + if (req->src == req->dst) { + mv_dma_unmap_sg(req->src, req->nbytes, DMA_BIDIRECTIONAL); + } else { + mv_dma_unmap_sg(req->src, req->nbytes, DMA_TO_DEVICE); + mv_dma_unmap_sg(req->dst, req->nbytes, DMA_FROM_DEVICE); + } if (req_ctx->op != COP_AES_CBC) return ; @@ -295,37 +448,33 @@ static void mv_crypto_algo_completion(void) memcpy(req->info, cpg->sram + SRAM_DATA_IV_BUF, 16); } -static void mv_process_hash_current(int first_block) +static void mv_init_hash_config(struct ahash_request *req) { - struct ahash_request *req = ahash_request_cast(cpg->cur_req); const struct mv_tfm_hash_ctx *tfm_ctx = crypto_tfm_ctx(req->base.tfm); struct mv_req_hash_ctx *req_ctx = ahash_request_ctx(req); struct req_progress *p = &cpg->p; - struct sec_accel_config op = { 0 }; + struct sec_accel_config *op = &cpg->sa_sram.op; int is_last; switch (req_ctx->op) { case COP_SHA1: default: - op.config = CFG_OP_MAC_ONLY | CFG_MACM_SHA1; + op->config = CFG_OP_MAC_ONLY | CFG_MACM_SHA1; break; case COP_HMAC_SHA1: - op.config = CFG_OP_MAC_ONLY | CFG_MACM_HMAC_SHA1; - memcpy(cpg->sram + SRAM_HMAC_IV_IN, + op->config = CFG_OP_MAC_ONLY | CFG_MACM_HMAC_SHA1; + memcpy(cpg->sa_sram.sa_ivi, tfm_ctx->ivs, sizeof(tfm_ctx->ivs)); break; } - op.mac_src_p = - MAC_SRC_DATA_P(SRAM_DATA_IN_START) | MAC_SRC_TOTAL_LEN((u32) - req_ctx-> - count); - - setup_data_in(); + op->mac_src_p = + MAC_SRC_DATA_P(SRAM_DATA_IN_START) | + MAC_SRC_TOTAL_LEN((u32)req_ctx->count); - op.mac_digest = + op->mac_digest = MAC_DIGEST_P(SRAM_DIGEST_BUF) | MAC_FRAG_LEN(p->crypt_len); - op.mac_iv = + op->mac_iv = MAC_INNER_IV_P(SRAM_HMAC_IV_IN) | MAC_OUTER_IV_P(SRAM_HMAC_IV_OUT); @@ -334,35 +483,59 @@ static void mv_process_hash_current(int first_block) && (req_ctx->count <= MAX_HW_HASH_SIZE); if (req_ctx->first_hash) { if (is_last) - op.config |= CFG_NOT_FRAG; + op->config |= CFG_NOT_FRAG; else - op.config |= CFG_FIRST_FRAG; + op->config |= CFG_FIRST_FRAG; req_ctx->first_hash = 0; } else { if (is_last) - op.config |= CFG_LAST_FRAG; + op->config |= CFG_LAST_FRAG; else - op.config |= CFG_MID_FRAG; - - if (first_block) { - writel(req_ctx->state[0], cpg->reg + DIGEST_INITIAL_VAL_A); - writel(req_ctx->state[1], cpg->reg + DIGEST_INITIAL_VAL_B); - writel(req_ctx->state[2], cpg->reg + DIGEST_INITIAL_VAL_C); - writel(req_ctx->state[3], cpg->reg + DIGEST_INITIAL_VAL_D); - writel(req_ctx->state[4], cpg->reg + DIGEST_INITIAL_VAL_E); - } + op->config |= CFG_MID_FRAG; + + writel(req_ctx->state[0], cpg->reg + DIGEST_INITIAL_VAL_A); + writel(req_ctx->state[1], cpg->reg + DIGEST_INITIAL_VAL_B); + writel(req_ctx->state[2], cpg->reg + DIGEST_INITIAL_VAL_C); + writel(req_ctx->state[3], cpg->reg + DIGEST_INITIAL_VAL_D); + writel(req_ctx->state[4], cpg->reg + DIGEST_INITIAL_VAL_E); } - memcpy(cpg->sram + SRAM_CONFIG, &op, sizeof(struct sec_accel_config)); + dma_sync_single_for_device(cpg->dev, cpg->sa_sram_dma, + sizeof(struct sec_accel_sram), DMA_TO_DEVICE); + mv_dma_memcpy(cpg->sram_phys + SRAM_CONFIG, cpg->sa_sram_dma, + sizeof(struct sec_accel_sram)); +} - /* GO */ - writel(SEC_CMD_EN_SEC_ACCL0, cpg->reg + SEC_ACCEL_CMD); +static void mv_update_hash_config(struct ahash_request *req) +{ + struct mv_req_hash_ctx *req_ctx = ahash_request_ctx(req); + struct req_progress *p = &cpg->p; + int is_last; + u32 val; + + /* update only the config (for changed fragment state) and + * mac_digest (for changed frag len) fields */ - /* - * XXX: add timer if the interrupt does not occur for some mystery - * reason - */ + switch (req_ctx->op) { + case COP_SHA1: + default: + val = CFG_OP_MAC_ONLY | CFG_MACM_SHA1; + break; + case COP_HMAC_SHA1: + val = CFG_OP_MAC_ONLY | CFG_MACM_HMAC_SHA1; + break; + } + + is_last = req_ctx->last_chunk + && (p->hw_processed_bytes + p->crypt_len >= p->hw_nbytes) + && (req_ctx->count <= MAX_HW_HASH_SIZE); + + val |= is_last ? CFG_LAST_FRAG : CFG_MID_FRAG; + mv_dma_u32_copy(cpg->sram_phys + SRAM_CONFIG, val); + + val = MAC_DIGEST_P(SRAM_DIGEST_BUF) | MAC_FRAG_LEN(p->crypt_len); + mv_dma_u32_copy(cpg->sram_phys + SRAM_CONFIG + 6 * sizeof(u32), val); } static inline int mv_hash_import_sha1_ctx(const struct mv_req_hash_ctx *ctx, @@ -406,6 +579,15 @@ out: return rc; } +static void mv_save_digest_state(struct mv_req_hash_ctx *ctx) +{ + ctx->state[0] = readl(cpg->reg + DIGEST_INITIAL_VAL_A); + ctx->state[1] = readl(cpg->reg + DIGEST_INITIAL_VAL_B); + ctx->state[2] = readl(cpg->reg + DIGEST_INITIAL_VAL_C); + ctx->state[3] = readl(cpg->reg + DIGEST_INITIAL_VAL_D); + ctx->state[4] = readl(cpg->reg + DIGEST_INITIAL_VAL_E); +} + static void mv_hash_algo_completion(void) { struct ahash_request *req = ahash_request_cast(cpg->cur_req); @@ -413,72 +595,39 @@ static void mv_hash_algo_completion(void) if (ctx->extra_bytes) copy_src_to_buf(&cpg->p, ctx->buffer, ctx->extra_bytes); - sg_miter_stop(&cpg->p.src_sg_it); if (likely(ctx->last_chunk)) { - if (likely(ctx->count <= MAX_HW_HASH_SIZE)) { - memcpy(req->result, cpg->sram + SRAM_DIGEST_BUF, - crypto_ahash_digestsize(crypto_ahash_reqtfm - (req))); - } else + dma_unmap_single(cpg->dev, ctx->result_dma, + ctx->digestsize, DMA_FROM_DEVICE); + + dma_unmap_single(cpg->dev, ctx->buffer_dma, + SHA1_BLOCK_SIZE, DMA_TO_DEVICE); + + if (unlikely(ctx->count > MAX_HW_HASH_SIZE)) { + mv_save_digest_state(ctx); mv_hash_final_fallback(req); + } } else { - ctx->state[0] = readl(cpg->reg + DIGEST_INITIAL_VAL_A); - ctx->state[1] = readl(cpg->reg + DIGEST_INITIAL_VAL_B); - ctx->state[2] = readl(cpg->reg + DIGEST_INITIAL_VAL_C); - ctx->state[3] = readl(cpg->reg + DIGEST_INITIAL_VAL_D); - ctx->state[4] = readl(cpg->reg + DIGEST_INITIAL_VAL_E); + mv_save_digest_state(ctx); } + + mv_dma_unmap_sg(req->src, req->nbytes, DMA_TO_DEVICE); } static void dequeue_complete_req(void) { struct crypto_async_request *req = cpg->cur_req; - void *buf; - int ret; - cpg->p.hw_processed_bytes += cpg->p.crypt_len; - if (cpg->p.copy_back) { - int need_copy_len = cpg->p.crypt_len; - int sram_offset = 0; - do { - int dst_copy; - - if (!cpg->p.sg_dst_left) { - ret = sg_miter_next(&cpg->p.dst_sg_it); - BUG_ON(!ret); - cpg->p.sg_dst_left = cpg->p.dst_sg_it.length; - cpg->p.dst_start = 0; - } - buf = cpg->p.dst_sg_it.addr; - buf += cpg->p.dst_start; - - dst_copy = min(need_copy_len, cpg->p.sg_dst_left); - - memcpy(buf, - cpg->sram + SRAM_DATA_OUT_START + sram_offset, - dst_copy); - sram_offset += dst_copy; - cpg->p.sg_dst_left -= dst_copy; - need_copy_len -= dst_copy; - cpg->p.dst_start += dst_copy; - } while (need_copy_len > 0); - } - - cpg->p.crypt_len = 0; + mv_dma_clear(); + cpg->desclist.usage = 0; BUG_ON(cpg->eng_st != ENGINE_W_DEQUEUE); - if (cpg->p.hw_processed_bytes < cpg->p.hw_nbytes) { - /* process next scatter list entry */ - cpg->eng_st = ENGINE_BUSY; - cpg->p.process(0); - } else { - cpg->p.complete(); - cpg->eng_st = ENGINE_IDLE; - local_bh_disable(); - req->complete(req, 0); - local_bh_enable(); - } + + cpg->p.complete(); + cpg->eng_st = ENGINE_IDLE; + local_bh_disable(); + req->complete(req, 0); + local_bh_enable(); } static int count_sgs(struct scatterlist *sl, unsigned int total_bytes) @@ -501,33 +650,68 @@ static int count_sgs(struct scatterlist *sl, unsigned int total_bytes) static void mv_start_new_crypt_req(struct ablkcipher_request *req) { struct req_progress *p = &cpg->p; - int num_sgs; cpg->cur_req = &req->base; memset(p, 0, sizeof(struct req_progress)); p->hw_nbytes = req->nbytes; p->complete = mv_crypto_algo_completion; - p->process = mv_process_current_q; - p->copy_back = 1; - num_sgs = count_sgs(req->src, req->nbytes); - sg_miter_start(&p->src_sg_it, req->src, num_sgs, SG_MITER_FROM_SG); + /* assume inplace request */ + if (req->src == req->dst) { + if (!mv_dma_map_sg(req->src, req->nbytes, DMA_BIDIRECTIONAL)) + return; + } else { + if (!mv_dma_map_sg(req->src, req->nbytes, DMA_TO_DEVICE)) + return; - num_sgs = count_sgs(req->dst, req->nbytes); - sg_miter_start(&p->dst_sg_it, req->dst, num_sgs, SG_MITER_TO_SG); + if (!mv_dma_map_sg(req->dst, req->nbytes, DMA_FROM_DEVICE)) { + mv_dma_unmap_sg(req->src, req->nbytes, DMA_TO_DEVICE); + return; + } + } + + p->src_sg = req->src; + p->dst_sg = req->dst; + if (req->nbytes) { + BUG_ON(!req->src); + BUG_ON(!req->dst); + p->sg_src_left = sg_dma_len(req->src); + p->sg_dst_left = sg_dma_len(req->dst); + } - mv_process_current_q(1); + setup_data_in(); + mv_init_crypt_config(req); + mv_dma_separator(); + dma_copy_buf_to_dst(&cpg->p, cpg->sram_phys + SRAM_DATA_OUT_START, cpg->p.crypt_len); + cpg->p.hw_processed_bytes += cpg->p.crypt_len; + while (cpg->p.hw_processed_bytes < cpg->p.hw_nbytes) { + cpg->p.crypt_len = 0; + + setup_data_in(); + mv_update_crypt_config(); + mv_dma_separator(); + dma_copy_buf_to_dst(&cpg->p, cpg->sram_phys + SRAM_DATA_OUT_START, cpg->p.crypt_len); + cpg->p.hw_processed_bytes += cpg->p.crypt_len; + } + + + /* GO */ + mv_setup_timer(); + mv_dma_trigger(); + writel(SEC_CMD_EN_SEC_ACCL0, cpg->reg + SEC_ACCEL_CMD); } static void mv_start_new_hash_req(struct ahash_request *req) { struct req_progress *p = &cpg->p; struct mv_req_hash_ctx *ctx = ahash_request_ctx(req); - int num_sgs, hw_bytes, old_extra_bytes, rc; + int hw_bytes, old_extra_bytes, rc; + cpg->cur_req = &req->base; memset(p, 0, sizeof(struct req_progress)); hw_bytes = req->nbytes + ctx->extra_bytes; old_extra_bytes = ctx->extra_bytes; + ctx->digestsize = crypto_ahash_digestsize(crypto_ahash_reqtfm(req)); ctx->extra_bytes = hw_bytes % SHA1_BLOCK_SIZE; if (ctx->extra_bytes != 0 @@ -536,25 +720,13 @@ static void mv_start_new_hash_req(struct ahash_request *req) else ctx->extra_bytes = 0; - num_sgs = count_sgs(req->src, req->nbytes); - sg_miter_start(&p->src_sg_it, req->src, num_sgs, SG_MITER_FROM_SG); - - if (hw_bytes) { - p->hw_nbytes = hw_bytes; - p->complete = mv_hash_algo_completion; - p->process = mv_process_hash_current; - - if (unlikely(old_extra_bytes)) { - memcpy(cpg->sram + SRAM_DATA_IN_START, ctx->buffer, - old_extra_bytes); - p->crypt_len = old_extra_bytes; + if (unlikely(!hw_bytes)) { /* too little data for CESA */ + if (req->nbytes) { + p->src_sg = req->src; + p->sg_src_left = req->src->length; + copy_src_to_buf(p, ctx->buffer + old_extra_bytes, + req->nbytes); } - - mv_process_hash_current(1); - } else { - copy_src_to_buf(p, ctx->buffer + old_extra_bytes, - ctx->extra_bytes - old_extra_bytes); - sg_miter_stop(&p->src_sg_it); if (ctx->last_chunk) rc = mv_hash_final_fallback(req); else @@ -563,7 +735,60 @@ static void mv_start_new_hash_req(struct ahash_request *req) local_bh_disable(); req->base.complete(&req->base, rc); local_bh_enable(); + return; + } + + if (likely(req->nbytes)) { + BUG_ON(!req->src); + + if (!mv_dma_map_sg(req->src, req->nbytes, DMA_TO_DEVICE)) { + printk(KERN_ERR "%s: out of memory\n", __func__); + return; + } + p->sg_src_left = sg_dma_len(req->src); + p->src_sg = req->src; } + + p->hw_nbytes = hw_bytes; + p->complete = mv_hash_algo_completion; + + if (unlikely(old_extra_bytes)) { + dma_sync_single_for_device(cpg->dev, ctx->buffer_dma, + SHA1_BLOCK_SIZE, DMA_TO_DEVICE); + mv_dma_memcpy(cpg->sram_phys + SRAM_DATA_IN_START, + ctx->buffer_dma, old_extra_bytes); + p->crypt_len = old_extra_bytes; + } + + setup_data_in(); + mv_init_hash_config(req); + mv_dma_separator(); + cpg->p.hw_processed_bytes += cpg->p.crypt_len; + while (cpg->p.hw_processed_bytes < cpg->p.hw_nbytes) { + cpg->p.crypt_len = 0; + + setup_data_in(); + mv_update_hash_config(req); + mv_dma_separator(); + cpg->p.hw_processed_bytes += cpg->p.crypt_len; + } + if (req->result) { + ctx->result_dma = dma_map_single(cpg->dev, req->result, + ctx->digestsize, DMA_FROM_DEVICE); + mv_dma_memcpy(ctx->result_dma, + cpg->sram_phys + SRAM_DIGEST_BUF, + ctx->digestsize); + } else { + /* XXX: this fixes some ugly register fuckup bug in the tdma engine + * (no need to sync since the data is ignored anyway) */ + mv_dma_memcpy(cpg->sa_sram_dma, + cpg->sram_phys + SRAM_CONFIG, 1); + } + + /* GO */ + mv_setup_timer(); + mv_dma_trigger(); + writel(SEC_CMD_EN_SEC_ACCL0, cpg->reg + SEC_ACCEL_CMD); } static int queue_manag(void *data) @@ -686,6 +911,8 @@ static void mv_init_hash_req_ctx(struct mv_req_hash_ctx *ctx, int op, ctx->first_hash = 1; ctx->last_chunk = is_last; ctx->count_add = count_add; + ctx->buffer_dma = dma_map_single(cpg->dev, ctx->buffer, + SHA1_BLOCK_SIZE, DMA_TO_DEVICE); } static void mv_update_hash_req_ctx(struct mv_req_hash_ctx *ctx, int is_last, @@ -885,10 +1112,14 @@ irqreturn_t crypto_int(int irq, void *priv) u32 val; val = readl(cpg->reg + SEC_ACCEL_INT_STATUS); - if (!(val & SEC_INT_ACCEL0_DONE)) + if (!(val & SEC_INT_ACC0_IDMA_DONE)) return IRQ_NONE; - val &= ~SEC_INT_ACCEL0_DONE; + if (!del_timer(&cpg->completion_timer)) { + printk(KERN_WARNING MV_CESA + "got an interrupt but no pending timer?\n"); + } + val &= ~SEC_INT_ACC0_IDMA_DONE; writel(val, cpg->reg + FPGA_INT_STATUS); writel(val, cpg->reg + SEC_ACCEL_INT_STATUS); BUG_ON(cpg->eng_st != ENGINE_BUSY); @@ -1028,6 +1259,7 @@ static int mv_probe(struct platform_device *pdev) } cp->sram_size = resource_size(res); cp->max_req_size = cp->sram_size - SRAM_CFG_SPACE; + cp->sram_phys = res->start; cp->sram = ioremap(res->start, cp->sram_size); if (!cp->sram) { ret = -ENOMEM; @@ -1043,6 +1275,7 @@ static int mv_probe(struct platform_device *pdev) platform_set_drvdata(pdev, cp); cpg = cp; + cpg->dev = &pdev->dev; cp->queue_th = kthread_run(queue_manag, cp, "mv_crypto"); if (IS_ERR(cp->queue_th)) { @@ -1061,15 +1294,30 @@ static int mv_probe(struct platform_device *pdev) if (!IS_ERR(cp->clk)) clk_prepare_enable(cp->clk); - writel(SEC_INT_ACCEL0_DONE, cpg->reg + SEC_ACCEL_INT_MASK); - writel(SEC_CFG_STOP_DIG_ERR, cpg->reg + SEC_ACCEL_CFG); + writel(0, cpg->reg + SEC_ACCEL_INT_STATUS); + writel(SEC_INT_ACC0_IDMA_DONE, cpg->reg + SEC_ACCEL_INT_MASK); + writel((SEC_CFG_STOP_DIG_ERR | SEC_CFG_CH0_W_IDMA | SEC_CFG_MP_CHAIN | + SEC_CFG_ACT_CH0_IDMA), cpg->reg + SEC_ACCEL_CFG); writel(SRAM_CONFIG, cpg->reg + SEC_ACCEL_DESC_P0); + cp->sa_sram_dma = dma_map_single(&pdev->dev, &cp->sa_sram, + sizeof(struct sec_accel_sram), DMA_TO_DEVICE); + + if (init_dma_desclist(&cpg->desclist, &pdev->dev, + sizeof(u32), MV_DMA_ALIGN, 0)) { + ret = -ENOMEM; + goto err_mapping; + } + if (set_dma_desclist_size(&cpg->desclist, MV_DMA_INIT_POOLSIZE)) { + printk(KERN_ERR MV_CESA "failed to initialise poolsize\n"); + goto err_pool; + } + ret = crypto_register_alg(&mv_aes_alg_ecb); if (ret) { printk(KERN_WARNING MV_CESA "Could not register aes-ecb driver\n"); - goto err_irq; + goto err_pool; } ret = crypto_register_alg(&mv_aes_alg_cbc); @@ -1096,7 +1344,11 @@ static int mv_probe(struct platform_device *pdev) return 0; err_unreg_ecb: crypto_unregister_alg(&mv_aes_alg_ecb); -err_irq: +err_pool: + fini_dma_desclist(&cpg->desclist); +err_mapping: + dma_unmap_single(&pdev->dev, cpg->sa_sram_dma, + sizeof(struct sec_accel_sram), DMA_TO_DEVICE); free_irq(irq, cp); err_thread: kthread_stop(cp->queue_th); @@ -1123,6 +1375,9 @@ static int mv_remove(struct platform_device *pdev) crypto_unregister_ahash(&mv_hmac_sha1_alg); kthread_stop(cp->queue_th); free_irq(cp->irq, cp); + dma_unmap_single(&pdev->dev, cpg->sa_sram_dma, + sizeof(struct sec_accel_sram), DMA_TO_DEVICE); + fini_dma_desclist(&cpg->desclist); memset(cp->sram, 0, cp->sram_size); iounmap(cp->sram); iounmap(cp->reg); diff --git a/drivers/crypto/mv_cesa.h b/drivers/crypto/mv_cesa.h index 08fcb11..866c437 100644 --- a/drivers/crypto/mv_cesa.h +++ b/drivers/crypto/mv_cesa.h @@ -24,6 +24,7 @@ #define SEC_CFG_CH1_W_IDMA (1 << 8) #define SEC_CFG_ACT_CH0_IDMA (1 << 9) #define SEC_CFG_ACT_CH1_IDMA (1 << 10) +#define SEC_CFG_MP_CHAIN (1 << 11) #define SEC_ACCEL_STATUS 0xde0c #define SEC_ST_ACT_0 (1 << 0) diff --git a/drivers/crypto/mv_dma.c b/drivers/crypto/mv_dma.c new file mode 100644 index 0000000..dd1ce02 --- /dev/null +++ b/drivers/crypto/mv_dma.c @@ -0,0 +1,520 @@ +/* + * Support for Marvell's IDMA/TDMA engines found on Orion/Kirkwood chips, + * used exclusively by the CESA crypto accelerator. + * + * Based on unpublished code for IDMA written by Sebastian Siewior. + * + * Copyright (C) 2012 Phil Sutter <phil.sutter@viprinet.com> + * License: GPLv2 + */ + +#include <linux/delay.h> +#include <linux/dma-mapping.h> +#include <linux/dmapool.h> +#include <linux/interrupt.h> +#include <linux/module.h> +#include <linux/clk.h> +#include <linux/slab.h> +#include <linux/platform_device.h> +#include <linux/mbus.h> +#include <plat/mv_dma.h> + +#include "mv_dma.h" +#include "dma_desclist.h" + +#define MV_DMA "MV-DMA: " + +#define MV_DMA_INIT_POOLSIZE 16 +#define MV_DMA_ALIGN 16 + +struct mv_dma_desc { + u32 count; + u32 src; + u32 dst; + u32 next; +} __attribute__((packed)); + +struct mv_dma_priv { + bool idma_registered, tdma_registered; + struct device *dev; + void __iomem *reg; + int irq; + struct clk *clk; + /* protecting the dma descriptors and stuff */ + spinlock_t lock; + struct dma_desclist desclist; + u32 (*print_and_clear_irq)(void); +} tpg; + +#define ITEM(x) ((struct mv_dma_desc *)DESCLIST_ITEM(tpg.desclist, x)) +#define ITEM_DMA(x) DESCLIST_ITEM_DMA(tpg.desclist, x) + +typedef u32 (*print_and_clear_irq)(void); +typedef void (*deco_win_setter)(void __iomem *, int, int, int, int, int); + + +static inline void wait_for_dma_idle(void) +{ + while (readl(tpg.reg + DMA_CTRL) & DMA_CTRL_ACTIVE) + mdelay(100); +} + +static inline void switch_dma_engine(bool state) +{ + u32 val = readl(tpg.reg + DMA_CTRL); + + val |= ( state * DMA_CTRL_ENABLE); + val &= ~(!state * DMA_CTRL_ENABLE); + + writel(val, tpg.reg + DMA_CTRL); +} + +static struct mv_dma_desc *get_new_last_desc(void) +{ + if (unlikely(DESCLIST_FULL(tpg.desclist)) && + set_dma_desclist_size(&tpg.desclist, tpg.desclist.length << 1)) { + printk(KERN_ERR MV_DMA "failed to increase DMA pool to %lu\n", + tpg.desclist.length << 1); + return NULL; + } + + if (likely(tpg.desclist.usage)) + ITEM(tpg.desclist.usage - 1)->next = + ITEM_DMA(tpg.desclist.usage); + + return ITEM(tpg.desclist.usage++); +} + +static inline void mv_dma_desc_dump(void) +{ + struct mv_dma_desc *tmp; + int i; + + if (!tpg.desclist.usage) { + printk(KERN_WARNING MV_DMA "DMA descriptor list is empty\n"); + return; + } + + printk(KERN_WARNING MV_DMA "DMA descriptor list:\n"); + for (i = 0; i < tpg.desclist.usage; i++) { + tmp = ITEM(i); + printk(KERN_WARNING MV_DMA "entry %d at 0x%x: dma addr 0x%x, " + "src 0x%x, dst 0x%x, count %u, own %d, next 0x%x", i, + (u32)tmp, ITEM_DMA(i) , tmp->src, tmp->dst, + tmp->count & DMA_BYTE_COUNT_MASK, !!(tmp->count & DMA_OWN_BIT), + tmp->next); + } +} + +static inline void mv_dma_reg_dump(void) +{ +#define PRINTREG(offset) \ + printk(KERN_WARNING MV_DMA "tpg.reg + " #offset " = 0x%x\n", \ + readl(tpg.reg + offset)) + + PRINTREG(DMA_CTRL); + PRINTREG(DMA_BYTE_COUNT); + PRINTREG(DMA_SRC_ADDR); + PRINTREG(DMA_DST_ADDR); + PRINTREG(DMA_NEXT_DESC); + PRINTREG(DMA_CURR_DESC); + +#undef PRINTREG +} + +static inline void mv_dma_clear_desc_reg(void) +{ + writel(0, tpg.reg + DMA_BYTE_COUNT); + writel(0, tpg.reg + DMA_SRC_ADDR); + writel(0, tpg.reg + DMA_DST_ADDR); + writel(0, tpg.reg + DMA_CURR_DESC); + writel(0, tpg.reg + DMA_NEXT_DESC); +} + +void mv_dma_clear(void) +{ + if (!tpg.dev) + return; + + spin_lock(&tpg.lock); + + /* make sure engine is idle */ + wait_for_dma_idle(); + switch_dma_engine(0); + wait_for_dma_idle(); + + /* clear descriptor registers */ + mv_dma_clear_desc_reg(); + + tpg.desclist.usage = 0; + + switch_dma_engine(1); + + /* finally free system lock again */ + spin_unlock(&tpg.lock); +} +EXPORT_SYMBOL_GPL(mv_dma_clear); + +void mv_dma_trigger(void) +{ + if (!tpg.dev) + return; + + spin_lock(&tpg.lock); + + writel(ITEM_DMA(0), tpg.reg + DMA_NEXT_DESC); + + spin_unlock(&tpg.lock); +} +EXPORT_SYMBOL_GPL(mv_dma_trigger); + +void mv_dma_separator(void) +{ + struct mv_dma_desc *tmp; + + if (!tpg.dev) + return; + + spin_lock(&tpg.lock); + + tmp = get_new_last_desc(); + memset(tmp, 0, sizeof(*tmp)); + + spin_unlock(&tpg.lock); +} +EXPORT_SYMBOL_GPL(mv_dma_separator); + +void mv_dma_memcpy(dma_addr_t dst, dma_addr_t src, unsigned int size) +{ + struct mv_dma_desc *tmp; + + if (!tpg.dev) + return; + + spin_lock(&tpg.lock); + + tmp = get_new_last_desc(); + tmp->count = size | DMA_OWN_BIT; + tmp->src = src; + tmp->dst = dst; + tmp->next = 0; + + spin_unlock(&tpg.lock); +} +EXPORT_SYMBOL_GPL(mv_dma_memcpy); + +static u32 idma_print_and_clear_irq(void) +{ + u32 val, val2, addr; + + val = readl(tpg.reg + IDMA_INT_CAUSE); + val2 = readl(tpg.reg + IDMA_ERR_SELECT); + addr = readl(tpg.reg + IDMA_ERR_ADDR); + + if (val & IDMA_INT_MISS(0)) + printk(KERN_ERR MV_DMA "%s: address miss @%x!\n", + __func__, val2 & IDMA_INT_MISS(0) ? addr : 0); + if (val & IDMA_INT_APROT(0)) + printk(KERN_ERR MV_DMA "%s: access protection @%x!\n", + __func__, val2 & IDMA_INT_APROT(0) ? addr : 0); + if (val & IDMA_INT_WPROT(0)) + printk(KERN_ERR MV_DMA "%s: write protection @%x!\n", + __func__, val2 & IDMA_INT_WPROT(0) ? addr : 0); + + /* clear interrupt cause register */ + writel(0, tpg.reg + IDMA_INT_CAUSE); + + return val; +} + +static u32 tdma_print_and_clear_irq(void) +{ + u32 val; + + val = readl(tpg.reg + TDMA_ERR_CAUSE); + + if (val & TDMA_INT_MISS) + printk(KERN_ERR MV_DMA "%s: miss!\n", __func__); + if (val & TDMA_INT_DOUBLE_HIT) + printk(KERN_ERR MV_DMA "%s: double hit!\n", __func__); + if (val & TDMA_INT_BOTH_HIT) + printk(KERN_ERR MV_DMA "%s: both hit!\n", __func__); + if (val & TDMA_INT_DATA_ERROR) + printk(KERN_ERR MV_DMA "%s: data error!\n", __func__); + + /* clear error cause register */ + writel(0, tpg.reg + TDMA_ERR_CAUSE); + + return val; +} + +irqreturn_t mv_dma_int(int irq, void *priv) +{ + int handled; + + handled = (*tpg.print_and_clear_irq)(); + + if (handled) { + mv_dma_reg_dump(); + mv_dma_desc_dump(); + } + + switch_dma_engine(0); + wait_for_dma_idle(); + + /* clear descriptor registers */ + mv_dma_clear_desc_reg(); + + switch_dma_engine(1); + wait_for_dma_idle(); + + return (handled ? IRQ_HANDLED : IRQ_NONE); +} + +static void tdma_set_deco_win(void __iomem *regs, int chan, + int target, int attr, int base, int size) +{ + u32 val; + + writel(DMA_DECO_ADDR_MASK(base), regs + TDMA_DECO_BAR(chan)); + + val = TDMA_WCR_ENABLE; + val |= TDMA_WCR_TARGET(target); + val |= TDMA_WCR_ATTR(attr); + val |= DMA_DECO_SIZE_MASK(size); + writel(val, regs + TDMA_DECO_WCR(chan)); +} + +static void idma_set_deco_win(void __iomem *regs, int chan, + int target, int attr, int base, int size) +{ + u32 val; + + /* setup window parameters */ + val = IDMA_BAR_TARGET(target); + val |= IDMA_BAR_ATTR(attr); + val |= DMA_DECO_ADDR_MASK(base); + writel(val, regs + IDMA_DECO_BAR(chan)); + + /* window size goes to a separate register */ + writel(DMA_DECO_SIZE_MASK(size), regs + IDMA_DECO_SIZE(chan)); + + /* set the channel to enabled */ + val = readl(regs + IDMA_DECO_ENABLE); + val &= ~(1 << chan); + writel(val, regs + IDMA_DECO_ENABLE); + + /* allow RW access from all other windows */ + writel(0xffff, regs + IDMA_DECO_PROT(chan)); +} + +static void setup_mbus_windows(void __iomem *regs, struct mv_dma_pdata *pdata, + deco_win_setter win_setter) +{ + int chan; + const struct mbus_dram_target_info *dram; + + dram = mv_mbus_dram_info(); + for (chan = 0; chan < dram->num_cs; chan++) { + const struct mbus_dram_window *cs = &dram->cs[chan]; + + (*win_setter)(regs, chan, dram->mbus_dram_target_id, + cs->mbus_attr, cs->base, cs->size); + } + if (pdata) { + /* Need to add a decoding window for SRAM access. + * This is needed only on IDMA, since every address + * is looked up. But not allowed on TDMA, since it + * errors if source and dest are in different windows. + * + * Size is in 64k granularity, max SRAM size is 8k - + * so a single "unit" easily suffices. + */ + (*win_setter)(regs, chan, pdata->sram_target_id, + pdata->sram_attr, pdata->sram_base, 1 << 16); + } +} + +/* initialise the global tpg structure */ +static int mv_init_engine(struct platform_device *pdev, u32 ctrl_init_val, + print_and_clear_irq pc_irq, deco_win_setter win_setter) +{ + struct resource *res; + void __iomem *deco; + int rc; + + if (tpg.dev) { + printk(KERN_ERR MV_DMA "second DMA device?!\n"); + return -ENXIO; + } + tpg.dev = &pdev->dev; + tpg.print_and_clear_irq = pc_irq; + + /* setup address decoding */ + res = platform_get_resource_byname(pdev, + IORESOURCE_MEM, "regs deco"); + if (!res) + return -ENXIO; + if (!(deco = ioremap(res->start, resource_size(res)))) + return -ENOMEM; + setup_mbus_windows(deco, pdev->dev.platform_data, win_setter); + iounmap(deco); + + /* get register start address */ + res = platform_get_resource_byname(pdev, + IORESOURCE_MEM, "regs control and error"); + if (!res) + return -ENXIO; + if (!(tpg.reg = ioremap(res->start, resource_size(res)))) + return -ENOMEM; + + /* get the IRQ */ + tpg.irq = platform_get_irq(pdev, 0); + if (tpg.irq < 0 || tpg.irq == NO_IRQ) { + rc = -ENXIO; + goto out_unmap_reg; + } + + /* Not all platforms can gate the clock, so it is not + an error if the clock does not exists. */ + tpg.clk = clk_get(&pdev->dev, NULL); + if (!IS_ERR(tpg.clk)) + clk_prepare_enable(tpg.clk); + + /* initialise DMA descriptor list */ + if (init_dma_desclist(&tpg.desclist, tpg.dev, + sizeof(struct mv_dma_desc), MV_DMA_ALIGN, 0)) { + rc = -ENOMEM; + goto out_unmap_reg; + } + if (set_dma_desclist_size(&tpg.desclist, MV_DMA_INIT_POOLSIZE)) { + rc = -ENOMEM; + goto out_free_desclist; + } + + platform_set_drvdata(pdev, &tpg); + + spin_lock_init(&tpg.lock); + + switch_dma_engine(0); + wait_for_dma_idle(); + + /* clear descriptor registers */ + mv_dma_clear_desc_reg(); + + /* initialize control register (also enables engine) */ + writel(ctrl_init_val, tpg.reg + DMA_CTRL); + wait_for_dma_idle(); + + if (request_irq(tpg.irq, mv_dma_int, IRQF_DISABLED, + dev_name(tpg.dev), &tpg)) { + rc = -ENXIO; + goto out_free_all; + } + + return 0; + +out_free_all: + switch_dma_engine(0); + platform_set_drvdata(pdev, NULL); +out_free_desclist: + fini_dma_desclist(&tpg.desclist); +out_unmap_reg: + iounmap(tpg.reg); + tpg.dev = NULL; + return rc; +} + +static int mv_remove(struct platform_device *pdev) +{ + switch_dma_engine(0); + platform_set_drvdata(pdev, NULL); + fini_dma_desclist(&tpg.desclist); + free_irq(tpg.irq, &tpg); + iounmap(tpg.reg); + + if (!IS_ERR(tpg.clk)) { + clk_disable_unprepare(tpg.clk); + clk_put(tpg.clk); + } + + tpg.dev = NULL; + return 0; +} + +static int mv_probe_tdma(struct platform_device *pdev) +{ + int rc; + + rc = mv_init_engine(pdev, TDMA_CTRL_INIT_VALUE, + &tdma_print_and_clear_irq, &tdma_set_deco_win); + if (rc) + return rc; + + /* have an ear for occurring errors */ + writel(TDMA_INT_ALL, tpg.reg + TDMA_ERR_MASK); + writel(0, tpg.reg + TDMA_ERR_CAUSE); + + printk(KERN_INFO MV_DMA + "TDMA engine up and running, IRQ %d\n", tpg.irq); + return 0; +} + +static int mv_probe_idma(struct platform_device *pdev) +{ + int rc; + + rc = mv_init_engine(pdev, IDMA_CTRL_INIT_VALUE, + &idma_print_and_clear_irq, &idma_set_deco_win); + if (rc) + return rc; + + /* have an ear for occurring errors */ + writel(IDMA_INT_MISS(0) | IDMA_INT_APROT(0) | IDMA_INT_WPROT(0), + tpg.reg + IDMA_INT_MASK); + writel(0, tpg.reg + IDMA_INT_CAUSE); + + printk(KERN_INFO MV_DMA + "IDMA engine up and running, IRQ %d\n", tpg.irq); + return 0; +} + +static struct platform_driver marvell_tdma = { + .probe = mv_probe_tdma, + .remove = mv_remove, + .driver = { + .owner = THIS_MODULE, + .name = "mv_tdma", + }, +}, marvell_idma = { + .probe = mv_probe_idma, + .remove = mv_remove, + .driver = { + .owner = THIS_MODULE, + .name = "mv_idma", + }, +}; +MODULE_ALIAS("platform:mv_tdma"); +MODULE_ALIAS("platform:mv_idma"); + +static int __init mv_dma_init(void) +{ + tpg.tdma_registered = !platform_driver_register(&marvell_tdma); + tpg.idma_registered = !platform_driver_register(&marvell_idma); + return !(tpg.tdma_registered || tpg.idma_registered); +} +module_init(mv_dma_init); + +static void __exit mv_dma_exit(void) +{ + if (tpg.tdma_registered) + platform_driver_unregister(&marvell_tdma); + if (tpg.idma_registered) + platform_driver_unregister(&marvell_idma); +} +module_exit(mv_dma_exit); + +MODULE_AUTHOR("Phil Sutter <phil.sutter@viprinet.com>"); +MODULE_DESCRIPTION("Support for Marvell's IDMA/TDMA engines"); +MODULE_LICENSE("GPL"); + diff --git a/drivers/crypto/mv_dma.h b/drivers/crypto/mv_dma.h new file mode 100644 index 0000000..1d8d5df --- /dev/null +++ b/drivers/crypto/mv_dma.h @@ -0,0 +1,150 @@ +#ifndef _MV_DMA_H +#define _MV_DMA_H + +/* common TDMA_CTRL/IDMA_CTRL_LOW bits */ +#define DMA_CTRL_DST_BURST(x) (x) +#define DMA_CTRL_SRC_BURST(x) (x << 6) +#define DMA_CTRL_NO_CHAIN_MODE (1 << 9) +#define DMA_CTRL_ENABLE (1 << 12) +#define DMA_CTRL_FETCH_ND (1 << 13) +#define DMA_CTRL_ACTIVE (1 << 14) + +/* TDMA_CTRL register bits */ +#define TDMA_CTRL_DST_BURST_32 DMA_CTRL_DST_BURST(3) +#define TDMA_CTRL_DST_BURST_128 DMA_CTRL_DST_BURST(4) +#define TDMA_CTRL_OUTST_RD_EN (1 << 4) +#define TDMA_CTRL_SRC_BURST_32 DMA_CTRL_SRC_BURST(3) +#define TDMA_CTRL_SRC_BURST_128 DMA_CTRL_SRC_BURST(4) +#define TDMA_CTRL_NO_BYTE_SWAP (1 << 11) + +#define TDMA_CTRL_INIT_VALUE ( \ + TDMA_CTRL_DST_BURST_128 | TDMA_CTRL_SRC_BURST_128 | \ + TDMA_CTRL_NO_BYTE_SWAP | DMA_CTRL_ENABLE \ +) + +/* IDMA_CTRL_LOW register bits */ +#define IDMA_CTRL_DST_BURST_8 DMA_CTRL_DST_BURST(0) +#define IDMA_CTRL_DST_BURST_16 DMA_CTRL_DST_BURST(1) +#define IDMA_CTRL_DST_BURST_32 DMA_CTRL_DST_BURST(3) +#define IDMA_CTRL_DST_BURST_64 DMA_CTRL_DST_BURST(7) +#define IDMA_CTRL_DST_BURST_128 DMA_CTRL_DST_BURST(4) +#define IDMA_CTRL_SRC_HOLD (1 << 3) +#define IDMA_CTRL_DST_HOLD (1 << 5) +#define IDMA_CTRL_SRC_BURST_8 DMA_CTRL_SRC_BURST(0) +#define IDMA_CTRL_SRC_BURST_16 DMA_CTRL_SRC_BURST(1) +#define IDMA_CTRL_SRC_BURST_32 DMA_CTRL_SRC_BURST(3) +#define IDMA_CTRL_SRC_BURST_64 DMA_CTRL_SRC_BURST(7) +#define IDMA_CTRL_SRC_BURST_128 DMA_CTRL_SRC_BURST(4) +#define IDMA_CTRL_INT_MODE (1 << 10) +#define IDMA_CTRL_BLOCK_MODE (1 << 11) +#define IDMA_CTRL_CLOSE_DESC (1 << 17) +#define IDMA_CTRL_ABORT (1 << 20) +#define IDMA_CTRL_SADDR_OVR(x) (x << 21) +#define IDMA_CTRL_NO_SADDR_OVR IDMA_CTRL_SADDR_OVR(0) +#define IDMA_CTRL_SADDR_OVR_1 IDMA_CTRL_SADDR_OVR(1) +#define IDMA_CTRL_SADDR_OVR_2 IDMA_CTRL_SADDR_OVR(2) +#define IDMA_CTRL_SADDR_OVR_3 IDMA_CTRL_SADDR_OVR(3) +#define IDMA_CTRL_DADDR_OVR(x) (x << 23) +#define IDMA_CTRL_NO_DADDR_OVR IDMA_CTRL_DADDR_OVR(0) +#define IDMA_CTRL_DADDR_OVR_1 IDMA_CTRL_DADDR_OVR(1) +#define IDMA_CTRL_DADDR_OVR_2 IDMA_CTRL_DADDR_OVR(2) +#define IDMA_CTRL_DADDR_OVR_3 IDMA_CTRL_DADDR_OVR(3) +#define IDMA_CTRL_NADDR_OVR(x) (x << 25) +#define IDMA_CTRL_NO_NADDR_OVR IDMA_CTRL_NADDR_OVR(0) +#define IDMA_CTRL_NADDR_OVR_1 IDMA_CTRL_NADDR_OVR(1) +#define IDMA_CTRL_NADDR_OVR_2 IDMA_CTRL_NADDR_OVR(2) +#define IDMA_CTRL_NADDR_OVR_3 IDMA_CTRL_NADDR_OVR(3) +#define IDMA_CTRL_DESC_MODE_16M (1 << 31) + +#define IDMA_CTRL_INIT_VALUE ( \ + IDMA_CTRL_DST_BURST_128 | IDMA_CTRL_SRC_BURST_128 | \ + IDMA_CTRL_INT_MODE | IDMA_CTRL_BLOCK_MODE | \ + DMA_CTRL_ENABLE | IDMA_CTRL_DESC_MODE_16M \ +) + +/* TDMA_ERR_CAUSE bits */ +#define TDMA_INT_MISS (1 << 0) +#define TDMA_INT_DOUBLE_HIT (1 << 1) +#define TDMA_INT_BOTH_HIT (1 << 2) +#define TDMA_INT_DATA_ERROR (1 << 3) +#define TDMA_INT_ALL 0x0f + +/* address decoding registers, starting at "regs deco" */ +#define TDMA_DECO_BAR(chan) (0x00 + (chan) * 8) +#define TDMA_DECO_WCR(chan) (0x04 + (chan) * 8) + +#define IDMA_DECO_BAR(chan) TDMA_DECO_BAR(chan) +#define IDMA_DECO_SIZE(chan) (0x04 + (chan) * 8) +#define IDMA_DECO_REMAP(chan) (0x60 + (chan) * 4) +#define IDMA_DECO_PROT(chan) (0x70 + (chan) * 4) +#define IDMA_DECO_ENABLE 0x80 /* bit field, zero enables */ + +/* decoding address and size masks */ +#define DMA_DECO_ADDR_MASK(x) ((x) & 0xffff0000) +#define DMA_DECO_SIZE_MASK(x) DMA_DECO_ADDR_MASK((x) - 1) + +/* TDMA_DECO_WCR fields */ +#define TDMA_WCR_ENABLE 0x01 +#define TDMA_WCR_TARGET(x) (((x) & 0x0f) << 4) +#define TDMA_WCR_ATTR(x) (((x) & 0xff) << 8) + +/* IDMA_DECO_BAR fields */ +#define IDMA_BAR_TARGET(x) ((x) & 0x0f) +#define IDMA_BAR_ATTR(x) (((x) & 0xff) << 8) + +/* offsets of registers, starting at "regs control and error" */ +#define TDMA_BYTE_COUNT 0x00 +#define TDMA_SRC_ADDR 0x10 +#define TDMA_DST_ADDR 0x20 +#define TDMA_NEXT_DESC 0x30 +#define TDMA_CTRL 0x40 +#define TDMA_CURR_DESC 0x70 +#define TDMA_ERR_CAUSE 0xc8 +#define TDMA_ERR_MASK 0xcc + +#define IDMA_BYTE_COUNT(chan) (0x00 + (chan) * 4) +#define IDMA_SRC_ADDR(chan) (0x10 + (chan) * 4) +#define IDMA_DST_ADDR(chan) (0x20 + (chan) * 4) +#define IDMA_NEXT_DESC(chan) (0x30 + (chan) * 4) +#define IDMA_CTRL_LOW(chan) (0x40 + (chan) * 4) +#define IDMA_CURR_DESC(chan) (0x70 + (chan) * 4) +#define IDMA_CTRL_HIGH(chan) (0x80 + (chan) * 4) +#define IDMA_INT_CAUSE (0xc0) +#define IDMA_INT_MASK (0xc4) +#define IDMA_ERR_ADDR (0xc8) +#define IDMA_ERR_SELECT (0xcc) + +/* register offsets common to TDMA and IDMA channel 0 */ +#define DMA_BYTE_COUNT TDMA_BYTE_COUNT +#define DMA_SRC_ADDR TDMA_SRC_ADDR +#define DMA_DST_ADDR TDMA_DST_ADDR +#define DMA_NEXT_DESC TDMA_NEXT_DESC +#define DMA_CTRL TDMA_CTRL +#define DMA_CURR_DESC TDMA_CURR_DESC + +/* IDMA_INT_CAUSE and IDMA_INT_MASK bits */ +#define IDMA_INT_COMP(chan) ((1 << 0) << ((chan) * 8)) +#define IDMA_INT_MISS(chan) ((1 << 1) << ((chan) * 8)) +#define IDMA_INT_APROT(chan) ((1 << 2) << ((chan) * 8)) +#define IDMA_INT_WPROT(chan) ((1 << 3) << ((chan) * 8)) +#define IDMA_INT_OWN(chan) ((1 << 4) << ((chan) * 8)) +#define IDMA_INT_ALL(chan) (0x1f << (chan) * 8) + +/* Owner bit in DMA_BYTE_COUNT and descriptors' count field, used + * to signal input data completion in descriptor chain */ +#define DMA_OWN_BIT (1 << 31) + +/* IDMA also has a "Left Byte Count" bit, + * indicating not everything was transfered */ +#define IDMA_LEFT_BYTE_COUNT (1 << 30) + +/* filter the actual byte count value from the DMA_BYTE_COUNT field */ +#define DMA_BYTE_COUNT_MASK (~(DMA_OWN_BIT | IDMA_LEFT_BYTE_COUNT)) + +extern void mv_dma_memcpy(dma_addr_t, dma_addr_t, unsigned int); +extern void mv_dma_separator(void); +extern void mv_dma_clear(void); +extern void mv_dma_trigger(void); + + +#endif /* _MV_DMA_H */ ^ permalink raw reply related [flat|nested] 67+ messages in thread
* Re: [PATCH 0/2] Fixes for MV_CESA with IDMA or TDMA 2012-06-25 14:25 ` cloudy.linux 2012-06-25 14:36 ` Phil Sutter @ 2012-06-25 16:05 ` cloudy.linux 2012-06-25 21:59 ` Phil Sutter 1 sibling, 1 reply; 67+ messages in thread From: cloudy.linux @ 2012-06-25 16:05 UTC (permalink / raw) To: Phil Sutter; +Cc: linux-crypto, andrew, Simon Baatz On 2012-6-25 22:25, cloudy.linux wrote: > On 2012-6-25 21:40, Phil Sutter wrote: >> Hi, >> >> On Wed, Jun 20, 2012 at 05:41:31PM +0200, Phil Sutter wrote: >>> PS: I am currently working at the address decoding problem, will get >>> back to in a few days when I have something to test. So stay tuned! >> >> I have updated the cesa-dma branch at git://nwl.cc/~n0-1/linux.git with >> code setting the decoding windows. I hope this fixes the issues on >> orion. I decided not to publish the changes regarding the second DMA >> channel for now, as this seems to be support for a second crypto session >> (handled consecutively, so no real improvement) which is not supported >> anyway. >> >> Greetings, Phil >> >> >> Phil Sutter >> Software Engineer >> > > Thanks Phil. I'm cloning your git now but the speed is really slow. Last > time I tried to do this but had to cancel after hours of downloading (at > only about 20% progress). So the previous tests were actually done with > 3.5-rc3 (I tried the up-to-date Linus' linux-git, but met compiling > problem), of course with your patch and Simon's. Could you provide a > diff based on your last round patch (diff to the not patched kernel > should also be good, I think)? > > In the mean time, I'm still trying with a cloning speed of 5KiB/s ... > > Regards > Cloudy Hi Phil This time the machine can't finish the boot again and the console was flooded by the message like below: ... MV-DMA: IDMA engine up and running, IRQ 23 MV-DMA: idma_print_and_clear_irq: address miss @0! MV-DMA: tpg.reg + DMA_CTRL = 0x80001d04 MV-DMA: tpg.reg + DMA_BYTE_COUNT = 0x0 MV-DMA: tpg.reg + DMA_SRC_ADDR = 0x0 MV-DMA: tpg.reg + DMA_DST_ADDR = 0x0 MV-DMA: tpg.reg + DMA_NEXT_DESC = 0x79b1000 MV-DMA: tpg.reg + DMA_CURR_DESC = 0x0 MV-DMA: DMA descriptor list: MV-DMA: entry 0 at 0xffdbb000: dma addr 0x79b1000, src 0x79b4000, dst 0xf2200080, count 16, own 1, next 0x79b1010 MV-DMA: entry 1 at 0xffdbb010: dma addr 0x79b1010, src 0x799c28c, dst 0xf2200000, count 80, own 1, next 0x79b1020 MV-DMA: entry 2 at 0xffdbb020: dma addr 0x79b1020, src 0x0, dst 0x0, count 0, own 0, next 0x79b1030 MV-DMA: entry 3 at 0xffdbb030: dma addr 0x79b1030, src 0xf2200080, dst 0x79b4000, count 16, own 1, next 0x0 MV-CESA:got an interrupt but no pending timer? MV-DMA: idma_print_and_clear_irq: address miss @0! MV-DMA: tpg.reg + DMA_CTRL = 0x80001d04 MV-DMA: tpg.reg + DMA_BYTE_COUNT = 0x0 MV-DMA: tpg.reg + DMA_SRC_ADDR = 0x0 MV-DMA: tpg.reg + DMA_DST_ADDR = 0x0 MV-DMA: tpg.reg + DMA_NEXT_DESC = 0x0 MV-DMA: tpg.reg + DMA_CURR_DESC = 0x0 MV-DMA: DMA descriptor list: MV-DMA: entry 0 at 0xffdbb000: dma addr 0x79b1000, src 0x79b4000, dst 0xf2200080, count 16, own 1, next 0x79b1010 MV-DMA: entry 1 at 0xffdbb010: dma addr 0x79b1010, src 0x799c28c, dst 0xf2200000, count 80, own 1, next 0x79b1020 MV-DMA: entry 2 at 0xffdbb020: dma addr 0x79b1020, src 0x0, dst 0x0, count 0, own 0, next 0x79b1030 MV-DMA: entry 3 at 0xffdbb030: dma addr 0x79b1030, src 0xf2200080, dst 0x79b4000, count 16, own 1, next 0x0 MV-CESA:got an interrupt but no pending timer? MV-DMA: idma_print_and_clear_irq: address miss @0! MV-DMA: tpg.reg + DMA_CTRL = 0x80001d04 MV-DMA: tpg.reg + DMA_BYTE_COUNT = 0x0 MV-DMA: tpg.reg + DMA_SRC_ADDR = 0x0 MV-DMA: tpg.reg + DMA_DST_ADDR = 0x0 MV-DMA: tpg.reg + DMA_NEXT_DESC = 0x0 MV-DMA: tpg.reg + DMA_CURR_DESC = 0x0 MV-DMA: DMA descriptor list: MV-DMA: entry 0 at 0xffdbb000: dma addr 0x79b1000, src 0x79b4000, dst 0xf2200080, count 16, own 1, next 0x79b1010 MV-DMA: entry 1 at 0xffdbb010: dma addr 0x79b1010, src 0x799c28c, dst 0xf2200000, count 80, own 1, next 0x79b1020 MV-DMA: entry 2 at 0xffdbb020: dma addr 0x79b1020, src 0x0, dst 0x0, count 0, own 0, next 0x79b1030 MV-DMA: entry 3 at 0xffdbb030: dma addr 0x79b1030, src 0xf2200080, dst 0x79b4000, count 16, own 1, next 0x0 MV-CESA:got an interrupt but no pending timer? MV-DMA: idma_print_and_clear_irq: address miss @0! MV-DMA: tpg.reg + DMA_CTRL = 0x80001d04 MV-DMA: tpg.reg + DMA_BYTE_COUNT = 0x0 MV-DMA: tpg.reg + DMA_SRC_ADDR = 0x0 MV-DMA: tpg.reg + DMA_DST_ADDR = 0x0 MV-DMA: tpg.reg + DMA_NEXT_DESC = 0x0 MV-DMA: tpg.reg + DMA_CURR_DESC = 0x0 MV-DMA: DMA descriptor list: MV-DMA: entry 0 at 0xffdbb000: dma addr 0x79b1000, src 0x79b4000, dst 0xf2200080, count 16, own 1, next 0x79b1010 MV-DMA: entry 1 at 0xffdbb010: dma addr 0x79b1010, src 0x799c28c, dst 0xf2200000, count 80, own 1, next 0x79b1020 MV-DMA: entry 2 at 0xffdbb020: dma addr 0x79b1020, src 0x0, dst 0x0, count 0, own 0, next 0x79b1030 MV-DMA: entry 3 at 0xffdbb030: dma addr 0x79b1030, src 0xf2200080, dst 0x79b4000, count 16, own 1, next 0x0 Also, I had to make some modifications to the arch/arm/mach-orion5x/common.c to let it compile successfully: 1 Add including of mv_dma.h 2 Add macro to define TARGET_SRAM as 9 (which is defined in addr-map.c, so I think the clean solution should be modify the addr-map.h? Anyway, as a quick solution the source finally got compiled) Regards Cloudy ^ permalink raw reply [flat|nested] 67+ messages in thread
* Re: [PATCH 0/2] Fixes for MV_CESA with IDMA or TDMA 2012-06-25 16:05 ` cloudy.linux @ 2012-06-25 21:59 ` Phil Sutter 2012-06-26 11:24 ` cloudy.linux 2012-06-30 7:35 ` cloudy.linux 0 siblings, 2 replies; 67+ messages in thread From: Phil Sutter @ 2012-06-25 21:59 UTC (permalink / raw) To: cloudy.linux; +Cc: linux-crypto, andrew, Simon Baatz [-- Attachment #1: Type: text/plain, Size: 1703 bytes --] Hi, On Tue, Jun 26, 2012 at 12:05:55AM +0800, cloudy.linux wrote: > This time the machine can't finish the boot again and the console was > flooded by the message like below: Oh well. I decided to drop that BUG_ON() again, since I saw it once being triggered while in interrupt context. But since the error is non-recovering anyway, I guess it may stay there as well. > Also, I had to make some modifications to the > arch/arm/mach-orion5x/common.c to let it compile successfully: > 1 Add including of mv_dma.h > 2 Add macro to define TARGET_SRAM as 9 (which is defined in addr-map.c, > so I think the clean solution should be modify the addr-map.h? Anyway, > as a quick solution the source finally got compiled) Hmm, yeah. Test-compiling for the platform one is writing code for is still a good idea. But it's even worse than that: according to the specs, for IDMA the SRAM target ID is 5, not 9 like it is for the CPU. Please apply the attached patch on top of the one I sent earlier, without your modifications (the necessary parts are contained in it). Also, I've added some log output to the decode window setter, so we see what's going on there. Anyway, thanks a lot for your help so far! I hope next try shows some progress at least. Greetings, Phil Phil Sutter Software Engineer -- Viprinet GmbH Mainzer Str. 43 55411 Bingen am Rhein Germany Phone/Zentrale: +49-6721-49030-0 Direct line/Durchwahl: +49-6721-49030-134 Fax: +49-6721-49030-209 phil.sutter@viprinet.com http://www.viprinet.com Registered office/Sitz der Gesellschaft: Bingen am Rhein Commercial register/Handelsregister: Amtsgericht Mainz HRB40380 CEO/Geschäftsführer: Simon Kissel [-- Attachment #2: mv_dma_fixup.diff --] [-- Type: text/plain, Size: 2202 bytes --] diff --git a/arch/arm/mach-orion5x/common.c b/arch/arm/mach-orion5x/common.c index 4734231..45450af 100644 --- a/arch/arm/mach-orion5x/common.c +++ b/arch/arm/mach-orion5x/common.c @@ -35,6 +35,7 @@ #include <plat/time.h> #include <plat/common.h> #include <plat/addr-map.h> +#include <plat/mv_dma.h> #include "common.h" /***************************************************************************** @@ -203,7 +204,7 @@ static struct resource orion_idma_res[] = { static u64 mv_idma_dma_mask = DMA_BIT_MASK(32); static struct mv_dma_pdata mv_idma_pdata = { - .sram_target_id = TARGET_SRAM, + .sram_target_id = 5, .sram_attr = 0, .sram_base = ORION5X_SRAM_PHYS_BASE, }; diff --git a/drivers/crypto/mv_cesa.c b/drivers/crypto/mv_cesa.c index b75fdf5..4f48c63 100644 --- a/drivers/crypto/mv_cesa.c +++ b/drivers/crypto/mv_cesa.c @@ -195,6 +195,7 @@ static void mv_completion_timer_callback(unsigned long unused) if (count < 0) { printk(KERN_ERR MV_CESA "%s: engine reset timed out!\n", __func__); + BUG(); } cpg->eng_st = ENGINE_W_DEQUEUE; wake_up_process(cpg->queue_th); diff --git a/drivers/crypto/mv_dma.c b/drivers/crypto/mv_dma.c index dd1ce02..176976e 100644 --- a/drivers/crypto/mv_dma.c +++ b/drivers/crypto/mv_dma.c @@ -318,6 +318,8 @@ static void setup_mbus_windows(void __iomem *regs, struct mv_dma_pdata *pdata, for (chan = 0; chan < dram->num_cs; chan++) { const struct mbus_dram_window *cs = &dram->cs[chan]; + printk(KERN_INFO MV_DMA "window at bar%d: target %d, attr %d, base %x, size %x\n", + chan, dram->mbus_dram_target_id, cs->mbus_attr, cs->base, cs->size); (*win_setter)(regs, chan, dram->mbus_dram_target_id, cs->mbus_attr, cs->base, cs->size); } @@ -330,6 +332,8 @@ static void setup_mbus_windows(void __iomem *regs, struct mv_dma_pdata *pdata, * Size is in 64k granularity, max SRAM size is 8k - * so a single "unit" easily suffices. */ + printk(KERN_INFO MV_DMA "window at bar%d: target %d, attr %d, base %x, size %x\n", + chan, pdata->sram_target_id, pdata->sram_attr, pdata->sram_base, 1 << 16); (*win_setter)(regs, chan, pdata->sram_target_id, pdata->sram_attr, pdata->sram_base, 1 << 16); } ^ permalink raw reply related [flat|nested] 67+ messages in thread
* Re: [PATCH 0/2] Fixes for MV_CESA with IDMA or TDMA 2012-06-25 21:59 ` Phil Sutter @ 2012-06-26 11:24 ` cloudy.linux 2012-06-30 7:35 ` cloudy.linux 1 sibling, 0 replies; 67+ messages in thread From: cloudy.linux @ 2012-06-26 11:24 UTC (permalink / raw) To: Phil Sutter; +Cc: linux-crypto, andrew, Simon Baatz On 2012-6-26 5:59, Phil Sutter wrote: > Hi, > > On Tue, Jun 26, 2012 at 12:05:55AM +0800, cloudy.linux wrote: >> This time the machine can't finish the boot again and the console was >> flooded by the message like below: > > Oh well. I decided to drop that BUG_ON() again, since I saw it once > being triggered while in interrupt context. But since the error is > non-recovering anyway, I guess it may stay there as well. > >> Also, I had to make some modifications to the >> arch/arm/mach-orion5x/common.c to let it compile successfully: >> 1 Add including of mv_dma.h >> 2 Add macro to define TARGET_SRAM as 9 (which is defined in addr-map.c, >> so I think the clean solution should be modify the addr-map.h? Anyway, >> as a quick solution the source finally got compiled) > > Hmm, yeah. Test-compiling for the platform one is writing code for is > still a good idea. But it's even worse than that: according to the > specs, for IDMA the SRAM target ID is 5, not 9 like it is for the CPU. > > Please apply the attached patch on top of the one I sent earlier, > without your modifications (the necessary parts are contained in it). > Also, I've added some log output to the decode window setter, so we see > what's going on there. > > Anyway, thanks a lot for your help so far! I hope next try shows some > progress at least. > > Greetings, Phil > > > Phil Sutter > Software Engineer > Kernel message after applying the latest patch: MV-DMA: window at bar0: target 0, attr 14, base 0, size 8000000 MV-DMA: window at bar1: target 5, attr 0, base f2200000, size 10000 MV-DMA: IDMA engine up and running, IRQ 23 MV-DMA: idma_print_and_clear_irq: address miss @0! MV-DMA: tpg.reg + DMA_CTRL = 0x80001d04 MV-DMA: tpg.reg + DMA_BYTE_COUNT = 0x0 MV-DMA: tpg.reg + DMA_SRC_ADDR = 0x0 MV-DMA: tpg.reg + DMA_DST_ADDR = 0x0 MV-DMA: tpg.reg + DMA_NEXT_DESC = 0x79b1000 MV-DMA: tpg.reg + DMA_CURR_DESC = 0x0 MV-DMA: DMA descriptor list: MV-DMA: entry 0 at 0xffdbb000: dma addr 0x79b1000, src 0x79b4000, dst 0xf2200080, count 16, own 1, next 0x79b1010 MV-DMA: entry 1 at 0xffdbb010: dma addr 0x79b1010, src 0x799c28c, dst 0xf2200000, count 80, own 1, next 0x79b1020 MV-DMA: entry 2 at 0xffdbb020: dma addr 0x79b1020, src 0x0, dst 0x0, count 0, own 0, next 0x79b1030 MV-DMA: entry 3 at 0xffdbb030: dma addr 0x79b1030, src 0xf2200080, dst 0x79b4000, count 16, own 1, next 0x0 MV-CESA:got an interrupt but no pending timer? ------------[ cut here ]------------ kernel BUG at drivers/crypto/mv_cesa.c:1126! Internal error: Oops - BUG: 0 [#1] ARM Modules linked in: CPU: 0 Not tainted (3.5.0-rc4+ #2) pc : [<c01dfcd0>] lr : [<c0015c20>] psr: 20000093 sp : c79b9e58 ip : c79b9da8 fp : c79b9e6c r10: c02f4184 r9 : c0308342 r8 : 0000001c r7 : 00000000 r6 : 00000000 r5 : c03149a4 r4 : 00000002 r3 : c799c200 r2 : 0000de20 r1 : fdd90000 r0 : fdd90000 Flags: nzCv IRQs off FIQs on Mode SVC_32 ISA ARM Segment kernel Control: a005317f Table: 00004000 DAC: 00000017 Process mv_crypto (pid: 276, stack limit = 0xc79b8270) Stack: (0xc79b9e58 to 0xc79ba000) 9e40: c79afc40 0000001c 9e60: c79b9ea4 c79b9e70 c0047aa8 c01dfc48 c788929c 4bfb2bc9 00000000 c02f4184 9e80: 0000001c 00000000 c79b9f4c c79bbe18 c02eec18 c02f10a0 c79b9ebc c79b9ea8 9ea0: c0047c58 c0047a64 00022000 c02f4184 c79b9ed4 c79b9ec0 c0049f60 c0047c38 9ec0: c0049ed8 c0301758 c79b9ee4 c79b9ed8 c00473e4 c0049ee8 c79b9f04 c79b9ee8 9ee0: c000985c c00473c4 c01debf4 c025dfdc a0000013 fdd20200 c79b9f14 c79b9f08 9f00: c0008170 c0009834 c79b9f6c c79b9f18 c0008c14 c0008170 00000000 00000001 9f20: c79b9f60 c78897a0 c03149a4 c799c200 c7936cc0 c79ac780 c79bbe18 c02eec18 9f40: c02f10a0 c79b9f6c c79b9f70 c79b9f60 c01debf4 c025dfdc a0000013 ffffffff 9f60: c79b9fbc c79b9f70 c01debf4 c025dfd0 00000000 c79b9f80 c025dd54 c0035418 9f80: c78897a0 c7827de8 c79b8000 c02eec18 00000013 c7827de8 c799c200 c01dea10 9fa0: 00000013 00000000 00000000 00000000 c79b9ff4 c79b9fc0 c002de90 c01dea20 9fc0: c7827de8 00000000 c799c200 00000000 c79b9fd0 c79b9fd0 00000000 c7827de8 9fe0: c002de00 c001877c 00000000 c79b9ff8 c001877c c002de10 01e6e7fe 01e6e7ff Backtrace: Function entered at [<c01dfc38>] from [<c0047aa8>] r5:0000001c r4:c79afc40 Function entered at [<c0047a54>] from [<c0047c58>] Function entered at [<c0047c28>] from [<c0049f60>] r4:c02f4184 r3:00022000 Function entered at [<c0049ed8>] from [<c00473e4>] r4:c0301758 r3:c0049ed8 Function entered at [<c00473b4>] from [<c000985c>] Function entered at [<c0009824>] from [<c0008170>] r6:fdd20200 r5:a0000013 r4:c025dfdc r3:c01debf4 Function entered at [<c0008160>] from [<c0008c14>] Exception stack(0xc79b9f18 to 0xc79b9f60) 9f00: 00000000 00000001 9f20: c79b9f60 c78897a0 c03149a4 c799c200 c7936cc0 c79ac780 c79bbe18 c02eec18 9f40: c02f10a0 c79b9f6c c79b9f70 c79b9f60 c01debf4 c025dfdc a0000013 ffffffff Function entered at [<c025dfc0>] from [<c01debf4>] Function entered at [<c01dea10>] from [<c002de90>] Function entered at [<c002de00>] from [<c001877c>] r6:c001877c r5:c002de00 r4:c7827de8 Code: e89da830 e59f000c eb01ec50 eaffffe9 (e7f001f2) MV-DMA: idma_print_and_clear_irq: address miss @0! MV-DMA: tpg.reg + DMA_CTRL = 0x80001d04 MV-DMA: tpg.reg + DMA_BYTE_COUNT = 0x0 MV-DMA: tpg.reg + DMA_SRC_ADDR = 0x0 MV-DMA: tpg.reg + DMA_DST_ADDR = 0x0 MV-DMA: tpg.reg + DMA_NEXT_DESC = 0x0 MV-DMA: tpg.reg + DMA_CURR_DESC = 0x0 MV-DMA: DMA descriptor list: MV-DMA: entry 0 at 0xffdbb000: dma addr 0x79b1000, src 0x79b4000, dst 0xf2200080, count 16, own 1, next 0x79b1010 MV-DMA: entry 1 at 0xffdbb010: dma addr 0x79b1010, src 0x799c28c, dst 0xf2200000, count 80, own 1, next 0x79b1020 MV-DMA: entry 2 at 0xffdbb020: dma addr 0x79b1020, src 0x0, dst 0x0, count 0, own 0, next 0x79b1030 MV-DMA: entry 3 at 0xffdbb030: dma addr 0x79b1030, src 0xf2200080, dst 0x79b4000, count 16, own 1, next 0x0 ^ permalink raw reply [flat|nested] 67+ messages in thread
* Re: [PATCH 0/2] Fixes for MV_CESA with IDMA or TDMA 2012-06-25 21:59 ` Phil Sutter 2012-06-26 11:24 ` cloudy.linux @ 2012-06-30 7:35 ` cloudy.linux 2012-07-06 15:30 ` Phil Sutter 1 sibling, 1 reply; 67+ messages in thread From: cloudy.linux @ 2012-06-30 7:35 UTC (permalink / raw) To: Phil Sutter; +Cc: linux-crypto, andrew, Simon Baatz [-- Attachment #1: Type: text/plain, Size: 1732 bytes --] Hi Phil Although I had no idea about what's wrong, I looked in the functional errata (again), And I found what's attached (The doc I got from Internet was a protected PDF, that's why I had to use screen capture). Is this relevant? Or maybe you have already addressed this in the code (I can just read some simple C code)? Regards Cloudy On 2012-6-26 5:59, Phil Sutter wrote: > Hi, > > On Tue, Jun 26, 2012 at 12:05:55AM +0800, cloudy.linux wrote: >> This time the machine can't finish the boot again and the console was >> flooded by the message like below: > > Oh well. I decided to drop that BUG_ON() again, since I saw it once > being triggered while in interrupt context. But since the error is > non-recovering anyway, I guess it may stay there as well. > >> Also, I had to make some modifications to the >> arch/arm/mach-orion5x/common.c to let it compile successfully: >> 1 Add including of mv_dma.h >> 2 Add macro to define TARGET_SRAM as 9 (which is defined in addr-map.c, >> so I think the clean solution should be modify the addr-map.h? Anyway, >> as a quick solution the source finally got compiled) > > Hmm, yeah. Test-compiling for the platform one is writing code for is > still a good idea. But it's even worse than that: according to the > specs, for IDMA the SRAM target ID is 5, not 9 like it is for the CPU. > > Please apply the attached patch on top of the one I sent earlier, > without your modifications (the necessary parts are contained in it). > Also, I've added some log output to the decode window setter, so we see > what's going on there. > > Anyway, thanks a lot for your help so far! I hope next try shows some > progress at least. > > Greetings, Phil > > > Phil Sutter > Software Engineer > [-- Attachment #2: gl-cesa-110.PNG --] [-- Type: image/png, Size: 62921 bytes --] ^ permalink raw reply [flat|nested] 67+ messages in thread
* Re: [PATCH 0/2] Fixes for MV_CESA with IDMA or TDMA 2012-06-30 7:35 ` cloudy.linux @ 2012-07-06 15:30 ` Phil Sutter 2012-07-08 5:38 ` cloudy.linux 0 siblings, 1 reply; 67+ messages in thread From: Phil Sutter @ 2012-07-06 15:30 UTC (permalink / raw) To: cloudy.linux; +Cc: linux-crypto, andrew, Simon Baatz Hi Cloudy, On Sat, Jun 30, 2012 at 03:35:48PM +0800, cloudy.linux wrote: > Although I had no idea about what's wrong, I looked in the functional > errata (again), And I found what's attached (The doc I got from Internet > was a protected PDF, that's why I had to use screen capture). > Is this relevant? Or maybe you have already addressed this in the code > (I can just read some simple C code)? To me, doesn't read like a real problem, just a guideline for doing things. From the output you sent me in your previous mail, I'd rather suspect fetching the first descriptor to be faulty: the next descriptor pointer contains the first descriptor's DMA address, all other fields are zero (this is the situation when triggering the engine, as on kirkwood all I have to do is fill the first descriptor's address in and TDMA does the rest) and IDMA triggers an address miss interrupt at address 0x0. So probably IDMA starts up and tries to look up decoding windows for he up the still zero source and destination addresses. According to the specs, when using the next descriptor field for fetching the first descriptor one also has to set the FETCH_ND field in DMA_CTRL register, also for TDMA. Though, on my hardware the only working configuration is the implemented one, i.e. without FETCH_ND being set. I have implemented a separate approach just for IDMA, which instead of just writing the first descriptor's address to NEXT_DESC does: 1. clear CTRL_ENABLE bit 2. fill NEXT_DESC 3. set CTRL_ENABLE along with FETCH_ND hopefully this is the way to go on Orion. Since Marvell's BSP doesn't implement *DMA attached to CESA, I have nowhere to look this up. Getting it right for TDMA was just a matter of trial and error. My public git got a few updates, including the code described above. Would be great if you could give it a try. Greetings, Phil Phil Sutter Software Engineer -- VNet Europe GmbH Mainzer Str. 43 55411 Bingen am Rhein Germany Management Buy-Out at Viprinet - please read http://www.viprinet.com/en/mbo Management Buy-Out bei Viprinet - bitte lesen Sie http://www.viprinet.com/de/mbo Phone/Zentrale: +49 6721 49030-0 Direct line/Durchwahl: +49 6721 49030-134 Fax: +49 6721 49030-109 phil.sutter@viprinet.com http://www.viprinet.com Registered office/Sitz der Gesellschaft: Bingen am Rhein, Germany Commercial register/Handelsregister: Amtsgericht Mainz HRB44090 CEO/Geschäftsführer: Simon Kissel ^ permalink raw reply [flat|nested] 67+ messages in thread
* Re: [PATCH 0/2] Fixes for MV_CESA with IDMA or TDMA 2012-07-06 15:30 ` Phil Sutter @ 2012-07-08 5:38 ` cloudy.linux 2012-07-09 12:54 ` Phil Sutter 0 siblings, 1 reply; 67+ messages in thread From: cloudy.linux @ 2012-07-08 5:38 UTC (permalink / raw) To: Phil Sutter; +Cc: linux-crypto, andrew, Simon Baatz On 2012-7-6 23:30, Phil Sutter wrote: > Hi Cloudy, > > On Sat, Jun 30, 2012 at 03:35:48PM +0800, cloudy.linux wrote: >> Although I had no idea about what's wrong, I looked in the functional >> errata (again), And I found what's attached (The doc I got from Internet >> was a protected PDF, that's why I had to use screen capture). >> Is this relevant? Or maybe you have already addressed this in the code >> (I can just read some simple C code)? > > To me, doesn't read like a real problem, just a guideline for doing > things. From the output you sent me in your previous mail, I'd rather > suspect fetching the first descriptor to be faulty: the next descriptor > pointer contains the first descriptor's DMA address, all other fields > are zero (this is the situation when triggering the engine, as on > kirkwood all I have to do is fill the first descriptor's address in and > TDMA does the rest) and IDMA triggers an address miss interrupt at > address 0x0. So probably IDMA starts up and tries to look up decoding > windows for he up the still zero source and destination addresses. > > According to the specs, when using the next descriptor field for > fetching the first descriptor one also has to set the FETCH_ND field in > DMA_CTRL register, also for TDMA. Though, on my hardware the only > working configuration is the implemented one, i.e. without FETCH_ND > being set. > > I have implemented a separate approach just for IDMA, which instead of > just writing the first descriptor's address to NEXT_DESC does: > 1. clear CTRL_ENABLE bit > 2. fill NEXT_DESC > 3. set CTRL_ENABLE along with FETCH_ND > hopefully this is the way to go on Orion. Since Marvell's BSP doesn't > implement *DMA attached to CESA, I have nowhere to look this up. Getting > it right for TDMA was just a matter of trial and error. > > My public git got a few updates, including the code described above. > Would be great if you could give it a try. > > Greetings, Phil > > > > Phil Sutter > Software Engineer > Hi Newest result. Still couldn't boot up. This time the source was cloned from your git repository. MV-DMA: window at bar0: target 0, attr 14, base 0, size 8000000 MV-DMA: window at bar1: target 5, attr 0, base f2200000, size 10000 MV-DMA: IDMA engine up and running, IRQ 23 MV-DMA: idma_print_and_clear_irq: address miss @0! MV-DMA: tpg.reg + DMA_CTRL = 0x80001d04 MV-DMA: tpg.reg + DMA_BYTE_COUNT = 0x0 MV-DMA: tpg.reg + DMA_SRC_ADDR = 0x0 MV-DMA: tpg.reg + DMA_DST_ADDR = 0x0 MV-DMA: tpg.reg + DMA_NEXT_DESC = 0x79b1000 MV-DMA: tpg.reg + DMA_CURR_DESC = 0x0 MV-DMA: DMA descriptor list: MV-DMA: entry 0 at 0xffdbb000: dma addr 0x79b1000, src 0x79b4000, dst 0xf2200080, count 16, own 1, next 0x79b1010 MV-DMA: entry 1 at 0xffdbb010: dma addr 0x79b1010, src 0x799c28c, dst 0xf2200000, count 80, own 1, next 0x79b1020 MV-DMA: entry 2 at 0xffdbb020: dma addr 0x79b1020, src 0x0, dst 0x0, count 0, own 0, next 0x79b1030 MV-DMA: entry 3 at 0xffdbb030: dma addr 0x79b1030, src 0xf2200080, dst 0x79b4000, count 16, own 1, next 0x0 MV-CESA:got an interrupt but no pending timer? ------------[ cut here ]------------ kernel BUG at drivers/crypto/mv_cesa.c:1126! Internal error: Oops - BUG: 0 [#1] ARM Modules linked in: CPU: 0 Not tainted (3.5.0-rc2+ #3) pc : [<c01df8e0>] lr : [<c0015810>] psr: 20000093 sp : c79b9e58 ip : c79b9da8 fp : c79b9e6c r10: c02f2164 r9 : c0306322 r8 : 0000001c r7 : 00000000 r6 : 00000000 r5 : c0312988 r4 : 00000002 r3 : c799c200 r2 : 0000de20 r1 : fdd90000 r0 : fdd90000 Flags: nzCv IRQs off FIQs on Mode SVC_32 ISA ARM Segment kernel Control: a005317f Table: 00004000 DAC: 00000017 Process mv_crypto (pid: 276, stack limit = 0xc79b8270) Stack: (0xc79b9e58 to 0xc79ba000) 9e40: c79afc40 0000001c 9e60: c79b9ea4 c79b9e70 c0047694 c01df858 c788929c 4bf287d9 00000000 c02f2164 9e80: 0000001c 00000000 c79b9f4c c79bbe18 c02ecc18 c02ef0a0 c79b9ebc c79b9ea8 9ea0: c0047844 c0047650 00022000 c02f2164 c79b9ed4 c79b9ec0 c0049b4c c0047824 9ec0: c0049ac4 c02ff738 c79b9ee4 c79b9ed8 c0046fd0 c0049ad4 c79b9f04 c79b9ee8 9ee0: c000985c c0046fb0 c01de804 c025db80 a0000013 fdd20200 c79b9f14 c79b9f08 9f00: c0008170 c0009834 c79b9f6c c79b9f18 c0008c14 c0008170 00000000 00000001 9f20: c79b9f60 0000de00 c0312988 c799c200 c7936cc0 c79ac780 c79bbe18 c02ecc18 9f40: c02ef0a0 c79b9f6c c79b9f70 c79b9f60 c01de804 c025db80 a0000013 ffffffff 9f60: c79b9fbc c79b9f70 c01de804 c025db80 00000000 c79b9f80 c025d904 c0035000 9f80: c78897a0 c7827de8 c79b8000 c02ecc18 00000013 c7827de8 c799c200 c01de620 9fa0: 00000013 00000000 00000000 00000000 c79b9ff4 c79b9fc0 c002da78 c01de630 9fc0: c7827de8 00000000 c799c200 00000000 c79b9fd0 c79b9fd0 00000000 c7827de8 9fe0: c002d9e8 c0018354 00000000 c79b9ff8 c0018354 c002d9f8 01e6e7fe 01e6e7ff Backtrace: Function entered at [<c01df848>] from [<c0047694>] r5:0000001c r4:c79afc40 Function entered at [<c0047640>] from [<c0047844>] Function entered at [<c0047814>] from [<c0049b4c>] r4:c02f2164 r3:00022000 Function entered at [<c0049ac4>] from [<c0046fd0>] r4:c02ff738 r3:c0049ac4 Function entered at [<c0046fa0>] from [<c000985c>] Function entered at [<c0009824>] from [<c0008170>] r6:fdd20200 r5:a0000013 r4:c025db80 r3:c01de804 Function entered at [<c0008160>] from [<c0008c14>] Exception stack(0xc79b9f18 to 0xc79b9f60) 9f00: 00000000 00000001 9f20: c79b9f60 0000de00 c0312988 c799c200 c7936cc0 c79ac780 c79bbe18 c02ecc18 9f40: c02ef0a0 c79b9f6c c79b9f70 c79b9f60 c01de804 c025db80 a0000013 ffffffff Function entered at [<c025db70>] from [<c01de804>] Function entered at [<c01de620>] from [<c002da78>] Function entered at [<c002d9e8>] from [<c0018354>] r6:c0018354 r5:c002d9e8 r4:c7827de8 Code: e89da830 e59f000c eb01ec39 eaffffe9 (e7f001f2) MV-DMA: idma_print_and_clear_irq: address miss @0! MV-DMA: tpg.reg + DMA_CTRL = 0x80001d04 MV-DMA: tpg.reg + DMA_BYTE_COUNT = 0x0 MV-DMA: tpg.reg + DMA_SRC_ADDR = 0x0 MV-DMA: tpg.reg + DMA_DST_ADDR = 0x0 MV-DMA: tpg.reg + DMA_NEXT_DESC = 0x0 MV-DMA: tpg.reg + DMA_CURR_DESC = 0x0 MV-DMA: DMA descriptor list: MV-DMA: entry 0 at 0xffdbb000: dma addr 0x79b1000, src 0x79b4000, dst 0xf2200080, count 16, own 1, next 0x79b1010 MV-DMA: entry 1 at 0xffdbb010: dma addr 0x79b1010, src 0x799c28c, dst 0xf2200000, count 80, own 1, next 0x79b1020 MV-DMA: entry 2 at 0xffdbb020: dma addr 0x79b1020, src 0x0, dst 0x0, count 0, own 0, next 0x79b1030 MV-DMA: entry 3 at 0xffdbb030: dma addr 0x79b1030, src 0xf2200080, dst 0x79b4000, count 16, own 1, next 0x0 MV-DMA: idma_print_and_clear_irq: address miss @0! MV-DMA: tpg.reg + DMA_CTRL = 0x80001d04 MV-DMA: tpg.reg + DMA_BYTE_COUNT = 0x0 MV-DMA: tpg.reg + DMA_SRC_ADDR = 0x0 MV-DMA: tpg.reg + DMA_DST_ADDR = 0x0 MV-DMA: tpg.reg + DMA_NEXT_DESC = 0x0 MV-DMA: tpg.reg + DMA_CURR_DESC = 0x0 MV-DMA: DMA descriptor list: MV-DMA: entry 0 at 0xffdbb000: dma addr 0x79b1000, src 0x79b4000, dst 0xf2200080, count 16, own 1, next 0x79b1010 MV-DMA: entry 1 at 0xffdbb010: dma addr 0x79b1010, src 0x799c28c, dst 0xf2200000, count 80, own 1, next 0x79b1020 MV-DMA: entry 2 at 0xffdbb020: dma addr 0x79b1020, src 0x0, dst 0x0, count 0, own 0, next 0x79b1030 MV-DMA: entry 3 at 0xffdbb030: dma addr 0x79b1030, src 0xf2200080, dst 0x79b4000, count 16, own 1, next 0x0 Regards ^ permalink raw reply [flat|nested] 67+ messages in thread
* Re: [PATCH 0/2] Fixes for MV_CESA with IDMA or TDMA 2012-07-08 5:38 ` cloudy.linux @ 2012-07-09 12:54 ` Phil Sutter 2012-07-31 12:12 ` cloudy.linux 0 siblings, 1 reply; 67+ messages in thread From: Phil Sutter @ 2012-07-09 12:54 UTC (permalink / raw) To: cloudy.linux; +Cc: linux-crypto, andrew, Simon Baatz Hi, On Sun, Jul 08, 2012 at 01:38:47PM +0800, cloudy.linux wrote: > Newest result. Still couldn't boot up. This time the source was cloned > from your git repository. > > MV-DMA: window at bar0: target 0, attr 14, base 0, size 8000000 > MV-DMA: window at bar1: target 5, attr 0, base f2200000, size 10000 > MV-DMA: IDMA engine up and running, IRQ 23 > MV-DMA: idma_print_and_clear_irq: address miss @0! > MV-DMA: tpg.reg + DMA_CTRL = 0x80001d04 > MV-DMA: tpg.reg + DMA_BYTE_COUNT = 0x0 > MV-DMA: tpg.reg + DMA_SRC_ADDR = 0x0 > MV-DMA: tpg.reg + DMA_DST_ADDR = 0x0 > MV-DMA: tpg.reg + DMA_NEXT_DESC = 0x79b1000 > MV-DMA: tpg.reg + DMA_CURR_DESC = 0x0 > MV-DMA: DMA descriptor list: > MV-DMA: entry 0 at 0xffdbb000: dma addr 0x79b1000, src 0x79b4000, dst > 0xf2200080, count 16, own 1, next 0x79b1010 > MV-DMA: entry 1 at 0xffdbb010: dma addr 0x79b1010, src 0x799c28c, dst > 0xf2200000, count 80, own 1, next 0x79b1020 > MV-DMA: entry 2 at 0xffdbb020: dma addr 0x79b1020, src 0x0, dst 0x0, > count 0, own 0, next 0x79b1030 > MV-DMA: entry 3 at 0xffdbb030: dma addr 0x79b1030, src 0xf2200080, dst > 0x79b4000, count 16, own 1, next 0x0 > MV-CESA:got an interrupt but no pending timer? Sucks. What's making me wonder here is, address decoding of address 0x0 actually shouldn't fail, since window 0 includes this address. For now, I have pushed two new commits to my public git, adding more debugging output for decoding window logic and interrupt case as well as decoding window permission fix and changing from FETCH_ND to programming the first DMA descriptor's values manually. In the long term, I probably should try to get access to some appropriate hardware myself. This is rather a quiz game than actual bug tracking. Greetings, Phil Phil Sutter Software Engineer -- VNet Europe GmbH Mainzer Str. 43 55411 Bingen am Rhein Germany Management Buy-Out at Viprinet - please read http://www.viprinet.com/en/mbo Management Buy-Out bei Viprinet - bitte lesen Sie http://www.viprinet.com/de/mbo Phone/Zentrale: +49 6721 49030-0 Direct line/Durchwahl: +49 6721 49030-134 Fax: +49 6721 49030-109 phil.sutter@viprinet.com http://www.viprinet.com Registered office/Sitz der Gesellschaft: Bingen am Rhein, Germany Commercial register/Handelsregister: Amtsgericht Mainz HRB44090 CEO/Geschäftsführer: Simon Kissel ^ permalink raw reply [flat|nested] 67+ messages in thread
* Re: [PATCH 0/2] Fixes for MV_CESA with IDMA or TDMA 2012-07-09 12:54 ` Phil Sutter @ 2012-07-31 12:12 ` cloudy.linux 2012-10-23 17:11 ` Phil Sutter 0 siblings, 1 reply; 67+ messages in thread From: cloudy.linux @ 2012-07-31 12:12 UTC (permalink / raw) To: Phil Sutter; +Cc: linux-crypto, andrew, Simon Baatz Hi Phil On 2012-7-9 20:54, Phil Sutter wrote: > Hi, > > On Sun, Jul 08, 2012 at 01:38:47PM +0800, cloudy.linux wrote: >> Newest result. Still couldn't boot up. This time the source was cloned >> from your git repository. >> >> MV-DMA: window at bar0: target 0, attr 14, base 0, size 8000000 >> MV-DMA: window at bar1: target 5, attr 0, base f2200000, size 10000 >> MV-DMA: IDMA engine up and running, IRQ 23 >> MV-DMA: idma_print_and_clear_irq: address miss @0! >> MV-DMA: tpg.reg + DMA_CTRL = 0x80001d04 >> MV-DMA: tpg.reg + DMA_BYTE_COUNT = 0x0 >> MV-DMA: tpg.reg + DMA_SRC_ADDR = 0x0 >> MV-DMA: tpg.reg + DMA_DST_ADDR = 0x0 >> MV-DMA: tpg.reg + DMA_NEXT_DESC = 0x79b1000 >> MV-DMA: tpg.reg + DMA_CURR_DESC = 0x0 >> MV-DMA: DMA descriptor list: >> MV-DMA: entry 0 at 0xffdbb000: dma addr 0x79b1000, src 0x79b4000, dst >> 0xf2200080, count 16, own 1, next 0x79b1010 >> MV-DMA: entry 1 at 0xffdbb010: dma addr 0x79b1010, src 0x799c28c, dst >> 0xf2200000, count 80, own 1, next 0x79b1020 >> MV-DMA: entry 2 at 0xffdbb020: dma addr 0x79b1020, src 0x0, dst 0x0, >> count 0, own 0, next 0x79b1030 >> MV-DMA: entry 3 at 0xffdbb030: dma addr 0x79b1030, src 0xf2200080, dst >> 0x79b4000, count 16, own 1, next 0x0 >> MV-CESA:got an interrupt but no pending timer? > > Sucks. What's making me wonder here is, address decoding of address 0x0 > actually shouldn't fail, since window 0 includes this address. > > For now, I have pushed two new commits to my public git, adding more > debugging output for decoding window logic and interrupt case as well as > decoding window permission fix and changing from FETCH_ND to programming > the first DMA descriptor's values manually. > > In the long term, I probably should try to get access to some > appropriate hardware myself. This is rather a quiz game than actual bug > tracking. > > Greetings, Phil > > > Phil Sutter > Software Engineer > Sorry for taking so long time to try the latest code. Just came back from a vacation and tried several days to get a tight sleep. The latest console output: MV-DMA: window at bar0: target 0, attr 14, base 0, size 8000000 MV-DMA: idma_set_deco_win: win(0): BAR 0x7ff0000, size 0x0, enable 0xffff, prot 0xc031295c MV-DMA: window at bar1: target 5, attr 0, base f2200000, size 10000 MV-DMA: idma_set_deco_win: win(1): BAR 0x0, size 0x0, enable 0xffff, prot 0xc031295c MV-DMA: IDMA engine up and running, IRQ 23 MV-DMA: idma_print_and_clear_irq: cause 0x3, select 0x1, addr 0x79b4000 MV-DMA: idma_print_and_clear_irq: address miss @0! MV-DMA: tpg.reg + DMA_CTRL = 0x80001d04 MV-DMA: tpg.reg + DMA_BYTE_COUNT = 0x80000010 MV-DMA: tpg.reg + DMA_SRC_ADDR = 0x79b4000 MV-DMA: tpg.reg + DMA_DST_ADDR = 0xf2200080 MV-DMA: tpg.reg + DMA_NEXT_DESC = 0x79b1010 MV-DMA: tpg.reg + DMA_CURR_DESC = 0x0 MV-DMA: DMA descriptor list: MV-DMA: entry 0 at 0xffdbb000: dma addr 0x79b1000, src 0x79b4000, dst 0xf2200080, count 16, own 1, next 0x79b1010 MV-DMA: entry 1 at 0xffdbb010: dma addr 0x79b1010, src 0x799c28c, dst 0xf2200000, count 80, own 1, next 0x79b1020 MV-DMA: entry 2 at 0xffdbb020: dma addr 0x79b1020, src 0x0, dst 0x0, count 0, own 0, next 0x79b1030 MV-DMA: entry 3 at 0xffdbb030: dma addr 0x79b1030, src 0xf2200080, dst 0x79b4000, count 16, own 1, next 0x0 MV-CESA:got an interrupt but no pending timer? ------------[ cut here ]------------ kernel BUG at drivers/crypto/mv_cesa.c:1126! Internal error: Oops - BUG: 0 [#1] ARM Modules linked in: CPU: 0 Not tainted (3.5.0-rc2+ #4) pc : [<c01df9d8>] lr : [<c0015810>] psr: 20000093 sp : c79b9e68 ip : c79b9db8 fp : c79b9e7c r10: c02f2164 r9 : c0306322 r8 : 0000001c r7 : 00000000 r6 : 00000000 r5 : c0312988 r4 : 00000002 r3 : c799c200 r2 : 0000de20 r1 : fdd90000 r0 : fdd90000 Flags: nzCv IRQs off FIQs on Mode SVC_32 ISA ARM Segment kernel Control: a005317f Table: 00004000 DAC: 00000017 Process mv_crypto (pid: 276, stack limit = 0xc79b8270) Stack: (0xc79b9e68 to 0xc79ba000) 9e60: c79afc40 0000001c c79b9eb4 c79b9e80 c0047694 c01df950 9e80: c7824540 00000001 c79b9eac c02f2164 0000001c 00000000 c79b9f5c c79bbe18 9ea0: c02ecc18 c02ef0a0 c79b9ecc c79b9eb8 c0047844 c0047650 00022000 c02f2164 9ec0: c79b9ee4 c79b9ed0 c0049b4c c0047824 c0049ac4 c02ff738 c79b9ef4 c79b9ee8 9ee0: c0046fd0 c0049ad4 c79b9f14 c79b9ef8 c000985c c0046fb0 c01dc330 c01dee08 9f00: a0000013 fdd20200 c79b9f24 c79b9f18 c0008170 c0009834 c79b9fbc c79b9f28 9f20: c0008c14 c0008170 0000009b 00000001 fdd90000 0000de00 c0312988 c799c200 9f40: c7936cc0 c79ac780 c79bbe18 c02ecc18 c02ef0a0 c79b9fbc 00000010 c79b9f70 9f60: c01dc330 c01dee08 a0000013 ffffffff 00000000 c79b9f80 c025d9fc c0035000 9f80: c78897a0 c7827de8 c79b8000 c02ecc18 00000013 c7827de8 c799c200 c01de718 9fa0: 00000013 00000000 00000000 00000000 c79b9ff4 c79b9fc0 c002da78 c01de728 9fc0: c7827de8 00000000 c799c200 00000000 c79b9fd0 c79b9fd0 00000000 c7827de8 9fe0: c002d9e8 c0018354 00000000 c79b9ff8 c0018354 c002d9f8 01e6e7fe 01e6e7ff Backtrace: Function entered at [<c01df940>] from [<c0047694>] r5:0000001c r4:c79afc40 Function entered at [<c0047640>] from [<c0047844>] Function entered at [<c0047814>] from [<c0049b4c>] r4:c02f2164 r3:00022000 Function entered at [<c0049ac4>] from [<c0046fd0>] r4:c02ff738 r3:c0049ac4 Function entered at [<c0046fa0>] from [<c000985c>] Function entered at [<c0009824>] from [<c0008170>] r6:fdd20200 r5:a0000013 r4:c01dee08 r3:c01dc330 Function entered at [<c0008160>] from [<c0008c14>] Exception stack(0xc79b9f28 to 0xc79b9f70) 9f20: 0000009b 00000001 fdd90000 0000de00 c0312988 c799c200 9f40: c7936cc0 c79ac780 c79bbe18 c02ecc18 c02ef0a0 c79b9fbc 00000010 c79b9f70 9f60: c01dc330 c01dee08 a0000013 ffffffff Function entered at [<c01de718>] from [<c002da78>] Function entered at [<c002d9e8>] from [<c0018354>] r6:c0018354 r5:c002d9e8 r4:c7827de8 Code: e89da830 e59f000c eb01ec39 eaffffe9 (e7f001f2) MV-DMA: idma_print_and_clear_irq: cause 0x3, select 0x1, addr 0x0 MV-DMA: idma_print_and_clear_irq: address miss @0! MV-DMA: tpg.reg + DMA_CTRL = 0x80001d04 MV-DMA: tpg.reg + DMA_BYTE_COUNT = 0x0 MV-DMA: tpg.reg + DMA_SRC_ADDR = 0x0 MV-DMA: tpg.reg + DMA_DST_ADDR = 0x0 MV-DMA: tpg.reg + DMA_NEXT_DESC = 0x0 MV-DMA: tpg.reg + DMA_CURR_DESC = 0x0 MV-DMA: DMA descriptor list: MV-DMA: entry 0 at 0xffdbb000: dma addr 0x79b1000, src 0x79b4000, dst 0xf2200080, count 16, own 1, next 0x79b1010 MV-DMA: entry 1 at 0xffdbb010: dma addr 0x79b1010, src 0x799c28c, dst 0xf2200000, count 80, own 1, next 0x79b1020 MV-DMA: entry 2 at 0xffdbb020: dma addr 0x79b1020, src 0x0, dst 0x0, count 0, own 0, next 0x79b1030 MV-DMA: entry 3 at 0xffdbb030: dma addr 0x79b1030, src 0xf2200080, dst 0x79b4000, count 16, own 1, next 0x0 MV-DMA: idma_print_and_clear_irq: cause 0x3, select 0x1, addr 0x0 MV-DMA: idma_print_and_clear_irq: address miss @0! MV-DMA: tpg.reg + DMA_CTRL = 0x80001d04 MV-DMA: tpg.reg + DMA_BYTE_COUNT = 0x0 MV-DMA: tpg.reg + DMA_SRC_ADDR = 0x0 MV-DMA: tpg.reg + DMA_DST_ADDR = 0x0 MV-DMA: tpg.reg + DMA_NEXT_DESC = 0x0 MV-DMA: tpg.reg + DMA_CURR_DESC = 0x0 MV-DMA: DMA descriptor list: MV-DMA: entry 0 at 0xffdbb000: dma addr 0x79b1000, src 0x79b4000, dst 0xf2200080, count 16, own 1, next 0x79b1010 MV-DMA: entry 1 at 0xffdbb010: dma addr 0x79b1010, src 0x799c28c, dst 0xf2200000, count 80, own 1, next 0x79b1020 MV-DMA: entry 2 at 0xffdbb020: dma addr 0x79b1020, src 0x0, dst 0x0, count 0, own 0, next 0x79b1030 MV-DMA: entry 3 at 0xffdbb030: dma addr 0x79b1030, src 0xf2200080, dst 0x79b4000, count 16, own 1, next 0x0 MV-DMA: idma_print_and_clear_irq: cause 0x3, select 0x1, addr 0x0 MV-DMA: idma_print_and_clear_irq: address miss @0! MV-DMA: tpg.reg + DMA_CTRL = 0x80001d04 MV-DMA: tpg.reg + DMA_BYTE_COUNT = 0x0 MV-DMA: tpg.reg + DMA_SRC_ADDR = 0x0 MV-DMA: tpg.reg + DMA_DST_ADDR = 0x0 MV-DMA: tpg.reg + DMA_NEXT_DESC = 0x0 MV-DMA: tpg.reg + DMA_CURR_DESC = 0x0 ... Best Regards Cloudy ^ permalink raw reply [flat|nested] 67+ messages in thread
* Re: [PATCH 0/2] Fixes for MV_CESA with IDMA or TDMA 2012-07-31 12:12 ` cloudy.linux @ 2012-10-23 17:11 ` Phil Sutter 0 siblings, 0 replies; 67+ messages in thread From: Phil Sutter @ 2012-10-23 17:11 UTC (permalink / raw) To: cloudy.linux; +Cc: linux-crypto, andrew, Simon Baatz Hey, On Tue, Jul 31, 2012 at 08:12:02PM +0800, cloudy.linux wrote: > Sorry for taking so long time to try the latest code. Just came back > from a vacation and tried several days to get a tight sleep. My apologies for having a ~3months lag. Somehow I have totally forgotten about your mail in my inbox and just recently found it again. Luckily, I received testing hardware from a colleague a few days ago on which I can reproduce the problems at hand. So for now, I can do the testing on my own. Thanks a lot for yours! I'll get back to you (probably via linux-crypto) as soon as I have some useful progress. Could be that I have to implement bigger changes in code flow for Orion, as the IDMA seems to lag this "Enhanced Software Flow" functionality (how the Kirkwood datasheet calls it) I am relying upon in my current code. Will see. Best wishes, Phil ^ permalink raw reply [flat|nested] 67+ messages in thread
* [PATCH 0/1] MV_CESA with DMA: Clk init fixes 2012-06-18 13:47 ` [PATCH 0/2] Fixes for MV_CESA with IDMA or TDMA Phil Sutter 2012-06-18 20:12 ` Simon Baatz @ 2012-06-26 20:31 ` Simon Baatz 2012-06-26 20:31 ` [PATCH 1/1] mv_dma: mv_cesa: fixes for clock init Simon Baatz 2012-07-06 15:05 ` [PATCH 0/1] MV_CESA with DMA: Clk init fixes Phil Sutter 1 sibling, 2 replies; 67+ messages in thread From: Simon Baatz @ 2012-06-26 20:31 UTC (permalink / raw) To: phil.sutter; +Cc: linux-crypto, cloudy.linux Hi Phil, I just found the time to test your updates. Alas, the mv_dma module hangs at boot again. The culprit seems to be setup_mbus_windows(), which is called before the clock is turned on but accesses the DMA engine. I shifted the clock init code a bit and while doing so, fixed some error case handling for mv_dma and mv_cesa. See proposed patch in next mail. - Simon Simon Baatz (1): mv_dma: mv_cesa: fixes for clock init drivers/crypto/mv_cesa.c | 7 ++++++- drivers/crypto/mv_dma.c | 44 +++++++++++++++++++++++++++++--------------- 2 files changed, 35 insertions(+), 16 deletions(-) -- 1.7.9.5 ^ permalink raw reply [flat|nested] 67+ messages in thread
* [PATCH 1/1] mv_dma: mv_cesa: fixes for clock init 2012-06-26 20:31 ` [PATCH 0/1] MV_CESA with DMA: Clk init fixes Simon Baatz @ 2012-06-26 20:31 ` Simon Baatz 2012-07-06 15:05 ` [PATCH 0/1] MV_CESA with DMA: Clk init fixes Phil Sutter 1 sibling, 0 replies; 67+ messages in thread From: Simon Baatz @ 2012-06-26 20:31 UTC (permalink / raw) To: phil.sutter; +Cc: linux-crypto, cloudy.linux mv_dma tries to access CESA engine registers before the CESA clock is enabled. Shift the clock enable code to the proper position. Additionally, both mv_dma and mv_cesa did not disable the clock if something went wrong during init. Signed-off-by: Simon Baatz <gmbnomis@gmail.com> --- drivers/crypto/mv_cesa.c | 7 ++++++- drivers/crypto/mv_dma.c | 44 +++++++++++++++++++++++++++++--------------- 2 files changed, 35 insertions(+), 16 deletions(-) diff --git a/drivers/crypto/mv_cesa.c b/drivers/crypto/mv_cesa.c index b75fdf5..aa05567 100644 --- a/drivers/crypto/mv_cesa.c +++ b/drivers/crypto/mv_cesa.c @@ -1308,7 +1308,8 @@ static int mv_probe(struct platform_device *pdev) ret = -ENOMEM; goto err_mapping; } - if (set_dma_desclist_size(&cpg->desclist, MV_DMA_INIT_POOLSIZE)) { + ret = set_dma_desclist_size(&cpg->desclist, MV_DMA_INIT_POOLSIZE); + if (ret) { printk(KERN_ERR MV_CESA "failed to initialise poolsize\n"); goto err_pool; } @@ -1350,6 +1351,10 @@ err_mapping: dma_unmap_single(&pdev->dev, cpg->sa_sram_dma, sizeof(struct sec_accel_sram), DMA_TO_DEVICE); free_irq(irq, cp); + if (!IS_ERR(cp->clk)) { + clk_disable_unprepare(cp->clk); + clk_put(cp->clk); + } err_thread: kthread_stop(cp->queue_th); err_unmap_sram: diff --git a/drivers/crypto/mv_dma.c b/drivers/crypto/mv_dma.c index dd1ce02..9440fbc 100644 --- a/drivers/crypto/mv_dma.c +++ b/drivers/crypto/mv_dma.c @@ -350,23 +350,39 @@ static int mv_init_engine(struct platform_device *pdev, u32 ctrl_init_val, tpg.dev = &pdev->dev; tpg.print_and_clear_irq = pc_irq; + /* Not all platforms can gate the clock, so it is not + an error if the clock does not exists. */ + tpg.clk = clk_get(&pdev->dev, NULL); + if (!IS_ERR(tpg.clk)) + clk_prepare_enable(tpg.clk); + /* setup address decoding */ res = platform_get_resource_byname(pdev, IORESOURCE_MEM, "regs deco"); - if (!res) - return -ENXIO; - if (!(deco = ioremap(res->start, resource_size(res)))) - return -ENOMEM; + if (!res) { + rc = -ENXIO; + goto out_disable_clk; + } + deco = ioremap(res->start, resource_size(res)); + if (!deco) { + rc = -ENOMEM; + goto out_disable_clk; + } setup_mbus_windows(deco, pdev->dev.platform_data, win_setter); iounmap(deco); /* get register start address */ res = platform_get_resource_byname(pdev, IORESOURCE_MEM, "regs control and error"); - if (!res) - return -ENXIO; - if (!(tpg.reg = ioremap(res->start, resource_size(res)))) - return -ENOMEM; + if (!res) { + rc = -ENXIO; + goto out_disable_clk; + } + tpg.reg = ioremap(res->start, resource_size(res)); + if (!tpg.reg) { + rc = -ENOMEM; + goto out_disable_clk; + } /* get the IRQ */ tpg.irq = platform_get_irq(pdev, 0); @@ -375,12 +391,6 @@ static int mv_init_engine(struct platform_device *pdev, u32 ctrl_init_val, goto out_unmap_reg; } - /* Not all platforms can gate the clock, so it is not - an error if the clock does not exists. */ - tpg.clk = clk_get(&pdev->dev, NULL); - if (!IS_ERR(tpg.clk)) - clk_prepare_enable(tpg.clk); - /* initialise DMA descriptor list */ if (init_dma_desclist(&tpg.desclist, tpg.dev, sizeof(struct mv_dma_desc), MV_DMA_ALIGN, 0)) { @@ -421,6 +431,11 @@ out_free_desclist: fini_dma_desclist(&tpg.desclist); out_unmap_reg: iounmap(tpg.reg); +out_disable_clk: + if (!IS_ERR(tpg.clk)) { + clk_disable_unprepare(tpg.clk); + clk_put(tpg.clk); + } tpg.dev = NULL; return rc; } @@ -517,4 +532,3 @@ module_exit(mv_dma_exit); MODULE_AUTHOR("Phil Sutter <phil.sutter@viprinet.com>"); MODULE_DESCRIPTION("Support for Marvell's IDMA/TDMA engines"); MODULE_LICENSE("GPL"); - -- 1.7.9.5 ^ permalink raw reply related [flat|nested] 67+ messages in thread
* Re: [PATCH 0/1] MV_CESA with DMA: Clk init fixes 2012-06-26 20:31 ` [PATCH 0/1] MV_CESA with DMA: Clk init fixes Simon Baatz 2012-06-26 20:31 ` [PATCH 1/1] mv_dma: mv_cesa: fixes for clock init Simon Baatz @ 2012-07-06 15:05 ` Phil Sutter 1 sibling, 0 replies; 67+ messages in thread From: Phil Sutter @ 2012-07-06 15:05 UTC (permalink / raw) To: Simon Baatz; +Cc: linux-crypto, cloudy.linux Hi Simon, On Tue, Jun 26, 2012 at 10:31:51PM +0200, Simon Baatz wrote: > I just found the time to test your updates. Alas, the mv_dma module > hangs at boot again. The culprit seems to be setup_mbus_windows(), > which is called before the clock is turned on but accesses the DMA > engine. > > I shifted the clock init code a bit and while doing so, fixed some error > case handling for mv_dma and mv_cesa. See proposed patch in next mail. I applied that to my public git, thanks a lot! Greetings, Phil Phil Sutter Software Engineer -- VNet Europe GmbH Mainzer Str. 43 55411 Bingen am Rhein Germany Management Buy-Out at Viprinet - please read http://www.viprinet.com/en/mbo Management Buy-Out bei Viprinet - bitte lesen Sie http://www.viprinet.com/de/mbo Phone/Zentrale: +49 6721 49030-0 Direct line/Durchwahl: +49 6721 49030-134 Fax: +49 6721 49030-109 phil.sutter@viprinet.com http://www.viprinet.com Registered office/Sitz der Gesellschaft: Bingen am Rhein, Germany Commercial register/Handelsregister: Amtsgericht Mainz HRB44090 CEO/Geschäftsführer: Simon Kissel ^ permalink raw reply [flat|nested] 67+ messages in thread
end of thread, other threads:[~2012-10-23 17:18 UTC | newest] Thread overview: 67+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2012-05-25 16:08 RFC: support for MV_CESA with TDMA Phil Sutter 2012-05-25 16:08 ` [PATCH 01/13] mv_cesa: do not use scatterlist iterators Phil Sutter 2012-05-25 16:08 ` [PATCH 02/13] mv_cesa: minor formatting cleanup, will all make sense soon Phil Sutter 2012-05-25 16:08 ` [PATCH 03/13] mv_cesa: prepare the full sram config in dram Phil Sutter 2012-05-25 16:08 ` [PATCH 04/13] mv_cesa: split up processing callbacks Phil Sutter 2012-05-25 16:08 ` [PATCH 05/13] add a driver for the Marvell TDMA engine Phil Sutter 2012-05-25 16:08 ` [PATCH 06/13] mv_cesa: use TDMA engine for data transfers Phil Sutter 2012-05-25 16:08 ` [PATCH 07/13] mv_cesa: have TDMA copy back the digest result Phil Sutter 2012-05-25 16:08 ` [PATCH 08/13] mv_cesa: fetch extra_bytes via TDMA engine, too Phil Sutter 2012-05-25 16:08 ` [PATCH 09/13] mv_cesa: implementing packet chain mode, only aes for now Phil Sutter 2012-05-25 16:08 ` [PATCH 10/13] mv_cesa: reorganise mv_start_new_hash_req a bit Phil Sutter 2012-05-25 16:08 ` [PATCH 11/13] mv_cesa: implement descriptor chaining for hashes, too Phil Sutter 2012-05-25 16:08 ` [PATCH 12/13] mv_cesa: drop the now unused process callback Phil Sutter 2012-05-25 16:08 ` [PATCH 13/13] mv_cesa, mv_tdma: outsource common dma-pool handling code Phil Sutter 2012-05-27 14:03 ` RFC: support for MV_CESA with TDMA cloudy.linux 2012-05-29 11:34 ` Phil Sutter 2012-06-12 10:04 ` Herbert Xu 2012-06-12 10:24 ` Phil Sutter 2012-06-12 11:39 ` Herbert Xu 2012-06-12 17:17 ` RFC: support for MV_CESA with IDMA or TDMA Phil Sutter 2012-06-12 17:17 ` [PATCH 01/13] mv_cesa: do not use scatterlist iterators Phil Sutter 2012-06-12 17:17 ` [PATCH 02/13] mv_cesa: minor formatting cleanup, will all make sense soon Phil Sutter 2012-06-12 17:17 ` [PATCH 03/13] mv_cesa: prepare the full sram config in dram Phil Sutter 2012-06-12 17:17 ` [PATCH 04/13] mv_cesa: split up processing callbacks Phil Sutter 2012-06-12 17:17 ` [PATCH 05/13] add a driver for the Marvell IDMA/TDMA engines Phil Sutter 2012-06-12 17:17 ` [PATCH 06/13] mv_cesa: use DMA engine for data transfers Phil Sutter 2012-06-12 17:17 ` [PATCH 07/13] mv_cesa: have DMA engine copy back the digest result Phil Sutter 2012-06-12 17:17 ` [PATCH 08/13] mv_cesa: fetch extra_bytes via DMA engine, too Phil Sutter 2012-06-12 17:17 ` [PATCH 09/13] mv_cesa: implementing packet chain mode, only aes for now Phil Sutter 2012-06-12 17:17 ` [PATCH 10/13] mv_cesa: reorganise mv_start_new_hash_req a bit Phil Sutter 2012-06-12 17:17 ` [PATCH 11/13] mv_cesa: implement descriptor chaining for hashes, too Phil Sutter 2012-06-12 17:17 ` [PATCH 12/13] mv_cesa: drop the now unused process callback Phil Sutter 2012-06-12 17:17 ` [PATCH 13/13] mv_cesa, mv_dma: outsource common dma-pool handling code Phil Sutter 2012-06-15 1:40 ` RFC: support for MV_CESA with IDMA or TDMA cloudy.linux 2012-06-15 9:51 ` Phil Sutter 2012-06-16 0:20 ` [PATCH 0/2] Fixes " Simon Baatz 2012-06-16 0:20 ` [PATCH 1/2] mv_dma: fix mv_init_engine() error case Simon Baatz 2012-06-16 0:20 ` [PATCH 2/2] ARM: Orion: mv_dma: Add support for clk Simon Baatz 2012-06-18 13:47 ` [PATCH 0/2] Fixes for MV_CESA with IDMA or TDMA Phil Sutter 2012-06-18 20:12 ` Simon Baatz 2012-06-19 11:51 ` Phil Sutter 2012-06-19 15:09 ` cloudy.linux 2012-06-19 17:13 ` Phil Sutter 2012-06-20 1:16 ` cloudy.linux 2012-07-16 9:32 ` Andrew Lunn 2012-07-16 13:52 ` Phil Sutter 2012-07-16 14:03 ` Andrew Lunn 2012-07-16 14:53 ` Phil Sutter 2012-07-16 17:32 ` Simon Baatz 2012-07-16 17:59 ` Andrew Lunn 2012-06-20 13:31 ` cloudy.linux 2012-06-20 15:41 ` Phil Sutter 2012-06-25 13:40 ` Phil Sutter 2012-06-25 14:25 ` cloudy.linux 2012-06-25 14:36 ` Phil Sutter 2012-06-25 16:05 ` cloudy.linux 2012-06-25 21:59 ` Phil Sutter 2012-06-26 11:24 ` cloudy.linux 2012-06-30 7:35 ` cloudy.linux 2012-07-06 15:30 ` Phil Sutter 2012-07-08 5:38 ` cloudy.linux 2012-07-09 12:54 ` Phil Sutter 2012-07-31 12:12 ` cloudy.linux 2012-10-23 17:11 ` Phil Sutter 2012-06-26 20:31 ` [PATCH 0/1] MV_CESA with DMA: Clk init fixes Simon Baatz 2012-06-26 20:31 ` [PATCH 1/1] mv_dma: mv_cesa: fixes for clock init Simon Baatz 2012-07-06 15:05 ` [PATCH 0/1] MV_CESA with DMA: Clk init fixes Phil Sutter
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).