public inbox for linux-crypto@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/4 v2]crypto: octeontx2: Fix hang and address alignment issues
@ 2025-05-20 13:07 Bharat Bhushan
  2025-05-20 13:07 ` [PATCH 1/4 v2] crypto: octeontx2: add timeout for load_fvc completion poll Bharat Bhushan
                   ` (3 more replies)
  0 siblings, 4 replies; 9+ messages in thread
From: Bharat Bhushan @ 2025-05-20 13:07 UTC (permalink / raw)
  To: bbrezillon, schalla, herbert, davem, giovanni.cabiddu, linux,
	bharatb.linux, linux-crypto, linux-kernel
  Cc: Bharat Bhushan

First patch of the series fixes possible infinite loop.

Remaining three patches fixes address alignment issue observed
after "9382bc44b5f5 arm64: allow kmalloc() caches aligned to the
       smaller cache_line_size()"

First 3 patches applies to Linux version 6.5 onwards.
Patch-4 applies to Linux version 6.8 onwards

v1->v2:
 - Fixed memory padding size calculation as per review comment 

Bharat Bhushan (4):
  crypto: octeontx2: add timeout for load_fvc completion poll
  crypto: octeontx2: Fix address alignment issue on ucode loading
  crypto: octeontx2: Fix address alignment on CN10K A0/A1 and OcteonTX2
  crypto: octeontx2: Fix address alignment on CN10KB and CN10KA-B0

 .../marvell/octeontx2/otx2_cpt_reqmgr.h       | 123 +++++++++++++-----
 .../marvell/octeontx2/otx2_cptpf_ucode.c      |  46 +++++--
 2 files changed, 125 insertions(+), 44 deletions(-)

-- 
2.34.1


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 1/4 v2] crypto: octeontx2: add timeout for load_fvc completion poll
  2025-05-20 13:07 [PATCH 0/4 v2]crypto: octeontx2: Fix hang and address alignment issues Bharat Bhushan
@ 2025-05-20 13:07 ` Bharat Bhushan
  2025-05-20 13:07 ` [PATCH 2/4 v2] crypto: octeontx2: Fix address alignment issue on ucode loading Bharat Bhushan
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 9+ messages in thread
From: Bharat Bhushan @ 2025-05-20 13:07 UTC (permalink / raw)
  To: bbrezillon, schalla, herbert, davem, giovanni.cabiddu, linux,
	bharatb.linux, linux-crypto, linux-kernel
  Cc: Bharat Bhushan, stable

Adds timeout to exit from possible infinite loop, which polls
on CPT instruction(load_fvc) completion.

Signed-off-by: Srujana Challa <schalla@marvell.com>
Signed-off-by: Bharat Bhushan <bbhushan2@marvell.com>
Cc: <stable@vger.kernel.org> #v6.5+
---
v1->v2:
 - No Change

 .../crypto/marvell/octeontx2/otx2_cptpf_ucode.c  | 16 ++++++++++++++--
 1 file changed, 14 insertions(+), 2 deletions(-)

diff --git a/drivers/crypto/marvell/octeontx2/otx2_cptpf_ucode.c b/drivers/crypto/marvell/octeontx2/otx2_cptpf_ucode.c
index 78367849c3d5..9095dea2748d 100644
--- a/drivers/crypto/marvell/octeontx2/otx2_cptpf_ucode.c
+++ b/drivers/crypto/marvell/octeontx2/otx2_cptpf_ucode.c
@@ -1494,6 +1494,7 @@ int otx2_cpt_discover_eng_capabilities(struct otx2_cptpf_dev *cptpf)
 	dma_addr_t rptr_baddr;
 	struct pci_dev *pdev;
 	u32 len, compl_rlen;
+	int timeout = 10000;
 	int ret, etype;
 	void *rptr;
 
@@ -1554,16 +1555,27 @@ int otx2_cpt_discover_eng_capabilities(struct otx2_cptpf_dev *cptpf)
 							 etype);
 		otx2_cpt_fill_inst(&inst, &iq_cmd, rptr_baddr);
 		lfs->ops->send_cmd(&inst, 1, &cptpf->lfs.lf[0]);
+		timeout = 10000;
 
 		while (lfs->ops->cpt_get_compcode(result) ==
-						OTX2_CPT_COMPLETION_CODE_INIT)
+						OTX2_CPT_COMPLETION_CODE_INIT) {
 			cpu_relax();
+			udelay(1);
+			timeout--;
+			if (!timeout) {
+				ret = -ENODEV;
+				cptpf->is_eng_caps_discovered = false;
+				dev_warn(&pdev->dev, "Timeout on CPT load_fvc completion poll\n");
+				goto error_no_response;
+			}
+		}
 
 		cptpf->eng_caps[etype].u = be64_to_cpup(rptr);
 	}
-	dma_unmap_single(&pdev->dev, rptr_baddr, len, DMA_BIDIRECTIONAL);
 	cptpf->is_eng_caps_discovered = true;
 
+error_no_response:
+	dma_unmap_single(&pdev->dev, rptr_baddr, len, DMA_BIDIRECTIONAL);
 free_result:
 	kfree(result);
 lf_cleanup:
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 2/4 v2] crypto: octeontx2: Fix address alignment issue on ucode loading
  2025-05-20 13:07 [PATCH 0/4 v2]crypto: octeontx2: Fix hang and address alignment issues Bharat Bhushan
  2025-05-20 13:07 ` [PATCH 1/4 v2] crypto: octeontx2: add timeout for load_fvc completion poll Bharat Bhushan
@ 2025-05-20 13:07 ` Bharat Bhushan
  2025-05-20 23:02   ` Herbert Xu
  2025-05-20 13:07 ` [PATCH 3/4 v2] crypto: octeontx2: Fix address alignment on CN10K A0/A1 and OcteonTX2 Bharat Bhushan
  2025-05-20 13:07 ` [PATCH 4/4 v2] crypto: octeontx2: Fix address alignment on CN10KB and CN10KA-B0 Bharat Bhushan
  3 siblings, 1 reply; 9+ messages in thread
From: Bharat Bhushan @ 2025-05-20 13:07 UTC (permalink / raw)
  To: bbrezillon, schalla, herbert, davem, giovanni.cabiddu, linux,
	bharatb.linux, linux-crypto, linux-kernel
  Cc: Bharat Bhushan, stable

octeontx2 crypto driver allocates memory using kmalloc/kzalloc,
and uses this memory for dma (does dma_map_single()). It assumes
that kmalloc/kzalloc will return 128-byte aligned address. But
kmalloc/kzalloc returns 8-byte aligned address after below changes:
  "9382bc44b5f5 arm64: allow kmalloc() caches aligned to the
   smaller cache_line_size()"

Completion address should be 32-Byte alignment when loading
microcode.

Signed-off-by: Bharat Bhushan <bbhushan2@marvell.com>
Cc: <stable@vger.kernel.org> #v6.5+
---
v1->v2:
 - No Change

 .../marvell/octeontx2/otx2_cptpf_ucode.c      | 30 +++++++++++--------
 1 file changed, 18 insertions(+), 12 deletions(-)

diff --git a/drivers/crypto/marvell/octeontx2/otx2_cptpf_ucode.c b/drivers/crypto/marvell/octeontx2/otx2_cptpf_ucode.c
index 9095dea2748d..3e8357c0ecb2 100644
--- a/drivers/crypto/marvell/octeontx2/otx2_cptpf_ucode.c
+++ b/drivers/crypto/marvell/octeontx2/otx2_cptpf_ucode.c
@@ -1491,12 +1491,13 @@ int otx2_cpt_discover_eng_capabilities(struct otx2_cptpf_dev *cptpf)
 	union otx2_cpt_opcode opcode;
 	union otx2_cpt_res_s *result;
 	union otx2_cpt_inst_s inst;
+	dma_addr_t result_baddr;
 	dma_addr_t rptr_baddr;
 	struct pci_dev *pdev;
-	u32 len, compl_rlen;
 	int timeout = 10000;
 	int ret, etype;
 	void *rptr;
+	u32 len;
 
 	/*
 	 * We don't get capabilities if it was already done
@@ -1519,22 +1520,27 @@ int otx2_cpt_discover_eng_capabilities(struct otx2_cptpf_dev *cptpf)
 	if (ret)
 		goto delete_grps;
 
-	compl_rlen = ALIGN(sizeof(union otx2_cpt_res_s), OTX2_CPT_DMA_MINALIGN);
-	len = compl_rlen + LOADFVC_RLEN;
+	len = LOADFVC_RLEN + sizeof(union otx2_cpt_res_s) +
+	       OTX2_CPT_RES_ADDR_ALIGN;
 
-	result = kzalloc(len, GFP_KERNEL);
-	if (!result) {
+	rptr = kzalloc(len, GFP_KERNEL);
+	if (!rptr) {
 		ret = -ENOMEM;
 		goto lf_cleanup;
 	}
-	rptr_baddr = dma_map_single(&pdev->dev, (void *)result, len,
+
+	rptr_baddr = dma_map_single(&pdev->dev, rptr, len,
 				    DMA_BIDIRECTIONAL);
 	if (dma_mapping_error(&pdev->dev, rptr_baddr)) {
 		dev_err(&pdev->dev, "DMA mapping failed\n");
 		ret = -EFAULT;
-		goto free_result;
+		goto free_rptr;
 	}
-	rptr = (u8 *)result + compl_rlen;
+
+	result = (union otx2_cpt_res_s *)PTR_ALIGN(rptr + LOADFVC_RLEN,
+						   OTX2_CPT_RES_ADDR_ALIGN);
+	result_baddr = ALIGN(rptr_baddr + LOADFVC_RLEN,
+			     OTX2_CPT_RES_ADDR_ALIGN);
 
 	/* Fill in the command */
 	opcode.s.major = LOADFVC_MAJOR_OP;
@@ -1546,14 +1552,14 @@ int otx2_cpt_discover_eng_capabilities(struct otx2_cptpf_dev *cptpf)
 	/* 64-bit swap for microcode data reads, not needed for addresses */
 	cpu_to_be64s(&iq_cmd.cmd.u);
 	iq_cmd.dptr = 0;
-	iq_cmd.rptr = rptr_baddr + compl_rlen;
+	iq_cmd.rptr = rptr_baddr;
 	iq_cmd.cptr.u = 0;
 
 	for (etype = 1; etype < OTX2_CPT_MAX_ENG_TYPES; etype++) {
 		result->s.compcode = OTX2_CPT_COMPLETION_CODE_INIT;
 		iq_cmd.cptr.s.grp = otx2_cpt_get_eng_grp(&cptpf->eng_grps,
 							 etype);
-		otx2_cpt_fill_inst(&inst, &iq_cmd, rptr_baddr);
+		otx2_cpt_fill_inst(&inst, &iq_cmd, result_baddr);
 		lfs->ops->send_cmd(&inst, 1, &cptpf->lfs.lf[0]);
 		timeout = 10000;
 
@@ -1576,8 +1582,8 @@ int otx2_cpt_discover_eng_capabilities(struct otx2_cptpf_dev *cptpf)
 
 error_no_response:
 	dma_unmap_single(&pdev->dev, rptr_baddr, len, DMA_BIDIRECTIONAL);
-free_result:
-	kfree(result);
+free_rptr:
+	kfree(rptr);
 lf_cleanup:
 	otx2_cptlf_shutdown(lfs);
 delete_grps:
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 3/4 v2] crypto: octeontx2: Fix address alignment on CN10K A0/A1 and OcteonTX2
  2025-05-20 13:07 [PATCH 0/4 v2]crypto: octeontx2: Fix hang and address alignment issues Bharat Bhushan
  2025-05-20 13:07 ` [PATCH 1/4 v2] crypto: octeontx2: add timeout for load_fvc completion poll Bharat Bhushan
  2025-05-20 13:07 ` [PATCH 2/4 v2] crypto: octeontx2: Fix address alignment issue on ucode loading Bharat Bhushan
@ 2025-05-20 13:07 ` Bharat Bhushan
  2025-05-20 23:04   ` Herbert Xu
  2025-05-20 13:07 ` [PATCH 4/4 v2] crypto: octeontx2: Fix address alignment on CN10KB and CN10KA-B0 Bharat Bhushan
  3 siblings, 1 reply; 9+ messages in thread
From: Bharat Bhushan @ 2025-05-20 13:07 UTC (permalink / raw)
  To: bbrezillon, schalla, herbert, davem, giovanni.cabiddu, linux,
	bharatb.linux, linux-crypto, linux-kernel
  Cc: Bharat Bhushan, stable

octeontx2 crypto driver allocates memory using kmalloc/kzalloc,
and uses this memory for dma (does dma_map_single()). It assumes
that kmalloc/kzalloc will return 128-byte aligned address. But
kmalloc/kzalloc returns 8-byte aligned address after below changes:
  "9382bc44b5f5 arm64: allow kmalloc() caches aligned to the
   smaller cache_line_size()

Memory allocated are used for following purpose:
 - Input data or scatter list address - 8-Byte alignment
 - Output data or gather list address - 8-Byte alignment
 - Completion address - 32-Byte alignment.

This patch ensures all addresses are aligned as mentioned above.

Signed-off-by: Bharat Bhushan <bbhushan2@marvell.com>
Cc: <stable@vger.kernel.org> #v6.5+
---
v1->v2:
 - Fixed memory padding size calculation as per review comment 

 .../marvell/octeontx2/otx2_cpt_reqmgr.h       | 64 ++++++++++++++-----
 1 file changed, 49 insertions(+), 15 deletions(-)

diff --git a/drivers/crypto/marvell/octeontx2/otx2_cpt_reqmgr.h b/drivers/crypto/marvell/octeontx2/otx2_cpt_reqmgr.h
index e27e849b01df..bb4e067ae826 100644
--- a/drivers/crypto/marvell/octeontx2/otx2_cpt_reqmgr.h
+++ b/drivers/crypto/marvell/octeontx2/otx2_cpt_reqmgr.h
@@ -34,6 +34,9 @@
 #define SG_COMP_2    2
 #define SG_COMP_1    1
 
+#define OTX2_CPT_DPTR_RPTR_ALIGN	8
+#define OTX2_CPT_RES_ADDR_ALIGN		32
+
 union otx2_cpt_opcode {
 	u16 flags;
 	struct {
@@ -417,10 +420,9 @@ static inline struct otx2_cpt_inst_info *
 otx2_sg_info_create(struct pci_dev *pdev, struct otx2_cpt_req_info *req,
 		    gfp_t gfp)
 {
-	int align = OTX2_CPT_DMA_MINALIGN;
 	struct otx2_cpt_inst_info *info;
-	u32 dlen, align_dlen, info_len;
-	u16 g_sz_bytes, s_sz_bytes;
+	u32 dlen, info_len;
+	u16 g_len, s_len;
 	u32 total_mem_len;
 
 	if (unlikely(req->in_cnt > OTX2_CPT_MAX_SG_IN_CNT ||
@@ -429,22 +431,52 @@ otx2_sg_info_create(struct pci_dev *pdev, struct otx2_cpt_req_info *req,
 		return NULL;
 	}
 
-	g_sz_bytes = ((req->in_cnt + 3) / 4) *
-		      sizeof(struct otx2_cpt_sglist_component);
-	s_sz_bytes = ((req->out_cnt + 3) / 4) *
-		      sizeof(struct otx2_cpt_sglist_component);
+	/* Allocate memory to meet below alignment requirement:
+	 *  ----------------------------------
+	 * |    struct otx2_cpt_inst_info     |
+	 * |    (No alignment required)       |
+	 * |     -----------------------------|
+	 * |    | padding for 8B alignment    |
+	 * |----------------------------------|
+	 * |    SG List Gather/Input memory   |
+	 * |    Length = multiple of 32Bytes  |
+	 * |    Alignment = 8Byte             |
+	 * |----------------------------------|
+	 * |    SG List Scatter/Output memory |
+	 * |    Length = multiple of 32Bytes  |
+	 * |    Alignment = 8Byte             |
+	 * |    (padding for below alignment) |
+	 * |     -----------------------------|
+	 * |    | padding for 32B alignment   |
+	 * |----------------------------------|
+	 * |    Result response memory        |
+	 *  ----------------------------------
+	 */
 
-	dlen = g_sz_bytes + s_sz_bytes + SG_LIST_HDR_SIZE;
-	align_dlen = ALIGN(dlen, align);
-	info_len = ALIGN(sizeof(*info), align);
-	total_mem_len = align_dlen + info_len + sizeof(union otx2_cpt_res_s);
+	info_len = sizeof(*info);
+
+	g_len = ((req->in_cnt + 3) / 4) *
+		 sizeof(struct otx2_cpt_sglist_component);
+	s_len = ((req->out_cnt + 3) / 4) *
+		 sizeof(struct otx2_cpt_sglist_component);
+
+	dlen = g_len + s_len + SG_LIST_HDR_SIZE;
+
+	/* Allocate extra memory for SG and response address alignment */
+	total_mem_len = ALIGN(info_len, OTX2_CPT_DPTR_RPTR_ALIGN) + dlen;
+	total_mem_len = ALIGN(total_mem_len, OTX2_CPT_DPTR_RPTR_ALIGN);
+	total_mem_len += (OTX2_CPT_RES_ADDR_ALIGN - 1) &
+			  ~(OTX2_CPT_DPTR_RPTR_ALIGN - 1);
+	total_mem_len += sizeof(union otx2_cpt_res_s);
 
 	info = kzalloc(total_mem_len, gfp);
 	if (unlikely(!info))
 		return NULL;
 
 	info->dlen = dlen;
-	info->in_buffer = (u8 *)info + info_len;
+	info->in_buffer = PTR_ALIGN((u8 *)info + info_len,
+				    OTX2_CPT_DPTR_RPTR_ALIGN);
+	info->out_buffer = info->in_buffer + SG_LIST_HDR_SIZE + g_len;
 
 	((u16 *)info->in_buffer)[0] = req->out_cnt;
 	((u16 *)info->in_buffer)[1] = req->in_cnt;
@@ -460,7 +492,7 @@ otx2_sg_info_create(struct pci_dev *pdev, struct otx2_cpt_req_info *req,
 	}
 
 	if (setup_sgio_components(pdev, req->out, req->out_cnt,
-				  &info->in_buffer[8 + g_sz_bytes])) {
+				  info->out_buffer)) {
 		dev_err(&pdev->dev, "Failed to setup scatter list\n");
 		goto destroy_info;
 	}
@@ -476,8 +508,10 @@ otx2_sg_info_create(struct pci_dev *pdev, struct otx2_cpt_req_info *req,
 	 * Get buffer for union otx2_cpt_res_s response
 	 * structure and its physical address
 	 */
-	info->completion_addr = info->in_buffer + align_dlen;
-	info->comp_baddr = info->dptr_baddr + align_dlen;
+	info->completion_addr = PTR_ALIGN((info->in_buffer + dlen),
+					  OTX2_CPT_RES_ADDR_ALIGN);
+	info->comp_baddr = ALIGN((info->dptr_baddr + dlen),
+				 OTX2_CPT_RES_ADDR_ALIGN);
 
 	return info;
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 4/4 v2] crypto: octeontx2: Fix address alignment on CN10KB and CN10KA-B0
  2025-05-20 13:07 [PATCH 0/4 v2]crypto: octeontx2: Fix hang and address alignment issues Bharat Bhushan
                   ` (2 preceding siblings ...)
  2025-05-20 13:07 ` [PATCH 3/4 v2] crypto: octeontx2: Fix address alignment on CN10K A0/A1 and OcteonTX2 Bharat Bhushan
@ 2025-05-20 13:07 ` Bharat Bhushan
  3 siblings, 0 replies; 9+ messages in thread
From: Bharat Bhushan @ 2025-05-20 13:07 UTC (permalink / raw)
  To: bbrezillon, schalla, herbert, davem, giovanni.cabiddu, linux,
	bharatb.linux, linux-crypto, linux-kernel
  Cc: Bharat Bhushan, stable

octeontx2 crypto driver allocates memory using kmalloc/kzalloc,
and uses this memory for dma (does dma_map_single()). It assumes
that kmalloc/kzalloc will return 128-byte aligned address. But
kmalloc/kzalloc returns 8-byte aligned address after below changes:
  "9382bc44b5f5 arm64: allow kmalloc() caches aligned to the
   smaller cache_line_size()

Memory allocated are used for following purpose:
 - Input data or scatter list address - 8-Byte alignment
 - Output data or gather list address - 8-Byte alignment
 - Completion address - 32-Byte alignment.

This patch ensures all addresses are aligned as mentioned above.

Signed-off-by: Bharat Bhushan <bbhushan2@marvell.com>
Cc: <stable@vger.kernel.org> #v6.8+
---
v1->v2:
 - Fixed memory padding size calculation as per review comment 

 .../marvell/octeontx2/otx2_cpt_reqmgr.h       | 59 ++++++++++++++-----
 1 file changed, 44 insertions(+), 15 deletions(-)

diff --git a/drivers/crypto/marvell/octeontx2/otx2_cpt_reqmgr.h b/drivers/crypto/marvell/octeontx2/otx2_cpt_reqmgr.h
index bb4e067ae826..766fa63fb075 100644
--- a/drivers/crypto/marvell/octeontx2/otx2_cpt_reqmgr.h
+++ b/drivers/crypto/marvell/octeontx2/otx2_cpt_reqmgr.h
@@ -350,22 +350,47 @@ static inline struct otx2_cpt_inst_info *
 cn10k_sgv2_info_create(struct pci_dev *pdev, struct otx2_cpt_req_info *req,
 		       gfp_t gfp)
 {
-	u32 dlen = 0, g_len, sg_len, info_len;
-	int align = OTX2_CPT_DMA_MINALIGN;
+	u32 dlen = 0, g_len, s_len, sg_len, info_len;
 	struct otx2_cpt_inst_info *info;
-	u16 g_sz_bytes, s_sz_bytes;
 	u32 total_mem_len;
 	int i;
 
-	g_sz_bytes = ((req->in_cnt + 2) / 3) *
-		      sizeof(struct cn10kb_cpt_sglist_component);
-	s_sz_bytes = ((req->out_cnt + 2) / 3) *
-		      sizeof(struct cn10kb_cpt_sglist_component);
+	/* Allocate memory to meet below alignment requirement:
+	 *  ----------------------------------
+	 * |    struct otx2_cpt_inst_info     |
+	 * |    (No alignment required)       |
+	 * |     -----------------------------|
+	 * |    | padding for 8B alignment    |
+	 * |----------------------------------|
+	 * |    SG List Gather/Input memory   |
+	 * |    Length = multiple of 32Bytes  |
+	 * |    Alignment = 8Byte             |
+	 * |----------------------------------|
+	 * |    SG List Scatter/Output memory |
+	 * |    Length = multiple of 32Bytes  |
+	 * |    Alignment = 8Byte             |
+	 * |    (padding for below alignment) |
+	 * |     -----------------------------|
+	 * |    | padding for 32B alignment   |
+	 * |----------------------------------|
+	 * |    Result response memory        |
+	 *  ----------------------------------
+	 */
 
-	g_len = ALIGN(g_sz_bytes, align);
-	sg_len = ALIGN(g_len + s_sz_bytes, align);
-	info_len = ALIGN(sizeof(*info), align);
-	total_mem_len = sg_len + info_len + sizeof(union otx2_cpt_res_s);
+	info_len = sizeof(*info);
+
+	g_len = ((req->in_cnt + 2) / 3) *
+		 sizeof(struct cn10kb_cpt_sglist_component);
+	s_len = ((req->out_cnt + 2) / 3) *
+		 sizeof(struct cn10kb_cpt_sglist_component);
+	sg_len = g_len + s_len;
+
+	/* Allocate extra memory for SG and response address alignment */
+	total_mem_len = ALIGN(info_len, OTX2_CPT_DPTR_RPTR_ALIGN) + sg_len;
+	total_mem_len = ALIGN(total_mem_len, OTX2_CPT_DPTR_RPTR_ALIGN);
+	total_mem_len += (OTX2_CPT_RES_ADDR_ALIGN - 1) &
+			  ~(OTX2_CPT_DPTR_RPTR_ALIGN - 1);
+	total_mem_len += sizeof(union otx2_cpt_res_s);
 
 	info = kzalloc(total_mem_len, gfp);
 	if (unlikely(!info))
@@ -375,7 +400,9 @@ cn10k_sgv2_info_create(struct pci_dev *pdev, struct otx2_cpt_req_info *req,
 		dlen += req->in[i].size;
 
 	info->dlen = dlen;
-	info->in_buffer = (u8 *)info + info_len;
+	info->in_buffer = PTR_ALIGN((u8 *)info + info_len,
+				    OTX2_CPT_DPTR_RPTR_ALIGN);
+	info->out_buffer = info->in_buffer + g_len;
 	info->gthr_sz = req->in_cnt;
 	info->sctr_sz = req->out_cnt;
 
@@ -387,7 +414,7 @@ cn10k_sgv2_info_create(struct pci_dev *pdev, struct otx2_cpt_req_info *req,
 	}
 
 	if (sgv2io_components_setup(pdev, req->out, req->out_cnt,
-				    &info->in_buffer[g_len])) {
+				    info->out_buffer)) {
 		dev_err(&pdev->dev, "Failed to setup scatter list\n");
 		goto destroy_info;
 	}
@@ -404,8 +431,10 @@ cn10k_sgv2_info_create(struct pci_dev *pdev, struct otx2_cpt_req_info *req,
 	 * Get buffer for union otx2_cpt_res_s response
 	 * structure and its physical address
 	 */
-	info->completion_addr = info->in_buffer + sg_len;
-	info->comp_baddr = info->dptr_baddr + sg_len;
+	info->completion_addr = PTR_ALIGN((info->in_buffer + sg_len),
+					  OTX2_CPT_RES_ADDR_ALIGN);
+	info->comp_baddr = ALIGN((info->dptr_baddr + sg_len),
+				 OTX2_CPT_RES_ADDR_ALIGN);
 
 	return info;
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/4 v2] crypto: octeontx2: Fix address alignment issue on ucode loading
  2025-05-20 13:07 ` [PATCH 2/4 v2] crypto: octeontx2: Fix address alignment issue on ucode loading Bharat Bhushan
@ 2025-05-20 23:02   ` Herbert Xu
  2025-05-21  5:37     ` Bharat Bhushan
  0 siblings, 1 reply; 9+ messages in thread
From: Herbert Xu @ 2025-05-20 23:02 UTC (permalink / raw)
  To: Bharat Bhushan
  Cc: bbrezillon, schalla, davem, giovanni.cabiddu, linux,
	bharatb.linux, linux-crypto, linux-kernel, stable

On Tue, May 20, 2025 at 06:37:35PM +0530, Bharat Bhushan wrote:
>
> @@ -1519,22 +1520,27 @@ int otx2_cpt_discover_eng_capabilities(struct otx2_cptpf_dev *cptpf)
>  	if (ret)
>  		goto delete_grps;
>  
> -	compl_rlen = ALIGN(sizeof(union otx2_cpt_res_s), OTX2_CPT_DMA_MINALIGN);
> -	len = compl_rlen + LOADFVC_RLEN;
> +	len = LOADFVC_RLEN + sizeof(union otx2_cpt_res_s) +
> +	       OTX2_CPT_RES_ADDR_ALIGN;
>  
> -	result = kzalloc(len, GFP_KERNEL);
> -	if (!result) {
> +	rptr = kzalloc(len, GFP_KERNEL);
> +	if (!rptr) {
>  		ret = -ENOMEM;
>  		goto lf_cleanup;
>  	}
> -	rptr_baddr = dma_map_single(&pdev->dev, (void *)result, len,
> +
> +	rptr_baddr = dma_map_single(&pdev->dev, rptr, len,
>  				    DMA_BIDIRECTIONAL);

After this change rptr is still unaligned.  However, you appear
to be doing bidirectional DMA to rptr, so it should be aligned
to ARCH_DMA_MINALIGN or you risk corrupting the surrounding
memory.

Only TO_DEVICE DMA addresses can be unaligned.

Cheers,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 3/4 v2] crypto: octeontx2: Fix address alignment on CN10K A0/A1 and OcteonTX2
  2025-05-20 13:07 ` [PATCH 3/4 v2] crypto: octeontx2: Fix address alignment on CN10K A0/A1 and OcteonTX2 Bharat Bhushan
@ 2025-05-20 23:04   ` Herbert Xu
  2025-05-21  5:38     ` Bharat Bhushan
  0 siblings, 1 reply; 9+ messages in thread
From: Herbert Xu @ 2025-05-20 23:04 UTC (permalink / raw)
  To: Bharat Bhushan
  Cc: bbrezillon, schalla, davem, giovanni.cabiddu, linux,
	bharatb.linux, linux-crypto, linux-kernel, stable

On Tue, May 20, 2025 at 06:37:36PM +0530, Bharat Bhushan wrote:
>
> +	info->in_buffer = PTR_ALIGN((u8 *)info + info_len,
> +				    OTX2_CPT_DPTR_RPTR_ALIGN);

Any address that's used for bidirectional or from-device DMA
needs to be aligned to ARCH_DMA_MINALIGN.

Sorry I missed this during the first round.

Cheers,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/4 v2] crypto: octeontx2: Fix address alignment issue on ucode loading
  2025-05-20 23:02   ` Herbert Xu
@ 2025-05-21  5:37     ` Bharat Bhushan
  0 siblings, 0 replies; 9+ messages in thread
From: Bharat Bhushan @ 2025-05-21  5:37 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Bharat Bhushan, bbrezillon, schalla, davem, giovanni.cabiddu,
	linux, linux-crypto, linux-kernel, stable

On Wed, May 21, 2025 at 4:32 AM Herbert Xu <herbert@gondor.apana.org.au> wrote:
>
> On Tue, May 20, 2025 at 06:37:35PM +0530, Bharat Bhushan wrote:
> >
> > @@ -1519,22 +1520,27 @@ int otx2_cpt_discover_eng_capabilities(struct otx2_cptpf_dev *cptpf)
> >       if (ret)
> >               goto delete_grps;
> >
> > -     compl_rlen = ALIGN(sizeof(union otx2_cpt_res_s), OTX2_CPT_DMA_MINALIGN);
> > -     len = compl_rlen + LOADFVC_RLEN;
> > +     len = LOADFVC_RLEN + sizeof(union otx2_cpt_res_s) +
> > +            OTX2_CPT_RES_ADDR_ALIGN;
> >
> > -     result = kzalloc(len, GFP_KERNEL);
> > -     if (!result) {
> > +     rptr = kzalloc(len, GFP_KERNEL);
> > +     if (!rptr) {
> >               ret = -ENOMEM;
> >               goto lf_cleanup;
> >       }
> > -     rptr_baddr = dma_map_single(&pdev->dev, (void *)result, len,
> > +
> > +     rptr_baddr = dma_map_single(&pdev->dev, rptr, len,
> >                                   DMA_BIDIRECTIONAL);
>
> After this change rptr is still unaligned.  However, you appear
> to be doing bidirectional DMA to rptr, so it should be aligned
> to ARCH_DMA_MINALIGN or you risk corrupting the surrounding
> memory.

Yes, rptr was not aligned as ARCH_KMALLOC_MINALIGN and rptr alignment are same.
But as per the second part of the comment, rptr must be aligned to
ARCH_DMA_MINALIGN.
So will change total memory allocation and rptr and result_address
alignment accordingly.

Thanks
-Bharat
>
> Only TO_DEVICE DMA addresses can be unaligned.
>
> Cheers,
> --
> Email: Herbert Xu <herbert@gondor.apana.org.au>
> Home Page: http://gondor.apana.org.au/~herbert/
> PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 3/4 v2] crypto: octeontx2: Fix address alignment on CN10K A0/A1 and OcteonTX2
  2025-05-20 23:04   ` Herbert Xu
@ 2025-05-21  5:38     ` Bharat Bhushan
  0 siblings, 0 replies; 9+ messages in thread
From: Bharat Bhushan @ 2025-05-21  5:38 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Bharat Bhushan, bbrezillon, schalla, davem, giovanni.cabiddu,
	linux, linux-crypto, linux-kernel, stable

On Wed, May 21, 2025 at 4:34 AM Herbert Xu <herbert@gondor.apana.org.au> wrote:
>
> On Tue, May 20, 2025 at 06:37:36PM +0530, Bharat Bhushan wrote:
> >
> > +     info->in_buffer = PTR_ALIGN((u8 *)info + info_len,
> > +                                 OTX2_CPT_DPTR_RPTR_ALIGN);
>
> Any address that's used for bidirectional or from-device DMA
> needs to be aligned to ARCH_DMA_MINALIGN.
>
> Sorry I missed this during the first round.

Will change in the next version.

Thanks
-Bharat

>
> Cheers,
> --
> Email: Herbert Xu <herbert@gondor.apana.org.au>
> Home Page: http://gondor.apana.org.au/~herbert/
> PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2025-05-21  5:39 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-05-20 13:07 [PATCH 0/4 v2]crypto: octeontx2: Fix hang and address alignment issues Bharat Bhushan
2025-05-20 13:07 ` [PATCH 1/4 v2] crypto: octeontx2: add timeout for load_fvc completion poll Bharat Bhushan
2025-05-20 13:07 ` [PATCH 2/4 v2] crypto: octeontx2: Fix address alignment issue on ucode loading Bharat Bhushan
2025-05-20 23:02   ` Herbert Xu
2025-05-21  5:37     ` Bharat Bhushan
2025-05-20 13:07 ` [PATCH 3/4 v2] crypto: octeontx2: Fix address alignment on CN10K A0/A1 and OcteonTX2 Bharat Bhushan
2025-05-20 23:04   ` Herbert Xu
2025-05-21  5:38     ` Bharat Bhushan
2025-05-20 13:07 ` [PATCH 4/4 v2] crypto: octeontx2: Fix address alignment on CN10KB and CN10KA-B0 Bharat Bhushan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox