DPDK-dev Archive on lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] net/crc: cleanup code in net_crc_sse.c implementation
@ 2026-06-12  2:51 Shreesh Adiga
  2026-06-15 21:31 ` Stephen Hemminger
  0 siblings, 1 reply; 2+ messages in thread
From: Shreesh Adiga @ 2026-06-12  2:51 UTC (permalink / raw)
  To: Bruce Richardson, Konstantin Ananyev, Jasvinder Singh; +Cc: dev

Special handling for len between 16 and 31 is not required as the
implementation correctly handles them in the main path. Given that these
cases were annotated with unlikely branch hint, it should be simpler to
have these handled in the main path itself.

We can remove the partial_bytes label as there is no jump target to it,
and replace folding code in that block with already existing inline
function to simplify and have better code reuse.

Signed-off-by: Shreesh Adiga <16567adigashreesh@gmail.com>
---
 lib/net/net_crc_sse.c | 53 ++++++++++++++-----------------------------
 1 file changed, 17 insertions(+), 36 deletions(-)

diff --git a/lib/net/net_crc_sse.c b/lib/net/net_crc_sse.c
index dfef8ecc59..e30f8544fc 100644
--- a/lib/net/net_crc_sse.c
+++ b/lib/net/net_crc_sse.c
@@ -182,39 +182,24 @@ crc32_eth_calc_pclmulqdq(
 		goto single_fold_loop;
 	}
 
-	if (unlikely(data_len < 32)) {
-		if (unlikely(data_len == 16)) {
-			/* 16 bytes */
-			fold = _mm_loadu_si128((const __m128i *)data);
-			fold = _mm_xor_si128(fold, temp);
-			goto reduction_128_64;
-		}
+	if (unlikely(data_len < 16)) {
+		/* 0 to 15 bytes */
+		alignas(16) uint8_t buffer[16];
 
-		if (unlikely(data_len < 16)) {
-			/* 0 to 15 bytes */
-			alignas(16) uint8_t buffer[16];
-
-			memset(buffer, 0, sizeof(buffer));
-			memcpy(buffer, data, data_len);
-
-			fold = _mm_load_si128((const __m128i *)buffer);
-			fold = _mm_xor_si128(fold, temp);
-			if (unlikely(data_len < 4)) {
-				fold = xmm_shift_left(fold, 8 - data_len);
-				goto barret_reduction;
-			}
-			fold = xmm_shift_left(fold, 16 - data_len);
-			goto reduction_128_64;
-		}
-		/* 17 to 31 bytes */
-		fold = _mm_loadu_si128((const __m128i *)data);
+		memset(buffer, 0, sizeof(buffer));
+		memcpy(buffer, data, data_len);
+
+		fold = _mm_load_si128((const __m128i *)buffer);
 		fold = _mm_xor_si128(fold, temp);
-		n = 16;
-		k = params->rk3_rk4;
-		goto partial_bytes;
+		if (unlikely(data_len < 4)) {
+			fold = xmm_shift_left(fold, 8 - data_len);
+			goto barret_reduction;
+		}
+		fold = xmm_shift_left(fold, 16 - data_len);
+		goto reduction_128_64;
 	}
 
-	/** At least 32 bytes in the buffer */
+	/** At least 16 bytes in the buffer */
 	/** Apply CRC initial value */
 	fold = _mm_loadu_si128((const __m128i *)data);
 	fold = _mm_xor_si128(fold, temp);
@@ -229,7 +214,7 @@ crc32_eth_calc_pclmulqdq(
 		fold = crcr32_folding_round(temp, k, fold);
 	}
 
-partial_bytes:
+	/** Partial bytes - process last <16 bytes */
 	if (likely(n < data_len)) {
 
 		__m128i last16, a, b;
@@ -244,12 +229,8 @@ crc32_eth_calc_pclmulqdq(
 		b = _mm_shuffle_epi8(fold, temp);
 		b = _mm_blendv_epi8(b, last16, temp);
 
-		/* k = rk1 & rk2 */
-		temp = _mm_clmulepi64_si128(a, k, 0x01);
-		fold = _mm_clmulepi64_si128(a, k, 0x10);
-
-		fold = _mm_xor_si128(fold, temp);
-		fold = _mm_xor_si128(fold, b);
+		/* k = rk3 & rk4 */
+		fold = crcr32_folding_round(b, k, a);
 	}
 
 	/** Reduction 128 -> 32 Assumes: fold holds 128bit folded data */
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 2+ messages in thread

* Re: [PATCH] net/crc: cleanup code in net_crc_sse.c implementation
  2026-06-12  2:51 [PATCH] net/crc: cleanup code in net_crc_sse.c implementation Shreesh Adiga
@ 2026-06-15 21:31 ` Stephen Hemminger
  0 siblings, 0 replies; 2+ messages in thread
From: Stephen Hemminger @ 2026-06-15 21:31 UTC (permalink / raw)
  To: Shreesh Adiga; +Cc: Bruce Richardson, Konstantin Ananyev, Jasvinder Singh, dev

On Fri, 12 Jun 2026 08:21:35 +0530
Shreesh Adiga <16567adigashreesh@gmail.com> wrote:

> Special handling for len between 16 and 31 is not required as the
> implementation correctly handles them in the main path. Given that these
> cases were annotated with unlikely branch hint, it should be simpler to
> have these handled in the main path itself.
> 
> We can remove the partial_bytes label as there is no jump target to it,
> and replace folding code in that block with already existing inline
> function to simplify and have better code reuse.
> 
> Signed-off-by: Shreesh Adiga <16567adigashreesh@gmail.com>
> ---

Looks good, applied to net-next

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2026-06-15 21:32 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-12  2:51 [PATCH] net/crc: cleanup code in net_crc_sse.c implementation Shreesh Adiga
2026-06-15 21:31 ` Stephen Hemminger

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox