From mboxrd@z Thu Jan  1 00:00:00 1970
From: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Subject: Re: [PATCH v4 1/5] efd: new Elastic Flow Distributor
 library
Date: Mon, 16 Jan 2017 09:55:48 +0530
Message-ID: <20170116042547.GA6781@localhost.localdomain>
References: <1484259360-198276-1-git-send-email-pablo.de.lara.guarch@intel.com>
 <1484481875-126335-1-git-send-email-pablo.de.lara.guarch@intel.com>
 <1484481875-126335-2-git-send-email-pablo.de.lara.guarch@intel.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Cc: <dev@dpdk.org>, Byron Marohn <byron.marohn@intel.com>, Saikrishna
 Edupuganti <saikrishna.edupuganti@intel.com>
To: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Return-path: <dev-bounces@dpdk.org>
Received: from NAM03-CO1-obe.outbound.protection.outlook.com
 (mail-co1nam03on0061.outbound.protection.outlook.com [104.47.40.61])
 by dpdk.org (Postfix) with ESMTP id 86943FA5F
 for <dev@dpdk.org>; Mon, 16 Jan 2017 05:26:14 +0100 (CET)
Content-Disposition: inline
In-Reply-To: <1484481875-126335-2-git-send-email-pablo.de.lara.guarch@intel.com>
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <http://dpdk.org/ml/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://dpdk.org/ml/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <http://dpdk.org/ml/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
Errors-To: dev-bounces@dpdk.org
Sender: "dev" <dev-bounces@dpdk.org>

On Sun, Jan 15, 2017 at 12:04:31PM +0000, Pablo de Lara wrote:
> Elastic Flow Distributor (EFD) is a distributor library that uses
> perfect hashing to determine a target/value for a given incoming flow key.
> It has the following advantages:
> 
> - First, because it uses perfect hashing, it does not store
>   the key itself and hence lookup performance is not dependent
>   on the key size.
> 
> - Second, the target/value can be any arbitrary value hence
>   the system designer and/or operator can better optimize service rates
>   and inter-cluster network traffic locating.
> 
> - Third, since the storage requirement is much smaller than a hash-based
>   flow table (i.e. better fit for CPU cache), EFD can scale to
>   millions of flow keys.
>   Finally, with current optimized library implementation performance
>   is fully scalable with number of CPU cores.
> 
> Signed-off-by: Byron Marohn <byron.marohn@intel.com>
> Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
> Signed-off-by: Saikrishna Edupuganti <saikrishna.edupuganti@intel.com>
> Acked-by: Christian Maciocco <christian.maciocco@intel.com>
> ---
> +#if (RTE_EFD_VALUE_NUM_BITS == 8 || RTE_EFD_VALUE_NUM_BITS == 16 || \
> +	RTE_EFD_VALUE_NUM_BITS == 24 || RTE_EFD_VALUE_NUM_BITS == 32)
> +#define EFD_LOAD_SI128(val) _mm_load_si128(val)
> +#else
> +#define EFD_LOAD_SI128(val) _mm_lddqu_si128(val)
> +#endif
> +
> +static inline efd_value_t
> +efd_lookup_internal(const struct efd_online_group_entry * const group,
> +		const uint32_t hash_val_a, const uint32_t hash_val_b,
> +		enum rte_efd_compare_function cmp_fn)
> +{
> +	efd_value_t value = 0;
> +	uint32_t i;
> +
> +	switch (cmp_fn) {
> +#ifdef RTE_MACHINE_CPUFLAG_AVX2
> +	case RTE_HASH_COMPARE_AVX2:
> +
> +		i = 0;
> +		__m256i vhash_val_a = _mm256_set1_epi32(hash_val_a);
> +		__m256i vhash_val_b = _mm256_set1_epi32(hash_val_b);
> +

Could you please abstract and move SIMD specific code to another file like other
libraries(example: lib_acl) to enable smooth integration with neon and altivec
SIMD implementations in future.

> +		for (; i < RTE_EFD_VALUE_NUM_BITS; i += 8) {
> +			__m256i vhash_idx =
> +					_mm256_cvtepu16_epi32(EFD_LOAD_SI128(
> +					(__m128i const *) &group->hash_idx[i]));
> +			__m256i vlookup_table = _mm256_cvtepu16_epi32(
> +					EFD_LOAD_SI128((__m128i const *)
> +					&group->lookup_table[i]));
> +			__m256i vhash = _mm256_add_epi32(vhash_val_a,
> +					_mm256_mullo_epi32(vhash_idx, vhash_val_b));
> +			__m256i vbucket_idx = _mm256_srli_epi32(vhash,
> +					EFD_LOOKUPTBL_SHIFT);
> +			__m256i vresult = _mm256_srlv_epi32(vlookup_table,
> +					vbucket_idx);
> +
> +			value |= (_mm256_movemask_ps(
> +				(__m256) _mm256_slli_epi32(vresult, 31))
> +				& ((1 << (RTE_EFD_VALUE_NUM_BITS - i)) - 1)) << i;
> +		}
> +		break;
> +#endif