From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jerin Jacob Subject: Re: [PATCH v4 1/5] efd: new Elastic Flow Distributor library Date: Mon, 16 Jan 2017 09:55:48 +0530 Message-ID: <20170116042547.GA6781@localhost.localdomain> References: <1484259360-198276-1-git-send-email-pablo.de.lara.guarch@intel.com> <1484481875-126335-1-git-send-email-pablo.de.lara.guarch@intel.com> <1484481875-126335-2-git-send-email-pablo.de.lara.guarch@intel.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Cc: , Byron Marohn , Saikrishna Edupuganti To: Pablo de Lara Return-path: Received: from NAM03-CO1-obe.outbound.protection.outlook.com (mail-co1nam03on0061.outbound.protection.outlook.com [104.47.40.61]) by dpdk.org (Postfix) with ESMTP id 86943FA5F for ; Mon, 16 Jan 2017 05:26:14 +0100 (CET) Content-Disposition: inline In-Reply-To: <1484481875-126335-2-git-send-email-pablo.de.lara.guarch@intel.com> List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" On Sun, Jan 15, 2017 at 12:04:31PM +0000, Pablo de Lara wrote: > Elastic Flow Distributor (EFD) is a distributor library that uses > perfect hashing to determine a target/value for a given incoming flow key. > It has the following advantages: > > - First, because it uses perfect hashing, it does not store > the key itself and hence lookup performance is not dependent > on the key size. > > - Second, the target/value can be any arbitrary value hence > the system designer and/or operator can better optimize service rates > and inter-cluster network traffic locating. > > - Third, since the storage requirement is much smaller than a hash-based > flow table (i.e. better fit for CPU cache), EFD can scale to > millions of flow keys. > Finally, with current optimized library implementation performance > is fully scalable with number of CPU cores. > > Signed-off-by: Byron Marohn > Signed-off-by: Pablo de Lara > Signed-off-by: Saikrishna Edupuganti > Acked-by: Christian Maciocco > --- > +#if (RTE_EFD_VALUE_NUM_BITS == 8 || RTE_EFD_VALUE_NUM_BITS == 16 || \ > + RTE_EFD_VALUE_NUM_BITS == 24 || RTE_EFD_VALUE_NUM_BITS == 32) > +#define EFD_LOAD_SI128(val) _mm_load_si128(val) > +#else > +#define EFD_LOAD_SI128(val) _mm_lddqu_si128(val) > +#endif > + > +static inline efd_value_t > +efd_lookup_internal(const struct efd_online_group_entry * const group, > + const uint32_t hash_val_a, const uint32_t hash_val_b, > + enum rte_efd_compare_function cmp_fn) > +{ > + efd_value_t value = 0; > + uint32_t i; > + > + switch (cmp_fn) { > +#ifdef RTE_MACHINE_CPUFLAG_AVX2 > + case RTE_HASH_COMPARE_AVX2: > + > + i = 0; > + __m256i vhash_val_a = _mm256_set1_epi32(hash_val_a); > + __m256i vhash_val_b = _mm256_set1_epi32(hash_val_b); > + Could you please abstract and move SIMD specific code to another file like other libraries(example: lib_acl) to enable smooth integration with neon and altivec SIMD implementations in future. > + for (; i < RTE_EFD_VALUE_NUM_BITS; i += 8) { > + __m256i vhash_idx = > + _mm256_cvtepu16_epi32(EFD_LOAD_SI128( > + (__m128i const *) &group->hash_idx[i])); > + __m256i vlookup_table = _mm256_cvtepu16_epi32( > + EFD_LOAD_SI128((__m128i const *) > + &group->lookup_table[i])); > + __m256i vhash = _mm256_add_epi32(vhash_val_a, > + _mm256_mullo_epi32(vhash_idx, vhash_val_b)); > + __m256i vbucket_idx = _mm256_srli_epi32(vhash, > + EFD_LOOKUPTBL_SHIFT); > + __m256i vresult = _mm256_srlv_epi32(vlookup_table, > + vbucket_idx); > + > + value |= (_mm256_movemask_ps( > + (__m256) _mm256_slli_epi32(vresult, 31)) > + & ((1 << (RTE_EFD_VALUE_NUM_BITS - i)) - 1)) << i; > + } > + break; > +#endif