From mboxrd@z Thu Jan 1 00:00:00 1970 From: Yerden Zhumabekov Subject: Re: [PATCH v2 0/4] rte_hash_crc reworked to be platform-independent Date: Mon, 17 Nov 2014 17:54:21 +0600 Message-ID: <5469E1ED.4040109@sts.kz> References: <1409724351-23786-1-git-send-email-e_zhumabekov@sts.kz> <1416160760-16087-1-git-send-email-e_zhumabekov@sts.kz> <20141117113110.GB17886@hmsreliant.think-freely.org> Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable To: Neil Horman , "dev-VfR2kkLFssw@public.gmane.org" , Return-path: In-Reply-To: <20141117113110.GB17886-B26myB8xz7F8NnZeBjwnZQMhkBWG/bsMQH7oEaQurus@public.gmane.org> List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces-VfR2kkLFssw@public.gmane.org Sender: "dev" 17.11.2014 17:31, Neil Horman =D0=BF=D0=B8=D1=88=D0=B5=D1=82: > On Sun, Nov 16, 2014 at 11:59:16PM +0600, Yerden Zhumabekov wrote: >> This is a rework of my previous patches improving performance of rte_h= ash_crc. In addition, this revision brings a fallback mechanism to ensure= that CRC32 hash is calculated regardless of hardware support from CPU (i= =2Ee. SSE4.2 intrinsics). >> >> Summary of changes: >> * added CRC32 software implementation, which is used as a fallback in = case SSE4.2 is not available, or if SSE4.2 is intentionally disabled. >> * added rte_hash_crc_set_alg() function to control availability of SSE= 4.2. >> * added rte_hash_crc_8byte() function to calculate CRC32 on 8-byte ope= rand. >> * reworked rte_hash_crc() function which leverages both versions of CR= C32 hash calculation functions with 4 and 8-byte operands. >> >> Patches were tested on machines either with and without SSE4.2 support= =2E Software implementation seems to be about 15 times slower than SSE4.2= -enabled one. Of course, they return identical results. >> >> Yerden Zhumabekov (4): >> hash: add software CRC32 implementation >> hash: add new rte_hash_crc_8byte call >> hash: add fallback to software CRC32 implementation >> hash: rte_hash_crc() slices data into 8-byte pieces >> >> lib/librte_hash/rte_hash_crc.h | 212 +++++++++++++++++++++++++++++++= +++++++-- >> 1 file changed, 202 insertions(+), 10 deletions(-) >> >> --=20 >> 1.7.9.5 >> >> > Functionally this all looks great, but I think you want to add a 5th pa= tch to > the series in which you remove the ifdef SSE4.2 bits from test_hash_per= f, since > this makes rte_hash_crc usable in all cases. Not sure if you would rat= her just > ditch rte_hash_jhash alltogether, or make testing it a command line run= time > option Meanwhile, I've borrowed some Intel's code (BSD licensed) for CRC32 sw algorithm, it runs 4 times faster sacrificing memory (2K) for additional lookup tables. I'd like to include it as well. As for test_hash_perf, I'll look at it. Should I just send new series over as 'v3'? Any approval/disapproval for the current series? --=20 Sincerely, Yerden Zhumabekov State Technical Service Astana, KZ