From mboxrd@z Thu Jan 1 00:00:00 1970 From: gowrishankar Subject: Re: [PATCH v2 0/6] enable lpm, acl and other missing libraries in ppc64le Date: Mon, 11 Jul 2016 15:07:22 +0530 Message-ID: <578368D2.4050401@linux.vnet.ibm.com> References: <1468137084-5983-1-git-send-email-gowrishankar.m@linux.vnet.ibm.com> <000101d1db51$f6955230$e3bff690$@linux.vnet.ibm.com> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: quoted-printable Cc: "'Bruce Richardson'" , "'Konstantin Ananyev'" , "'Thomas Monjalon'" , "'Cristian Dumitrescu'" , "'Pradeep'" To: Chao Zhu , dev@dpdk.org Return-path: Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by dpdk.org (Postfix) with ESMTP id DA5122BBE for ; Mon, 11 Jul 2016 11:54:21 +0200 (CEST) Received: from pps.filterd (m0098410.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.11/8.16.0.11) with SMTP id u6B9nR54106933 for ; Mon, 11 Jul 2016 05:54:20 -0400 Received: from e28smtp02.in.ibm.com (e28smtp02.in.ibm.com [125.16.236.2]) by mx0a-001b2d01.pphosted.com with ESMTP id 242wt5n14g-1 (version=TLSv1.2 cipher=AES256-SHA bits=256 verify=NOT) for ; Mon, 11 Jul 2016 05:54:20 -0400 Received: from localhost by e28smtp02.in.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Mon, 11 Jul 2016 15:24:16 +0530 Received: from d28relay02.in.ibm.com (d28relay02.in.ibm.com [9.184.220.59]) by d28dlp01.in.ibm.com (Postfix) with ESMTP id 155BBE005F for ; Mon, 11 Jul 2016 15:28:15 +0530 (IST) Received: from d28av03.in.ibm.com (d28av03.in.ibm.com [9.184.220.65]) by d28relay02.in.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id u6B9s6fJ15139134 for ; Mon, 11 Jul 2016 15:24:06 +0530 Received: from d28av03.in.ibm.com (localhost [127.0.0.1]) by d28av03.in.ibm.com (8.14.4/8.14.4/NCO v10.0 AVout) with ESMTP id u6B9kkBI009012 for ; Mon, 11 Jul 2016 15:24:13 +0530 In-Reply-To: <000101d1db51$f6955230$e3bff690$@linux.vnet.ibm.com> List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Hi Chao, On Monday 11 July 2016 02:25 PM, Chao Zhu wrote: > Gowrishankar, > > Nice patches! Do you have some function test result? I need some time t= o > verify the patches. Please find below lpm and acl units tests (Test OK at the end of each=20 tests). # ./app/test < EAL/PMD logs> APP: HPET is not enabled, using TSC as default timer RTE>>lpm_autotest No. routes =3D 1076806 Route distribution per prefix width: DEPTH QUANTITY (PERCENT) --------------------------- 01 0 (0.00) 02 0 (0.00) 03 1 (0.00) 04 0 (0.00) 05 3 (0.00) 06 2 (0.00) 07 4 (0.00) 08 201 (0.02) 09 37 (0.00) 10 55 (0.01) 11 97 (0.01) 12 381 (0.04) 13 775 (0.07) 14 2104 (0.20) 15 3712 (0.34) 16 69319 (6.44) 17 12983 (1.21) 18 23667 (2.20) 19 69068 (6.41) 20 62354 (5.79) 21 48531 (4.51) 22 72355 (6.72) 23 85427 (7.93) 24 583900 (54.23) 25 2654 (0.25) 26 5650 (0.52) 27 6467 (0.60) 28 7127 (0.66) 29 12936 (1.20) 30 5999 (0.56) 31 13 (0.00) 32 984 (0.09) Unique added entries =3D 1039948 Used table 24 entries =3D 11343198 (67.6107%) 64 byte Cache entries used =3D 360735 (23087040 bytes) Average LPM Add: 110820 cycles Average LPM Lookup: 34.5 cycles (fails =3D 19.3%) BULK LPM Lookup: 31.5 cycles (fails =3D 19.3%) LPM LookupX4: 29.5 cycles (fails =3D 19.3%) Average LPM Delete: 63841.6 cycles Test OK RTE>>acl_autotest ACL: allocation of 25166728 bytes on socket 33 for ACL_acl_ctx failed ACL: rte_acl_add_rules(acl_ctx): rule #1 is invalid ACL: rte_acl_ipv4vlan_add_rules: rule #1 is invalid ACL: rte_acl_ipv4vlan_add_rules: rule #1 is invalid ACL: rte_acl_ipv4vlan_add_rules: rule #1 is invalid ACL: rte_acl_ipv4vlan_add_rules: rule #1 is invalid ACL: rte_acl_add_rules(acl_ctx): rule #1 is invalid ACL: Gen phase for ACL "acl_ctx": runtime memory footprint on socket -1: single nodes/bytes used: 0/0 quad nodes/vectors/bytes used: 0/0/0 DFA nodes/group64/bytes used: 1/4/4104 match nodes/bytes used: 1/128 total: 6432 bytes max limit: 18446744073709551615 bytes ACL: Build phase for ACL "acl_ctx": node limit for tree split: 2048 nodes created: 2 memory consumed: 8388615 ACL: trie 0: number of rules: 16, indexes: 1 ACL: Gen phase for ACL "acl_ctx": runtime memory footprint on socket -1: single nodes/bytes used: 22/176 quad nodes/vectors/bytes used: 30/104/832 DFA nodes/group64/bytes used: 6/19/11784 match nodes/bytes used: 6/768 total: 15760 bytes max limit: 18446744073709551615 bytes ACL: Build phase for ACL "acl_ctx": node limit for tree split: 2048 nodes created: 64 memory consumed: 8388615 ACL: trie 0: number of rules: 6000, indexes: 4 acl context @0x3efded3b3400 socket_id=3D-1 alg=3D5 max_rules=3D196608 rule_size=3D128 num_rules=3D0 num_categories=3D0 num_tries=3D0 acl context @0x3efded3b3400 socket_id=3D-1 alg=3D5 max_rules=3D196608 rule_size=3D128 num_rules=3D0 num_categories=3D0 num_tries=3D0 ACL: Gen phase for ACL "acl_ctx": runtime memory footprint on socket -1: single nodes/bytes used: 974/7792 quad nodes/vectors/bytes used: 816/3211/25688 DFA nodes/group64/bytes used: 137/289/150024 match nodes/bytes used: 1181/151168 total: 336880 bytes max limit: 18446744073709551615 bytes ACL: Build phase for ACL "acl_ctx": node limit for tree split: 2048 nodes created: 3108 memory consumed: 8388615 ACL: trie 0: number of rules: 15, indexes: 4 ACL: trie 1: number of rules: 12, indexes: 5 ACL: Gen phase for ACL "acl_ctx": runtime memory footprint on socket -1: single nodes/bytes used: 974/7792 quad nodes/vectors/bytes used: 816/3211/25688 DFA nodes/group64/bytes used: 137/289/150024 match nodes/bytes used: 1181/151168 total: 336880 bytes max limit: 18446744073709551615 bytes ACL: Build phase for ACL "acl_ctx": node limit for tree split: 2048 nodes created: 3108 memory consumed: 8388615 ACL: trie 0: number of rules: 15, indexes: 4 ACL: trie 1: number of rules: 12, indexes: 5 ACL: Build phase for ACL "acl_ctx": node limit for tree split: 2048 nodes created: 0 memory consumed: 8388615 ACL: Gen phase for ACL "acl_ctx": runtime memory footprint on socket -1: single nodes/bytes used: 974/7792 quad nodes/vectors/bytes used: 816/3211/25688 DFA nodes/group64/bytes used: 137/289/150024 match nodes/bytes used: 1181/151168 total: 336880 bytes max limit: 18446744073709551615 bytes ACL: Build phase for ACL "acl_ctx": node limit for tree split: 2048 nodes created: 3108 memory consumed: 8388615 ACL: trie 0: number of rules: 15, indexes: 4 ACL: trie 1: number of rules: 12, indexes: 5 ACL: Gen phase for ACL "acl_ctx": runtime memory footprint on socket -1: single nodes/bytes used: 974/7792 quad nodes/vectors/bytes used: 816/3211/25688 DFA nodes/group64/bytes used: 137/289/150024 match nodes/bytes used: 1181/151168 total: 336880 bytes max limit: 18446744073709551615 bytes ACL: Build phase for ACL "acl_ctx": node limit for tree split: 2048 nodes created: 3108 memory consumed: 8388615 ACL: trie 0: number of rules: 15, indexes: 4 ACL: trie 1: number of rules: 12, indexes: 5 ACL: Gen phase for ACL "acl_ctx": runtime memory footprint on socket -1: single nodes/bytes used: 0/0 quad nodes/vectors/bytes used: 0/0/0 DFA nodes/group64/bytes used: 1/4/4104 match nodes/bytes used: 1/128 total: 6432 bytes max limit: 18446744073709551615 bytes ACL: Build phase for ACL "acl_ctx": node limit for tree split: 2048 nodes created: 2 memory consumed: 8388615 ACL: trie 0: number of rules: 1, indexes: 1 ACL: Build phase for ACL "acl_ctx": node limit for tree split: 2048 nodes created: 0 memory consumed: 8388615 ACL: Gen phase for ACL "acl_ctx": runtime memory footprint on socket -1: single nodes/bytes used: 3/24 quad nodes/vectors/bytes used: 3/10/80 DFA nodes/group64/bytes used: 1/4/4104 match nodes/bytes used: 2/256 total: 6672 bytes max limit: 18446744073709551615 bytes ACL: Build phase for ACL "acl_ctx": node limit for tree split: 2048 nodes created: 9 memory consumed: 8388615 ACL: trie 0: number of rules: 2, indexes: 2 ACL: Build phase for ACL "acl_ctx": node limit for tree split: 2048 nodes created: 0 memory consumed: 8388615 ACL: Gen phase for ACL "acl_ctx": runtime memory footprint on socket -1: single nodes/bytes used: 3/24 quad nodes/vectors/bytes used: 3/11/88 DFA nodes/group64/bytes used: 1/4/4104 match nodes/bytes used: 3/384 total: 6800 bytes max limit: 18446744073709551615 bytes ACL: Build phase for ACL "acl_ctx": node limit for tree split: 2048 nodes created: 10 memory consumed: 8388615 ACL: trie 0: number of rules: 3, indexes: 2 ACL: Build phase for ACL "acl_ctx": node limit for tree split: 2048 nodes created: 0 memory consumed: 8388615 ACL: Gen phase for ACL "acl_ctx": runtime memory footprint on socket -1: single nodes/bytes used: 3/24 quad nodes/vectors/bytes used: 3/11/88 DFA nodes/group64/bytes used: 1/4/4104 match nodes/bytes used: 3/384 total: 6800 bytes max limit: 18446744073709551615 bytes ACL: Build phase for ACL "acl_ctx": node limit for tree split: 2048 nodes created: 10 memory consumed: 8388615 ACL: trie 0: number of rules: 4, indexes: 2 ACL: Build phase for ACL "acl_ctx": node limit for tree split: 2048 nodes created: 0 memory consumed: 8388615 ACL: Gen phase for ACL "acl_ctx": runtime memory footprint on socket -1: single nodes/bytes used: 3/24 quad nodes/vectors/bytes used: 3/11/88 DFA nodes/group64/bytes used: 1/4/4104 match nodes/bytes used: 3/384 total: 6800 bytes max limit: 18446744073709551615 bytes ACL: Build phase for ACL "acl_ctx": node limit for tree split: 2048 nodes created: 10 memory consumed: 8388615 ACL: trie 0: number of rules: 5, indexes: 2 running test_convert_rules(acl_ipv4vlan_tuple) ACL: Gen phase for ACL "acl_ctx": runtime memory footprint on socket -1: single nodes/bytes used: 974/7792 quad nodes/vectors/bytes used: 816/3211/25688 DFA nodes/group64/bytes used: 137/289/150024 match nodes/bytes used: 1181/151168 total: 336880 bytes max limit: 18446744073709551615 bytes ACL: Build phase for ACL "acl_ctx": node limit for tree split: 2048 nodes created: 3108 memory consumed: 8388615 ACL: trie 0: number of rules: 15, indexes: 4 ACL: trie 1: number of rules: 12, indexes: 5 ACL: Gen phase for ACL "acl_ctx": runtime memory footprint on socket -1: single nodes/bytes used: 9136/73088 quad nodes/vectors/bytes used: 13258/55366/442928 DFA nodes/group64/bytes used: 2242/4493/2302472 match nodes/bytes used: 25011/3201408 total: 6022096 bytes max limit: 18446744073709551615 bytes ACL: Build phase for ACL "acl_ctx": node limit for tree split: 16384 nodes created: 49647 memory consumed: 100663380 ACL: trie 0: number of rules: 22, indexes: 5 ACL: trie 1: number of rules: 5, indexes: 5 running test_convert_rules(acl_ipv4vlan_tuple,=20 RTE_ACL_FIELD_TYPE_BITMASK type for IPv4) ACL: Gen phase for ACL "acl_ctx": runtime memory footprint on socket -1: single nodes/bytes used: 974/7792 quad nodes/vectors/bytes used: 816/3211/25688 DFA nodes/group64/bytes used: 137/289/150024 match nodes/bytes used: 1181/151168 total: 336880 bytes max limit: 18446744073709551615 bytes ACL: Build phase for ACL "acl_ctx": node limit for tree split: 2048 nodes created: 3108 memory consumed: 8388615 ACL: trie 0: number of rules: 15, indexes: 4 ACL: trie 1: number of rules: 12, indexes: 5 ACL: Gen phase for ACL "acl_ctx": runtime memory footprint on socket -1: single nodes/bytes used: 9136/73088 quad nodes/vectors/bytes used: 13258/55366/442928 DFA nodes/group64/bytes used: 2242/4493/2302472 match nodes/bytes used: 25011/3201408 total: 6022096 bytes max limit: 18446744073709551615 bytes ACL: Build phase for ACL "acl_ctx": node limit for tree split: 16384 nodes created: 49647 memory consumed: 100663380 ACL: trie 0: number of rules: 22, indexes: 5 ACL: trie 1: number of rules: 5, indexes: 5 running test_convert_rules(acl_ipv4vlan_tuple, RTE_ACL_FIELD_TYPE_RANGE=20 type for IPv4) ACL: Gen phase for ACL "acl_ctx": runtime memory footprint on socket -1: single nodes/bytes used: 997/7976 quad nodes/vectors/bytes used: 1052/4198/33584 DFA nodes/group64/bytes used: 195/405/209416 match nodes/bytes used: 1917/245376 total: 498560 bytes max limit: 18446744073709551615 bytes ACL: Build phase for ACL "acl_ctx": node limit for tree split: 2048 nodes created: 4161 memory consumed: 8388615 ACL: trie 0: number of rules: 15, indexes: 4 ACL: trie 1: number of rules: 12, indexes: 5 ACL: Gen phase for ACL "acl_ctx": runtime memory footprint on socket -1: single nodes/bytes used: 9400/75200 quad nodes/vectors/bytes used: 13549/56210/449680 DFA nodes/group64/bytes used: 2603/5215/2672136 match nodes/bytes used: 26504/3392512 total: 6591728 bytes max limit: 18446744073709551615 bytes ACL: Build phase for ACL "acl_ctx": node limit for tree split: 16384 nodes created: 52056 memory consumed: 100663380 ACL: trie 0: number of rules: 22, indexes: 5 ACL: trie 1: number of rules: 5, indexes: 5 running test_convert_rules(acl_ipv4vlan_tuple: swap VLAN and PORTs order) ACL: Gen phase for ACL "acl_ctx": runtime memory footprint on socket -1: single nodes/bytes used: 446/3568 quad nodes/vectors/bytes used: 600/1854/14832 DFA nodes/group64/bytes used: 530/924/475144 match nodes/bytes used: 544/69632 total: 565376 bytes max limit: 18446744073709551615 bytes ACL: Build phase for ACL "acl_ctx": node limit for tree split: 2048 nodes created: 2120 memory consumed: 8388615 ACL: trie 0: number of rules: 15, indexes: 4 ACL: trie 1: number of rules: 8, indexes: 5 ACL: trie 2: number of rules: 4, indexes: 2 ACL: Gen phase for ACL "acl_ctx": runtime memory footprint on socket -1: single nodes/bytes used: 1071/8568 quad nodes/vectors/bytes used: 2328/7050/56400 DFA nodes/group64/bytes used: 3526/6266/3210248 match nodes/bytes used: 5235/670080 total: 3947504 bytes max limit: 18446744073709551615 bytes ACL: Build phase for ACL "acl_ctx": node limit for tree split: 16384 nodes created: 12160 memory consumed: 58720305 ACL: trie 0: number of rules: 23, indexes: 5 ACL: trie 1: number of rules: 4, indexes: 2 running test_convert_rules(acl_ipv4vlan_tuple: swap SRC and DST IPv4 orde= r) ACL: Gen phase for ACL "acl_ctx": runtime memory footprint on socket -1: single nodes/bytes used: 1040/8320 quad nodes/vectors/bytes used: 1319/5494/43952 DFA nodes/group64/bytes used: 162/331/171528 match nodes/bytes used: 2270/290560 total: 516560 bytes max limit: 18446744073709551615 bytes ACL: Build phase for ACL "acl_ctx": node limit for tree split: 2048 nodes created: 4791 memory consumed: 16777230 ACL: trie 0: number of rules: 17, indexes: 4 ACL: trie 1: number of rules: 10, indexes: 5 ACL: Gen phase for ACL "acl_ctx": runtime memory footprint on socket -1: single nodes/bytes used: 8822/70576 quad nodes/vectors/bytes used: 12788/53374/426992 DFA nodes/group64/bytes used: 2175/4359/2233864 match nodes/bytes used: 24100/3084800 total: 5818432 bytes max limit: 18446744073709551615 bytes ACL: Build phase for ACL "acl_ctx": node limit for tree split: 16384 nodes created: 47885 memory consumed: 100663380 ACL: trie 0: number of rules: 22, indexes: 5 ACL: trie 1: number of rules: 5, indexes: 5 Test OK RTE>> Thanks, Gowrishankar > -----Original Message----- > From: Gowrishankar [mailto:gowrishankar.m@linux.vnet.ibm.com] > Sent: 2016=E5=B9=B47=E6=9C=8810=E6=97=A5 15:51 > To: dev@dpdk.org > Cc: Chao Zhu ; Bruce Richardson > ; Konstantin Ananyev > ; Thomas Monjalon ; > Cristian Dumitrescu ; Pradeep > ; gowrishankar > Subject: [PATCH v2 0/6] enable lpm, acl and other missing libraries in > ppc64le > > From: gowrishankar > > This patchset enables LPM, ACL and other few missing libs in ppc64le an= d > also address few patches in related examples (ip_pipeline and l3fwd). > > Test report: > LPM and ACL unit tests verified as in patch set v1. > Same results as before observed. > > v2 changes: > - enabling libs in config included as part of lib changes itself. > > gowrishankar (6): > lpm: add altivec intrinsics for dpdk lpm on ppc_64 > acl: add altivec intrinsics for dpdk acl on ppc_64 > ip_pipeline: fix lcore mapping for varying SMT threads as in ppc64 > table: cache align rte_bucket_4_8 > sched: enable sched library for ppc64le > l3fwd: add altivec support for em_hash_key > > app/test-acl/main.c | 4 + > app/test/test_xmmt_ops.h | 16 + > config/defconfig_ppc_64-power8-linuxapp-gcc | 7 - > examples/ip_pipeline/cpu_core_map.c | 12 +- > examples/ip_pipeline/init.c | 4 + > examples/l3fwd/l3fwd_em.c | 8 + > lib/librte_acl/Makefile | 2 + > lib/librte_acl/acl.h | 4 + > lib/librte_acl/acl_run.h | 2 + > lib/librte_acl/acl_run_altivec.c | 47 +++ > lib/librte_acl/acl_run_altivec.h | 328 > +++++++++++++++++++++ > lib/librte_acl/rte_acl.c | 13 + > lib/librte_acl/rte_acl.h | 1 + > .../common/include/arch/ppc_64/rte_vect.h | 60 ++++ > lib/librte_lpm/Makefile | 2 + > lib/librte_lpm/rte_lpm.h | 2 + > lib/librte_lpm/rte_lpm_altivec.h | 154 ++++++++++ > lib/librte_table/rte_table_hash_key8.c | 2 +- > 18 files changed, 649 insertions(+), 19 deletions(-) create mode 100= 644 > lib/librte_acl/acl_run_altivec.c create mode 100644 > lib/librte_acl/acl_run_altivec.h create mode 100644 > lib/librte_eal/common/include/arch/ppc_64/rte_vect.h > create mode 100644 lib/librte_lpm/rte_lpm_altivec.h > > -- > 1.9.1 > > >