From mboxrd@z Thu Jan 1 00:00:00 1970 From: Joyce Kong Subject: [PATCH v2 0/3] reimplement rwlock and add relevant perf test case Date: Tue, 15 Jan 2019 21:12:56 +0800 Message-ID: <1547557979-153169-1-git-send-email-joyce.kong@arm.com> Cc: thomas@monjalon.net, jerinj@marvell.com, hemant.agrawal@nxp.com, bruce.richardson@intel.com, chaozhu@linux.vnet.ibm.com, honnappa.nagarahalli@arm.com, nd@arm.com To: dev@dpdk.org Return-path: Received: from foss.arm.com (usa-sjc-mx-foss1.foss.arm.com [217.140.101.70]) by dpdk.org (Postfix) with ESMTP id 3C7235A44 for ; Tue, 15 Jan 2019 14:13:09 +0100 (CET) List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" v2: Rebase and modify the rwlock test case to address the comments in v1. v1: reimplement rwlock with __atomic builtins, and add a rwlock perf test on all available cores to benchmark the improvement. We tested the patches on three arm64 platforms. ThundeX2 gained 20% performance, Qualcomm gained 36% and the 4-Cortex-A72 Marvell MACCHIATObin gained 19.6%. Below is the detailed test result on ThunderX2: *** rwlock_autotest without __atomic builtins *** Rwlock Perf Test on 128 cores... Core [0] count = 281 Core [1] count = 252 Core [2] count = 290 Core [3] count = 259 Core [4] count = 287 ... Core [209] count = 3 Core [210] count = 31 Core [211] count = 120 Total count = 18537 *** rwlock_autotest with __atomic builtins *** Rwlock Perf Test on 128 cores... Core [0] count = 346 Core [1] count = 355 Core [2] count = 259 Core [3] count = 285 Core [4] count = 320 ... Core [209] count = 2 Core [210] count = 23 Core [211] count = 63 Total count = 22194 Gavin Hu (1): rwlock: reimplement with __atomic builtins Joyce Kong (2): test/rwlock: add perf test case test/rwlock: amortize the cost of getting time lib/librte_eal/common/include/generic/rte_rwlock.h | 16 ++--- test/test/test_rwlock.c | 75 ++++++++++++++++++++++ 2 files changed, 83 insertions(+), 8 deletions(-) -- 2.7.4