From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jia He Subject: [PATCH v5 1/1] ring: guarantee load/load order in enqueue and dequeue Date: Fri, 10 Nov 2017 01:51:09 +0000 Message-ID: <1510278669-8489-2-git-send-email-hejianet@gmail.com> References: <1510118764-29697-1-git-send-email-hejianet@gmail.com> <1510278669-8489-1-git-send-email-hejianet@gmail.com> Cc: konstantin.ananyev@intel.com, bruce.richardson@intel.com, jianbo.liu@arm.com, hemant.agrawal@nxp.com, Jia He , Jia He , jie2.liu@hxt-semitech.com, bing.zhao@hxt-semitech.com To: jerin.jacob@caviumnetworks.com, dev@dpdk.org, olivier.matz@6wind.com Return-path: Received: from mail-pg0-f66.google.com (mail-pg0-f66.google.com [74.125.83.66]) by dpdk.org (Postfix) with ESMTP id 9FF351B3DC for ; Fri, 10 Nov 2017 02:51:40 +0100 (CET) Received: by mail-pg0-f66.google.com with SMTP id z184so844100pgd.13 for ; Thu, 09 Nov 2017 17:51:40 -0800 (PST) In-Reply-To: <1510278669-8489-1-git-send-email-hejianet@gmail.com> List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" We watched a rte panic of mbuf_autotest in our qualcomm arm64 server. In __rte_ring_move_cons_head() ... do { /* Restore n as it may change every loop */ n = max; *old_head = r->cons.head; //1st load const uint32_t prod_tail = r->prod.tail; //2nd load cpu1(producer) cpu2(consumer) cpu3(consumer) load r->prod.tail in enqueue: load r->cons.tail load r->prod.head store r->prod.tail load r->cons.head load r->prod.tail ... store r->cons.{head,tail} load r->cons.head In weak memory order architectures(powerpc,arm), the 2nd load might be reodered before the 1st load, that makes *entries is bigger than we wanted. This nasty reording messed enque/deque up. Then, r->cons.head will be bigger than prod_tail, then make *entries very big and the consumer will go forward incorrectly. After this patch, even with above context switches, the old cons.head will be recaculated after failure of rte_atomic32_cmpset. So no race conditions left. There is no such issue on X86, because X86 is strong memory order model. But rte_smp_rmb() doesn't have impact on runtime performance on X86, so keep the same code without architectures specific concerns. Signed-off-by: Jia He Signed-off-by: jie2.liu@hxt-semitech.com Signed-off-by: bing.zhao@hxt-semitech.com --- lib/librte_ring/rte_ring.h | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/lib/librte_ring/rte_ring.h b/lib/librte_ring/rte_ring.h index 5e9b3b7..3e8085a 100644 --- a/lib/librte_ring/rte_ring.h +++ b/lib/librte_ring/rte_ring.h @@ -409,6 +409,11 @@ __rte_ring_move_prod_head(struct rte_ring *r, int is_sp, n = max; *old_head = r->prod.head; + + /* add rmb barrier to avoid load/load reorder in weak + * memory model. It is noop on x86 */ + rte_smp_rmb(); + const uint32_t cons_tail = r->cons.tail; /* * The subtraction is done between two unsigned 32bits value @@ -517,6 +522,11 @@ __rte_ring_move_cons_head(struct rte_ring *r, int is_sc, n = max; *old_head = r->cons.head; + + /* add rmb barrier to avoid load/load reorder in weak + * memory model. It is noop on x86 */ + rte_smp_rmb(); + const uint32_t prod_tail = r->prod.tail; /* The subtraction is done between two unsigned 32bits value * (the result is always modulo 32 bits even if we have -- 2.7.4