From mboxrd@z Thu Jan 1 00:00:00 1970 From: Thomas Monjalon Subject: Re: [dpdk-stable] [PATCH v6] ring: guarantee load/load order in enqueue and dequeue Date: Sun, 12 Nov 2017 18:51:37 +0100 Message-ID: <3554910.3QoyqmBB3R@xps> References: <1510278669-8489-1-git-send-email-hejianet@gmail.com> <1510284642-7442-1-git-send-email-hejianet@gmail.com> <1510284642-7442-2-git-send-email-hejianet@gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7Bit Cc: stable@dpdk.org, jerin.jacob@caviumnetworks.com, dev@dpdk.org, olivier.matz@6wind.com, konstantin.ananyev@intel.com, bruce.richardson@intel.com, jianbo.liu@arm.com, hemant.agrawal@nxp.com, jie2.liu@hxt-semitech.com, bing.zhao@hxt-semitech.com To: Jia He , Jia He Return-path: In-Reply-To: <1510284642-7442-2-git-send-email-hejianet@gmail.com> List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" 10/11/2017 04:30, Jia He: > We watched a rte panic of mbuf_autotest in our qualcomm arm64 server > (Amberwing). > > Root cause: > In __rte_ring_move_cons_head() > ... > do { > /* Restore n as it may change every loop */ > n = max; > > *old_head = r->cons.head; //1st load > const uint32_t prod_tail = r->prod.tail; //2nd load > > In weak memory order architectures(powerpc,arm), the 2nd load might be > reodered before the 1st load, that makes *entries is bigger than we wanted. > This nasty reording messed enque/deque up. > > cpu1(producer) cpu2(consumer) cpu3(consumer) > load r->prod.tail > in enqueue: > load r->cons.tail > load r->prod.head > > store r->prod.tail > > load r->cons.head > load r->prod.tail > ... > store r->cons.{head,tail} > load r->cons.head > > Then, r->cons.head will be bigger than prod_tail, then make *entries very > big and the consumer will go forward incorrectly. > > After this patch, the old cons.head will be recaculated after failure of > rte_atomic32_cmpset > > There is no such issue on X86, because X86 is strong memory order model. > But rte_smp_rmb() doesn't have impact on runtime performance on X86, so > keep the same code without architectures specific concerns. > > Signed-off-by: Jia He > Signed-off-by: jie2.liu@hxt-semitech.com > Signed-off-by: bing.zhao@hxt-semitech.com > Acked-by: Jerin Jacob > Acked-by: Jianbo Liu > Cc: stable@dpdk.org Applied, thanks