From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jerin Jacob Subject: Re: [PATCH] ring: guarantee ordering of cons/prod loading when doing enqueue/dequeue Date: Thu, 12 Oct 2017 22:53:12 +0530 Message-ID: <20171012172311.GA8524@jerin> References: <20171010095636.4507-1-hejianet@gmail.com> <20171012155350.j34ddtivxzd27pag@platinum> <2601191342CEEE43887BDE71AB9772585FAA859F@IRSMSX103.ger.corp.intel.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: Olivier MATZ , Jia He , "dev@dpdk.org" , "jia.he@hxt-semitech.com" , "jie2.liu@hxt-semitech.com" , "bing.zhao@hxt-semitech.com" To: "Ananyev, Konstantin" Return-path: Received: from NAM03-DM3-obe.outbound.protection.outlook.com (mail-dm3nam03on0061.outbound.protection.outlook.com [104.47.41.61]) by dpdk.org (Postfix) with ESMTP id 055A41B31C for ; Thu, 12 Oct 2017 19:23:38 +0200 (CEST) Content-Disposition: inline In-Reply-To: <2601191342CEEE43887BDE71AB9772585FAA859F@IRSMSX103.ger.corp.intel.com> List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" -----Original Message----- > Date: Thu, 12 Oct 2017 17:05:50 +0000 > From: "Ananyev, Konstantin" > To: Olivier MATZ , Jia He > CC: "dev@dpdk.org" , "jia.he@hxt-semitech.com" > , "jie2.liu@hxt-semitech.com" > , "bing.zhao@hxt-semitech.com" > , "jerin.jacob@caviumnetworks.com" > > Subject: RE: [PATCH] ring: guarantee ordering of cons/prod loading when > doing enqueue/dequeue > > Hi guys, > > > -----Original Message----- > > From: Olivier MATZ [mailto:olivier.matz@6wind.com] > > Sent: Thursday, October 12, 2017 4:54 PM > > To: Jia He > > Cc: dev@dpdk.org; jia.he@hxt-semitech.com; jie2.liu@hxt-semitech.com; bing.zhao@hxt-semitech.com; Ananyev, Konstantin > > ; jerin.jacob@caviumnetworks.com > > Subject: Re: [PATCH] ring: guarantee ordering of cons/prod loading when doing enqueue/dequeue > > > > Hi, > > > > On Tue, Oct 10, 2017 at 05:56:36PM +0800, Jia He wrote: > > > Before this patch: > > > In __rte_ring_move_cons_head() > > > ... > > > do { > > > /* Restore n as it may change every loop */ > > > n = max; > > > > > > *old_head = r->cons.head; //1st load > > > const uint32_t prod_tail = r->prod.tail; //2nd load > > > > > > In weak memory order architectures(powerpc,arm), the 2nd load might be > > > reodered before the 1st load, that makes *entries is bigger than we wanted. > > > This nasty reording messed enque/deque up. > > > > > > cpu1(producer) cpu2(consumer) cpu3(consumer) > > > load r->prod.tail > > > in enqueue: > > > load r->cons.tail > > > load r->prod.head > > > > > > store r->prod.tail > > > > > > load r->cons.head > > > load r->prod.tail > > > ... > > > store r->cons.{head,tail} > > > load r->cons.head > > > > > > THEN,r->cons.head will be bigger than prod_tail, then make *entries very big > > > > > > After this patch, the old cons.head will be recaculated after failure of > > > rte_atomic32_cmpset > > > > > > There is no such issue in X86 cpu, because X86 is strong memory order model > > > > > > Signed-off-by: Jia He > > > Signed-off-by: jia.he@hxt-semitech.com > > > Signed-off-by: jie2.liu@hxt-semitech.com > > > Signed-off-by: bing.zhao@hxt-semitech.com > > > > > > --- > > > lib/librte_ring/rte_ring.h | 8 ++++++++ > > > 1 file changed, 8 insertions(+) > > > > > > diff --git a/lib/librte_ring/rte_ring.h b/lib/librte_ring/rte_ring.h > > > index 5e9b3b7..15c72e2 100644 > > > --- a/lib/librte_ring/rte_ring.h > > > +++ b/lib/librte_ring/rte_ring.h > > > @@ -409,6 +409,10 @@ __rte_ring_move_prod_head(struct rte_ring *r, int is_sp, > > > n = max; > > > > > > *old_head = r->prod.head; > > > + > > > + /* load of prod.tail can't be reordered before cons.head */ > > > + rte_smp_rmb(); > > > + > > > const uint32_t cons_tail = r->cons.tail; > > > /* > > > * The subtraction is done between two unsigned 32bits value > > > @@ -517,6 +521,10 @@ __rte_ring_move_cons_head(struct rte_ring *r, int is_sc, > > > n = max; > > > > > > *old_head = r->cons.head; > > > + > > > + /* load of prod.tail can't be reordered before cons.head */ > > > + rte_smp_rmb(); > > > + > > > const uint32_t prod_tail = r->prod.tail; > > > /* The subtraction is done between two unsigned 32bits value > > > * (the result is always modulo 32 bits even if we have > > > -- > > > 2.7.4 > > > > > > > The explanation convinces me. > > > > However, since it's in a critical path, it would be good to have other > > opinions. This patch reminds me this discussion, that was also related to > > memory barrier, but at another place: > > http://dpdk.org/ml/archives/dev/2016-July/043765.html > > Lead to that patch: http://dpdk.org/browse/dpdk/commit/?id=ecc7d10e448e > > But finally reverted: http://dpdk.org/browse/dpdk/commit/?id=c3acd92746c3 > > > > Konstatin, Jerin, do you have any comment? > > For IA, as rte_smp_rmb() is just a compiler_barrier, that patch shouldn't make any difference, > but I can't see how read reordering would screw things up here... > Probably just me and arm or ppc guys could explain what will be the problem > if let say cons.tail will be read before prod.head in __rte_ring_move_prod_head(). > I wonder Is there a simple test-case to reproduce that problem (on arm or ppc)? > Probably new test-case for rte_ring autotest is needed, or is it possible to reproduce > it with existing one? On the same lines, Jia He, jie2.liu, bing.zhao, Is this patch based on code review or do you saw this issue on any of the arm/ppc target? arm64 will have performance impact with this change. > Konstantin