From mboxrd@z Thu Jan 1 00:00:00 1970 From: Olivier MATZ Subject: Re: [PATCH v2] ring: check for zero objects mc dequeue / mp enqueue Date: Tue, 29 Mar 2016 17:29:12 +0200 Message-ID: <56FA9F48.3030509@6wind.com> References: <1458229783-15547-1-git-send-email-l@nofutznetworks.com> <56F51DCB.5020502@6wind.com> <20160329085443.GA17800@bricha3-MOBL3> Mime-Version: 1.0 Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit Cc: dev@dpdk.org, Thomas Monjalon To: Bruce Richardson , Lazaros Koromilas Return-path: Received: from proxy.6wind.com (host.76.145.23.62.rev.coltfrance.com [62.23.145.76]) by dpdk.org (Postfix) with ESMTP id 6D3255323 for ; Tue, 29 Mar 2016 17:29:23 +0200 (CEST) In-Reply-To: <20160329085443.GA17800@bricha3-MOBL3> List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Hi, On 03/29/2016 10:54 AM, Bruce Richardson wrote: > On Mon, Mar 28, 2016 at 06:48:07PM +0300, Lazaros Koromilas wrote: >> Hi Olivier, >> >> We could have two threads (running on different cores in the general >> case) that both succeed the cmpset operation. In the dequeue path, >> when n == 0, then cons_next == cons_head, and cmpset will always >> succeed. Now, if they both see an old r->cons.tail value from a >> previous dequeue, they can get stuck in the while > > Hi, > > I don't see how threads reading an "old r->cons.tail" value is even possible. > The head and tail pointers on the ring are marked in the code as volatile, so > all reads and writes to those values are always done from memory and not cached > in registers. No deadlock should be possible on that while loop, unless a > process crashes in the middle of a ring operation. Each thread which updates > the head pointer from x to y, is responsible for updating the tail pointer in > a similar manner. The loop ensures the tail updates are in the same order as the > head updates. > > If you believe deadlock is possible, can you outline the sequence of operations > which would lead to such a state, because I cannot see how it could occur without > a crash inside one of the threads. I think the deadlock Lazaros describes could occur in the following condition: current ring state r->prod.head = 0 r->prod.tail = 0 core 0 core 1 ==================================================================== enqueue 0 object cmpset(&r->prod.head, 0, 0) core 0 is interrupted here enqueue 1 object cmpset(&r->prod.head, 0, 1) copy the objects in box 0 while (r->prod.tail != prod_head)) r->prod.tail = prod_next copy 0 object (-> nothing to do) while (r->prod.tail != prod_head)) I think this issue is indeed fixed by Lazaros' patch (I missed it in previous review). However, I don't think this deadlock could happen once we avoided the (n == 0) case. Olivier