From mboxrd@z Thu Jan 1 00:00:00 1970 From: Lazaros Koromilas Subject: Re: [PATCH v2] ring: check for zero objects mc dequeue / mp enqueue Date: Mon, 28 Mar 2016 18:48:07 +0300 Message-ID: References: <1458229783-15547-1-git-send-email-l@nofutznetworks.com> <56F51DCB.5020502@6wind.com> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Cc: dev@dpdk.org, Thomas Monjalon To: Olivier Matz Return-path: Received: from mail-io0-f178.google.com (mail-io0-f178.google.com [209.85.223.178]) by dpdk.org (Postfix) with ESMTP id 60854532C for ; Mon, 28 Mar 2016 17:48:08 +0200 (CEST) Received: by mail-io0-f178.google.com with SMTP id g185so25369505ioa.2 for ; Mon, 28 Mar 2016 08:48:08 -0700 (PDT) In-Reply-To: <56F51DCB.5020502@6wind.com> List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Hi Olivier, We could have two threads (running on different cores in the general case) that both succeed the cmpset operation. In the dequeue path, when n == 0, then cons_next == cons_head, and cmpset will always succeed. Now, if they both see an old r->cons.tail value from a previous dequeue, they can get stuck in the while (unlikely(r->cons.tail != cons_head)) loop. I tried, however, to reproduce (without the patch) and it seems that there is still a window for deadlock. I'm pasting some debug output below that shows two processes' state. It's two receivers doing interleaved mc_dequeue(32)/mc_dequeue(0), and one sender doing mp_enqueue(32) on the same ring. gdb --args ./ring-test -l0 --proc-type=primary gdb --args ./ring-test -l1 --proc-type=secondary gdb --args ./ring-test -l2 --proc-type=secondary -- tx This is what I would usually see, process 0 and 1 both stuck at the same state: 663 while (unlikely(r->cons.tail != cons_head)) { (gdb) p n $1 = 0 (gdb) p r->cons.tail $2 = 576416 (gdb) p cons_head $3 = 576448 (gdb) p cons_next $4 = 576448 But now I managed to get the two processes stuck at this state too. process 0: 663 while (unlikely(r->cons.tail != cons_head)) { (gdb) p n $1 = 32 (gdb) p r->cons.tail $2 = 254348832 (gdb) p cons_head $3 = 254348864 (gdb) p cons_next $4 = 254348896 proccess 1: 663 while (unlikely(r->cons.tail != cons_head)) { (gdb) p n $1 = 32 (gdb) p r->cons.tail $2 = 254348832 (gdb) p cons_head $3 = 254348896 (gdb) p cons_next $4 = 254348928 I haven't been able to trigger this with the patch so far, but it should be possible. Lazaros. On Fri, Mar 25, 2016 at 1:15 PM, Olivier Matz wrote: > Hi Lazaros, > > On 03/17/2016 04:49 PM, Lazaros Koromilas wrote: >> Issuing a zero objects dequeue with a single consumer has no effect. >> Doing so with multiple consumers, can get more than one thread to succeed >> the compare-and-set operation and observe starvation or even deadlock in >> the while loop that checks for preceding dequeues. The problematic piece >> of code when n = 0: >> >> cons_next = cons_head + n; >> success = rte_atomic32_cmpset(&r->cons.head, cons_head, cons_next); >> >> The same is possible on the enqueue path. > > Just a question about this patch (that has been applied). Thomas > retitled the commit from your log message: > > ring: fix deadlock in zero object multi enqueue or dequeue > http://dpdk.org/browse/dpdk/commit/?id=d0979646166e > > I think this patch does not fix a deadlock, or did I miss something? > > As explained in the following links, the ring may not perform well > if several threads running on the same cpu use it: > > http://dpdk.org/ml/archives/dev/2013-November/000714.html > http://www.dpdk.org/ml/archives/dev/2014-January/001070.html > http://www.dpdk.org/ml/archives/dev/2014-January/001162.html > http://dpdk.org/ml/archives/dev/2015-July/020659.html > > A deadlock could occur if the threads running on the same core > have different priority. > > Regards, > Olivier