From mboxrd@z Thu Jan  1 00:00:00 1970
From: Olivier MATZ <olivier.matz@6wind.com>
Subject: Re: [PATCH v2] ring: check for zero objects mc dequeue /
 mp enqueue
Date: Tue, 29 Mar 2016 17:29:12 +0200
Message-ID: <56FA9F48.3030509@6wind.com>
References: <1458229783-15547-1-git-send-email-l@nofutznetworks.com>
 <56F51DCB.5020502@6wind.com>
 <CAHPNE8jmWLQAjZV71ryhH01SnJbGA=ZuZD+z3vR-hWUjEbpd+Q@mail.gmail.com>
 <20160329085443.GA17800@bricha3-MOBL3>
Mime-Version: 1.0
Content-Type: text/plain; charset=windows-1252; format=flowed
Content-Transfer-Encoding: 7bit
Cc: dev@dpdk.org, Thomas Monjalon <thomas.monjalon@6wind.com>
To: Bruce Richardson <bruce.richardson@intel.com>,
 Lazaros Koromilas <l@nofutznetworks.com>
Return-path: <dev-bounces@dpdk.org>
Received: from proxy.6wind.com (host.76.145.23.62.rev.coltfrance.com
 [62.23.145.76]) by dpdk.org (Postfix) with ESMTP id 6D3255323
 for <dev@dpdk.org>; Tue, 29 Mar 2016 17:29:23 +0200 (CEST)
In-Reply-To: <20160329085443.GA17800@bricha3-MOBL3>
List-Id: patches and discussions about DPDK <dev.dpdk.org>
List-Unsubscribe: <http://dpdk.org/ml/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://dpdk.org/ml/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <http://dpdk.org/ml/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
Errors-To: dev-bounces@dpdk.org
Sender: "dev" <dev-bounces@dpdk.org>

Hi,


On 03/29/2016 10:54 AM, Bruce Richardson wrote:
> On Mon, Mar 28, 2016 at 06:48:07PM +0300, Lazaros Koromilas wrote:
>> Hi Olivier,
>>
>> We could have two threads (running on different cores in the general
>> case) that both succeed the cmpset operation. In the dequeue path,
>> when n == 0, then cons_next == cons_head, and cmpset will always
>> succeed. Now, if they both see an old r->cons.tail value from a
>> previous dequeue, they can get stuck in the while
>
> Hi,
>
> I don't see how threads reading an "old r->cons.tail" value is even possible.
> The head and tail pointers on the ring are marked in the code as volatile, so
> all reads and writes to those values are always done from memory and not cached
> in registers. No deadlock should be possible on that while loop, unless a
> process crashes in the middle of a ring operation. Each thread which updates
> the head pointer from x to y, is responsible for updating the tail pointer in
> a similar manner. The loop ensures the tail updates are in the same order as the
> head updates.
>
> If you believe deadlock is possible, can you outline the sequence of operations
> which would lead to such a state, because I cannot see how it could occur without
> a crash inside one of the threads.

I think the deadlock Lazaros describes could occur in the following
condition:

current ring state
  r->prod.head = 0
  r->prod.tail = 0

core 0                               core 1
====================================================================
enqueue 0 object
  cmpset(&r->prod.head, 0, 0)
  core 0 is interrupted here
                                      enqueue 1 object
                                       cmpset(&r->prod.head, 0, 1)
                                       copy the objects in box 0
                                       while (r->prod.tail != prod_head))
                                       r->prod.tail = prod_next
  copy 0 object (-> nothing to do)
  while (r->prod.tail != prod_head))
     <loop forever>


I think this issue is indeed fixed by Lazaros' patch (I missed it
in previous review). However, I don't think this deadlock could
happen once we avoided the (n == 0) case.


Olivier