From mboxrd@z Thu Jan 1 00:00:00 1970 From: Maxime Coquelin Subject: Re: pthread_barrier_deadlock in -rc1 Date: Wed, 2 May 2018 11:41:47 +0200 Message-ID: References: <20180403130439.11151-1-olivier.matz@6wind.com> <20180424144651.13145-1-olivier.matz@6wind.com> <4256B2F0-EF9D-4B22-AC1A-D440C002360A@6wind.com> <39d5baf8-2bad-6df8-0419-a06c65d41475@redhat.com> <2d828aa1-482f-7f19-1909-c3ca4599c9b2@intel.com> <393a2f7e-ed20-fa28-0b07-aa3374593d5a@redhat.com> <20180502092011.5nxl5nbka6zfi4hb@neon> <7afa9235-cc14-a05f-7f85-87d8a40d447e@intel.com> Mime-Version: 1.0 Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 8bit Cc: dev@dpdk.org, Anatoly Burakov , Thomas Monjalon To: "Tan, Jianfeng" , Olivier Matz Return-path: Received: from mx1.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by dpdk.org (Postfix) with ESMTP id 30D0323C for ; Wed, 2 May 2018 11:41:50 +0200 (CEST) In-Reply-To: <7afa9235-cc14-a05f-7f85-87d8a40d447e@intel.com> Content-Language: en-US List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" On 05/02/2018 11:32 AM, Tan, Jianfeng wrote: > Hi Maxime and Olivier, > > [...] >>>> Below patch can fix another strange sigsegv issue in my VM. Please >>>> check >>>> if it works for you. I doubt it's use-after-free problem which could >>>> lead to different issues in different env. Please have a try. >>>> >>>> >>>> diff --git a/lib/librte_eal/common/eal_common_thread.c >>>> b/lib/librte_eal/common/eal_common_thread.c >>>> index de69452..d91b67d 100644 >>>> --- a/lib/librte_eal/common/eal_common_thread.c >>>> +++ b/lib/librte_eal/common/eal_common_thread.c >>>> @@ -205,6 +205,7 @@ rte_ctrl_thread_create(pthread_t *thread, const >>>> char >>>> *name, >>>>                   goto fail; >>>> >>>>           pthread_barrier_wait(¶ms->configured); >>>> +       pthread_barrier_destroy(¶ms->configured); >>> Thanks Jianfeng, that fixes my issue. >>> For correctness, I wonder whether we should check pthread_barrier_wait >>> return, and only call destroy() if PTHREAD_BARRIER_SERIAL_THREAD? >>> And so also do same the same thing in rte_thread_init(). >>> >>> What do you think? >>> Thanks, >>> Maxime >> >> Thanks for the update. I also have a patch that replaces the barrier by >> a lock which could also work, but if Jianfeng's one fixes the issue, I >> think it is better. >> >> About the PTHREAD_BARRIER_SERIAL_THREAD, not sure it will change >> something: >> >>         Upon successful completion, the pthread_barrier_wait() function >>         shall return PTHREAD_BARRIER_SERIAL_THREAD for a single >>         (arbitrary) thread synchronized at the barrier and zero for each >>         of the other threads. Otherwise, an error number shall be >>         returned to indicate the error. >> >> I understand that it will ensure that only one barrier will return >> PTHREAD_BARRIER_SERIAL_THREAD, but not necessarily the last one. So >> if destroy() is called in the parent thread, it should be the same, no? >> >> By the way, there is also a small memory leak that was introduced by >> the previous patch, maybe you can add the fix too: >> >> -       if (ret != 0) >> +       if (ret != 0) { >> +               free(params); >>                  return ret; >> +       } > > How about: the thread who gets PTHREAD_BARRIER_SERIAL_THREAD returned, > is responsible for the destroy and free(params)? I agree with your suggestion. Thanks, Maxime > Thanks, > Jianfeng