From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Tan, Jianfeng" Subject: Re: pthread_barrier_deadlock in -rc1 Date: Wed, 2 May 2018 17:32:30 +0800 Message-ID: <7afa9235-cc14-a05f-7f85-87d8a40d447e@intel.com> References: <20180403130439.11151-1-olivier.matz@6wind.com> <20180424144651.13145-1-olivier.matz@6wind.com> <4256B2F0-EF9D-4B22-AC1A-D440C002360A@6wind.com> <39d5baf8-2bad-6df8-0419-a06c65d41475@redhat.com> <2d828aa1-482f-7f19-1909-c3ca4599c9b2@intel.com> <393a2f7e-ed20-fa28-0b07-aa3374593d5a@redhat.com> <20180502092011.5nxl5nbka6zfi4hb@neon> Mime-Version: 1.0 Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit Cc: dev@dpdk.org, Anatoly Burakov , Thomas Monjalon To: Olivier Matz , Maxime Coquelin Return-path: Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by dpdk.org (Postfix) with ESMTP id 0FF7EDED for ; Wed, 2 May 2018 11:32:32 +0200 (CEST) In-Reply-To: <20180502092011.5nxl5nbka6zfi4hb@neon> List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Hi Maxime and Olivier, [...] >>> Below patch can fix another strange sigsegv issue in my VM. Please check >>> if it works for you. I doubt it's use-after-free problem which could >>> lead to different issues in different env. Please have a try. >>> >>> >>> diff --git a/lib/librte_eal/common/eal_common_thread.c >>> b/lib/librte_eal/common/eal_common_thread.c >>> index de69452..d91b67d 100644 >>> --- a/lib/librte_eal/common/eal_common_thread.c >>> +++ b/lib/librte_eal/common/eal_common_thread.c >>> @@ -205,6 +205,7 @@ rte_ctrl_thread_create(pthread_t *thread, const char >>> *name, >>> goto fail; >>> >>> pthread_barrier_wait(¶ms->configured); >>> + pthread_barrier_destroy(¶ms->configured); >> Thanks Jianfeng, that fixes my issue. >> For correctness, I wonder whether we should check pthread_barrier_wait >> return, and only call destroy() if PTHREAD_BARRIER_SERIAL_THREAD? >> And so also do same the same thing in rte_thread_init(). >> >> What do you think? >> Thanks, >> Maxime > > Thanks for the update. I also have a patch that replaces the barrier by > a lock which could also work, but if Jianfeng's one fixes the issue, I > think it is better. > > About the PTHREAD_BARRIER_SERIAL_THREAD, not sure it will change > something: > > Upon successful completion, the pthread_barrier_wait() function > shall return PTHREAD_BARRIER_SERIAL_THREAD for a single > (arbitrary) thread synchronized at the barrier and zero for each > of the other threads. Otherwise, an error number shall be > returned to indicate the error. > > I understand that it will ensure that only one barrier will return > PTHREAD_BARRIER_SERIAL_THREAD, but not necessarily the last one. So > if destroy() is called in the parent thread, it should be the same, no? > > By the way, there is also a small memory leak that was introduced by > the previous patch, maybe you can add the fix too: > > - if (ret != 0) > + if (ret != 0) { > + free(params); > return ret; > + } How about: the thread who gets PTHREAD_BARRIER_SERIAL_THREAD returned, is responsible for the destroy and free(params)? Thanks, Jianfeng