From mboxrd@z Thu Jan 1 00:00:00 1970 From: Thomas Monjalon Subject: Re: [PATCH 18.11] malloc: fix deadlock when using malloc stats Date: Fri, 21 Dec 2018 13:09:23 +0100 Message-ID: <8322983.QTS3aZyW4V@xps> References: <884a355bde19652d57c253c6b36036571c4f46ee.1543926108.git.anatoly.burakov@intel.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7Bit Cc: dev@dpdk.org, stable@dpdk.org, ferruh.yigit@intel.com, bruce.richardson@intel.com, arybchenko@solarflare.com, olivier.matz@6wind.com To: Anatoly Burakov Return-path: In-Reply-To: <884a355bde19652d57c253c6b36036571c4f46ee.1543926108.git.anatoly.burakov@intel.com> List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" 04/12/2018 13:22, Anatoly Burakov: > Currently, malloc statistics and external heap creation code > use memory hotplug lock as a way to synchronize accesses to > heaps (as in, locking the hotplug lock to prevent list of heaps > from changing under our feet). At the same time, malloc > statistics code will also lock the heap because it needs to > access heap data and does not want any other thread to allocate > anything from that heap. > > In such scheme, it is possible to enter a deadlock with the > following sequence of events: > > thread 1 thread 2 > rte_malloc() > rte_malloc_dump_stats() > take heap lock > take hotplug lock > failed to allocate, > attempt to take > hotplug lock > attempt to take heap lock > > Neither thread will be able to continue, as both of them are > waiting for the other one to drop the lock. Adding an > additional lock will require an ABI change, so instead of > that, make malloc statistics calls thread-unsafe with > respect to creating/destroying heaps. > > Fixes: 72cf92b31855 ("malloc: index heaps using heap ID rather than NUMA node") > Cc: stable@dpdk.org > > Signed-off-by: Anatoly Burakov > --- > > Notes: > IMO this is the best we can do for 18.11 without breaking ABI. > For 19.02, we can introduce a new global heap lock (something > i should've done in the first place...), so this patch is > not applicable to 19.02. For 19.02, we can fix this properly > by introducing another lock and breaking the EAL ABI. > > Not sure where to put docs update, feedback welcome. This patch is also changing the API, because functions become not thread-safe. I think you should note it in the release notes. About 19.02, do we want to take this patch (with release notes updated)?