From mboxrd@z Thu Jan 1 00:00:00 1970 Date: Wed, 1 Jun 2016 18:45:30 +0200 From: Gilles Chanteperdrix Message-ID: <20160601164530.GE14103@hermes.click-hack.org> References: <574EFA37.5030607@mitre.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <574EFA37.5030607@mitre.org> Subject: Re: [Xenomai] SIGDEBUG_RESCNT_IMBALANCE with recursive mutex List-Id: Discussions about the Xenomai project List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Jeffrey Melville Cc: "xenomai@xenomai.org" On Wed, Jun 01, 2016 at 11:07:35AM -0400, Jeffrey Melville wrote: > Hi, > > Setup: Xenomai 2.6.4 (actually 2.6 git rev 4f349cf0553, with a99426 > cherry-picked) with kernel 3.14.17 on a Zynq and the POSIX skin using > the ipipe patches included with the specified git rev) > > We've noticed that SIGDEBUG_RESCNT_IMBALANCE is generated when a > (Xenomai) mutex is taken recursively by an NRT thread. The snippet at > the bottom of this message will reproduce the issue. I omitted most of > the error-checking for brevity. > > A couple previous threads have discussed slightly similar problems, but > I never saw final resolutions: > http://www.xenomai.org/pipermail/xenomai/2012-January/025278.html > http://www.xenomai.org/pipermail/xenomai/2014-October/031919.html > > As far as "why are we doing this?", the problem area occurs in a test > suite where some tests have to run as NRT threads because they don't > have to run real-time and will get killed by the watchdog if they run as > RT threads. Removing the Xenomai wrappers would also be complicated for > reasons that are outside of the scope of this email. Yeah well, this is an issue that has been known and fixed for so long that I forgot we knew it: https://git.xenomai.org/xenomai-3.git/commit/?id=79f0dd1cdc408b22afe301fa03805349a4a9f151 I will try and backport this change to 2.6 in 2.6.5. The change should be easy to backport since it was made prior to most of the cleanup of mutex and condvars. In 3.x, the kernel does not handle at all the recursive mutex recursion count, this makes things much simpler, but the code very different. -- Gilles. https://click-hack.org