All of lore.kernel.org
 help / color / mirror / Atom feed
* [Xenomai] SIGDEBUG_RESCNT_IMBALANCE with recursive mutex
@ 2016-06-01 15:07 Jeffrey Melville
  2016-06-01 15:45 ` Gilles Chanteperdrix
  2016-06-01 16:45 ` Gilles Chanteperdrix
  0 siblings, 2 replies; 7+ messages in thread
From: Jeffrey Melville @ 2016-06-01 15:07 UTC (permalink / raw)
  To: xenomai@xenomai.org

Hi,

Setup: Xenomai 2.6.4 (actually 2.6 git rev 4f349cf0553, with a99426
cherry-picked) with kernel 3.14.17 on a Zynq and the POSIX skin using
the ipipe patches included with the specified git rev)

We've noticed that SIGDEBUG_RESCNT_IMBALANCE is generated when a
(Xenomai) mutex is taken recursively by an NRT thread. The snippet at
the bottom of this message will reproduce the issue. I omitted most of
the error-checking for brevity.

A couple previous threads have discussed slightly similar problems, but
I never saw final resolutions:
http://www.xenomai.org/pipermail/xenomai/2012-January/025278.html
http://www.xenomai.org/pipermail/xenomai/2014-October/031919.html

As far as "why are we doing this?", the problem area occurs in a test
suite where some tests have to run as NRT threads because they don't
have to run real-time and will get killed by the watchdog if they run as
RT threads. Removing the Xenomai wrappers would also be complicated for
reasons that are outside of the scope of this email.

Thanks,
Jeff

---
#include <stdio.h>
#include <pthread.h>

int main(int argc, char **argv)
{
    pthread_mutex_t mutex;
    pthread_mutexattr_t mutex_attr;
    pthread_mutexattr_init(&mutex_attr);
    pthread_mutexattr_settype(&mutex_attr, PTHREAD_MUTEX_RECURSIVE);
    if (pthread_mutex_init(&mutex, &mutex_attr) != 0)
    {
        printf("Failed to initialize mutex.\n");
        return 1;
    }
    pthread_mutexattr_destroy(&mutex_attr);
    pthread_mutex_lock(&mutex);
    pthread_mutex_lock(&mutex);
    pthread_mutex_unlock(&mutex);
    pthread_mutex_unlock(&mutex);
    pthread_mutex_destroy(&mutex);
    return 0;
}


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Xenomai] SIGDEBUG_RESCNT_IMBALANCE with recursive mutex
  2016-06-01 15:07 [Xenomai] SIGDEBUG_RESCNT_IMBALANCE with recursive mutex Jeffrey Melville
@ 2016-06-01 15:45 ` Gilles Chanteperdrix
  2016-06-01 16:48   ` Jeffrey Melville
  2016-06-01 16:45 ` Gilles Chanteperdrix
  1 sibling, 1 reply; 7+ messages in thread
From: Gilles Chanteperdrix @ 2016-06-01 15:45 UTC (permalink / raw)
  To: Jeffrey Melville; +Cc: xenomai@xenomai.org

On Wed, Jun 01, 2016 at 11:07:35AM -0400, Jeffrey Melville wrote:
> Hi,
> 
> Setup: Xenomai 2.6.4 (actually 2.6 git rev 4f349cf0553, with a99426
> cherry-picked) with kernel 3.14.17 on a Zynq and the POSIX skin using
> the ipipe patches included with the specified git rev)
> 
> We've noticed that SIGDEBUG_RESCNT_IMBALANCE is generated when a
> (Xenomai) mutex is taken recursively by an NRT thread. The snippet at
> the bottom of this message will reproduce the issue. I omitted most of
> the error-checking for brevity.
> 
> A couple previous threads have discussed slightly similar problems, but
> I never saw final resolutions:
> http://www.xenomai.org/pipermail/xenomai/2012-January/025278.html

This issue is unrelated: setting a thread priority while holding a
mutex is clearly something we consider you should not be doing.

> http://www.xenomai.org/pipermail/xenomai/2014-October/031919.html

In this case, the mail asked the user to provide a test case, and
the user never provided one, it seems.

> 
> As far as "why are we doing this?", the problem area occurs in a test
> suite where some tests have to run as NRT threads because they don't
> have to run real-time and will get killed by the watchdog if they run as
> RT threads. Removing the Xenomai wrappers would also be complicated for
> reasons that are outside of the scope of this email.

Now we have a testcase it seems. However, a Xenomai mutex used by an
NRT thread only makes sense if the mutex is shared with an RT thread
(otherwise you could use plain Linux mutex). In that context, it
makes little sense to not enable priority inheritance on the mutex.
So, the question is: do you have the same problem if you enable
priority inheritance?

Also, could you check the function return values, to make sure that
you did not miss any error?

Regards.

-- 
					    Gilles.
https://click-hack.org


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Xenomai] SIGDEBUG_RESCNT_IMBALANCE with recursive mutex
  2016-06-01 15:07 [Xenomai] SIGDEBUG_RESCNT_IMBALANCE with recursive mutex Jeffrey Melville
  2016-06-01 15:45 ` Gilles Chanteperdrix
@ 2016-06-01 16:45 ` Gilles Chanteperdrix
  2016-06-01 16:58   ` Jeffrey Melville
  1 sibling, 1 reply; 7+ messages in thread
From: Gilles Chanteperdrix @ 2016-06-01 16:45 UTC (permalink / raw)
  To: Jeffrey Melville; +Cc: xenomai@xenomai.org

On Wed, Jun 01, 2016 at 11:07:35AM -0400, Jeffrey Melville wrote:
> Hi,
> 
> Setup: Xenomai 2.6.4 (actually 2.6 git rev 4f349cf0553, with a99426
> cherry-picked) with kernel 3.14.17 on a Zynq and the POSIX skin using
> the ipipe patches included with the specified git rev)
> 
> We've noticed that SIGDEBUG_RESCNT_IMBALANCE is generated when a
> (Xenomai) mutex is taken recursively by an NRT thread. The snippet at
> the bottom of this message will reproduce the issue. I omitted most of
> the error-checking for brevity.
> 
> A couple previous threads have discussed slightly similar problems, but
> I never saw final resolutions:
> http://www.xenomai.org/pipermail/xenomai/2012-January/025278.html
> http://www.xenomai.org/pipermail/xenomai/2014-October/031919.html
> 
> As far as "why are we doing this?", the problem area occurs in a test
> suite where some tests have to run as NRT threads because they don't
> have to run real-time and will get killed by the watchdog if they run as
> RT threads. Removing the Xenomai wrappers would also be complicated for
> reasons that are outside of the scope of this email.

Yeah well, this is an issue that has been known and fixed for so
long that I forgot we knew it:

https://git.xenomai.org/xenomai-3.git/commit/?id=79f0dd1cdc408b22afe301fa03805349a4a9f151

I will try and backport this change to 2.6 in 2.6.5. The change
should be easy to backport since it was made prior to most of the
cleanup of mutex and condvars. In 3.x, the kernel does not handle at
all the recursive mutex recursion count, this makes things much
simpler, but the code very different.

-- 
					    Gilles.
https://click-hack.org


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Xenomai] SIGDEBUG_RESCNT_IMBALANCE with recursive mutex
  2016-06-01 15:45 ` Gilles Chanteperdrix
@ 2016-06-01 16:48   ` Jeffrey Melville
  0 siblings, 0 replies; 7+ messages in thread
From: Jeffrey Melville @ 2016-06-01 16:48 UTC (permalink / raw)
  To: Gilles Chanteperdrix; +Cc: xenomai@xenomai.org

On 6/1/2016 11:45 AM, Gilles Chanteperdrix wrote:
> On Wed, Jun 01, 2016 at 11:07:35AM -0400, Jeffrey Melville wrote:
>> Hi,
>>
>> Setup: Xenomai 2.6.4 (actually 2.6 git rev 4f349cf0553, with a99426
>> cherry-picked) with kernel 3.14.17 on a Zynq and the POSIX skin using
>> the ipipe patches included with the specified git rev)
>>
>> We've noticed that SIGDEBUG_RESCNT_IMBALANCE is generated when a
>> (Xenomai) mutex is taken recursively by an NRT thread. The snippet at
>> the bottom of this message will reproduce the issue. I omitted most of
>> the error-checking for brevity.
>>
>> A couple previous threads have discussed slightly similar problems, but
>> I never saw final resolutions:
>> http://www.xenomai.org/pipermail/xenomai/2012-January/025278.html
> 
> This issue is unrelated: setting a thread priority while holding a
> mutex is clearly something we consider you should not be doing.
> 
>> http://www.xenomai.org/pipermail/xenomai/2014-October/031919.html
> 
> In this case, the mail asked the user to provide a test case, and
> the user never provided one, it seems.
> 
>>
>> As far as "why are we doing this?", the problem area occurs in a test
>> suite where some tests have to run as NRT threads because they don't
>> have to run real-time and will get killed by the watchdog if they run as
>> RT threads. Removing the Xenomai wrappers would also be complicated for
>> reasons that are outside of the scope of this email.
> 
> Now we have a testcase it seems. However, a Xenomai mutex used by an
> NRT thread only makes sense if the mutex is shared with an RT thread
> (otherwise you could use plain Linux mutex). In that context, it
> makes little sense to not enable priority inheritance on the mutex.
> So, the question is: do you have the same problem if you enable
> priority inheritance?
>
Yes, the original test case included priority inheritance but I took it
out to provide the shortest possible test case. The same problem exists
either way (see updated test case).

FWIW, there are no issues for a single lock/unlock, regardless of
whether or not the mutex was created with the recursive type.

Agreed that in isolation this example makes little sense. In operational
usage the offending object executes within an RT thread but some (not
all) of our test cases have to exercise it in an NRT thread. Reverting
to plain Linux mutexes for these cases introduces functional coupling /
logistical issues we'd prefer to avoid.

> Also, could you check the function return values, to make sure that
> you did not miss any error?
> 
I updated the test case (without much attention paid to style) to check
return values and use priority inheritance, though the testcase remains
single threaded. I'll link since it is now longer:
http://pastebin.com/Hvamh9rA

> Regards.
> 
Cheers,
Jeff


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Xenomai] SIGDEBUG_RESCNT_IMBALANCE with recursive mutex
  2016-06-01 16:45 ` Gilles Chanteperdrix
@ 2016-06-01 16:58   ` Jeffrey Melville
  2016-06-16 13:26     ` Gilles Chanteperdrix
  0 siblings, 1 reply; 7+ messages in thread
From: Jeffrey Melville @ 2016-06-01 16:58 UTC (permalink / raw)
  To: Gilles Chanteperdrix; +Cc: xenomai@xenomai.org

On 6/1/2016 12:45 PM, Gilles Chanteperdrix wrote:
> On Wed, Jun 01, 2016 at 11:07:35AM -0400, Jeffrey Melville wrote:
>> Hi,
>>
>> Setup: Xenomai 2.6.4 (actually 2.6 git rev 4f349cf0553, with a99426
>> cherry-picked) with kernel 3.14.17 on a Zynq and the POSIX skin using
>> the ipipe patches included with the specified git rev)
>>
>> We've noticed that SIGDEBUG_RESCNT_IMBALANCE is generated when a
>> (Xenomai) mutex is taken recursively by an NRT thread. The snippet at
>> the bottom of this message will reproduce the issue. I omitted most of
>> the error-checking for brevity.
>>
>> A couple previous threads have discussed slightly similar problems, but
>> I never saw final resolutions:
>> http://www.xenomai.org/pipermail/xenomai/2012-January/025278.html
>> http://www.xenomai.org/pipermail/xenomai/2014-October/031919.html
>>
>> As far as "why are we doing this?", the problem area occurs in a test
>> suite where some tests have to run as NRT threads because they don't
>> have to run real-time and will get killed by the watchdog if they run as
>> RT threads. Removing the Xenomai wrappers would also be complicated for
>> reasons that are outside of the scope of this email.
> 
> Yeah well, this is an issue that has been known and fixed for so
> long that I forgot we knew it:
> 
> https://git.xenomai.org/xenomai-3.git/commit/?id=79f0dd1cdc408b22afe301fa03805349a4a9f151
> 
> I will try and backport this change to 2.6 in 2.6.5. The change
> should be easy to backport since it was made prior to most of the
> cleanup of mutex and condvars. In 3.x, the kernel does not handle at
> all the recursive mutex recursion count, this makes things much
> simpler, but the code very different.
> 
Ok thanks. You can disregard my other email then. I'll keep an eye out
for the fix and we will avoid that test case for the time being.

At some point after 2.6.5 I know we'll have to migrate to 3.x...

Jeff


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Xenomai] SIGDEBUG_RESCNT_IMBALANCE with recursive mutex
  2016-06-01 16:58   ` Jeffrey Melville
@ 2016-06-16 13:26     ` Gilles Chanteperdrix
  2016-06-17 19:30       ` Jeffrey Melville
  0 siblings, 1 reply; 7+ messages in thread
From: Gilles Chanteperdrix @ 2016-06-16 13:26 UTC (permalink / raw)
  To: Jeffrey Melville; +Cc: xenomai@xenomai.org

On Wed, Jun 01, 2016 at 12:58:20PM -0400, Jeffrey Melville wrote:
> On 6/1/2016 12:45 PM, Gilles Chanteperdrix wrote:
> > On Wed, Jun 01, 2016 at 11:07:35AM -0400, Jeffrey Melville wrote:
> >> Hi,
> >>
> >> Setup: Xenomai 2.6.4 (actually 2.6 git rev 4f349cf0553, with a99426
> >> cherry-picked) with kernel 3.14.17 on a Zynq and the POSIX skin using
> >> the ipipe patches included with the specified git rev)
> >>
> >> We've noticed that SIGDEBUG_RESCNT_IMBALANCE is generated when a
> >> (Xenomai) mutex is taken recursively by an NRT thread. The snippet at
> >> the bottom of this message will reproduce the issue. I omitted most of
> >> the error-checking for brevity.
> >>
> >> A couple previous threads have discussed slightly similar problems, but
> >> I never saw final resolutions:
> >> http://www.xenomai.org/pipermail/xenomai/2012-January/025278.html
> >> http://www.xenomai.org/pipermail/xenomai/2014-October/031919.html
> >>
> >> As far as "why are we doing this?", the problem area occurs in a test
> >> suite where some tests have to run as NRT threads because they don't
> >> have to run real-time and will get killed by the watchdog if they run as
> >> RT threads. Removing the Xenomai wrappers would also be complicated for
> >> reasons that are outside of the scope of this email.
> > 
> > Yeah well, this is an issue that has been known and fixed for so
> > long that I forgot we knew it:
> > 
> > https://git.xenomai.org/xenomai-3.git/commit/?id=79f0dd1cdc408b22afe301fa03805349a4a9f151
> > 
> > I will try and backport this change to 2.6 in 2.6.5. The change
> > should be easy to backport since it was made prior to most of the
> > cleanup of mutex and condvars. In 3.x, the kernel does not handle at
> > all the recursive mutex recursion count, this makes things much
> > simpler, but the code very different.
> > 
> Ok thanks. You can disregard my other email then. I'll keep an eye out
> for the fix and we will avoid that test case for the time being.
> 
> At some point after 2.6.5 I know we'll have to migrate to 3.x...

Hi,

I have pushed a fix for this issue:
https://git.xenomai.org/xenomai-2.6.git/commit/?id=8047147aff9dee9529f5561ecd7afc29c48d14db

Could you test it on your side?

Regards.

-- 
					    Gilles.
https://click-hack.org


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Xenomai] SIGDEBUG_RESCNT_IMBALANCE with recursive mutex
  2016-06-16 13:26     ` Gilles Chanteperdrix
@ 2016-06-17 19:30       ` Jeffrey Melville
  0 siblings, 0 replies; 7+ messages in thread
From: Jeffrey Melville @ 2016-06-17 19:30 UTC (permalink / raw)
  To: Gilles Chanteperdrix; +Cc: xenomai@xenomai.org

On 6/16/2016 9:26 AM, Gilles Chanteperdrix wrote:
> On Wed, Jun 01, 2016 at 12:58:20PM -0400, Jeffrey Melville wrote:
>> On 6/1/2016 12:45 PM, Gilles Chanteperdrix wrote:
>>> On Wed, Jun 01, 2016 at 11:07:35AM -0400, Jeffrey Melville wrote:
>>>> Hi,
>>>>
>>>> Setup: Xenomai 2.6.4 (actually 2.6 git rev 4f349cf0553, with a99426
>>>> cherry-picked) with kernel 3.14.17 on a Zynq and the POSIX skin using
>>>> the ipipe patches included with the specified git rev)
>>>>
>>>> We've noticed that SIGDEBUG_RESCNT_IMBALANCE is generated when a
>>>> (Xenomai) mutex is taken recursively by an NRT thread. The snippet at
>>>> the bottom of this message will reproduce the issue. I omitted most of
>>>> the error-checking for brevity.
>>>>
>>>> A couple previous threads have discussed slightly similar problems, but
>>>> I never saw final resolutions:
>>>> http://www.xenomai.org/pipermail/xenomai/2012-January/025278.html
>>>> http://www.xenomai.org/pipermail/xenomai/2014-October/031919.html
>>>>
>>>> As far as "why are we doing this?", the problem area occurs in a test
>>>> suite where some tests have to run as NRT threads because they don't
>>>> have to run real-time and will get killed by the watchdog if they run as
>>>> RT threads. Removing the Xenomai wrappers would also be complicated for
>>>> reasons that are outside of the scope of this email.
>>>
>>> Yeah well, this is an issue that has been known and fixed for so
>>> long that I forgot we knew it:
>>>
>>> https://git.xenomai.org/xenomai-3.git/commit/?id=79f0dd1cdc408b22afe301fa03805349a4a9f151
>>>
>>> I will try and backport this change to 2.6 in 2.6.5. The change
>>> should be easy to backport since it was made prior to most of the
>>> cleanup of mutex and condvars. In 3.x, the kernel does not handle at
>>> all the recursive mutex recursion count, this makes things much
>>> simpler, but the code very different.
>>>
>> Ok thanks. You can disregard my other email then. I'll keep an eye out
>> for the fix and we will avoid that test case for the time being.
>>
>> At some point after 2.6.5 I know we'll have to migrate to 3.x...
> 
> Hi,
> 
> I have pushed a fix for this issue:
> https://git.xenomai.org/xenomai-2.6.git/commit/?id=8047147aff9dee9529f5561ecd7afc29c48d14db
> 
> Could you test it on your side?
> 
> Regards.
> 
Gilles,

I confirmed that the isolated test case no longer causes SIGDEBUG on our
system after rebuilding with the fix. I don't have many miles on the new
build otherwise but will let you know if I see anything strange elsewhere.

Thanks for the quick turnaround with the fix.

Cheers,
Jeff


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2016-06-17 19:30 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-06-01 15:07 [Xenomai] SIGDEBUG_RESCNT_IMBALANCE with recursive mutex Jeffrey Melville
2016-06-01 15:45 ` Gilles Chanteperdrix
2016-06-01 16:48   ` Jeffrey Melville
2016-06-01 16:45 ` Gilles Chanteperdrix
2016-06-01 16:58   ` Jeffrey Melville
2016-06-16 13:26     ` Gilles Chanteperdrix
2016-06-17 19:30       ` Jeffrey Melville

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.