From mboxrd@z Thu Jan 1 00:00:00 1970 Message-ID: <53346CF7.2020805@xenomai.org> Date: Thu, 27 Mar 2014 19:24:55 +0100 From: Philippe Gerum MIME-Version: 1.0 References: <53316C4C.6000708@sigmatek.at> <53335D9A.9040902@xenomai.org> <5333F22F.70805@xenomai.org> <53345143.7000009@xenomai.org> <53346835.4030202@web.de> In-Reply-To: <53346835.4030202@web.de> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Xenomai] Rescnt imbalance in rtdm_mutex_timedlock List-Id: Discussions about the Xenomai project List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Jan Kiszka , Henri Roosen Cc: "Xenomai@xenomai.org" On 03/27/2014 07:04 PM, Jan Kiszka wrote: > On 2014-03-27 17:26, Philippe Gerum wrote: >> On 03/27/2014 04:02 PM, Henri Roosen wrote: >>> >>> >>> >>> On Thu, Mar 27, 2014 at 10:41 AM, Philippe Gerum >> > wrote: >>> >>> On 03/27/2014 12:07 AM, Gilles Chanteperdrix wrote: >>> >>> On 03/26/2014 05:49 PM, Henri Roosen wrote: >>> >>> On Tue, Mar 25, 2014 at 12:45 PM, Erwin Pranz >>> >> >__wrote: >>> >>> Hello, >>> >>> I am using rtdm_mutex_lock/rtdm_mutex___unlock in a rtdm >>> driver and I get a >>> signal SIGDEBUG_MIGRATE_PRIOINV when I call the driver >>> from a priority-0 >>> thread.The signal is sent in xnsynch_release_thread when >>> rescnt is 0 on a >>> XNOTHER thread, which means that rescnt is not balanced. >>> >>> I think the reason why rescnt is not balanced is in >>> rtdm_mutex_timedlock: >>> when the owner of the mutex is NULL, the owner is >>> changed to the calling >>> thread, but rescnt is not increased. >>> >>> I suggest the following fix in rtdm_mutex_timedlock: >>> - else if >>> (likely(xnsynch_owner(&mutex->__synch_base) == NULL)) >>> + else if >>> (likely(xnsynch_owner(&mutex->__synch_base) == NULL)) { >>> >>> xnsynch_set_owner(&mutex->__synch_base, curr_thread); >>> + if (xnthread_test_state(curr___thread, >>> XNOTHER)) >>> + xnthread_inc_rescnt(curr___thread); >>> + } >>> >>> >>> Hi Gilles, >>> >>> Could you maybe elaborate on this? It seems the resource >>> counting is not in >>> balance. It would be great to know if this the correct way >>> to fix it. >>> >>> The problem should be easily reproducible by having just one >>> shadowed >>> priority 0 thread that somehow calls rtdm_mutex_lock >>> directly followed by >>> rtdm_mutex_unlock. This will trigger the code in >>> ksrc/nucleus/synch.c that >>> was added by commit-486afd15, which sends >>> SIGDEBUG_MIGRATE_PRIOINV on >>> Xenomai v2.6.2.1. It shouldn't send any signal in that case, >>> as nothing is >>> wrong. >>> >>> Note we are on Xenomai v2.6.2.1. Xenomai v2.6.3 sends (the >>> better name for >>> it) SIGDEBUG_RESCNT_IMBALANCE. >>> >>> >>> To be completely frank, I expected Jan to answer this mail as this >>> concerns RTDM. Anyway, I had a look into it, and: >>> - the RTDM skin is not alone in this case, several skins >>> implement the >>> same pattern (if owner == NULL, set owner, then return); >>> - adding calls to xnthread_inc_rescnt everywhere does not seem >>> right, I >>> would prefer to move this inside some xnsynch_ service. The best >>> candidate I found for that is xnsynch_set_owner, though I am not >>> sure if >>> this will work, especially with fastsynchs. >>> >>> >>> >>> Indeed this wouldn't. I suspect the skins mentioned did not have >>> access to xnsynch_acquire(), and were not updated when the >>> auto-relax feature was introduced for XNOTHER threads. >>> >>> We could provide a helper for fast grabbing a mutex which has no >>> fast locking support shared with userland, saving a call to >>> xnsynch_acquire() in the non-contended case, e.g. >>> >>> diff --git a/include/nucleus/synch.h b/include/nucleus/synch.h >>> index 7deae21..87456ea 100644 >>> --- a/include/nucleus/synch.h >>> +++ b/include/nucleus/synch.h >>> @@ -192,6 +192,22 @@ static inline void xnsynch_set_owner(struct >>> xnsynch *synch, >>> synch->owner = thread; >>> } >>> >>> +static inline int xnsynch_fast_grab(struct xnsynch *synch, >>> + struct xnthread *thread) >>> +{ >>> + XENO_BUGON(NUCLEUS, xnsynch_fastlock_p(synch)); >>> + >>> + if (likely(xnsynch_owner(synch) != NULL)) >>> + return 0; >>> + >>> + xnsynch_set_owner(synch, thread); >>> + >>> + if (xnthread_test_state(thread, XNOTHER)) >>> + xnthread_inc_rescnt(thread); >>> + >>> + return 1; >>> +} >>> + >>> static inline void xnsynch_register_cleanup(__struct xnsynch *synch, >>> void (*handler)(struct >>> xnsynch *)) >>> { >>> diff --git a/ksrc/skins/rtdm/drvlib.c b/ksrc/skins/rtdm/drvlib.c >>> index 391d45f..338005d 100644 >>> --- a/ksrc/skins/rtdm/drvlib.c >>> +++ b/ksrc/skins/rtdm/drvlib.c >>> @@ -1538,9 +1538,7 @@ int rtdm_mutex_timedlock(rtdm___mutex_t >>> *mutex, nanosecs_rel_t timeout, >>> if (unlikely(xnsynch_test_flags(&__mutex->synch_base, >>> RTDM_SYNCH_DELETED))) >>> err = -EIDRM; >>> >>> - else if (likely(xnsynch_owner(&mutex->__synch_base) == NULL)) >>> - xnsynch_set_owner(&mutex->__synch_base, curr_thread); >>> - else { >>> + else if (!xnsynch_fast_grab(&mutex->__synch_base, >>> curr_thread)) { >>> /* Redefinition to clarify XENO_ASSERT output */ >>> #define mutex_owner xnsynch_owner(&mutex->synch___base) >>> XENO_ASSERT(RTDM, mutex_owner != curr_thread, >>> >>> The VxWorks and VRTX skins would have to use it as well. >>> >>> >>> Thanks for the replies! We'll give this patch a try for the RTDM case. >>> >> >> Minor points, but a compiler hint above was inverted and eventually useless, and the naming not descriptive enough. Take #2: >> >> diff --git a/include/nucleus/synch.h b/include/nucleus/synch.h >> index 7deae21..09ebdbf 100644 >> --- a/include/nucleus/synch.h >> +++ b/include/nucleus/synch.h >> @@ -192,6 +192,22 @@ static inline void xnsynch_set_owner(struct xnsynch *synch, >> synch->owner = thread; >> } >> >> +static inline int xnsynch_try_grab(struct xnsynch *synch, >> + struct xnthread *thread) >> +{ >> + XENO_BUGON(NUCLEUS, xnsynch_fastlock_p(synch)); >> + >> + if (xnsynch_owner(synch) != NULL) > > if (unlikely(...)) - locks are generally unlikely to be contended > (unless something is suboptimally designed at the user site). > I'm unsure we really need to take a chance at giving such hint. Modern CPUs tend to have branch prediction right. -- Philippe.