From: Philippe Gerum <rpm@xenomai.org>
To: Jan Kiszka <jan.kiszka@siemens.com>,
Gilles Chanteperdrix <gilles.chanteperdrix@xenomai.org>
Cc: xenomai@xenomai.org
Subject: Re: [Xenomai] [Xenomai-git] Jan Kiszka : cobalt/kernel: Fix locking for xnthread info manipulations
Date: Tue, 30 Jun 2015 20:55:35 +0200 [thread overview]
Message-ID: <5592E627.3020701@xenomai.org> (raw)
In-Reply-To: <5592E29E.4030701@siemens.com>
On 06/30/2015 08:40 PM, Jan Kiszka wrote:
> On 2015-06-27 15:05, Philippe Gerum wrote:
>> On 06/27/2015 12:59 PM, Gilles Chanteperdrix wrote:
>>> On Sat, Jun 27, 2015 at 12:50:35PM +0200, Philippe Gerum wrote:
>>>> On 06/27/2015 12:03 PM, Gilles Chanteperdrix wrote:
>>>>> On Fri, Jun 26, 2015 at 10:17:21PM +0200, Jan Kiszka wrote:
>>>>>> On 2015-06-26 18:01, Gilles Chanteperdrix wrote:
>>>>>>> On Fri, Jun 26, 2015 at 04:12:44PM +0200, git repository hosting wrote:
>>>>>>>> Module: xenomai-jki
>>>>>>>> Branch: for-forge
>>>>>>>> Commit: 67a64e164b737759cc171d3b04b05770e1450cb6
>>>>>>>> URL: http://git.xenomai.org/?p=xenomai-jki.git;a=commit;h=67a64e164b737759cc171d3b04b05770e1450cb6
>>>>>>>>
>>>>>>>> Author: Jan Kiszka <jan.kiszka@siemens.com>
>>>>>>>> Date: Fri Jun 26 15:11:42 2015 +0200
>>>>>>>>
>>>>>>>> cobalt/kernel: Fix locking for xnthread info manipulations
>>>>>>>>
>>>>>>>> nklock must be held when manipulating bits of xnthread::info. Not all
>>>>>>>> callsites of xnthread_set/clear_info follow this rule so far, directly
>>>>>>>> or indirectly, fix them (and possibly some other races along this).
>>>>>>>>
>>>>>>>> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
>>>>>>>>
>>>>>>>> ---
>>>>>>>>
>>>>>>>> include/cobalt/kernel/rtdm/driver.h | 8 ++++----
>>>>>>>> kernel/cobalt/posix/syscall.c | 5 +++++
>>>>>>>> kernel/cobalt/synch.c | 2 ++
>>>>>>>> kernel/cobalt/thread.c | 5 +++++
>>>>>>>> 4 files changed, 16 insertions(+), 4 deletions(-)
>>>>>>>>
>>>>>>>> diff --git a/include/cobalt/kernel/rtdm/driver.h b/include/cobalt/kernel/rtdm/driver.h
>>>>>>>> index c14198b..de476ca 100644
>>>>>>>> --- a/include/cobalt/kernel/rtdm/driver.h
>>>>>>>> +++ b/include/cobalt/kernel/rtdm/driver.h
>>>>>>>> @@ -553,7 +553,7 @@ static inline void rtdm_lock_get(rtdm_lock_t *lock)
>>>>>>>> {
>>>>>>>> XENO_BUG_ON(COBALT, !spltest());
>>>>>>>> spin_lock(lock);
>>>>>>>> - __xnsched_lock();
>>>>>>>> + xnsched_lock();
>>>>>>>> }
>>>>>>>>
>>>>>>>> /**
>>>>>>>> @@ -566,7 +566,7 @@ static inline void rtdm_lock_get(rtdm_lock_t *lock)
>>>>>>>> static inline void rtdm_lock_put(rtdm_lock_t *lock)
>>>>>>>> {
>>>>>>>> spin_unlock(lock);
>>>>>>>> - __xnsched_unlock();
>>>>>>>> + xnsched_unlock();
>>>>>>>> }
>>>>>>>>
>>>>>>>> /**
>>>>>>>> @@ -584,7 +584,7 @@ static inline rtdm_lockctx_t __rtdm_lock_get_irqsave(rtdm_lock_t *lock)
>>>>>>>>
>>>>>>>> context = ipipe_test_and_stall_head();
>>>>>>>> spin_lock(lock);
>>>>>>>> - __xnsched_lock();
>>>>>>>> + xnsched_lock();
>>>>>>>>
>>>>>>>> return context;
>>>>>>>> }
>>>>>>>> @@ -603,7 +603,7 @@ static inline
>>>>>>>> void rtdm_lock_put_irqrestore(rtdm_lock_t *lock, rtdm_lockctx_t context)
>>>>>>>> {
>>>>>>>> spin_unlock(lock);
>>>>>>>> - __xnsched_unlock();
>>>>>>>> + xnsched_unlock();
>>>>>>>> ipipe_restore_head(context);
>>>>>>>> }
>>>>>>>
>>>>>>> These changes are not without risk: is not there a risk of deadlock
>>>>>>> if one thread calls rtdm_lock_get without holding the nklock and
>>>>>>> another calls it while holding it?
>>>>>>
>>>>>> That could happen, although the "normal" pattern does not involve
>>>>>> explicit nklock acquisition if nesting is used, rather rtdm_lock before
>>>>>> calling nklock-acquiring services. Not a reasonable ordering either, but
>>>>>> common practice now.
>>>>>>
>>>>>> We could also try to rework the whole scheduler lock so that it avoids
>>>>>> unsafe thread fields, ideally converting it to something more
>>>>>> lightweight as Linux' preempt_disable/enable().
>>>>>>
>>>>>> BTW, xnsched has those two types of flags fields: status requires
>>>>>> nklock, lflags must only be locally modified, thus are fine with
>>>>>> interrupt disabling as protection.
>>>>>
>>>>> the sched lock uses lflags. The only point where we need to know
>>>>> that a thread had the scheduler lock, is when it is switched in
>>>>> after having voluntarily suspended. So, maybe a solution would be to
>>>>> set that state bit only when a thread with the scheduler lock
>>>>> voluntarily suspends. That would make the scheduler lock quite
>>>>> lightweight (refcount + scheduler lflags bit).
>>>>>
>>>>
>>>> - XNINLOCK is overkill, it is only tested for skipping the resched
>>>> procedure as an optimization from xnsched_run(). As we may access
>>>> sched->lflags in that situation, we may also access
>>>> sched->curr->lock_count to figure this out.
>>>>
>>>> - XNLOCK and xnthread->lock_count are redundant when combined to track
>>>> the lock state internally. This is part of what makes xnsched_lock() and
>>>> friends uselessly complex (the potential for deadlocks you mentioned
>>>> about the RTDM locking changes is definitely a concern, dropping the
>>>> requirement for grabbing the nklock there is very desirable in that
>>>> sense). However, we still need XNLOCK as part of the ABI to convey the
>>>> state bit information between kernel and userland. But, referring to
>>>> curr->lock_count should be enough for implementing that logic in the
>>>> scheduler code.
>>>>
>>>> - there are only two places where XNLOCK is accessed within the state
>>>> word of a potentially non-current thread, i.e: from
>>>> __xnthread_set_schedparam(), and xnthread_set_mode(). But XNLOCK is only
>>>> tested there, and testing thread->lock_count would work exactly the same
>>>> way.
>>>>
>>>> So, yes, I agree that such code direly needs simplification.
>>>
>>> Ok. I was thinking about using XNINLOCK in place of XNLOCK, but
>>> using lock_count may be even simpler.
>>>
>>> The nklock seems to also be needed for handling the XNLBALERT bit.
>>>
>>
>> Correct, but I'm looking for a way to move this elsewhere.
>>
>
> BTW, let me know how we should proceed with this issue. Are you looking
> into the scheduler lock? I could split off the RTDM thing from the rest
> if those bits are fine.
>
Yes, I'm dropping XNLOCK as a runtime bit, keeping it only for ABI
purpose between lib/cobalt and pthread_setmode_np basically. I'll push
the changes tomorrow. Next step will be to get rid of nklock for
xnsched_lock() entirely, still needs a bit more thought though.
--
Philippe.
next prev parent reply other threads:[~2015-06-30 18:55 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <E1Z8UN2-0004n4-Gc@sd-51317.xenomai.org>
2015-06-26 16:01 ` [Xenomai] [Xenomai-git] Jan Kiszka : cobalt/kernel: Fix locking for xnthread info manipulations Gilles Chanteperdrix
2015-06-26 20:17 ` Jan Kiszka
2015-06-27 10:03 ` Gilles Chanteperdrix
2015-06-27 10:50 ` Philippe Gerum
2015-06-27 10:59 ` Gilles Chanteperdrix
2015-06-27 13:05 ` Philippe Gerum
2015-06-30 18:40 ` Jan Kiszka
2015-06-30 18:55 ` Philippe Gerum [this message]
[not found] <E1ZG585-0002oE-L7@sd-51317.xenomai.org>
2015-07-17 13:13 ` Philippe Gerum
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5592E627.3020701@xenomai.org \
--to=rpm@xenomai.org \
--cc=gilles.chanteperdrix@xenomai.org \
--cc=jan.kiszka@siemens.com \
--cc=xenomai@xenomai.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.