Re: [Xenomai] Target frozen with rtcan_virt

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Gilles Chanteperdrix <gilles.chanteperdrix@xenomai.org>
To: Philippe Gerum <rpm@xenomai.org>
Cc: Thierry Bultel <thierry.bultel@wanadoo.fr>,
	Jan Kiszka <jan.kiszka@web.de>, Wolfgang Grandegger <wg@denx.de>,
	xenomai@xenomai.org
Subject: Re: [Xenomai] Target frozen with rtcan_virt
Date: Wed, 10 Oct 2012 16:52:15 +0200	[thread overview]
Message-ID: <50758B9F.3060209@xenomai.org> (raw)
In-Reply-To: <50757EBD.7080506@xenomai.org>

On 10/10/2012 03:57 PM, Philippe Gerum wrote:
> On 10/10/2012 03:19 PM, Gilles Chanteperdrix wrote:
>> On 10/10/2012 03:18 PM, Philippe Gerum wrote:
>>> On 10/10/2012 03:16 PM, Gilles Chanteperdrix wrote:
>>>> On 10/10/2012 03:09 PM, Philippe Gerum wrote:
>>>>> On 10/10/2012 03:01 PM, Gilles Chanteperdrix wrote:
>>>>>> On 10/10/2012 02:57 PM, Philippe Gerum wrote:
>>>>>>> On 10/10/2012 02:55 PM, Gilles Chanteperdrix wrote:
>>>>>>>> On 10/10/2012 02:41 PM, Philippe Gerum wrote:
>>>>>>>>> On 10/10/2012 02:33 PM, Gilles Chanteperdrix wrote:
>>>>>>>>>> On 10/10/2012 02:09 PM, Philippe Gerum wrote:
>>>>>>>>>>> On 10/10/2012 01:30 PM, Gilles Chanteperdrix wrote:
>>>>>>>>>>>> On 10/10/2012 12:54 PM, Philippe Gerum wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> On 10/10/2012 12:25 PM, Jan Kiszka wrote:
>>>>>>>>>>>>>> On 2012-10-10 12:07, Gilles Chanteperdrix wrote:
>>>>>>>>>>>>>>> On 10/10/2012 12:04 PM, Jan Kiszka wrote:
>>>>>>>>>>>>>>>> On 2012-10-10 11:23, Gilles Chanteperdrix wrote:
>>>>>>>>>>>>>>>>> On 10/10/2012 11:01 AM, Jan Kiszka wrote:
>>>>>>>>>>>>>>>>>> On 2012-10-10 10:58, Gilles Chanteperdrix wrote:
>>>>>>>>>>>>>>>>>>> On 10/10/2012 10:10 AM, Jan Kiszka wrote:
>>>>>>>>>>>>>>>>>>>> On 2012-10-10 10:04, Gilles Chanteperdrix wrote:
>>>>>>>>>>>>>>>>>>>>> On 10/10/2012 09:56 AM, Jan Kiszka wrote:
>>>>>>>>>>>>>>>>>>>>>> On 2012-10-10 09:51, Gilles Chanteperdrix wrote:
>>>>>>>>>>>>>>>>>>>>>>> On 10/10/2012 09:38 AM, Thierry Bultel wrote:
>>>>>>>>>>>>>>>>>>>>>>>> Hi Gilles,
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> Many thanks,
>>>>>>>>>>>>>>>>>>>>>>>> The first patch does not work, the second does.
>>>>>>>>>>>>>>>>>>>>>>>> I think the reason for 1st patch why is that in rtcan_virt, we have
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> rtdm_lock_get_irqsave(&rtcan_recv_list_lock, lock_ctx);
>>>>>>>>>>>>>>>>>>>>>>>> rtdm_lock_get(&rtcan_socket_lock);
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> ...
>>>>>>>>>>>>>>>>>>>>>>>> --->        rtcan_rcv(rx_dev, &skb);
>>>>>>>>>>>>>>>>>>>>>>>> ....
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> rtdm_lock_put(&rtcan_socket_lock);
>>>>>>>>>>>>>>>>>>>>>>>> rtdm_lock_put_irqrestore(&rtcan_recv_list_lock, lock_ctx);
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> and rtcan_rcv->rtcan_rcv_deliver->rtdm_sem_up(&sock->recv_sem);
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> thus the same re-scheduling stuff with interrupts locked.
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> Are you not not afraid of side effects with the second patch, 
>>>>>>>>>>>>>>>>>>>>>>>> since you change the overall behaviour ?
>>>>>>>>>>>>>>>>>>>>>>>> Won't you prefer a only locally modified rtcan_virt ?
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> We should ask Jan's opinion. In any case, if we adopt the second patch,
>>>>>>>>>>>>>>>>>>>>>>> we might want to try and reduce the overhead of xnpod_unlock_sched.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> We were signaling the semaphore while holding a spin lock? That's a
>>>>>>>>>>>>>>>>>>>>>> clear bug. Your patch is aligning rtcan to the pattern we are also using
>>>>>>>>>>>>>>>>>>>>>> in RTnet. We just need to make sure (haven't looked at the full context
>>>>>>>>>>>>>>>>>>>>>> yet) that sock remains valid even after dropping the lock(s).
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> The second patch idea was to lock the scheduler while spinlocks are
>>>>>>>>>>>>>>>>>>>>> held, so that posting a semaphore while holding a spin lock is no longer
>>>>>>>>>>>>>>>>>>>>> a bug.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Sounds a bit hacky,
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Well, that is what the linux kernel does.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>  but I think we have this pattern
>>>>>>>>>>>>>>>>>>>> (RTDM_EXECUTE_ATOMICALLY)
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> RTDM_EXECUTE_ATOMICALLY is a bit of a misnomer, if you do:
>>>>>>>>>>>>>>>>>>> RTDM_EXECUTE_ATOMICALLY(foo(); rtdm_sem_up(); bar());
>>>>>>>>>>>>>>>>>>> foo() and bar() are not executed atomically if sem_up wakes up another
>>>>>>>>>>>>>>>>>>> thread.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> So, I do not see how RTDM_EXECUTE_ATOMICALLY solves the issue we are
>>>>>>>>>>>>>>>>>>> talking about.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> RTDM_EXECUTE_ATOMICALLY holds the nucleus lock across the encapsulated
>>>>>>>>>>>>>>>>>> code, executing it atomically as rescheduling is postponed until the end
>>>>>>>>>>>>>>>>>> of the block.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Err... no. Absolutely not.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Err... absolutely right.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> The good news is: we don't need to worry about such kind of locking. In
>>>>>>>>>>>>>>>> rtcan_raw_recvmsg, the socket is locked via the RTDM context as we are
>>>>>>>>>>>>>>>> in a handler. So it won't disappear when we drop the lock, and your
>>>>>>>>>>>>>>>> first patch is fine.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Which one? The first one does not seem to work because the rtdm locks
>>>>>>>>>>>>>>> seem to be nested. The second one would probably need to find a way to
>>>>>>>>>>>>>>> reduce the overhead of xnpod_unlock_sched(). What can be done, however,
>>>>>>>>>>>>>>> is adding a call to xnpod_lock_sched()/xnpod_unlock_sched() in
>>>>>>>>>>>>>>> RTDM_EXECUTE_ATOMICALLY.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Oh, I'm seeing the locking forest in rtcan now. I suppose rtcan wasn't
>>>>>>>>>>>>>> used much on SMP so far. That looks indeed unresolvable without a
>>>>>>>>>>>>>> semantical change to rtdm_lock/unlock.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> But then we really need something as light-weight as preempt_enable/disable.
>>>>>>>>>>>>>>
>>>>>>>>>>>>> This is not as lightweight as it might be given that we pair a flag and
>>>>>>>>>>>>> a counter to achieve this (which saves one data reference in
>>>>>>>>>>>>> xnpod_schedule() though), but this is a start:
>>>>>>>>>>>>>
>>>>>>>>>>>>> http://git.xenomai.org/?p=xenomai-2.6.git;a=commit;h=aed4dfce9967e45ef7e8a8da4b6c90267ea81497
>>>>>>>>>>>>>
>>>>>>>>>>>>> So, I'm setting __xnpod_lock_sched() and __xnpod_unlock_sched() in stone
>>>>>>>>>>>>> in the nucleus API to manipulate the sched locking counter from a
>>>>>>>>>>>>> context where the nucleus lock is already held, so that RTDM can rely on
>>>>>>>>>>>>> this for RTDM_EXECUTE_ATOMICALLY().
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> The problem of xnpod_unlock_sched from my point of view is this section
>>>>>>>>>>>> of code:
>>>>>>>>>>>>
>>>>>>>>>>>>  		xnsched_set_self_resched(curr->sched);
>>>>>>>>>>>>  		xnpod_schedule();
>>>>>>>>>>>>
>>>>>>>>>>>> It means that we will go to the full blown __xnpod_schedule when
>>>>>>>>>>>> unlocking the top-most spinlock.
>>>>>>>>>>>>
>>>>>>>>>>>> I guess what I meant is that we should have a scheduler bit that is
>>>>>>>>>>>> simply tested in xnpod_schedule, but then we loose the ability for the
>>>>>>>>>>>> threads with the scheduler locked to suspend themselves. So, maybe we
>>>>>>>>>>>> should define another xnpod_lock/unlock pair, maybe something like
>>>>>>>>>>>> xnpod_preempt_disablle()/xnpod_preempt_enabled().
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> xnpod_lock_sched() really means xnpod_preempt_disable() in RTOS parlance.
>>>>>>>>>>
>>>>>>>>>> Yes, but the reason why we have to go to __xnpod_schedule in
>>>>>>>>>> xnpod_unlock_sched() is to allow thread to be suspended with the
>>>>>>>>>> scheduler locked (you added that at some point for the vxworks emulator,
>>>>>>>>>> if I remember correctly). But without that need, we can put the XNLOCK
>>>>>>>>>> bit in the sched->status structure, and simply test that bit with all
>>>>>>>>>> the others in xnpod_schedule(), and forget about the need to call
>>>>>>>>>> xnsched_set_self_resched() in xnpod_unlock_sched(). That is all I meant.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> You may have to reschedule after an IRQ has interrupted the lock owner
>>>>>>>>> inside the locked section, in which case you have to make sure to
>>>>>>>>> reschedule when dropping the sched lock.
>>>>>>>>
>>>>>>>> Well, in that case the resched bit will have been set already by the irq
>>>>>>>> handler calling xnpod_resume_thread, or other service, you do not have
>>>>>>>> to force it.
>>>>>>>>
>>>>>>>
>>>>>>> Yes, and this is why you have to call __xnpod_schedule() regardless.
>>>>>>>
>>>>>> no, xnpod_schedule is enough, it will only call __xnpod_schedule() if
>>>>>> the resched bit is set.
>>>>>>
>>>>>
>>>>> We are not discussing about the same issue, I'm afraid. We can't
>>>>> optimize the schedlock path via a scheduler flag since we have to care
>>>>> about lock nesting. Since the only sane option there is to hold such
>>>>> counter in the current thread context, the optimization is 100% in the
>>>>> way we access this information.
>>>>>
>>>>
>>>> Again, if we do not have to allow threads to suspend while holding the
>>>> scheduler lock, the preempt_count also can be in the sched structure.
>>>>
>>>
>>> Sure, but this point is moot. We just _can't_ do that.
>>>
>> Except if we invent another scheduler lock service, simply for the
>> purpose of spinlocks where suspending the spinlock holder does not make
>> sense anyway.
>>
> 
> Oh, well, ok. Ack.
> 
> Let's just make sure that we can fold the whole thing into a single set
> of services at some point in 3.x though.
> 
> The core issue is that we should not even have to expose a scheduler
> locking service to userland, but emulating traditional RTOS APIs
> requires that. What a braindamage counter-feature for a real-time system
> when one thinks of it.
> 

With a parameter to xnpod_lock_sched(), but we would have to pass a
parameter to xnpod_unlock_sched() too...

-- 
					    Gilles.

next prev parent reply	other threads:[~2012-10-10 14:52 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <505EB99F.7050705@wanadoo.fr>
2012-09-23  7:42 ` [Xenomai] Target frozen with rtcan_virt Thierry Bultel
2012-09-23 13:23   ` Gilles Chanteperdrix
2012-09-23 14:57     ` Thierry Bultel
2012-09-23 14:59       ` Gilles Chanteperdrix
2012-10-09 12:45     ` Thierry Bultel
2012-10-09 21:34       ` Gilles Chanteperdrix
2012-10-09 21:41         ` Gilles Chanteperdrix
2012-10-09 21:51           ` Gilles Chanteperdrix
2012-10-10  7:38             ` Thierry Bultel
2012-10-10  7:51               ` Gilles Chanteperdrix
2012-10-10  7:56                 ` Jan Kiszka
2012-10-10  8:04                   ` Gilles Chanteperdrix
2012-10-10  8:10                     ` Jan Kiszka
2012-10-10  8:58                       ` Gilles Chanteperdrix
2012-10-10  9:01                         ` Jan Kiszka
2012-10-10  9:23                           ` Gilles Chanteperdrix
2012-10-10 10:04                             ` Jan Kiszka
2012-10-10 10:07                               ` Gilles Chanteperdrix
2012-10-10 10:25                                 ` Jan Kiszka
2012-10-10 10:54                                   ` Philippe Gerum
2012-10-10 11:30                                     ` Gilles Chanteperdrix
2012-10-10 12:09                                       ` Philippe Gerum
2012-10-10 12:33                                         ` Gilles Chanteperdrix
2012-10-10 12:41                                           ` Philippe Gerum
2012-10-10 12:43                                             ` Philippe Gerum
2012-10-10 12:45                                               ` Philippe Gerum
2012-10-10 12:55                                             ` Gilles Chanteperdrix
2012-10-10 12:57                                               ` Philippe Gerum
2012-10-10 13:01                                                 ` Gilles Chanteperdrix
2012-10-10 13:09                                                   ` Philippe Gerum
2012-10-10 13:16                                                     ` Gilles Chanteperdrix
2012-10-10 13:18                                                       ` Philippe Gerum
2012-10-10 13:19                                                         ` Gilles Chanteperdrix
2012-10-10 13:57                                                           ` Philippe Gerum
2012-10-10 14:52                                                             ` Gilles Chanteperdrix [this message]
2012-10-11  8:05                                                             ` Gilles Chanteperdrix
2012-11-05 13:33                                                               ` Thierry Bultel
2012-11-05 18:46                                                                 ` Gilles Chanteperdrix
2012-11-06  7:22                                                                   ` Thierry Bultel
2012-11-06 11:59                                                                   ` Philippe Gerum
2012-11-06 14:49                                                                     ` Gilles Chanteperdrix
2012-10-10 13:17                                                     ` Philippe Gerum

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=50758B9F.3060209@xenomai.org \
    --to=gilles.chanteperdrix@xenomai.org \
    --cc=jan.kiszka@web.de \
    --cc=rpm@xenomai.org \
    --cc=thierry.bultel@wanadoo.fr \
    --cc=wg@denx.de \
    --cc=xenomai@xenomai.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.