All of lore.kernel.org
 help / color / mirror / Atom feed
* [Xenomai] Process shared rt_event_wait() never signaled on ARM with Mercury core
@ 2016-02-10 18:41 Charles Kiorpes
  2016-02-10 20:21 ` Philippe Gerum
  0 siblings, 1 reply; 10+ messages in thread
From: Charles Kiorpes @ 2016-02-10 18:41 UTC (permalink / raw)
  To: xenomai

Hello,
I am having some difficulty getting two processes to signal each other on
an AM572x EVM.
I have two small sample applications:

- app1 : creates an event "MYTEST" using rt_event_create(), clears the
event, and then rt_event_wait()s for TM_INFINITE.  When the event is
signaled, app1 rt_event_clear()s the event and then loops back to the
rt_event_wait()

- app2 : binds to the "MYTEST" event, sleeps for one second,
rt_event_signal()s the event, and then loops back to the one second sleep

I am using the Mercury core, built natively, configured with:
xenomai-3.0.1/configure \
--with-core=mercury \
--enable-debug \
--enable-pshared \
--disable-registry \
--enable-smp \
--enable-condvar-workaround

I launch both applications in different terminals in this order:
# sudo ./app1 --session="Test"
# sudo ./app2 --session="Test"

The following behavior results:
- app1 claims to have successfully created the event
- app2 claims to have successfully bound to the event
- app2 loops correctly, signalling the event once per second
- app1 never wakes up from the first rt_event_wait()

I am running the 4.1.15-ti-rt kernel with full preemption.

On an x86 virtual machine with a vanilla 4.2.0-27-generic kernel, the
applications perform as expected (app1 wakes up when app2 signals the
event).

Any advice as to where to look for further debugging would be greatly
appreciated.

Thanks,
Charles

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Xenomai] Process shared rt_event_wait() never signaled on ARM with Mercury core
  2016-02-10 18:41 [Xenomai] Process shared rt_event_wait() never signaled on ARM with Mercury core Charles Kiorpes
@ 2016-02-10 20:21 ` Philippe Gerum
  2016-02-11 12:57   ` Charles Kiorpes
  0 siblings, 1 reply; 10+ messages in thread
From: Philippe Gerum @ 2016-02-10 20:21 UTC (permalink / raw)
  To: Charles Kiorpes, xenomai

On 02/10/2016 07:41 PM, Charles Kiorpes wrote:
> Hello,
> I am having some difficulty getting two processes to signal each other on
> an AM572x EVM.
> I have two small sample applications:
> 
> - app1 : creates an event "MYTEST" using rt_event_create(), clears the
> event, and then rt_event_wait()s for TM_INFINITE.  When the event is
> signaled, app1 rt_event_clear()s the event and then loops back to the
> rt_event_wait()
> 
> - app2 : binds to the "MYTEST" event, sleeps for one second,
> rt_event_signal()s the event, and then loops back to the one second sleep
> 
> I am using the Mercury core, built natively, configured with:
> xenomai-3.0.1/configure \
> --with-core=mercury \
> --enable-debug \
> --enable-pshared \
> --disable-registry \
> --enable-smp \
> --enable-condvar-workaround
> 
> I launch both applications in different terminals in this order:
> # sudo ./app1 --session="Test"
> # sudo ./app2 --session="Test"
> 
> The following behavior results:
> - app1 claims to have successfully created the event
> - app2 claims to have successfully bound to the event
> - app2 loops correctly, signalling the event once per second
> - app1 never wakes up from the first rt_event_wait()
> 
> I am running the 4.1.15-ti-rt kernel with full preemption.
> 
> On an x86 virtual machine with a vanilla 4.2.0-27-generic kernel, the
> applications perform as expected (app1 wakes up when app2 signals the
> event).
> 
> Any advice as to where to look for further debugging would be greatly
> appreciated.
> 

Over Mercury, the event synchronization is based on a common
mutex+condvar pair obtained from the underlying *libc basically. The
rest of the implementation for this mechanism is shared with the Cobalt
side, except the workaround for the PI breakage the glibc suffers, which
is Mercury-only.

- does omitting --enable-condvar-workaround change the behavior?
- does this issue also happen when both threads belong to the same
process (i.e. is it a session-based processing issue?)
- does the code check for each and every return code, particularly the
return status of rt_event_signal()? Attaching your self-contained test
case may help finding out where the problem stands.
- after enabling the registry (--enable-registry), is the reported event
value consistent when looking at
/var/run/xenomai/<user>/Test/alchemy/events/<event_name>?

Also, you have a test suite available for Alchemy, which can run on
either Mercury/Cobalt cores unmodified. There is a test exercising
events there.

e.g.:

$ cd lib/alchemy/testsuite
$ make XENO_CONFIG=/your/xenomai/install/bin/xeno-config test

NOTE: The "test" rule in the Makefile assumes that the current shell
user is sudoer.

-- 
Philippe.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Xenomai] Process shared rt_event_wait() never signaled on ARM with Mercury core
  2016-02-10 20:21 ` Philippe Gerum
@ 2016-02-11 12:57   ` Charles Kiorpes
  2016-02-12 10:43     ` Philippe Gerum
  2016-02-12 10:55     ` Philippe Gerum
  0 siblings, 2 replies; 10+ messages in thread
From: Charles Kiorpes @ 2016-02-11 12:57 UTC (permalink / raw)
  To: Philippe Gerum; +Cc: xenomai

On Wed, Feb 10, 2016 at 3:21 PM, Philippe Gerum <rpm@xenomai.org> wrote:

> On 02/10/2016 07:41 PM, Charles Kiorpes wrote:
> > Hello,
> > I am having some difficulty getting two processes to signal each other on
> > an AM572x EVM.
> > I have two small sample applications:
> >
> > - app1 : creates an event "MYTEST" using rt_event_create(), clears the
> > event, and then rt_event_wait()s for TM_INFINITE.  When the event is
> > signaled, app1 rt_event_clear()s the event and then loops back to the
> > rt_event_wait()
> >
> > - app2 : binds to the "MYTEST" event, sleeps for one second,
> > rt_event_signal()s the event, and then loops back to the one second sleep
> >
> > I am using the Mercury core, built natively, configured with:
> > xenomai-3.0.1/configure \
> > --with-core=mercury \
> > --enable-debug \
> > --enable-pshared \
> > --disable-registry \
> > --enable-smp \
> > --enable-condvar-workaround
> >
> > I launch both applications in different terminals in this order:
> > # sudo ./app1 --session="Test"
> > # sudo ./app2 --session="Test"
> >
> > The following behavior results:
> > - app1 claims to have successfully created the event
> > - app2 claims to have successfully bound to the event
> > - app2 loops correctly, signalling the event once per second
> > - app1 never wakes up from the first rt_event_wait()
> >
> > I am running the 4.1.15-ti-rt kernel with full preemption.
> >
> > On an x86 virtual machine with a vanilla 4.2.0-27-generic kernel, the
> > applications perform as expected (app1 wakes up when app2 signals the
> > event).
> >
> > Any advice as to where to look for further debugging would be greatly
> > appreciated.
> >
>
> Over Mercury, the event synchronization is based on a common
> mutex+condvar pair obtained from the underlying *libc basically. The
> rest of the implementation for this mechanism is shared with the Cobalt
> side, except the workaround for the PI breakage the glibc suffers, which
> is Mercury-only.
>
> - does omitting --enable-condvar-workaround change the behavior?
>

Omitting --enable-condvar-workaround has no effect on the behavior of this
test app.


> - does this issue also happen when both threads belong to the same
> process (i.e. is it a session-based processing issue?)
>

I am unsure, I will construct a test case for this and reply with more info.


> - does the code check for each and every return code, particularly the
> return status of rt_event_signal()? Attaching your self-contained test
> case may help finding out where the problem stands.
>

The code checks the return value of all alchemy API calls and
printfs+returns if an error is encountered.
I have attached app1.c and app2.c for review.


> - after enabling the registry (--enable-registry), is the reported event
> value consistent when looking at
> /var/run/xenomai/<user>/Test/alchemy/events/<event_name>?
>
>
The event shows the correct signaled value under
/var/run/xenomai/root/Test/<app1_pid>/alchemy/events/MYTEST.
There are no extra events present, for example under app2's pid.


> Also, you have a test suite available for Alchemy, which can run on
> either Mercury/Cobalt cores unmodified. There is a test exercising
> events there.
>
> e.g.:
>
> $ cd lib/alchemy/testsuite
> $ make XENO_CONFIG=/your/xenomai/install/bin/xeno-config test
>
> NOTE: The "test" rule in the Makefile assumes that the current shell
> user is sudoer.
>
>
I attempted to run several tests:  'task-1', 'event-1', and 'mutex-1'.
Each of these hung indefinitely.  A gdb trace indicated that they were
hanging on __libc_do_syscall() within __pthread_cond_wait() within
threadobj_cond_wait().

I have attached the full backtrace from mutex-1 as mutex-1_bt.txt

I will be trying combinations of kernels and configuration options
throughout the course of the day.


> --
> Philippe.


Thanks,

-Charles
-------------- next part --------------
(gdb) bt full
#0  __libc_do_syscall () at ../ports/sysdeps/unix/sysv/linux/arm/libc-do-syscall.S:43
No locals.
#1  0xb6f67204 in __pthread_cond_wait (cond=cond@entry=0xb6cb0194, mutex=mutex@entry=0xb6cb009c)
    at pthread_cond_wait.c:178
        _a2tmp = 11
        _a2 = 11
        _nametmp = 240
        _a3tmp = 3
        _a3 = 3
        _a1 = -1228209768
        _v1tmp = -1228210020
        _a4tmp = 0
        _a4 = 0
        _v1 = -1228210020
        _name = 240
        __op = 11
        futex_val = 3
        buffer = {__routine = 0xb6f66f1d <__condvar_cleanup>, __arg = 0xbefff9e8, __canceltype = -1090520544, 
          __prev = 0x0}
        cbuffer = {oldtype = 1, cond = 0xb6cb0194, mutex = 0xb6cb009c, bc_seq = 0}
        err = <optimized out>
        pshared = 128
        pi_flag = 1
        val = <optimized out>
        seq = 1
#2  0xb6fa0f8c in threadobj_cond_wait (cond=cond@entry=0xb6cb0194, lock=lock@entry=0xb6cb009c)
    at ../../../xenomai-3.0.1/lib/copperplate/threadobj.c:959
        ret = <optimized out>
#3  0xb6fa0fe4 in wait_on_barrier (thobj=thobj@entry=0xb6cb0094, mask=mask@entry=16)
    at ../../../xenomai-3.0.1/lib/copperplate/threadobj.c:1210
        __cancel_buf = {__cancel_jmp_buf = {{__cancel_jmp_buf = {-1228232340, -1228210028, 69993, 
                -1090520152, 0, 0, -1224740864, 0, 688021297, 554153222, 0 <repeats 16 times>, -1225393808, 
                -1224796768, 5, 0, 0, -1224757760, 0, -1224758200, -1, -1225356317, -1225151504, -1224798208, 
                -1090520212, -1224800400, -1224806400, -1225019392, 110988, 0, 1, -1224740864, 0, 
                -1224861195, -1224800400, 0, -1224806400, -1225019392, 111420, 0, 1, 1, 0, 0, 0, 0, 
                -1225030060, -1228232340, -1228210028, 69993}, __mask_was_saved = 0}}, __pad = {0xbefffbe8, 
            0x0, 0xb6cb0094, 0x11169 <task_b>}}
        __not_first_call = <optimized out>
        status = <optimized out>
#4  0xb6fa15d6 in threadobj_start (thobj=thobj@entry=0xb6cb0094)
    at ../../../xenomai-3.0.1/lib/copperplate/threadobj.c:1247
        ret = 0
        oldstate = 1
#5  0xb6fc2ee8 in rt_task_start (task=<optimized out>, entry=0x11169 <task_b>, arg=0x0)
    at ../../../xenomai-3.0.1/lib/alchemy/task.c:634
        tcb = 0xb6caff6c
        ret = 69993
#6  0x00011562 in main (argc=1, argv=0x22538) at mutex-1.c:130
        ret = 0

-------------- next part --------------
A non-text attachment was scrubbed...
Name: app1.c
Type: text/x-csrc
Size: 1076 bytes
Desc: not available
URL: <http://xenomai.org/pipermail/xenomai/attachments/20160211/5df88846/attachment.c>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: app2.c
Type: text/x-csrc
Size: 759 bytes
Desc: not available
URL: <http://xenomai.org/pipermail/xenomai/attachments/20160211/5df88846/attachment-0001.c>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Xenomai] Process shared rt_event_wait() never signaled on ARM with Mercury core
  2016-02-11 12:57   ` Charles Kiorpes
@ 2016-02-12 10:43     ` Philippe Gerum
  2016-02-12 14:08       ` Charles Kiorpes
  2016-02-12 10:55     ` Philippe Gerum
  1 sibling, 1 reply; 10+ messages in thread
From: Philippe Gerum @ 2016-02-12 10:43 UTC (permalink / raw)
  To: Charles Kiorpes; +Cc: xenomai

On 02/11/2016 01:57 PM, Charles Kiorpes wrote:
> 
> I attempted to run several tests:  'task-1', 'event-1', and 'mutex-1'. 
> Each of these hung indefinitely.  A gdb trace indicated that they were
> hanging on __libc_do_syscall() within __pthread_cond_wait() within
> threadobj_cond_wait().
> 
> I have attached the full backtrace from mutex-1 as mutex-1_bt.txt
> 

Ok, if the test suite does not pass, something is badly wrong, so we
should investigate that hang issue before anything else.

The backtrace reveals that copperplate cannot handshake with a newly
spawned task, this is the purpose of the wait_on_barrier() call over the
context of rt_task_start(). That barrier should be signaled by a call to
threadobj_notify_entry() from the internal trampoline code of the
emerging thread (task_entry() in alchemy/task.c).

- maybe task_prologue_2() (alchemy/task.c) which is called earlier hangs
indefinitely, and therefore prevents threadobj_notify_entry() from running?

- maybe the new thread does not even start for some reason, are we sure
task_entry() is reached (e.g. do we hit a breakpoint there?)

Could you inspect the current thread list under gdb when the program hangs?

Also, I would recommend to enable full debugging for now
(--enable-debug=full) to get accurate line information, assuming the
issue should still show up with a non-optimized code. Hopefully.

-- 
Philippe.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Xenomai] Process shared rt_event_wait() never signaled on ARM with Mercury core
  2016-02-11 12:57   ` Charles Kiorpes
  2016-02-12 10:43     ` Philippe Gerum
@ 2016-02-12 10:55     ` Philippe Gerum
  1 sibling, 0 replies; 10+ messages in thread
From: Philippe Gerum @ 2016-02-12 10:55 UTC (permalink / raw)
  To: Charles Kiorpes; +Cc: xenomai

On 02/11/2016 01:57 PM, Charles Kiorpes wrote:
> will be trying combinations of kernels and configuration options
> throughout the course of the day.
>  

The embedded-specific options in the kernel configuration
(CONFIG_EMBEDDED) should be checked, particularly CONFIG_FUTEX.

-- 
Philippe.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Xenomai] Process shared rt_event_wait() never signaled on ARM with Mercury core
  2016-02-12 10:43     ` Philippe Gerum
@ 2016-02-12 14:08       ` Charles Kiorpes
  2016-02-12 15:25         ` Philippe Gerum
  0 siblings, 1 reply; 10+ messages in thread
From: Charles Kiorpes @ 2016-02-12 14:08 UTC (permalink / raw)
  To: Philippe Gerum; +Cc: xenomai

On Fri, Feb 12, 2016 at 5:43 AM, Philippe Gerum <rpm@xenomai.org> wrote:

> On 02/11/2016 01:57 PM, Charles Kiorpes wrote:
> >
> > I attempted to run several tests:  'task-1', 'event-1', and 'mutex-1'.
> > Each of these hung indefinitely.  A gdb trace indicated that they were
> > hanging on __libc_do_syscall() within __pthread_cond_wait() within
> > threadobj_cond_wait().
> >
> > I have attached the full backtrace from mutex-1 as mutex-1_bt.txt
> >
>
> Ok, if the test suite does not pass, something is badly wrong, so we
> should investigate that hang issue before anything else.
>
> The backtrace reveals that copperplate cannot handshake with a newly
> spawned task, this is the purpose of the wait_on_barrier() call over the
> context of rt_task_start(). That barrier should be signaled by a call to
> threadobj_notify_entry() from the internal trampoline code of the
> emerging thread (task_entry() in alchemy/task.c).
>
> - maybe task_prologue_2() (alchemy/task.c) which is called earlier hangs
> indefinitely, and therefore prevents threadobj_notify_entry() from running?
>
> - maybe the new thread does not even start for some reason, are we sure
> task_entry() is reached (e.g. do we hit a breakpoint there?)
>
> Could you inspect the current thread list under gdb when the program hangs?
>
> Also, I would recommend to enable full debugging for now
> (--enable-debug=full) to get accurate line information, assuming the
> issue should still show up with a non-optimized code. Hopefully.
>
> --
> Philippe.
>

I ran the task-1 test under gdb with this Xenomai configuration:
--with-core=mercury \
--enable-debug=full \
--enable-registry \
--enable-smp \
--enable-pshared \
--enable-condvar-workaround

It appears that the new thread is being launched, and getting stuck in
threadobj_wait_start() within task_prologue_2(), as you indicated might be
the case.
I have attached the thread list and a full backtrace for each thread (in
separate files by thread id).

As per your other message, my kernel configs all include CONFIG_FUTEX.

I have tried glibc 2.19 and 2.21, as well as RT patched and vanilla kernels.

Interestingly, when I removed --enable-pshared from my configuration, the
task-1 test passed.

- Charles
-------------- next part --------------
(gdb) thread 1
[Switching to thread 1 (Thread 0xb6ff0000 (LWP 14056))]
#0  __libc_do_syscall () at ../sysdeps/unix/sysv/linux/arm/libc-do-syscall.S:46
46	in ../sysdeps/unix/sysv/linux/arm/libc-do-syscall.S
(gdb) bt full
#0  __libc_do_syscall () at ../sysdeps/unix/sysv/linux/arm/libc-do-syscall.S:46
No locals.
#1  0xb6f5f6f2 in __pthread_cond_wait (cond=0xb6caa194, mutex=0xb6caa09c) at pthread_cond_wait.c:177
        _a2tmp = 11
        _a2 = 11
        _nametmp = 240
        _a3tmp = 3
        _a3 = 3
        _a1 = -1228234344
        _v1tmp = -1228234596
        _a4tmp = 0
        _a1tmp = -1228234344
        _a4 = 0
        _v1 = -1228234596
        _name = 240
        __ret = <optimized out>
        futex_val = 3
        buffer = {__routine = 0xb6f5f3e1 <__condvar_cleanup>, __arg = 0xbefffa00, __canceltype = -1225039472, __prev = 0x0}
        cbuffer = {oldtype = 1, cond = 0xb6caa194, mutex = 0xb6caa09c, bc_seq = 0}
        err = <optimized out>
        pshared = <optimized out>
        pi_flag = 1
        val = <optimized out>
        seq = <optimized out>
#2  0xb6f9cd9a in threadobj_cond_wait (cond=0xb6caa194, lock=0xb6caa09c) at ../../../xenomai-3/lib/copperplate/threadobj.c:959
        ret = -1228234604
#3  0xb6f9d348 in wait_on_barrier (thobj=0xb6caa094, mask=16) at ../../../xenomai-3/lib/copperplate/threadobj.c:1210
        __cancel_buf = {__cancel_jmp_buf = {{__cancel_jmp_buf = {-1446288694, -1580381599, -1090520024, 0, 0, -1090520496, 0, 0, -1224740864, 
                0 <repeats 17 times>, 2, 0, 5, 0, 1, -1224758200, -1659757782, -1224874387, 0, -1224738124, -1224757760, -1090520312, -1224757760, 0, -1, 
                -1224796312, -1225176080, -1224794112, -1225021852, -1224802304, -1224801088, 1, 0, -1224859353, -1224796312, 1, 5, 0, 0, 1, -1225039472, 
                0, -1090520024, 0, -1090520232, -1090520024, 0, 0}, __mask_was_saved = 0}}, __pad = {0xbefffc28, 0x0, 0x0, 0xb6ca496c}}
        __cancel_routine = 0xb6f5e225 <__GI___pthread_mutex_unlock>
        __cancel_arg = 0xb6caa09c
        __not_first_call = 0
        oldstate = 0
        status = 73
#4  0xb6f9d3fa in threadobj_start (thobj=0xb6caa094) at ../../../xenomai-3/lib/copperplate/threadobj.c:1247
        current = 0xb6ca496c
        ret = 0
        oldstate = 1
#5  0xb6fc2cc8 in rt_task_start (task=0x20ed0 <t_main>, entry=0x10985 <main_task>, arg=0xdeadbeef) at ../../../xenomai-3/lib/alchemy/task.c:634
        tcb = 0xb6ca9f6c
        svc = {cancel_type = -559038737}
        ret = 67973
#6  0x00010a62 in main (argc=1, argv=0x21538) at task-1.c:26
        ret = 0

-------------- next part --------------
(gdb) thread 2
[Switching to thread 2 (Thread 0xb6ca1460 (LWP 14063))]
#0  0xb6f61cbe in read () at ../sysdeps/unix/syscall-template.S:81
81	../sysdeps/unix/syscall-template.S: No such file or directory.
(gdb) bt full
#0  0xb6f61cbe in read () at ../sysdeps/unix/syscall-template.S:81
No locals.
#1  0xb6f1936e in ?? () from /lib/arm-linux-gnueabihf/libfuse.so.2
No symbol table info available.
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
(gdb) 

-------------- next part --------------
(gdb) thread 3
[Switching to thread 3 (Thread 0xb6c6f460 (LWP 14065))]
#0  __libc_do_syscall () at ../sysdeps/unix/sysv/linux/arm/libc-do-syscall.S:46
46	../sysdeps/unix/sysv/linux/arm/libc-do-syscall.S: No such file or directory.
(gdb) bt full
#0  __libc_do_syscall () at ../sysdeps/unix/sysv/linux/arm/libc-do-syscall.S:46
No locals.
#1  0xb6f5f6f2 in __pthread_cond_wait (cond=0xb6caa194, mutex=0xb6caa09c) at pthread_cond_wait.c:177
        _a2tmp = 11
        _a2 = 11
        _nametmp = 240
        _a3tmp = 1
        _a3 = 1
        _a1 = -1228234344
        _v1tmp = -1228234596
        _a4tmp = 0
        _a1tmp = -1228234344
        _a4 = 0
        _v1 = -1228234596
        _name = 240
        __ret = <optimized out>
        futex_val = 1
        buffer = {__routine = 0xb6f5f3e1 <__condvar_cleanup>, __arg = 0xb6c6ec18, __canceltype = -1225039472, __prev = 0x0}
        cbuffer = {oldtype = 1, cond = 0xb6caa194, mutex = 0xb6caa09c, bc_seq = 0}
        err = <optimized out>
        pshared = <optimized out>
        pi_flag = 1
        val = <optimized out>
        seq = <optimized out>
#2  0xb6f9cd9a in threadobj_cond_wait (cond=0xb6caa194, lock=0xb6caa09c) at ../../../xenomai-3/lib/copperplate/threadobj.c:959
        ret = -1228234596
#3  0xb6f9d348 in wait_on_barrier (thobj=0xb6caa094, mask=5) at ../../../xenomai-3/lib/copperplate/threadobj.c:1210
        __cancel_buf = {__cancel_jmp_buf = {{__cancel_jmp_buf = {-1577945870, -1580381599, -1228476860, -1090520124, 0, -1228477336, -1228476512, 
                -1224802304, 0, -1228474732, 0 <repeats 16 times>, 1, -1224758200, -1228477152, -1228477004, 0, -1228477120, -1224757760, -1228477168, 
                -1224757760, -1224738124, -1, 0, -1225177696, -1224794112, 1, -1224794112, 0, 0, 0, 0, -1224793672, -1228477112, -1224793672, 0, -1, 0, 
                -1225027584, 119632, -1225021852, 0, -1228234596, -1090520124, 0, -1228477040, -1228476512, -1224802304, 0, -1225404053}, 
              __mask_was_saved = 0}}, __pad = {0xb6c6ee80, 0x0, 0x0, 0xb6c6ed90}}
        __cancel_routine = 0xb6f5e225 <__GI___pthread_mutex_unlock>
        __cancel_arg = 0xb6caa09c
        __not_first_call = 0
        oldstate = 0
        status = 8
#4  0xb6f9d45e in threadobj_wait_start () at ../../../xenomai-3/lib/copperplate/threadobj.c:1276
        current = 0xb6caa094
        status = -1228234900
#5  0xb6fc2728 in task_prologue_2 (tcb=0xb6ca9f6c) at ../../../xenomai-3/lib/alchemy/task.c:211
        ret = 0
#6  0xb6fc276e in task_entry (arg=0xb6ca9f6c) at ../../../xenomai-3/lib/alchemy/task.c:227
        __ret = -1228476896
        tcb = 0xb6ca9f6c
        svc = {cancel_type = 1}
        ret = -1228475296
        __FUNCTION__ = "task_entry"
#7  0xb6f9a288 in thread_trampoline (arg=0xbefffb94) at ../../../xenomai-3/lib/copperplate/internal.c:251
        cta = 0xbefffb94
        _cta = {stacksize = 0, detachstate = 1, policy = 1, param_ex = {__sched_priority = 99, sched_u = {rr = {__sched_rr_quantum = {tv_sec = -1225255424, 
                  tv_nsec = 0}}}}, prologue = 0xb6fc26d1 <task_prologue_1>, run = 0xb6fc275d <task_entry>, arg = 0xb6ca9f6c, __reserved = {status = -38, 
            warm = {__size = "\001\000\000\000\000\000\000\000\060m\376\266\001\000\000", __align = 1}, released = 0x10cac}}
        released = {__size = '\000' <repeats 15 times>, __align = 0}
        ret = 0
        __FUNCTION__ = "thread_trampoline"
#8  0xb6f5b424 in start_thread (arg=0x0) at pthread_create.c:335
        pd = 0x0
        unwind_buf = {cancel_jmp_buf = {{jmp_buf = {-1577945366, -1581175653, -1228475296, -1090520280, 0, -1228476816, -1228476512, -1224802304, 0, 
                -1228474732, 0 <repeats 54 times>}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, 
              canceltype = 0}}}
        not_first_call = <optimized out>
        pagesize_m1 = <optimized out>
        sp = <optimized out>
        freesize = <optimized out>
        __PRETTY_FUNCTION__ = "start_thread"
#9  0xb6eb243c in ?? () at ../sysdeps/unix/sysv/linux/arm/clone.S:89 from /lib/arm-linux-gnueabihf/libc.so.6
No locals.
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

-------------- next part --------------
(gdb) info threads
  Id   Target Id         Frame 
  3    Thread 0xb6c6f460 (LWP 14065) "main_task" __libc_do_syscall () at ../sysdeps/unix/sysv/linux/arm/libc-do-syscall.S:46
  2    Thread 0xb6ca1460 (LWP 14063) "task-1" 0xb6f61cbe in read () at ../sysdeps/unix/syscall-template.S:81
* 1    Thread 0xb6ff0000 (LWP 14056) "task-1" __libc_do_syscall () at ../sysdeps/unix/sysv/linux/arm/libc-do-syscall.S:46

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Xenomai] Process shared rt_event_wait() never signaled on ARM with Mercury core
  2016-02-12 14:08       ` Charles Kiorpes
@ 2016-02-12 15:25         ` Philippe Gerum
  2016-02-12 19:07           ` Charles Kiorpes
  0 siblings, 1 reply; 10+ messages in thread
From: Philippe Gerum @ 2016-02-12 15:25 UTC (permalink / raw)
  To: Charles Kiorpes; +Cc: xenomai

On 02/12/2016 03:08 PM, Charles Kiorpes wrote:
> 
> 
> On Fri, Feb 12, 2016 at 5:43 AM, Philippe Gerum <rpm@xenomai.org
> <mailto:rpm@xenomai.org>> wrote:
> 
>     On 02/11/2016 01:57 PM, Charles Kiorpes wrote:
>     >
>     > I attempted to run several tests:  'task-1', 'event-1', and 'mutex-1'.
>     > Each of these hung indefinitely.  A gdb trace indicated that they were
>     > hanging on __libc_do_syscall() within __pthread_cond_wait() within
>     > threadobj_cond_wait().
>     >
>     > I have attached the full backtrace from mutex-1 as mutex-1_bt.txt
>     >
> 
>     Ok, if the test suite does not pass, something is badly wrong, so we
>     should investigate that hang issue before anything else.
> 
>     The backtrace reveals that copperplate cannot handshake with a newly
>     spawned task, this is the purpose of the wait_on_barrier() call over the
>     context of rt_task_start(). That barrier should be signaled by a call to
>     threadobj_notify_entry() from the internal trampoline code of the
>     emerging thread (task_entry() in alchemy/task.c).
> 
>     - maybe task_prologue_2() (alchemy/task.c) which is called earlier hangs
>     indefinitely, and therefore prevents threadobj_notify_entry() from
>     running?
> 
>     - maybe the new thread does not even start for some reason, are we sure
>     task_entry() is reached (e.g. do we hit a breakpoint there?)
> 
>     Could you inspect the current thread list under gdb when the program
>     hangs?
> 
>     Also, I would recommend to enable full debugging for now
>     (--enable-debug=full) to get accurate line information, assuming the
>     issue should still show up with a non-optimized code. Hopefully.
> 
>     --
>     Philippe.
> 
> 
> I ran the task-1 test under gdb with this Xenomai configuration:
> --with-core=mercury \
> --enable-debug=full \
> --enable-registry \
> --enable-smp \
> --enable-pshared \
> --enable-condvar-workaround
> 
> It appears that the new thread is being launched, and getting stuck in
> threadobj_wait_start() within task_prologue_2(), as you indicated might
> be the case.
> I have attached the thread list and a full backtrace for each thread (in
> separate files by thread id).
> 
> As per your other message, my kernel configs all include CONFIG_FUTEX.
> 
> I have tried glibc 2.19 and 2.21, as well as RT patched and vanilla kernels.
> 
> Interestingly, when I removed --enable-pshared from my configuration,
> the task-1 test passed.
> 

Here is the sync pattern the code normally achieves, once the parent has successfully spawned a child thread, which has to wait for a start signal before it may run application code:

1. parent calls threadobj_start(child)
	1.1 child->status |= __THREAD_S_STARTED
	1.2 wait for child->status & __THREAD_S_ACTIVE

2. child calls threadobj_wait_start(self)
	2.1 wait for self->status & __THREAD_S_STARTED
	2.2 raise self->status |= __THREAD_S_ACTIVE

All accesses to the status bits are serialized by a per-thread mutex, operated by the threadobj_lock/unlock accessors, which also covers the condvar signaling/waiting as one would expect.

When running in pshared mode, thread descriptors (holding ->status, mutex and barrier sync) are obtained from /dev/shm. If --disable-pshared, we are using 100% process-private memory.

Case 1: a race when manipulating the thread status due to inconsistent locking. I could not find any so far.

Case 2: a cache coherence issue in SMP, also caused by improper locking. Otherwise, the locking should enforce memory barriers as expected.

Case 3: anything not mentioned in other cases...

- Could you paste/copy the disassembly (objdump -dl rather than gdb's disass) of the wait_on_barrier() function?

- Does running both programs with --cpu-affinity=0/1 change the outcome?

- Without specifying any affinity this time, could you run the current test with the debug patch below applied (this is clearly not a fix)? The patch forces the code to read the value of the ->status field before waiting on the barrier. With that code in and a backtrace showing locals, we should be able to check the status word when threadobj_wait_start() is entered.

diff --git a/lib/copperplate/threadobj.c b/lib/copperplate/threadobj.c
index cc64caa..ed85a12 100644
--- a/lib/copperplate/threadobj.c
+++ b/lib/copperplate/threadobj.c
@@ -1273,7 +1273,9 @@ void threadobj_wait_start(void) /* current->lock free. */
 	int status;
 
 	threadobj_lock(current);
-	status = wait_on_barrier(current, __THREAD_S_STARTED|__THREAD_S_ABORTED);
+	status = current->status;
+	if (!(status & __THREAD_S_STARTED))
+		status = wait_on_barrier(current, __THREAD_S_STARTED|__THREAD_S_ABORTED);
 	threadobj_unlock(current);
 
 	/*

-- 
Philippe.


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [Xenomai] Process shared rt_event_wait() never signaled on ARM with Mercury core
  2016-02-12 15:25         ` Philippe Gerum
@ 2016-02-12 19:07           ` Charles Kiorpes
  2016-02-16 15:40             ` Philippe Gerum
  0 siblings, 1 reply; 10+ messages in thread
From: Charles Kiorpes @ 2016-02-12 19:07 UTC (permalink / raw)
  To: Philippe Gerum; +Cc: xenomai

> Here is the sync pattern the code normally achieves, once the parent has
> successfully spawned a child thread, which has to wait for a start signal
> before it may run application code:
>
> 1. parent calls threadobj_start(child)
>         1.1 child->status |= __THREAD_S_STARTED
>         1.2 wait for child->status & __THREAD_S_ACTIVE
>
> 2. child calls threadobj_wait_start(self)
>         2.1 wait for self->status & __THREAD_S_STARTED
>         2.2 raise self->status |= __THREAD_S_ACTIVE
>
> All accesses to the status bits are serialized by a per-thread mutex,
> operated by the threadobj_lock/unlock accessors, which also covers the
> condvar signaling/waiting as one would expect.
>
> When running in pshared mode, thread descriptors (holding ->status, mutex
> and barrier sync) are obtained from /dev/shm. If --disable-pshared, we are
> using 100% process-private memory.
>
> Case 1: a race when manipulating the thread status due to inconsistent
> locking. I could not find any so far.
>
> Case 2: a cache coherence issue in SMP, also caused by improper locking.
> Otherwise, the locking should enforce memory barriers as expected.
>
> Case 3: anything not mentioned in other cases...
>
> - Could you paste/copy the disassembly (objdump -dl rather than gdb's
> disass) of the wait_on_barrier() function?
>
>
I have attached the disassembly as wait_on_barrier_disas.txt


> - Does running both programs with --cpu-affinity=0/1 change the outcome?
>
>
There is no change in behavior when trying any combination of cpu
affinities, with either the "task-1" alchemy test or my event test apps.


> - Without specifying any affinity this time, could you run the current
> test with the debug patch below applied (this is clearly not a fix)? The
> patch forces the code to read the value of the ->status field before
> waiting on the barrier. With that code in and a backtrace showing locals,
> we should be able to check the status word when threadobj_wait_start() is
> entered.
>

diff --git a/lib/copperplate/threadobj.c b/lib/copperplate/threadobj.c
> index cc64caa..ed85a12 100644
> --- a/lib/copperplate/threadobj.c
> +++ b/lib/copperplate/threadobj.c
> @@ -1273,7 +1273,9 @@ void threadobj_wait_start(void) /* current->lock
> free. */
>         int status;
>
>         threadobj_lock(current);
> -       status = wait_on_barrier(current,
> __THREAD_S_STARTED|__THREAD_S_ABORTED);
> +       status = current->status;
> +       if (!(status & __THREAD_S_STARTED))
> +               status = wait_on_barrier(current,
> __THREAD_S_STARTED|__THREAD_S_ABORTED);
>         threadobj_unlock(current);
>
>         /*
>
> --
> Philippe.
>

I patched in the debug and I have attached the full backtraces of threads 1
and 3 of the "task-1" alchemy test.

At the time of the hang:
 - parent sees status = 73    (matches the flags set during
threadobj_start())
 - child sees status = 8        (locked?)

- Charles
-------------- next part --------------
00001868 <wait_on_barrier>:
wait_on_barrier():
/home/debian/Development/build/lib/copperplate/../../../xenomai-3/lib/copperplate/threadobj.c:1200
    1868:	b580      	push	{r7, lr}
    186a:	b0ce      	sub	sp, #312	; 0x138
    186c:	af00      	add	r7, sp, #0
    186e:	1d3b      	adds	r3, r7, #4
    1870:	6018      	str	r0, [r3, #0]
    1872:	463b      	mov	r3, r7
    1874:	6019      	str	r1, [r3, #0]
/home/debian/Development/build/lib/copperplate/../../../xenomai-3/lib/copperplate/threadobj.c:1204
    1876:	1d3b      	adds	r3, r7, #4
    1878:	681b      	ldr	r3, [r3, #0]
    187a:	6a9b      	ldr	r3, [r3, #40]	; 0x28
    187c:	f8c7 3134 	str.w	r3, [r7, #308]	; 0x134
/home/debian/Development/build/lib/copperplate/../../../xenomai-3/lib/copperplate/threadobj.c:1205
    1880:	463b      	mov	r3, r7
    1882:	f8d7 2134 	ldr.w	r2, [r7, #308]	; 0x134
    1886:	681b      	ldr	r3, [r3, #0]
    1888:	4013      	ands	r3, r2
    188a:	2b00      	cmp	r3, #0
    188c:	d148      	bne.n	1920 <wait_on_barrier+0xb8>
/home/debian/Development/build/lib/copperplate/../../../xenomai-3/lib/copperplate/threadobj.c:1207
    188e:	1d3b      	adds	r3, r7, #4
    1890:	681b      	ldr	r3, [r3, #0]
    1892:	6a5b      	ldr	r3, [r3, #36]	; 0x24
    1894:	f8c7 3130 	str.w	r3, [r7, #304]	; 0x130
/home/debian/Development/build/lib/copperplate/../../../xenomai-3/lib/copperplate/threadobj.c:1208
    1898:	f240 0300 	movw	r3, #0
    189c:	f2c0 0300 	movt	r3, #0
    18a0:	f8c7 312c 	str.w	r3, [r7, #300]	; 0x12c
    18a4:	1d3b      	adds	r3, r7, #4
    18a6:	681b      	ldr	r3, [r3, #0]
    18a8:	3308      	adds	r3, #8
    18aa:	f8c7 3128 	str.w	r3, [r7, #296]	; 0x128
    18ae:	f107 0308 	add.w	r3, r7, #8
    18b2:	2100      	movs	r1, #0
    18b4:	4618      	mov	r0, r3
    18b6:	f7ff fffe 	bl	0 <__sigsetjmp>
    18ba:	f8c7 0124 	str.w	r0, [r7, #292]	; 0x124
    18be:	f8d7 3124 	ldr.w	r3, [r7, #292]	; 0x124
    18c2:	2b00      	cmp	r3, #0
    18c4:	d009      	beq.n	18da <wait_on_barrier+0x72>
/home/debian/Development/build/lib/copperplate/../../../xenomai-3/lib/copperplate/threadobj.c:1208 (discriminator 2)
    18c6:	f8d7 312c 	ldr.w	r3, [r7, #300]	; 0x12c
    18ca:	f8d7 0128 	ldr.w	r0, [r7, #296]	; 0x128
    18ce:	4798      	blx	r3
    18d0:	f107 0308 	add.w	r3, r7, #8
    18d4:	4618      	mov	r0, r3
    18d6:	f7ff fffe 	bl	0 <__pthread_unwind_next>
/home/debian/Development/build/lib/copperplate/../../../xenomai-3/lib/copperplate/threadobj.c:1208 (discriminator 3)
    18da:	f107 0308 	add.w	r3, r7, #8
    18de:	4618      	mov	r0, r3
    18e0:	f7ff fffe 	bl	0 <__pthread_register_cancel>
/home/debian/Development/build/lib/copperplate/../../../xenomai-3/lib/copperplate/threadobj.c:1209 (discriminator 3)
    18e4:	1d3b      	adds	r3, r7, #4
    18e6:	6818      	ldr	r0, [r3, #0]
    18e8:	f7fe fd80 	bl	3ec <__threadobj_tag_unlocked>
/home/debian/Development/build/lib/copperplate/../../../xenomai-3/lib/copperplate/threadobj.c:1210 (discriminator 3)
    18ec:	1d3b      	adds	r3, r7, #4
    18ee:	681b      	ldr	r3, [r3, #0]
    18f0:	f503 7280 	add.w	r2, r3, #256	; 0x100
    18f4:	1d3b      	adds	r3, r7, #4
    18f6:	681b      	ldr	r3, [r3, #0]
    18f8:	3308      	adds	r3, #8
    18fa:	4619      	mov	r1, r3
    18fc:	4610      	mov	r0, r2
    18fe:	f7ff fffe 	bl	1364 <threadobj_cond_wait>
/home/debian/Development/build/lib/copperplate/../../../xenomai-3/lib/copperplate/threadobj.c:1211 (discriminator 3)
    1902:	1d3b      	adds	r3, r7, #4
    1904:	6818      	ldr	r0, [r3, #0]
    1906:	f7fe fd61 	bl	3cc <__threadobj_tag_locked>
/home/debian/Development/build/lib/copperplate/../../../xenomai-3/lib/copperplate/threadobj.c:1212 (discriminator 3)
    190a:	f107 0308 	add.w	r3, r7, #8
    190e:	4618      	mov	r0, r3
    1910:	f7ff fffe 	bl	0 <__pthread_unregister_cancel>
/home/debian/Development/build/lib/copperplate/../../../xenomai-3/lib/copperplate/threadobj.c:1213 (discriminator 3)
    1914:	1d3b      	adds	r3, r7, #4
    1916:	681b      	ldr	r3, [r3, #0]
    1918:	f8d7 2130 	ldr.w	r2, [r7, #304]	; 0x130
    191c:	625a      	str	r2, [r3, #36]	; 0x24
/home/debian/Development/build/lib/copperplate/../../../xenomai-3/lib/copperplate/threadobj.c:1214 (discriminator 3)
    191e:	e7aa      	b.n	1876 <wait_on_barrier+0xe>
/home/debian/Development/build/lib/copperplate/../../../xenomai-3/lib/copperplate/threadobj.c:1206
    1920:	bf00      	nop
/home/debian/Development/build/lib/copperplate/../../../xenomai-3/lib/copperplate/threadobj.c:1216
    1922:	f8d7 3134 	ldr.w	r3, [r7, #308]	; 0x134
/home/debian/Development/build/lib/copperplate/../../../xenomai-3/lib/copperplate/threadobj.c:1217
    1926:	4618      	mov	r0, r3
    1928:	f507 779c 	add.w	r7, r7, #312	; 0x138
    192c:	46bd      	mov	sp, r7
    192e:	bd80      	pop	{r7, pc}

-------------- next part --------------
(gdb) thread 1
[Switching to thread 1 (Thread 0xb6ff0000 (LWP 4007))]
#0  __libc_do_syscall () at ../sysdeps/unix/sysv/linux/arm/libc-do-syscall.S:46
46	in ../sysdeps/unix/sysv/linux/arm/libc-do-syscall.S
(gdb) bt full
#0  __libc_do_syscall () at ../sysdeps/unix/sysv/linux/arm/libc-do-syscall.S:46
No locals.
#1  0xb6f5f6f2 in __pthread_cond_wait (cond=0xb6caa194, mutex=0xb6caa09c) at pthread_cond_wait.c:177
        _a2tmp = 11
        _a2 = 11
        _nametmp = 240
        _a3tmp = 3
        _a3 = 3
        _a1 = -1228234344
        _v1tmp = -1228234596
        _a4tmp = 0
        _a1tmp = -1228234344
        _a4 = 0
        _v1 = -1228234596
        _name = 240
        __ret = <optimized out>
        futex_val = 3
        buffer = {__routine = 0xb6f5f3e1 <__condvar_cleanup>, __arg = 0xbefffa00, __canceltype = -1225038976, __prev = 0x0}
        cbuffer = {oldtype = 1, cond = 0xb6caa194, mutex = 0xb6caa09c, bc_seq = 0}
        err = <optimized out>
        pshared = <optimized out>
        pi_flag = 1
        val = <optimized out>
        seq = <optimized out>
#2  0xb6f9cfa6 in threadobj_cond_wait (cond=0xb6caa194, lock=0xb6caa09c) at ../../../xenomai-3/lib/copperplate/threadobj.c:980
        ret = -1228234604
#3  0xb6f9d554 in wait_on_barrier (thobj=0xb6caa094, mask=16) at ../../../xenomai-3/lib/copperplate/threadobj.c:1231
        __cancel_buf = {__cancel_jmp_buf = {{__cancel_jmp_buf = {-619902981, -754243412, -1090520024, 0, 0, -1090520496, 0, 0, -1224740864, 0 <repeats 17 times>, 
                2, 0, 5, 0, 1, -1224758200, -1659757782, -1224874387, 0, -1224738124, -1224757760, -1090520312, -1224757760, 0, -1, -1224796312, -1225176048, 
                -1224794112, -1225021852, -1224802304, -1224801088, 1, 0, -1224859353, -1224796312, 1, 5, 0, 0, 1, -1225038976, 0, -1090520024, 0, -1090520232, 
                -1090520024, 0, 0}, __mask_was_saved = 0}}, __pad = {0xbefffc28, 0x0, 0x0, 0xb6ca496c}}
        __cancel_routine = 0xb6f5e225 <__GI___pthread_mutex_unlock>
        __cancel_arg = 0xb6caa09c
        __not_first_call = 0
        oldstate = 0
        status = 73
#4  0xb6f9d606 in threadobj_start (thobj=0xb6caa094) at ../../../xenomai-3/lib/copperplate/threadobj.c:1268
        current = 0xb6ca496c
        ret = 0
        oldstate = 1
#5  0xb6fc2d10 in rt_task_start (task=0x20ed0 <t_main>, entry=0x10985 <main_task>, arg=0xdeadbeef) at ../../../xenomai-3/lib/alchemy/task.c:634
        tcb = 0xb6ca9f6c
        svc = {cancel_type = -559038737}
        ret = 67973
#6  0x00010a62 in main (argc=1, argv=0x21548) at task-1.c:26
        ret = 0
(gdb)
-------------- next part --------------
(gdb) thread 3
[Switching to thread 3 (Thread 0xb6c6f460 (LWP 4016))]
#0  __libc_do_syscall () at ../sysdeps/unix/sysv/linux/arm/libc-do-syscall.S:46
46	in ../sysdeps/unix/sysv/linux/arm/libc-do-syscall.S
(gdb) bt full
#0  __libc_do_syscall () at ../sysdeps/unix/sysv/linux/arm/libc-do-syscall.S:46
No locals.
#1  0xb6f5f6f2 in __pthread_cond_wait (cond=0xb6caa194, mutex=0xb6caa09c) at pthread_cond_wait.c:177
        _a2tmp = 11
        _a2 = 11
        _nametmp = 240
        _a3tmp = 1
        _a3 = 1
        _a1 = -1228234344
        _v1tmp = -1228234596
        _a4tmp = 0
        _a1tmp = -1228234344
        _a4 = 0
        _v1 = -1228234596
        _name = 240
        __ret = <optimized out>
        futex_val = 1
        buffer = {__routine = 0xb6f5f3e1 <__condvar_cleanup>, __arg = 0xb6c6ec18, __canceltype = -1225038976, __prev = 0x0}
        cbuffer = {oldtype = 1, cond = 0xb6caa194, mutex = 0xb6caa09c, bc_seq = 0}
        err = <optimized out>
        pshared = <optimized out>
        pi_flag = 1
        val = <optimized out>
        seq = <optimized out>
#2  0xb6f9cfa6 in threadobj_cond_wait (cond=0xb6caa194, lock=0xb6caa09c) at ../../../xenomai-3/lib/copperplate/threadobj.c:980
        ret = -1228234596
#3  0xb6f9d554 in wait_on_barrier (thobj=0xb6caa094, mask=5) at ../../../xenomai-3/lib/copperplate/threadobj.c:1231
        __cancel_buf = {__cancel_jmp_buf = {{__cancel_jmp_buf = {-751562301, -754243412, -1228476860, -1090520124, 0, -1228477336, -1228476512, -1224802304, 0, 
                -1228474732, 0 <repeats 16 times>, 1, -1224758200, -1228477152, -1228477004, 0, -1228477120, -1224757760, -1228477168, -1224757760, -1224738124, 
                -1, 0, -1225177664, -1224794112, 1, -1224794112, 0, 0, 0, 0, -1224793672, -1228477112, -1224793672, 0, -1, 0, -1225027584, 119632, -1225021852, 
                0, -1228234596, -1090520124, 0, -1228477040, -1228476512, -1224802304, 0, -1225404053}, __mask_was_saved = 0}}, __pad = {0xb6c6ee80, 0x0, 0x0, 
            0xb6c6ed90}}
        __cancel_routine = 0xb6f5e225 <__GI___pthread_mutex_unlock>
        __cancel_arg = 0xb6caa09c
        __not_first_call = 0
        oldstate = 0
        status = 8
#4  0xb6f9d67a in threadobj_wait_start () at ../../../xenomai-3/lib/copperplate/threadobj.c:1299
        current = 0xb6caa094
        status = 8
#5  0xb6fc2770 in task_prologue_2 (tcb=0xb6ca9f6c) at ../../../xenomai-3/lib/alchemy/task.c:211
        ret = 0
#6  0xb6fc27b6 in task_entry (arg=0xb6ca9f6c) at ../../../xenomai-3/lib/alchemy/task.c:227
        __ret = -1228476896
        tcb = 0xb6ca9f6c
        svc = {cancel_type = 1}
        ret = -1228475296
        __FUNCTION__ = "task_entry"
#7  0xb6f9a2d8 in thread_trampoline (arg=0xbefffb94) at ../../../xenomai-3/lib/copperplate/internal.c:251
        cta = 0xbefffb94
        _cta = {stacksize = 0, detachstate = 1, policy = 1, param_ex = {__sched_priority = 99, sched_u = {rr = {__sched_rr_quantum = {tv_sec = -1225254836, 
                  tv_nsec = 0}}}}, prologue = 0xb6fc2719 <task_prologue_1>, run = 0xb6fc27a5 <task_entry>, arg = 0xb6ca9f6c, __reserved = {status = -38, warm = {
              __size = "\001\000\000\000\000\000\000\000\060m\376\266\001\000\000", __align = 1}, released = 0x10cac}}
        released = {__size = '\000' <repeats 15 times>, __align = 0}
        ret = 0
        __FUNCTION__ = "thread_trampoline"
#8  0xb6f5b424 in start_thread (arg=0x0) at pthread_create.c:335
        pd = 0x0
        unwind_buf = {cancel_jmp_buf = {{jmp_buf = {-751561765, -754497110, -1228475296, -1090520280, 0, -1228476816, -1228476512, -1224802304, 0, -1228474732, 
                0 <repeats 54 times>}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
        not_first_call = <optimized out>
        pagesize_m1 = <optimized out>
        sp = <optimized out>
        freesize = <optimized out>
        __PRETTY_FUNCTION__ = "start_thread"
#9  0xb6eb243c in ?? () at ../sysdeps/unix/sysv/linux/arm/clone.S:89 from /lib/arm-linux-gnueabihf/libc.so.6
No locals.
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
(gdb)

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Xenomai] Process shared rt_event_wait() never signaled on ARM with Mercury core
  2016-02-12 19:07           ` Charles Kiorpes
@ 2016-02-16 15:40             ` Philippe Gerum
  2016-02-17 21:34               ` Charles Kiorpes
  0 siblings, 1 reply; 10+ messages in thread
From: Philippe Gerum @ 2016-02-16 15:40 UTC (permalink / raw)
  To: Charles Kiorpes; +Cc: xenomai

On 02/12/2016 08:07 PM, Charles Kiorpes wrote:
> 
>     Here is the sync pattern the code normally achieves, once the parent
>     has successfully spawned a child thread, which has to wait for a
>     start signal before it may run application code:
> 
>     1. parent calls threadobj_start(child)
>             1.1 child->status |= __THREAD_S_STARTED
>             1.2 wait for child->status & __THREAD_S_ACTIVE
> 
>     2. child calls threadobj_wait_start(self)
>             2.1 wait for self->status & __THREAD_S_STARTED
>             2.2 raise self->status |= __THREAD_S_ACTIVE
> 
>     All accesses to the status bits are serialized by a per-thread
>     mutex, operated by the threadobj_lock/unlock accessors, which also
>     covers the condvar signaling/waiting as one would expect.
> 
>     When running in pshared mode, thread descriptors (holding ->status,
>     mutex and barrier sync) are obtained from /dev/shm. If
>     --disable-pshared, we are using 100% process-private memory.
> 
>     Case 1: a race when manipulating the thread status due to
>     inconsistent locking. I could not find any so far.
> 
>     Case 2: a cache coherence issue in SMP, also caused by improper
>     locking. Otherwise, the locking should enforce memory barriers as
>     expected.
> 
>     Case 3: anything not mentioned in other cases...
> 
>     - Could you paste/copy the disassembly (objdump -dl rather than
>     gdb's disass) of the wait_on_barrier() function?
> 
>  
> I have attached the disassembly as wait_on_barrier_disas.txt
>  
> 
>     - Does running both programs with --cpu-affinity=0/1 change the outcome?
> 
>  
> There is no change in behavior when trying any combination of cpu
> affinities, with either the "task-1" alchemy test or my event test apps.
>  
> 
>     - Without specifying any affinity this time, could you run the
>     current test with the debug patch below applied (this is clearly not
>     a fix)? The patch forces the code to read the value of the ->status
>     field before waiting on the barrier. With that code in and a
>     backtrace showing locals, we should be able to check the status word
>     when threadobj_wait_start() is entered.
>      
> 
>     diff --git a/lib/copperplate/threadobj.c b/lib/copperplate/threadobj.c
>     index cc64caa..ed85a12 100644
>     --- a/lib/copperplate/threadobj.c
>     +++ b/lib/copperplate/threadobj.c
>     @@ -1273,7 +1273,9 @@ void threadobj_wait_start(void) /*
>     current->lock free. */
>             int status;
> 
>             threadobj_lock(current);
>     -       status = wait_on_barrier(current,
>     __THREAD_S_STARTED|__THREAD_S_ABORTED);
>     +       status = current->status;
>     +       if (!(status & __THREAD_S_STARTED))
>     +               status = wait_on_barrier(current,
>     __THREAD_S_STARTED|__THREAD_S_ABORTED);
>             threadobj_unlock(current);
> 
>             /*
> 
>     --
>     Philippe.
> 
> 
> I patched in the debug and I have attached the full backtraces of
> threads 1 and 3 of the "task-1" alchemy test.
> 
> At the time of the hang:
>  - parent sees status = 73    (matches the flags set during
> threadobj_start())
>  - child sees status = 8        (locked?)
> 

Yes, this is a debugging status with --enable-debug to check for locking
consistency, also raised while holding the lock. From the traces, it
looks like the child never sees any of the status bits raised by the
parent when entering wait_on_barrier(), although its priority is
strictly lower.

At this point, we need to consider the toolchain. Which one are you
using, and specifically, is it built for supporting multi-thread
applications (such as implementing atomic operations)?

I have attached a test code basically reproducing what copperplate does
under the hood over a native kernel.

e.g.:
(term-1) ./basic-shm --create
(term-2) ./basic-shm --listen

or you can invert the logic by having the listener create the shm
segment, waiting for the signals:

(term-1) ./basic-shm --create --listen
(term-2) ./basic-shm

The code is not smart enough to detect when the listener attempts to
reuse an obsolete shared memory file from a previous run. For this
reason, the creator should always run first in a new test.

-- 
Philippe.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: basic-shm.c
Type: text/x-csrc
Size: 3294 bytes
Desc: not available
URL: <http://xenomai.org/pipermail/xenomai/attachments/20160216/1e20b172/attachment.c>
-------------- next part --------------

CFLAGS = -O2 -g
LDFLAGS =

all: basic-shm

%: %.c
	$(CROSS_COMPILE)$(CC) $(CFLAGS) $(LDFLAGS) -o $@ $< -lpthread -lrt

clean:
	$(RM) basic-shm

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Xenomai] Process shared rt_event_wait() never signaled on ARM with Mercury core
  2016-02-16 15:40             ` Philippe Gerum
@ 2016-02-17 21:34               ` Charles Kiorpes
  0 siblings, 0 replies; 10+ messages in thread
From: Charles Kiorpes @ 2016-02-17 21:34 UTC (permalink / raw)
  To: Philippe Gerum; +Cc: xenomai

The provided test program also did not perform correctly in the current
environment (no signals were received by the listener).

However, after testing various toolchains, I tried it on the same hardware
(Beagleboard-x15 sample) with a TI-supplied rt kernel, and it did run
correctly.  This indicates to me that your earlier instinct was correct;
this is a kernel configuration problem.

I am in the process of doing a thorough comparison of the kernel configs to
determine what option(s) made the difference.

If I can find the offending kernel configuration option(s), I will post
them here.  This will take some time though, as I will need to do quite a
bit of building and testing of kernels.

Thank you for your assistance,
- Charles

On Tue, Feb 16, 2016 at 10:40 AM, Philippe Gerum <rpm@xenomai.org> wrote:

> On 02/12/2016 08:07 PM, Charles Kiorpes wrote:
> >
> >     Here is the sync pattern the code normally achieves, once the parent
> >     has successfully spawned a child thread, which has to wait for a
> >     start signal before it may run application code:
> >
> >     1. parent calls threadobj_start(child)
> >             1.1 child->status |= __THREAD_S_STARTED
> >             1.2 wait for child->status & __THREAD_S_ACTIVE
> >
> >     2. child calls threadobj_wait_start(self)
> >             2.1 wait for self->status & __THREAD_S_STARTED
> >             2.2 raise self->status |= __THREAD_S_ACTIVE
> >
> >     All accesses to the status bits are serialized by a per-thread
> >     mutex, operated by the threadobj_lock/unlock accessors, which also
> >     covers the condvar signaling/waiting as one would expect.
> >
> >     When running in pshared mode, thread descriptors (holding ->status,
> >     mutex and barrier sync) are obtained from /dev/shm. If
> >     --disable-pshared, we are using 100% process-private memory.
> >
> >     Case 1: a race when manipulating the thread status due to
> >     inconsistent locking. I could not find any so far.
> >
> >     Case 2: a cache coherence issue in SMP, also caused by improper
> >     locking. Otherwise, the locking should enforce memory barriers as
> >     expected.
> >
> >     Case 3: anything not mentioned in other cases...
> >
> >     - Could you paste/copy the disassembly (objdump -dl rather than
> >     gdb's disass) of the wait_on_barrier() function?
> >
> >
> > I have attached the disassembly as wait_on_barrier_disas.txt
> >
> >
> >     - Does running both programs with --cpu-affinity=0/1 change the
> outcome?
> >
> >
> > There is no change in behavior when trying any combination of cpu
> > affinities, with either the "task-1" alchemy test or my event test apps.
> >
> >
> >     - Without specifying any affinity this time, could you run the
> >     current test with the debug patch below applied (this is clearly not
> >     a fix)? The patch forces the code to read the value of the ->status
> >     field before waiting on the barrier. With that code in and a
> >     backtrace showing locals, we should be able to check the status word
> >     when threadobj_wait_start() is entered.
> >
> >
> >     diff --git a/lib/copperplate/threadobj.c
> b/lib/copperplate/threadobj.c
> >     index cc64caa..ed85a12 100644
> >     --- a/lib/copperplate/threadobj.c
> >     +++ b/lib/copperplate/threadobj.c
> >     @@ -1273,7 +1273,9 @@ void threadobj_wait_start(void) /*
> >     current->lock free. */
> >             int status;
> >
> >             threadobj_lock(current);
> >     -       status = wait_on_barrier(current,
> >     __THREAD_S_STARTED|__THREAD_S_ABORTED);
> >     +       status = current->status;
> >     +       if (!(status & __THREAD_S_STARTED))
> >     +               status = wait_on_barrier(current,
> >     __THREAD_S_STARTED|__THREAD_S_ABORTED);
> >             threadobj_unlock(current);
> >
> >             /*
> >
> >     --
> >     Philippe.
> >
> >
> > I patched in the debug and I have attached the full backtraces of
> > threads 1 and 3 of the "task-1" alchemy test.
> >
> > At the time of the hang:
> >  - parent sees status = 73    (matches the flags set during
> > threadobj_start())
> >  - child sees status = 8        (locked?)
> >
>
> Yes, this is a debugging status with --enable-debug to check for locking
> consistency, also raised while holding the lock. From the traces, it
> looks like the child never sees any of the status bits raised by the
> parent when entering wait_on_barrier(), although its priority is
> strictly lower.
>
> At this point, we need to consider the toolchain. Which one are you
> using, and specifically, is it built for supporting multi-thread
> applications (such as implementing atomic operations)?
>
> I have attached a test code basically reproducing what copperplate does
> under the hood over a native kernel.
>
> e.g.:
> (term-1) ./basic-shm --create
> (term-2) ./basic-shm --listen
>
> or you can invert the logic by having the listener create the shm
> segment, waiting for the signals:
>
> (term-1) ./basic-shm --create --listen
> (term-2) ./basic-shm
>
> The code is not smart enough to detect when the listener attempts to
> reuse an obsolete shared memory file from a previous run. For this
> reason, the creator should always run first in a new test.
>
> --
> Philippe.
>

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2016-02-17 21:34 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-02-10 18:41 [Xenomai] Process shared rt_event_wait() never signaled on ARM with Mercury core Charles Kiorpes
2016-02-10 20:21 ` Philippe Gerum
2016-02-11 12:57   ` Charles Kiorpes
2016-02-12 10:43     ` Philippe Gerum
2016-02-12 14:08       ` Charles Kiorpes
2016-02-12 15:25         ` Philippe Gerum
2016-02-12 19:07           ` Charles Kiorpes
2016-02-16 15:40             ` Philippe Gerum
2016-02-17 21:34               ` Charles Kiorpes
2016-02-12 10:55     ` Philippe Gerum

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.