* [Xenomai-help] hang in rtcansend
@ 2011-12-22 0:41 Andrew Tannenbaum
2011-12-22 9:48 ` Gilles Chanteperdrix
2011-12-26 22:56 ` Gilles Chanteperdrix
0 siblings, 2 replies; 19+ messages in thread
From: Andrew Tannenbaum @ 2011-12-22 0:41 UTC (permalink / raw)
To: xenomai
Summary: I am having a problem running rtcansend/recv on Xenomai 2.6.0,
with the processes hanging in their cleanup code.
I had been running Xenomai on an Intel Atom system with a PEAK PCI
SJA1000 CAN adapter.
I was running Linux 2.5.35.7 with Xenomai 2.5.5.2. I connected a servo
and motor to the PEAK adapter, and I was able to talk with it using
rtcansend and rtcanrecv.
After working on other things for a few months, I need to return to this
project, so I downloaded the latest Linux/Xenomai pair, which I think is
Linux 2.5.38.8 and Xenomai 2.6.0.
I was able to compile these (using the Debian build advice, generating
.deb files for Linux and Xenomai, which I install with dpkg -i). I used
a Linux .config derived from my older build.
With both the new and old installs, I am able to run xeno-test and get
decent latencies and such, though some of the tests fail depending on
what I have configured in Realtime/Drivers/Testing Drivers. That is not
what I'm asking about.
I am having a problem running rtcansend/recv on Xenomai 2.6.0:
I can run rtcanconfig and it sets up my rtcan0 properly so I can see and
configure the servo. The data in /proc/rtcan looks ok.
But when I try to talk with the servo using rtcansend, the rtcansend
process fails during the close phase, it looks like this:
$ rtcansend rtcan0 -v -i 0x0 0x82 0x1
interface rtcan0
s=0, ifr_name=rtcan0
<0x000> [2] 82 01
Cleaning up...
^CSignal 2 received
Cleaning up...
$
So it hangs after the first "Cleaning up..." and I hit Control-C and
then it catches the ^C and exits. The code at the bottom of
xenomai-2.6.0/src/utils/can/rtcansend.c
looks like this:
...
cleanup();
return 0;
failure:
cleanup();
return -1;
}
and cleanup is printing "Cleaning up..." twice, so maybe both of those
are being called? I'm not sure.
Under 2.5.5.2, the same command runs normally and exits cleanly:
$ rtcansend rtcan0 -v -i 0x0 0x82 0x1
interface rtcan0
s=0, ifr_name=rtcan0
<0x000> [2] 82 01
Cleaning up...
$
I have strace output from both of the builds (running something
similar, where it rtcansend loops with a delay, long enough for me to
catch the problem with it:
Working 2.5.5.2 strace:
$ tail -25 str.rtcs.2.5.5.2.ok.out
write(1, "<0x000> [2] 82 01\n", 18) = 18
write(1, "<0x000> [2] 82 01\n", 18) = 18
write(1, "<0x000> [2] 82 01\n", 18) = 18
write(1, "<0x000> [2] 82 01\n", 18) = 18
write(1, "<0x000> [2] 82 01\n", 18) = 18
write(1, "Cleaning up...\n", 15) = 15
nanosleep({0, 100000000}, NULL) = 0
open("/etc/ld.so.cache", O_RDONLY) = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=69045, ...}) = 0
mmap2(NULL, 69045, PROT_READ, MAP_PRIVATE, 3, 0) = 0xb76c6000
close(3) = 0
access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or
directory)
open("/lib/libgcc_s.so.1", O_RDONLY) = 3
read(3,
"\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0p#\0\0004\0\0\0"...,
512) = 512
brk(0) = 0x9b3c000
brk(0x9b5d000) = 0x9b5d000
fstat64(3, {st_mode=S_IFREG|0644, st_size=120368, ...}) = 0
mmap2(NULL, 123432, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3,
0) = 0xb76a7000
mmap2(0xb76c4000, 8192, PROT_READ|PROT_WRITE,
MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1c) = 0xb76c4000
close(3) = 0
mprotect(0xb76c4000, 4096, PROT_READ) = 0
munmap(0xb76c6000, 69045) = 0
futex(0xb76c50e8, FUTEX_WAKE_PRIVATE, 2147483647) = 0
exit_group(0) = ?
Process 1766 detached
Broken 2.6.0 strace:
$ tail -25 str.rtcs.2.6.0.hang.out
write(1, "<0x000> [2] 82 01\n", 18) = 18
write(1, "<0x000> [2] 82 01\n", 18) = 18
write(1, "<0x000> [2] 82 01\n", 18) = 18
write(1, "<0x000> [2] 82 01\n", 18) = 18
write(1, "<0x000> [2] 82 01\n", 18) = 18
write(1, "<0x000> [2] 82 01\n", 18) = 18
write(1, "<0x000> [2] 82 01\n", 18) = 18
write(1, "Cleaning up...\n", 15) = 15
nanosleep({0, 100000000}, NULL) = 0
open("/etc/ld.so.cache", O_RDONLY) = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=68769, ...}) = 0
mmap2(NULL, 68769, PROT_READ, MAP_PRIVATE, 3, 0) = 0xb7590000
close(3) = 0
access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or
directory)
open("/lib/libgcc_s.so.1", O_RDONLY) = 3
read(3,
"\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0p#\0\0004\0\0\0"...,
512) = 512
fstat64(3, {st_mode=S_IFREG|0644, st_size=120368, ...}) = 0
mmap2(NULL, 123432, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3,
0) = 0xb7571000
mmap2(0xb758e000, 8192, PROT_READ|PROT_WRITE,
MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1c) = 0xb758e000
close(3) = 0
mprotect(0xb758e000, 4096, PROT_READ) = 0
munmap(0xb7590000, 68769) = 0
futex(0xb758f0e8, FUTEX_WAKE_PRIVATE, 2147483647) = 0
_exit(0) = ?
Process 8114 detached
$
If I just run rtcansend with no args (so it prints a help message), it
exits ok on 2.6.0 with no hang. But if I run rtcansend or rtcanrecv
with any CAN i/o, they hang in the cleanup function, though they do
their processing before then correctly.
Any idea why rtcansend/recv would hang for me here on 2.6.0?
Andy
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Xenomai-help] hang in rtcansend
2011-12-22 0:41 [Xenomai-help] hang in rtcansend Andrew Tannenbaum
@ 2011-12-22 9:48 ` Gilles Chanteperdrix
2011-12-22 13:33 ` Willy Lambert
2011-12-26 22:56 ` Gilles Chanteperdrix
1 sibling, 1 reply; 19+ messages in thread
From: Gilles Chanteperdrix @ 2011-12-22 9:48 UTC (permalink / raw)
To: Andrew Tannenbaum; +Cc: xenomai
On 12/22/2011 01:41 AM, Andrew Tannenbaum wrote:
> If I just run rtcansend with no args (so it prints a help message), it
> exits ok on 2.6.0 with no hang. But if I run rtcansend or rtcanrecv
> with any CAN i/o, they hang in the cleanup function, though they do
> their processing before then correctly.
>
> Any idea why rtcansend/recv would hang for me here on 2.6.0?
Is this behaviour reproducible with the virtual rtcan driver?
--
Gilles.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Xenomai-help] hang in rtcansend
2011-12-22 9:48 ` Gilles Chanteperdrix
@ 2011-12-22 13:33 ` Willy Lambert
2011-12-22 15:36 ` Gilles Chanteperdrix
0 siblings, 1 reply; 19+ messages in thread
From: Willy Lambert @ 2011-12-22 13:33 UTC (permalink / raw)
To: Gilles Chanteperdrix; +Cc: Andrew Tannenbaum, xenomai
2011/12/22 Gilles Chanteperdrix <gilles.chanteperdrix@xenomai.org>:
> On 12/22/2011 01:41 AM, Andrew Tannenbaum wrote:
>> If I just run rtcansend with no args (so it prints a help message), it
>> exits ok on 2.6.0 with no hang. But if I run rtcansend or rtcanrecv
>> with any CAN i/o, they hang in the cleanup function, though they do
>> their processing before then correctly.
>>
>> Any idea why rtcansend/recv would hang for me here on 2.6.0?
>
> Is this behaviour reproducible with the virtual rtcan driver?
>
>
For your information, I have a similar behavior with the same Linux
2.5.38.8 and Xenomai 2.6.0 config. As I did not used the CAN before I
didn't reported it.
> --
> Gilles.
>
> _______________________________________________
> Xenomai-help mailing list
> Xenomai-help@domain.hid
> https://mail.gna.org/listinfo/xenomai-help
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Xenomai-help] hang in rtcansend
2011-12-22 13:33 ` Willy Lambert
@ 2011-12-22 15:36 ` Gilles Chanteperdrix
2011-12-22 15:47 ` Willy Lambert
2011-12-22 18:59 ` Andrew Tannenbaum
0 siblings, 2 replies; 19+ messages in thread
From: Gilles Chanteperdrix @ 2011-12-22 15:36 UTC (permalink / raw)
To: Willy Lambert; +Cc: Andrew Tannenbaum, xenomai
On 12/22/2011 02:33 PM, Willy Lambert wrote:
> 2011/12/22 Gilles Chanteperdrix <gilles.chanteperdrix@xenomai.org>:
>> On 12/22/2011 01:41 AM, Andrew Tannenbaum wrote:
>>> If I just run rtcansend with no args (so it prints a help message), it
>>> exits ok on 2.6.0 with no hang. But if I run rtcansend or rtcanrecv
>>> with any CAN i/o, they hang in the cleanup function, though they do
>>> their processing before then correctly.
>>>
>>> Any idea why rtcansend/recv would hang for me here on 2.6.0?
>>
>> Is this behaviour reproducible with the virtual rtcan driver?
>>
>>
>
> For your information, I have a similar behavior with the same Linux
> 2.5.38.8 and Xenomai 2.6.0 config. As I did not used the CAN before I
> didn't reported it.
Have you tried attaching gdb when rtcansend is locked to see where the
bug happens?
--
Gilles.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Xenomai-help] hang in rtcansend
2011-12-22 15:36 ` Gilles Chanteperdrix
@ 2011-12-22 15:47 ` Willy Lambert
2011-12-22 18:59 ` Andrew Tannenbaum
1 sibling, 0 replies; 19+ messages in thread
From: Willy Lambert @ 2011-12-22 15:47 UTC (permalink / raw)
To: Gilles Chanteperdrix; +Cc: Andrew Tannenbaum, xenomai
2011/12/22 Gilles Chanteperdrix <gilles.chanteperdrix@xenomai.org>:
> On 12/22/2011 02:33 PM, Willy Lambert wrote:
>> 2011/12/22 Gilles Chanteperdrix <gilles.chanteperdrix@xenomai.org>:
>>> On 12/22/2011 01:41 AM, Andrew Tannenbaum wrote:
>>>> If I just run rtcansend with no args (so it prints a help message), it
>>>> exits ok on 2.6.0 with no hang. But if I run rtcansend or rtcanrecv
>>>> with any CAN i/o, they hang in the cleanup function, though they do
>>>> their processing before then correctly.
>>>>
>>>> Any idea why rtcansend/recv would hang for me here on 2.6.0?
>>>
>>> Is this behaviour reproducible with the virtual rtcan driver?
>>>
>>>
>>
>> For your information, I have a similar behavior with the same Linux
>> 2.5.38.8 and Xenomai 2.6.0 config. As I did not used the CAN before I
>> didn't reported it.
>
> Have you tried attaching gdb when rtcansend is locked to see where the
> bug happens?
>
No, it was just a test to see if xenomai was a possible part of my
design. As something is not supporting xenomai in my framework, I
decided to stay with gnulinux for the present time. So I can't test it
for now. I'm sorry, it was just to say that the problem was not
localized to Andrew's one.
> --
> Gilles.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Xenomai-help] hang in rtcansend
2011-12-22 15:36 ` Gilles Chanteperdrix
2011-12-22 15:47 ` Willy Lambert
@ 2011-12-22 18:59 ` Andrew Tannenbaum
1 sibling, 0 replies; 19+ messages in thread
From: Andrew Tannenbaum @ 2011-12-22 18:59 UTC (permalink / raw)
To: Gilles Chanteperdrix; +Cc: xenomai
Gilles,
The virtual driver gives me the same hang problem. It loaded fine:
Dec 22 12:04:18 atom1 kernel: [ 2457.833703] rtcan: registered rtcan2
Dec 22 12:04:18 atom1 kernel: [ 2457.833721] rtcan2: VIRT driver loaded
Dec 22 12:04:18 atom1 kernel: [ 2457.833775] rtcan: registered rtcan3
Dec 22 12:04:18 atom1 kernel: [ 2457.833789] rtcan3: VIRT driver loaded
All notes below operate on the virtual device rtcan2.
I get:
$ rtcansend rtcan2 -v -i 0x0 0x82 0x1
interface rtcan2
s=0, ifr_name=rtcan2
<0x000> [2] 82 01
Cleaning up...
^CSignal 2 received
Cleaning up...
$
So it's the same behavior - it hangs, I hit ^C, and it exits.
Here's what I get with gdb:
first on /usr/bin/rtcansend with no breakpoints:
$ gdb rtcansend
GNU gdb (GDB) 7.1-ubuntu
...
Reading symbols from /usr/bin/rtcansend...(no debugging symbols
found)...done.
(gdb) run rtcan2 -v -i 0x0 0x82 0x1
Starting program: /usr/bin/rtcansend rtcan2 -v -i 0x0 0x82 0x1
[Thread debugging using libthread_db enabled]
[New Thread 0xb7fdfb70 (LWP 1850)]
interface rtcan2
s=0, ifr_name=rtcan2
<0x000> [2] 82 01
Cleaning up...
And it hangs here and I can't ^C, though a kill -9 killed it.
Here's gdb with breakpoints on the suspect functions (here I cd into
src/utils/can/.libs):
(gdb) b rt_dev_close
Breakpoint 1 at 0x8048900
(gdb) b rt_task_delete
Breakpoint 2 at 0x8048960
(gdb) run rtcan2 -v -i 0x0 0x82 0x1
Starting program:
/home/imt/src/xenomai-2.5.5.2/src/utils/can/.libs/rtcansend rtcan2 -v
-i 0x0 0x82 0x1
[Thread debugging using libthread_db enabled]
[New Thread 0xb7fdfb70 (LWP 2326)]
interface rtcan2
s=0, ifr_name=rtcan2
<0x000> [2] 82 01
Cleaning up...
Breakpoint 1, 0xb7fc46f0 in rt_dev_close () from /usr/lib/librtdm.so.1
(gdb) c
Continuing.
Breakpoint 2, 0xb7fcbeb5 in rt_task_delete () from /usr/lib/libnative.so.3
(gdb) bt
#0 0xb7fcbeb5 in rt_task_delete () from /usr/lib/libnative.so.3
#1 0x08048d36 in cleanup () at rtcansend.c:65
#2 0x08049268 in main (argc=7, argv=0xbffff754) at rtcansend.c:303
(gdb) info threads
2 Thread 0xb7fdfb70 (LWP 2326) 0xb7fe2424 in __kernel_vsyscall ()
* 1 Thread 0xb7e3e6d0 (LWP 2323) 0xb7fcbeb5 in rt_task_delete ()
from /usr/lib/libnative.so.3
(gdb) thread 2
[Switching to thread 2 (Thread 0xb7fdfb70 (LWP 2326))]#0 0xb7fe2424 in
__kernel_vsyscall ()
(gdb) bt
#0 0xb7fe2424 in __kernel_vsyscall ()
#1 0xb7fb0736 in nanosleep () from /lib/tls/i686/cmov/libpthread.so.0
#2 0xb7fbe8de in ?? () from /usr/lib/libxenomai.so.0
#3 0xb7fa896e in start_thread () from /lib/tls/i686/cmov/libpthread.so.0
#4 0xb7f16a4e in clone () from /lib/tls/i686/cmov/libc.so.6
(gdb) c
Continuing.
And it hangs (before I kill -9). I don't know the gdb commands to
investigate more deeply. The rtcansend is a zombie at this point.
$ ps -ef (edited)
imt 2321 1227 0 13:51 pts/1 00:00:00 gdb rtcansend
imt 2323 2321 0 13:51 pts/1 00:00:00 [rtcansend-2323] <defunct>
Re "narrowing down" the problem, note in the
strace outputs in my post at the beginning
of the thread that I think it's hanging in
cleanup(), and the straces show that 2.5.5.2
calls exit_group(0) (and closes correctly)
where 2.6.0 calls _exit(0) (and hangs). I assume
that exit_group() is cleaning up threads where
_exit() is probably doing less cleanup work,
or none. I'm curious about why one has
exit_group() and the other has _exit().
-Andy
On 12/22/2011 10:36 AM, Gilles Chanteperdrix wrote:
> On 12/22/2011 02:33 PM, Willy Lambert wrote:
>> 2011/12/22 Gilles Chanteperdrix <gilles.chanteperdrix@xenomai.org>:
>>> On 12/22/2011 01:41 AM, Andrew Tannenbaum wrote:
>>>> If I just run rtcansend with no args (so it prints a help message), it
>>>> exits ok on 2.6.0 with no hang. But if I run rtcansend or rtcanrecv
>>>> with any CAN i/o, they hang in the cleanup function, though they do
>>>> their processing before then correctly.
>>>>
>>>> Any idea why rtcansend/recv would hang for me here on 2.6.0?
>>>
>>> Is this behaviour reproducible with the virtual rtcan driver?
>>>
>>>
>>
>> For your information, I have a similar behavior with the same Linux
>> 2.5.38.8 and Xenomai 2.6.0 config. As I did not used the CAN before I
>> didn't reported it.
>
> Have you tried attaching gdb when rtcansend is locked to see where the
> bug happens?
>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Xenomai-help] hang in rtcansend
2011-12-22 0:41 [Xenomai-help] hang in rtcansend Andrew Tannenbaum
2011-12-22 9:48 ` Gilles Chanteperdrix
@ 2011-12-26 22:56 ` Gilles Chanteperdrix
2011-12-26 23:04 ` Gilles Chanteperdrix
1 sibling, 1 reply; 19+ messages in thread
From: Gilles Chanteperdrix @ 2011-12-26 22:56 UTC (permalink / raw)
To: Andrew Tannenbaum; +Cc: xenomai
On 12/22/2011 01:41 AM, Andrew Tannenbaum wrote:
> Summary: I am having a problem running rtcansend/recv on Xenomai 2.6.0,
> with the processes hanging in their cleanup code.
>
> I had been running Xenomai on an Intel Atom system with a PEAK PCI
> SJA1000 CAN adapter.
>
> I was running Linux 2.5.35.7 with Xenomai 2.5.5.2. I connected a servo
> and motor to the PEAK adapter, and I was able to talk with it using
> rtcansend and rtcanrecv.
>
> After working on other things for a few months, I need to return to this
> project, so I downloaded the latest Linux/Xenomai pair, which I think is
> Linux 2.5.38.8 and Xenomai 2.6.0.
>
> I was able to compile these (using the Debian build advice, generating
> .deb files for Linux and Xenomai, which I install with dpkg -i). I used
> a Linux .config derived from my older build.
>
> With both the new and old installs, I am able to run xeno-test and get
> decent latencies and such, though some of the tests fail depending on
> what I have configured in Realtime/Drivers/Testing Drivers. That is not
> what I'm asking about.
>
>
>
> I am having a problem running rtcansend/recv on Xenomai 2.6.0:
>
> I can run rtcanconfig and it sets up my rtcan0 properly so I can see and
> configure the servo. The data in /proc/rtcan looks ok.
>
> But when I try to talk with the servo using rtcansend, the rtcansend
> process fails during the close phase, it looks like this:
>
> $ rtcansend rtcan0 -v -i 0x0 0x82 0x1
> interface rtcan0
> s=0, ifr_name=rtcan0
> <0x000> [2] 82 01
> Cleaning up...
> ^CSignal 2 received
> Cleaning up...
> $
>
> So it hangs after the first "Cleaning up..." and I hit Control-C and
> then it catches the ^C and exits. The code at the bottom of
After various attempts, the bug happens when the main thread exits with
pthread_exit while other threads exist in the process. It was already
there in 2.5.6 at least, but we did not see it with rtcansend because
there was no other thread than the main thread, while in 2.6.0, there is
now the rt_print thread running.
--
Gilles.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Xenomai-help] hang in rtcansend
2011-12-26 22:56 ` Gilles Chanteperdrix
@ 2011-12-26 23:04 ` Gilles Chanteperdrix
2011-12-28 19:09 ` Andrew Tannenbaum
0 siblings, 1 reply; 19+ messages in thread
From: Gilles Chanteperdrix @ 2011-12-26 23:04 UTC (permalink / raw)
To: Andrew Tannenbaum; +Cc: xenomai
On 12/26/2011 11:56 PM, Gilles Chanteperdrix wrote:
> On 12/22/2011 01:41 AM, Andrew Tannenbaum wrote:
>> Summary: I am having a problem running rtcansend/recv on Xenomai 2.6.0,
>> with the processes hanging in their cleanup code.
>>
>> I had been running Xenomai on an Intel Atom system with a PEAK PCI
>> SJA1000 CAN adapter.
>>
>> I was running Linux 2.5.35.7 with Xenomai 2.5.5.2. I connected a servo
>> and motor to the PEAK adapter, and I was able to talk with it using
>> rtcansend and rtcanrecv.
>>
>> After working on other things for a few months, I need to return to this
>> project, so I downloaded the latest Linux/Xenomai pair, which I think is
>> Linux 2.5.38.8 and Xenomai 2.6.0.
>>
>> I was able to compile these (using the Debian build advice, generating
>> .deb files for Linux and Xenomai, which I install with dpkg -i). I used
>> a Linux .config derived from my older build.
>>
>> With both the new and old installs, I am able to run xeno-test and get
>> decent latencies and such, though some of the tests fail depending on
>> what I have configured in Realtime/Drivers/Testing Drivers. That is not
>> what I'm asking about.
>>
>>
>>
>> I am having a problem running rtcansend/recv on Xenomai 2.6.0:
>>
>> I can run rtcanconfig and it sets up my rtcan0 properly so I can see and
>> configure the servo. The data in /proc/rtcan looks ok.
>>
>> But when I try to talk with the servo using rtcansend, the rtcansend
>> process fails during the close phase, it looks like this:
>>
>> $ rtcansend rtcan0 -v -i 0x0 0x82 0x1
>> interface rtcan0
>> s=0, ifr_name=rtcan0
>> <0x000> [2] 82 01
>> Cleaning up...
>> ^CSignal 2 received
>> Cleaning up...
>> $
>>
>> So it hangs after the first "Cleaning up..." and I hit Control-C and
>> then it catches the ^C and exits. The code at the bottom of
>
> After various attempts, the bug happens when the main thread exits with
> pthread_exit while other threads exist in the process. It was already
> there in 2.5.6 at least, but we did not see it with rtcansend because
> there was no other thread than the main thread, while in 2.6.0, there is
> now the rt_print thread running.
>
And it is in fact a linux/glibc behaviour. A test program compiled
without xenomai exhibits the same behaviour. Here is the test program,
simplified to the max:
#include <pthread.h>
#include <sys/mman.h>
#include <time.h>
void *loop(void *cookie)
{
struct timespec ts;
ts.tv_sec = 0;
ts.tv_nsec = 100000000;
pthread_detach(pthread_self());
for(;;)
nanosleep(&ts, NULL);
}
int main(void)
{
pthread_t tid;
mlockall(MCL_CURRENT | MCL_FUTURE);
pthread_create(&tid, NULL, loop, NULL);
pthread_exit(NULL);
}
So, rtcansend should call exit.
--
Gilles.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Xenomai-help] hang in rtcansend
2011-12-26 23:04 ` Gilles Chanteperdrix
@ 2011-12-28 19:09 ` Andrew Tannenbaum
2011-12-28 20:50 ` Gilles Chanteperdrix
2011-12-29 11:25 ` Gilles Chanteperdrix
0 siblings, 2 replies; 19+ messages in thread
From: Andrew Tannenbaum @ 2011-12-28 19:09 UTC (permalink / raw)
To: Gilles Chanteperdrix; +Cc: xenomai
On 12/26/2011 06:04 PM, Gilles Chanteperdrix wrote:
> On 12/26/2011 11:56 PM, Gilles Chanteperdrix wrote:
>> On 12/22/2011 01:41 AM, Andrew Tannenbaum wrote:
>>> Summary: I am having a problem running rtcansend/recv on Xenomai 2.6.0,
>>> with the processes hanging in their cleanup code.
>>>
>>> I had been running Xenomai on an Intel Atom system with a PEAK PCI
>>> SJA1000 CAN adapter.
>>>
>>> I was running Linux 2.5.35.7 with Xenomai 2.5.5.2. I connected a servo
>>> and motor to the PEAK adapter, and I was able to talk with it using
>>> rtcansend and rtcanrecv.
>>>
>>> After working on other things for a few months, I need to return to this
>>> project, so I downloaded the latest Linux/Xenomai pair, which I think is
>>> Linux 2.5.38.8 and Xenomai 2.6.0.
>>>
>>> I was able to compile these (using the Debian build advice, generating
>>> .deb files for Linux and Xenomai, which I install with dpkg -i). I used
>>> a Linux .config derived from my older build.
>>>
>>> With both the new and old installs, I am able to run xeno-test and get
>>> decent latencies and such, though some of the tests fail depending on
>>> what I have configured in Realtime/Drivers/Testing Drivers. That is not
>>> what I'm asking about.
>>>
>>>
>>>
>>> I am having a problem running rtcansend/recv on Xenomai 2.6.0:
>>>
>>> I can run rtcanconfig and it sets up my rtcan0 properly so I can see and
>>> configure the servo. The data in /proc/rtcan looks ok.
>>>
>>> But when I try to talk with the servo using rtcansend, the rtcansend
>>> process fails during the close phase, it looks like this:
>>>
>>> $ rtcansend rtcan0 -v -i 0x0 0x82 0x1
>>> interface rtcan0
>>> s=0, ifr_name=rtcan0
>>> <0x000> [2] 82 01
>>> Cleaning up...
>>> ^CSignal 2 received
>>> Cleaning up...
>>> $
>>>
>>> So it hangs after the first "Cleaning up..." and I hit Control-C and
>>> then it catches the ^C and exits. The code at the bottom of
>>
>> After various attempts, the bug happens when the main thread exits with
>> pthread_exit while other threads exist in the process. It was already
>> there in 2.5.6 at least, but we did not see it with rtcansend because
>> there was no other thread than the main thread, while in 2.6.0, there is
>> now the rt_print thread running.
>>
>
> And it is in fact a linux/glibc behaviour. A test program compiled
> without xenomai exhibits the same behaviour. Here is the test program,
> simplified to the max:
>
> #include <pthread.h>
> #include <sys/mman.h>
> #include <time.h>
>
> void *loop(void *cookie)
> {
> struct timespec ts;
>
> ts.tv_sec = 0;
> ts.tv_nsec = 100000000;
>
> pthread_detach(pthread_self());
>
> for(;;)
> nanosleep(&ts, NULL);
> }
>
> int main(void)
> {
> pthread_t tid;
>
> mlockall(MCL_CURRENT | MCL_FUTURE);
>
> pthread_create(&tid, NULL, loop, NULL);
>
> pthread_exit(NULL);
> }
>
> So, rtcansend should call exit.
>
Gilles,
Thank you for your help, it explains and resolves my immediate needs. I
am not sure I understand the underlying problem, and I have more
questions about it.
Re the new loose private rt_print pthread, I am not comfortable with the
suggestion to call exit() explicitly (instead of pthread_exit() or
rt_task_delete()). Asking the user to call exit() instead of
rt_task_delete() is not intuitive.
In your simple example case, a simple solution would be to call
pthread_cancel(tid) before pthread_exit(). I understand that in a
Xenomai program using rt_print, the user isn't really handling the
rt_print thread. If rt_task_delete() doesn't mean process exit, the
question gets more difficult.
Can the rt_print pthread be cleaned up automatically? atexit()?
use-count in rt_task_delete()? If not, should rt_print be started and
stopped explicitly by the user?
I'm wondering about old programs that may hang when they are ported from
Xenomai pre-2.6 to post-2.6.
-Andy
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Xenomai-help] hang in rtcansend
2011-12-28 19:09 ` Andrew Tannenbaum
@ 2011-12-28 20:50 ` Gilles Chanteperdrix
2011-12-29 11:25 ` Gilles Chanteperdrix
1 sibling, 0 replies; 19+ messages in thread
From: Gilles Chanteperdrix @ 2011-12-28 20:50 UTC (permalink / raw)
To: Andrew Tannenbaum; +Cc: xenomai
On 12/28/2011 08:09 PM, Andrew Tannenbaum wrote:
> On 12/26/2011 06:04 PM, Gilles Chanteperdrix wrote:
>> On 12/26/2011 11:56 PM, Gilles Chanteperdrix wrote:
>>> On 12/22/2011 01:41 AM, Andrew Tannenbaum wrote:
>>>> Summary: I am having a problem running rtcansend/recv on Xenomai 2.6.0,
>>>> with the processes hanging in their cleanup code.
>>>>
>>>> I had been running Xenomai on an Intel Atom system with a PEAK PCI
>>>> SJA1000 CAN adapter.
>>>>
>>>> I was running Linux 2.5.35.7 with Xenomai 2.5.5.2. I connected a servo
>>>> and motor to the PEAK adapter, and I was able to talk with it using
>>>> rtcansend and rtcanrecv.
>>>>
>>>> After working on other things for a few months, I need to return to this
>>>> project, so I downloaded the latest Linux/Xenomai pair, which I think is
>>>> Linux 2.5.38.8 and Xenomai 2.6.0.
>>>>
>>>> I was able to compile these (using the Debian build advice, generating
>>>> .deb files for Linux and Xenomai, which I install with dpkg -i). I used
>>>> a Linux .config derived from my older build.
>>>>
>>>> With both the new and old installs, I am able to run xeno-test and get
>>>> decent latencies and such, though some of the tests fail depending on
>>>> what I have configured in Realtime/Drivers/Testing Drivers. That is not
>>>> what I'm asking about.
>>>>
>>>>
>>>>
>>>> I am having a problem running rtcansend/recv on Xenomai 2.6.0:
>>>>
>>>> I can run rtcanconfig and it sets up my rtcan0 properly so I can see and
>>>> configure the servo. The data in /proc/rtcan looks ok.
>>>>
>>>> But when I try to talk with the servo using rtcansend, the rtcansend
>>>> process fails during the close phase, it looks like this:
>>>>
>>>> $ rtcansend rtcan0 -v -i 0x0 0x82 0x1
>>>> interface rtcan0
>>>> s=0, ifr_name=rtcan0
>>>> <0x000> [2] 82 01
>>>> Cleaning up...
>>>> ^CSignal 2 received
>>>> Cleaning up...
>>>> $
>>>>
>>>> So it hangs after the first "Cleaning up..." and I hit Control-C and
>>>> then it catches the ^C and exits. The code at the bottom of
>>>
>>> After various attempts, the bug happens when the main thread exits with
>>> pthread_exit while other threads exist in the process. It was already
>>> there in 2.5.6 at least, but we did not see it with rtcansend because
>>> there was no other thread than the main thread, while in 2.6.0, there is
>>> now the rt_print thread running.
>>>
>>
>> And it is in fact a linux/glibc behaviour. A test program compiled
>> without xenomai exhibits the same behaviour. Here is the test program,
>> simplified to the max:
>>
>> #include <pthread.h>
>> #include <sys/mman.h>
>> #include <time.h>
>>
>> void *loop(void *cookie)
>> {
>> struct timespec ts;
>>
>> ts.tv_sec = 0;
>> ts.tv_nsec = 100000000;
>>
>> pthread_detach(pthread_self());
>>
>> for(;;)
>> nanosleep(&ts, NULL);
>> }
>>
>> int main(void)
>> {
>> pthread_t tid;
>>
>> mlockall(MCL_CURRENT | MCL_FUTURE);
>>
>> pthread_create(&tid, NULL, loop, NULL);
>>
>> pthread_exit(NULL);
>> }
>>
>> So, rtcansend should call exit.
>>
>
> Gilles,
>
> Thank you for your help, it explains and resolves my immediate needs. I
> am not sure I understand the underlying problem, and I have more
> questions about it.
>
> Re the new loose private rt_print pthread, I am not comfortable with the
> suggestion to call exit() explicitly (instead of pthread_exit() or
> rt_task_delete()). Asking the user to call exit() instead of
> rt_task_delete() is not intuitive.
>
> In your simple example case, a simple solution would be to call
> pthread_cancel(tid) before pthread_exit(). I understand that in a
> Xenomai program using rt_print, the user isn't really handling the
> rt_print thread. If rt_task_delete() doesn't mean process exit, the
> question gets more difficult.
rt_task_delete never meant process exit.
>
> Can the rt_print pthread be cleaned up automatically? atexit()?
> use-count in rt_task_delete()? If not, should rt_print be started and
> stopped explicitly by the user?
atexit will not work: routines registered with atexit will only be
called when exit is called, not when pthread_exit is called.
>
> I'm wondering about old programs that may hang when they are ported from
> Xenomai pre-2.6 to post-2.6.
We can probably work something out, but is it worth the trouble? Given
the example I showed, when you want to terminate a process, you should
call exit, not pthread_exit/rt_task_delete, calling these and relying on
the fact that only one thread is running is fragile. Besides, programs
with just one thread are probably more the exception than the rule.
>
> -Andy
>
--
Gilles.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Xenomai-help] hang in rtcansend
2011-12-28 19:09 ` Andrew Tannenbaum
2011-12-28 20:50 ` Gilles Chanteperdrix
@ 2011-12-29 11:25 ` Gilles Chanteperdrix
2011-12-29 13:13 ` Gilles Chanteperdrix
1 sibling, 1 reply; 19+ messages in thread
From: Gilles Chanteperdrix @ 2011-12-29 11:25 UTC (permalink / raw)
To: Andrew Tannenbaum; +Cc: xenomai
On 12/28/2011 08:09 PM, Andrew Tannenbaum wrote:
> On 12/26/2011 06:04 PM, Gilles Chanteperdrix wrote:
>> On 12/26/2011 11:56 PM, Gilles Chanteperdrix wrote:
>>> On 12/22/2011 01:41 AM, Andrew Tannenbaum wrote:
>>>> Summary: I am having a problem running rtcansend/recv on Xenomai 2.6.0,
>>>> with the processes hanging in their cleanup code.
>>>>
>>>> I had been running Xenomai on an Intel Atom system with a PEAK PCI
>>>> SJA1000 CAN adapter.
>>>>
>>>> I was running Linux 2.5.35.7 with Xenomai 2.5.5.2. I connected a servo
>>>> and motor to the PEAK adapter, and I was able to talk with it using
>>>> rtcansend and rtcanrecv.
>>>>
>>>> After working on other things for a few months, I need to return to this
>>>> project, so I downloaded the latest Linux/Xenomai pair, which I think is
>>>> Linux 2.5.38.8 and Xenomai 2.6.0.
>>>>
>>>> I was able to compile these (using the Debian build advice, generating
>>>> .deb files for Linux and Xenomai, which I install with dpkg -i). I used
>>>> a Linux .config derived from my older build.
>>>>
>>>> With both the new and old installs, I am able to run xeno-test and get
>>>> decent latencies and such, though some of the tests fail depending on
>>>> what I have configured in Realtime/Drivers/Testing Drivers. That is not
>>>> what I'm asking about.
>>>>
>>>>
>>>>
>>>> I am having a problem running rtcansend/recv on Xenomai 2.6.0:
>>>>
>>>> I can run rtcanconfig and it sets up my rtcan0 properly so I can see and
>>>> configure the servo. The data in /proc/rtcan looks ok.
>>>>
>>>> But when I try to talk with the servo using rtcansend, the rtcansend
>>>> process fails during the close phase, it looks like this:
>>>>
>>>> $ rtcansend rtcan0 -v -i 0x0 0x82 0x1
>>>> interface rtcan0
>>>> s=0, ifr_name=rtcan0
>>>> <0x000> [2] 82 01
>>>> Cleaning up...
>>>> ^CSignal 2 received
>>>> Cleaning up...
>>>> $
>>>>
>>>> So it hangs after the first "Cleaning up..." and I hit Control-C and
>>>> then it catches the ^C and exits. The code at the bottom of
>>>
>>> After various attempts, the bug happens when the main thread exits with
>>> pthread_exit while other threads exist in the process. It was already
>>> there in 2.5.6 at least, but we did not see it with rtcansend because
>>> there was no other thread than the main thread, while in 2.6.0, there is
>>> now the rt_print thread running.
>>>
>>
>> And it is in fact a linux/glibc behaviour. A test program compiled
>> without xenomai exhibits the same behaviour. Here is the test program,
>> simplified to the max:
>>
>> #include <pthread.h>
>> #include <sys/mman.h>
>> #include <time.h>
>>
>> void *loop(void *cookie)
>> {
>> struct timespec ts;
>>
>> ts.tv_sec = 0;
>> ts.tv_nsec = 100000000;
>>
>> pthread_detach(pthread_self());
>>
>> for(;;)
>> nanosleep(&ts, NULL);
>> }
>>
>> int main(void)
>> {
>> pthread_t tid;
>>
>> mlockall(MCL_CURRENT | MCL_FUTURE);
>>
>> pthread_create(&tid, NULL, loop, NULL);
>>
>> pthread_exit(NULL);
>> }
>>
>> So, rtcansend should call exit.
>>
>
> Gilles,
>
> Thank you for your help, it explains and resolves my immediate needs. I
> am not sure I understand the underlying problem, and I have more
> questions about it.
>
> Re the new loose private rt_print pthread, I am not comfortable with the
> suggestion to call exit() explicitly (instead of pthread_exit() or
> rt_task_delete()). Asking the user to call exit() instead of
> rt_task_delete() is not intuitive.
>
> In your simple example case, a simple solution would be to call
> pthread_cancel(tid) before pthread_exit(). I understand that in a
> Xenomai program using rt_print, the user isn't really handling the
> rt_print thread. If rt_task_delete() doesn't mean process exit, the
> question gets more difficult.
>
> Can the rt_print pthread be cleaned up automatically? atexit()?
> use-count in rt_task_delete()? If not, should rt_print be started and
> stopped explicitly by the user?
>
> I'm wondering about old programs that may hang when they are ported from
> Xenomai pre-2.6 to post-2.6.
Here is a patch which only spawns the rt_print thread if the user calls
rt_print_auto_init(1), or rt_print_init(). Then if you have called these
services, you are expected to call rt_print_cleanup() to cancel the
rt_print thread, before calling rt_task_delete().
diff --git a/src/skins/common/rt_print.c b/src/skins/common/rt_print.c
index c1849a5..5533e29 100644
--- a/src/skins/common/rt_print.c
+++ b/src/skins/common/rt_print.c
@@ -91,8 +91,11 @@ static unsigned pool_buf_size;
static unsigned long pool_start, pool_len;
#endif /* CONFIG_XENO_FASTSYNCH */
+static pthread_once_t init_once = PTHREAD_ONCE_INIT;
+
static void cleanup_buffer(struct print_buffer *buffer);
static void print_buffers(void);
+static void spawn_printer_thread(void);
/* *** rt_print API *** */
@@ -344,6 +347,8 @@ int rt_print_init(size_t buffer_size, const char
*buffer_name)
unsigned long old_bitmap;
unsigned j;
+ pthread_once(&init_once, spawn_printer_thread);
+
if (!size)
size = default_buffer_size;
else if (size < RT_PRINT_LINE_BREAK)
@@ -415,6 +420,8 @@ int rt_print_init(size_t buffer_size, const char
*buffer_name)
void rt_print_auto_init(int enable)
{
auto_init = enable;
+ if (enable)
+ pthread_once(&init_once, spawn_printer_thread);
}
void rt_print_cleanup(void)
@@ -432,6 +439,7 @@ void rt_print_cleanup(void)
}
pthread_cancel(printer_thread);
+ printer_thread = 0;
}
const char *rt_print_buffer_name(void)
@@ -596,9 +604,16 @@ static void print_buffers(void)
}
}
+static void unlock(void *cookie)
+{
+ pthread_mutex_t *mutex = (pthread_mutex_t *)cookie;
+ pthread_mutex_unlock(mutex);
+}
+
static void *printer_loop(void *arg)
{
while (1) {
+ pthread_cleanup_push(unlock, &buffer_lock);
pthread_mutex_lock(&buffer_lock);
while (buffers == 0)
@@ -606,7 +621,7 @@ static void *printer_loop(void *arg)
print_buffers();
- pthread_mutex_unlock(&buffer_lock);
+ pthread_cleanup_pop(1);
nanosleep(&print_period, NULL);
}
@@ -620,6 +635,7 @@ static void spawn_printer_thread(void)
pthread_attr_init(&thattr);
pthread_attr_setstacksize(&thattr, xeno_stacksize(0));
+ pthread_attr_setdetachstate(&thattr, PTHREAD_CREATE_DETACHED);
pthread_create(&printer_thread, &thattr, printer_loop, NULL);
}
@@ -653,10 +669,11 @@ static void forked_child_init(void)
cleanup_buffer(*pbuffer);
}
- spawn_printer_thread();
+ if (printer_thread)
+ spawn_printer_thread();
}
-static __attribute__ ((constructor)) void __rt_print_init(void)
+static __attribute__((constructor)) void __rt_print_init(void)
{
const char *value_str;
unsigned long long period;
@@ -752,7 +769,6 @@ static __attribute__ ((constructor)) void
__rt_print_init(void)
pthread_cond_init(&printer_wakeup, NULL);
- spawn_printer_thread();
pthread_atfork(NULL, NULL, forked_child_init);
}
>
> -Andy
>
--
Gilles.
^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [Xenomai-help] hang in rtcansend
2011-12-29 11:25 ` Gilles Chanteperdrix
@ 2011-12-29 13:13 ` Gilles Chanteperdrix
2011-12-29 15:18 ` Andrew Tannenbaum
0 siblings, 1 reply; 19+ messages in thread
From: Gilles Chanteperdrix @ 2011-12-29 13:13 UTC (permalink / raw)
To: Andrew Tannenbaum; +Cc: xenomai
On 12/29/2011 12:25 PM, Gilles Chanteperdrix wrote:
> On 12/28/2011 08:09 PM, Andrew Tannenbaum wrote:
>> On 12/26/2011 06:04 PM, Gilles Chanteperdrix wrote:
>>> On 12/26/2011 11:56 PM, Gilles Chanteperdrix wrote:
>>>> On 12/22/2011 01:41 AM, Andrew Tannenbaum wrote:
>>>>> Summary: I am having a problem running rtcansend/recv on Xenomai 2.6.0,
>>>>> with the processes hanging in their cleanup code.
>>>>>
>>>>> I had been running Xenomai on an Intel Atom system with a PEAK PCI
>>>>> SJA1000 CAN adapter.
>>>>>
>>>>> I was running Linux 2.5.35.7 with Xenomai 2.5.5.2. I connected a servo
>>>>> and motor to the PEAK adapter, and I was able to talk with it using
>>>>> rtcansend and rtcanrecv.
>>>>>
>>>>> After working on other things for a few months, I need to return to this
>>>>> project, so I downloaded the latest Linux/Xenomai pair, which I think is
>>>>> Linux 2.5.38.8 and Xenomai 2.6.0.
>>>>>
>>>>> I was able to compile these (using the Debian build advice, generating
>>>>> .deb files for Linux and Xenomai, which I install with dpkg -i). I used
>>>>> a Linux .config derived from my older build.
>>>>>
>>>>> With both the new and old installs, I am able to run xeno-test and get
>>>>> decent latencies and such, though some of the tests fail depending on
>>>>> what I have configured in Realtime/Drivers/Testing Drivers. That is not
>>>>> what I'm asking about.
>>>>>
>>>>>
>>>>>
>>>>> I am having a problem running rtcansend/recv on Xenomai 2.6.0:
>>>>>
>>>>> I can run rtcanconfig and it sets up my rtcan0 properly so I can see and
>>>>> configure the servo. The data in /proc/rtcan looks ok.
>>>>>
>>>>> But when I try to talk with the servo using rtcansend, the rtcansend
>>>>> process fails during the close phase, it looks like this:
>>>>>
>>>>> $ rtcansend rtcan0 -v -i 0x0 0x82 0x1
>>>>> interface rtcan0
>>>>> s=0, ifr_name=rtcan0
>>>>> <0x000> [2] 82 01
>>>>> Cleaning up...
>>>>> ^CSignal 2 received
>>>>> Cleaning up...
>>>>> $
>>>>>
>>>>> So it hangs after the first "Cleaning up..." and I hit Control-C and
>>>>> then it catches the ^C and exits. The code at the bottom of
>>>>
>>>> After various attempts, the bug happens when the main thread exits with
>>>> pthread_exit while other threads exist in the process. It was already
>>>> there in 2.5.6 at least, but we did not see it with rtcansend because
>>>> there was no other thread than the main thread, while in 2.6.0, there is
>>>> now the rt_print thread running.
>>>>
>>>
>>> And it is in fact a linux/glibc behaviour. A test program compiled
>>> without xenomai exhibits the same behaviour. Here is the test program,
>>> simplified to the max:
>>>
>>> #include <pthread.h>
>>> #include <sys/mman.h>
>>> #include <time.h>
>>>
>>> void *loop(void *cookie)
>>> {
>>> struct timespec ts;
>>>
>>> ts.tv_sec = 0;
>>> ts.tv_nsec = 100000000;
>>>
>>> pthread_detach(pthread_self());
>>>
>>> for(;;)
>>> nanosleep(&ts, NULL);
>>> }
>>>
>>> int main(void)
>>> {
>>> pthread_t tid;
>>>
>>> mlockall(MCL_CURRENT | MCL_FUTURE);
>>>
>>> pthread_create(&tid, NULL, loop, NULL);
>>>
>>> pthread_exit(NULL);
>>> }
>>>
>>> So, rtcansend should call exit.
>>>
>>
>> Gilles,
>>
>> Thank you for your help, it explains and resolves my immediate needs. I
>> am not sure I understand the underlying problem, and I have more
>> questions about it.
>>
>> Re the new loose private rt_print pthread, I am not comfortable with the
>> suggestion to call exit() explicitly (instead of pthread_exit() or
>> rt_task_delete()). Asking the user to call exit() instead of
>> rt_task_delete() is not intuitive.
>>
>> In your simple example case, a simple solution would be to call
>> pthread_cancel(tid) before pthread_exit(). I understand that in a
>> Xenomai program using rt_print, the user isn't really handling the
>> rt_print thread. If rt_task_delete() doesn't mean process exit, the
>> question gets more difficult.
>>
>> Can the rt_print pthread be cleaned up automatically? atexit()?
>> use-count in rt_task_delete()? If not, should rt_print be started and
>> stopped explicitly by the user?
>>
>> I'm wondering about old programs that may hang when they are ported from
>> Xenomai pre-2.6 to post-2.6.
>
> Here is a patch which only spawns the rt_print thread if the user calls
> rt_print_auto_init(1), or rt_print_init(). Then if you have called these
> services, you are expected to call rt_print_cleanup() to cancel the
> rt_print thread, before calling rt_task_delete().
It is a proposition. What do you think?
--
Gilles.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Xenomai-help] hang in rtcansend
2011-12-29 13:13 ` Gilles Chanteperdrix
@ 2011-12-29 15:18 ` Andrew Tannenbaum
2011-12-29 15:34 ` Gilles Chanteperdrix
0 siblings, 1 reply; 19+ messages in thread
From: Andrew Tannenbaum @ 2011-12-29 15:18 UTC (permalink / raw)
To: Gilles Chanteperdrix; +Cc: xenomai
On Thu, Dec 29, 2011 at 8:13 AM, Gilles Chanteperdrix
<gilles.chanteperdrix@xenomai.org> wrote:
> On 12/29/2011 12:25 PM, Gilles Chanteperdrix wrote:
>> On 12/28/2011 08:09 PM, Andrew Tannenbaum wrote:
>>> On 12/26/2011 06:04 PM, Gilles Chanteperdrix wrote:
>>>> On 12/26/2011 11:56 PM, Gilles Chanteperdrix wrote:
>>>>> On 12/22/2011 01:41 AM, Andrew Tannenbaum wrote:
>>>>>> Summary: I am having a problem running rtcansend/recv on Xenomai 2.6.0,
>>>>>> with the processes hanging in their cleanup code.
>>>>>>
>>>>>> I had been running Xenomai on an Intel Atom system with a PEAK PCI
>>>>>> SJA1000 CAN adapter.
>>>>>>
>>>>>> I was running Linux 2.5.35.7 with Xenomai 2.5.5.2. I connected a servo
>>>>>> and motor to the PEAK adapter, and I was able to talk with it using
>>>>>> rtcansend and rtcanrecv.
>>>>>>
>>>>>> After working on other things for a few months, I need to return to this
>>>>>> project, so I downloaded the latest Linux/Xenomai pair, which I think is
>>>>>> Linux 2.5.38.8 and Xenomai 2.6.0.
>>>>>>
>>>>>> I was able to compile these (using the Debian build advice, generating
>>>>>> .deb files for Linux and Xenomai, which I install with dpkg -i). I used
>>>>>> a Linux .config derived from my older build.
>>>>>>
>>>>>> With both the new and old installs, I am able to run xeno-test and get
>>>>>> decent latencies and such, though some of the tests fail depending on
>>>>>> what I have configured in Realtime/Drivers/Testing Drivers. That is not
>>>>>> what I'm asking about.
>>>>>>
>>>>>>
>>>>>>
>>>>>> I am having a problem running rtcansend/recv on Xenomai 2.6.0:
>>>>>>
>>>>>> I can run rtcanconfig and it sets up my rtcan0 properly so I can see and
>>>>>> configure the servo. The data in /proc/rtcan looks ok.
>>>>>>
>>>>>> But when I try to talk with the servo using rtcansend, the rtcansend
>>>>>> process fails during the close phase, it looks like this:
>>>>>>
>>>>>> $ rtcansend rtcan0 -v -i 0x0 0x82 0x1
>>>>>> interface rtcan0
>>>>>> s=0, ifr_name=rtcan0
>>>>>> <0x000> [2] 82 01
>>>>>> Cleaning up...
>>>>>> ^CSignal 2 received
>>>>>> Cleaning up...
>>>>>> $
>>>>>>
>>>>>> So it hangs after the first "Cleaning up..." and I hit Control-C and
>>>>>> then it catches the ^C and exits. The code at the bottom of
>>>>>
>>>>> After various attempts, the bug happens when the main thread exits with
>>>>> pthread_exit while other threads exist in the process. It was already
>>>>> there in 2.5.6 at least, but we did not see it with rtcansend because
>>>>> there was no other thread than the main thread, while in 2.6.0, there is
>>>>> now the rt_print thread running.
>>>>>
>>>>
>>>> And it is in fact a linux/glibc behaviour. A test program compiled
>>>> without xenomai exhibits the same behaviour. Here is the test program,
>>>> simplified to the max:
>>>>
>>>> #include <pthread.h>
>>>> #include <sys/mman.h>
>>>> #include <time.h>
>>>>
>>>> void *loop(void *cookie)
>>>> {
>>>> struct timespec ts;
>>>>
>>>> ts.tv_sec = 0;
>>>> ts.tv_nsec = 100000000;
>>>>
>>>> pthread_detach(pthread_self());
>>>>
>>>> for(;;)
>>>> nanosleep(&ts, NULL);
>>>> }
>>>>
>>>> int main(void)
>>>> {
>>>> pthread_t tid;
>>>>
>>>> mlockall(MCL_CURRENT | MCL_FUTURE);
>>>>
>>>> pthread_create(&tid, NULL, loop, NULL);
>>>>
>>>> pthread_exit(NULL);
>>>> }
>>>>
>>>> So, rtcansend should call exit.
>>>>
>>>
>>> Gilles,
>>>
>>> Thank you for your help, it explains and resolves my immediate needs. I
>>> am not sure I understand the underlying problem, and I have more
>>> questions about it.
>>>
>>> Re the new loose private rt_print pthread, I am not comfortable with the
>>> suggestion to call exit() explicitly (instead of pthread_exit() or
>>> rt_task_delete()). Asking the user to call exit() instead of
>>> rt_task_delete() is not intuitive.
>>>
>>> In your simple example case, a simple solution would be to call
>>> pthread_cancel(tid) before pthread_exit(). I understand that in a
>>> Xenomai program using rt_print, the user isn't really handling the
>>> rt_print thread. If rt_task_delete() doesn't mean process exit, the
>>> question gets more difficult.
>>>
>>> Can the rt_print pthread be cleaned up automatically? atexit()?
>>> use-count in rt_task_delete()? If not, should rt_print be started and
>>> stopped explicitly by the user?
>>>
>>> I'm wondering about old programs that may hang when they are ported from
>>> Xenomai pre-2.6 to post-2.6.
>>
>> Here is a patch which only spawns the rt_print thread if the user calls
>> rt_print_auto_init(1), or rt_print_init(). Then if you have called these
>> services, you are expected to call rt_print_cleanup() to cancel the
>> rt_print thread, before calling rt_task_delete().
>
> It is a proposition. What do you think?
>
> --
> Gilles.
>
Gilles,
I will not be in my office until 2-Jan, I will not be able to try the
patch until then.
Giving the user control of explicitly loading and unloading the
rt_print system and thread sounds good to me.
It's not clear to me from a quick look at the patch, what will happen
if a user calls rt_printf() without first calling the rt_print_init
code?
-Andy
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Xenomai-help] hang in rtcansend
2011-12-29 15:18 ` Andrew Tannenbaum
@ 2011-12-29 15:34 ` Gilles Chanteperdrix
2012-01-03 23:47 ` Andrew Tannenbaum
0 siblings, 1 reply; 19+ messages in thread
From: Gilles Chanteperdrix @ 2011-12-29 15:34 UTC (permalink / raw)
To: Andrew Tannenbaum; +Cc: xenomai
On 12/29/2011 04:18 PM, Andrew Tannenbaum wrote:
> On Thu, Dec 29, 2011 at 8:13 AM, Gilles Chanteperdrix
> <gilles.chanteperdrix@xenomai.org> wrote:
>> On 12/29/2011 12:25 PM, Gilles Chanteperdrix wrote:
>>> On 12/28/2011 08:09 PM, Andrew Tannenbaum wrote:
>>>> On 12/26/2011 06:04 PM, Gilles Chanteperdrix wrote:
>>>>> On 12/26/2011 11:56 PM, Gilles Chanteperdrix wrote:
>>>>>> On 12/22/2011 01:41 AM, Andrew Tannenbaum wrote:
>>>>>>> Summary: I am having a problem running rtcansend/recv on Xenomai 2.6.0,
>>>>>>> with the processes hanging in their cleanup code.
>>>>>>>
>>>>>>> I had been running Xenomai on an Intel Atom system with a PEAK PCI
>>>>>>> SJA1000 CAN adapter.
>>>>>>>
>>>>>>> I was running Linux 2.5.35.7 with Xenomai 2.5.5.2. I connected a servo
>>>>>>> and motor to the PEAK adapter, and I was able to talk with it using
>>>>>>> rtcansend and rtcanrecv.
>>>>>>>
>>>>>>> After working on other things for a few months, I need to return to this
>>>>>>> project, so I downloaded the latest Linux/Xenomai pair, which I think is
>>>>>>> Linux 2.5.38.8 and Xenomai 2.6.0.
>>>>>>>
>>>>>>> I was able to compile these (using the Debian build advice, generating
>>>>>>> .deb files for Linux and Xenomai, which I install with dpkg -i). I used
>>>>>>> a Linux .config derived from my older build.
>>>>>>>
>>>>>>> With both the new and old installs, I am able to run xeno-test and get
>>>>>>> decent latencies and such, though some of the tests fail depending on
>>>>>>> what I have configured in Realtime/Drivers/Testing Drivers. That is not
>>>>>>> what I'm asking about.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> I am having a problem running rtcansend/recv on Xenomai 2.6.0:
>>>>>>>
>>>>>>> I can run rtcanconfig and it sets up my rtcan0 properly so I can see and
>>>>>>> configure the servo. The data in /proc/rtcan looks ok.
>>>>>>>
>>>>>>> But when I try to talk with the servo using rtcansend, the rtcansend
>>>>>>> process fails during the close phase, it looks like this:
>>>>>>>
>>>>>>> $ rtcansend rtcan0 -v -i 0x0 0x82 0x1
>>>>>>> interface rtcan0
>>>>>>> s=0, ifr_name=rtcan0
>>>>>>> <0x000> [2] 82 01
>>>>>>> Cleaning up...
>>>>>>> ^CSignal 2 received
>>>>>>> Cleaning up...
>>>>>>> $
>>>>>>>
>>>>>>> So it hangs after the first "Cleaning up..." and I hit Control-C and
>>>>>>> then it catches the ^C and exits. The code at the bottom of
>>>>>>
>>>>>> After various attempts, the bug happens when the main thread exits with
>>>>>> pthread_exit while other threads exist in the process. It was already
>>>>>> there in 2.5.6 at least, but we did not see it with rtcansend because
>>>>>> there was no other thread than the main thread, while in 2.6.0, there is
>>>>>> now the rt_print thread running.
>>>>>>
>>>>>
>>>>> And it is in fact a linux/glibc behaviour. A test program compiled
>>>>> without xenomai exhibits the same behaviour. Here is the test program,
>>>>> simplified to the max:
>>>>>
>>>>> #include <pthread.h>
>>>>> #include <sys/mman.h>
>>>>> #include <time.h>
>>>>>
>>>>> void *loop(void *cookie)
>>>>> {
>>>>> struct timespec ts;
>>>>>
>>>>> ts.tv_sec = 0;
>>>>> ts.tv_nsec = 100000000;
>>>>>
>>>>> pthread_detach(pthread_self());
>>>>>
>>>>> for(;;)
>>>>> nanosleep(&ts, NULL);
>>>>> }
>>>>>
>>>>> int main(void)
>>>>> {
>>>>> pthread_t tid;
>>>>>
>>>>> mlockall(MCL_CURRENT | MCL_FUTURE);
>>>>>
>>>>> pthread_create(&tid, NULL, loop, NULL);
>>>>>
>>>>> pthread_exit(NULL);
>>>>> }
>>>>>
>>>>> So, rtcansend should call exit.
>>>>>
>>>>
>>>> Gilles,
>>>>
>>>> Thank you for your help, it explains and resolves my immediate needs. I
>>>> am not sure I understand the underlying problem, and I have more
>>>> questions about it.
>>>>
>>>> Re the new loose private rt_print pthread, I am not comfortable with the
>>>> suggestion to call exit() explicitly (instead of pthread_exit() or
>>>> rt_task_delete()). Asking the user to call exit() instead of
>>>> rt_task_delete() is not intuitive.
>>>>
>>>> In your simple example case, a simple solution would be to call
>>>> pthread_cancel(tid) before pthread_exit(). I understand that in a
>>>> Xenomai program using rt_print, the user isn't really handling the
>>>> rt_print thread. If rt_task_delete() doesn't mean process exit, the
>>>> question gets more difficult.
>>>>
>>>> Can the rt_print pthread be cleaned up automatically? atexit()?
>>>> use-count in rt_task_delete()? If not, should rt_print be started and
>>>> stopped explicitly by the user?
>>>>
>>>> I'm wondering about old programs that may hang when they are ported from
>>>> Xenomai pre-2.6 to post-2.6.
>>>
>>> Here is a patch which only spawns the rt_print thread if the user calls
>>> rt_print_auto_init(1), or rt_print_init(). Then if you have called these
>>> services, you are expected to call rt_print_cleanup() to cancel the
>>> rt_print thread, before calling rt_task_delete().
>>
>> It is a proposition. What do you think?
>>
>> --
>> Gilles.
>>
>
> Gilles,
>
> I will not be in my office until 2-Jan, I will not be able to try the
> patch until then.
>
> Giving the user control of explicitly loading and unloading the
> rt_print system and thread sounds good to me.
>
> It's not clear to me from a quick look at the patch, what will happen
> if a user calls rt_printf() without first calling the rt_print_init
> code?
if rt_print_auto_init(1) has been called, the initialization happens
automatically, otherwise nothing happens, but it always has been that way.
--
Gilles.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Xenomai-help] hang in rtcansend
2011-12-29 15:34 ` Gilles Chanteperdrix
@ 2012-01-03 23:47 ` Andrew Tannenbaum
2012-01-10 17:50 ` Andrew Tannenbaum
0 siblings, 1 reply; 19+ messages in thread
From: Andrew Tannenbaum @ 2012-01-03 23:47 UTC (permalink / raw)
To: Gilles Chanteperdrix; +Cc: xenomai
On 12/29/2011 10:34 AM, Gilles Chanteperdrix wrote:
> On 12/29/2011 04:18 PM, Andrew Tannenbaum wrote:
>> On Thu, Dec 29, 2011 at 8:13 AM, Gilles Chanteperdrix
>> <gilles.chanteperdrix@xenomai.org> wrote:
>>> On 12/29/2011 12:25 PM, Gilles Chanteperdrix wrote:
>>>> On 12/28/2011 08:09 PM, Andrew Tannenbaum wrote:
>>>>> On 12/26/2011 06:04 PM, Gilles Chanteperdrix wrote:
>>>>>> On 12/26/2011 11:56 PM, Gilles Chanteperdrix wrote:
>>>>>>> On 12/22/2011 01:41 AM, Andrew Tannenbaum wrote:
>>>>>>>> Summary: I am having a problem running rtcansend/recv on Xenomai 2.6.0,
>>>>>>>> with the processes hanging in their cleanup code.
>>>>>>>>
>>>>>>>> I had been running Xenomai on an Intel Atom system with a PEAK PCI
>>>>>>>> SJA1000 CAN adapter.
>>>>>>>>
>>>>>>>> I was running Linux 2.5.35.7 with Xenomai 2.5.5.2. I connected a servo
>>>>>>>> and motor to the PEAK adapter, and I was able to talk with it using
>>>>>>>> rtcansend and rtcanrecv.
>>>>>>>>
>>>>>>>> After working on other things for a few months, I need to return to this
>>>>>>>> project, so I downloaded the latest Linux/Xenomai pair, which I think is
>>>>>>>> Linux 2.5.38.8 and Xenomai 2.6.0.
>>>>>>>>
>>>>>>>> I was able to compile these (using the Debian build advice, generating
>>>>>>>> .deb files for Linux and Xenomai, which I install with dpkg -i). I used
>>>>>>>> a Linux .config derived from my older build.
>>>>>>>>
>>>>>>>> With both the new and old installs, I am able to run xeno-test and get
>>>>>>>> decent latencies and such, though some of the tests fail depending on
>>>>>>>> what I have configured in Realtime/Drivers/Testing Drivers. That is not
>>>>>>>> what I'm asking about.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> I am having a problem running rtcansend/recv on Xenomai 2.6.0:
>>>>>>>>
>>>>>>>> I can run rtcanconfig and it sets up my rtcan0 properly so I can see and
>>>>>>>> configure the servo. The data in /proc/rtcan looks ok.
>>>>>>>>
>>>>>>>> But when I try to talk with the servo using rtcansend, the rtcansend
>>>>>>>> process fails during the close phase, it looks like this:
>>>>>>>>
>>>>>>>> $ rtcansend rtcan0 -v -i 0x0 0x82 0x1
>>>>>>>> interface rtcan0
>>>>>>>> s=0, ifr_name=rtcan0
>>>>>>>> <0x000> [2] 82 01
>>>>>>>> Cleaning up...
>>>>>>>> ^CSignal 2 received
>>>>>>>> Cleaning up...
>>>>>>>> $
>>>>>>>>
>>>>>>>> So it hangs after the first "Cleaning up..." and I hit Control-C and
>>>>>>>> then it catches the ^C and exits. The code at the bottom of
>>>>>>>
>>>>>>> After various attempts, the bug happens when the main thread exits with
>>>>>>> pthread_exit while other threads exist in the process. It was already
>>>>>>> there in 2.5.6 at least, but we did not see it with rtcansend because
>>>>>>> there was no other thread than the main thread, while in 2.6.0, there is
>>>>>>> now the rt_print thread running.
>>>>>>>
>>>>>>
>>>>>> And it is in fact a linux/glibc behaviour. A test program compiled
>>>>>> without xenomai exhibits the same behaviour. Here is the test program,
>>>>>> simplified to the max:
>>>>>>
>>>>>> #include <pthread.h>
>>>>>> #include <sys/mman.h>
>>>>>> #include <time.h>
>>>>>>
>>>>>> void *loop(void *cookie)
>>>>>> {
>>>>>> struct timespec ts;
>>>>>>
>>>>>> ts.tv_sec = 0;
>>>>>> ts.tv_nsec = 100000000;
>>>>>>
>>>>>> pthread_detach(pthread_self());
>>>>>>
>>>>>> for(;;)
>>>>>> nanosleep(&ts, NULL);
>>>>>> }
>>>>>>
>>>>>> int main(void)
>>>>>> {
>>>>>> pthread_t tid;
>>>>>>
>>>>>> mlockall(MCL_CURRENT | MCL_FUTURE);
>>>>>>
>>>>>> pthread_create(&tid, NULL, loop, NULL);
>>>>>>
>>>>>> pthread_exit(NULL);
>>>>>> }
>>>>>>
>>>>>> So, rtcansend should call exit.
>>>>>>
>>>>>
>>>>> Gilles,
>>>>>
>>>>> Thank you for your help, it explains and resolves my immediate needs. I
>>>>> am not sure I understand the underlying problem, and I have more
>>>>> questions about it.
>>>>>
>>>>> Re the new loose private rt_print pthread, I am not comfortable with the
>>>>> suggestion to call exit() explicitly (instead of pthread_exit() or
>>>>> rt_task_delete()). Asking the user to call exit() instead of
>>>>> rt_task_delete() is not intuitive.
>>>>>
>>>>> In your simple example case, a simple solution would be to call
>>>>> pthread_cancel(tid) before pthread_exit(). I understand that in a
>>>>> Xenomai program using rt_print, the user isn't really handling the
>>>>> rt_print thread. If rt_task_delete() doesn't mean process exit, the
>>>>> question gets more difficult.
>>>>>
>>>>> Can the rt_print pthread be cleaned up automatically? atexit()?
>>>>> use-count in rt_task_delete()? If not, should rt_print be started and
>>>>> stopped explicitly by the user?
>>>>>
>>>>> I'm wondering about old programs that may hang when they are ported from
>>>>> Xenomai pre-2.6 to post-2.6.
>>>>
>>>> Here is a patch which only spawns the rt_print thread if the user calls
>>>> rt_print_auto_init(1), or rt_print_init(). Then if you have called these
>>>> services, you are expected to call rt_print_cleanup() to cancel the
>>>> rt_print thread, before calling rt_task_delete().
>>>
>>> It is a proposition. What do you think?
>>>
>>> --
>>> Gilles.
>>>
>>
>> Gilles,
>>
>> I will not be in my office until 2-Jan, I will not be able to try the
>> patch until then.
>>
>> Giving the user control of explicitly loading and unloading the
>> rt_print system and thread sounds good to me.
>>
>> It's not clear to me from a quick look at the patch, what will happen
>> if a user calls rt_printf() without first calling the rt_print_init
>> code?
>
> if rt_print_auto_init(1) has been called, the initialization happens
> automatically, otherwise nothing happens, but it always has been that way.
>
Gilles,
I think your changes should be ok, but I am trying to test them on my
own system. Until now, I have been building with debuild (for Ubuntu
x86) and I have been patching files by hand, to deal with the rt_print
problem, and also with the line-buffering problem in rtcanrecv, which I
submitted a patch for, but it was never integrated:
https://mail.gna.org/public/xenomai-help/2011-06/msg00219.html
At this point, I would like to download and build a fresh copy of the
latest 2.6 tree. It looks like there is not a "nightly" .tar.bz2 bundle
of this tree in the same format as:
http://download.gna.org/xenomai/stable/xenomai-2.6.0.tar.bz2
I thought I could follow the git instructions in:
http://www.xenomai.org/index.php/Building_Debian_packages
I am not familiar with git, but I downloaded what looks like the latest
2.6.0 tree with:
$ git clone git://xenomai.org/xenomai-2.6.git
I proceeded to do:
$ git fetch origin
which didn't seem to do much, so I did:
$ git pull git://xenomai.org/xenomai-2.6.git
which seemed to pull the latest deltas to my machine.
At this point, I was hoping to generate the .debs I needed with:
$ git checkout -b v2.6.0-deb v2.6.0
$ git-buildpackage \
--git-debian-branch=v2.6.0-deb \
--git-export-dir=.. \
-uc -us
Eventually the build fails:
<<<
...
/usr/bin/install -c xeno-config xeno wrap-link.sh
'/home/imt/src/git/xenomai-2.6.0/debian/tmp//usr/bin'
make[3]: Nothing to be done for `install-data-am'.
make[3]: Leaving directory `/home/imt/src/git/xenomai-2.6.0/scripts'
make[2]: Leaving directory `/home/imt/src/git/xenomai-2.6.0/scripts'
make[2]: Entering directory `/home/imt/src/git/xenomai-2.6.0'
make[3]: Entering directory `/home/imt/src/git/xenomai-2.6.0'
make[3]: Nothing to be done for `install-data-am'.
make[3]: Leaving directory `/home/imt/src/git/xenomai-2.6.0'
make[2]: Leaving directory `/home/imt/src/git/xenomai-2.6.0'
make[1]: Leaving directory `/home/imt/src/git/xenomai-2.6.0'
dh_install --sourcedir=/home/imt/src/git/xenomai-2.6.0/debian/tmp
cp: cannot stat `debian/tmp/usr/share/xenomai': No such file or directory
dh_install: cp -a debian/tmp/usr/share/xenomai
debian/xenomai-runtime//usr/share/ returned exit code 1
make: *** [install] Error 2
dpkg-buildpackage: error: fakeroot debian/rules binary gave error exit
status 2
debuild: fatal error at line 1340:
dpkg-buildpackage -rfakeroot -D -us -uc -i -I failed
debuild -i -I returned 25
Couldn't run 'debuild -i -I -uc -us'
>>>
My questions:
1) Must I use git, or is there a way I can get a .tar.bz2 "export" build
tree with the latest changes (a nightly build), that I can build with
debuild? Can I either get this from either
http://download.gna.org/xenomai or can I have git generate it?
Or, if I must use git...
2) What is causing:
cp: cannot stat `debian/tmp/usr/share/xenomai': No such file or directory
There is a xenomai-2.6/debian/ dir there with files in it, but not
xenomai-2.6/debian/tmp/ . The // in xenomai-2.6.0/debian/tmp//usr/bin
above looks suspicious.
3) Once I have a way to make .debs, if I must use git and I edit files
in my tree, I want git-buildpackage to use my edited copies. It looks
like git-buildpackage does not use my working copies of files, it only
uses "checked in" files. Do I do that with "git commit" ? (I assume
that checks in to my local git tree rather than the remote Xenomai git
tree?)
-Andy
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Xenomai-help] hang in rtcansend
2012-01-03 23:47 ` Andrew Tannenbaum
@ 2012-01-10 17:50 ` Andrew Tannenbaum
2012-01-10 18:11 ` Gilles Chanteperdrix
0 siblings, 1 reply; 19+ messages in thread
From: Andrew Tannenbaum @ 2012-01-10 17:50 UTC (permalink / raw)
To: Gilles Chanteperdrix; +Cc: xenomai
On 01/03/2012 06:47 PM, Andrew Tannenbaum wrote:
> On 12/29/2011 10:34 AM, Gilles Chanteperdrix wrote:
>> On 12/29/2011 04:18 PM, Andrew Tannenbaum wrote:
>>> On Thu, Dec 29, 2011 at 8:13 AM, Gilles Chanteperdrix
>>> <gilles.chanteperdrix@xenomai.org> wrote:
>>>> On 12/29/2011 12:25 PM, Gilles Chanteperdrix wrote:
>>>>> On 12/28/2011 08:09 PM, Andrew Tannenbaum wrote:
>>>>>> On 12/26/2011 06:04 PM, Gilles Chanteperdrix wrote:
>>>>>>> On 12/26/2011 11:56 PM, Gilles Chanteperdrix wrote:
>>>>>>>> On 12/22/2011 01:41 AM, Andrew Tannenbaum wrote:
>>>>>>>>> Summary: I am having a problem running rtcansend/recv on Xenomai 2.6.0,
>>>>>>>>> with the processes hanging in their cleanup code.
>>>>>>>>>
>>>>>>>>> I had been running Xenomai on an Intel Atom system with a PEAK PCI
>>>>>>>>> SJA1000 CAN adapter.
>>>>>>>>>
>>>>>>>>> I was running Linux 2.5.35.7 with Xenomai 2.5.5.2. I connected a servo
>>>>>>>>> and motor to the PEAK adapter, and I was able to talk with it using
>>>>>>>>> rtcansend and rtcanrecv.
>>>>>>>>>
>>>>>>>>> After working on other things for a few months, I need to return to this
>>>>>>>>> project, so I downloaded the latest Linux/Xenomai pair, which I think is
>>>>>>>>> Linux 2.5.38.8 and Xenomai 2.6.0.
>>>>>>>>>
>>>>>>>>> I was able to compile these (using the Debian build advice, generating
>>>>>>>>> .deb files for Linux and Xenomai, which I install with dpkg -i). I used
>>>>>>>>> a Linux .config derived from my older build.
>>>>>>>>>
>>>>>>>>> With both the new and old installs, I am able to run xeno-test and get
>>>>>>>>> decent latencies and such, though some of the tests fail depending on
>>>>>>>>> what I have configured in Realtime/Drivers/Testing Drivers. That is not
>>>>>>>>> what I'm asking about.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> I am having a problem running rtcansend/recv on Xenomai 2.6.0:
>>>>>>>>>
>>>>>>>>> I can run rtcanconfig and it sets up my rtcan0 properly so I can see and
>>>>>>>>> configure the servo. The data in /proc/rtcan looks ok.
>>>>>>>>>
>>>>>>>>> But when I try to talk with the servo using rtcansend, the rtcansend
>>>>>>>>> process fails during the close phase, it looks like this:
>>>>>>>>>
>>>>>>>>> $ rtcansend rtcan0 -v -i 0x0 0x82 0x1
>>>>>>>>> interface rtcan0
>>>>>>>>> s=0, ifr_name=rtcan0
>>>>>>>>> <0x000> [2] 82 01
>>>>>>>>> Cleaning up...
>>>>>>>>> ^CSignal 2 received
>>>>>>>>> Cleaning up...
>>>>>>>>> $
>>>>>>>>>
>>>>>>>>> So it hangs after the first "Cleaning up..." and I hit Control-C and
>>>>>>>>> then it catches the ^C and exits. The code at the bottom of
>>>>>>>>
>>>>>>>> After various attempts, the bug happens when the main thread exits with
>>>>>>>> pthread_exit while other threads exist in the process. It was already
>>>>>>>> there in 2.5.6 at least, but we did not see it with rtcansend because
>>>>>>>> there was no other thread than the main thread, while in 2.6.0, there is
>>>>>>>> now the rt_print thread running.
>>>>>>>>
>>>>>>>
>>>>>>> And it is in fact a linux/glibc behaviour. A test program compiled
>>>>>>> without xenomai exhibits the same behaviour. Here is the test program,
>>>>>>> simplified to the max:
>>>>>>>
>>>>>>> #include <pthread.h>
>>>>>>> #include <sys/mman.h>
>>>>>>> #include <time.h>
>>>>>>>
>>>>>>> void *loop(void *cookie)
>>>>>>> {
>>>>>>> struct timespec ts;
>>>>>>>
>>>>>>> ts.tv_sec = 0;
>>>>>>> ts.tv_nsec = 100000000;
>>>>>>>
>>>>>>> pthread_detach(pthread_self());
>>>>>>>
>>>>>>> for(;;)
>>>>>>> nanosleep(&ts, NULL);
>>>>>>> }
>>>>>>>
>>>>>>> int main(void)
>>>>>>> {
>>>>>>> pthread_t tid;
>>>>>>>
>>>>>>> mlockall(MCL_CURRENT | MCL_FUTURE);
>>>>>>>
>>>>>>> pthread_create(&tid, NULL, loop, NULL);
>>>>>>>
>>>>>>> pthread_exit(NULL);
>>>>>>> }
>>>>>>>
>>>>>>> So, rtcansend should call exit.
>>>>>>>
>>>>>>
>>>>>> Gilles,
>>>>>>
>>>>>> Thank you for your help, it explains and resolves my immediate needs. I
>>>>>> am not sure I understand the underlying problem, and I have more
>>>>>> questions about it.
>>>>>>
>>>>>> Re the new loose private rt_print pthread, I am not comfortable with the
>>>>>> suggestion to call exit() explicitly (instead of pthread_exit() or
>>>>>> rt_task_delete()). Asking the user to call exit() instead of
>>>>>> rt_task_delete() is not intuitive.
>>>>>>
>>>>>> In your simple example case, a simple solution would be to call
>>>>>> pthread_cancel(tid) before pthread_exit(). I understand that in a
>>>>>> Xenomai program using rt_print, the user isn't really handling the
>>>>>> rt_print thread. If rt_task_delete() doesn't mean process exit, the
>>>>>> question gets more difficult.
>>>>>>
>>>>>> Can the rt_print pthread be cleaned up automatically? atexit()?
>>>>>> use-count in rt_task_delete()? If not, should rt_print be started and
>>>>>> stopped explicitly by the user?
>>>>>>
>>>>>> I'm wondering about old programs that may hang when they are ported from
>>>>>> Xenomai pre-2.6 to post-2.6.
>>>>>
>>>>> Here is a patch which only spawns the rt_print thread if the user calls
>>>>> rt_print_auto_init(1), or rt_print_init(). Then if you have called these
>>>>> services, you are expected to call rt_print_cleanup() to cancel the
>>>>> rt_print thread, before calling rt_task_delete().
>>>>
>>>> It is a proposition. What do you think?
>>>>
>>>> --
>>>> Gilles.
>>>>
>>>
>>> Gilles,
>>>
>>> I will not be in my office until 2-Jan, I will not be able to try the
>>> patch until then.
>>>
>>> Giving the user control of explicitly loading and unloading the
>>> rt_print system and thread sounds good to me.
>>>
>>> It's not clear to me from a quick look at the patch, what will happen
>>> if a user calls rt_printf() without first calling the rt_print_init
>>> code?
>>
>> if rt_print_auto_init(1) has been called, the initialization happens
>> automatically, otherwise nothing happens, but it always has been that way.
>>
>
> Gilles,
>
> I think your changes should be ok, but I am trying to test them on my
> own system. Until now, I have been building with debuild (for Ubuntu
> x86) and I have been patching files by hand, to deal with the rt_print
> problem, and also with the line-buffering problem in rtcanrecv, which I
> submitted a patch for, but it was never integrated:
>
> https://mail.gna.org/public/xenomai-help/2011-06/msg00219.html
>
> At this point, I would like to download and build a fresh copy of the
> latest 2.6 tree. It looks like there is not a "nightly" .tar.bz2 bundle
> of this tree in the same format as:
>
> http://download.gna.org/xenomai/stable/xenomai-2.6.0.tar.bz2
>
> I thought I could follow the git instructions in:
>
> http://www.xenomai.org/index.php/Building_Debian_packages
>
> I am not familiar with git, but I downloaded what looks like the latest
> 2.6.0 tree with:
>
> $ git clone git://xenomai.org/xenomai-2.6.git
>
> I proceeded to do:
>
> $ git fetch origin
>
> which didn't seem to do much, so I did:
>
> $ git pull git://xenomai.org/xenomai-2.6.git
>
> which seemed to pull the latest deltas to my machine.
>
> At this point, I was hoping to generate the .debs I needed with:
>
> $ git checkout -b v2.6.0-deb v2.6.0
>
> $ git-buildpackage \
> --git-debian-branch=v2.6.0-deb \
> --git-export-dir=.. \
> -uc -us
>
> Eventually the build fails:
>
> <<<
> ...
> /usr/bin/install -c xeno-config xeno wrap-link.sh
> '/home/imt/src/git/xenomai-2.6.0/debian/tmp//usr/bin'
> make[3]: Nothing to be done for `install-data-am'.
> make[3]: Leaving directory `/home/imt/src/git/xenomai-2.6.0/scripts'
> make[2]: Leaving directory `/home/imt/src/git/xenomai-2.6.0/scripts'
> make[2]: Entering directory `/home/imt/src/git/xenomai-2.6.0'
> make[3]: Entering directory `/home/imt/src/git/xenomai-2.6.0'
> make[3]: Nothing to be done for `install-data-am'.
> make[3]: Leaving directory `/home/imt/src/git/xenomai-2.6.0'
> make[2]: Leaving directory `/home/imt/src/git/xenomai-2.6.0'
> make[1]: Leaving directory `/home/imt/src/git/xenomai-2.6.0'
> dh_install --sourcedir=/home/imt/src/git/xenomai-2.6.0/debian/tmp
> cp: cannot stat `debian/tmp/usr/share/xenomai': No such file or directory
> dh_install: cp -a debian/tmp/usr/share/xenomai
> debian/xenomai-runtime//usr/share/ returned exit code 1
> make: *** [install] Error 2
> dpkg-buildpackage: error: fakeroot debian/rules binary gave error exit
> status 2
> debuild: fatal error at line 1340:
> dpkg-buildpackage -rfakeroot -D -us -uc -i -I failed
> debuild -i -I returned 25
> Couldn't run 'debuild -i -I -uc -us'
>>>>
>
> My questions:
>
> 1) Must I use git, or is there a way I can get a .tar.bz2 "export" build
> tree with the latest changes (a nightly build), that I can build with
> debuild? Can I either get this from either
> http://download.gna.org/xenomai or can I have git generate it?
>
> Or, if I must use git...
>
> 2) What is causing:
>
> cp: cannot stat `debian/tmp/usr/share/xenomai': No such file or directory
>
> There is a xenomai-2.6/debian/ dir there with files in it, but not
> xenomai-2.6/debian/tmp/ . The // in xenomai-2.6.0/debian/tmp//usr/bin
> above looks suspicious.
>
> 3) Once I have a way to make .debs, if I must use git and I edit files
> in my tree, I want git-buildpackage to use my edited copies. It looks
> like git-buildpackage does not use my working copies of files, it only
> uses "checked in" files. Do I do that with "git commit" ? (I assume
> that checks in to my local git tree rather than the remote Xenomai git
> tree?)
>
> -Andy
As I noted in my previous message, I have been trying to compile the
latest Xenomai 2.6.0 from the git repository.
I was getting
dh_install --sourcedir=/home/imt/src/git/xenomai-2.6.0/debian/tmp
cp: cannot stat `debian/tmp/usr/share/xenomai': No such file or directory
dh_install: cp -a debian/tmp/usr/share/xenomai
debian/xenomai-runtime//usr/share/ returned exit code 1
and I don't know why. Here are more details:
I download the 2.6.0 repository with git clone, as in the instructions here:
http://www.xenomai.org/index.php/Building_Debian_packages
except the instructions are for 2.5.6, so I replace 2.5 with 2.6, and
2.5.6 with 2.6.0 .
When I build this 2.6.0 base tree with git-buildpackage, the build succeeds.
I start with this base tree and then merge the latest version:
git merge origin/master
(I do not make any other local changes to the tree)
and then I try to build again with git-buildpackage, the build fails
with the error:
=<<<
dh_install --sourcedir=/home/imt/src/git/xenomai-2.6.0/debian/tmp
cp: cannot stat `debian/tmp/usr/share/xenomai': No such file or directory
dh_install: cp -a debian/tmp/usr/share/xenomai
debian/xenomai-runtime//usr/share/ returned exit code 1
make: *** [install] Error 2
dpkg-buildpackage: error: fakeroot debian/rules binary gave error exit
status 2
debuild: fatal error at line 1340:
dpkg-buildpackage -rfakeroot -D -us -uc -i -I failed
debuild -i -I returned 29
Couldn't run 'debuild -i -I -uc -us'
=>>>
(as above).
The successful build looks like this:
=<<<
dh_install --sourcedir=/home/imt/src/git/xenomai-2.6.0/debian/tmp
# xeno-config should be only in libxenomai-dev
rm -f
/home/imt/src/git/xenomai-2.6.0/debian/xenomai-runtime/usr/bin/xeno-config
rm -f
/home/imt/src/git/xenomai-2.6.0/debian/xenomai-runtime/usr/share/man/man1/
xeno-config.1
for f in /home/imt/src/git/xenomai-2.6.0/ksrc/nucleus/udev/*.rules ; do \
cat $f >>
/home/imt/src/git/xenomai-2.6.0/debian/libxenomai1/etc/ude
v/xenomai.rules ; \
done
install -m 644 debian/libxenomai1.modprobe
/home/imt/src/git/xenomai-2.6.0/debia
n/libxenomai1/etc/modprobe.d/xenomai.conf
# remove empty directory
...
=>>>
I have tried several variations, like running make dist and then just
running debuild, but it fails the same way.
I notice that when I get a dist tarball, I compile from the
xenomai-2.6.0 directory, but when I start with the git tree, I compile
from xenomai-2.6 and it generates some files in the ../xenomai-2.6.0
directory. Is this part of the problem? It works for the base version
but not the merged version, so I'm not clear on this.
-Andy
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Xenomai-help] hang in rtcansend
2012-01-10 17:50 ` Andrew Tannenbaum
@ 2012-01-10 18:11 ` Gilles Chanteperdrix
2012-01-11 16:24 ` Andrew Tannenbaum
0 siblings, 1 reply; 19+ messages in thread
From: Gilles Chanteperdrix @ 2012-01-10 18:11 UTC (permalink / raw)
To: Andrew Tannenbaum; +Cc: xenomai
On 01/10/2012 06:50 PM, Andrew Tannenbaum wrote:
> I have tried several variations, like running make dist and then just
> running debuild, but it fails the same way.
>
> I notice that when I get a dist tarball, I compile from the
> xenomai-2.6.0 directory, but when I start with the git tree, I compile
> from xenomai-2.6 and it generates some files in the ../xenomai-2.6.0
> directory. Is this part of the problem? It works for the base version
> but not the merged version, so I'm not clear on this.
Normally, when you run "make dist", the directory should be called
xenomai-2.6.0, not xenomai-2.6. Or am I missing something?
--
Gilles.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Xenomai-help] hang in rtcansend
2012-01-10 18:11 ` Gilles Chanteperdrix
@ 2012-01-11 16:24 ` Andrew Tannenbaum
2012-01-16 23:19 ` [Xenomai-help] problem with Debian Xenomai build, was " Andrew Tannenbaum
0 siblings, 1 reply; 19+ messages in thread
From: Andrew Tannenbaum @ 2012-01-11 16:24 UTC (permalink / raw)
To: Gilles Chanteperdrix; +Cc: xenomai
On 01/10/2012 01:11 PM, Gilles Chanteperdrix wrote:
> On 01/10/2012 06:50 PM, Andrew Tannenbaum wrote:
>> I have tried several variations, like running make dist and then just
>> running debuild, but it fails the same way.
>>
>> I notice that when I get a dist tarball, I compile from the
>> xenomai-2.6.0 directory, but when I start with the git tree, I compile
>> from xenomai-2.6 and it generates some files in the ../xenomai-2.6.0
>> directory. Is this part of the problem? It works for the base version
>> but not the merged version, so I'm not clear on this.
>
> Normally, when you run "make dist", the directory should be called
> xenomai-2.6.0, not xenomai-2.6. Or am I missing something?
>
The instructions here for building from a Git repository:
http://www.xenomai.org/index.php/Building_Debian_packages
show the process of building a Xenomai 2.5.6 in the directory xenomai-2.5 .
-Andy
^ permalink raw reply [flat|nested] 19+ messages in thread
* [Xenomai-help] problem with Debian Xenomai build, was Re: hang in rtcansend
2012-01-11 16:24 ` Andrew Tannenbaum
@ 2012-01-16 23:19 ` Andrew Tannenbaum
0 siblings, 0 replies; 19+ messages in thread
From: Andrew Tannenbaum @ 2012-01-16 23:19 UTC (permalink / raw)
To: xenomai
I found a problem with 2.6.0 and Gilles Chanteperdrix posted some fixes
to the Git tree, and I want to test them. These are discussed in the
thread: [Xenomai-help] hang in rtcansend
I have been having trouble building 2.6.0 with latest git tree. I'm
trying to follow the instructions here:
http://www.xenomai.org/index.php/Building_Debian_packages
I am building 2.6.0 rather than 2.5.0. When I try to build, the base
2.6.0 build from the git tree builds correctly. But when I merge the
latest changes with
git merge origin/master
I get build errors. I posted notes about this here:
https://mail.gna.org/public/xenomai-help/2012-01/msg00057.html
but I have received no solution. Gilles helped me with the Xenomai
problem I found, but he is not familiar with the Debian build system, so
he suggested that I re-post with Debian in the subject line.
So I would like help building the latest 2.6.0 from the git repository.
-Andy
^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2012-01-16 23:19 UTC | newest]
Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-12-22 0:41 [Xenomai-help] hang in rtcansend Andrew Tannenbaum
2011-12-22 9:48 ` Gilles Chanteperdrix
2011-12-22 13:33 ` Willy Lambert
2011-12-22 15:36 ` Gilles Chanteperdrix
2011-12-22 15:47 ` Willy Lambert
2011-12-22 18:59 ` Andrew Tannenbaum
2011-12-26 22:56 ` Gilles Chanteperdrix
2011-12-26 23:04 ` Gilles Chanteperdrix
2011-12-28 19:09 ` Andrew Tannenbaum
2011-12-28 20:50 ` Gilles Chanteperdrix
2011-12-29 11:25 ` Gilles Chanteperdrix
2011-12-29 13:13 ` Gilles Chanteperdrix
2011-12-29 15:18 ` Andrew Tannenbaum
2011-12-29 15:34 ` Gilles Chanteperdrix
2012-01-03 23:47 ` Andrew Tannenbaum
2012-01-10 17:50 ` Andrew Tannenbaum
2012-01-10 18:11 ` Gilles Chanteperdrix
2012-01-11 16:24 ` Andrew Tannenbaum
2012-01-16 23:19 ` [Xenomai-help] problem with Debian Xenomai build, was " Andrew Tannenbaum
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.