All of lore.kernel.org
 help / color / mirror / Atom feed
* [Xenomai-help] rt_task_send / receive problems various issues and bug trace
@ 2008-05-07 15:27 Karch, Joshua
  2008-05-07 17:24 ` Philippe Gerum
  0 siblings, 1 reply; 9+ messages in thread
From: Karch, Joshua @ 2008-05-07 15:27 UTC (permalink / raw)
  To: xenomai

[-- Attachment #1: Type: text/plain, Size: 3874 bytes --]


Hello,

I'm using rt_task_send from a talker task and rt_task_receive/reply from a listener task.  When I launch the two tasks in the following order: listener task,  talker task,  everything runs normally.

However, when I launch the talker task first, and then the listener task second, I receive rt_task_send error -22 and after a bit of time the listener task starts up. I know this is logical and to be expected, however, it appears that issuing the rt_task_send command to a task that hasn't been started occasionally locks up sufficient resources to prevent the listener task from starting.  By controlling task startup order, I was able to circumvent this issue.  Both tasks have similar priorities (50, 51).

Additionally, I am unable to use rt_task_send with TM_NONBLOCK
len = rt_task_send(listen_task,&talk_send,&talk_reply,TM_NONBLOCK);

I get a bug failure and have to reset the machine-- see below:

The reason I want to use TM_NONBLOCK is so that I can send a trigger message from the producer task to the consumer task without requiring a reply to trigger the consumer task to act on the data received. I am using the rt_task_send trigger message to gate and synchronize the consumer task. Is a reply required for all rt_task_send?  

It seems if I don't send a reply when rt_task_send has a timeout specified the sending task locks up and the listening task runs rampant, i.e. rt_task_receive no longer blocks and the loop runs with no delay and essentially locks up the machine, since I don't use rt_set_task_periodic on the listening task.

Here is the trace, and it requires a reboot.  I also can attach the code- it is written in c++ with two separate classes as a model of the application I am building.

Thank you,

Joshua Karch

talker task started
rt_task_send error
len=-22, opcode=0
listener task stBUG: unable to handle kernel NULL pointer dereferencearted
rt_task_s at virtual address 0000020c
end error
len=-printing eip: c014f014 110, opcode=0
*pde = 00000000 
Oops: 0000 [#1] PREEMPT 
Modules linked in: xeno_timerbench nfs ipv6 nfsd lockd nfs_acl sunrpc exportfs dm_snapshot dm_mirror dm_mod loop pcmcia firmware_class serio_raw psmouse yenta_socket rsrc_nonstatic pcmcia_core cs5535_gpio joydev evdev ext3 jbd mbcache usbhid ide_disk generic ohci_hcd ehci_hcd amd74xx usbcore ide_core e100 mii

Pid: 1973, comm: listen_task Not tainted (2.6.24.4 #12)
EIP: 0060:[<c014f014>] EFLAGS: 00010093 CPU: 0
EIP is at rt_task_receive+0x113/0x18b
EAX: cdc71f28 EBX: fffffd98 ECX: c03c5500 EDX: cd820acc
ESI: ffffff97 EDI: cd820610 EBP: cdc71edc ESP: cdc71ec4
 DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068
Process listen_task (pid: 1973, ti=cdc70000 task=cdc6d320 task.ti=cdc70000)<0>
I-pipe domain Linux
Stack: 00000000 cdc71f28 00000000 cdc71ee8 cdc6d320 cdc71fb8 cdc71f4c c01515bf 
       b7d083ea cdc71f00 c0102cb1 ce46bc70 c03c7e90 cd820620 cdc6d620 cd80ff44 
       c029f55b cdc71f34 00000082 b91264ee 0000002e cdc6d320 b9127ee2 0000002e 
Call Trace:
 [<c01045f1>] show_trace_log_lvl+0x1a/0x2f
 [<c01046a3>] show_stack_log_lvl+0x9d/0xa5
 [<c0104769>] show_registers+0xbe/0x1fd
 [<c01049c1>] die+0x119/0x20a
 [<c010ef52>] do_page_fault+0x480/0x57e
 [<c010ca34>] __ipipe_handle_exception+0x11e/0x166
 [<c02a0e7f>] error_code+0x6f/0x80
 [<c01515bf>] __rt_task_receive+0xbe/0x113
 [<c0149b0d>] losyscall_event+0x99/0x13d
 [<c013f05b>] __ipipe_dispatch_event+0xac/0x16c
 [<c010c8b1>] __ipipe_syscall_root+0x6a/0xcf
 [<c0103e89>] system_call+0x29/0x4a
 =======================
Code: 00 00 8d 87 bc 04 00 00 39 c2 0f 95 c0 0f b6 c0 f7 d8 21 d0 31 db 3d 58 02 00 00 74 06 8d 98 98 fd ff ff 8b 45 ec be 97 ff ff ff <8b> 93 74 04 00 00 3b 50 0c 77 23 85 d2 74 19 89 d1 8b b3 70 04 
EIP: [<c014f014>] rt_task_receive+0x113/0x18b SS:ESP 0068:cdc71ec4
---[ end trace 523bcd2b73b75979 ]---





[-- Attachment #2: Type: text/html, Size: 4799 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Xenomai-help] rt_task_send / receive problems various issues and bug trace
  2008-05-07 15:27 [Xenomai-help] rt_task_send / receive problems various issues and bug trace Karch, Joshua
@ 2008-05-07 17:24 ` Philippe Gerum
  2008-05-07 19:33   ` Karch, Joshua
  0 siblings, 1 reply; 9+ messages in thread
From: Philippe Gerum @ 2008-05-07 17:24 UTC (permalink / raw)
  To: Karch, Joshua; +Cc: xenomai

Karch, Joshua wrote:
> 
> Hello,
> 
> I'm using rt_task_send from a talker task and rt_task_receive/reply from
> a listener task.  When I launch the two tasks in the following order:
> listener task,  talker task,  everything runs normally.
> 
> However, when I launch the talker task first, and then the listener task
> second, I receive rt_task_send error -22 and after a bit of time the
> listener task starts up.

-EINVAL is not on the error list for rt_task_send(). Could you confirm this result?

 I know this is logical and to be expected,

No, it's not. rt_task_send() does wait for the receiver to listen to, unless you
 asked for a non blocking call using TM_NONBLOCK as a timeout.

> however, it appears that issuing the rt_task_send command to a task that
> hasn't been started occasionally locks up sufficient resources to
> prevent the listener task from starting. 

Any chance your code enters a tight loop due to rt_task_send() failing repeatedly?

 By controlling task startup
> order, I was able to circumvent this issue.  Both tasks have similar
> priorities (50, 51).
> 
> Additionally, I am unable to use rt_task_send with TM_NONBLOCK
> len = rt_task_send(listen_task,&talk_send,&talk_reply,TM_NONBLOCK);
> 
> I get a bug failure and have to reset the machine-- see below:
> 

Please disassemble your "vmlinux" kernel image, the exact one that causes a bug:
$ objdump -d vmlinux > foo.txt.
In that large file, search for the "__rt_task_receive" symbol (notice the double
underscore prefix, we also have the "rt_task_receive" symbol, but we don't need
this code at the moment), then paste&copy the disassembly code for that
function. I'll have a look.

Step #2 is to send a simple piece of code that exhibits the problem. This will
speed up the debugging and fixing process.

> The reason I want to use TM_NONBLOCK is so that I can send a trigger
> message from the producer task to the consumer task without requiring a
> reply to trigger the consumer task to act on the data received. I am
> using the rt_task_send trigger message to gate and synchronize the
> consumer task. Is a reply required for all rt_task_send? 
> 
> It seems if I don't send a reply when rt_task_send has a timeout
> specified the sending task locks up and the listening task runs rampant,
> i.e. rt_task_receive no longer blocks and the loop runs with no delay
> and essentially locks up the machine, since I don't use
> rt_set_task_periodic on the listening task.
> 
> Here is the trace, and it requires a reboot.  I also can attach the
> code- it is written in c++ with two separate classes as a model of the
> application I am building.
> 
> Thank you,
> 
> Joshua Karch
> 
> talker task started
> rt_task_send error
> len=-22, opcode=0
> listener task stBUG: unable to handle kernel NULL pointer dereferencearted
> rt_task_s at virtual address 0000020c
> end error
> len=-printing eip: c014f014 110, opcode=0
> *pde = 00000000
> Oops: 0000 [#1] PREEMPT
> Modules linked in: xeno_timerbench nfs ipv6 nfsd lockd nfs_acl sunrpc
> exportfs dm_snapshot dm_mirror dm_mod loop pcmcia firmware_class
> serio_raw psmouse yenta_socket rsrc_nonstatic pcmcia_core cs5535_gpio
> joydev evdev ext3 jbd mbcache usbhid ide_disk generic ohci_hcd ehci_hcd
> amd74xx usbcore ide_core e100 mii
> 
> Pid: 1973, comm: listen_task Not tainted (2.6.24.4 #12)
> EIP: 0060:[<c014f014>] EFLAGS: 00010093 CPU: 0
> EIP is at rt_task_receive+0x113/0x18b
> EAX: cdc71f28 EBX: fffffd98 ECX: c03c5500 EDX: cd820acc
> ESI: ffffff97 EDI: cd820610 EBP: cdc71edc ESP: cdc71ec4
>  DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068
> Process listen_task (pid: 1973, ti=cdc70000 task=cdc6d320
> task.ti=cdc70000)<0>
> I-pipe domain Linux
> Stack: 00000000 cdc71f28 00000000 cdc71ee8 cdc6d320 cdc71fb8 cdc71f4c
> c01515bf
>        b7d083ea cdc71f00 c0102cb1 ce46bc70 c03c7e90 cd820620 cdc6d620
> cd80ff44
>        c029f55b cdc71f34 00000082 b91264ee 0000002e cdc6d320 b9127ee2
> 0000002e
> Call Trace:
>  [<c01045f1>] show_trace_log_lvl+0x1a/0x2f
>  [<c01046a3>] show_stack_log_lvl+0x9d/0xa5
>  [<c0104769>] show_registers+0xbe/0x1fd
>  [<c01049c1>] die+0x119/0x20a
>  [<c010ef52>] do_page_fault+0x480/0x57e
>  [<c010ca34>] __ipipe_handle_exception+0x11e/0x166
>  [<c02a0e7f>] error_code+0x6f/0x80
>  [<c01515bf>] __rt_task_receive+0xbe/0x113
>  [<c0149b0d>] losyscall_event+0x99/0x13d
>  [<c013f05b>] __ipipe_dispatch_event+0xac/0x16c
>  [<c010c8b1>] __ipipe_syscall_root+0x6a/0xcf
>  [<c0103e89>] system_call+0x29/0x4a
>  =======================
> Code: 00 00 8d 87 bc 04 00 00 39 c2 0f 95 c0 0f b6 c0 f7 d8 21 d0 31 db
> 3d 58 02 00 00 74 06 8d 98 98 fd ff ff 8b 45 ec be 97 ff ff ff <8b> 93
> 74 04 00 00 3b 50 0c 77 23 85 d2 74 19 89 d1 8b b3 70 04
> EIP: [<c014f014>] rt_task_receive+0x113/0x18b SS:ESP 0068:cdc71ec4
> ---[ end trace 523bcd2b73b75979 ]---
> 
> 
> 
> 
> 
> ------------------------------------------------------------------------
> 
> _______________________________________________
> Xenomai-help mailing list
> Xenomai-help@domain.hid
> https://mail.gna.org/listinfo/xenomai-help


-- 
Philippe.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Xenomai-help] rt_task_send / receive problems various issues and bug trace
  2008-05-07 17:24 ` Philippe Gerum
@ 2008-05-07 19:33   ` Karch, Joshua
  2008-05-07 19:35     ` [Xenomai-help] rt_task_send / receive problems various issuesand " Karch, Joshua
  0 siblings, 1 reply; 9+ messages in thread
From: Karch, Joshua @ 2008-05-07 19:33 UTC (permalink / raw)
  To: rpm; +Cc: xenomai

[-- Attachment #1: Type: text/plain, Size: 16274 bytes --]

Phillipe,

here is a disassembly of __rt_task_receive and the source code used to generate the error.  The error happens if I use TM_NONBLOCK, regardless of whether or not I enable a reply.  In the following example, I use rt_send_task with no reply requested (NULL), and all replies commented out.

I got three errors this time, -22, -11, -110, and then finally
BUG: unable to handle kernel NULL pointer dereference at virtual address 0000020c
printing eip: c014f014 *pde = 00000000
I hand typed the above in since I didn't capture it through a serial terminal.

>From what I can see here, there's no place for a local loop to form in the talker task. I have the wait period set to the main rate, defined as a 20 msec for this example.  Calling rt_task_send with or without a RT_TASK_MCB receive struct and without a reply from the listener task still results in the bug occurring.  My platform is a Geode, and I'm running 2.6.24-4 with Xenomai 2.4.3

Thank you,

Josh




c0151501 <__rt_task_receive>:
c0151501:	55                   	push   %ebp
c0151502:	89 e5                	mov    %esp,%ebp
c0151504:	57                   	push   %edi
c0151505:	89 d7                	mov    %edx,%edi
c0151507:	56                   	push   %esi
c0151508:	89 c6                	mov    %eax,%esi
c015150a:	53                   	push   %ebx
c015150b:	83 ec 5c             	sub    $0x5c,%esp
c015150e:	8b 1a                	mov    (%edx),%ebx
c0151510:	8b 40 04             	mov    0x4(%eax),%eax
c0151513:	89 da                	mov    %ebx,%edx
c0151515:	83 c2 10             	add    $0x10,%edx
c0151518:	19 c9                	sbb    %ecx,%ecx
c015151a:	39 50 18             	cmp    %edx,0x18(%eax)
c015151d:	83 d9 00             	sbb    $0x0,%ecx
c0151520:	85 c9                	test   %ecx,%ecx
c0151522:	0f 85 dd 00 00 00    	jne    c0151605 <__rt_task_receive+0x104>
c0151528:	b9 10 00 00 00       	mov    $0x10,%ecx
c015152d:	89 da                	mov    %ebx,%edx
c015152f:	8d 45 dc             	lea    0xffffffdc(%ebp),%eax
c0151532:	e8 29 a2 08 00       	call   c01db760 <__copy_from_user_ll_nozero>
c0151537:	8b 4d e8             	mov    0xffffffe8(%ebp),%ecx
c015153a:	85 c9                	test   %ecx,%ecx
c015153c:	74 18                	je     c0151556 <__rt_task_receive+0x55>
c015153e:	8b 55 e4             	mov    0xffffffe4(%ebp),%edx
c0151541:	8b 46 04             	mov    0x4(%esi),%eax
c0151544:	01 ca                	add    %ecx,%edx
c0151546:	19 db                	sbb    %ebx,%ebx
c0151548:	39 50 18             	cmp    %edx,0x18(%eax)
c015154b:	83 db 00             	sbb    $0x0,%ebx
c015154e:	85 db                	test   %ebx,%ebx
c0151550:	0f 85 af 00 00 00    	jne    c0151605 <__rt_task_receive+0x104>
c0151556:	8b 5f 04             	mov    0x4(%edi),%ebx
c0151559:	8b 46 04             	mov    0x4(%esi),%eax
c015155c:	89 da                	mov    %ebx,%edx
c015155e:	83 c2 08             	add    $0x8,%edx
c0151561:	19 c9                	sbb    %ecx,%ecx
c0151563:	39 50 18             	cmp    %edx,0x18(%eax)
c0151566:	83 d9 00             	sbb    $0x0,%ecx
c0151569:	85 c9                	test   %ecx,%ecx
c015156b:	0f 85 94 00 00 00    	jne    c0151605 <__rt_task_receive+0x104>
c0151571:	89 da                	mov    %ebx,%edx
c0151573:	b9 08 00 00 00       	mov    $0x8,%ecx
c0151578:	8d 45 ec             	lea    0xffffffec(%ebp),%eax
c015157b:	e8 e0 a1 08 00       	call   c01db760 <__copy_from_user_ll_nozero>
c0151580:	8b 5d e4             	mov    0xffffffe4(%ebp),%ebx
c0151583:	8b 55 e8             	mov    0xffffffe8(%ebp),%edx
c0151586:	89 5d 98             	mov    %ebx,0xffffff98(%ebp)
c0151589:	31 db                	xor    %ebx,%ebx
c015158b:	85 d2                	test   %edx,%edx
c015158d:	74 22                	je     c01515b1 <__rt_task_receive+0xb0>
c015158f:	83 fa 40             	cmp    $0x40,%edx
c0151592:	77 05                	ja     c0151599 <__rt_task_receive+0x98>
c0151594:	8d 5d 9c             	lea    0xffffff9c(%ebp),%ebx
c0151597:	eb 15                	jmp    c01515ae <__rt_task_receive+0xad>
c0151599:	b8 60 b9 3c c0       	mov    $0xc03cb960,%eax
c015159e:	be f4 ff ff ff       	mov    $0xfffffff4,%esi
c01515a3:	e8 25 f2 fe ff       	call   c01407cd <xnheap_alloc>
c01515a8:	85 c0                	test   %eax,%eax
c01515aa:	74 5e                	je     c015160a <__rt_task_receive+0x109>
c01515ac:	89 c3                	mov    %eax,%ebx
c01515ae:	89 5d e4             	mov    %ebx,0xffffffe4(%ebp)
c01515b1:	8b 55 ec             	mov    0xffffffec(%ebp),%edx
c01515b4:	8d 45 dc             	lea    0xffffffdc(%ebp),%eax
c01515b7:	8b 4d f0             	mov    0xfffffff0(%ebp),%ecx
c01515ba:	e8 42 d9 ff ff       	call   c014ef01 <rt_task_receive>
c01515bf:	85 c0                	test   %eax,%eax
c01515c1:	89 c6                	mov    %eax,%esi
c01515c3:	7e 12                	jle    c01515d7 <__rt_task_receive+0xd6>
c01515c5:	8b 4d e8             	mov    0xffffffe8(%ebp),%ecx
c01515c8:	85 c9                	test   %ecx,%ecx
c01515ca:	74 0b                	je     c01515d7 <__rt_task_receive+0xd6>
c01515cc:	8b 55 e4             	mov    0xffffffe4(%ebp),%edx
c01515cf:	8b 45 98             	mov    0xffffff98(%ebp),%eax
c01515d2:	e8 e9 9f 08 00       	call   c01db5c0 <__copy_to_user_ll>
c01515d7:	8b 45 98             	mov    0xffffff98(%ebp),%eax
c01515da:	8d 55 dc             	lea    0xffffffdc(%ebp),%edx
c01515dd:	b9 10 00 00 00       	mov    $0x10,%ecx
c01515e2:	89 45 e4             	mov    %eax,0xffffffe4(%ebp)
c01515e5:	8b 07                	mov    (%edi),%eax
c01515e7:	e8 d4 9f 08 00       	call   c01db5c0 <__copy_to_user_ll>
c01515ec:	85 db                	test   %ebx,%ebx
c01515ee:	74 1a                	je     c015160a <__rt_task_receive+0x109>
c01515f0:	8d 45 9c             	lea    0xffffff9c(%ebp),%eax
c01515f3:	39 c3                	cmp    %eax,%ebx
c01515f5:	74 13                	je     c015160a <__rt_task_receive+0x109>
c01515f7:	89 da                	mov    %ebx,%edx
c01515f9:	b8 60 b9 3c c0       	mov    $0xc03cb960,%eax
c01515fe:	e8 a4 f0 fe ff       	call   c01406a7 <xnheap_free>
c0151603:	eb 05                	jmp    c015160a <__rt_task_receive+0x109>
c0151605:	be f2 ff ff ff       	mov    $0xfffffff2,%esi
c015160a:	83 c4 5c             	add    $0x5c,%esp
c015160d:	89 f0                	mov    %esi,%eax
c015160f:	5b                   	pop    %ebx
c0151610:	5e                   	pop    %esi
c0151611:	5f                   	pop    %edi
c0151612:	5d                   	pop    %ebp
c0151613:	c3                   	ret    




Associated source code:


//main.cpp
#include "talker.h"
#include "listener.h"
#include <sys/mman.h>
#include <native/task.h>
#include <signal.h>

volatile int exitprogram;
RT_TASK talk_task, listen_task;


void catch_signal(int sig)
{
	exitprogram=1;
}

int main(int argc, char *argv[])
{
	exitprogram=0;
	
	mlockall(MCL_CURRENT | MCL_FUTURE);
	//wait for ctrl-c
	signal(SIGTERM, catch_signal);
	signal(SIGINT, catch_signal);
	talker *talk = new talker();
	listener *listen = new listener();
	talk->startup(&talk_task, &listen_task);
	sleep(1);
	listen->startup(&talk_task, &listen_task);
	pause();
	sleep(2);
	rt_task_join(&talk_task);
	rt_task_join(&listen_task);
	rt_task_delete(&talk_task);
	rt_task_delete(&listen_task);
	return 0;
}

//talker.h
#ifndef TALKER_H_
#define TALKER_H_
#include <stdio.h>
#include <stdlib.h>
#include <sys/time.h>
#include <unistd.h>
#include <sys/mman.h>
#include <native/task.h>
#include <native/timer.h>
#define TIMEOUT (100000000)
#define MAIN_RATE_NS 20000000


extern volatile int exitprogram;

class talker {
public:
	talker();
	virtual ~talker();
	void mainloop();
	int startup(RT_TASK *talktask, RT_TASK *listentask);

private:
	RT_TASK *talk_task, *listen_task;
	void shutdown();
	int iret;

protected:
	static void thunk(void * param);
};



#endif /*TALKER_H_*/



//talker.cpp
#include "talker.h"

talker::talker()
{
}
talker::~talker()
{
}

int talker::startup(RT_TASK *talktask, RT_TASK *listentask)
{
	talk_task=talktask;
	listen_task=listentask;
	rt_task_create(talk_task, "talk_task", 0, 51, 0);
	iret= rt_task_start(talk_task,thunk,(void*)(this));
	return(0);


}
void talker::thunk(void * param)
{
	talker *instance = (talker *) param;
	instance->mainloop();
}
void talker::mainloop()
{
	printf("talker task started\n");
	int len;
	RT_TASK_MCB talk_send, talk_reply;

	rt_task_set_periodic(NULL,TM_NOW,MAIN_RATE_NS);
   	while (1)
	{
		if(exitprogram)
			break;
		rt_task_wait_period(NULL);
		talk_send.opcode = 0x01;
		talk_send.data = NULL;
		talk_send.size = 0;
		talk_reply.size = 0;
		talk_reply.data = NULL;
		len = rt_task_send(listen_task,&talk_send,&talk_reply,TM_NONBLOCK);
		if (len < 0) printf("rt_task_send error\n");
		if (talk_reply.opcode != 4)
		printf("len=%d, opcode=%d\n", len, talk_reply.opcode);
	}
	shutdown();
}

void talker::shutdown()
{
    printf("Talker exits with return %d\n",iret);
}


//listener.h
#ifndef LISTENER_H_
#define LISTENER_H_
#include <stdio.h>
#include <stdlib.h>
#include <sys/time.h>
#include <unistd.h>
#include <sys/mman.h>
#include <native/task.h>
#include <native/timer.h>



extern volatile int exitprogram;

class listener {
public:
	listener();
	virtual ~listener();
	void mainloop();
	int startup(RT_TASK *talktask, RT_TASK *listentask);

private:
	RT_TASK *talk_task, *listen_task;
	void shutdown();
	int iret;

protected:
	static void thunk(void * param);
};



#endif /*LISTENER_H_*/


//listener.cpp

#include "listener.h"

listener::listener()
{
}

listener::~listener()
{
}

int listener::startup(RT_TASK *talktask, RT_TASK *listentask)
{
	talk_task=talktask;
	listen_task=listentask;
	rt_task_create(listen_task, "listen_task", 0, 50, 0);
	iret= rt_task_start(listen_task,thunk,(void*)(this));
	return(0);


}

void listener::thunk(void * param)
{
	listener *instance = (listener *) param;
	instance->mainloop();
}

void listener::mainloop()
{
	printf("listener task started\n");
	unsigned char buf[10];
	RT_TASK_MCB listen_rcv, listen_reply;
	
	while (1)
	{
		int taskid;
		if(exitprogram)
			break;
		listen_rcv.data = (caddr_t)buf;
		listen_rcv.size = sizeof(buf);
		taskid = rt_task_receive(&listen_rcv,TM_INFINITE);
		printf("received data with opcode %d\n",listen_rcv.opcode);	
		listen_reply.opcode = 4;
	        listen_reply.size = 0;
	        listen_reply.data = NULL;
	        rt_task_reply(taskid, &listen_reply);
	}
	shutdown();
}

void listener::shutdown()
{
    printf("listener exits with return %d\n",iret);
}






-----Original Message-----
From: Philippe Gerum on behalf of Philippe Gerum
Sent: Wed 5/7/2008 1:24 PM
To: Karch, Joshua
Cc: xenomai@xenomai.org
Subject: Re: [Xenomai-help] rt_task_send / receive problems various issues and	bug trace
 
Karch, Joshua wrote:
> 
> Hello,
> 
> I'm using rt_task_send from a talker task and rt_task_receive/reply from
> a listener task.  When I launch the two tasks in the following order:
> listener task,  talker task,  everything runs normally.
> 
> However, when I launch the talker task first, and then the listener task
> second, I receive rt_task_send error -22 and after a bit of time the
> listener task starts up.

-EINVAL is not on the error list for rt_task_send(). Could you confirm this result?

 I know this is logical and to be expected,

No, it's not. rt_task_send() does wait for the receiver to listen to, unless you
 asked for a non blocking call using TM_NONBLOCK as a timeout.

> however, it appears that issuing the rt_task_send command to a task that
> hasn't been started occasionally locks up sufficient resources to
> prevent the listener task from starting. 

Any chance your code enters a tight loop due to rt_task_send() failing repeatedly?

 By controlling task startup
> order, I was able to circumvent this issue.  Both tasks have similar
> priorities (50, 51).
> 
> Additionally, I am unable to use rt_task_send with TM_NONBLOCK
> len = rt_task_send(listen_task,&talk_send,&talk_reply,TM_NONBLOCK);
> 
> I get a bug failure and have to reset the machine-- see below:
> 

Please disassemble your "vmlinux" kernel image, the exact one that causes a bug:
$ objdump -d vmlinux > foo.txt.
In that large file, search for the "__rt_task_receive" symbol (notice the double
underscore prefix, we also have the "rt_task_receive" symbol, but we don't need
this code at the moment), then paste&copy the disassembly code for that
function. I'll have a look.

Step #2 is to send a simple piece of code that exhibits the problem. This will
speed up the debugging and fixing process.

> The reason I want to use TM_NONBLOCK is so that I can send a trigger
> message from the producer task to the consumer task without requiring a
> reply to trigger the consumer task to act on the data received. I am
> using the rt_task_send trigger message to gate and synchronize the
> consumer task. Is a reply required for all rt_task_send? 
> 
> It seems if I don't send a reply when rt_task_send has a timeout
> specified the sending task locks up and the listening task runs rampant,
> i.e. rt_task_receive no longer blocks and the loop runs with no delay
> and essentially locks up the machine, since I don't use
> rt_set_task_periodic on the listening task.
> 
> Here is the trace, and it requires a reboot.  I also can attach the
> code- it is written in c++ with two separate classes as a model of the
> application I am building.
> 
> Thank you,
> 
> Joshua Karch
> 
> talker task started
> rt_task_send error
> len=-22, opcode=0
> listener task stBUG: unable to handle kernel NULL pointer dereferencearted
> rt_task_s at virtual address 0000020c
> end error
> len=-printing eip: c014f014 110, opcode=0
> *pde = 00000000
> Oops: 0000 [#1] PREEMPT
> Modules linked in: xeno_timerbench nfs ipv6 nfsd lockd nfs_acl sunrpc
> exportfs dm_snapshot dm_mirror dm_mod loop pcmcia firmware_class
> serio_raw psmouse yenta_socket rsrc_nonstatic pcmcia_core cs5535_gpio
> joydev evdev ext3 jbd mbcache usbhid ide_disk generic ohci_hcd ehci_hcd
> amd74xx usbcore ide_core e100 mii
> 
> Pid: 1973, comm: listen_task Not tainted (2.6.24.4 #12)
> EIP: 0060:[<c014f014>] EFLAGS: 00010093 CPU: 0
> EIP is at rt_task_receive+0x113/0x18b
> EAX: cdc71f28 EBX: fffffd98 ECX: c03c5500 EDX: cd820acc
> ESI: ffffff97 EDI: cd820610 EBP: cdc71edc ESP: cdc71ec4
>  DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068
> Process listen_task (pid: 1973, ti=cdc70000 task=cdc6d320
> task.ti=cdc70000)<0>
> I-pipe domain Linux
> Stack: 00000000 cdc71f28 00000000 cdc71ee8 cdc6d320 cdc71fb8 cdc71f4c
> c01515bf
>        b7d083ea cdc71f00 c0102cb1 ce46bc70 c03c7e90 cd820620 cdc6d620
> cd80ff44
>        c029f55b cdc71f34 00000082 b91264ee 0000002e cdc6d320 b9127ee2
> 0000002e
> Call Trace:
>  [<c01045f1>] show_trace_log_lvl+0x1a/0x2f
>  [<c01046a3>] show_stack_log_lvl+0x9d/0xa5
>  [<c0104769>] show_registers+0xbe/0x1fd
>  [<c01049c1>] die+0x119/0x20a
>  [<c010ef52>] do_page_fault+0x480/0x57e
>  [<c010ca34>] __ipipe_handle_exception+0x11e/0x166
>  [<c02a0e7f>] error_code+0x6f/0x80
>  [<c01515bf>] __rt_task_receive+0xbe/0x113
>  [<c0149b0d>] losyscall_event+0x99/0x13d
>  [<c013f05b>] __ipipe_dispatch_event+0xac/0x16c
>  [<c010c8b1>] __ipipe_syscall_root+0x6a/0xcf
>  [<c0103e89>] system_call+0x29/0x4a
>  =======================
> Code: 00 00 8d 87 bc 04 00 00 39 c2 0f 95 c0 0f b6 c0 f7 d8 21 d0 31 db
> 3d 58 02 00 00 74 06 8d 98 98 fd ff ff 8b 45 ec be 97 ff ff ff <8b> 93
> 74 04 00 00 3b 50 0c 77 23 85 d2 74 19 89 d1 8b b3 70 04
> EIP: [<c014f014>] rt_task_receive+0x113/0x18b SS:ESP 0068:cdc71ec4
> ---[ end trace 523bcd2b73b75979 ]---
> 
> 
> 
> 
> 
> ------------------------------------------------------------------------
> 
> _______________________________________________
> Xenomai-help mailing list
> Xenomai-help@domain.hid
> https://mail.gna.org/listinfo/xenomai-help


-- 
Philippe.


[-- Attachment #2: Type: text/html, Size: 36807 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Xenomai-help] rt_task_send / receive problems various issuesand bug trace
  2008-05-07 19:33   ` Karch, Joshua
@ 2008-05-07 19:35     ` Karch, Joshua
  2008-05-12 12:36       ` [Xenomai-help] rt_task_send / receive bug from last week and use of rt_task_inquire Karch, Joshua
  2008-05-17 21:18       ` [Xenomai-help] rt_task_send / receive problems various issuesand bug trace Philippe Gerum
  0 siblings, 2 replies; 9+ messages in thread
From: Karch, Joshua @ 2008-05-07 19:35 UTC (permalink / raw)
  To: Karch, Joshua, rpm; +Cc: xenomai

[-- Attachment #1: Type: text/plain, Size: 16968 bytes --]

Phillipe: small clarification: the following example has replies enabled and TM_NONBLOCK on rt_task_send.  I tried the code two different ways:
len = rt_task_send(listen_task,&talk_send,&talk_reply,TM_NONBLOCK);
len = rt_task_send(listen_task,&talk_send,NULL,TM_NONBLOCK);// also deleted rt_task_reply from listen_task and commented out all appropriate RT_TASK_MCB reply structs

both have the same bug in the end.

Josh


-----Original Message-----
From: xenomai-help-bounces@domain.hid on behalf of Karch, Joshua
Sent: Wed 5/7/2008 3:33 PM
To: rpm@xenomai.org
Cc: xenomai@xenomai.org
Subject: Re: [Xenomai-help] rt_task_send / receive problems various issuesand	bug trace
 
Phillipe,

here is a disassembly of __rt_task_receive and the source code used to generate the error.  The error happens if I use TM_NONBLOCK, regardless of whether or not I enable a reply.  In the following example, I use rt_send_task with no reply requested (NULL), and all replies commented out.

I got three errors this time, -22, -11, -110, and then finally
BUG: unable to handle kernel NULL pointer dereference at virtual address 0000020c
printing eip: c014f014 *pde = 00000000
I hand typed the above in since I didn't capture it through a serial terminal.

>From what I can see here, there's no place for a local loop to form in the talker task. I have the wait period set to the main rate, defined as a 20 msec for this example.  Calling rt_task_send with or without a RT_TASK_MCB receive struct and without a reply from the listener task still results in the bug occurring.  My platform is a Geode, and I'm running 2.6.24-4 with Xenomai 2.4.3

Thank you,

Josh




c0151501 <__rt_task_receive>:
c0151501:	55                   	push   %ebp
c0151502:	89 e5                	mov    %esp,%ebp
c0151504:	57                   	push   %edi
c0151505:	89 d7                	mov    %edx,%edi
c0151507:	56                   	push   %esi
c0151508:	89 c6                	mov    %eax,%esi
c015150a:	53                   	push   %ebx
c015150b:	83 ec 5c             	sub    $0x5c,%esp
c015150e:	8b 1a                	mov    (%edx),%ebx
c0151510:	8b 40 04             	mov    0x4(%eax),%eax
c0151513:	89 da                	mov    %ebx,%edx
c0151515:	83 c2 10             	add    $0x10,%edx
c0151518:	19 c9                	sbb    %ecx,%ecx
c015151a:	39 50 18             	cmp    %edx,0x18(%eax)
c015151d:	83 d9 00             	sbb    $0x0,%ecx
c0151520:	85 c9                	test   %ecx,%ecx
c0151522:	0f 85 dd 00 00 00    	jne    c0151605 <__rt_task_receive+0x104>
c0151528:	b9 10 00 00 00       	mov    $0x10,%ecx
c015152d:	89 da                	mov    %ebx,%edx
c015152f:	8d 45 dc             	lea    0xffffffdc(%ebp),%eax
c0151532:	e8 29 a2 08 00       	call   c01db760 <__copy_from_user_ll_nozero>
c0151537:	8b 4d e8             	mov    0xffffffe8(%ebp),%ecx
c015153a:	85 c9                	test   %ecx,%ecx
c015153c:	74 18                	je     c0151556 <__rt_task_receive+0x55>
c015153e:	8b 55 e4             	mov    0xffffffe4(%ebp),%edx
c0151541:	8b 46 04             	mov    0x4(%esi),%eax
c0151544:	01 ca                	add    %ecx,%edx
c0151546:	19 db                	sbb    %ebx,%ebx
c0151548:	39 50 18             	cmp    %edx,0x18(%eax)
c015154b:	83 db 00             	sbb    $0x0,%ebx
c015154e:	85 db                	test   %ebx,%ebx
c0151550:	0f 85 af 00 00 00    	jne    c0151605 <__rt_task_receive+0x104>
c0151556:	8b 5f 04             	mov    0x4(%edi),%ebx
c0151559:	8b 46 04             	mov    0x4(%esi),%eax
c015155c:	89 da                	mov    %ebx,%edx
c015155e:	83 c2 08             	add    $0x8,%edx
c0151561:	19 c9                	sbb    %ecx,%ecx
c0151563:	39 50 18             	cmp    %edx,0x18(%eax)
c0151566:	83 d9 00             	sbb    $0x0,%ecx
c0151569:	85 c9                	test   %ecx,%ecx
c015156b:	0f 85 94 00 00 00    	jne    c0151605 <__rt_task_receive+0x104>
c0151571:	89 da                	mov    %ebx,%edx
c0151573:	b9 08 00 00 00       	mov    $0x8,%ecx
c0151578:	8d 45 ec             	lea    0xffffffec(%ebp),%eax
c015157b:	e8 e0 a1 08 00       	call   c01db760 <__copy_from_user_ll_nozero>
c0151580:	8b 5d e4             	mov    0xffffffe4(%ebp),%ebx
c0151583:	8b 55 e8             	mov    0xffffffe8(%ebp),%edx
c0151586:	89 5d 98             	mov    %ebx,0xffffff98(%ebp)
c0151589:	31 db                	xor    %ebx,%ebx
c015158b:	85 d2                	test   %edx,%edx
c015158d:	74 22                	je     c01515b1 <__rt_task_receive+0xb0>
c015158f:	83 fa 40             	cmp    $0x40,%edx
c0151592:	77 05                	ja     c0151599 <__rt_task_receive+0x98>
c0151594:	8d 5d 9c             	lea    0xffffff9c(%ebp),%ebx
c0151597:	eb 15                	jmp    c01515ae <__rt_task_receive+0xad>
c0151599:	b8 60 b9 3c c0       	mov    $0xc03cb960,%eax
c015159e:	be f4 ff ff ff       	mov    $0xfffffff4,%esi
c01515a3:	e8 25 f2 fe ff       	call   c01407cd <xnheap_alloc>
c01515a8:	85 c0                	test   %eax,%eax
c01515aa:	74 5e                	je     c015160a <__rt_task_receive+0x109>
c01515ac:	89 c3                	mov    %eax,%ebx
c01515ae:	89 5d e4             	mov    %ebx,0xffffffe4(%ebp)
c01515b1:	8b 55 ec             	mov    0xffffffec(%ebp),%edx
c01515b4:	8d 45 dc             	lea    0xffffffdc(%ebp),%eax
c01515b7:	8b 4d f0             	mov    0xfffffff0(%ebp),%ecx
c01515ba:	e8 42 d9 ff ff       	call   c014ef01 <rt_task_receive>
c01515bf:	85 c0                	test   %eax,%eax
c01515c1:	89 c6                	mov    %eax,%esi
c01515c3:	7e 12                	jle    c01515d7 <__rt_task_receive+0xd6>
c01515c5:	8b 4d e8             	mov    0xffffffe8(%ebp),%ecx
c01515c8:	85 c9                	test   %ecx,%ecx
c01515ca:	74 0b                	je     c01515d7 <__rt_task_receive+0xd6>
c01515cc:	8b 55 e4             	mov    0xffffffe4(%ebp),%edx
c01515cf:	8b 45 98             	mov    0xffffff98(%ebp),%eax
c01515d2:	e8 e9 9f 08 00       	call   c01db5c0 <__copy_to_user_ll>
c01515d7:	8b 45 98             	mov    0xffffff98(%ebp),%eax
c01515da:	8d 55 dc             	lea    0xffffffdc(%ebp),%edx
c01515dd:	b9 10 00 00 00       	mov    $0x10,%ecx
c01515e2:	89 45 e4             	mov    %eax,0xffffffe4(%ebp)
c01515e5:	8b 07                	mov    (%edi),%eax
c01515e7:	e8 d4 9f 08 00       	call   c01db5c0 <__copy_to_user_ll>
c01515ec:	85 db                	test   %ebx,%ebx
c01515ee:	74 1a                	je     c015160a <__rt_task_receive+0x109>
c01515f0:	8d 45 9c             	lea    0xffffff9c(%ebp),%eax
c01515f3:	39 c3                	cmp    %eax,%ebx
c01515f5:	74 13                	je     c015160a <__rt_task_receive+0x109>
c01515f7:	89 da                	mov    %ebx,%edx
c01515f9:	b8 60 b9 3c c0       	mov    $0xc03cb960,%eax
c01515fe:	e8 a4 f0 fe ff       	call   c01406a7 <xnheap_free>
c0151603:	eb 05                	jmp    c015160a <__rt_task_receive+0x109>
c0151605:	be f2 ff ff ff       	mov    $0xfffffff2,%esi
c015160a:	83 c4 5c             	add    $0x5c,%esp
c015160d:	89 f0                	mov    %esi,%eax
c015160f:	5b                   	pop    %ebx
c0151610:	5e                   	pop    %esi
c0151611:	5f                   	pop    %edi
c0151612:	5d                   	pop    %ebp
c0151613:	c3                   	ret    




Associated source code:


//main.cpp
#include "talker.h"
#include "listener.h"
#include <sys/mman.h>
#include <native/task.h>
#include <signal.h>

volatile int exitprogram;
RT_TASK talk_task, listen_task;


void catch_signal(int sig)
{
	exitprogram=1;
}

int main(int argc, char *argv[])
{
	exitprogram=0;
	
	mlockall(MCL_CURRENT | MCL_FUTURE);
	//wait for ctrl-c
	signal(SIGTERM, catch_signal);
	signal(SIGINT, catch_signal);
	talker *talk = new talker();
	listener *listen = new listener();
	talk->startup(&talk_task, &listen_task);
	sleep(1);
	listen->startup(&talk_task, &listen_task);
	pause();
	sleep(2);
	rt_task_join(&talk_task);
	rt_task_join(&listen_task);
	rt_task_delete(&talk_task);
	rt_task_delete(&listen_task);
	return 0;
}

//talker.h
#ifndef TALKER_H_
#define TALKER_H_
#include <stdio.h>
#include <stdlib.h>
#include <sys/time.h>
#include <unistd.h>
#include <sys/mman.h>
#include <native/task.h>
#include <native/timer.h>
#define TIMEOUT (100000000)
#define MAIN_RATE_NS 20000000


extern volatile int exitprogram;

class talker {
public:
	talker();
	virtual ~talker();
	void mainloop();
	int startup(RT_TASK *talktask, RT_TASK *listentask);

private:
	RT_TASK *talk_task, *listen_task;
	void shutdown();
	int iret;

protected:
	static void thunk(void * param);
};



#endif /*TALKER_H_*/



//talker.cpp
#include "talker.h"

talker::talker()
{
}
talker::~talker()
{
}

int talker::startup(RT_TASK *talktask, RT_TASK *listentask)
{
	talk_task=talktask;
	listen_task=listentask;
	rt_task_create(talk_task, "talk_task", 0, 51, 0);
	iret= rt_task_start(talk_task,thunk,(void*)(this));
	return(0);


}
void talker::thunk(void * param)
{
	talker *instance = (talker *) param;
	instance->mainloop();
}
void talker::mainloop()
{
	printf("talker task started\n");
	int len;
	RT_TASK_MCB talk_send, talk_reply;

	rt_task_set_periodic(NULL,TM_NOW,MAIN_RATE_NS);
   	while (1)
	{
		if(exitprogram)
			break;
		rt_task_wait_period(NULL);
		talk_send.opcode = 0x01;
		talk_send.data = NULL;
		talk_send.size = 0;
		talk_reply.size = 0;
		talk_reply.data = NULL;
		len = rt_task_send(listen_task,&talk_send,&talk_reply,TM_NONBLOCK);
		if (len < 0) printf("rt_task_send error\n");
		if (talk_reply.opcode != 4)
		printf("len=%d, opcode=%d\n", len, talk_reply.opcode);
	}
	shutdown();
}

void talker::shutdown()
{
    printf("Talker exits with return %d\n",iret);
}


//listener.h
#ifndef LISTENER_H_
#define LISTENER_H_
#include <stdio.h>
#include <stdlib.h>
#include <sys/time.h>
#include <unistd.h>
#include <sys/mman.h>
#include <native/task.h>
#include <native/timer.h>



extern volatile int exitprogram;

class listener {
public:
	listener();
	virtual ~listener();
	void mainloop();
	int startup(RT_TASK *talktask, RT_TASK *listentask);

private:
	RT_TASK *talk_task, *listen_task;
	void shutdown();
	int iret;

protected:
	static void thunk(void * param);
};



#endif /*LISTENER_H_*/


//listener.cpp

#include "listener.h"

listener::listener()
{
}

listener::~listener()
{
}

int listener::startup(RT_TASK *talktask, RT_TASK *listentask)
{
	talk_task=talktask;
	listen_task=listentask;
	rt_task_create(listen_task, "listen_task", 0, 50, 0);
	iret= rt_task_start(listen_task,thunk,(void*)(this));
	return(0);


}

void listener::thunk(void * param)
{
	listener *instance = (listener *) param;
	instance->mainloop();
}

void listener::mainloop()
{
	printf("listener task started\n");
	unsigned char buf[10];
	RT_TASK_MCB listen_rcv, listen_reply;
	
	while (1)
	{
		int taskid;
		if(exitprogram)
			break;
		listen_rcv.data = (caddr_t)buf;
		listen_rcv.size = sizeof(buf);
		taskid = rt_task_receive(&listen_rcv,TM_INFINITE);
		printf("received data with opcode %d\n",listen_rcv.opcode);	
		listen_reply.opcode = 4;
	        listen_reply.size = 0;
	        listen_reply.data = NULL;
	        rt_task_reply(taskid, &listen_reply);
	}
	shutdown();
}

void listener::shutdown()
{
    printf("listener exits with return %d\n",iret);
}






-----Original Message-----
From: Philippe Gerum on behalf of Philippe Gerum
Sent: Wed 5/7/2008 1:24 PM
To: Karch, Joshua
Cc: xenomai@xenomai.org
Subject: Re: [Xenomai-help] rt_task_send / receive problems various issues and	bug trace
 
Karch, Joshua wrote:
> 
> Hello,
> 
> I'm using rt_task_send from a talker task and rt_task_receive/reply from
> a listener task.  When I launch the two tasks in the following order:
> listener task,  talker task,  everything runs normally.
> 
> However, when I launch the talker task first, and then the listener task
> second, I receive rt_task_send error -22 and after a bit of time the
> listener task starts up.

-EINVAL is not on the error list for rt_task_send(). Could you confirm this result?

 I know this is logical and to be expected,

No, it's not. rt_task_send() does wait for the receiver to listen to, unless you
 asked for a non blocking call using TM_NONBLOCK as a timeout.

> however, it appears that issuing the rt_task_send command to a task that
> hasn't been started occasionally locks up sufficient resources to
> prevent the listener task from starting. 

Any chance your code enters a tight loop due to rt_task_send() failing repeatedly?

 By controlling task startup
> order, I was able to circumvent this issue.  Both tasks have similar
> priorities (50, 51).
> 
> Additionally, I am unable to use rt_task_send with TM_NONBLOCK
> len = rt_task_send(listen_task,&talk_send,&talk_reply,TM_NONBLOCK);
> 
> I get a bug failure and have to reset the machine-- see below:
> 

Please disassemble your "vmlinux" kernel image, the exact one that causes a bug:
$ objdump -d vmlinux > foo.txt.
In that large file, search for the "__rt_task_receive" symbol (notice the double
underscore prefix, we also have the "rt_task_receive" symbol, but we don't need
this code at the moment), then paste&copy the disassembly code for that
function. I'll have a look.

Step #2 is to send a simple piece of code that exhibits the problem. This will
speed up the debugging and fixing process.

> The reason I want to use TM_NONBLOCK is so that I can send a trigger
> message from the producer task to the consumer task without requiring a
> reply to trigger the consumer task to act on the data received. I am
> using the rt_task_send trigger message to gate and synchronize the
> consumer task. Is a reply required for all rt_task_send? 
> 
> It seems if I don't send a reply when rt_task_send has a timeout
> specified the sending task locks up and the listening task runs rampant,
> i.e. rt_task_receive no longer blocks and the loop runs with no delay
> and essentially locks up the machine, since I don't use
> rt_set_task_periodic on the listening task.
> 
> Here is the trace, and it requires a reboot.  I also can attach the
> code- it is written in c++ with two separate classes as a model of the
> application I am building.
> 
> Thank you,
> 
> Joshua Karch
> 
> talker task started
> rt_task_send error
> len=-22, opcode=0
> listener task stBUG: unable to handle kernel NULL pointer dereferencearted
> rt_task_s at virtual address 0000020c
> end error
> len=-printing eip: c014f014 110, opcode=0
> *pde = 00000000
> Oops: 0000 [#1] PREEMPT
> Modules linked in: xeno_timerbench nfs ipv6 nfsd lockd nfs_acl sunrpc
> exportfs dm_snapshot dm_mirror dm_mod loop pcmcia firmware_class
> serio_raw psmouse yenta_socket rsrc_nonstatic pcmcia_core cs5535_gpio
> joydev evdev ext3 jbd mbcache usbhid ide_disk generic ohci_hcd ehci_hcd
> amd74xx usbcore ide_core e100 mii
> 
> Pid: 1973, comm: listen_task Not tainted (2.6.24.4 #12)
> EIP: 0060:[<c014f014>] EFLAGS: 00010093 CPU: 0
> EIP is at rt_task_receive+0x113/0x18b
> EAX: cdc71f28 EBX: fffffd98 ECX: c03c5500 EDX: cd820acc
> ESI: ffffff97 EDI: cd820610 EBP: cdc71edc ESP: cdc71ec4
>  DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068
> Process listen_task (pid: 1973, ti=cdc70000 task=cdc6d320
> task.ti=cdc70000)<0>
> I-pipe domain Linux
> Stack: 00000000 cdc71f28 00000000 cdc71ee8 cdc6d320 cdc71fb8 cdc71f4c
> c01515bf
>        b7d083ea cdc71f00 c0102cb1 ce46bc70 c03c7e90 cd820620 cdc6d620
> cd80ff44
>        c029f55b cdc71f34 00000082 b91264ee 0000002e cdc6d320 b9127ee2
> 0000002e
> Call Trace:
>  [<c01045f1>] show_trace_log_lvl+0x1a/0x2f
>  [<c01046a3>] show_stack_log_lvl+0x9d/0xa5
>  [<c0104769>] show_registers+0xbe/0x1fd
>  [<c01049c1>] die+0x119/0x20a
>  [<c010ef52>] do_page_fault+0x480/0x57e
>  [<c010ca34>] __ipipe_handle_exception+0x11e/0x166
>  [<c02a0e7f>] error_code+0x6f/0x80
>  [<c01515bf>] __rt_task_receive+0xbe/0x113
>  [<c0149b0d>] losyscall_event+0x99/0x13d
>  [<c013f05b>] __ipipe_dispatch_event+0xac/0x16c
>  [<c010c8b1>] __ipipe_syscall_root+0x6a/0xcf
>  [<c0103e89>] system_call+0x29/0x4a
>  =======================
> Code: 00 00 8d 87 bc 04 00 00 39 c2 0f 95 c0 0f b6 c0 f7 d8 21 d0 31 db
> 3d 58 02 00 00 74 06 8d 98 98 fd ff ff 8b 45 ec be 97 ff ff ff <8b> 93
> 74 04 00 00 3b 50 0c 77 23 85 d2 74 19 89 d1 8b b3 70 04
> EIP: [<c014f014>] rt_task_receive+0x113/0x18b SS:ESP 0068:cdc71ec4
> ---[ end trace 523bcd2b73b75979 ]---
> 
> 
> 
> 
> 
> ------------------------------------------------------------------------
> 
> _______________________________________________
> Xenomai-help mailing list
> Xenomai-help@domain.hid
> https://mail.gna.org/listinfo/xenomai-help


-- 
Philippe.



[-- Attachment #2: Type: text/html, Size: 37596 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Xenomai-help] rt_task_send / receive bug from last week and use of rt_task_inquire
  2008-05-07 19:35     ` [Xenomai-help] rt_task_send / receive problems various issuesand " Karch, Joshua
@ 2008-05-12 12:36       ` Karch, Joshua
  2008-05-17 21:26         ` Philippe Gerum
  2008-05-17 21:18       ` [Xenomai-help] rt_task_send / receive problems various issuesand bug trace Philippe Gerum
  1 sibling, 1 reply; 9+ messages in thread
From: Karch, Joshua @ 2008-05-12 12:36 UTC (permalink / raw)
  To: rpm; +Cc: xenomai

[-- Attachment #1: Type: text/plain, Size: 19494 bytes --]

Phillipe,

I was wondering if you had any progress with the rt_task_send/receive bug?  I was able to get the synchronized serial chain to work from sensor task through an rt_task_send at 50 Hz with a timeout to the control task, which then synchronously sends an rt_task_send with timeout to the motor driver on the second serial port.  When I set the motor driver's timeout to be 100 msec and I unplug the motor driver's serial interface to cause this timeout, I immediately get rt_task_send errors, some bad enough to cause a system crash if and only if the rt_task_send looks for a reply, otherwise, with the reply MCB field set to NULL, rt_task_send fails indefinitely until the program is killed and a segfault occurs.  Clearly, with a 100 msec timeout, this means that up to five rt_task send commands will come from the control law which is chain-linked to the sensor task before a single reply is sent back from the servo controller. 

What is the expected behavior of repeatedly issuing rt_task_send when the other task has not replied within a timeout specified by rt_task_send?  The 100 msec timeout is akin to having 3 cars on a highway, with the front one (the servo motor) slamming on the brakes and causing a major collision.  Basically, the servo controller must always respond within 20 msec- the time of execution for the control task in order to prevent the xenomai equivalent of a traffic jam.  Can and should this be avoided by using a function like rt_task_inquire to see if the servo serial driver is blocking on serial read, and if so, not issue an rt_task_send command?  In this way, the synchronous relationship between the serial sensor and the control law will never be broken. 

Regardless, bringing the serial port timeout down to 10msec, which is reasonable, resolves this problem for now, though it limits the control law's ability to process while the servo motor serial task has timed out. 

That's basically my update.  I'd still much prefer to be able to send a non-blocking trigger message without a reply for all rt_task_send calls from the sensor to the control law to regulate the control law's loop rate, and perhaps also do the same for the motor controller, which currently places its response in shared memory.

Thank you,

Josh


-----Original Message-----
From: Karch, Joshua
Sent: Wed 5/7/2008 3:35 PM
To: Karch, Joshua; rpm@xenomai.org
Cc: xenomai@xenomai.org
Subject: RE: [Xenomai-help] rt_task_send / receive problems various issuesand	bug trace
 
Phillipe: small clarification: the following example has replies enabled and TM_NONBLOCK on rt_task_send.  I tried the code two different ways:
len = rt_task_send(listen_task,&talk_send,&talk_reply,TM_NONBLOCK);
len = rt_task_send(listen_task,&talk_send,NULL,TM_NONBLOCK);// also deleted rt_task_reply from listen_task and commented out all appropriate RT_TASK_MCB reply structs

both have the same bug in the end.

Josh


-----Original Message-----
From: xenomai-help-bounces@domain.hid on behalf of Karch, Joshua
Sent: Wed 5/7/2008 3:33 PM
To: rpm@xenomai.org
Cc: xenomai@xenomai.org
Subject: Re: [Xenomai-help] rt_task_send / receive problems various issuesand	bug trace
 
Phillipe,

here is a disassembly of __rt_task_receive and the source code used to generate the error.  The error happens if I use TM_NONBLOCK, regardless of whether or not I enable a reply.  In the following example, I use rt_send_task with no reply requested (NULL), and all replies commented out.

I got three errors this time, -22, -11, -110, and then finally
BUG: unable to handle kernel NULL pointer dereference at virtual address 0000020c
printing eip: c014f014 *pde = 00000000
I hand typed the above in since I didn't capture it through a serial terminal.

>From what I can see here, there's no place for a local loop to form in the talker task. I have the wait period set to the main rate, defined as a 20 msec for this example.  Calling rt_task_send with or without a RT_TASK_MCB receive struct and without a reply from the listener task still results in the bug occurring.  My platform is a Geode, and I'm running 2.6.24-4 with Xenomai 2.4.3

Thank you,

Josh




c0151501 <__rt_task_receive>:
c0151501:	55                   	push   %ebp
c0151502:	89 e5                	mov    %esp,%ebp
c0151504:	57                   	push   %edi
c0151505:	89 d7                	mov    %edx,%edi
c0151507:	56                   	push   %esi
c0151508:	89 c6                	mov    %eax,%esi
c015150a:	53                   	push   %ebx
c015150b:	83 ec 5c             	sub    $0x5c,%esp
c015150e:	8b 1a                	mov    (%edx),%ebx
c0151510:	8b 40 04             	mov    0x4(%eax),%eax
c0151513:	89 da                	mov    %ebx,%edx
c0151515:	83 c2 10             	add    $0x10,%edx
c0151518:	19 c9                	sbb    %ecx,%ecx
c015151a:	39 50 18             	cmp    %edx,0x18(%eax)
c015151d:	83 d9 00             	sbb    $0x0,%ecx
c0151520:	85 c9                	test   %ecx,%ecx
c0151522:	0f 85 dd 00 00 00    	jne    c0151605 <__rt_task_receive+0x104>
c0151528:	b9 10 00 00 00       	mov    $0x10,%ecx
c015152d:	89 da                	mov    %ebx,%edx
c015152f:	8d 45 dc             	lea    0xffffffdc(%ebp),%eax
c0151532:	e8 29 a2 08 00       	call   c01db760 <__copy_from_user_ll_nozero>
c0151537:	8b 4d e8             	mov    0xffffffe8(%ebp),%ecx
c015153a:	85 c9                	test   %ecx,%ecx
c015153c:	74 18                	je     c0151556 <__rt_task_receive+0x55>
c015153e:	8b 55 e4             	mov    0xffffffe4(%ebp),%edx
c0151541:	8b 46 04             	mov    0x4(%esi),%eax
c0151544:	01 ca                	add    %ecx,%edx
c0151546:	19 db                	sbb    %ebx,%ebx
c0151548:	39 50 18             	cmp    %edx,0x18(%eax)
c015154b:	83 db 00             	sbb    $0x0,%ebx
c015154e:	85 db                	test   %ebx,%ebx
c0151550:	0f 85 af 00 00 00    	jne    c0151605 <__rt_task_receive+0x104>
c0151556:	8b 5f 04             	mov    0x4(%edi),%ebx
c0151559:	8b 46 04             	mov    0x4(%esi),%eax
c015155c:	89 da                	mov    %ebx,%edx
c015155e:	83 c2 08             	add    $0x8,%edx
c0151561:	19 c9                	sbb    %ecx,%ecx
c0151563:	39 50 18             	cmp    %edx,0x18(%eax)
c0151566:	83 d9 00             	sbb    $0x0,%ecx
c0151569:	85 c9                	test   %ecx,%ecx
c015156b:	0f 85 94 00 00 00    	jne    c0151605 <__rt_task_receive+0x104>
c0151571:	89 da                	mov    %ebx,%edx
c0151573:	b9 08 00 00 00       	mov    $0x8,%ecx
c0151578:	8d 45 ec             	lea    0xffffffec(%ebp),%eax
c015157b:	e8 e0 a1 08 00       	call   c01db760 <__copy_from_user_ll_nozero>
c0151580:	8b 5d e4             	mov    0xffffffe4(%ebp),%ebx
c0151583:	8b 55 e8             	mov    0xffffffe8(%ebp),%edx
c0151586:	89 5d 98             	mov    %ebx,0xffffff98(%ebp)
c0151589:	31 db                	xor    %ebx,%ebx
c015158b:	85 d2                	test   %edx,%edx
c015158d:	74 22                	je     c01515b1 <__rt_task_receive+0xb0>
c015158f:	83 fa 40             	cmp    $0x40,%edx
c0151592:	77 05                	ja     c0151599 <__rt_task_receive+0x98>
c0151594:	8d 5d 9c             	lea    0xffffff9c(%ebp),%ebx
c0151597:	eb 15                	jmp    c01515ae <__rt_task_receive+0xad>
c0151599:	b8 60 b9 3c c0       	mov    $0xc03cb960,%eax
c015159e:	be f4 ff ff ff       	mov    $0xfffffff4,%esi
c01515a3:	e8 25 f2 fe ff       	call   c01407cd <xnheap_alloc>
c01515a8:	85 c0                	test   %eax,%eax
c01515aa:	74 5e                	je     c015160a <__rt_task_receive+0x109>
c01515ac:	89 c3                	mov    %eax,%ebx
c01515ae:	89 5d e4             	mov    %ebx,0xffffffe4(%ebp)
c01515b1:	8b 55 ec             	mov    0xffffffec(%ebp),%edx
c01515b4:	8d 45 dc             	lea    0xffffffdc(%ebp),%eax
c01515b7:	8b 4d f0             	mov    0xfffffff0(%ebp),%ecx
c01515ba:	e8 42 d9 ff ff       	call   c014ef01 <rt_task_receive>
c01515bf:	85 c0                	test   %eax,%eax
c01515c1:	89 c6                	mov    %eax,%esi
c01515c3:	7e 12                	jle    c01515d7 <__rt_task_receive+0xd6>
c01515c5:	8b 4d e8             	mov    0xffffffe8(%ebp),%ecx
c01515c8:	85 c9                	test   %ecx,%ecx
c01515ca:	74 0b                	je     c01515d7 <__rt_task_receive+0xd6>
c01515cc:	8b 55 e4             	mov    0xffffffe4(%ebp),%edx
c01515cf:	8b 45 98             	mov    0xffffff98(%ebp),%eax
c01515d2:	e8 e9 9f 08 00       	call   c01db5c0 <__copy_to_user_ll>
c01515d7:	8b 45 98             	mov    0xffffff98(%ebp),%eax
c01515da:	8d 55 dc             	lea    0xffffffdc(%ebp),%edx
c01515dd:	b9 10 00 00 00       	mov    $0x10,%ecx
c01515e2:	89 45 e4             	mov    %eax,0xffffffe4(%ebp)
c01515e5:	8b 07                	mov    (%edi),%eax
c01515e7:	e8 d4 9f 08 00       	call   c01db5c0 <__copy_to_user_ll>
c01515ec:	85 db                	test   %ebx,%ebx
c01515ee:	74 1a                	je     c015160a <__rt_task_receive+0x109>
c01515f0:	8d 45 9c             	lea    0xffffff9c(%ebp),%eax
c01515f3:	39 c3                	cmp    %eax,%ebx
c01515f5:	74 13                	je     c015160a <__rt_task_receive+0x109>
c01515f7:	89 da                	mov    %ebx,%edx
c01515f9:	b8 60 b9 3c c0       	mov    $0xc03cb960,%eax
c01515fe:	e8 a4 f0 fe ff       	call   c01406a7 <xnheap_free>
c0151603:	eb 05                	jmp    c015160a <__rt_task_receive+0x109>
c0151605:	be f2 ff ff ff       	mov    $0xfffffff2,%esi
c015160a:	83 c4 5c             	add    $0x5c,%esp
c015160d:	89 f0                	mov    %esi,%eax
c015160f:	5b                   	pop    %ebx
c0151610:	5e                   	pop    %esi
c0151611:	5f                   	pop    %edi
c0151612:	5d                   	pop    %ebp
c0151613:	c3                   	ret    




Associated source code:


//main.cpp
#include "talker.h"
#include "listener.h"
#include <sys/mman.h>
#include <native/task.h>
#include <signal.h>

volatile int exitprogram;
RT_TASK talk_task, listen_task;


void catch_signal(int sig)
{
	exitprogram=1;
}

int main(int argc, char *argv[])
{
	exitprogram=0;
	
	mlockall(MCL_CURRENT | MCL_FUTURE);
	//wait for ctrl-c
	signal(SIGTERM, catch_signal);
	signal(SIGINT, catch_signal);
	talker *talk = new talker();
	listener *listen = new listener();
	talk->startup(&talk_task, &listen_task);
	sleep(1);
	listen->startup(&talk_task, &listen_task);
	pause();
	sleep(2);
	rt_task_join(&talk_task);
	rt_task_join(&listen_task);
	rt_task_delete(&talk_task);
	rt_task_delete(&listen_task);
	return 0;
}

//talker.h
#ifndef TALKER_H_
#define TALKER_H_
#include <stdio.h>
#include <stdlib.h>
#include <sys/time.h>
#include <unistd.h>
#include <sys/mman.h>
#include <native/task.h>
#include <native/timer.h>
#define TIMEOUT (100000000)
#define MAIN_RATE_NS 20000000


extern volatile int exitprogram;

class talker {
public:
	talker();
	virtual ~talker();
	void mainloop();
	int startup(RT_TASK *talktask, RT_TASK *listentask);

private:
	RT_TASK *talk_task, *listen_task;
	void shutdown();
	int iret;

protected:
	static void thunk(void * param);
};



#endif /*TALKER_H_*/



//talker.cpp
#include "talker.h"

talker::talker()
{
}
talker::~talker()
{
}

int talker::startup(RT_TASK *talktask, RT_TASK *listentask)
{
	talk_task=talktask;
	listen_task=listentask;
	rt_task_create(talk_task, "talk_task", 0, 51, 0);
	iret= rt_task_start(talk_task,thunk,(void*)(this));
	return(0);


}
void talker::thunk(void * param)
{
	talker *instance = (talker *) param;
	instance->mainloop();
}
void talker::mainloop()
{
	printf("talker task started\n");
	int len;
	RT_TASK_MCB talk_send, talk_reply;

	rt_task_set_periodic(NULL,TM_NOW,MAIN_RATE_NS);
   	while (1)
	{
		if(exitprogram)
			break;
		rt_task_wait_period(NULL);
		talk_send.opcode = 0x01;
		talk_send.data = NULL;
		talk_send.size = 0;
		talk_reply.size = 0;
		talk_reply.data = NULL;
		len = rt_task_send(listen_task,&talk_send,&talk_reply,TM_NONBLOCK);
		if (len < 0) printf("rt_task_send error\n");
		if (talk_reply.opcode != 4)
		printf("len=%d, opcode=%d\n", len, talk_reply.opcode);
	}
	shutdown();
}

void talker::shutdown()
{
    printf("Talker exits with return %d\n",iret);
}


//listener.h
#ifndef LISTENER_H_
#define LISTENER_H_
#include <stdio.h>
#include <stdlib.h>
#include <sys/time.h>
#include <unistd.h>
#include <sys/mman.h>
#include <native/task.h>
#include <native/timer.h>



extern volatile int exitprogram;

class listener {
public:
	listener();
	virtual ~listener();
	void mainloop();
	int startup(RT_TASK *talktask, RT_TASK *listentask);

private:
	RT_TASK *talk_task, *listen_task;
	void shutdown();
	int iret;

protected:
	static void thunk(void * param);
};



#endif /*LISTENER_H_*/


//listener.cpp

#include "listener.h"

listener::listener()
{
}

listener::~listener()
{
}

int listener::startup(RT_TASK *talktask, RT_TASK *listentask)
{
	talk_task=talktask;
	listen_task=listentask;
	rt_task_create(listen_task, "listen_task", 0, 50, 0);
	iret= rt_task_start(listen_task,thunk,(void*)(this));
	return(0);


}

void listener::thunk(void * param)
{
	listener *instance = (listener *) param;
	instance->mainloop();
}

void listener::mainloop()
{
	printf("listener task started\n");
	unsigned char buf[10];
	RT_TASK_MCB listen_rcv, listen_reply;
	
	while (1)
	{
		int taskid;
		if(exitprogram)
			break;
		listen_rcv.data = (caddr_t)buf;
		listen_rcv.size = sizeof(buf);
		taskid = rt_task_receive(&listen_rcv,TM_INFINITE);
		printf("received data with opcode %d\n",listen_rcv.opcode);	
		listen_reply.opcode = 4;
	        listen_reply.size = 0;
	        listen_reply.data = NULL;
	        rt_task_reply(taskid, &listen_reply);
	}
	shutdown();
}

void listener::shutdown()
{
    printf("listener exits with return %d\n",iret);
}






-----Original Message-----
From: Philippe Gerum on behalf of Philippe Gerum
Sent: Wed 5/7/2008 1:24 PM
To: Karch, Joshua
Cc: xenomai@xenomai.org
Subject: Re: [Xenomai-help] rt_task_send / receive problems various issues and	bug trace
 
Karch, Joshua wrote:
> 
> Hello,
> 
> I'm using rt_task_send from a talker task and rt_task_receive/reply from
> a listener task.  When I launch the two tasks in the following order:
> listener task,  talker task,  everything runs normally.
> 
> However, when I launch the talker task first, and then the listener task
> second, I receive rt_task_send error -22 and after a bit of time the
> listener task starts up.

-EINVAL is not on the error list for rt_task_send(). Could you confirm this result?

 I know this is logical and to be expected,

No, it's not. rt_task_send() does wait for the receiver to listen to, unless you
 asked for a non blocking call using TM_NONBLOCK as a timeout.

> however, it appears that issuing the rt_task_send command to a task that
> hasn't been started occasionally locks up sufficient resources to
> prevent the listener task from starting. 

Any chance your code enters a tight loop due to rt_task_send() failing repeatedly?

 By controlling task startup
> order, I was able to circumvent this issue.  Both tasks have similar
> priorities (50, 51).
> 
> Additionally, I am unable to use rt_task_send with TM_NONBLOCK
> len = rt_task_send(listen_task,&talk_send,&talk_reply,TM_NONBLOCK);
> 
> I get a bug failure and have to reset the machine-- see below:
> 

Please disassemble your "vmlinux" kernel image, the exact one that causes a bug:
$ objdump -d vmlinux > foo.txt.
In that large file, search for the "__rt_task_receive" symbol (notice the double
underscore prefix, we also have the "rt_task_receive" symbol, but we don't need
this code at the moment), then paste&copy the disassembly code for that
function. I'll have a look.

Step #2 is to send a simple piece of code that exhibits the problem. This will
speed up the debugging and fixing process.

> The reason I want to use TM_NONBLOCK is so that I can send a trigger
> message from the producer task to the consumer task without requiring a
> reply to trigger the consumer task to act on the data received. I am
> using the rt_task_send trigger message to gate and synchronize the
> consumer task. Is a reply required for all rt_task_send? 
> 
> It seems if I don't send a reply when rt_task_send has a timeout
> specified the sending task locks up and the listening task runs rampant,
> i.e. rt_task_receive no longer blocks and the loop runs with no delay
> and essentially locks up the machine, since I don't use
> rt_set_task_periodic on the listening task.
> 
> Here is the trace, and it requires a reboot.  I also can attach the
> code- it is written in c++ with two separate classes as a model of the
> application I am building.
> 
> Thank you,
> 
> Joshua Karch
> 
> talker task started
> rt_task_send error
> len=-22, opcode=0
> listener task stBUG: unable to handle kernel NULL pointer dereferencearted
> rt_task_s at virtual address 0000020c
> end error
> len=-printing eip: c014f014 110, opcode=0
> *pde = 00000000
> Oops: 0000 [#1] PREEMPT
> Modules linked in: xeno_timerbench nfs ipv6 nfsd lockd nfs_acl sunrpc
> exportfs dm_snapshot dm_mirror dm_mod loop pcmcia firmware_class
> serio_raw psmouse yenta_socket rsrc_nonstatic pcmcia_core cs5535_gpio
> joydev evdev ext3 jbd mbcache usbhid ide_disk generic ohci_hcd ehci_hcd
> amd74xx usbcore ide_core e100 mii
> 
> Pid: 1973, comm: listen_task Not tainted (2.6.24.4 #12)
> EIP: 0060:[<c014f014>] EFLAGS: 00010093 CPU: 0
> EIP is at rt_task_receive+0x113/0x18b
> EAX: cdc71f28 EBX: fffffd98 ECX: c03c5500 EDX: cd820acc
> ESI: ffffff97 EDI: cd820610 EBP: cdc71edc ESP: cdc71ec4
>  DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068
> Process listen_task (pid: 1973, ti=cdc70000 task=cdc6d320
> task.ti=cdc70000)<0>
> I-pipe domain Linux
> Stack: 00000000 cdc71f28 00000000 cdc71ee8 cdc6d320 cdc71fb8 cdc71f4c
> c01515bf
>        b7d083ea cdc71f00 c0102cb1 ce46bc70 c03c7e90 cd820620 cdc6d620
> cd80ff44
>        c029f55b cdc71f34 00000082 b91264ee 0000002e cdc6d320 b9127ee2
> 0000002e
> Call Trace:
>  [<c01045f1>] show_trace_log_lvl+0x1a/0x2f
>  [<c01046a3>] show_stack_log_lvl+0x9d/0xa5
>  [<c0104769>] show_registers+0xbe/0x1fd
>  [<c01049c1>] die+0x119/0x20a
>  [<c010ef52>] do_page_fault+0x480/0x57e
>  [<c010ca34>] __ipipe_handle_exception+0x11e/0x166
>  [<c02a0e7f>] error_code+0x6f/0x80
>  [<c01515bf>] __rt_task_receive+0xbe/0x113
>  [<c0149b0d>] losyscall_event+0x99/0x13d
>  [<c013f05b>] __ipipe_dispatch_event+0xac/0x16c
>  [<c010c8b1>] __ipipe_syscall_root+0x6a/0xcf
>  [<c0103e89>] system_call+0x29/0x4a
>  =======================
> Code: 00 00 8d 87 bc 04 00 00 39 c2 0f 95 c0 0f b6 c0 f7 d8 21 d0 31 db
> 3d 58 02 00 00 74 06 8d 98 98 fd ff ff 8b 45 ec be 97 ff ff ff <8b> 93
> 74 04 00 00 3b 50 0c 77 23 85 d2 74 19 89 d1 8b b3 70 04
> EIP: [<c014f014>] rt_task_receive+0x113/0x18b SS:ESP 0068:cdc71ec4
> ---[ end trace 523bcd2b73b75979 ]---
> 
> 
> 
> 
> 
> ------------------------------------------------------------------------
> 
> _______________________________________________
> Xenomai-help mailing list
> Xenomai-help@domain.hid
> https://mail.gna.org/listinfo/xenomai-help


-- 
Philippe.






[-- Attachment #2: Type: text/html, Size: 40279 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Xenomai-help] rt_task_send / receive problems various issuesand bug trace
  2008-05-07 19:35     ` [Xenomai-help] rt_task_send / receive problems various issuesand " Karch, Joshua
  2008-05-12 12:36       ` [Xenomai-help] rt_task_send / receive bug from last week and use of rt_task_inquire Karch, Joshua
@ 2008-05-17 21:18       ` Philippe Gerum
  2008-05-19 13:57         ` Karch, Joshua
  1 sibling, 1 reply; 9+ messages in thread
From: Philippe Gerum @ 2008-05-17 21:18 UTC (permalink / raw)
  To: Karch, Joshua; +Cc: xenomai

[-- Attachment #1: Type: text/plain, Size: 19057 bytes --]

Karch, Joshua wrote:
> Phillipe: small clarification: the following example has replies enabled
> and TM_NONBLOCK on rt_task_send.  I tried the code two different ways:
> len = rt_task_send(listen_task,&talk_send,&talk_reply,TM_NONBLOCK);
> len = rt_task_send(listen_task,&talk_send,NULL,TM_NONBLOCK);// also
> deleted rt_task_reply from listen_task and commented out all appropriate
> RT_TASK_MCB reply structs
> 
> both have the same bug in the end.
>

The attached patch fixes the issue.

> Josh
> 
> 
> -----Original Message-----
> From: xenomai-help-bounces@domain.hid on behalf of Karch, Joshua
> Sent: Wed 5/7/2008 3:33 PM
> To: rpm@xenomai.org
> Cc: xenomai@xenomai.org
> Subject: Re: [Xenomai-help] rt_task_send / receive problems various
> issuesand   bug trace
> 
> Phillipe,
> 
> here is a disassembly of __rt_task_receive and the source code used to
> generate the error.  The error happens if I use TM_NONBLOCK, regardless
> of whether or not I enable a reply.  In the following example, I use
> rt_send_task with no reply requested (NULL), and all replies commented out.
> 
> I got three errors this time, -22, -11, -110, and then finally
> BUG: unable to handle kernel NULL pointer dereference at virtual address
> 0000020c
> printing eip: c014f014 *pde = 00000000
> I hand typed the above in since I didn't capture it through a serial
> terminal.
> 
>>From what I can see here, there's no place for a local loop to form in
> the talker task. I have the wait period set to the main rate, defined as
> a 20 msec for this example.  Calling rt_task_send with or without a
> RT_TASK_MCB receive struct and without a reply from the listener task
> still results in the bug occurring.  My platform is a Geode, and I'm
> running 2.6.24-4 with Xenomai 2.4.3
> 
> Thank you,
> 
> Josh
> 
> 
> 
> 
> c0151501 <__rt_task_receive>:
> c0151501:       55                      push   %ebp
> c0151502:       89 e5                   mov    %esp,%ebp
> c0151504:       57                      push   %edi
> c0151505:       89 d7                   mov    %edx,%edi
> c0151507:       56                      push   %esi
> c0151508:       89 c6                   mov    %eax,%esi
> c015150a:       53                      push   %ebx
> c015150b:       83 ec 5c                sub    $0x5c,%esp
> c015150e:       8b 1a                   mov    (%edx),%ebx
> c0151510:       8b 40 04                mov    0x4(%eax),%eax
> c0151513:       89 da                   mov    %ebx,%edx
> c0151515:       83 c2 10                add    $0x10,%edx
> c0151518:       19 c9                   sbb    %ecx,%ecx
> c015151a:       39 50 18                cmp    %edx,0x18(%eax)
> c015151d:       83 d9 00                sbb    $0x0,%ecx
> c0151520:       85 c9                   test   %ecx,%ecx
> c0151522:       0f 85 dd 00 00 00       jne    c0151605
> <__rt_task_receive+0x104>
> c0151528:       b9 10 00 00 00          mov    $0x10,%ecx
> c015152d:       89 da                   mov    %ebx,%edx
> c015152f:       8d 45 dc                lea    0xffffffdc(%ebp),%eax
> c0151532:       e8 29 a2 08 00          call   c01db760
> <__copy_from_user_ll_nozero>
> c0151537:       8b 4d e8                mov    0xffffffe8(%ebp),%ecx
> c015153a:       85 c9                   test   %ecx,%ecx
> c015153c:       74 18                   je     c0151556
> <__rt_task_receive+0x55>
> c015153e:       8b 55 e4                mov    0xffffffe4(%ebp),%edx
> c0151541:       8b 46 04                mov    0x4(%esi),%eax
> c0151544:       01 ca                   add    %ecx,%edx
> c0151546:       19 db                   sbb    %ebx,%ebx
> c0151548:       39 50 18                cmp    %edx,0x18(%eax)
> c015154b:       83 db 00                sbb    $0x0,%ebx
> c015154e:       85 db                   test   %ebx,%ebx
> c0151550:       0f 85 af 00 00 00       jne    c0151605
> <__rt_task_receive+0x104>
> c0151556:       8b 5f 04                mov    0x4(%edi),%ebx
> c0151559:       8b 46 04                mov    0x4(%esi),%eax
> c015155c:       89 da                   mov    %ebx,%edx
> c015155e:       83 c2 08                add    $0x8,%edx
> c0151561:       19 c9                   sbb    %ecx,%ecx
> c0151563:       39 50 18                cmp    %edx,0x18(%eax)
> c0151566:       83 d9 00                sbb    $0x0,%ecx
> c0151569:       85 c9                   test   %ecx,%ecx
> c015156b:       0f 85 94 00 00 00       jne    c0151605
> <__rt_task_receive+0x104>
> c0151571:       89 da                   mov    %ebx,%edx
> c0151573:       b9 08 00 00 00          mov    $0x8,%ecx
> c0151578:       8d 45 ec                lea    0xffffffec(%ebp),%eax
> c015157b:       e8 e0 a1 08 00          call   c01db760
> <__copy_from_user_ll_nozero>
> c0151580:       8b 5d e4                mov    0xffffffe4(%ebp),%ebx
> c0151583:       8b 55 e8                mov    0xffffffe8(%ebp),%edx
> c0151586:       89 5d 98                mov    %ebx,0xffffff98(%ebp)
> c0151589:       31 db                   xor    %ebx,%ebx
> c015158b:       85 d2                   test   %edx,%edx
> c015158d:       74 22                   je     c01515b1
> <__rt_task_receive+0xb0>
> c015158f:       83 fa 40                cmp    $0x40,%edx
> c0151592:       77 05                   ja     c0151599
> <__rt_task_receive+0x98>
> c0151594:       8d 5d 9c                lea    0xffffff9c(%ebp),%ebx
> c0151597:       eb 15                   jmp    c01515ae
> <__rt_task_receive+0xad>
> c0151599:       b8 60 b9 3c c0          mov    $0xc03cb960,%eax
> c015159e:       be f4 ff ff ff          mov    $0xfffffff4,%esi
> c01515a3:       e8 25 f2 fe ff          call   c01407cd <xnheap_alloc>
> c01515a8:       85 c0                   test   %eax,%eax
> c01515aa:       74 5e                   je     c015160a
> <__rt_task_receive+0x109>
> c01515ac:       89 c3                   mov    %eax,%ebx
> c01515ae:       89 5d e4                mov    %ebx,0xffffffe4(%ebp)
> c01515b1:       8b 55 ec                mov    0xffffffec(%ebp),%edx
> c01515b4:       8d 45 dc                lea    0xffffffdc(%ebp),%eax
> c01515b7:       8b 4d f0                mov    0xfffffff0(%ebp),%ecx
> c01515ba:       e8 42 d9 ff ff          call   c014ef01 <rt_task_receive>
> c01515bf:       85 c0                   test   %eax,%eax
> c01515c1:       89 c6                   mov    %eax,%esi
> c01515c3:       7e 12                   jle    c01515d7
> <__rt_task_receive+0xd6>
> c01515c5:       8b 4d e8                mov    0xffffffe8(%ebp),%ecx
> c01515c8:       85 c9                   test   %ecx,%ecx
> c01515ca:       74 0b                   je     c01515d7
> <__rt_task_receive+0xd6>
> c01515cc:       8b 55 e4                mov    0xffffffe4(%ebp),%edx
> c01515cf:       8b 45 98                mov    0xffffff98(%ebp),%eax
> c01515d2:       e8 e9 9f 08 00          call   c01db5c0 <__copy_to_user_ll>
> c01515d7:       8b 45 98                mov    0xffffff98(%ebp),%eax
> c01515da:       8d 55 dc                lea    0xffffffdc(%ebp),%edx
> c01515dd:       b9 10 00 00 00          mov    $0x10,%ecx
> c01515e2:       89 45 e4                mov    %eax,0xffffffe4(%ebp)
> c01515e5:       8b 07                   mov    (%edi),%eax
> c01515e7:       e8 d4 9f 08 00          call   c01db5c0 <__copy_to_user_ll>
> c01515ec:       85 db                   test   %ebx,%ebx
> c01515ee:       74 1a                   je     c015160a
> <__rt_task_receive+0x109>
> c01515f0:       8d 45 9c                lea    0xffffff9c(%ebp),%eax
> c01515f3:       39 c3                   cmp    %eax,%ebx
> c01515f5:       74 13                   je     c015160a
> <__rt_task_receive+0x109>
> c01515f7:       89 da                   mov    %ebx,%edx
> c01515f9:       b8 60 b9 3c c0          mov    $0xc03cb960,%eax
> c01515fe:       e8 a4 f0 fe ff          call   c01406a7 <xnheap_free>
> c0151603:       eb 05                   jmp    c015160a
> <__rt_task_receive+0x109>
> c0151605:       be f2 ff ff ff          mov    $0xfffffff2,%esi
> c015160a:       83 c4 5c                add    $0x5c,%esp
> c015160d:       89 f0                   mov    %esi,%eax
> c015160f:       5b                      pop    %ebx
> c0151610:       5e                      pop    %esi
> c0151611:       5f                      pop    %edi
> c0151612:       5d                      pop    %ebp
> c0151613:       c3                      ret   
> 
> 
> 
> 
> Associated source code:
> 
> 
> //main.cpp
> #include "talker.h"
> #include "listener.h"
> #include <sys/mman.h>
> #include <native/task.h>
> #include <signal.h>
> 
> volatile int exitprogram;
> RT_TASK talk_task, listen_task;
> 
> 
> void catch_signal(int sig)
> {
>         exitprogram=1;
> }
> 
> int main(int argc, char *argv[])
> {
>         exitprogram=0;
>        
>         mlockall(MCL_CURRENT | MCL_FUTURE);
>         //wait for ctrl-c
>         signal(SIGTERM, catch_signal);
>         signal(SIGINT, catch_signal);
>         talker *talk = new talker();
>         listener *listen = new listener();
>         talk->startup(&talk_task, &listen_task);
>         sleep(1);
>         listen->startup(&talk_task, &listen_task);
>         pause();
>         sleep(2);
>         rt_task_join(&talk_task);
>         rt_task_join(&listen_task);
>         rt_task_delete(&talk_task);
>         rt_task_delete(&listen_task);
>         return 0;
> }
> 
> //talker.h
> #ifndef TALKER_H_
> #define TALKER_H_
> #include <stdio.h>
> #include <stdlib.h>
> #include <sys/time.h>
> #include <unistd.h>
> #include <sys/mman.h>
> #include <native/task.h>
> #include <native/timer.h>
> #define TIMEOUT (100000000)
> #define MAIN_RATE_NS 20000000
> 
> 
> extern volatile int exitprogram;
> 
> class talker {
> public:
>         talker();
>         virtual ~talker();
>         void mainloop();
>         int startup(RT_TASK *talktask, RT_TASK *listentask);
> 
> private:
>         RT_TASK *talk_task, *listen_task;
>         void shutdown();
>         int iret;
> 
> protected:
>         static void thunk(void * param);
> };
> 
> 
> 
> #endif /*TALKER_H_*/
> 
> 
> 
> //talker.cpp
> #include "talker.h"
> 
> talker::talker()
> {
> }
> talker::~talker()
> {
> }
> 
> int talker::startup(RT_TASK *talktask, RT_TASK *listentask)
> {
>         talk_task=talktask;
>         listen_task=listentask;
>         rt_task_create(talk_task, "talk_task", 0, 51, 0);
>         iret= rt_task_start(talk_task,thunk,(void*)(this));
>         return(0);
> 
> 
> }
> void talker::thunk(void * param)
> {
>         talker *instance = (talker *) param;
>         instance->mainloop();
> }
> void talker::mainloop()
> {
>         printf("talker task started\n");
>         int len;
>         RT_TASK_MCB talk_send, talk_reply;
> 
>         rt_task_set_periodic(NULL,TM_NOW,MAIN_RATE_NS);
>         while (1)
>         {
>                 if(exitprogram)
>                         break;
>                 rt_task_wait_period(NULL);
>                 talk_send.opcode = 0x01;
>                 talk_send.data = NULL;
>                 talk_send.size = 0;
>                 talk_reply.size = 0;
>                 talk_reply.data = NULL;
>                 len =
> rt_task_send(listen_task,&talk_send,&talk_reply,TM_NONBLOCK);
>                 if (len < 0) printf("rt_task_send error\n");
>                 if (talk_reply.opcode != 4)
>                 printf("len=%d, opcode=%d\n", len, talk_reply.opcode);
>         }
>         shutdown();
> }
> 
> void talker::shutdown()
> {
>     printf("Talker exits with return %d\n",iret);
> }
> 
> 
> //listener.h
> #ifndef LISTENER_H_
> #define LISTENER_H_
> #include <stdio.h>
> #include <stdlib.h>
> #include <sys/time.h>
> #include <unistd.h>
> #include <sys/mman.h>
> #include <native/task.h>
> #include <native/timer.h>
> 
> 
> 
> extern volatile int exitprogram;
> 
> class listener {
> public:
>         listener();
>         virtual ~listener();
>         void mainloop();
>         int startup(RT_TASK *talktask, RT_TASK *listentask);
> 
> private:
>         RT_TASK *talk_task, *listen_task;
>         void shutdown();
>         int iret;
> 
> protected:
>         static void thunk(void * param);
> };
> 
> 
> 
> #endif /*LISTENER_H_*/
> 
> 
> //listener.cpp
> 
> #include "listener.h"
> 
> listener::listener()
> {
> }
> 
> listener::~listener()
> {
> }
> 
> int listener::startup(RT_TASK *talktask, RT_TASK *listentask)
> {
>         talk_task=talktask;
>         listen_task=listentask;
>         rt_task_create(listen_task, "listen_task", 0, 50, 0);
>         iret= rt_task_start(listen_task,thunk,(void*)(this));
>         return(0);
> 
> 
> }
> 
> void listener::thunk(void * param)
> {
>         listener *instance = (listener *) param;
>         instance->mainloop();
> }
> 
> void listener::mainloop()
> {
>         printf("listener task started\n");
>         unsigned char buf[10];
>         RT_TASK_MCB listen_rcv, listen_reply;
>        
>         while (1)
>         {
>                 int taskid;
>                 if(exitprogram)
>                         break;
>                 listen_rcv.data = (caddr_t)buf;
>                 listen_rcv.size = sizeof(buf);
>                 taskid = rt_task_receive(&listen_rcv,TM_INFINITE);
>                 printf("received data with opcode
> %d\n",listen_rcv.opcode);    
>                 listen_reply.opcode = 4;
>                 listen_reply.size = 0;
>                 listen_reply.data = NULL;
>                 rt_task_reply(taskid, &listen_reply);
>         }
>         shutdown();
> }
> 
> void listener::shutdown()
> {
>     printf("listener exits with return %d\n",iret);
> }
> 
> 
> 
> 
> 
> 
> -----Original Message-----
> From: Philippe Gerum on behalf of Philippe Gerum
> Sent: Wed 5/7/2008 1:24 PM
> To: Karch, Joshua
> Cc: xenomai@xenomai.org
> Subject: Re: [Xenomai-help] rt_task_send / receive problems various
> issues and  bug trace
> 
> Karch, Joshua wrote:
>>
>> Hello,
>>
>> I'm using rt_task_send from a talker task and rt_task_receive/reply from
>> a listener task.  When I launch the two tasks in the following order:
>> listener task,  talker task,  everything runs normally.
>>
>> However, when I launch the talker task first, and then the listener task
>> second, I receive rt_task_send error -22 and after a bit of time the
>> listener task starts up.
> 
> -EINVAL is not on the error list for rt_task_send(). Could you confirm
> this result?
> 
>  I know this is logical and to be expected,
> 
> No, it's not. rt_task_send() does wait for the receiver to listen to,
> unless you
>  asked for a non blocking call using TM_NONBLOCK as a timeout.
> 
>> however, it appears that issuing the rt_task_send command to a task that
>> hasn't been started occasionally locks up sufficient resources to
>> prevent the listener task from starting.
> 
> Any chance your code enters a tight loop due to rt_task_send() failing
> repeatedly?
> 
>  By controlling task startup
>> order, I was able to circumvent this issue.  Both tasks have similar
>> priorities (50, 51).
>>
>> Additionally, I am unable to use rt_task_send with TM_NONBLOCK
>> len = rt_task_send(listen_task,&talk_send,&talk_reply,TM_NONBLOCK);
>>
>> I get a bug failure and have to reset the machine-- see below:
>>
> 
> Please disassemble your "vmlinux" kernel image, the exact one that
> causes a bug:
> $ objdump -d vmlinux > foo.txt.
> In that large file, search for the "__rt_task_receive" symbol (notice
> the double
> underscore prefix, we also have the "rt_task_receive" symbol, but we
> don't need
> this code at the moment), then paste&copy the disassembly code for that
> function. I'll have a look.
> 
> Step #2 is to send a simple piece of code that exhibits the problem.
> This will
> speed up the debugging and fixing process.
> 
>> The reason I want to use TM_NONBLOCK is so that I can send a trigger
>> message from the producer task to the consumer task without requiring a
>> reply to trigger the consumer task to act on the data received. I am
>> using the rt_task_send trigger message to gate and synchronize the
>> consumer task. Is a reply required for all rt_task_send?
>>
>> It seems if I don't send a reply when rt_task_send has a timeout
>> specified the sending task locks up and the listening task runs rampant,
>> i.e. rt_task_receive no longer blocks and the loop runs with no delay
>> and essentially locks up the machine, since I don't use
>> rt_set_task_periodic on the listening task.
>>
>> Here is the trace, and it requires a reboot.  I also can attach the
>> code- it is written in c++ with two separate classes as a model of the
>> application I am building.
>>
>> Thank you,
>>
>> Joshua Karch
>>
>> talker task started
>> rt_task_send error
>> len=-22, opcode=0
>> listener task stBUG: unable to handle kernel NULL pointer dereferencearted
>> rt_task_s at virtual address 0000020c
>> end error
>> len=-printing eip: c014f014 110, opcode=0
>> *pde = 00000000
>> Oops: 0000 [#1] PREEMPT
>> Modules linked in: xeno_timerbench nfs ipv6 nfsd lockd nfs_acl sunrpc
>> exportfs dm_snapshot dm_mirror dm_mod loop pcmcia firmware_class
>> serio_raw psmouse yenta_socket rsrc_nonstatic pcmcia_core cs5535_gpio
>> joydev evdev ext3 jbd mbcache usbhid ide_disk generic ohci_hcd ehci_hcd
>> amd74xx usbcore ide_core e100 mii
>>
>> Pid: 1973, comm: listen_task Not tainted (2.6.24.4 #12)
>> EIP: 0060:[<c014f014>] EFLAGS: 00010093 CPU: 0
>> EIP is at rt_task_receive+0x113/0x18b
>> EAX: cdc71f28 EBX: fffffd98 ECX: c03c5500 EDX: cd820acc
>> ESI: ffffff97 EDI: cd820610 EBP: cdc71edc ESP: cdc71ec4
>>  DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068
>> Process listen_task (pid: 1973, ti=cdc70000 task=cdc6d320
>> task.ti=cdc70000)<0>
>> I-pipe domain Linux
>> Stack: 00000000 cdc71f28 00000000 cdc71ee8 cdc6d320 cdc71fb8 cdc71f4c
>> c01515bf
>>        b7d083ea cdc71f00 c0102cb1 ce46bc70 c03c7e90 cd820620 cdc6d620
>> cd80ff44
>>        c029f55b cdc71f34 00000082 b91264ee 0000002e cdc6d320 b9127ee2
>> 0000002e
>> Call Trace:
>>  [<c01045f1>] show_trace_log_lvl+0x1a/0x2f
>>  [<c01046a3>] show_stack_log_lvl+0x9d/0xa5
>>  [<c0104769>] show_registers+0xbe/0x1fd
>>  [<c01049c1>] die+0x119/0x20a
>>  [<c010ef52>] do_page_fault+0x480/0x57e
>>  [<c010ca34>] __ipipe_handle_exception+0x11e/0x166
>>  [<c02a0e7f>] error_code+0x6f/0x80
>>  [<c01515bf>] __rt_task_receive+0xbe/0x113
>>  [<c0149b0d>] losyscall_event+0x99/0x13d
>>  [<c013f05b>] __ipipe_dispatch_event+0xac/0x16c
>>  [<c010c8b1>] __ipipe_syscall_root+0x6a/0xcf
>>  [<c0103e89>] system_call+0x29/0x4a
>>  =======================
>> Code: 00 00 8d 87 bc 04 00 00 39 c2 0f 95 c0 0f b6 c0 f7 d8 21 d0 31 db
>> 3d 58 02 00 00 74 06 8d 98 98 fd ff ff 8b 45 ec be 97 ff ff ff <8b> 93
>> 74 04 00 00 3b 50 0c 77 23 85 d2 74 19 89 d1 8b b3 70 04
>> EIP: [<c014f014>] rt_task_receive+0x113/0x18b SS:ESP 0068:cdc71ec4
>> ---[ end trace 523bcd2b73b75979 ]---
>>
>>
>>
>>
>>
>> ------------------------------------------------------------------------
>>
>> _______________________________________________
>> Xenomai-help mailing list
>> Xenomai-help@domain.hid
>> https://mail.gna.org/listinfo/xenomai-help
> 
> 
> --
> Philippe.
> 
> 


-- 
Philippe.

[-- Attachment #2: msend-fix-non-blocking.patch --]
[-- Type: text/x-diff, Size: 2270 bytes --]

Index: ChangeLog
===================================================================
--- ChangeLog	(revision 3796)
+++ ChangeLog	(working copy)
@@ -1,3 +1,11 @@
+2008-05-17  Philippe Gerum  <rpm@xenomai.org>
+
+	* ksrc/skins/native/task.c (rt_task_send): Apply infinite timeout
+	to wait for a reply once a remote server has been found, in the
+	non-blocking call case. NOTE: Passing TM_NONBLOCK does NOT mean
+	not to wait for a reply, but it means to wait for a reply UNLESS
+	no server is listening to us before we send the request.
+
 2008-05-15  Philippe Gerum  <rpm@xenomai.org>
 
 	* src/skins/vxworks, ksrc/skins/vxworks: Add taskInfoGet()
Index: ksrc/skins/native/task.c
===================================================================
--- ksrc/skins/native/task.c	(revision 3796)
+++ ksrc/skins/native/task.c	(working copy)
@@ -1691,7 +1691,8 @@
  * remote task eventually replies. Passing TM_NONBLOCK causes the
  * service to return immediately without waiting if the remote task is
  * not waiting for messages (i.e. if @a task is not currently blocked
- * on the rt_task_receive() service).
+ * on the rt_task_receive() service); however, the caller will wait
+ * indefinitely for a reply from that remote task if present.
  *
  * @return A positive value is returned upon success, representing the
  * length (in bytes) of the reply message returned by the remote
@@ -1718,6 +1719,9 @@
  * from a context which cannot sleep (e.g. interrupt, non-realtime or
  * scheduler locked).
  *
+ * - -ESRCH is returned if @a task cannot be found (when called from
+ *    user-space only).
+ *
  * Environments:
  *
  * This service can be called from:
@@ -1755,10 +1759,17 @@
 		goto unlock_and_exit;
 	}
 
-	if (timeout == TM_NONBLOCK && xnsynch_nsleepers(&task->mrecv) == 0) {
-		/* Can't block and no server listening; just bail out. */
-		err = -EWOULDBLOCK;
-		goto unlock_and_exit;
+	if (timeout == TM_NONBLOCK) {
+		if (xnsynch_nsleepers(&task->mrecv) == 0) {
+			/* Can't block and no server listening; just bail out. */
+			err = -EWOULDBLOCK;
+			goto unlock_and_exit;
+		} else
+			/*
+			 * Make sure we'll wait indefinitely once we
+			 * know that a remote task is listening.
+			 */
+			timeout = TM_INFINITE;
 	}
 
 	if (xnpod_unblockable_p()) {

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Xenomai-help] rt_task_send / receive bug from last week and use of rt_task_inquire
  2008-05-12 12:36       ` [Xenomai-help] rt_task_send / receive bug from last week and use of rt_task_inquire Karch, Joshua
@ 2008-05-17 21:26         ` Philippe Gerum
  2008-05-19 13:59           ` Karch, Joshua
  0 siblings, 1 reply; 9+ messages in thread
From: Philippe Gerum @ 2008-05-17 21:26 UTC (permalink / raw)
  To: Karch, Joshua; +Cc: xenomai

Karch, Joshua wrote:
> Phillipe,
> 
> I was wondering if you had any progress with the rt_task_send/receive
> bug?  I was able to get the synchronized serial chain to work from
> sensor task through an rt_task_send at 50 Hz with a timeout to the
> control task, which then synchronously sends an rt_task_send with
> timeout to the motor driver on the second serial port.  When I set the
> motor driver's timeout to be 100 msec and I unplug the motor driver's
> serial interface to cause this timeout, I immediately get rt_task_send
> errors, some bad enough to cause a system crash if and only if the
> rt_task_send looks for a reply, otherwise, with the reply MCB field set
> to NULL, rt_task_send fails indefinitely until the program is killed and
> a segfault occurs.  Clearly, with a 100 msec timeout, this means that up
> to five rt_task send commands will come from the control law which is
> chain-linked to the sensor task before a single reply is sent back from
> the servo controller.
> 
> What is the expected behavior of repeatedly issuing rt_task_send when
> the other task has not replied within a timeout specified by
> rt_task_send?  The 100 msec timeout is akin to having 3 cars on a
> highway, with the front one (the servo motor) slamming on the brakes and
> causing a major collision.  Basically, the servo controller must always
> respond within 20 msec- the time of execution for the control task in
> order to prevent the xenomai equivalent of a traffic jam.  Can and
> should this be avoided by using a function like rt_task_inquire to see
> if the servo serial driver is blocking on serial read, and if so, not
> issue an rt_task_send command?  In this way, the synchronous
> relationship between the serial sensor and the control law will never be
> broken.
> 
> Regardless, bringing the serial port timeout down to 10msec, which is
> reasonable, resolves this problem for now, though it limits the control
> law's ability to process while the servo motor serial task has timed out.
> 
> That's basically my update.  I'd still much prefer to be able to send a
> non-blocking trigger message without a reply for all rt_task_send calls

It's ok unless you end up implementing some kind of asynchronous protocol using
that message passing interface: this won't work. rt_task_send/receive/reply are
meant to implement synchronous message passing. Just to make sure the
implementation is well understood:

- passing mcb_r == NULL to rt_task_send() does not mean that you won't wait for
any answer from the remote task, you will wait for rt_task_reply() to be called;
that answer will just be discarded before rt_task_send() returns.

- passing timeout == TM_NONBLOCK to rt_task_send() does NOT mean not to wait for
a reply, but it means to wait for a reply indefinitely UNLESS no server is
listening to the sender before it issues the request, in which case
rt_task_send() should return immediately.

In other word, it's purely synchronous stuff. If you need asynchronous message
passing, then you should use another IPC.

> from the sensor to the control law to regulate the control law's loop
> rate, and perhaps also do the same for the motor controller, which
> currently places its response in shared memory.
> 
> Thank you,
> 
> Josh
> 
> 
> -----Original Message-----
> From: Karch, Joshua
> Sent: Wed 5/7/2008 3:35 PM
> To: Karch, Joshua; rpm@xenomai.org
> Cc: xenomai@xenomai.org
> Subject: RE: [Xenomai-help] rt_task_send / receive problems various
> issuesand   bug trace
> 
> Phillipe: small clarification: the following example has replies enabled
> and TM_NONBLOCK on rt_task_send.  I tried the code two different ways:
> len = rt_task_send(listen_task,&talk_send,&talk_reply,TM_NONBLOCK);
> len = rt_task_send(listen_task,&talk_send,NULL,TM_NONBLOCK);// also
> deleted rt_task_reply from listen_task and commented out all appropriate
> RT_TASK_MCB reply structs
> 
> both have the same bug in the end.
> 
> Josh
> 
> 
> -----Original Message-----
> From: xenomai-help-bounces@domain.hid on behalf of Karch, Joshua
> Sent: Wed 5/7/2008 3:33 PM
> To: rpm@xenomai.org
> Cc: xenomai@xenomai.org
> Subject: Re: [Xenomai-help] rt_task_send / receive problems various
> issuesand   bug trace
> 
> Phillipe,
> 
> here is a disassembly of __rt_task_receive and the source code used to
> generate the error.  The error happens if I use TM_NONBLOCK, regardless
> of whether or not I enable a reply.  In the following example, I use
> rt_send_task with no reply requested (NULL), and all replies commented out.
> 
> I got three errors this time, -22, -11, -110, and then finally
> BUG: unable to handle kernel NULL pointer dereference at virtual address
> 0000020c
> printing eip: c014f014 *pde = 00000000
> I hand typed the above in since I didn't capture it through a serial
> terminal.
> 
>>From what I can see here, there's no place for a local loop to form in
> the talker task. I have the wait period set to the main rate, defined as
> a 20 msec for this example.  Calling rt_task_send with or without a
> RT_TASK_MCB receive struct and without a reply from the listener task
> still results in the bug occurring.  My platform is a Geode, and I'm
> running 2.6.24-4 with Xenomai 2.4.3
> 
> Thank you,
> 
> Josh
> 
> 
> 
> 
> c0151501 <__rt_task_receive>:
> c0151501:       55                      push   %ebp
> c0151502:       89 e5                   mov    %esp,%ebp
> c0151504:       57                      push   %edi
> c0151505:       89 d7                   mov    %edx,%edi
> c0151507:       56                      push   %esi
> c0151508:       89 c6                   mov    %eax,%esi
> c015150a:       53                      push   %ebx
> c015150b:       83 ec 5c                sub    $0x5c,%esp
> c015150e:       8b 1a                   mov    (%edx),%ebx
> c0151510:       8b 40 04                mov    0x4(%eax),%eax
> c0151513:       89 da                   mov    %ebx,%edx
> c0151515:       83 c2 10                add    $0x10,%edx
> c0151518:       19 c9                   sbb    %ecx,%ecx
> c015151a:       39 50 18                cmp    %edx,0x18(%eax)
> c015151d:       83 d9 00                sbb    $0x0,%ecx
> c0151520:       85 c9                   test   %ecx,%ecx
> c0151522:       0f 85 dd 00 00 00       jne    c0151605
> <__rt_task_receive+0x104>
> c0151528:       b9 10 00 00 00          mov    $0x10,%ecx
> c015152d:       89 da                   mov    %ebx,%edx
> c015152f:       8d 45 dc                lea    0xffffffdc(%ebp),%eax
> c0151532:       e8 29 a2 08 00          call   c01db760
> <__copy_from_user_ll_nozero>
> c0151537:       8b 4d e8                mov    0xffffffe8(%ebp),%ecx
> c015153a:       85 c9                   test   %ecx,%ecx
> c015153c:       74 18                   je     c0151556
> <__rt_task_receive+0x55>
> c015153e:       8b 55 e4                mov    0xffffffe4(%ebp),%edx
> c0151541:       8b 46 04                mov    0x4(%esi),%eax
> c0151544:       01 ca                   add    %ecx,%edx
> c0151546:       19 db                   sbb    %ebx,%ebx
> c0151548:       39 50 18                cmp    %edx,0x18(%eax)
> c015154b:       83 db 00                sbb    $0x0,%ebx
> c015154e:       85 db                   test   %ebx,%ebx
> c0151550:       0f 85 af 00 00 00       jne    c0151605
> <__rt_task_receive+0x104>
> c0151556:       8b 5f 04                mov    0x4(%edi),%ebx
> c0151559:       8b 46 04                mov    0x4(%esi),%eax
> c015155c:       89 da                   mov    %ebx,%edx
> c015155e:       83 c2 08                add    $0x8,%edx
> c0151561:       19 c9                   sbb    %ecx,%ecx
> c0151563:       39 50 18                cmp    %edx,0x18(%eax)
> c0151566:       83 d9 00                sbb    $0x0,%ecx
> c0151569:       85 c9                   test   %ecx,%ecx
> c015156b:       0f 85 94 00 00 00       jne    c0151605
> <__rt_task_receive+0x104>
> c0151571:       89 da                   mov    %ebx,%edx
> c0151573:       b9 08 00 00 00          mov    $0x8,%ecx
> c0151578:       8d 45 ec                lea    0xffffffec(%ebp),%eax
> c015157b:       e8 e0 a1 08 00          call   c01db760
> <__copy_from_user_ll_nozero>
> c0151580:       8b 5d e4                mov    0xffffffe4(%ebp),%ebx
> c0151583:       8b 55 e8                mov    0xffffffe8(%ebp),%edx
> c0151586:       89 5d 98                mov    %ebx,0xffffff98(%ebp)
> c0151589:       31 db                   xor    %ebx,%ebx
> c015158b:       85 d2                   test   %edx,%edx
> c015158d:       74 22                   je     c01515b1
> <__rt_task_receive+0xb0>
> c015158f:       83 fa 40                cmp    $0x40,%edx
> c0151592:       77 05                   ja     c0151599
> <__rt_task_receive+0x98>
> c0151594:       8d 5d 9c                lea    0xffffff9c(%ebp),%ebx
> c0151597:       eb 15                   jmp    c01515ae
> <__rt_task_receive+0xad>
> c0151599:       b8 60 b9 3c c0          mov    $0xc03cb960,%eax
> c015159e:       be f4 ff ff ff          mov    $0xfffffff4,%esi
> c01515a3:       e8 25 f2 fe ff          call   c01407cd <xnheap_alloc>
> c01515a8:       85 c0                   test   %eax,%eax
> c01515aa:       74 5e                   je     c015160a
> <__rt_task_receive+0x109>
> c01515ac:       89 c3                   mov    %eax,%ebx
> c01515ae:       89 5d e4                mov    %ebx,0xffffffe4(%ebp)
> c01515b1:       8b 55 ec                mov    0xffffffec(%ebp),%edx
> c01515b4:       8d 45 dc                lea    0xffffffdc(%ebp),%eax
> c01515b7:       8b 4d f0                mov    0xfffffff0(%ebp),%ecx
> c01515ba:       e8 42 d9 ff ff          call   c014ef01 <rt_task_receive>
> c01515bf:       85 c0                   test   %eax,%eax
> c01515c1:       89 c6                   mov    %eax,%esi
> c01515c3:       7e 12                   jle    c01515d7
> <__rt_task_receive+0xd6>
> c01515c5:       8b 4d e8                mov    0xffffffe8(%ebp),%ecx
> c01515c8:       85 c9                   test   %ecx,%ecx
> c01515ca:       74 0b                   je     c01515d7
> <__rt_task_receive+0xd6>
> c01515cc:       8b 55 e4                mov    0xffffffe4(%ebp),%edx
> c01515cf:       8b 45 98                mov    0xffffff98(%ebp),%eax
> c01515d2:       e8 e9 9f 08 00          call   c01db5c0 <__copy_to_user_ll>
> c01515d7:       8b 45 98                mov    0xffffff98(%ebp),%eax
> c01515da:       8d 55 dc                lea    0xffffffdc(%ebp),%edx
> c01515dd:       b9 10 00 00 00          mov    $0x10,%ecx
> c01515e2:       89 45 e4                mov    %eax,0xffffffe4(%ebp)
> c01515e5:       8b 07                   mov    (%edi),%eax
> c01515e7:       e8 d4 9f 08 00          call   c01db5c0 <__copy_to_user_ll>
> c01515ec:       85 db                   test   %ebx,%ebx
> c01515ee:       74 1a                   je     c015160a
> <__rt_task_receive+0x109>
> c01515f0:       8d 45 9c                lea    0xffffff9c(%ebp),%eax
> c01515f3:       39 c3                   cmp    %eax,%ebx
> c01515f5:       74 13                   je     c015160a
> <__rt_task_receive+0x109>
> c01515f7:       89 da                   mov    %ebx,%edx
> c01515f9:       b8 60 b9 3c c0          mov    $0xc03cb960,%eax
> c01515fe:       e8 a4 f0 fe ff          call   c01406a7 <xnheap_free>
> c0151603:       eb 05                   jmp    c015160a
> <__rt_task_receive+0x109>
> c0151605:       be f2 ff ff ff          mov    $0xfffffff2,%esi
> c015160a:       83 c4 5c                add    $0x5c,%esp
> c015160d:       89 f0                   mov    %esi,%eax
> c015160f:       5b                      pop    %ebx
> c0151610:       5e                      pop    %esi
> c0151611:       5f                      pop    %edi
> c0151612:       5d                      pop    %ebp
> c0151613:       c3                      ret   
> 
> 
> 
> 
> Associated source code:
> 
> 
> //main.cpp
> #include "talker.h"
> #include "listener.h"
> #include <sys/mman.h>
> #include <native/task.h>
> #include <signal.h>
> 
> volatile int exitprogram;
> RT_TASK talk_task, listen_task;
> 
> 
> void catch_signal(int sig)
> {
>         exitprogram=1;
> }
> 
> int main(int argc, char *argv[])
> {
>         exitprogram=0;
>        
>         mlockall(MCL_CURRENT | MCL_FUTURE);
>         //wait for ctrl-c
>         signal(SIGTERM, catch_signal);
>         signal(SIGINT, catch_signal);
>         talker *talk = new talker();
>         listener *listen = new listener();
>         talk->startup(&talk_task, &listen_task);
>         sleep(1);
>         listen->startup(&talk_task, &listen_task);
>         pause();
>         sleep(2);
>         rt_task_join(&talk_task);
>         rt_task_join(&listen_task);
>         rt_task_delete(&talk_task);
>         rt_task_delete(&listen_task);
>         return 0;
> }
> 
> //talker.h
> #ifndef TALKER_H_
> #define TALKER_H_
> #include <stdio.h>
> #include <stdlib.h>
> #include <sys/time.h>
> #include <unistd.h>
> #include <sys/mman.h>
> #include <native/task.h>
> #include <native/timer.h>
> #define TIMEOUT (100000000)
> #define MAIN_RATE_NS 20000000
> 
> 
> extern volatile int exitprogram;
> 
> class talker {
> public:
>         talker();
>         virtual ~talker();
>         void mainloop();
>         int startup(RT_TASK *talktask, RT_TASK *listentask);
> 
> private:
>         RT_TASK *talk_task, *listen_task;
>         void shutdown();
>         int iret;
> 
> protected:
>         static void thunk(void * param);
> };
> 
> 
> 
> #endif /*TALKER_H_*/
> 
> 
> 
> //talker.cpp
> #include "talker.h"
> 
> talker::talker()
> {
> }
> talker::~talker()
> {
> }
> 
> int talker::startup(RT_TASK *talktask, RT_TASK *listentask)
> {
>         talk_task=talktask;
>         listen_task=listentask;
>         rt_task_create(talk_task, "talk_task", 0, 51, 0);
>         iret= rt_task_start(talk_task,thunk,(void*)(this));
>         return(0);
> 
> 
> }
> void talker::thunk(void * param)
> {
>         talker *instance = (talker *) param;
>         instance->mainloop();
> }
> void talker::mainloop()
> {
>         printf("talker task started\n");
>         int len;
>         RT_TASK_MCB talk_send, talk_reply;
> 
>         rt_task_set_periodic(NULL,TM_NOW,MAIN_RATE_NS);
>         while (1)
>         {
>                 if(exitprogram)
>                         break;
>                 rt_task_wait_period(NULL);
>                 talk_send.opcode = 0x01;
>                 talk_send.data = NULL;
>                 talk_send.size = 0;
>                 talk_reply.size = 0;
>                 talk_reply.data = NULL;
>                 len =
> rt_task_send(listen_task,&talk_send,&talk_reply,TM_NONBLOCK);
>                 if (len < 0) printf("rt_task_send error\n");
>                 if (talk_reply.opcode != 4)
>                 printf("len=%d, opcode=%d\n", len, talk_reply.opcode);
>         }
>         shutdown();
> }
> 
> void talker::shutdown()
> {
>     printf("Talker exits with return %d\n",iret);
> }
> 
> 
> //listener.h
> #ifndef LISTENER_H_
> #define LISTENER_H_
> #include <stdio.h>
> #include <stdlib.h>
> #include <sys/time.h>
> #include <unistd.h>
> #include <sys/mman.h>
> #include <native/task.h>
> #include <native/timer.h>
> 
> 
> 
> extern volatile int exitprogram;
> 
> class listener {
> public:
>         listener();
>         virtual ~listener();
>         void mainloop();
>         int startup(RT_TASK *talktask, RT_TASK *listentask);
> 
> private:
>         RT_TASK *talk_task, *listen_task;
>         void shutdown();
>         int iret;
> 
> protected:
>         static void thunk(void * param);
> };
> 
> 
> 
> #endif /*LISTENER_H_*/
> 
> 
> //listener.cpp
> 
> #include "listener.h"
> 
> listener::listener()
> {
> }
> 
> listener::~listener()
> {
> }
> 
> int listener::startup(RT_TASK *talktask, RT_TASK *listentask)
> {
>         talk_task=talktask;
>         listen_task=listentask;
>         rt_task_create(listen_task, "listen_task", 0, 50, 0);
>         iret= rt_task_start(listen_task,thunk,(void*)(this));
>         return(0);
> 
> 
> }
> 
> void listener::thunk(void * param)
> {
>         listener *instance = (listener *) param;
>         instance->mainloop();
> }
> 
> void listener::mainloop()
> {
>         printf("listener task started\n");
>         unsigned char buf[10];
>         RT_TASK_MCB listen_rcv, listen_reply;
>        
>         while (1)
>         {
>                 int taskid;
>                 if(exitprogram)
>                         break;
>                 listen_rcv.data = (caddr_t)buf;
>                 listen_rcv.size = sizeof(buf);
>                 taskid = rt_task_receive(&listen_rcv,TM_INFINITE);
>                 printf("received data with opcode
> %d\n",listen_rcv.opcode);    
>                 listen_reply.opcode = 4;
>                 listen_reply.size = 0;
>                 listen_reply.data = NULL;
>                 rt_task_reply(taskid, &listen_reply);
>         }
>         shutdown();
> }
> 
> void listener::shutdown()
> {
>     printf("listener exits with return %d\n",iret);
> }
> 
> 
> 
> 
> 
> 
> -----Original Message-----
> From: Philippe Gerum on behalf of Philippe Gerum
> Sent: Wed 5/7/2008 1:24 PM
> To: Karch, Joshua
> Cc: xenomai@xenomai.org
> Subject: Re: [Xenomai-help] rt_task_send / receive problems various
> issues and  bug trace
> 
> Karch, Joshua wrote:
>>
>> Hello,
>>
>> I'm using rt_task_send from a talker task and rt_task_receive/reply from
>> a listener task.  When I launch the two tasks in the following order:
>> listener task,  talker task,  everything runs normally.
>>
>> However, when I launch the talker task first, and then the listener task
>> second, I receive rt_task_send error -22 and after a bit of time the
>> listener task starts up.
> 
> -EINVAL is not on the error list for rt_task_send(). Could you confirm
> this result?
> 
>  I know this is logical and to be expected,
> 
> No, it's not. rt_task_send() does wait for the receiver to listen to,
> unless you
>  asked for a non blocking call using TM_NONBLOCK as a timeout.
> 
>> however, it appears that issuing the rt_task_send command to a task that
>> hasn't been started occasionally locks up sufficient resources to
>> prevent the listener task from starting.
> 
> Any chance your code enters a tight loop due to rt_task_send() failing
> repeatedly?
> 
>  By controlling task startup
>> order, I was able to circumvent this issue.  Both tasks have similar
>> priorities (50, 51).
>>
>> Additionally, I am unable to use rt_task_send with TM_NONBLOCK
>> len = rt_task_send(listen_task,&talk_send,&talk_reply,TM_NONBLOCK);
>>
>> I get a bug failure and have to reset the machine-- see below:
>>
> 
> Please disassemble your "vmlinux" kernel image, the exact one that
> causes a bug:
> $ objdump -d vmlinux > foo.txt.
> In that large file, search for the "__rt_task_receive" symbol (notice
> the double
> underscore prefix, we also have the "rt_task_receive" symbol, but we
> don't need
> this code at the moment), then paste&copy the disassembly code for that
> function. I'll have a look.
> 
> Step #2 is to send a simple piece of code that exhibits the problem.
> This will
> speed up the debugging and fixing process.
> 
>> The reason I want to use TM_NONBLOCK is so that I can send a trigger
>> message from the producer task to the consumer task without requiring a
>> reply to trigger the consumer task to act on the data received. I am
>> using the rt_task_send trigger message to gate and synchronize the
>> consumer task. Is a reply required for all rt_task_send?
>>
>> It seems if I don't send a reply when rt_task_send has a timeout
>> specified the sending task locks up and the listening task runs rampant,
>> i.e. rt_task_receive no longer blocks and the loop runs with no delay
>> and essentially locks up the machine, since I don't use
>> rt_set_task_periodic on the listening task.
>>
>> Here is the trace, and it requires a reboot.  I also can attach the
>> code- it is written in c++ with two separate classes as a model of the
>> application I am building.
>>
>> Thank you,
>>
>> Joshua Karch
>>
>> talker task started
>> rt_task_send error
>> len=-22, opcode=0
>> listener task stBUG: unable to handle kernel NULL pointer dereferencearted
>> rt_task_s at virtual address 0000020c
>> end error
>> len=-printing eip: c014f014 110, opcode=0
>> *pde = 00000000
>> Oops: 0000 [#1] PREEMPT
>> Modules linked in: xeno_timerbench nfs ipv6 nfsd lockd nfs_acl sunrpc
>> exportfs dm_snapshot dm_mirror dm_mod loop pcmcia firmware_class
>> serio_raw psmouse yenta_socket rsrc_nonstatic pcmcia_core cs5535_gpio
>> joydev evdev ext3 jbd mbcache usbhid ide_disk generic ohci_hcd ehci_hcd
>> amd74xx usbcore ide_core e100 mii
>>
>> Pid: 1973, comm: listen_task Not tainted (2.6.24.4 #12)
>> EIP: 0060:[<c014f014>] EFLAGS: 00010093 CPU: 0
>> EIP is at rt_task_receive+0x113/0x18b
>> EAX: cdc71f28 EBX: fffffd98 ECX: c03c5500 EDX: cd820acc
>> ESI: ffffff97 EDI: cd820610 EBP: cdc71edc ESP: cdc71ec4
>>  DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068
>> Process listen_task (pid: 1973, ti=cdc70000 task=cdc6d320
>> task.ti=cdc70000)<0>
>> I-pipe domain Linux
>> Stack: 00000000 cdc71f28 00000000 cdc71ee8 cdc6d320 cdc71fb8 cdc71f4c
>> c01515bf
>>        b7d083ea cdc71f00 c0102cb1 ce46bc70 c03c7e90 cd820620 cdc6d620
>> cd80ff44
>>        c029f55b cdc71f34 00000082 b91264ee 0000002e cdc6d320 b9127ee2
>> 0000002e
>> Call Trace:
>>  [<c01045f1>] show_trace_log_lvl+0x1a/0x2f
>>  [<c01046a3>] show_stack_log_lvl+0x9d/0xa5
>>  [<c0104769>] show_registers+0xbe/0x1fd
>>  [<c01049c1>] die+0x119/0x20a
>>  [<c010ef52>] do_page_fault+0x480/0x57e
>>  [<c010ca34>] __ipipe_handle_exception+0x11e/0x166
>>  [<c02a0e7f>] error_code+0x6f/0x80
>>  [<c01515bf>] __rt_task_receive+0xbe/0x113
>>  [<c0149b0d>] losyscall_event+0x99/0x13d
>>  [<c013f05b>] __ipipe_dispatch_event+0xac/0x16c
>>  [<c010c8b1>] __ipipe_syscall_root+0x6a/0xcf
>>  [<c0103e89>] system_call+0x29/0x4a
>>  =======================
>> Code: 00 00 8d 87 bc 04 00 00 39 c2 0f 95 c0 0f b6 c0 f7 d8 21 d0 31 db
>> 3d 58 02 00 00 74 06 8d 98 98 fd ff ff 8b 45 ec be 97 ff ff ff <8b> 93
>> 74 04 00 00 3b 50 0c 77 23 85 d2 74 19 89 d1 8b b3 70 04
>> EIP: [<c014f014>] rt_task_receive+0x113/0x18b SS:ESP 0068:cdc71ec4
>> ---[ end trace 523bcd2b73b75979 ]---
>>
>>
>>
>>
>>
>> ------------------------------------------------------------------------
>>
>> _______________________________________________
>> Xenomai-help mailing list
>> Xenomai-help@domain.hid
>> https://mail.gna.org/listinfo/xenomai-help
> 
> 
> --
> Philippe.
> 
> 
> 
> 
> 


-- 
Philippe.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Xenomai-help] rt_task_send / receive problems various issuesand bug trace
  2008-05-17 21:18       ` [Xenomai-help] rt_task_send / receive problems various issuesand bug trace Philippe Gerum
@ 2008-05-19 13:57         ` Karch, Joshua
  0 siblings, 0 replies; 9+ messages in thread
From: Karch, Joshua @ 2008-05-19 13:57 UTC (permalink / raw)
  To: rpm; +Cc: xenomai

[-- Attachment #1: Type: text/plain, Size: 20008 bytes --]

Philippe,

Thank you for diagnosing and fixing the problem. I'll recompile xenomai and let you know how it works-- 

Regards,

Joshua Karch


-----Original Message-----
From: Philippe Gerum on behalf of Philippe Gerum
Sent: Sat 5/17/2008 5:18 PM
To: Karch, Joshua
Cc: xenomai@xenomai.org
Subject: Re: [Xenomai-help] rt_task_send / receive problems various issuesand bug trace
 
Karch, Joshua wrote:
> Phillipe: small clarification: the following example has replies enabled
> and TM_NONBLOCK on rt_task_send.  I tried the code two different ways:
> len = rt_task_send(listen_task,&talk_send,&talk_reply,TM_NONBLOCK);
> len = rt_task_send(listen_task,&talk_send,NULL,TM_NONBLOCK);// also
> deleted rt_task_reply from listen_task and commented out all appropriate
> RT_TASK_MCB reply structs
> 
> both have the same bug in the end.
>

The attached patch fixes the issue.

> Josh
> 
> 
> -----Original Message-----
> From: xenomai-help-bounces@domain.hid on behalf of Karch, Joshua
> Sent: Wed 5/7/2008 3:33 PM
> To: rpm@xenomai.org
> Cc: xenomai@xenomai.org
> Subject: Re: [Xenomai-help] rt_task_send / receive problems various
> issuesand   bug trace
> 
> Phillipe,
> 
> here is a disassembly of __rt_task_receive and the source code used to
> generate the error.  The error happens if I use TM_NONBLOCK, regardless
> of whether or not I enable a reply.  In the following example, I use
> rt_send_task with no reply requested (NULL), and all replies commented out.
> 
> I got three errors this time, -22, -11, -110, and then finally
> BUG: unable to handle kernel NULL pointer dereference at virtual address
> 0000020c
> printing eip: c014f014 *pde = 00000000
> I hand typed the above in since I didn't capture it through a serial
> terminal.
> 
>>From what I can see here, there's no place for a local loop to form in
> the talker task. I have the wait period set to the main rate, defined as
> a 20 msec for this example.  Calling rt_task_send with or without a
> RT_TASK_MCB receive struct and without a reply from the listener task
> still results in the bug occurring.  My platform is a Geode, and I'm
> running 2.6.24-4 with Xenomai 2.4.3
> 
> Thank you,
> 
> Josh
> 
> 
> 
> 
> c0151501 <__rt_task_receive>:
> c0151501:       55                      push   %ebp
> c0151502:       89 e5                   mov    %esp,%ebp
> c0151504:       57                      push   %edi
> c0151505:       89 d7                   mov    %edx,%edi
> c0151507:       56                      push   %esi
> c0151508:       89 c6                   mov    %eax,%esi
> c015150a:       53                      push   %ebx
> c015150b:       83 ec 5c                sub    $0x5c,%esp
> c015150e:       8b 1a                   mov    (%edx),%ebx
> c0151510:       8b 40 04                mov    0x4(%eax),%eax
> c0151513:       89 da                   mov    %ebx,%edx
> c0151515:       83 c2 10                add    $0x10,%edx
> c0151518:       19 c9                   sbb    %ecx,%ecx
> c015151a:       39 50 18                cmp    %edx,0x18(%eax)
> c015151d:       83 d9 00                sbb    $0x0,%ecx
> c0151520:       85 c9                   test   %ecx,%ecx
> c0151522:       0f 85 dd 00 00 00       jne    c0151605
> <__rt_task_receive+0x104>
> c0151528:       b9 10 00 00 00          mov    $0x10,%ecx
> c015152d:       89 da                   mov    %ebx,%edx
> c015152f:       8d 45 dc                lea    0xffffffdc(%ebp),%eax
> c0151532:       e8 29 a2 08 00          call   c01db760
> <__copy_from_user_ll_nozero>
> c0151537:       8b 4d e8                mov    0xffffffe8(%ebp),%ecx
> c015153a:       85 c9                   test   %ecx,%ecx
> c015153c:       74 18                   je     c0151556
> <__rt_task_receive+0x55>
> c015153e:       8b 55 e4                mov    0xffffffe4(%ebp),%edx
> c0151541:       8b 46 04                mov    0x4(%esi),%eax
> c0151544:       01 ca                   add    %ecx,%edx
> c0151546:       19 db                   sbb    %ebx,%ebx
> c0151548:       39 50 18                cmp    %edx,0x18(%eax)
> c015154b:       83 db 00                sbb    $0x0,%ebx
> c015154e:       85 db                   test   %ebx,%ebx
> c0151550:       0f 85 af 00 00 00       jne    c0151605
> <__rt_task_receive+0x104>
> c0151556:       8b 5f 04                mov    0x4(%edi),%ebx
> c0151559:       8b 46 04                mov    0x4(%esi),%eax
> c015155c:       89 da                   mov    %ebx,%edx
> c015155e:       83 c2 08                add    $0x8,%edx
> c0151561:       19 c9                   sbb    %ecx,%ecx
> c0151563:       39 50 18                cmp    %edx,0x18(%eax)
> c0151566:       83 d9 00                sbb    $0x0,%ecx
> c0151569:       85 c9                   test   %ecx,%ecx
> c015156b:       0f 85 94 00 00 00       jne    c0151605
> <__rt_task_receive+0x104>
> c0151571:       89 da                   mov    %ebx,%edx
> c0151573:       b9 08 00 00 00          mov    $0x8,%ecx
> c0151578:       8d 45 ec                lea    0xffffffec(%ebp),%eax
> c015157b:       e8 e0 a1 08 00          call   c01db760
> <__copy_from_user_ll_nozero>
> c0151580:       8b 5d e4                mov    0xffffffe4(%ebp),%ebx
> c0151583:       8b 55 e8                mov    0xffffffe8(%ebp),%edx
> c0151586:       89 5d 98                mov    %ebx,0xffffff98(%ebp)
> c0151589:       31 db                   xor    %ebx,%ebx
> c015158b:       85 d2                   test   %edx,%edx
> c015158d:       74 22                   je     c01515b1
> <__rt_task_receive+0xb0>
> c015158f:       83 fa 40                cmp    $0x40,%edx
> c0151592:       77 05                   ja     c0151599
> <__rt_task_receive+0x98>
> c0151594:       8d 5d 9c                lea    0xffffff9c(%ebp),%ebx
> c0151597:       eb 15                   jmp    c01515ae
> <__rt_task_receive+0xad>
> c0151599:       b8 60 b9 3c c0          mov    $0xc03cb960,%eax
> c015159e:       be f4 ff ff ff          mov    $0xfffffff4,%esi
> c01515a3:       e8 25 f2 fe ff          call   c01407cd <xnheap_alloc>
> c01515a8:       85 c0                   test   %eax,%eax
> c01515aa:       74 5e                   je     c015160a
> <__rt_task_receive+0x109>
> c01515ac:       89 c3                   mov    %eax,%ebx
> c01515ae:       89 5d e4                mov    %ebx,0xffffffe4(%ebp)
> c01515b1:       8b 55 ec                mov    0xffffffec(%ebp),%edx
> c01515b4:       8d 45 dc                lea    0xffffffdc(%ebp),%eax
> c01515b7:       8b 4d f0                mov    0xfffffff0(%ebp),%ecx
> c01515ba:       e8 42 d9 ff ff          call   c014ef01 <rt_task_receive>
> c01515bf:       85 c0                   test   %eax,%eax
> c01515c1:       89 c6                   mov    %eax,%esi
> c01515c3:       7e 12                   jle    c01515d7
> <__rt_task_receive+0xd6>
> c01515c5:       8b 4d e8                mov    0xffffffe8(%ebp),%ecx
> c01515c8:       85 c9                   test   %ecx,%ecx
> c01515ca:       74 0b                   je     c01515d7
> <__rt_task_receive+0xd6>
> c01515cc:       8b 55 e4                mov    0xffffffe4(%ebp),%edx
> c01515cf:       8b 45 98                mov    0xffffff98(%ebp),%eax
> c01515d2:       e8 e9 9f 08 00          call   c01db5c0 <__copy_to_user_ll>
> c01515d7:       8b 45 98                mov    0xffffff98(%ebp),%eax
> c01515da:       8d 55 dc                lea    0xffffffdc(%ebp),%edx
> c01515dd:       b9 10 00 00 00          mov    $0x10,%ecx
> c01515e2:       89 45 e4                mov    %eax,0xffffffe4(%ebp)
> c01515e5:       8b 07                   mov    (%edi),%eax
> c01515e7:       e8 d4 9f 08 00          call   c01db5c0 <__copy_to_user_ll>
> c01515ec:       85 db                   test   %ebx,%ebx
> c01515ee:       74 1a                   je     c015160a
> <__rt_task_receive+0x109>
> c01515f0:       8d 45 9c                lea    0xffffff9c(%ebp),%eax
> c01515f3:       39 c3                   cmp    %eax,%ebx
> c01515f5:       74 13                   je     c015160a
> <__rt_task_receive+0x109>
> c01515f7:       89 da                   mov    %ebx,%edx
> c01515f9:       b8 60 b9 3c c0          mov    $0xc03cb960,%eax
> c01515fe:       e8 a4 f0 fe ff          call   c01406a7 <xnheap_free>
> c0151603:       eb 05                   jmp    c015160a
> <__rt_task_receive+0x109>
> c0151605:       be f2 ff ff ff          mov    $0xfffffff2,%esi
> c015160a:       83 c4 5c                add    $0x5c,%esp
> c015160d:       89 f0                   mov    %esi,%eax
> c015160f:       5b                      pop    %ebx
> c0151610:       5e                      pop    %esi
> c0151611:       5f                      pop    %edi
> c0151612:       5d                      pop    %ebp
> c0151613:       c3                      ret   
> 
> 
> 
> 
> Associated source code:
> 
> 
> //main.cpp
> #include "talker.h"
> #include "listener.h"
> #include <sys/mman.h>
> #include <native/task.h>
> #include <signal.h>
> 
> volatile int exitprogram;
> RT_TASK talk_task, listen_task;
> 
> 
> void catch_signal(int sig)
> {
>         exitprogram=1;
> }
> 
> int main(int argc, char *argv[])
> {
>         exitprogram=0;
>        
>         mlockall(MCL_CURRENT | MCL_FUTURE);
>         //wait for ctrl-c
>         signal(SIGTERM, catch_signal);
>         signal(SIGINT, catch_signal);
>         talker *talk = new talker();
>         listener *listen = new listener();
>         talk->startup(&talk_task, &listen_task);
>         sleep(1);
>         listen->startup(&talk_task, &listen_task);
>         pause();
>         sleep(2);
>         rt_task_join(&talk_task);
>         rt_task_join(&listen_task);
>         rt_task_delete(&talk_task);
>         rt_task_delete(&listen_task);
>         return 0;
> }
> 
> //talker.h
> #ifndef TALKER_H_
> #define TALKER_H_
> #include <stdio.h>
> #include <stdlib.h>
> #include <sys/time.h>
> #include <unistd.h>
> #include <sys/mman.h>
> #include <native/task.h>
> #include <native/timer.h>
> #define TIMEOUT (100000000)
> #define MAIN_RATE_NS 20000000
> 
> 
> extern volatile int exitprogram;
> 
> class talker {
> public:
>         talker();
>         virtual ~talker();
>         void mainloop();
>         int startup(RT_TASK *talktask, RT_TASK *listentask);
> 
> private:
>         RT_TASK *talk_task, *listen_task;
>         void shutdown();
>         int iret;
> 
> protected:
>         static void thunk(void * param);
> };
> 
> 
> 
> #endif /*TALKER_H_*/
> 
> 
> 
> //talker.cpp
> #include "talker.h"
> 
> talker::talker()
> {
> }
> talker::~talker()
> {
> }
> 
> int talker::startup(RT_TASK *talktask, RT_TASK *listentask)
> {
>         talk_task=talktask;
>         listen_task=listentask;
>         rt_task_create(talk_task, "talk_task", 0, 51, 0);
>         iret= rt_task_start(talk_task,thunk,(void*)(this));
>         return(0);
> 
> 
> }
> void talker::thunk(void * param)
> {
>         talker *instance = (talker *) param;
>         instance->mainloop();
> }
> void talker::mainloop()
> {
>         printf("talker task started\n");
>         int len;
>         RT_TASK_MCB talk_send, talk_reply;
> 
>         rt_task_set_periodic(NULL,TM_NOW,MAIN_RATE_NS);
>         while (1)
>         {
>                 if(exitprogram)
>                         break;
>                 rt_task_wait_period(NULL);
>                 talk_send.opcode = 0x01;
>                 talk_send.data = NULL;
>                 talk_send.size = 0;
>                 talk_reply.size = 0;
>                 talk_reply.data = NULL;
>                 len =
> rt_task_send(listen_task,&talk_send,&talk_reply,TM_NONBLOCK);
>                 if (len < 0) printf("rt_task_send error\n");
>                 if (talk_reply.opcode != 4)
>                 printf("len=%d, opcode=%d\n", len, talk_reply.opcode);
>         }
>         shutdown();
> }
> 
> void talker::shutdown()
> {
>     printf("Talker exits with return %d\n",iret);
> }
> 
> 
> //listener.h
> #ifndef LISTENER_H_
> #define LISTENER_H_
> #include <stdio.h>
> #include <stdlib.h>
> #include <sys/time.h>
> #include <unistd.h>
> #include <sys/mman.h>
> #include <native/task.h>
> #include <native/timer.h>
> 
> 
> 
> extern volatile int exitprogram;
> 
> class listener {
> public:
>         listener();
>         virtual ~listener();
>         void mainloop();
>         int startup(RT_TASK *talktask, RT_TASK *listentask);
> 
> private:
>         RT_TASK *talk_task, *listen_task;
>         void shutdown();
>         int iret;
> 
> protected:
>         static void thunk(void * param);
> };
> 
> 
> 
> #endif /*LISTENER_H_*/
> 
> 
> //listener.cpp
> 
> #include "listener.h"
> 
> listener::listener()
> {
> }
> 
> listener::~listener()
> {
> }
> 
> int listener::startup(RT_TASK *talktask, RT_TASK *listentask)
> {
>         talk_task=talktask;
>         listen_task=listentask;
>         rt_task_create(listen_task, "listen_task", 0, 50, 0);
>         iret= rt_task_start(listen_task,thunk,(void*)(this));
>         return(0);
> 
> 
> }
> 
> void listener::thunk(void * param)
> {
>         listener *instance = (listener *) param;
>         instance->mainloop();
> }
> 
> void listener::mainloop()
> {
>         printf("listener task started\n");
>         unsigned char buf[10];
>         RT_TASK_MCB listen_rcv, listen_reply;
>        
>         while (1)
>         {
>                 int taskid;
>                 if(exitprogram)
>                         break;
>                 listen_rcv.data = (caddr_t)buf;
>                 listen_rcv.size = sizeof(buf);
>                 taskid = rt_task_receive(&listen_rcv,TM_INFINITE);
>                 printf("received data with opcode
> %d\n",listen_rcv.opcode);    
>                 listen_reply.opcode = 4;
>                 listen_reply.size = 0;
>                 listen_reply.data = NULL;
>                 rt_task_reply(taskid, &listen_reply);
>         }
>         shutdown();
> }
> 
> void listener::shutdown()
> {
>     printf("listener exits with return %d\n",iret);
> }
> 
> 
> 
> 
> 
> 
> -----Original Message-----
> From: Philippe Gerum on behalf of Philippe Gerum
> Sent: Wed 5/7/2008 1:24 PM
> To: Karch, Joshua
> Cc: xenomai@xenomai.org
> Subject: Re: [Xenomai-help] rt_task_send / receive problems various
> issues and  bug trace
> 
> Karch, Joshua wrote:
>>
>> Hello,
>>
>> I'm using rt_task_send from a talker task and rt_task_receive/reply from
>> a listener task.  When I launch the two tasks in the following order:
>> listener task,  talker task,  everything runs normally.
>>
>> However, when I launch the talker task first, and then the listener task
>> second, I receive rt_task_send error -22 and after a bit of time the
>> listener task starts up.
> 
> -EINVAL is not on the error list for rt_task_send(). Could you confirm
> this result?
> 
>  I know this is logical and to be expected,
> 
> No, it's not. rt_task_send() does wait for the receiver to listen to,
> unless you
>  asked for a non blocking call using TM_NONBLOCK as a timeout.
> 
>> however, it appears that issuing the rt_task_send command to a task that
>> hasn't been started occasionally locks up sufficient resources to
>> prevent the listener task from starting.
> 
> Any chance your code enters a tight loop due to rt_task_send() failing
> repeatedly?
> 
>  By controlling task startup
>> order, I was able to circumvent this issue.  Both tasks have similar
>> priorities (50, 51).
>>
>> Additionally, I am unable to use rt_task_send with TM_NONBLOCK
>> len = rt_task_send(listen_task,&talk_send,&talk_reply,TM_NONBLOCK);
>>
>> I get a bug failure and have to reset the machine-- see below:
>>
> 
> Please disassemble your "vmlinux" kernel image, the exact one that
> causes a bug:
> $ objdump -d vmlinux > foo.txt.
> In that large file, search for the "__rt_task_receive" symbol (notice
> the double
> underscore prefix, we also have the "rt_task_receive" symbol, but we
> don't need
> this code at the moment), then paste&copy the disassembly code for that
> function. I'll have a look.
> 
> Step #2 is to send a simple piece of code that exhibits the problem.
> This will
> speed up the debugging and fixing process.
> 
>> The reason I want to use TM_NONBLOCK is so that I can send a trigger
>> message from the producer task to the consumer task without requiring a
>> reply to trigger the consumer task to act on the data received. I am
>> using the rt_task_send trigger message to gate and synchronize the
>> consumer task. Is a reply required for all rt_task_send?
>>
>> It seems if I don't send a reply when rt_task_send has a timeout
>> specified the sending task locks up and the listening task runs rampant,
>> i.e. rt_task_receive no longer blocks and the loop runs with no delay
>> and essentially locks up the machine, since I don't use
>> rt_set_task_periodic on the listening task.
>>
>> Here is the trace, and it requires a reboot.  I also can attach the
>> code- it is written in c++ with two separate classes as a model of the
>> application I am building.
>>
>> Thank you,
>>
>> Joshua Karch
>>
>> talker task started
>> rt_task_send error
>> len=-22, opcode=0
>> listener task stBUG: unable to handle kernel NULL pointer dereferencearted
>> rt_task_s at virtual address 0000020c
>> end error
>> len=-printing eip: c014f014 110, opcode=0
>> *pde = 00000000
>> Oops: 0000 [#1] PREEMPT
>> Modules linked in: xeno_timerbench nfs ipv6 nfsd lockd nfs_acl sunrpc
>> exportfs dm_snapshot dm_mirror dm_mod loop pcmcia firmware_class
>> serio_raw psmouse yenta_socket rsrc_nonstatic pcmcia_core cs5535_gpio
>> joydev evdev ext3 jbd mbcache usbhid ide_disk generic ohci_hcd ehci_hcd
>> amd74xx usbcore ide_core e100 mii
>>
>> Pid: 1973, comm: listen_task Not tainted (2.6.24.4 #12)
>> EIP: 0060:[<c014f014>] EFLAGS: 00010093 CPU: 0
>> EIP is at rt_task_receive+0x113/0x18b
>> EAX: cdc71f28 EBX: fffffd98 ECX: c03c5500 EDX: cd820acc
>> ESI: ffffff97 EDI: cd820610 EBP: cdc71edc ESP: cdc71ec4
>>  DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068
>> Process listen_task (pid: 1973, ti=cdc70000 task=cdc6d320
>> task.ti=cdc70000)<0>
>> I-pipe domain Linux
>> Stack: 00000000 cdc71f28 00000000 cdc71ee8 cdc6d320 cdc71fb8 cdc71f4c
>> c01515bf
>>        b7d083ea cdc71f00 c0102cb1 ce46bc70 c03c7e90 cd820620 cdc6d620
>> cd80ff44
>>        c029f55b cdc71f34 00000082 b91264ee 0000002e cdc6d320 b9127ee2
>> 0000002e
>> Call Trace:
>>  [<c01045f1>] show_trace_log_lvl+0x1a/0x2f
>>  [<c01046a3>] show_stack_log_lvl+0x9d/0xa5
>>  [<c0104769>] show_registers+0xbe/0x1fd
>>  [<c01049c1>] die+0x119/0x20a
>>  [<c010ef52>] do_page_fault+0x480/0x57e
>>  [<c010ca34>] __ipipe_handle_exception+0x11e/0x166
>>  [<c02a0e7f>] error_code+0x6f/0x80
>>  [<c01515bf>] __rt_task_receive+0xbe/0x113
>>  [<c0149b0d>] losyscall_event+0x99/0x13d
>>  [<c013f05b>] __ipipe_dispatch_event+0xac/0x16c
>>  [<c010c8b1>] __ipipe_syscall_root+0x6a/0xcf
>>  [<c0103e89>] system_call+0x29/0x4a
>>  =======================
>> Code: 00 00 8d 87 bc 04 00 00 39 c2 0f 95 c0 0f b6 c0 f7 d8 21 d0 31 db
>> 3d 58 02 00 00 74 06 8d 98 98 fd ff ff 8b 45 ec be 97 ff ff ff <8b> 93
>> 74 04 00 00 3b 50 0c 77 23 85 d2 74 19 89 d1 8b b3 70 04
>> EIP: [<c014f014>] rt_task_receive+0x113/0x18b SS:ESP 0068:cdc71ec4
>> ---[ end trace 523bcd2b73b75979 ]---
>>
>>
>>
>>
>>
>> ------------------------------------------------------------------------
>>
>> _______________________________________________
>> Xenomai-help mailing list
>> Xenomai-help@domain.hid
>> https://mail.gna.org/listinfo/xenomai-help
> 
> 
> --
> Philippe.
> 
> 


-- 
Philippe.


[-- Attachment #2: Type: text/html, Size: 41965 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Xenomai-help] rt_task_send / receive bug from last week and use of rt_task_inquire
  2008-05-17 21:26         ` Philippe Gerum
@ 2008-05-19 13:59           ` Karch, Joshua
  0 siblings, 0 replies; 9+ messages in thread
From: Karch, Joshua @ 2008-05-19 13:59 UTC (permalink / raw)
  To: rpm; +Cc: xenomai

[-- Attachment #1: Type: text/plain, Size: 23968 bytes --]

Phillipe,  

so essentially rt_task_reply is an essential and required part of synchronous messaging, and TM_NONBLOCK only returns immediately if the listening task is not active.  In this case, since a reply is always necessary.  It's probably better for now to use synchronous tasks.  I was able to get the timeout down to 1msec because the servo controller always responds within 500nsec, so I could set the timeout threshold much more aggressively and therefore have approximately 18 msec and change to do control law work.

Sounds good--

Cheers,

Josh



-----Original Message-----
From: Philippe Gerum on behalf of Philippe Gerum
Sent: Sat 5/17/2008 5:26 PM
To: Karch, Joshua
Cc: xenomai@xenomai.org
Subject: Re: [Xenomai-help] rt_task_send / receive bug from last week and use of rt_task_inquire
 
Karch, Joshua wrote:
> Phillipe,
> 
> I was wondering if you had any progress with the rt_task_send/receive
> bug?  I was able to get the synchronized serial chain to work from
> sensor task through an rt_task_send at 50 Hz with a timeout to the
> control task, which then synchronously sends an rt_task_send with
> timeout to the motor driver on the second serial port.  When I set the
> motor driver's timeout to be 100 msec and I unplug the motor driver's
> serial interface to cause this timeout, I immediately get rt_task_send
> errors, some bad enough to cause a system crash if and only if the
> rt_task_send looks for a reply, otherwise, with the reply MCB field set
> to NULL, rt_task_send fails indefinitely until the program is killed and
> a segfault occurs.  Clearly, with a 100 msec timeout, this means that up
> to five rt_task send commands will come from the control law which is
> chain-linked to the sensor task before a single reply is sent back from
> the servo controller.
> 
> What is the expected behavior of repeatedly issuing rt_task_send when
> the other task has not replied within a timeout specified by
> rt_task_send?  The 100 msec timeout is akin to having 3 cars on a
> highway, with the front one (the servo motor) slamming on the brakes and
> causing a major collision.  Basically, the servo controller must always
> respond within 20 msec- the time of execution for the control task in
> order to prevent the xenomai equivalent of a traffic jam.  Can and
> should this be avoided by using a function like rt_task_inquire to see
> if the servo serial driver is blocking on serial read, and if so, not
> issue an rt_task_send command?  In this way, the synchronous
> relationship between the serial sensor and the control law will never be
> broken.
> 
> Regardless, bringing the serial port timeout down to 10msec, which is
> reasonable, resolves this problem for now, though it limits the control
> law's ability to process while the servo motor serial task has timed out.
> 
> That's basically my update.  I'd still much prefer to be able to send a
> non-blocking trigger message without a reply for all rt_task_send calls

It's ok unless you end up implementing some kind of asynchronous protocol using
that message passing interface: this won't work. rt_task_send/receive/reply are
meant to implement synchronous message passing. Just to make sure the
implementation is well understood:

- passing mcb_r == NULL to rt_task_send() does not mean that you won't wait for
any answer from the remote task, you will wait for rt_task_reply() to be called;
that answer will just be discarded before rt_task_send() returns.

- passing timeout == TM_NONBLOCK to rt_task_send() does NOT mean not to wait for
a reply, but it means to wait for a reply indefinitely UNLESS no server is
listening to the sender before it issues the request, in which case
rt_task_send() should return immediately.

In other word, it's purely synchronous stuff. If you need asynchronous message
passing, then you should use another IPC.

> from the sensor to the control law to regulate the control law's loop
> rate, and perhaps also do the same for the motor controller, which
> currently places its response in shared memory.
> 
> Thank you,
> 
> Josh
> 
> 
> -----Original Message-----
> From: Karch, Joshua
> Sent: Wed 5/7/2008 3:35 PM
> To: Karch, Joshua; rpm@xenomai.org
> Cc: xenomai@xenomai.org
> Subject: RE: [Xenomai-help] rt_task_send / receive problems various
> issuesand   bug trace
> 
> Phillipe: small clarification: the following example has replies enabled
> and TM_NONBLOCK on rt_task_send.  I tried the code two different ways:
> len = rt_task_send(listen_task,&talk_send,&talk_reply,TM_NONBLOCK);
> len = rt_task_send(listen_task,&talk_send,NULL,TM_NONBLOCK);// also
> deleted rt_task_reply from listen_task and commented out all appropriate
> RT_TASK_MCB reply structs
> 
> both have the same bug in the end.
> 
> Josh
> 
> 
> -----Original Message-----
> From: xenomai-help-bounces@domain.hid on behalf of Karch, Joshua
> Sent: Wed 5/7/2008 3:33 PM
> To: rpm@xenomai.org
> Cc: xenomai@xenomai.org
> Subject: Re: [Xenomai-help] rt_task_send / receive problems various
> issuesand   bug trace
> 
> Phillipe,
> 
> here is a disassembly of __rt_task_receive and the source code used to
> generate the error.  The error happens if I use TM_NONBLOCK, regardless
> of whether or not I enable a reply.  In the following example, I use
> rt_send_task with no reply requested (NULL), and all replies commented out.
> 
> I got three errors this time, -22, -11, -110, and then finally
> BUG: unable to handle kernel NULL pointer dereference at virtual address
> 0000020c
> printing eip: c014f014 *pde = 00000000
> I hand typed the above in since I didn't capture it through a serial
> terminal.
> 
>>From what I can see here, there's no place for a local loop to form in
> the talker task. I have the wait period set to the main rate, defined as
> a 20 msec for this example.  Calling rt_task_send with or without a
> RT_TASK_MCB receive struct and without a reply from the listener task
> still results in the bug occurring.  My platform is a Geode, and I'm
> running 2.6.24-4 with Xenomai 2.4.3
> 
> Thank you,
> 
> Josh
> 
> 
> 
> 
> c0151501 <__rt_task_receive>:
> c0151501:       55                      push   %ebp
> c0151502:       89 e5                   mov    %esp,%ebp
> c0151504:       57                      push   %edi
> c0151505:       89 d7                   mov    %edx,%edi
> c0151507:       56                      push   %esi
> c0151508:       89 c6                   mov    %eax,%esi
> c015150a:       53                      push   %ebx
> c015150b:       83 ec 5c                sub    $0x5c,%esp
> c015150e:       8b 1a                   mov    (%edx),%ebx
> c0151510:       8b 40 04                mov    0x4(%eax),%eax
> c0151513:       89 da                   mov    %ebx,%edx
> c0151515:       83 c2 10                add    $0x10,%edx
> c0151518:       19 c9                   sbb    %ecx,%ecx
> c015151a:       39 50 18                cmp    %edx,0x18(%eax)
> c015151d:       83 d9 00                sbb    $0x0,%ecx
> c0151520:       85 c9                   test   %ecx,%ecx
> c0151522:       0f 85 dd 00 00 00       jne    c0151605
> <__rt_task_receive+0x104>
> c0151528:       b9 10 00 00 00          mov    $0x10,%ecx
> c015152d:       89 da                   mov    %ebx,%edx
> c015152f:       8d 45 dc                lea    0xffffffdc(%ebp),%eax
> c0151532:       e8 29 a2 08 00          call   c01db760
> <__copy_from_user_ll_nozero>
> c0151537:       8b 4d e8                mov    0xffffffe8(%ebp),%ecx
> c015153a:       85 c9                   test   %ecx,%ecx
> c015153c:       74 18                   je     c0151556
> <__rt_task_receive+0x55>
> c015153e:       8b 55 e4                mov    0xffffffe4(%ebp),%edx
> c0151541:       8b 46 04                mov    0x4(%esi),%eax
> c0151544:       01 ca                   add    %ecx,%edx
> c0151546:       19 db                   sbb    %ebx,%ebx
> c0151548:       39 50 18                cmp    %edx,0x18(%eax)
> c015154b:       83 db 00                sbb    $0x0,%ebx
> c015154e:       85 db                   test   %ebx,%ebx
> c0151550:       0f 85 af 00 00 00       jne    c0151605
> <__rt_task_receive+0x104>
> c0151556:       8b 5f 04                mov    0x4(%edi),%ebx
> c0151559:       8b 46 04                mov    0x4(%esi),%eax
> c015155c:       89 da                   mov    %ebx,%edx
> c015155e:       83 c2 08                add    $0x8,%edx
> c0151561:       19 c9                   sbb    %ecx,%ecx
> c0151563:       39 50 18                cmp    %edx,0x18(%eax)
> c0151566:       83 d9 00                sbb    $0x0,%ecx
> c0151569:       85 c9                   test   %ecx,%ecx
> c015156b:       0f 85 94 00 00 00       jne    c0151605
> <__rt_task_receive+0x104>
> c0151571:       89 da                   mov    %ebx,%edx
> c0151573:       b9 08 00 00 00          mov    $0x8,%ecx
> c0151578:       8d 45 ec                lea    0xffffffec(%ebp),%eax
> c015157b:       e8 e0 a1 08 00          call   c01db760
> <__copy_from_user_ll_nozero>
> c0151580:       8b 5d e4                mov    0xffffffe4(%ebp),%ebx
> c0151583:       8b 55 e8                mov    0xffffffe8(%ebp),%edx
> c0151586:       89 5d 98                mov    %ebx,0xffffff98(%ebp)
> c0151589:       31 db                   xor    %ebx,%ebx
> c015158b:       85 d2                   test   %edx,%edx
> c015158d:       74 22                   je     c01515b1
> <__rt_task_receive+0xb0>
> c015158f:       83 fa 40                cmp    $0x40,%edx
> c0151592:       77 05                   ja     c0151599
> <__rt_task_receive+0x98>
> c0151594:       8d 5d 9c                lea    0xffffff9c(%ebp),%ebx
> c0151597:       eb 15                   jmp    c01515ae
> <__rt_task_receive+0xad>
> c0151599:       b8 60 b9 3c c0          mov    $0xc03cb960,%eax
> c015159e:       be f4 ff ff ff          mov    $0xfffffff4,%esi
> c01515a3:       e8 25 f2 fe ff          call   c01407cd <xnheap_alloc>
> c01515a8:       85 c0                   test   %eax,%eax
> c01515aa:       74 5e                   je     c015160a
> <__rt_task_receive+0x109>
> c01515ac:       89 c3                   mov    %eax,%ebx
> c01515ae:       89 5d e4                mov    %ebx,0xffffffe4(%ebp)
> c01515b1:       8b 55 ec                mov    0xffffffec(%ebp),%edx
> c01515b4:       8d 45 dc                lea    0xffffffdc(%ebp),%eax
> c01515b7:       8b 4d f0                mov    0xfffffff0(%ebp),%ecx
> c01515ba:       e8 42 d9 ff ff          call   c014ef01 <rt_task_receive>
> c01515bf:       85 c0                   test   %eax,%eax
> c01515c1:       89 c6                   mov    %eax,%esi
> c01515c3:       7e 12                   jle    c01515d7
> <__rt_task_receive+0xd6>
> c01515c5:       8b 4d e8                mov    0xffffffe8(%ebp),%ecx
> c01515c8:       85 c9                   test   %ecx,%ecx
> c01515ca:       74 0b                   je     c01515d7
> <__rt_task_receive+0xd6>
> c01515cc:       8b 55 e4                mov    0xffffffe4(%ebp),%edx
> c01515cf:       8b 45 98                mov    0xffffff98(%ebp),%eax
> c01515d2:       e8 e9 9f 08 00          call   c01db5c0 <__copy_to_user_ll>
> c01515d7:       8b 45 98                mov    0xffffff98(%ebp),%eax
> c01515da:       8d 55 dc                lea    0xffffffdc(%ebp),%edx
> c01515dd:       b9 10 00 00 00          mov    $0x10,%ecx
> c01515e2:       89 45 e4                mov    %eax,0xffffffe4(%ebp)
> c01515e5:       8b 07                   mov    (%edi),%eax
> c01515e7:       e8 d4 9f 08 00          call   c01db5c0 <__copy_to_user_ll>
> c01515ec:       85 db                   test   %ebx,%ebx
> c01515ee:       74 1a                   je     c015160a
> <__rt_task_receive+0x109>
> c01515f0:       8d 45 9c                lea    0xffffff9c(%ebp),%eax
> c01515f3:       39 c3                   cmp    %eax,%ebx
> c01515f5:       74 13                   je     c015160a
> <__rt_task_receive+0x109>
> c01515f7:       89 da                   mov    %ebx,%edx
> c01515f9:       b8 60 b9 3c c0          mov    $0xc03cb960,%eax
> c01515fe:       e8 a4 f0 fe ff          call   c01406a7 <xnheap_free>
> c0151603:       eb 05                   jmp    c015160a
> <__rt_task_receive+0x109>
> c0151605:       be f2 ff ff ff          mov    $0xfffffff2,%esi
> c015160a:       83 c4 5c                add    $0x5c,%esp
> c015160d:       89 f0                   mov    %esi,%eax
> c015160f:       5b                      pop    %ebx
> c0151610:       5e                      pop    %esi
> c0151611:       5f                      pop    %edi
> c0151612:       5d                      pop    %ebp
> c0151613:       c3                      ret   
> 
> 
> 
> 
> Associated source code:
> 
> 
> //main.cpp
> #include "talker.h"
> #include "listener.h"
> #include <sys/mman.h>
> #include <native/task.h>
> #include <signal.h>
> 
> volatile int exitprogram;
> RT_TASK talk_task, listen_task;
> 
> 
> void catch_signal(int sig)
> {
>         exitprogram=1;
> }
> 
> int main(int argc, char *argv[])
> {
>         exitprogram=0;
>        
>         mlockall(MCL_CURRENT | MCL_FUTURE);
>         //wait for ctrl-c
>         signal(SIGTERM, catch_signal);
>         signal(SIGINT, catch_signal);
>         talker *talk = new talker();
>         listener *listen = new listener();
>         talk->startup(&talk_task, &listen_task);
>         sleep(1);
>         listen->startup(&talk_task, &listen_task);
>         pause();
>         sleep(2);
>         rt_task_join(&talk_task);
>         rt_task_join(&listen_task);
>         rt_task_delete(&talk_task);
>         rt_task_delete(&listen_task);
>         return 0;
> }
> 
> //talker.h
> #ifndef TALKER_H_
> #define TALKER_H_
> #include <stdio.h>
> #include <stdlib.h>
> #include <sys/time.h>
> #include <unistd.h>
> #include <sys/mman.h>
> #include <native/task.h>
> #include <native/timer.h>
> #define TIMEOUT (100000000)
> #define MAIN_RATE_NS 20000000
> 
> 
> extern volatile int exitprogram;
> 
> class talker {
> public:
>         talker();
>         virtual ~talker();
>         void mainloop();
>         int startup(RT_TASK *talktask, RT_TASK *listentask);
> 
> private:
>         RT_TASK *talk_task, *listen_task;
>         void shutdown();
>         int iret;
> 
> protected:
>         static void thunk(void * param);
> };
> 
> 
> 
> #endif /*TALKER_H_*/
> 
> 
> 
> //talker.cpp
> #include "talker.h"
> 
> talker::talker()
> {
> }
> talker::~talker()
> {
> }
> 
> int talker::startup(RT_TASK *talktask, RT_TASK *listentask)
> {
>         talk_task=talktask;
>         listen_task=listentask;
>         rt_task_create(talk_task, "talk_task", 0, 51, 0);
>         iret= rt_task_start(talk_task,thunk,(void*)(this));
>         return(0);
> 
> 
> }
> void talker::thunk(void * param)
> {
>         talker *instance = (talker *) param;
>         instance->mainloop();
> }
> void talker::mainloop()
> {
>         printf("talker task started\n");
>         int len;
>         RT_TASK_MCB talk_send, talk_reply;
> 
>         rt_task_set_periodic(NULL,TM_NOW,MAIN_RATE_NS);
>         while (1)
>         {
>                 if(exitprogram)
>                         break;
>                 rt_task_wait_period(NULL);
>                 talk_send.opcode = 0x01;
>                 talk_send.data = NULL;
>                 talk_send.size = 0;
>                 talk_reply.size = 0;
>                 talk_reply.data = NULL;
>                 len =
> rt_task_send(listen_task,&talk_send,&talk_reply,TM_NONBLOCK);
>                 if (len < 0) printf("rt_task_send error\n");
>                 if (talk_reply.opcode != 4)
>                 printf("len=%d, opcode=%d\n", len, talk_reply.opcode);
>         }
>         shutdown();
> }
> 
> void talker::shutdown()
> {
>     printf("Talker exits with return %d\n",iret);
> }
> 
> 
> //listener.h
> #ifndef LISTENER_H_
> #define LISTENER_H_
> #include <stdio.h>
> #include <stdlib.h>
> #include <sys/time.h>
> #include <unistd.h>
> #include <sys/mman.h>
> #include <native/task.h>
> #include <native/timer.h>
> 
> 
> 
> extern volatile int exitprogram;
> 
> class listener {
> public:
>         listener();
>         virtual ~listener();
>         void mainloop();
>         int startup(RT_TASK *talktask, RT_TASK *listentask);
> 
> private:
>         RT_TASK *talk_task, *listen_task;
>         void shutdown();
>         int iret;
> 
> protected:
>         static void thunk(void * param);
> };
> 
> 
> 
> #endif /*LISTENER_H_*/
> 
> 
> //listener.cpp
> 
> #include "listener.h"
> 
> listener::listener()
> {
> }
> 
> listener::~listener()
> {
> }
> 
> int listener::startup(RT_TASK *talktask, RT_TASK *listentask)
> {
>         talk_task=talktask;
>         listen_task=listentask;
>         rt_task_create(listen_task, "listen_task", 0, 50, 0);
>         iret= rt_task_start(listen_task,thunk,(void*)(this));
>         return(0);
> 
> 
> }
> 
> void listener::thunk(void * param)
> {
>         listener *instance = (listener *) param;
>         instance->mainloop();
> }
> 
> void listener::mainloop()
> {
>         printf("listener task started\n");
>         unsigned char buf[10];
>         RT_TASK_MCB listen_rcv, listen_reply;
>        
>         while (1)
>         {
>                 int taskid;
>                 if(exitprogram)
>                         break;
>                 listen_rcv.data = (caddr_t)buf;
>                 listen_rcv.size = sizeof(buf);
>                 taskid = rt_task_receive(&listen_rcv,TM_INFINITE);
>                 printf("received data with opcode
> %d\n",listen_rcv.opcode);    
>                 listen_reply.opcode = 4;
>                 listen_reply.size = 0;
>                 listen_reply.data = NULL;
>                 rt_task_reply(taskid, &listen_reply);
>         }
>         shutdown();
> }
> 
> void listener::shutdown()
> {
>     printf("listener exits with return %d\n",iret);
> }
> 
> 
> 
> 
> 
> 
> -----Original Message-----
> From: Philippe Gerum on behalf of Philippe Gerum
> Sent: Wed 5/7/2008 1:24 PM
> To: Karch, Joshua
> Cc: xenomai@xenomai.org
> Subject: Re: [Xenomai-help] rt_task_send / receive problems various
> issues and  bug trace
> 
> Karch, Joshua wrote:
>>
>> Hello,
>>
>> I'm using rt_task_send from a talker task and rt_task_receive/reply from
>> a listener task.  When I launch the two tasks in the following order:
>> listener task,  talker task,  everything runs normally.
>>
>> However, when I launch the talker task first, and then the listener task
>> second, I receive rt_task_send error -22 and after a bit of time the
>> listener task starts up.
> 
> -EINVAL is not on the error list for rt_task_send(). Could you confirm
> this result?
> 
>  I know this is logical and to be expected,
> 
> No, it's not. rt_task_send() does wait for the receiver to listen to,
> unless you
>  asked for a non blocking call using TM_NONBLOCK as a timeout.
> 
>> however, it appears that issuing the rt_task_send command to a task that
>> hasn't been started occasionally locks up sufficient resources to
>> prevent the listener task from starting.
> 
> Any chance your code enters a tight loop due to rt_task_send() failing
> repeatedly?
> 
>  By controlling task startup
>> order, I was able to circumvent this issue.  Both tasks have similar
>> priorities (50, 51).
>>
>> Additionally, I am unable to use rt_task_send with TM_NONBLOCK
>> len = rt_task_send(listen_task,&talk_send,&talk_reply,TM_NONBLOCK);
>>
>> I get a bug failure and have to reset the machine-- see below:
>>
> 
> Please disassemble your "vmlinux" kernel image, the exact one that
> causes a bug:
> $ objdump -d vmlinux > foo.txt.
> In that large file, search for the "__rt_task_receive" symbol (notice
> the double
> underscore prefix, we also have the "rt_task_receive" symbol, but we
> don't need
> this code at the moment), then paste&copy the disassembly code for that
> function. I'll have a look.
> 
> Step #2 is to send a simple piece of code that exhibits the problem.
> This will
> speed up the debugging and fixing process.
> 
>> The reason I want to use TM_NONBLOCK is so that I can send a trigger
>> message from the producer task to the consumer task without requiring a
>> reply to trigger the consumer task to act on the data received. I am
>> using the rt_task_send trigger message to gate and synchronize the
>> consumer task. Is a reply required for all rt_task_send?
>>
>> It seems if I don't send a reply when rt_task_send has a timeout
>> specified the sending task locks up and the listening task runs rampant,
>> i.e. rt_task_receive no longer blocks and the loop runs with no delay
>> and essentially locks up the machine, since I don't use
>> rt_set_task_periodic on the listening task.
>>
>> Here is the trace, and it requires a reboot.  I also can attach the
>> code- it is written in c++ with two separate classes as a model of the
>> application I am building.
>>
>> Thank you,
>>
>> Joshua Karch
>>
>> talker task started
>> rt_task_send error
>> len=-22, opcode=0
>> listener task stBUG: unable to handle kernel NULL pointer dereferencearted
>> rt_task_s at virtual address 0000020c
>> end error
>> len=-printing eip: c014f014 110, opcode=0
>> *pde = 00000000
>> Oops: 0000 [#1] PREEMPT
>> Modules linked in: xeno_timerbench nfs ipv6 nfsd lockd nfs_acl sunrpc
>> exportfs dm_snapshot dm_mirror dm_mod loop pcmcia firmware_class
>> serio_raw psmouse yenta_socket rsrc_nonstatic pcmcia_core cs5535_gpio
>> joydev evdev ext3 jbd mbcache usbhid ide_disk generic ohci_hcd ehci_hcd
>> amd74xx usbcore ide_core e100 mii
>>
>> Pid: 1973, comm: listen_task Not tainted (2.6.24.4 #12)
>> EIP: 0060:[<c014f014>] EFLAGS: 00010093 CPU: 0
>> EIP is at rt_task_receive+0x113/0x18b
>> EAX: cdc71f28 EBX: fffffd98 ECX: c03c5500 EDX: cd820acc
>> ESI: ffffff97 EDI: cd820610 EBP: cdc71edc ESP: cdc71ec4
>>  DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068
>> Process listen_task (pid: 1973, ti=cdc70000 task=cdc6d320
>> task.ti=cdc70000)<0>
>> I-pipe domain Linux
>> Stack: 00000000 cdc71f28 00000000 cdc71ee8 cdc6d320 cdc71fb8 cdc71f4c
>> c01515bf
>>        b7d083ea cdc71f00 c0102cb1 ce46bc70 c03c7e90 cd820620 cdc6d620
>> cd80ff44
>>        c029f55b cdc71f34 00000082 b91264ee 0000002e cdc6d320 b9127ee2
>> 0000002e
>> Call Trace:
>>  [<c01045f1>] show_trace_log_lvl+0x1a/0x2f
>>  [<c01046a3>] show_stack_log_lvl+0x9d/0xa5
>>  [<c0104769>] show_registers+0xbe/0x1fd
>>  [<c01049c1>] die+0x119/0x20a
>>  [<c010ef52>] do_page_fault+0x480/0x57e
>>  [<c010ca34>] __ipipe_handle_exception+0x11e/0x166
>>  [<c02a0e7f>] error_code+0x6f/0x80
>>  [<c01515bf>] __rt_task_receive+0xbe/0x113
>>  [<c0149b0d>] losyscall_event+0x99/0x13d
>>  [<c013f05b>] __ipipe_dispatch_event+0xac/0x16c
>>  [<c010c8b1>] __ipipe_syscall_root+0x6a/0xcf
>>  [<c0103e89>] system_call+0x29/0x4a
>>  =======================
>> Code: 00 00 8d 87 bc 04 00 00 39 c2 0f 95 c0 0f b6 c0 f7 d8 21 d0 31 db
>> 3d 58 02 00 00 74 06 8d 98 98 fd ff ff 8b 45 ec be 97 ff ff ff <8b> 93
>> 74 04 00 00 3b 50 0c 77 23 85 d2 74 19 89 d1 8b b3 70 04
>> EIP: [<c014f014>] rt_task_receive+0x113/0x18b SS:ESP 0068:cdc71ec4
>> ---[ end trace 523bcd2b73b75979 ]---
>>
>>
>>
>>
>>
>> ------------------------------------------------------------------------
>>
>> _______________________________________________
>> Xenomai-help mailing list
>> Xenomai-help@domain.hid
>> https://mail.gna.org/listinfo/xenomai-help
> 
> 
> --
> Philippe.
> 
> 
> 
> 
> 


-- 
Philippe.


[-- Attachment #2: Type: text/html, Size: 46455 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2008-05-19 13:59 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-05-07 15:27 [Xenomai-help] rt_task_send / receive problems various issues and bug trace Karch, Joshua
2008-05-07 17:24 ` Philippe Gerum
2008-05-07 19:33   ` Karch, Joshua
2008-05-07 19:35     ` [Xenomai-help] rt_task_send / receive problems various issuesand " Karch, Joshua
2008-05-12 12:36       ` [Xenomai-help] rt_task_send / receive bug from last week and use of rt_task_inquire Karch, Joshua
2008-05-17 21:26         ` Philippe Gerum
2008-05-19 13:59           ` Karch, Joshua
2008-05-17 21:18       ` [Xenomai-help] rt_task_send / receive problems various issuesand bug trace Philippe Gerum
2008-05-19 13:57         ` Karch, Joshua

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.