* [Xenomai-help] EML conflict with RTCAN? low_level_input framebuilding failed.
@ 2007-08-13 9:45 Roland Tollenaar
2007-08-13 11:41 ` Wolfgang Grandegger
0 siblings, 1 reply; 23+ messages in thread
From: Roland Tollenaar @ 2007-08-13 9:45 UTC (permalink / raw)
To: EML users, Xenomai-help
Hi,
in a 1 ms period task of my xenomai application things work perfectly
with my BeckHoff devices addressing them via Ethercat with EML until I
activate rtcan. Then sometimes (not always) I get incessant warnings
from EML which read:
EC_Telegram:: check_index(): Index field does not correspond with
received data.
low_level_input(): framebuilding failed.
The cycle time of the task is not being violated AFAI can see.
In dmesg the following can be found when the conflict occurs
RTnet:rtskb allocation from real-time cache failed.
Assertion failed! drivers/xenomai/can/rtcan_raw.c: rtcan_tx_push:168
dev->tx_socket=0 (3) TX skb still in use.
Can anyone make any suggestions as to what might be the problem here or
what I could try to look at to establish this?
Regards.
Roland.
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [Xenomai-help] EML conflict with RTCAN? low_level_input framebuilding failed.
2007-08-13 9:45 [Xenomai-help] EML conflict with RTCAN? low_level_input framebuilding failed Roland Tollenaar
@ 2007-08-13 11:41 ` Wolfgang Grandegger
2007-08-13 12:41 ` Roland Tollenaar
0 siblings, 1 reply; 23+ messages in thread
From: Wolfgang Grandegger @ 2007-08-13 11:41 UTC (permalink / raw)
To: rolandtollenaar; +Cc: Xenomai-help, EML users
Roland Tollenaar wrote:
> Hi,
>
> in a 1 ms period task of my xenomai application things work perfectly
> with my BeckHoff devices addressing them via Ethercat with EML until I
> activate rtcan. Then sometimes (not always) I get incessant warnings
> from EML which read:
>
> EC_Telegram:: check_index(): Index field does not correspond with
> received data.
> low_level_input(): framebuilding failed.
>
> The cycle time of the task is not being violated AFAI can see.
>
> In dmesg the following can be found when the conflict occurs
>
> RTnet:rtskb allocation from real-time cache failed.
> Assertion failed! drivers/xenomai/can/rtcan_raw.c: rtcan_tx_push:168
> dev->tx_socket=0 (3) TX skb still in use.
Hm, this is not supposed to happen.
> Can anyone make any suggestions as to what might be the problem here or
> what I could try to look at to establish this?
Can you show the output of /proc/rtcan/devices and /proc/rtcan/sockets
before and after the problem showed up.
Wolfgang.
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [Xenomai-help] EML conflict with RTCAN? low_level_input framebuilding failed.
2007-08-13 11:41 ` Wolfgang Grandegger
@ 2007-08-13 12:41 ` Roland Tollenaar
2007-08-13 13:03 ` Wolfgang Grandegger
0 siblings, 1 reply; 23+ messages in thread
From: Roland Tollenaar @ 2007-08-13 12:41 UTC (permalink / raw)
To: Wolfgang Grandegger, Xenomai-help, EML users
Hi
>> RTnet:rtskb allocation from real-time cache failed.
>> Assertion failed! drivers/xenomai/can/rtcan_raw.c: rtcan_tx_push:168
>> dev->tx_socket=0 (3) TX skb still in use.
>
> Hm, this is not supposed to happen.
Which of the two?
> Can you show the output of /proc/rtcan/devices and /proc/rtcan/sockets
> before and after the problem showed up.
Below is an accumulation of what I think you are asking for. I am not
convinced that the rtskb allocation failed message is serious, as you
will see from the syslog and my comment above it only takes place when i
close my application. Although I try to close all connections neatly
certain threads still seem to be busy. See the errors I get on closing
the application.
App running with no problem:
root@domain.hid:~# cat /proc/rtcan/sockets
fd Name___________ Filter ErrMask RX_Timeout_ns TX_Timeout_ns RX_BufFull
TX_Lo
2 rtcan2 1 0x00000 infinite infinite
0 1
0 rtcan2 -1 0x00000 infinite infinite
0 1
root@domain.hid:~# cat /proc/rtcan/devices
Name___________ _Baudrate State___ TX_Counter RX_Counter ____Errors
rtcan0 undefined stopped 0 0 0
rtcan1 undefined stopped 0 0 0
rtcan2 1000000 active 16321347 27633347 2367116
App running with messages failing
root@domain.hid# cat /proc/rtcan/sockets
fd Name___________ Filter ErrMask RX_Timeout_ns TX_Timeout_ns RX_BufFull
TX_Lo
2 rtcan2 1 0x00000 infinite infinite
0 1
0 rtcan2 -1 0x00000 infinite infinite
0 1
root@domain.hid# cat /proc/rtcan/devices
Name___________ _Baudrate State___ TX_Counter RX_Counter ____Errors
rtcan0 undefined stopped 0 0 0
rtcan1 undefined stopped 0 0 0
rtcan2 1000000 active 16850473 28691571 2367116
cat /var/syslog shows that the error only seems to come up when the
application closes.
Only occurs on closing the application
Aug 13 13:01:28 (none) kernel: RTnet: rtskb allocation from real-time
cache failed
Aug 13 13:02:14 (none) kernel: RTnet: rtskb allocation from real-time
cache failed
Aug 13 14:02:34 (none) kernel: RTnet: rtskb allocation from real-time
cache failed
Aug 13 14:03:36 (none) kernel: RTnet: rtskb allocation from real-time
cache failed
Aug 13 14:18:39 (none) kernel: RTnet: rtskb allocation from real-time
cache failed
Aug 13 14:19:33 (none) kernel: RTnet: rtskb allocation from real-time
cache failed
Aug 13 14:19:58 (none) kernel: RTnet: rtskb allocation from real-time
cache failed
Aug 13 14:21:27 (none) kernel: RTnet: rtskb allocation from real-time
cache failed
Aug 13 14:22:10 (none) kernel: RTnet: rtskb allocation from real-time
cache failed
When I close the application I get these errors
rt_dev_recv: aborted because socket was closed
rt_dev_recv: aborted because socket was closed
rt_dev_recv: aborted because socket was closed
rt_dev_recv: aborted because socket was closed
rt_dev_recv: aborted because socket was closed
rt_dev_recv: aborted because socket was closed
rt_dev_recv: aborted because socket was closed
rt_dev_recv: aborted because socket was closed
rt_dev_recv: aborted because socket was closed
rt_dev_recv: aborted because socket was closed
rt_dev_recv: aborted because socket was closed
rt_dev_recv: aborted because socket was closed
rt_dev_recv: aborted because socket was closed
rt_dev_recv: aborted because socket was closed
rt_dev_recv: aborted because socket was closed
rt_dev_recv: aborted because socket was closed
rt_dev_ioctl: Bad file descriptor
Waiting for tasks to stop....low_level_output(): Cannot Send
low_level_output(): Cannot Send
low_level_output(): Cannot Send
low_level_output(): Cannot Send
low_level_output(): Cannot Send
low_level_output(): Cannot Send
low_level_output(): Cannot Send
low_level_output(): Cannot Send
low_level_output(): Cannot Send
low_level_output(): Cannot Send
low_level_txandrx: failed: MAX_TRIES_TX: Giving up
DLL::txandrx() Error
PD_Buffer: Error sending PD
txandrx failed:
Does this shed any light on the matter?
Roland
>
> Wolfgang.
>
>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [Xenomai-help] EML conflict with RTCAN? low_level_input framebuilding failed.
2007-08-13 12:41 ` Roland Tollenaar
@ 2007-08-13 13:03 ` Wolfgang Grandegger
2007-08-13 13:11 ` Roland Tollenaar
2007-08-13 14:00 ` Roland Tollenaar
0 siblings, 2 replies; 23+ messages in thread
From: Wolfgang Grandegger @ 2007-08-13 13:03 UTC (permalink / raw)
To: rolandtollenaar; +Cc: Xenomai-help, EML users
Roland Tollenaar wrote:
> Hi
>
>>> RTnet:rtskb allocation from real-time cache failed.
>>> Assertion failed! drivers/xenomai/can/rtcan_raw.c: rtcan_tx_push:168
>>> dev->tx_socket=0 (3) TX skb still in use.
>>
>> Hm, this is not supposed to happen.
> Which of the two?
The RTCAN assertion. Well, in fact, it can happen when the device goes
bus-off or is stopped while a TX message is pending. The next message
after (re-)start will the trigger this message. This is a bug but it
should _not_ harm (either I remove the assertion or I reset properly the
value of dev->tx_socket).
The first one should be pretty clear. The rtskb pool seems to be exhausted.
>
>> Can you show the output of /proc/rtcan/devices and /proc/rtcan/sockets
>> before and after the problem showed up.
>
> Below is an accumulation of what I think you are asking for. I am not
> convinced that the rtskb allocation failed message is serious, as you
> will see from the syslog and my comment above it only takes place when i
> close my application. Although I try to close all connections neatly
> certain threads still seem to be busy. See the errors I get on closing
> the application.
>
> App running with no problem:
>
> root@domain.hid:~# cat /proc/rtcan/sockets
> fd Name___________ Filter ErrMask RX_Timeout_ns TX_Timeout_ns RX_BufFull
> TX_Lo
> 2 rtcan2 1 0x00000 infinite infinite 0 1
> 0 rtcan2 -1 0x00000 infinite infinite 0 1
>
> root@domain.hid:~# cat /proc/rtcan/devices
> Name___________ _Baudrate State___ TX_Counter RX_Counter ____Errors
> rtcan0 undefined stopped 0 0 0
> rtcan1 undefined stopped 0 0 0
> rtcan2 1000000 active 16321347 27633347 2367116
>
>
> App running with messages failing
>
> root@domain.hid# cat /proc/rtcan/sockets
> fd Name___________ Filter ErrMask RX_Timeout_ns TX_Timeout_ns RX_BufFull
> TX_Lo
> 2 rtcan2 1 0x00000 infinite infinite 0 1
> 0 rtcan2 -1 0x00000 infinite infinite 0 1
>
>
> root@domain.hid# cat /proc/rtcan/devices
> Name___________ _Baudrate State___ TX_Counter RX_Counter ____Errors
> rtcan0 undefined stopped 0 0 0
> rtcan1 undefined stopped 0 0 0
> rtcan2 1000000 active 16850473 28691571 2367116
Oops, that much errors?
> cat /var/syslog shows that the error only seems to come up when the
> application closes.
>
> Only occurs on closing the application
> Aug 13 13:01:28 (none) kernel: RTnet: rtskb allocation from real-time
> cache failed
> Aug 13 13:02:14 (none) kernel: RTnet: rtskb allocation from real-time
> cache failed
> Aug 13 14:02:34 (none) kernel: RTnet: rtskb allocation from real-time
> cache failed
> Aug 13 14:03:36 (none) kernel: RTnet: rtskb allocation from real-time
> cache failed
> Aug 13 14:18:39 (none) kernel: RTnet: rtskb allocation from real-time
> cache failed
> Aug 13 14:19:33 (none) kernel: RTnet: rtskb allocation from real-time
> cache failed
> Aug 13 14:19:58 (none) kernel: RTnet: rtskb allocation from real-time
> cache failed
> Aug 13 14:21:27 (none) kernel: RTnet: rtskb allocation from real-time
> cache failed
> Aug 13 14:22:10 (none) kernel: RTnet: rtskb allocation from real-time
> cache failed
>
>
> When I close the application I get these errors
>
> rt_dev_recv: aborted because socket was closed
> rt_dev_recv: aborted because socket was closed
> rt_dev_recv: aborted because socket was closed
> rt_dev_recv: aborted because socket was closed
> rt_dev_recv: aborted because socket was closed
> rt_dev_recv: aborted because socket was closed
> rt_dev_recv: aborted because socket was closed
> rt_dev_recv: aborted because socket was closed
> rt_dev_recv: aborted because socket was closed
> rt_dev_recv: aborted because socket was closed
> rt_dev_recv: aborted because socket was closed
> rt_dev_recv: aborted because socket was closed
> rt_dev_recv: aborted because socket was closed
> rt_dev_recv: aborted because socket was closed
> rt_dev_recv: aborted because socket was closed
> rt_dev_recv: aborted because socket was closed
You should handle this error properly.
> rt_dev_ioctl: Bad file descriptor
> Waiting for tasks to stop....low_level_output(): Cannot Send
> low_level_output(): Cannot Send
> low_level_output(): Cannot Send
> low_level_output(): Cannot Send
> low_level_output(): Cannot Send
> low_level_output(): Cannot Send
> low_level_output(): Cannot Send
> low_level_output(): Cannot Send
> low_level_output(): Cannot Send
> low_level_output(): Cannot Send
> low_level_txandrx: failed: MAX_TRIES_TX: Giving up
> DLL::txandrx() Error
> PD_Buffer: Error sending PD
> txandrx failed:
>
>
> Does this shed any light on the matter?
Hm, seems that your shutdown is not implemented properly.
Wolfgang.
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [Xenomai-help] EML conflict with RTCAN? low_level_input framebuilding failed.
2007-08-13 13:03 ` Wolfgang Grandegger
@ 2007-08-13 13:11 ` Roland Tollenaar
2007-08-13 14:00 ` Roland Tollenaar
1 sibling, 0 replies; 23+ messages in thread
From: Roland Tollenaar @ 2007-08-13 13:11 UTC (permalink / raw)
To: Wolfgang Grandegger; +Cc: Xenomai-help, EML users
Hi Wolfgang,
> The RTCAN assertion. Well, in fact, it can happen when the device goes
> bus-off or is stopped while a TX message is pending. The next message
> after (re-)start will the trigger this message. This is a bug but it
> should _not_ harm (either I remove the assertion or I reset properly the
> value of dev->tx_socket).
Clear. Thanks.
> The first one should be pretty clear. The rtskb pool seems to be exhausted.
Sorry if this is not clear to me. What is the rtskb pool and what are
the implications of it being full?
>>
>> root@domain.hid# cat /proc/rtcan/devices
>> Name___________ _Baudrate State___ TX_Counter RX_Counter ____Errors
>> rtcan0 undefined stopped 0 0 0
>> rtcan1 undefined stopped 0 0 0
>> rtcan2 1000000 active 16850473 28691571 2367116
>
> Oops, that much errors?
eeuuh yes, I started up can after having had it disabled for a very long
time while I was working on the ethercat. I seem to have forgotten that
CAN is not wireless, forgot to plug in the bus. So I think that those
errors were picked up then, they did not seem to increase later on.
>> rt_dev_recv: aborted because socket was closed
>> rt_dev_recv: aborted because socket was closed
>> rt_dev_recv: aborted because socket was closed
>> rt_dev_recv: aborted because socket was closed
>
> You should handle this error properly.
You are right. I think I am not closing the threads in the correct
sequence, not sure I know how to yet. But can this be the cause of my
problem? Where is the conflict/ complication arising between rtcan and
eml. I do understand that this is an almost impossible question to find
an answer to over two separate lists. :(
>
> Hm, seems that your shutdown is not implemented properly.
I'd say this assessment is rather accurate. Will look into it:)
Roland
>
> Wolfgang.
>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [Xenomai-help] EML conflict with RTCAN? low_level_input framebuilding failed.
2007-08-13 13:03 ` Wolfgang Grandegger
2007-08-13 13:11 ` Roland Tollenaar
@ 2007-08-13 14:00 ` Roland Tollenaar
2007-08-13 14:51 ` [Xenomai-help] [Ethercatmaster-users] " Jan Kiszka
1 sibling, 1 reply; 23+ messages in thread
From: Roland Tollenaar @ 2007-08-13 14:00 UTC (permalink / raw)
To: Wolfgang Grandegger; +Cc: Xenomai-help, EML users
Hi,
All closing & shutting down has been perfected. There are no more errors
on closing my application.
Yet the problem persists very explicitly. Rtcan and EML can run
separately and never throw up any errors. As soon as they are used in
combination then in 50% of the cases the framebuilding in EML gets
messed up (as per the error message)
There is definitely something between the two that is not right.
>>>> RTnet:rtskb allocation from real-time cache failed.
Could I get some tips as to what I can do about this? I seem to get it
even when I do not have rtcan activity running in my application and
(because I am clueless) I would like to prevent this message which may
signify the root of the problem.
Regards,
Roland.
>>> Hm, this is not supposed to happen.
>> Which of the two?
>
> The RTCAN assertion. Well, in fact, it can happen when the device goes
> bus-off or is stopped while a TX message is pending. The next message
> after (re-)start will the trigger this message. This is a bug but it
> should _not_ harm (either I remove the assertion or I reset properly the
> value of dev->tx_socket).
>
> The first one should be pretty clear. The rtskb pool seems to be exhausted.
>
>>
>>> Can you show the output of /proc/rtcan/devices and
>>> /proc/rtcan/sockets before and after the problem showed up.
>>
>> Below is an accumulation of what I think you are asking for. I am not
>> convinced that the rtskb allocation failed message is serious, as you
>> will see from the syslog and my comment above it only takes place when
>> i close my application. Although I try to close all connections neatly
>> certain threads still seem to be busy. See the errors I get on closing
>> the application.
>>
>> App running with no problem:
>>
>> root@domain.hid:~# cat /proc/rtcan/sockets
>> fd Name___________ Filter ErrMask RX_Timeout_ns TX_Timeout_ns
>> RX_BufFull TX_Lo
>> 2 rtcan2 1 0x00000 infinite infinite 0 1
>> 0 rtcan2 -1 0x00000 infinite infinite 0 1
>>
>> root@domain.hid:~# cat /proc/rtcan/devices
>> Name___________ _Baudrate State___ TX_Counter RX_Counter ____Errors
>> rtcan0 undefined stopped 0 0 0
>> rtcan1 undefined stopped 0 0 0
>> rtcan2 1000000 active 16321347 27633347 2367116
>>
>>
>> App running with messages failing
>>
>> root@domain.hid# cat /proc/rtcan/sockets
>> fd Name___________ Filter ErrMask RX_Timeout_ns TX_Timeout_ns
>> RX_BufFull TX_Lo
>> 2 rtcan2 1 0x00000 infinite infinite 0 1
>> 0 rtcan2 -1 0x00000 infinite infinite 0 1
>>
>>
>> root@domain.hid# cat /proc/rtcan/devices
>> Name___________ _Baudrate State___ TX_Counter RX_Counter ____Errors
>> rtcan0 undefined stopped 0 0 0
>> rtcan1 undefined stopped 0 0 0
>> rtcan2 1000000 active 16850473 28691571 2367116
>
> Oops, that much errors?
>
>> cat /var/syslog shows that the error only seems to come up when the
>> application closes.
>>
>> Only occurs on closing the application
>> Aug 13 13:01:28 (none) kernel: RTnet: rtskb allocation from real-time
>> cache failed
>> Aug 13 13:02:14 (none) kernel: RTnet: rtskb allocation from real-time
>> cache failed
>> Aug 13 14:02:34 (none) kernel: RTnet: rtskb allocation from real-time
>> cache failed
>> Aug 13 14:03:36 (none) kernel: RTnet: rtskb allocation from real-time
>> cache failed
>> Aug 13 14:18:39 (none) kernel: RTnet: rtskb allocation from real-time
>> cache failed
>> Aug 13 14:19:33 (none) kernel: RTnet: rtskb allocation from real-time
>> cache failed
>> Aug 13 14:19:58 (none) kernel: RTnet: rtskb allocation from real-time
>> cache failed
>> Aug 13 14:21:27 (none) kernel: RTnet: rtskb allocation from real-time
>> cache failed
>> Aug 13 14:22:10 (none) kernel: RTnet: rtskb allocation from real-time
>> cache failed
>>
>>
>> When I close the application I get these errors
>>
>> rt_dev_recv: aborted because socket was closed
>> rt_dev_recv: aborted because socket was closed
>> rt_dev_recv: aborted because socket was closed
>> rt_dev_recv: aborted because socket was closed
>> rt_dev_recv: aborted because socket was closed
>> rt_dev_recv: aborted because socket was closed
>> rt_dev_recv: aborted because socket was closed
>> rt_dev_recv: aborted because socket was closed
>> rt_dev_recv: aborted because socket was closed
>> rt_dev_recv: aborted because socket was closed
>> rt_dev_recv: aborted because socket was closed
>> rt_dev_recv: aborted because socket was closed
>> rt_dev_recv: aborted because socket was closed
>> rt_dev_recv: aborted because socket was closed
>> rt_dev_recv: aborted because socket was closed
>> rt_dev_recv: aborted because socket was closed
>
> You should handle this error properly.
>
>> rt_dev_ioctl: Bad file descriptor
>> Waiting for tasks to stop....low_level_output(): Cannot Send
>> low_level_output(): Cannot Send
>> low_level_output(): Cannot Send
>> low_level_output(): Cannot Send
>> low_level_output(): Cannot Send
>> low_level_output(): Cannot Send
>> low_level_output(): Cannot Send
>> low_level_output(): Cannot Send
>> low_level_output(): Cannot Send
>> low_level_output(): Cannot Send
>> low_level_txandrx: failed: MAX_TRIES_TX: Giving up
>> DLL::txandrx() Error
>> PD_Buffer: Error sending PD
>> txandrx failed:
>>
>>
>> Does this shed any light on the matter?
>
> Hm, seems that your shutdown is not implemented properly.
>
> Wolfgang.
>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [Xenomai-help] [Ethercatmaster-users] EML conflict with RTCAN? low_level_input framebuilding failed.
2007-08-13 14:00 ` Roland Tollenaar
@ 2007-08-13 14:51 ` Jan Kiszka
2007-08-13 15:55 ` Roland Tollenaar
2007-08-14 13:56 ` Roland Tollenaar
0 siblings, 2 replies; 23+ messages in thread
From: Jan Kiszka @ 2007-08-13 14:51 UTC (permalink / raw)
To: rolandtollenaar; +Cc: Xenomai-help, EML users
[-- Attachment #1: Type: text/plain, Size: 1593 bytes --]
Roland Tollenaar wrote:
> Hi,
>
> All closing & shutting down has been perfected. There are no more errors
> on closing my application.
>
> Yet the problem persists very explicitly. Rtcan and EML can run
> separately and never throw up any errors. As soon as they are used in
> combination then in 50% of the cases the framebuilding in EML gets
> messed up (as per the error message)
>
> There is definitely something between the two that is not right.
>
In 9 of 10 cases (if not more): timing. Running both alone doesn't
expose some timing issue (race) or transient overload. I can't help with
EML complaints, maybe the FMTC guys have an idea what can trigger this
and how to debug it.
>
>>>>> RTnet:rtskb allocation from real-time cache failed.
>
> Could I get some tips as to what I can do about this? I seem to get it
> even when I do not have rtcan activity running in my application and
> (because I am clueless) I would like to prevent this message which may
> signify the root of the problem.
You have created the socket for some/all EML activity from primary mode
of some Xenomai thread, thus network buffer allocation is ought to run
against the real-time rtskb pool - which is by default empty :p. See
README.pools from the RTnet documentation on this.
I don't have the EML design at hand, but you might be able to avoid this
by initialising before creating the shadow task or by explicitly
switching to secondary mode before initialising. [Sorry for this issue,
it's at least partly due to some outdated RTnet design.]
Jan
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 250 bytes --]
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [Xenomai-help] [Ethercatmaster-users] EML conflict with RTCAN? low_level_input framebuilding failed.
2007-08-13 14:51 ` [Xenomai-help] [Ethercatmaster-users] " Jan Kiszka
@ 2007-08-13 15:55 ` Roland Tollenaar
2007-08-13 16:57 ` Jan Kiszka
2007-08-14 13:56 ` Roland Tollenaar
1 sibling, 1 reply; 23+ messages in thread
From: Roland Tollenaar @ 2007-08-13 15:55 UTC (permalink / raw)
To: Jan Kiszka; +Cc: Xenomai-help, EML users
Hi
>> There is definitely something between the two that is not right.
>>
>
> In 9 of 10 cases (if not more): timing. Running both alone doesn't
> expose some timing issue (race) or transient overload. I can't help with
> EML complaints, maybe the FMTC guys have an idea what can trigger this
> and how to debug it.
Out of interest, what timing exactly? These two systems (rtcan and eml)
run separately, they don;t need to access the same address space or
otherwise share resources that would require timing? What am I not
understanding?
>>>>>> RTnet:rtskb allocation from real-time cache failed.
>> Could I get some tips as to what I can do about this? I seem to get it
>> even when I do not have rtcan activity running in my application and
>> (because I am clueless) I would like to prevent this message which may
>> signify the root of the problem.
>
> You have created the socket for some/all EML activity from primary mode
> of some Xenomai thread,
100% correct.
thus network buffer allocation is ought to run
> against the real-time rtskb pool - which is by default empty :p. See
> README.pools from the RTnet documentation on this.
> I don't have the EML design at hand, but you might be able to avoid this
> by initialising before creating the shadow task or by explicitly
In fact this is what I tried initially. IT does not work at all. so I
ended up initializing in the thread. Problem?
Is this allocation possibly the cause of the problem or is the rtnet
warning harmless? At least it is not related to rtcan in any manner
because it appears even if rtcan is not activated in the application.
Roland
> switching to secondary mode before initialising. [Sorry for this issue,
> it's at least partly due to some outdated RTnet design.]
>
> Jan
>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [Xenomai-help] [Ethercatmaster-users] EML conflict with RTCAN? low_level_input framebuilding failed.
2007-08-13 15:55 ` Roland Tollenaar
@ 2007-08-13 16:57 ` Jan Kiszka
2007-08-13 17:40 ` Roland Tollenaar
0 siblings, 1 reply; 23+ messages in thread
From: Jan Kiszka @ 2007-08-13 16:57 UTC (permalink / raw)
To: rolandtollenaar; +Cc: Xenomai-help, EML users
[-- Attachment #1: Type: text/plain, Size: 2453 bytes --]
Roland Tollenaar wrote:
> Hi
>
>>> There is definitely something between the two that is not right.
>>>
>>
>> In 9 of 10 cases (if not more): timing. Running both alone doesn't
>> expose some timing issue (race) or transient overload. I can't help with
>> EML complaints, maybe the FMTC guys have an idea what can trigger this
>> and how to debug it.
> Out of interest, what timing exactly? These two systems (rtcan and eml)
> run separately, they don;t need to access the same address space or
> otherwise share resources that would require timing? What am I not
> understanding?
The share the same CPU? Varying the load can re-order the execution
order in otherwise independent components.
>
>
>>>>>>> RTnet:rtskb allocation from real-time cache failed.
>>> Could I get some tips as to what I can do about this? I seem to get
>>> it even when I do not have rtcan activity running in my application
>>> and (because I am clueless) I would like to prevent this message
>>> which may signify the root of the problem.
>>
>> You have created the socket for some/all EML activity from primary mode
>> of some Xenomai thread,
> 100% correct.
>
>
> thus network buffer allocation is ought to run
>> against the real-time rtskb pool - which is by default empty :p. See
>> README.pools from the RTnet documentation on this.
>
>
>> I don't have the EML design at hand, but you might be able to avoid this
>> by initialising before creating the shadow task or by explicitly
> In fact this is what I tried initially. IT does not work at all. so I
> ended up initializing in the thread. Problem?
Not necessarily. But it would have been nice to report the other issue
as well, because maybe there is something to be fixed (either in the
code or in the docs). Initialisation almost always happens in non-RT
context, and you shouldn't be force to do this under RT constraints. If
this is an RTnet and/or EML problem, please report it on the related lists!
>
> Is this allocation possibly the cause of the problem or is the rtnet
> warning harmless? At least it is not related to rtcan in any manner
> because it appears even if rtcan is not activated in the application.
Did you set the rtskb_cache_size module parameter for the rtnet.ko? Did
you choose it appropriately large so that buffer pool do not exhaust if
RTnet is blocked by other system activity? Again, check the documentation.
Jan
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 250 bytes --]
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [Xenomai-help] [Ethercatmaster-users] EML conflict with RTCAN? low_level_input framebuilding failed.
2007-08-13 16:57 ` Jan Kiszka
@ 2007-08-13 17:40 ` Roland Tollenaar
2007-08-13 17:57 ` Jan Kiszka
0 siblings, 1 reply; 23+ messages in thread
From: Roland Tollenaar @ 2007-08-13 17:40 UTC (permalink / raw)
To: Jan Kiszka, Xenomai-help, rtnet-users, EML users
Hi Jan,
>> thus network buffer allocation is ought to run
>>> against the real-time rtskb pool - which is by default empty :p. See
>>> README.pools from the RTnet documentation on this.
I read this documentation. Together with an archive email of this list I
understand that if I load rtnet.ko like
insmod rtnet.ko rtskb_cache_size=64
(for the benfit of other poor souls in the future :))
it should help. And it does make a huge difference. Now instead of not
giving a problem 1 out of 5 times its more like giving a problem 1 every
10 times.
The 64 is a value I got from the mailing list. How large can I make this
and what am I compromising?
>>> I don't have the EML design at hand, but you might be able to avoid this
>>> by initialising before creating the shadow task or by explicitly
>> In fact this is what I tried initially. IT does not work at all. so I
>> ended up initializing in the thread. Problem?
>
> Not necessarily. But it would have been nice to report the other issue
> as well, because maybe there is something to be fixed (either in the
> code or in the docs). Initialisation almost always happens in non-RT
> context, and you shouldn't be force to do this under RT constraints. If
> this is an RTnet and/or EML problem, please report it on the related lists!
Will do so with your compliments and regards. :) I tried to initialize
like I initialize rtcan in non-rt but it really does not work.
> Did you set the rtskb_cache_size module parameter for the rtnet.ko? Did
> you choose it appropriately large so that buffer pool do not exhaust if
> RTnet is blocked by other system activity? Again, check the documentation.
As stated, this seems to mitigate the problem. What is not clear to me
is why the default of the rtskb pool is zero?
Roland.
>
> Jan
>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [Xenomai-help] [Ethercatmaster-users] EML conflict with RTCAN? low_level_input framebuilding failed.
2007-08-13 17:40 ` Roland Tollenaar
@ 2007-08-13 17:57 ` Jan Kiszka
2007-08-13 18:17 ` Roland Tollenaar
0 siblings, 1 reply; 23+ messages in thread
From: Jan Kiszka @ 2007-08-13 17:57 UTC (permalink / raw)
To: rolandtollenaar; +Cc: Xenomai-help, EML users, rtnet-users
[-- Attachment #1: Type: text/plain, Size: 2412 bytes --]
Roland Tollenaar wrote:
> Hi Jan,
>
>
>>> thus network buffer allocation is ought to run
>>>> against the real-time rtskb pool - which is by default empty :p. See
>>>> README.pools from the RTnet documentation on this.
> I read this documentation. Together with an archive email of this list I
> understand that if I load rtnet.ko like
>
> insmod rtnet.ko rtskb_cache_size=64
>
> (for the benfit of other poor souls in the future :))
>
> it should help. And it does make a huge difference. Now instead of not
> giving a problem 1 out of 5 times its more like giving a problem 1 every
> 10 times.
>
> The 64 is a value I got from the mailing list. How large can I make this
> and what am I compromising?
Each buffer is slightly more than 1.5 KB heavy. Do your maths :). How
many buffers you need depend on how many incoming and outgoing frames
might be queued into they are processed. And that depends on the frame
rate and the time your EML stack has to handle it in the worst case. I
can't give you numbers on this, that depends on _your_ setup.
>
>
>>>> I don't have the EML design at hand, but you might be able to avoid
>>>> this
>>>> by initialising before creating the shadow task or by explicitly
>>> In fact this is what I tried initially. IT does not work at all. so I
>>> ended up initializing in the thread. Problem?
>>
>> Not necessarily. But it would have been nice to report the other issue
>> as well, because maybe there is something to be fixed (either in the
>> code or in the docs). Initialisation almost always happens in non-RT
>> context, and you shouldn't be force to do this under RT constraints. If
>> this is an RTnet and/or EML problem, please report it on the related
>> lists!
> Will do so with your compliments and regards. :) I tried to initialize
> like I initialize rtcan in non-rt but it really does not work.
That sounds like a bug - of what component soever.
>
>
>> Did you set the rtskb_cache_size module parameter for the rtnet.ko? Did
>> you choose it appropriately large so that buffer pool do not exhaust if
>> RTnet is blocked by other system activity? Again, check the
>> documentation.
> As stated, this seems to mitigate the problem. What is not clear to me
> is why the default of the rtskb pool is zero?
Because you _normally_ don't need it and would thus wast the allocated
memory.
Jan
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 250 bytes --]
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [Xenomai-help] [Ethercatmaster-users] EML conflict with RTCAN? low_level_input framebuilding failed.
2007-08-13 17:57 ` Jan Kiszka
@ 2007-08-13 18:17 ` Roland Tollenaar
2007-08-13 18:30 ` Jan Kiszka
0 siblings, 1 reply; 23+ messages in thread
From: Roland Tollenaar @ 2007-08-13 18:17 UTC (permalink / raw)
To: Jan Kiszka; +Cc: Xenomai-help, EML users, rtnet-users
Hi,
>> The 64 is a value I got from the mailing list. How large can I make this
>> and what am I compromising?
>
> Each buffer is slightly more than 1.5 KB heavy. Do your maths :). How
> many buffers you need depend on how many incoming and outgoing frames
> might be queued into they are processed. And that depends on the frame
> rate and the time your EML stack has to handle it in the worst case. I
> can't give you numbers on this, that depends on _your_ setup.
That much is clear. Will make it big out of shear inability to
calculate. The lost memory is of almost no concern.
Thanks.
Roland
>
> Jan
>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [Xenomai-help] [Ethercatmaster-users] EML conflict with RTCAN? low_level_input framebuilding failed.
2007-08-13 18:17 ` Roland Tollenaar
@ 2007-08-13 18:30 ` Jan Kiszka
0 siblings, 0 replies; 23+ messages in thread
From: Jan Kiszka @ 2007-08-13 18:30 UTC (permalink / raw)
To: rolandtollenaar; +Cc: Xenomai-help, EML users, rtnet-users
[-- Attachment #1: Type: text/plain, Size: 1077 bytes --]
Roland Tollenaar wrote:
> Hi,
>
>>> The 64 is a value I got from the mailing list. How large can I make this
>>> and what am I compromising?
>>
>> Each buffer is slightly more than 1.5 KB heavy. Do your maths :). How
>> many buffers you need depend on how many incoming and outgoing frames
>> might be queued into they are processed. And that depends on the frame
s/into/until/ (my brain-based dictionary must be broken)
>> rate and the time your EML stack has to handle it in the worst case. I
>> can't give you numbers on this, that depends on _your_ setup.
>
> That much is clear. Will make it big out of shear inability to
> calculate. The lost memory is of almost no concern.
Just make sure that picking an arbitrary large pool size doesn't paper
over some real system design issue that may manifests in huge latencies.
Again, I don't know your numbers, so I cannot tell what is reasonable
and what an indication of a problem. A system-level analysis of the
event flows would be a good job for LTTng now - if it only worked already...
Jan
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 250 bytes --]
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [Xenomai-help] [Ethercatmaster-users] EML conflict with RTCAN? low_level_input framebuilding failed.
2007-08-13 14:51 ` [Xenomai-help] [Ethercatmaster-users] " Jan Kiszka
2007-08-13 15:55 ` Roland Tollenaar
@ 2007-08-14 13:56 ` Roland Tollenaar
2007-08-14 14:47 ` Klaas Gadeyne
1 sibling, 1 reply; 23+ messages in thread
From: Roland Tollenaar @ 2007-08-14 13:56 UTC (permalink / raw)
To: Jan Kiszka; +Cc: Xenomai-help, EML users, rtnet-users
Hi,
Some new insight:
I have included the rtnet list because I think this is of interest and I
don;t know exactly where the problem arises. Most likely EML but which
goes unnecessarily haywire when xenomai has a latency glitch which seems
to be caused by rtcan but may also be rtnet (so the rest of you can stop
reading here :) )
To refresh, I am running EML and rtcan together and separately they
appear to function perfectly but when combined, EML sometimes starts
ejecting the error:
EC_Telegram::index_check() : index not the same,
low_level_input() : framebuilding failed.
Or something pretty closely to that effect.
Despite EML warning that the frame could not be built, the inputs (which
rely on the framebuilding being succesful pretty stringently) seem to
function perfectly. It seems as though EML is emitting a false warning.
I have now killed the warning from EML with the following hack:
In EC_Telegram::check_index() I have effectively killed the check by
snubbing the holler and always returning true. Now the application no
longer emits warnings and everything functions well (or so it seems). I
have a sawtooth analog output on a scope which triggers well. It
fluttered sometimes when EML emitted the warning. Presumably the
fluttering is caused by some latency which I thought might be the result
of EML emitting the warnings. However even without the warnings from EML
this fluttering incidentally takes place. Generally I can increase the
chance of this flutter (I will confirm this with a latency test later
on) by clicking about in the user interface of my application (QT).
So EML seems to go haywire when there is latency and there seems to be
latency when I am using rtcan. (Bit of latency can also be caused by
clicking about but I think it is mainly rtcan causing the latency spikes
and the clicking about just knocks it over the edge more often.)
I have also made the index that is supposedly not the same visible. For
the EML chaps: index and m_idx and the output seems to be something like
this
index m_idx
0 1
0 0
1 2
2 3
2 3
3 4
4 5
: :
: :
253 254
0 2
or something to that effect. Can anyone comment on this?
P.S.
Everything else is sorted out now, application closes neatly all
sockets, no buffer overruns, no errors in syslog etc.
Also I have managed to get the initialization of the socket to the non
real-time context but the problem persists in exactly the same manner.
I have increased the rtskbf_cache_size. The problem occurs less
frequently but certainly does not subside completely. Irrespective of
how big I make it after that. There is no mention of any problem in
this regard in any of the logs anymore (dmesg syslog etc)
Roland.
These are the comments that have been made which may be relevant:
Jan Kiszka wrote:
> In 9 of 10 cases (if not more): timing. Running both alone doesn't
> expose some timing issue (race) or transient overload. I can't help with
> EML complaints, maybe the FMTC guys have an idea what can trigger this
> and how to debug it.
>>>>>> RTnet:rtskb allocation from real-time cache failed.
> You have created the socket for some/all EML activity from primary mode
> of some Xenomai thread, thus network buffer allocation is ought to run
> against the real-time rtskb pool - which is by default empty :p. See
> README.pools from the RTnet documentation on this.
Although this was a problem
>
> I don't have the EML design at hand, but you might be able to avoid this
> by initialising before creating the shadow task or by explicitly
> switching to secondary mode before initialising. [Sorry for this issue,
> it's at least partly due to some outdated RTnet design.]
>
> Jan
>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [Xenomai-help] [Ethercatmaster-users] EML conflict with RTCAN? low_level_input framebuilding failed.
2007-08-14 13:56 ` Roland Tollenaar
@ 2007-08-14 14:47 ` Klaas Gadeyne
2007-08-14 18:03 ` Roland Tollenaar
0 siblings, 1 reply; 23+ messages in thread
From: Klaas Gadeyne @ 2007-08-14 14:47 UTC (permalink / raw)
To: rolandtollenaar; +Cc: Xenomai-help, EML users, rtnet-users
On Tue, 14 Aug 2007, Roland Tollenaar wrote:
> To refresh, I am running EML and rtcan together and separately they
> appear to function perfectly but when combined, EML sometimes starts
> ejecting the error:
How many threads do you have sending process data, and what are there
priorities? (/proc/xenomai/sched IIRC)
> EC_Telegram::index_check() : index not the same,
> low_level_input() : framebuilding failed.
>
> Or something pretty closely to that effect.
>
> Despite EML warning that the frame could not be built, the inputs (which
> rely on the framebuilding being succesful pretty stringently) seem to
> function perfectly. It seems as though EML is emitting a false warning.
What do you mean with "inputs functioning perfectly"?
> I have now killed the warning from EML with the following hack:
>
> In EC_Telegram::check_index() I have effectively killed the check by
> snubbing the holler and always returning true. Now the application no
> longer emits warnings and everything functions well (or so it seems). I
> have a sawtooth analog output on a scope which triggers well.
Should I have read s/input/output/g in the above?
> It
> fluttered sometimes when EML emitted the warning. Presumably the
> fluttering is caused by some latency which I thought might be the result
> of EML emitting the warnings. However even without the warnings from EML
> this fluttering incidentally takes place. Generally I can increase the
> chance of this flutter (I will confirm this with a latency test later
> on) by clicking about in the user interface of my application (QT).
>
>
> So EML seems to go haywire when there is latency and there seems to be
> latency when I am using rtcan. (Bit of latency can also be caused by
> clicking about but I think it is mainly rtcan causing the latency spikes
> and the clicking about just knocks it over the edge more often.)
>
> I have also made the index that is supposedly not the same visible. For
> the EML chaps: index and m_idx and the output seems to be something like
> this
>
> index m_idx
> 0 1
> 0 0
> 1 2
> 2 3
> 2 3
> 3 4
> 4 5
> : :
> : :
> 253 254
> 0 2
>
> or something to that effect. Can anyone comment on this?
Is the above the index in the output captured with wireshark or
something else?
AFAIS from the code shouldn't be affected by latency of the PD
thread. You might uncomment the following log statements to get more
info too (I wonder why they are commented out anyway, Tom?)
static bool ec_rtdm_txandrx(struct EtherCAT_Frame * frame, struct
netif * netif) {
int tries = 0;
while (tries < MAX_TRIES_TX) {
pthread_mutex_lock (&txandrx_mut);
if (low_level_output(frame,netif)){
if (low_level_input(frame,netif)){
pthread_mutex_unlock(&txandrx_mut);
return true;
}
else{
//log(EC_LOG_ERROR,
"low_level_txandrx: receiving
failed\n");
pthread_mutex_unlock(&txandrx_mut);
}
}
else{
//log(EC_LOG_ERROR, "low_level_txandrx:
sending failed\n");
pthread_mutex_unlock(&txandrx_mut);
}
tries++;
}
log(EC_LOG_FATAL, "low_level_txandrx: failed: MAX_TRIES_TX:
Giving up\n");
return false;
Klaas
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [Xenomai-help] [Ethercatmaster-users] EML conflict with RTCAN? low_level_input framebuilding failed.
2007-08-14 14:47 ` Klaas Gadeyne
@ 2007-08-14 18:03 ` Roland Tollenaar
2007-08-14 19:17 ` Jan Kiszka
0 siblings, 1 reply; 23+ messages in thread
From: Roland Tollenaar @ 2007-08-14 18:03 UTC (permalink / raw)
To: Klaas Gadeyne; +Cc: Xenomai-help, EML users, rtnet-users
Hi,
> How many threads do you have sending process data, and what are there
> priorities? (/proc/xenomai/sched IIRC)
I have 3 rt tasks running. Only one sends and receives process data. The
priorities are:
rt_task1 99
rt_task2 75
rt_task3 1
Period times are
1ms
3ms
indefinite(holds a blocking rt_can recv call to catch any incoming CAN
messages)
>
>> EC_Telegram::index_check() : index not the same,
>> low_level_input() : framebuilding failed.
>>
>> Or something pretty closely to that effect.
>>
>> Despite EML warning that the frame could not be built, the inputs (which
>> rely on the framebuilding being succesful pretty stringently) seem to
>> function perfectly. It seems as though EML is emitting a false warning.
>
> What do you mean with "inputs functioning perfectly"?
The digital inputs are packed into the frame as are the digital outputs
and analog output process data. The outputs function as they should but
the warning complains mainly about the retrieving part of the ethercat
cycle. Hence my comment that the digital inputs also function as they
should that is to say the data arrives correctly and uncorrupted. AFAI
understand from ETG the index is not changed by the ESC's so I would
expect the check always return true. But even if it does not what does
that mean? Can it mean that EML is losing some frames that have been
transmitted? I.e. the index is incremented with every transmit and the
message with the same index is expected on the next read but instead it
is only getting one later? If so, what could cause this?
>> I have now killed the warning from EML with the following hack:
>>
>> In EC_Telegram::check_index() I have effectively killed the check by
>> snubbing the holler and always returning true. Now the application no
>> longer emits warnings and everything functions well (or so it seems). I
>> have a sawtooth analog output on a scope which triggers well.
>
> Should I have read s/input/output/g in the above?
No, output. The output is incremented up in task 1 and reset to -10V
when it reaches 10V. The steps take place at exactly 1ms with impressive
accuracy and consistency. This saw tooth wave is so stable that my scope
has no problem locking onto it and it remains stationary (allowing me to
admire the accurate 1ms steps :) ). Now and then if I try hard the wave
flimmers i.e. the triggering is lost) which I think indicates that the
1ms task did not increment the analog output on time. Whatever happens
to cause this (rtcan ?, we know that it displays some latency breaking
behaviour when the buffer is full. But there is no evidence that there
is any buffer overflow at present anymore) might make an ethernet frame
go lost Ah!, a delayed read **** which causes the index shift and
consequently the irritating messages.
**** It occurs to me to ask whether there is an incomming buffer for
ethercat frames that is maybe read out in such a manner (e.g. FIFO) that
if a message can get "buried" and once a slip has occurred the index
shift stays resident. ?? If anyone would care to enlighten me on how
this part of EML works and whether this hypothesis is a possibility or
not I would be much obliged.
>> I have also made the index that is supposedly not the same visible. For
>> the EML chaps: index and m_idx and the output seems to be something like
>> this
>>
>> index m_idx
>> 0 1
>> 0 0
>> 1 2
>> 2 3
>> 2 3
>> 3 4
>> 4 5
>> : :
>> : :
>> 253 254
>> 0 2
>>
>> or something to that effect. Can anyone comment on this?
>
> Is the above the index in the output captured with wireshark or
> something else?
No EML and switches are no friends of each other when the frame gets
long I suspect. EML bombs out when I introduce a switch. I would be much
obliged to anyone who would tell me how to set up any timeout delay
measurement in EML.
What I did was simply put a printf line into EML to output index and
m_idx to screen. So unfortunately this does not tell us where the shift
is acquired.
Thanks. I'll give below a try.
Roland.
>
> AFAIS from the code shouldn't be affected by latency of the PD
> thread. You might uncomment the following log statements to get more
> info too (I wonder why they are commented out anyway, Tom?)
>
> static bool ec_rtdm_txandrx(struct EtherCAT_Frame * frame, struct
> netif * netif) {
> int tries = 0;
> while (tries < MAX_TRIES_TX) {
> pthread_mutex_lock (&txandrx_mut);
> if (low_level_output(frame,netif)){
> if (low_level_input(frame,netif)){
> pthread_mutex_unlock(&txandrx_mut);
> return true;
> }
> else{
> //log(EC_LOG_ERROR,
> "low_level_txandrx: receiving
> failed\n");
> pthread_mutex_unlock(&txandrx_mut);
> }
> }
> else{
> //log(EC_LOG_ERROR, "low_level_txandrx:
> sending failed\n");
> pthread_mutex_unlock(&txandrx_mut);
> }
> tries++;
> }
> log(EC_LOG_FATAL, "low_level_txandrx: failed: MAX_TRIES_TX:
> Giving up\n");
> return false;
>
> Klaas
>
>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [Xenomai-help] [Ethercatmaster-users] EML conflict with RTCAN? low_level_input framebuilding failed.
2007-08-14 18:03 ` Roland Tollenaar
@ 2007-08-14 19:17 ` Jan Kiszka
2007-08-15 6:11 ` Roland Tollenaar
0 siblings, 1 reply; 23+ messages in thread
From: Jan Kiszka @ 2007-08-14 19:17 UTC (permalink / raw)
To: rolandtollenaar; +Cc: EML users, rtnet-users, Xenomai-help
[-- Attachment #1: Type: text/plain, Size: 2555 bytes --]
Roland Tollenaar wrote:
> Hi,
>
>> How many threads do you have sending process data, and what are there
>> priorities? (/proc/xenomai/sched IIRC)
> I have 3 rt tasks running. Only one sends and receives process data. The
> priorities are:
> rt_task1 99
Check the /proc output again, there should be also RTnet's stack manager
at prio 98. Maybe that is too low for your scenario and causes prio
inversions (note: every incoming Ethernet frame goes through its hands).
Try lowering the prio of your rt_task1 beneath 98.
> rt_task2 75
> rt_task3 1
>
> Period times are
> 1ms
> 3ms
> indefinite(holds a blocking rt_can recv call to catch any incoming CAN
> messages)
>
>
>>> EC_Telegram::index_check() : index not the same,
>>> low_level_input() : framebuilding failed.
>>>
>>> Or something pretty closely to that effect.
>>>
>>> Despite EML warning that the frame could not be built, the inputs (which
>>> rely on the framebuilding being succesful pretty stringently) seem to
>>> function perfectly. It seems as though EML is emitting a false warning.
>> What do you mean with "inputs functioning perfectly"?
>
> The digital inputs are packed into the frame as are the digital outputs
> and analog output process data. The outputs function as they should but
> the warning complains mainly about the retrieving part of the ethercat
> cycle. Hence my comment that the digital inputs also function as they
> should that is to say the data arrives correctly and uncorrupted. AFAI
> understand from ETG the index is not changed by the ESC's so I would
> expect the check always return true. But even if it does not what does
> that mean? Can it mean that EML is losing some frames that have been
> transmitted? I.e. the index is incremented with every transmit and the
> message with the same index is expected on the next read but instead it
> is only getting one later? If so, what could cause this?
If the problem persists (or your _really_ want to understand what
happens), you could try to put an xntrace_user_freeze(0, 1) before the
line which emits that EML warning, turn on the I-pipe tracer, set a
large back_trace_points value (a few thousand), enable verbose mode, and
grab what /proc/ipipe/trace/frozen reports after the hick-up. See [1]
for more howtos.
If you post the dump, we may be able to analyse what the system is doing
before the problem report, if there are long delays due to high-prio
tasks e.g.
Jan
[1] http://www.xenomai.org/index.php/I-pipe:Tracer
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 250 bytes --]
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [Xenomai-help] [Ethercatmaster-users] EML conflict with RTCAN? low_level_input framebuilding failed.
2007-08-14 19:17 ` Jan Kiszka
@ 2007-08-15 6:11 ` Roland Tollenaar
2007-08-15 8:24 ` Jan Kiszka
0 siblings, 1 reply; 23+ messages in thread
From: Roland Tollenaar @ 2007-08-15 6:11 UTC (permalink / raw)
To: Jan Kiszka; +Cc: EML users, rtnet-users, Xenomai-help
Hi,
> Check the /proc output again, there should be also RTnet's stack manager
> at prio 98. Maybe that is too low for your scenario and causes prio
> inversions (note: every incoming Ethernet frame goes through its hands).
> Try lowering the prio of your rt_task1 beneath 98.
Thanks. This seems to have made a big improvement. I have so far not
once detected the scope to loose lock on the sawtooth when the
index_check in eml is still disabled. Before lowering the priority of my
task (to 97) I could still invoke what I suspect to be a latency spike.
If the index_check is enabled I now mostly have less problems too. There
is a chance in start-up of the application that there is a latency spike
and then the warning kicks in. Due to the fact that the shift is
permanent, the error is persistent and this then destabilizes the
sawtooth a bit.
I will keep the check disabled but for the EML chaps I do think this is
a point of interest. I would be very interested how this index shift
occurs and why it is persistent after occurring once.
Sorry for the pragmatic qualifications here but in the end its the
stability of the outputs that will determine the behaviour of the
machine so its not a bad way to assess the software. :)
> If the problem persists (or your _really_ want to understand what
> happens), you could try to put an xntrace_user_freeze(0, 1) before the
> line which emits that EML warning, turn on the I-pipe tracer, set a
> large back_trace_points value (a few thousand), enable verbose mode, and
> grab what /proc/ipipe/trace/frozen reports after the hick-up. See [1]
> for more howtos.
Done this before so it should not be a problem. Don't think it is
necessary quite yet as the behaviour at the moment looks good.
Regards,
Roland.
>
> If you post the dump, we may be able to analyse what the system is doing
> before the problem report, if there are long delays due to high-prio
> tasks e.g.
>
> Jan
>
> [1] http://www.xenomai.org/index.php/I-pipe:Tracer
>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [Xenomai-help] [Ethercatmaster-users] EML conflict with RTCAN? low_level_input framebuilding failed.
2007-08-15 6:11 ` Roland Tollenaar
@ 2007-08-15 8:24 ` Jan Kiszka
2007-08-15 8:37 ` Roland Tollenaar
2007-08-15 9:50 ` Roland Tollenaar
0 siblings, 2 replies; 23+ messages in thread
From: Jan Kiszka @ 2007-08-15 8:24 UTC (permalink / raw)
To: rolandtollenaar; +Cc: EML users, rtnet-users, Xenomai-help
[-- Attachment #1: Type: text/plain, Size: 2292 bytes --]
Roland Tollenaar wrote:
> Hi,
>
>> Check the /proc output again, there should be also RTnet's stack manager
>> at prio 98. Maybe that is too low for your scenario and causes prio
>> inversions (note: every incoming Ethernet frame goes through its hands).
>> Try lowering the prio of your rt_task1 beneath 98.
>
> Thanks. This seems to have made a big improvement. I have so far not
> once detected the scope to loose lock on the sawtooth when the
> index_check in eml is still disabled. Before lowering the priority of my
> task (to 97) I could still invoke what I suspect to be a latency spike.
>
> If the index_check is enabled I now mostly have less problems too. There
> is a chance in start-up of the application that there is a latency spike
> and then the warning kicks in. Due to the fact that the shift is
> permanent, the error is persistent and this then destabilizes the
> sawtooth a bit.
Hmm, this doesn't convince me yet. Such skews during startup may as well
be triggered by unusual load during runtime (non-RT activity or new RT
components). Did you put your system under adequate non-RT load as well
while measuring the outputs?
>
> I will keep the check disabled but for the EML chaps I do think this is
> a point of interest. I would be very interested how this index shift
> occurs and why it is persistent after occurring once.
>
> Sorry for the pragmatic qualifications here but in the end its the
> stability of the outputs that will determine the behaviour of the
> machine so its not a bad way to assess the software. :)
A problem isn't solved until it is also understood.
>
>> If the problem persists (or your _really_ want to understand what
>> happens), you could try to put an xntrace_user_freeze(0, 1) before the
>> line which emits that EML warning, turn on the I-pipe tracer, set a
>> large back_trace_points value (a few thousand), enable verbose mode, and
>> grab what /proc/ipipe/trace/frozen reports after the hick-up. See [1]
>> for more howtos.
>
> Done this before so it should not be a problem. Don't think it is
In that case, I would even more suggest to collect the data, maybe now
about the fragile startup case.
> necessary quite yet as the behaviour at the moment looks good.
Jan
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 250 bytes --]
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [Xenomai-help] [Ethercatmaster-users] EML conflict with RTCAN? low_level_input framebuilding failed.
2007-08-15 8:24 ` Jan Kiszka
@ 2007-08-15 8:37 ` Roland Tollenaar
2007-08-15 9:50 ` Roland Tollenaar
1 sibling, 0 replies; 23+ messages in thread
From: Roland Tollenaar @ 2007-08-15 8:37 UTC (permalink / raw)
To: Jan Kiszka; +Cc: EML users, rtnet-users, Xenomai-help
> Hmm, this doesn't convince me yet. Such skews during startup may as well
> be triggered by unusual load during runtime (non-RT activity or new RT
> components). Did you put your system under adequate non-RT load as well
> while measuring the outputs?
could you please just remind me how to do this again? OR can i just run
the latency test, it has dummy loading in it does it not?
>> Sorry for the pragmatic qualifications here but in the end its the
>> stability of the outputs that will determine the behaviour of the
>> machine so its not a bad way to assess the software. :)
>
> A problem isn't solved until it is also understood.
You are so right. :(
>
>>> If the problem persists (or your _really_ want to understand what
>>> happens), you could try to put an xntrace_user_freeze(0, 1) before the
>>> line which emits that EML warning, turn on the I-pipe tracer, set a
>>> large back_trace_points value (a few thousand), enable verbose mode, and
>>> grab what /proc/ipipe/trace/frozen reports after the hick-up. See [1]
>>> for more howtos.
>> Done this before so it should not be a problem. Don't think it is
>
> In that case, I would even more suggest to collect the data, maybe now
> about the fragile startup case.
Have got it on my todo list. :)
Roland.
>
>> necessary quite yet as the behaviour at the moment looks good.
>
> Jan
>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [Xenomai-help] [Ethercatmaster-users] EML conflict with RTCAN? low_level_input framebuilding failed.
2007-08-15 8:24 ` Jan Kiszka
2007-08-15 8:37 ` Roland Tollenaar
@ 2007-08-15 9:50 ` Roland Tollenaar
2007-08-15 10:30 ` Wolfgang Grandegger
1 sibling, 1 reply; 23+ messages in thread
From: Roland Tollenaar @ 2007-08-15 9:50 UTC (permalink / raw)
To: Jan Kiszka; +Cc: EML users, rtnet-users, Xenomai-help
Hi,
Some more interesting findings (no i-pipe trace yet though).
> Hmm, this doesn't convince me yet. Such skews during startup may as well
> be triggered by unusual load during runtime (non-RT activity or new RT
> components). Did you put your system under adequate non-RT load as well
> while measuring the outputs?
Running latencytest with my application shows an average latency of
about 40 and a max of 200ns. This was rather shocking so I turned off
rtcan in my application. Now the max latecy is 60ns. Turn off EML and
turn on rtcan, max latecy is 230ns. How is that for strange? But since I
can see the scope output bobbing with 200ns during the latency test, I
can also see that if I run my application without the latency test the
huge max latency disappears entirely. Maybe it is time for the trace but
then again I am still using CAN over the parallel port so will see what
it does on a machine with a PCI CAN adaptor first. Because I think I
know what happens: Due to the external loading the CAN recv interrupt
triggers the Rx ISR briefly before the 1ms task period ends. Due to the
priority of the ISR (huge debate over this) and its atomicness (if I
remember correctly) the reading out of the slow hardware delays the
start of the new task period.
Just thought it was interesting to mention. Btw when the latency appears
there are no overflow messages or anything like that which support the
theory I have about the cause.
Btw2 the 200ns latency spikes do not cause the scope to loose lock on
the saw-tooth so whatever causes that problem is of a different nature
still.
Regards,
Roland.
>
>> I will keep the check disabled but for the EML chaps I do think this is
>> a point of interest. I would be very interested how this index shift
>> occurs and why it is persistent after occurring once.
>>
>> Sorry for the pragmatic qualifications here but in the end its the
>> stability of the outputs that will determine the behaviour of the
>> machine so its not a bad way to assess the software. :)
>
> A problem isn't solved until it is also understood.
>
>>> If the problem persists (or your _really_ want to understand what
>>> happens), you could try to put an xntrace_user_freeze(0, 1) before the
>>> line which emits that EML warning, turn on the I-pipe tracer, set a
>>> large back_trace_points value (a few thousand), enable verbose mode, and
>>> grab what /proc/ipipe/trace/frozen reports after the hick-up. See [1]
>>> for more howtos.
>> Done this before so it should not be a problem. Don't think it is
>
> In that case, I would even more suggest to collect the data, maybe now
> about the fragile startup case.
>
>> necessary quite yet as the behaviour at the moment looks good.
>
> Jan
>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [Xenomai-help] [Ethercatmaster-users] EML conflict with RTCAN? low_level_input framebuilding failed.
2007-08-15 9:50 ` Roland Tollenaar
@ 2007-08-15 10:30 ` Wolfgang Grandegger
2007-08-15 10:30 ` Roland Tollenaar
0 siblings, 1 reply; 23+ messages in thread
From: Wolfgang Grandegger @ 2007-08-15 10:30 UTC (permalink / raw)
To: rolandtollenaar; +Cc: Xenomai-help, EML users, Jan Kiszka, rtnet-users
Roland Tollenaar wrote:
> Hi,
>
> Some more interesting findings (no i-pipe trace yet though).
>
>> Hmm, this doesn't convince me yet. Such skews during startup may as well
>> be triggered by unusual load during runtime (non-RT activity or new RT
>> components). Did you put your system under adequate non-RT load as well
>> while measuring the outputs?
> Running latencytest with my application shows an average latency of
> about 40 and a max of 200ns. This was rather shocking so I turned off
> rtcan in my application. Now the max latecy is 60ns. Turn off EML and
> turn on rtcan, max latecy is 230ns. How is that for strange? But since I
> can see the scope output bobbing with 200ns during the latency test, I
> can also see that if I run my application without the latency test the
> huge max latency disappears entirely. Maybe it is time for the trace but
> then again I am still using CAN over the parallel port so will see what
> it does on a machine with a PCI CAN adaptor first. Because I think I
> know what happens: Due to the external loading the CAN recv interrupt
> triggers the Rx ISR briefly before the 1ms task period ends. Due to the
> priority of the ISR (huge debate over this) and its atomicness (if I
> remember correctly) the reading out of the slow hardware delays the
> start of the new task period.
>
> Just thought it was interesting to mention. Btw when the latency appears
> there are no overflow messages or anything like that which support the
> theory I have about the cause.
>
> Btw2 the 200ns latency spikes do not cause the scope to loose lock on
> the saw-tooth so whatever causes that problem is of a different nature
> still.
s/ns/us/ ?
Wolfgang.
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [Xenomai-help] [Ethercatmaster-users] EML conflict with RTCAN? low_level_input framebuilding failed.
2007-08-15 10:30 ` Wolfgang Grandegger
@ 2007-08-15 10:30 ` Roland Tollenaar
0 siblings, 0 replies; 23+ messages in thread
From: Roland Tollenaar @ 2007-08-15 10:30 UTC (permalink / raw)
To: Wolfgang Grandegger; +Cc: Xenomai-help, EML users, Jan Kiszka, rtnet-users
>>
>> Btw2 the 200ns latency spikes do not cause the scope to loose lock on
>> the saw-tooth so whatever causes that problem is of a different nature
>> still.
>
> s/ns/us/ ?
Indeed. Sorry.
Roland
>
> Wolfgang.
>
^ permalink raw reply [flat|nested] 23+ messages in thread
end of thread, other threads:[~2007-08-15 10:30 UTC | newest]
Thread overview: 23+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-08-13 9:45 [Xenomai-help] EML conflict with RTCAN? low_level_input framebuilding failed Roland Tollenaar
2007-08-13 11:41 ` Wolfgang Grandegger
2007-08-13 12:41 ` Roland Tollenaar
2007-08-13 13:03 ` Wolfgang Grandegger
2007-08-13 13:11 ` Roland Tollenaar
2007-08-13 14:00 ` Roland Tollenaar
2007-08-13 14:51 ` [Xenomai-help] [Ethercatmaster-users] " Jan Kiszka
2007-08-13 15:55 ` Roland Tollenaar
2007-08-13 16:57 ` Jan Kiszka
2007-08-13 17:40 ` Roland Tollenaar
2007-08-13 17:57 ` Jan Kiszka
2007-08-13 18:17 ` Roland Tollenaar
2007-08-13 18:30 ` Jan Kiszka
2007-08-14 13:56 ` Roland Tollenaar
2007-08-14 14:47 ` Klaas Gadeyne
2007-08-14 18:03 ` Roland Tollenaar
2007-08-14 19:17 ` Jan Kiszka
2007-08-15 6:11 ` Roland Tollenaar
2007-08-15 8:24 ` Jan Kiszka
2007-08-15 8:37 ` Roland Tollenaar
2007-08-15 9:50 ` Roland Tollenaar
2007-08-15 10:30 ` Wolfgang Grandegger
2007-08-15 10:30 ` Roland Tollenaar
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.