* [Xenomai-help] rtserial interface stalls
@ 2009-03-20 10:56 Vikesh Rambaran
2009-03-20 11:46 ` Jan Kiszka
0 siblings, 1 reply; 6+ messages in thread
From: Vikesh Rambaran @ 2009-03-20 10:56 UTC (permalink / raw)
To: xenomai-help
Hi
We have the following setup :
Hardware : Intel Core Duo, Quatech PCI 200/300 4 channel serial card
Software : Ubuntu 8.04, Linux 2.6.24, Xenomai 2.4.5.
Application : 3 tasks running at (Task1) 100uS (Task2) 20mS (Task3) 1S
Tasks 1 and 3 are 'empty' at the moment. Task 2 transmits and receives
multiple messages on rtser0 and rtser1
Serial ports are configured as follows
serialConfig.config_mask = 0xFFFF;
serialConfig.baud_rate = configPtr->BitRate; //= 115200
serialConfig.parity = configPtr->Parity; //= None
serialConfig.data_bits = configPtr->DataBits; //= 8
serialConfig.stop_bits = configPtr->StopBits; //= 1
serialConfig.handshake = configPtr->FlowControl;//= None
serialConfig.fifo_depth = RTSER_DEF_FIFO_DEPTH;
serialConfig.rx_timeout = RTDM_TIMEOUT_NONE;//RTSER_DEF_TIMEOUT;
serialConfig.tx_timeout = RTSER_DEF_TIMEOUT;
serialConfig.event_timeout = RTSER_DEF_TIMEOUT;
serialConfig.timestamp_history= RTSER_RX_TIMESTAMP_HISTORY;
serialConfig.event_mask = RTSER_DEF_EVENT_MASK;
Application runs for hours/days with loop back on PCI card connector
Problem
-------
However, between two PCs or external test unit, task 2 stops after a
few minutes. The other tasks are still active.
/proc/xenomai/ while running
seeker@domain.hid$ cat stat
CPU PID MSW CSW PF STAT %CPU NAME
0 0 0 43778348 0 00500080 82.9 ROOT/0
1 0 0 0 0 00500080 100.0 ROOT/1
0 8284 1 280862 0 00300184 5.0 ASU_SYNC
0 8285 2 3946 0 00300184 0.2 HILS
0 8286 1 33 0 00300184 0.0 DEBUG
0 0 0 8429 0 00000000 4.1 IRQ16: rtser0
1 0 0 0 0 00000000 0.0 IRQ16: rtser0
0 0 0 96987 0 00000000 6.1 IRQ16: rtser1
1 0 0 0 0 00000000 0.0 IRQ16: rtser1
0 0 0 45948236 0 00000000 1.2 IRQ233: [timer]
1 0 0 2102218 0 00000000 0.0 IRQ233: [timer]
seeker@domain.hid$ cat sched
CPU PID PRI PERIOD TIMEOUT TIMEBASE STAT NAME
0 0 -1 0 0 master R ROOT/0
1 0 -1 0 0 master R ROOT/1
0 8284 3 100000 51111 master D ASU_SYNC
0 8285 2 20000000 13078512 master D HILS
0 8286 1 1000000000 553400299 master D DEBUG
/proc/xenomai/ when task stalls
seeker@domain.hid$ cat sched
CPU PID PRI PERIOD TIMEOUT TIMEBASE STAT NAME
0 0 -1 0 0 master R ROOT/0
1 0 -1 0 0 master R ROOT/1
0 6223 3 100000 67714 master D ASU_SYNC
0 6224 2 20000000 0 master W HILS
0 6225 1 1000000000 669570391 master D DEBUG
seeker@domain.hid$ cat stat
CPU PID MSW CSW PF STAT %CPU NAME
0 0 0 37141696 0 00500080 85.1 ROOT/0
1 0 0 0 0 00500080 100.0 ROOT/1
0 6223 1 16318556 0 00300184 4.9 ASU_SYNC
0 6224 1 9506 0 00300182 0.0 HILS
0 6225 1 1756 0 00300184 0.0 DEBUG
0 0 0 20694 0 00000000 2.8 IRQ16: rtser0
1 0 0 0 0 00000000 0.0 IRQ16: rtser0
0 0 0 5471547 0 00000000 5.5 IRQ16: rtser1
1 0 0 0 0 00000000 0.0 IRQ16: rtser1
0 0 0 39087824 0 00000000 1.2 IRQ233: [timer]
1 0 0 1891732 0 00000000 0.0 IRQ233: [timer]
Alternative tried (after a bit of debugging)
-----------------
Changed serialConfig.tx_timeout = RTSER_DEF_TIMEOUT;
to serialConfig.tx_timeout = RTDM_TIMEOUT_NONE;
Task 2 then runs continuously. Writing to rtser0 returns valid number of
bytes written but no data appears on serial port pin. At the same time
rtser1 functions normally.
/proc/xenomai/stat shows rtser0 CSW incrementing. Disconnecting rtser0
stops CSW from incrementing.
Looks like the rtser0 tx interrupt gets 'lost' somehow and never
recovers. Restarting the application restores communication again, for a
while ...
Has any one else experienced a similar situation ?
Any suggestions on how to trace this further, would be greatly
appreciated.
Thanx
Vicki
[PS Downloaded the latest Xenomai and kernel. Going to build and test
that later]
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Xenomai-help] rtserial interface stalls
2009-03-20 10:56 [Xenomai-help] rtserial interface stalls Vikesh Rambaran
@ 2009-03-20 11:46 ` Jan Kiszka
2009-03-20 15:38 ` Vikesh Rambaran
[not found] ` <b131c9f0903230122r713d131x7bb516ed9d00a42a@domain.hid>
0 siblings, 2 replies; 6+ messages in thread
From: Jan Kiszka @ 2009-03-20 11:46 UTC (permalink / raw)
To: Vikesh Rambaran; +Cc: xenomai-help
Vikesh Rambaran wrote:
> Hi
>
> We have the following setup :
>
> Hardware : Intel Core Duo, Quatech PCI 200/300 4 channel serial card
Do you see the same issue when using only one core?
> Software : Ubuntu 8.04, Linux 2.6.24, Xenomai 2.4.5.
>
> Application : 3 tasks running at (Task1) 100uS (Task2) 20mS (Task3) 1S
>
> Tasks 1 and 3 are 'empty' at the moment. Task 2 transmits and receives
> multiple messages on rtser0 and rtser1
>
> Serial ports are configured as follows
>
> serialConfig.config_mask = 0xFFFF;
> serialConfig.baud_rate = configPtr->BitRate; //= 115200
> serialConfig.parity = configPtr->Parity; //= None
> serialConfig.data_bits = configPtr->DataBits; //= 8
> serialConfig.stop_bits = configPtr->StopBits; //= 1
> serialConfig.handshake = configPtr->FlowControl;//= None
> serialConfig.fifo_depth = RTSER_DEF_FIFO_DEPTH;
> serialConfig.rx_timeout = RTDM_TIMEOUT_NONE;//RTSER_DEF_TIMEOUT;
> serialConfig.tx_timeout = RTSER_DEF_TIMEOUT;
> serialConfig.event_timeout = RTSER_DEF_TIMEOUT;
> serialConfig.timestamp_history= RTSER_RX_TIMESTAMP_HISTORY;
> serialConfig.event_mask = RTSER_DEF_EVENT_MASK;
>
>
> Application runs for hours/days with loop back on PCI card connector
>
>
> Problem
> -------
>
> However, between two PCs or external test unit, task 2 stops after a
> few minutes. The other tasks are still active.
>
>
> /proc/xenomai/ while running
>
>
> seeker@domain.hid$ cat stat
> CPU PID MSW CSW PF STAT %CPU NAME
> 0 0 0 43778348 0 00500080 82.9 ROOT/0
> 1 0 0 0 0 00500080 100.0 ROOT/1
> 0 8284 1 280862 0 00300184 5.0 ASU_SYNC
> 0 8285 2 3946 0 00300184 0.2 HILS
> 0 8286 1 33 0 00300184 0.0 DEBUG
> 0 0 0 8429 0 00000000 4.1 IRQ16: rtser0
> 1 0 0 0 0 00000000 0.0 IRQ16: rtser0
> 0 0 0 96987 0 00000000 6.1 IRQ16: rtser1
> 1 0 0 0 0 00000000 0.0 IRQ16: rtser1
> 0 0 0 45948236 0 00000000 1.2 IRQ233: [timer]
> 1 0 0 2102218 0 00000000 0.0 IRQ233: [timer]
> seeker@domain.hid$ cat sched
> CPU PID PRI PERIOD TIMEOUT TIMEBASE STAT NAME
> 0 0 -1 0 0 master R ROOT/0
> 1 0 -1 0 0 master R ROOT/1
> 0 8284 3 100000 51111 master D ASU_SYNC
> 0 8285 2 20000000 13078512 master D HILS
> 0 8286 1 1000000000 553400299 master D DEBUG
>
>
> /proc/xenomai/ when task stalls
>
> seeker@domain.hid$ cat sched
> CPU PID PRI PERIOD TIMEOUT TIMEBASE STAT NAME
> 0 0 -1 0 0 master R ROOT/0
> 1 0 -1 0 0 master R ROOT/1
> 0 6223 3 100000 67714 master D ASU_SYNC
> 0 6224 2 20000000 0 master W HILS
> 0 6225 1 1000000000 669570391 master D DEBUG
>
> seeker@domain.hid$ cat stat
> CPU PID MSW CSW PF STAT %CPU NAME
> 0 0 0 37141696 0 00500080 85.1 ROOT/0
> 1 0 0 0 0 00500080 100.0 ROOT/1
> 0 6223 1 16318556 0 00300184 4.9 ASU_SYNC
> 0 6224 1 9506 0 00300182 0.0 HILS
> 0 6225 1 1756 0 00300184 0.0 DEBUG
> 0 0 0 20694 0 00000000 2.8 IRQ16: rtser0
> 1 0 0 0 0 00000000 0.0 IRQ16: rtser0
> 0 0 0 5471547 0 00000000 5.5 IRQ16: rtser1
> 1 0 0 0 0 00000000 0.0 IRQ16: rtser1
> 0 0 0 39087824 0 00000000 1.2 IRQ233: [timer]
> 1 0 0 1891732 0 00000000 0.0 IRQ233: [timer]
>
>
> Alternative tried (after a bit of debugging)
> -----------------
>
> Changed serialConfig.tx_timeout = RTSER_DEF_TIMEOUT;
> to serialConfig.tx_timeout = RTDM_TIMEOUT_NONE;
>
> Task 2 then runs continuously. Writing to rtser0 returns valid number of
> bytes written but no data appears on serial port pin. At the same time
> rtser1 functions normally.
Without feedback from the device about its tx queue state you may
quickly overload it this way (definitely if written bytes > fifo length).
>
> /proc/xenomai/stat shows rtser0 CSW incrementing. Disconnecting rtser0
> stops CSW from incrementing.
>
> Looks like the rtser0 tx interrupt gets 'lost' somehow and never
> recovers. Restarting the application restores communication again, for a
> while ...
>
>
> Has any one else experienced a similar situation ?
Not with the current versions. But there are many factors that may
influence the situation.
>
> Any suggestions on how to trace this further, would be greatly
> appreciated.
First of all, it would in fact be good to rule-out issues of the old
kernel/ipipe combination /wrt IRQ handling by giving latest versions a
try (2.6.28 + Xenomai 2.4.7). The you may want to consider setting up a
tracer:
The ipipe function tracer (see Xenomai wiki) would make sense when you
can identify a failure very quickly (without a few 100 us or so) and
trigger a stop. That is required as the ipipe tracer works on lowest
lever (kernel functions) and quickly fills up its circular buffer with
new events.
The LTTng tracer provides a higher level view on the problem and could
perfectly run over a longer period. See related postings on this list
for details (I'm currently maintaining a 2.6.28 port for ipipe, see also
git.kiszka.org).
Once you have a picture of what goes on in the kernel generally, you may
add ad-hoc instrumentations to driver or kernel (or we can discuss where
to add them) to find out what actually happens.
Jan
--
Siemens AG, Corporate Technology, CT SE 2
Corporate Competence Center Embedded Linux
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Xenomai-help] rtserial interface stalls
2009-03-20 11:46 ` Jan Kiszka
@ 2009-03-20 15:38 ` Vikesh Rambaran
2009-03-21 9:02 ` Jan Kiszka
[not found] ` <b131c9f0903230122r713d131x7bb516ed9d00a42a@domain.hid>
1 sibling, 1 reply; 6+ messages in thread
From: Vikesh Rambaran @ 2009-03-20 15:38 UTC (permalink / raw)
To: Jan Kiszka; +Cc: xenomai-help
On Fri, 2009-03-20 at 12:46 +0100, Jan Kiszka wrote:
> Vikesh Rambaran wrote:
> > Hi
> >
> > We have the following setup :
> >
> > Hardware : Intel Core Duo, Quatech PCI 200/300 4 channel serial card
>
> Do you see the same issue when using only one core?
>
Set the CPU affinity to 1 using echo 1 > /proc/xenomai/affinity
(also tried echo 2 > /proc/xenomai/affinity)
I have confirmed that the tasks do get assigned to the specified cpu.
Also reverted to serialConfig.tx_timeout = RTSER_DEF_TIMEOUT;
as per your suggestion below.
Task 2 still ends up waiting on the write to rtser0.
> > Software : Ubuntu 8.04, Linux 2.6.24, Xenomai 2.4.5.
> >
> > Application : 3 tasks running at (Task1) 100uS (Task2) 20mS (Task3) 1S
> >
> > Tasks 1 and 3 are 'empty' at the moment. Task 2 transmits and receives
> > multiple messages on rtser0 and rtser1
> >
> > Serial ports are configured as follows
> >
> > serialConfig.config_mask = 0xFFFF;
> > serialConfig.baud_rate = configPtr->BitRate; //= 115200
> > serialConfig.parity = configPtr->Parity; //= None
> > serialConfig.data_bits = configPtr->DataBits; //= 8
> > serialConfig.stop_bits = configPtr->StopBits; //= 1
> > serialConfig.handshake = configPtr->FlowControl;//= None
> > serialConfig.fifo_depth = RTSER_DEF_FIFO_DEPTH;
> > serialConfig.rx_timeout = RTDM_TIMEOUT_NONE;//RTSER_DEF_TIMEOUT;
> > serialConfig.tx_timeout = RTSER_DEF_TIMEOUT;
> > serialConfig.event_timeout = RTSER_DEF_TIMEOUT;
> > serialConfig.timestamp_history= RTSER_RX_TIMESTAMP_HISTORY;
> > serialConfig.event_mask = RTSER_DEF_EVENT_MASK;
> >
> >
> > Application runs for hours/days with loop back on PCI card connector
> >
> >
> > Problem
> > -------
> >
> > However, between two PCs or external test unit, task 2 stops after a
> > few minutes. The other tasks are still active.
> >
> >
> > /proc/xenomai/ while running
> >
> >
> > seeker@domain.hid$ cat stat
> > CPU PID MSW CSW PF STAT %CPU NAME
> > 0 0 0 43778348 0 00500080 82.9 ROOT/0
> > 1 0 0 0 0 00500080 100.0 ROOT/1
> > 0 8284 1 280862 0 00300184 5.0 ASU_SYNC
> > 0 8285 2 3946 0 00300184 0.2 HILS
> > 0 8286 1 33 0 00300184 0.0 DEBUG
> > 0 0 0 8429 0 00000000 4.1 IRQ16: rtser0
> > 1 0 0 0 0 00000000 0.0 IRQ16: rtser0
> > 0 0 0 96987 0 00000000 6.1 IRQ16: rtser1
> > 1 0 0 0 0 00000000 0.0 IRQ16: rtser1
> > 0 0 0 45948236 0 00000000 1.2 IRQ233: [timer]
> > 1 0 0 2102218 0 00000000 0.0 IRQ233: [timer]
> > seeker@domain.hid$ cat sched
> > CPU PID PRI PERIOD TIMEOUT TIMEBASE STAT NAME
> > 0 0 -1 0 0 master R ROOT/0
> > 1 0 -1 0 0 master R ROOT/1
> > 0 8284 3 100000 51111 master D ASU_SYNC
> > 0 8285 2 20000000 13078512 master D HILS
> > 0 8286 1 1000000000 553400299 master D DEBUG
> >
> >
> > /proc/xenomai/ when task stalls
> >
> > seeker@domain.hid$ cat sched
> > CPU PID PRI PERIOD TIMEOUT TIMEBASE STAT NAME
> > 0 0 -1 0 0 master R ROOT/0
> > 1 0 -1 0 0 master R ROOT/1
> > 0 6223 3 100000 67714 master D ASU_SYNC
> > 0 6224 2 20000000 0 master W HILS
> > 0 6225 1 1000000000 669570391 master D DEBUG
> >
> > seeker@domain.hid$ cat stat
> > CPU PID MSW CSW PF STAT %CPU NAME
> > 0 0 0 37141696 0 00500080 85.1 ROOT/0
> > 1 0 0 0 0 00500080 100.0 ROOT/1
> > 0 6223 1 16318556 0 00300184 4.9 ASU_SYNC
> > 0 6224 1 9506 0 00300182 0.0 HILS
> > 0 6225 1 1756 0 00300184 0.0 DEBUG
> > 0 0 0 20694 0 00000000 2.8 IRQ16: rtser0
> > 1 0 0 0 0 00000000 0.0 IRQ16: rtser0
> > 0 0 0 5471547 0 00000000 5.5 IRQ16: rtser1
> > 1 0 0 0 0 00000000 0.0 IRQ16: rtser1
> > 0 0 0 39087824 0 00000000 1.2 IRQ233: [timer]
> > 1 0 0 1891732 0 00000000 0.0 IRQ233: [timer]
> >
> >
> > Alternative tried (after a bit of debugging)
> > -----------------
> >
> > Changed serialConfig.tx_timeout = RTSER_DEF_TIMEOUT;
> > to serialConfig.tx_timeout = RTDM_TIMEOUT_NONE;
> >
> > Task 2 then runs continuously. Writing to rtser0 returns valid number of
> > bytes written but no data appears on serial port pin. At the same time
> > rtser1 functions normally.
>
> Without feedback from the device about its tx queue state you may
> quickly overload it this way (definitely if written bytes > fifo length).
>
> >
The idea is to write data into the devices' circular buffer and return
immediately. If there is not enough place in the buffer, i expected
the write call to return an error code or fewer bytes than that which
was requested. That would indicate a buffer overrun condition which can
be flagged at application level. This way the task will not be delayed
and other important functionality can be executed in a deterministic
way.
The data transmitted on each serial channel is less than 150 bytes at
115200kb/s with the task having a fixed period of 20mS. This should not
overflow default 4k buffers of the 16550A driver.
Well that's the plan:) Did i perhaps misunderstand the implementation
for the tx_timeout ?
> > /proc/xenomai/stat shows rtser0 CSW incrementing. Disconnecting rtser0
> > stops CSW from incrementing.
> >
> > Looks like the rtser0 tx interrupt gets 'lost' somehow and never
> > recovers. Restarting the application restores communication again, for a
> > while ...
> >
> >
> > Has any one else experienced a similar situation ?
>
> Not with the current versions. But there are many factors that may
> influence the situation.
>
> >
> > Any suggestions on how to trace this further, would be greatly
> > appreciated.
>
Will implement the rest of your suggestions and provide feedback.
> First of all, it would in fact be good to rule-out issues of the old
> kernel/ipipe combination /wrt IRQ handling by giving latest versions a
> try (2.6.28 + Xenomai 2.4.7). The you may want to consider setting up a
> tracer:
>
> The ipipe function tracer (see Xenomai wiki) would make sense when you
> can identify a failure very quickly (without a few 100 us or so) and
> trigger a stop. That is required as the ipipe tracer works on lowest
> lever (kernel functions) and quickly fills up its circular buffer with
> new events.
>
> The LTTng tracer provides a higher level view on the problem and could
> perfectly run over a longer period. See related postings on this list
> for details (I'm currently maintaining a 2.6.28 port for ipipe, see also
> git.kiszka.org).
>
> Once you have a picture of what goes on in the kernel generally, you may
> add ad-hoc instrumentations to driver or kernel (or we can discuss where
> to add them) to find out what actually happens.
>
> Jan
>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Xenomai-help] rtserial interface stalls
2009-03-20 15:38 ` Vikesh Rambaran
@ 2009-03-21 9:02 ` Jan Kiszka
2009-03-21 16:26 ` vikesh rambaran
0 siblings, 1 reply; 6+ messages in thread
From: Jan Kiszka @ 2009-03-21 9:02 UTC (permalink / raw)
To: Vikesh Rambaran; +Cc: xenomai-help
[-- Attachment #1: Type: text/plain, Size: 1793 bytes --]
Vikesh Rambaran wrote:
> On Fri, 2009-03-20 at 12:46 +0100, Jan Kiszka wrote:
>> Vikesh Rambaran wrote:
>>> ...
>>> Alternative tried (after a bit of debugging)
>>> -----------------
>>>
>>> Changed serialConfig.tx_timeout = RTSER_DEF_TIMEOUT;
>>> to serialConfig.tx_timeout = RTDM_TIMEOUT_NONE;
>>>
>>> Task 2 then runs continuously. Writing to rtser0 returns valid number of
>>> bytes written but no data appears on serial port pin. At the same time
>>> rtser1 functions normally.
>> Without feedback from the device about its tx queue state you may
>> quickly overload it this way (definitely if written bytes > fifo length).
>>
>
> The idea is to write data into the devices' circular buffer and return
> immediately. If there is not enough place in the buffer, i expected
> the write call to return an error code or fewer bytes than that which
> was requested. That would indicate a buffer overrun condition which can
> be flagged at application level. This way the task will not be delayed
> and other important functionality can be executed in a deterministic
> way.
>
> The data transmitted on each serial channel is less than 150 bytes at
> 115200kb/s with the task having a fixed period of 20mS. This should not
> overflow default 4k buffers of the 16550A driver.
>
> Well that's the plan:) Did i perhaps misunderstand the implementation
> for the tx_timeout ?
>
No, I agree your approach is valid, and even non-blocking write should
behave as you expected. I actually forgot that the uart driver is that
smart - at least in theory. Something else is fishy.
When you switch to non-blocking, is there no data at all written, or
does transmission stop roughly around where blocking write would stop
the writer?
Jan
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 257 bytes --]
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Xenomai-help] rtserial interface stalls
2009-03-21 9:02 ` Jan Kiszka
@ 2009-03-21 16:26 ` vikesh rambaran
0 siblings, 0 replies; 6+ messages in thread
From: vikesh rambaran @ 2009-03-21 16:26 UTC (permalink / raw)
To: Jan Kiszka; +Cc: xenomai-help
[-- Attachment #1: Type: text/plain, Size: 3275 bytes --]
On Sat, Mar 21, 2009 at 11:02 AM, Jan Kiszka <jan.kiszka@domain.hid> wrote:
> Vikesh Rambaran wrote:
> > On Fri, 2009-03-20 at 12:46 +0100, Jan Kiszka wrote:
> >> Vikesh Rambaran wrote:
> >>> ...
> >>> Alternative tried (after a bit of debugging)
> >>> -----------------
> >>>
> >>> Changed serialConfig.tx_timeout = RTSER_DEF_TIMEOUT;
> >>> to serialConfig.tx_timeout = RTDM_TIMEOUT_NONE;
> >>>
> >>> Task 2 then runs continuously. Writing to rtser0 returns valid number
> of
> >>> bytes written but no data appears on serial port pin. At the same time
> >>> rtser1 functions normally.
> >> Without feedback from the device about its tx queue state you may
> >> quickly overload it this way (definitely if written bytes > fifo
> length).
> >>
> >
> > The idea is to write data into the devices' circular buffer and return
> > immediately. If there is not enough place in the buffer, i expected
> > the write call to return an error code or fewer bytes than that which
> > was requested. That would indicate a buffer overrun condition which can
> > be flagged at application level. This way the task will not be delayed
> > and other important functionality can be executed in a deterministic
> > way.
> >
> > The data transmitted on each serial channel is less than 150 bytes at
> > 115200kb/s with the task having a fixed period of 20mS. This should not
> > overflow default 4k buffers of the 16550A driver.
> >
> > Well that's the plan:) Did i perhaps misunderstand the implementation
> > for the tx_timeout ?
> >
>
> No, I agree your approach is valid, and even non-blocking write should
> behave as you expected. I actually forgot that the uart driver is that
> smart - at least in theory. Something else is fishy.
:) Yes the uart driver is brimming with features !
>
> When you switch to non-blocking, is there no data at all written, or
> does transmission stop roughly around where blocking write would stop
> the writer?
>
I came across a *stupid* mistake that i made with regards displaying the
error code
of the write function! Apologies for noise on that one.
The return codes are exactly as we would expect in the non-blocking mode.
After running normally for a while, the write function for rtser0, returns a
positive
value that is less than the requested number of bytes. Thereafter the
function returns
-EAGAIN, for consecutive writes. This occurs at the same place where the
blocking write
stops the task.
I've also displayed the error codes returned from the read calls. On some
occasions, just prior to
the above error, the read call returns -EIO, with the LSR = 0x02 (overrun)
and MSR = 0xFB.
Need to spend some time understanding that one ...
On another note, the RS422 signal levels on the 2nd PC interfacing to our
realtime simulator
PC is a bit low. Will try to get another RS422 converter for the 2nd PC and
repeat the tests to see
if that is influencing it adversely. For comparison, i did measure the
signals when the realtime
simulator interfaced to the actual hardware (autopilot) and the levels were
acceptable.
> Jan
>
> Thanx for your help thus far
Vicki
>
> _______________________________________________
> Xenomai-help mailing list
> Xenomai-help@domain.hid
> https://mail.gna.org/listinfo/xenomai-help
>
>
[-- Attachment #2: Type: text/html, Size: 4629 bytes --]
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Xenomai-help] rtserial interface stalls
[not found] ` <b131c9f0903230122r713d131x7bb516ed9d00a42a@domain.hid>
@ 2009-04-06 9:30 ` vikesh rambaran
0 siblings, 0 replies; 6+ messages in thread
From: vikesh rambaran @ 2009-04-06 9:30 UTC (permalink / raw)
To: Jan Kiszka; +Cc: xenomai-help
[-- Attachment #1: Type: text/plain, Size: 3447 bytes --]
Hi there
On Mon, Mar 23, 2009 at 10:22 AM, vikesh rambaran <vikesh.rambaran@domain.hid
> wrote:
>
> Some feedback,
>
> On Fri, Mar 20, 2009 at 1:46 PM, Jan Kiszka <jan.kiszka@domain.hid>wrote:
>
> First of all, it would in fact be good to rule-out issues of the old
>> kernel/ipipe combination /wrt IRQ handling by giving latest versions a
>> try (2.6.28 + Xenomai 2.4.7). The you may want to consider setting up a
>> tracer:
>>
>
> Tried the above combination and the problem still persists (:
>
> Although SMIs cannot be globally disabled on this platform, i wonder if it
> is having an effect after all.
> Latency and serial loopback tests ran acceptably tho' with worst case of <
> 85uS for a 100uS
> task.
>
> I have rewired the communications to bypass the PCI communications card and
> use the motherboard
> RS232 ports instead. Although task latencies increased by about 150uS, the
> communications was
> stable after an hour ... (Changed task 1 to 250uS periodic instead of
> 100uS)
>
The tests with the motherboard serial ports interfacing to the actual
hardware via RS422 converters,
ran smoothly for more than 72 hours.
Problem seems to be interfacing to the PCI serial card.
>
>>
>> The ipipe function tracer (see Xenomai wiki) would make sense when you
>> can identify a failure very quickly (without a few 100 us or so) and
>> trigger a stop. That is required as the ipipe tracer works on lowest
>> lever (kernel functions) and quickly fills up its circular buffer with
>> new events.
>
>
> Next on the TODO list
>
>
>> The LTTng tracer provides a higher level view on the problem and could
>> perfectly run over a longer period. See related postings on this list
>> for details (I'm currently maintaining a 2.6.28 port for ipipe, see also
>> git.kiszka.org).
>>
>
> Started a new thread on getting LTTng up and running. This feature will
> definitely worthwhile putting the
> effort into!
>
>
>> Once you have a picture of what goes on in the kernel generally, you may
>> add ad-hoc instrumentations to driver or kernel (or we can discuss where
>> to add them) to find out what actually happens.
>>
>
LTTng trace captured. With my limited background, the following seems to
occur after the application
has been running for a while :
1. RT task requests a write of 85 bytes to rtser0. The result of the call is
64 bytes.
(not all bytes could be written)
2. Following write requests return -11 (EAGAIN).
3. Read calls on rtser0 function normally.
4. rtser1 read and write function normally
5. Approximately one second prior to the write call returning 64, one of the
xenomai/rtdm internal event
signals stops (This event could be related to the write event).
Since the thread writes 85 bytes every 20mS to rtser0, the 4k write buffer
would fill up in about
one second if no data was removed. This ties up with the event stopping one
second earlier.
The number of interrupts generated by the card even when 'operating
normally' is a bit high tho'.
(typically 100uS apart !)
I can mail the zipped trace data privately if required. It is approx
20Mbyte.
Will try to contact the manufacturer for a register level datasheet to see
if there are any considerations
that need to be taken specifically for this card.
Thanx for the help and suggestions
Vicki
PS Fortunately, the onboard serial ports do give us a workable solution for
our simulation environment,
but it would be good to understand why the problem occurs.
[-- Attachment #2: Type: text/html, Size: 5208 bytes --]
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2009-04-06 9:30 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-03-20 10:56 [Xenomai-help] rtserial interface stalls Vikesh Rambaran
2009-03-20 11:46 ` Jan Kiszka
2009-03-20 15:38 ` Vikesh Rambaran
2009-03-21 9:02 ` Jan Kiszka
2009-03-21 16:26 ` vikesh rambaran
[not found] ` <b131c9f0903230122r713d131x7bb516ed9d00a42a@domain.hid>
2009-04-06 9:30 ` vikesh rambaran
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.