All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jan Kiszka <jan.kiszka@domain.hid>
To: Vikesh Rambaran <vikesh.rambaran@domain.hid>
Cc: xenomai-help <xenomai@xenomai.org>
Subject: Re: [Xenomai-help] rtserial interface stalls
Date: Fri, 20 Mar 2009 12:46:00 +0100	[thread overview]
Message-ID: <49C381F8.9070107@domain.hid> (raw)
In-Reply-To: <1237546591.9844.115.camel@domain.hid>

Vikesh Rambaran wrote:
> Hi
> 
> We have the following setup : 
> 
> Hardware : Intel Core Duo, Quatech PCI 200/300 4 channel serial card

Do you see the same issue when using only one core?

> Software : Ubuntu 8.04, Linux 2.6.24, Xenomai 2.4.5.
> 
> Application : 3 tasks running at (Task1) 100uS (Task2) 20mS (Task3) 1S
> 
> Tasks 1 and 3 are 'empty' at the moment. Task 2 transmits and receives
> multiple messages on rtser0 and rtser1
> 
> Serial ports are configured as follows
> 
>   serialConfig.config_mask      = 0xFFFF;
>   serialConfig.baud_rate        = configPtr->BitRate;    //= 115200
>   serialConfig.parity           = configPtr->Parity;     //= None
>   serialConfig.data_bits        = configPtr->DataBits;   //= 8
>   serialConfig.stop_bits        = configPtr->StopBits;   //= 1
>   serialConfig.handshake        = configPtr->FlowControl;//= None 
>   serialConfig.fifo_depth       = RTSER_DEF_FIFO_DEPTH;
>   serialConfig.rx_timeout       = RTDM_TIMEOUT_NONE;//RTSER_DEF_TIMEOUT;
>   serialConfig.tx_timeout       = RTSER_DEF_TIMEOUT;
>   serialConfig.event_timeout    = RTSER_DEF_TIMEOUT;
>   serialConfig.timestamp_history= RTSER_RX_TIMESTAMP_HISTORY;
>   serialConfig.event_mask       = RTSER_DEF_EVENT_MASK;
> 
> 
> Application runs for hours/days with loop back on PCI card connector
> 
> 
> Problem
> -------
> 
> However, between two PCs or external test unit, task 2 stops after a
> few minutes. The other tasks are still active.
> 
> 
> /proc/xenomai/ while running
> 
> 
> seeker@domain.hid$ cat stat 
> CPU  PID    MSW        CSW        PF    STAT       %CPU  NAME
>   0  0      0          43778348   0     00500080   82.9  ROOT/0
>   1  0      0          0          0     00500080  100.0  ROOT/1
>   0  8284   1          280862     0     00300184    5.0  ASU_SYNC
>   0  8285   2          3946       0     00300184    0.2  HILS
>   0  8286   1          33         0     00300184    0.0  DEBUG
>   0  0      0          8429       0     00000000    4.1  IRQ16: rtser0
>   1  0      0          0          0     00000000    0.0  IRQ16: rtser0
>   0  0      0          96987      0     00000000    6.1  IRQ16: rtser1
>   1  0      0          0          0     00000000    0.0  IRQ16: rtser1
>   0  0      0          45948236   0     00000000    1.2  IRQ233: [timer]
>   1  0      0          2102218    0     00000000    0.0  IRQ233: [timer]
> seeker@domain.hid$ cat sched 
> CPU  PID    PRI      PERIOD     TIMEOUT    TIMEBASE  STAT       NAME
>   0  0       -1      0          0          master    R          ROOT/0
>   1  0       -1      0          0          master    R          ROOT/1
>   0  8284     3      100000     51111      master    D          ASU_SYNC
>   0  8285     2      20000000   13078512   master    D          HILS
>   0  8286     1      1000000000 553400299  master    D          DEBUG
> 
> 
> /proc/xenomai/ when task stalls
> 
> seeker@domain.hid$ cat sched 
> CPU  PID    PRI      PERIOD     TIMEOUT    TIMEBASE  STAT       NAME
>   0  0       -1      0          0          master    R          ROOT/0
>   1  0       -1      0          0          master    R          ROOT/1
>   0  6223     3      100000     67714      master    D          ASU_SYNC
>   0  6224     2      20000000   0          master    W          HILS
>   0  6225     1      1000000000 669570391  master    D          DEBUG
> 
> seeker@domain.hid$ cat stat 
> CPU  PID    MSW        CSW        PF    STAT       %CPU  NAME
>   0  0      0          37141696   0     00500080   85.1  ROOT/0
>   1  0      0          0          0     00500080  100.0  ROOT/1
>   0  6223   1          16318556   0     00300184    4.9  ASU_SYNC
>   0  6224   1          9506       0     00300182    0.0  HILS
>   0  6225   1          1756       0     00300184    0.0  DEBUG
>   0  0      0          20694      0     00000000    2.8  IRQ16: rtser0
>   1  0      0          0          0     00000000    0.0  IRQ16: rtser0
>   0  0      0          5471547    0     00000000    5.5  IRQ16: rtser1
>   1  0      0          0          0     00000000    0.0  IRQ16: rtser1
>   0  0      0          39087824   0     00000000    1.2  IRQ233: [timer]
>   1  0      0          1891732    0     00000000    0.0  IRQ233: [timer]
> 
> 
> Alternative tried (after a bit of debugging)
> -----------------
> 
> Changed  serialConfig.tx_timeout       = RTSER_DEF_TIMEOUT;
> to       serialConfig.tx_timeout       = RTDM_TIMEOUT_NONE;
> 
> Task 2 then runs continuously. Writing to rtser0 returns valid number of
> bytes written but no data appears on serial port pin. At the same time
> rtser1 functions normally.

Without feedback from the device about its tx queue state you may
quickly overload it this way (definitely if written bytes > fifo length).

> 
> /proc/xenomai/stat shows rtser0 CSW incrementing. Disconnecting rtser0
> stops CSW from incrementing.
> 
> Looks like the rtser0 tx interrupt gets 'lost' somehow and never
> recovers. Restarting the application restores communication again, for a
> while ...
> 
> 
> Has any one else experienced a similar situation ?

Not with the current versions. But there are many factors that may
influence the situation.

> 
> Any suggestions on how to trace this further, would be greatly
> appreciated.

First of all, it would in fact be good to rule-out issues of the old
kernel/ipipe combination /wrt IRQ handling by giving latest versions a
try (2.6.28 + Xenomai 2.4.7). The you may want to consider setting up a
tracer:

The ipipe function tracer (see Xenomai wiki) would make sense when you
can identify a failure very quickly (without a few 100 us or so) and
trigger a stop. That is required as the ipipe tracer works on lowest
lever (kernel functions) and quickly fills up its circular buffer with
new events.

The LTTng tracer provides a higher level view on the problem and could
perfectly run over a longer period. See related postings on this list
for details (I'm currently maintaining a 2.6.28 port for ipipe, see also
git.kiszka.org).

Once you have a picture of what goes on in the kernel generally, you may
add ad-hoc instrumentations to driver or kernel (or we can discuss where
to add them) to find out what actually happens.

Jan

-- 
Siemens AG, Corporate Technology, CT SE 2
Corporate Competence Center Embedded Linux


  reply	other threads:[~2009-03-20 11:46 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-03-20 10:56 [Xenomai-help] rtserial interface stalls Vikesh Rambaran
2009-03-20 11:46 ` Jan Kiszka [this message]
2009-03-20 15:38   ` Vikesh Rambaran
2009-03-21  9:02     ` Jan Kiszka
2009-03-21 16:26       ` vikesh rambaran
     [not found]   ` <b131c9f0903230122r713d131x7bb516ed9d00a42a@domain.hid>
2009-04-06  9:30     ` vikesh rambaran

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=49C381F8.9070107@domain.hid \
    --to=jan.kiszka@domain.hid \
    --cc=vikesh.rambaran@domain.hid \
    --cc=xenomai@xenomai.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.