All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jan Kiszka <jan.kiszka@domain.hid>
To: Petr Cervenka <grugh@domain.hid>
Cc: xenomai@xenomai.org
Subject: Re: [Xenomai-help] Kernel panic: not syncing
Date: Mon, 21 Jul 2008 13:26:42 +0200	[thread overview]
Message-ID: <48847272.3080605@domain.hid> (raw)
In-Reply-To: <200807211258.30164@domain.hid>

Petr Cervenka wrote:
> Jan Kiszka wrote:
>> We likely see some race that causes weird memory corruptions. Its
>> probability often increases when the code execution frequency raises.
>>
>> However, reducing the test case is very important now to reduce the
>> search domain for this issue. E.g. try to fake peripheral access as far
>> as possible, unloading the unused driver and only leaving the test
>> program behind that is executable on arbitrary Xenomai installation
>> (maybe finally on one of my boxes...).
>>
> I'm not sure if I will be able to reduce the software. It's dependent on hardware and it's controlled from another windows computer with GUI and control application. And to check if the error is still there usually takes couple of days.
> I ran a test during last weekend (and nothing wrong happened). But the /proc/xenomai/stat output is strange. Probably some type cast error, because 18446744071739514846 = 0xFFFFFFFF8A939FDE and the appropriate value perhaps should be 0x000000008A939FDE = 2324930526.
> 
> CPU  PID    MSW        CSW        PF    STAT       %CPU  NAME
>   0  0      0          18446744071739514846 0     00500088   69.8  ROOT/0
>   1  0      0          18446744071675175740 0     00500080   23.2  ROOT/1
>   0  5299   0          351459     0     00300182    0.0  LOGGER_TASK_1804289383
>   0  5100   8          283613     0     00300186    0.0
>   0  5317   0          40591      0     00300182    0.0
>   0  5034   2          2330696    0     00300184    0.0  MAIN_TASK_2056
>   0  5318   5          18446744071736105613 3     00300180   29.5  REG_TASK_2056
>   0  5319   28         36         0     00300182    0.0  WORK_TASK_2056
>   0  5321   38926      39159      0     00300380    0.0  CERECV_2056
>   0  5323   1159385    2438330    0     00300181    0.0  CESEND_2056
>   1  5710   0          18446744071675175740 0     00300184   76.8  HARDWARE_KERNEL
>   0  0      0          18446744071964064315 0     00000000    0.7  IRQ520: [timer]
>   1  0      0          232145209  0     00000000    0.0  IRQ520: [timer] 

OK, at least this bug is a bit easier to fix. Please try this patch
(which also takes the chance and extends the range of our stat counters
a bit):

Index: xenomai/include/nucleus/stat.h
===================================================================
--- xenomai/include/nucleus/stat.h	(Revision 4060)
+++ xenomai/include/nucleus/stat.h	(Arbeitskopie)
@@ -84,20 +84,20 @@ do { \
 
 
 typedef struct xnstat_counter {
-	int counter;
+	unsigned long counter;
 } xnstat_counter_t;
 
-static inline int xnstat_counter_inc(xnstat_counter_t *c)
+static inline unsigned long xnstat_counter_inc(xnstat_counter_t *c)
 {
 	return c->counter++;
 }
 
-static inline int xnstat_counter_get(xnstat_counter_t *c)
+static inline unsigned long xnstat_counter_get(xnstat_counter_t *c)
 {
 	return c->counter;
 }
 
-static inline void xnstat_counter_set(xnstat_counter_t *c, int value)
+static inline void xnstat_counter_set(xnstat_counter_t *c, unsigned long value)
 {
 	c->counter = value;
 }

> 
> My theory is, that a occasional "longer" work or system call usage in the real-time task corrupts the rest of the system (under some special circumstances).

Yes, some nasty memory corruption is probably the reason. And that is
always hard to track down, specifically if it happens very
unpredictably. Nevertheless, if the issue continues to bug you, you will
not get around reducing the test case and trying to increase its
occurrence probability.

Jan

-- 
Siemens AG, Corporate Technology, CT SE 2
Corporate Competence Center Embedded Linux


  reply	other threads:[~2008-07-21 11:26 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-07-07 15:45 [Xenomai-help] Kernel panic: not syncing Petr Cervenka
2008-07-07 15:59 ` Philippe Gerum
2008-07-08  8:31   ` Petr Cervenka
2008-07-08  8:38   ` Jan Kiszka
2008-07-08  9:21     ` Gilles Chanteperdrix
2008-07-08  9:33       ` Jan Kiszka
2008-07-09 15:19         ` Petr Cervenka
2008-07-09 16:05           ` Philippe Gerum
2008-07-10 13:45             ` Petr Cervenka
2008-07-11 13:18             ` Petr Cervenka
2008-07-15 14:42               ` Petr Cervenka
2008-07-15 15:03                 ` Jan Kiszka
2008-07-16  8:39                   ` Petr Cervenka
2008-07-17 10:21                     ` Jan Kiszka
2008-07-21 10:58                       ` Petr Cervenka
2008-07-21 11:26                         ` Jan Kiszka [this message]
2008-07-31 16:14                           ` [Xenomai-help] Segmentation error by heavy dynamic RT_QUEUE usage Petr Cervenka
2008-08-12 14:37                             ` Philippe Gerum
2008-08-13 11:01                           ` [Xenomai-core] [PATCH] Fix stat overruns on 64-bit (was: [Xenomai-help] Kernel panic: not syncing) Jan Kiszka
2008-08-13 15:29                             ` [Xenomai-core] [PATCH] Fix stat overruns on 64-bit Philippe Gerum

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=48847272.3080605@domain.hid \
    --to=jan.kiszka@domain.hid \
    --cc=grugh@domain.hid \
    --cc=xenomai@xenomai.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.