All of lore.kernel.org
 help / color / mirror / Atom feed
From: Philippe Gerum <rpm@xenomai.org>
To: Saul <xeno@domain.hid>
Cc: xenomai@xenomai.org
Subject: Re: [Xenomai-help] Non-deterministic behaviour in primary mode
Date: Thu, 30 Mar 2006 23:00:28 +0200	[thread overview]
Message-ID: <442C46EC.6000605@domain.hid> (raw)
In-Reply-To: <1143694058.30209.257882137@domain.hid>

Saul wrote:
> Philippe Gerum wrote:
> 
>>It would be nice to known whether the latency test from the testsuite 
>>has correct latency figures, so that we could rule out any SMI issue.
> 
> 
> See my reply to Jan Kiszka. The max is 55us, but I've just noticed that
> starting xawtv using the bttv driver causes an overrun in the latency
> test, but that may be unrelated.
> 
> 
>>>Anyway, the erratic execution times come and go with the smallest
>>>changes in the program e.g. moving the delay into it's own function or
>>>creating functions for timing the delay loop, so I wouldn't be too
>>>surprised if the problem can't be reproduced.
>>>
>>
>>What /proc/xenomai/stat says during execution btw?
> 
> 
> Here's /proc/xenomai/stat sampled every four seconds:
> 
> CPU  PID    MSW        CSW        PF    STAT      NAME
>   0  0      0/0        42572094   0     00500080  ROOT
> 
> CPU  PID    MSW        CSW        PF    STAT      NAME
>   0  0      0/0        42572110   0     00500080  ROOT
>   0  28088  1/0        8          0     00380084  timed-task
>   0  28089  8/7        15         0     00300081  display-task
> 
> CPU  PID    MSW        CSW        PF    STAT      NAME
>   0  0      0/0        42572164   0     00500080  ROOT
>   0  28088  1/0        35         0     00380084  timed-task
>   0  28089  35/34      69         0     00300081  display-task
> 
> CPU  PID    MSW        CSW        PF    STAT      NAME
>   0  0      0/0        42572216   0     00500080  ROOT
>   0  28088  1/0        61         0     00380084  timed-task
>   0  28089  61/60      121        0     00300081  display-task
> 
> CPU  PID    MSW        CSW        PF    STAT      NAME
>   0  0      0/0        42572272   0     00500080  ROOT
>   0  28088  1/0        89         0     00380084  timed-task
>   0  28089  89/88      177        0     00300081  display-task
> 
> CPU  PID    MSW        CSW        PF    STAT      NAME
>   0  0      0/0        42572296   0     00500080  ROOT
> 
> 
>>>Also, I found that using rt_timer_spin() instead of a real busy loop
>>>does not have the same problem, which indicates that the simple busy
>>>loop is in fact being interrupted during execution.
>>>
>>
>>Just to rule this out, maybe you could try something along these lines:
>>
>>		iopl(3)
>>...
>>		asm("cli");
>>                  RTIME prev_tsc = rt_timer_tsc();
>>                  my_processing();
>>                  RTIME tsc = rt_timer_tsc();
>>		asm("sti");
> 
> 
> Case 1: the original data was the case of the busy loop in a function &
> interrupts enabled.
>   1:  40.017 ms
>   2:  40.017 ms
>   3:  53.357 ms
>   4:  53.356 ms
>   5:  40.017 ms
>   6:  40.020 ms
>   7:  53.355 ms
>   8:  40.020 ms
>   9:  40.017 ms
> 10:  53.356 ms
> 
> Case 2: as suggested, this is the case of having the busy loop in a
> function & interrupts disabled.
>   1:  53.316 ms
>   2:  53.316 ms
>   3:  53.316 ms
>   4:  53.316 ms
>   5:  53.316 ms
>   6:  106.632 ms
>   7:  53.316 ms
>   8:  106.632 ms
>   9:  35.544 ms
>  10:  39.987 ms
> 
> It appears to have made the situation worse?! Note:
> 35.544 = 39.987 * 8/9
> 53.316 = 39.987 * 12/9
> 106.632 = 39.987 * 24/9
> OK, sending me crazy enough without delving to numerological lows.
> Next I tried two more cases...
> 
> Case 3: busy loop inline & interrupts enabled.
>   1:  35.571 ms
>   2:  40.019 ms
>   3:  40.020 ms
>   4:  40.017 ms
>   5:  40.017 ms
>   6:  40.019 ms
>   7:  40.017 ms
>   8:  40.020 ms
>   9:  40.017 ms
>  10:  40.019 ms
> ...
>  55:  53.357 ms
> 
> Case 4: busy loop inline & interrupts disabled.
>   1:  39.987 ms
>   2:  39.987 ms
>   3:  39.987 ms
>   4:  39.987 ms
>   5:  39.987 ms
>   6:  39.987 ms
>   7:  39.987 ms
>   8:  39.987 ms
>   9:  39.987 ms
>  10:  39.987 ms
> 
> Oooooh, that last one looks good :) But even it has produced a 106.632ms
> measure.
> 
> 
>>Out of curiosity, I'd also be interested in looking at the assembly dump 
>>of my_process in the working and non-working cases.
> 
> 
> Well, from the above tests the problem appears to always be there, just
> to varying degrees - I no longer believe there is a working case on my
> PC. Disabling interrupts removes some noise from the signal, but the
> real problem persists. The large increase in computation time (many ms)
> appears to be quantized, made more visible by disabling interrupts. I
> guess placing the delay in a function is simply pushing the execution
> time into the next quantization level. The assembly all looks about the
> same between cases:
> 
> ----------------------------------
> Case 1: busy loop in a function & interrupts enabled.
> my_processing:
>         pushl   %ebp
>         movl    %esp, %ebp
>         movl    $80000000, %eax
>         .p2align 4,,15
> .L2:
>         decl    %eax
>         jne     .L2
>         popl    %ebp
>         ret
> ...snip...
>         call    rt_timer_tsc
>         movl    %eax, %ebx
>         movl    %edx, %esi
>         call    my_processing
>         call    rt_timer_tsc
> 
> ----------------------------------
> Case 2: busy loop in a function & interrupts disabled.
> my_processing:
>         pushl   %ebp
>         movl    %esp, %ebp
>         movl    $80000000, %eax
>         .p2align 4,,15
> .L2:
>         decl    %eax
>         jne     .L2
>         popl    %ebp
>         ret
> ...snip...
> #APP
>         cli
> #NO_APP
>         call    rt_timer_tsc
>         movl    %eax, %ebx
>         movl    %edx, %esi
>         call    my_processing
>         call    rt_timer_tsc
> #APP
>         sti
> #NO_APP
> 
> ----------------------------------
> Case 3: busy loop inline & interrupts enabled.
>         call    rt_timer_tsc
>         movl    %eax, %ebx
>         movl    %edx, %esi
>         movl    $80000000, %eax
>         .p2align 4,,15
> .L13:
>         decl    %eax
>         jne     .L13
>         call    rt_timer_tsc
> 
> ----------------------------------
> Case 4: busy loop inline & interrupts disabled.
> #APP
>         cli
> #NO_APP
>         call    rt_timer_tsc
>         movl    %eax, %ebx
>         movl    %edx, %esi
>         movl    $80000000, %eax
>         .p2align 4,,15
> .L13:
>         decl    %eax
>         jne     .L13
>         call    rt_timer_tsc
> #APP
>         sti
> #NO_APP
> 
> 
> Disabling USB legacy had no effect - was already using a driver that
> should disable it anyway.
> 
> Disabling ACPI caused Case 1&3 to look more alike, and Case 2&4 to
> look more alike - it's still remains that:
> 1) when interrupts are disabled the noise level reduces
> 2) when interrupts are disabled delays are introduced less often
> 3) with interrupts enabled or disabled, large quantized delays occur
> Point 2) could be due to point 1).
> 
> I have no idea what to try next but thanks for the advice. I think some
> progress was made :)
> 

Mm, this really looks like some firmware/BIOS cyclic activity, that 
would hit the busy loop more or less frequently, depending on the 
internal timing of the outer loop. This reminds me that some recent 
Intel chipset are known to prevent global SMI disabling actually, maybe 
this is the case here, so our work-around would be basically useless. 
Another usual suspect is NMI handling (which the Adeos layer does not 
pipeline but rather delivers immediately), but I guess that you did not 
activate the NMI watchdog anyway, so back to square #1, I guess.

-- 

Philippe.


  parent reply	other threads:[~2006-03-30 21:00 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-03-29  6:06 [Xenomai-help] Non-deterministic behaviour in primary mode Saul
2006-03-29 11:28 ` Jan Kiszka
2006-03-30  4:27   ` Saul
2006-03-29 12:27 ` Philippe Gerum
2006-03-30  4:47   ` Saul
2006-03-30 12:05     ` Gilles Chanteperdrix
2006-03-31  6:44       ` Saul
2006-03-30 21:00     ` Philippe Gerum [this message]
2006-03-31  7:43       ` Saul
2006-03-31  9:55         ` Philippe Gerum
2006-03-31 15:35           ` Gilles Chanteperdrix
2006-04-11  3:06             ` Saul
2006-04-11 12:08               ` Gilles Chanteperdrix
2006-04-12  0:56                 ` Saul
2006-04-12  7:49                   ` Jan Kiszka
2006-03-29 12:41 ` Philippe Gerum

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=442C46EC.6000605@domain.hid \
    --to=rpm@xenomai.org \
    --cc=xeno@domain.hid \
    --cc=xenomai@xenomai.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.