From: Philippe Gerum <rpm@xenomai.org>
To: Jan Kiszka <jan.kiszka@siemens.com>
Cc: "Łukasz Majewski" <lukma@nabladev.com>,
"Giulio Moro" <giulio@bela.io>, Xenomai <xenomai@lists.linux.dev>
Subject: Re: Unexpected switches to in-band
Date: Fri, 31 Oct 2025 19:11:23 +0100 [thread overview]
Message-ID: <87qzujaqlw.fsf@xenomai.org> (raw)
In-Reply-To: <875xbvc5a3.fsf@xenomai.org> (Philippe Gerum's message of "Fri, 31 Oct 2025 19:09:08 +0100")
Philippe Gerum <rpm@xenomai.org> writes:
> Jan Kiszka <jan.kiszka@siemens.com> writes:
>
>> On 31.10.25 17:30, Philippe Gerum wrote:
>>> Łukasz Majewski <lukma@nabladev.com> writes:
>>>
>>>> Hi Philippe,
>>>>
>>>>> Hi Łukasz,
>>>>>
>>>>> Łukasz Majewski <lukma@nabladev.com> writes:
>>>>>
>>>>>>>> Could you get me a trace which goes back up to two secs before
>>>>>>>> the time reported by evl about the inband switch of the timer
>>>>>>>> responder? e.g.
>>>>>>
>>>>>> Why is it 2 seconds? Is it related to latmus "refresh time"? (which
>>>>>> is 1s)
>>>>>>
>>>>>
>>>>> Picking this delay is purely heuristical. Such a long backlog is
>>>>> likely going to include the smoking gun, if any.
>>>>>
>>>>>>>>
>>>>>>>> [ 75.214527] EVL: timer-responder:743 switching in-band
>>>>>>>> [pid=745, excpt=14, user_pc=0x77d08492a5fe]
>>>>>>>>
>>>>>>>> In the above case, I would need a trace going back to 73.214527
>>>>>>>> or so.
>>>>>>>
>>>>>>> Apparently the ftrace buffer was too small. I will adjust it to
>>>>>>> store more data and share results.
>>>>>>
>>>>>> It is also interesting, that:
>>>>>>
>>>>>> bash-5.2# echo 16384 > /sys/kernel/debug/tracing/buffer_size_kb
>>>>>> bash-5.2# cat /sys/kernel/debug/tracing/buffer_size_kb
>>>>>> 16387
>>>>>> bash-5.2# evl trace -eirq -f
>>>>>> tracing enabled
>>>>>> bash-5.2# cat /sys/kernel/debug/tracing/buffer_size_kb
>>>>>> 131
>>>>>>
>>>>>> Reduces the buffer size...
>>>>>>
>>>>>
>>>>> Yeah, the evl-trace script is not that smart. I can see the issue
>>>>> causing this, will fix.
>>>>>
>>>>>> Hence, I've used the "normal" ftrace with enabled evl functions -
>>>>>> the
>>>>>
>>>>> FWIW, EVL relies on the normal ftrace. However, evl-trace fiddles with
>>>>> the sysfs entries manually, and may be a bit naive at times. As Jan
>>>>> pointed out long ago, relying on the trace-cmd front-end would be a
>>>>> better approach than tweaking ftrace's sysfs-based controls manually
>>>>> at the very least. evl-trace adds EVL semantics to the tracing
>>>>> configuration which is handy, but the way it is done today may
>>>>> deserve a revamping, or maybe drop it entirely in favor of trace-cmd
>>>>> if we have enough flexibility there in order to introduce some EVL
>>>>> specifics.
>>>>>
>>>>>> detailed information about the ftrace config is with log files:
>>>>>> https://nextcloud.swupdate.org/index.php/s/czBd3Ydq9B9noZ7
>>>>>>
>>>>>>
>>>>>> To extract the cpu2 specific data:
>>>>>> bash-5.2# grep -rnE '\[002\]' Xenomai4-ISW-trace-6.6-logs.txt >
>>>>>> cpu2.txt
>>>>>>
>>>>>> Here again the ISW happens in the cpu2 execution "hole":
>>>>>> [ 128.466459] EVL: timer-responder:736 switching in-band [pid=738,
>>>>>> excpt=14, user_pc=0x7588e71985fe]
>>>>>>
>>>>>> Happens for cpu2 between:
>>>>>>
>>>>>> 11832546: <idle>-0 [002] *..2. 128.466438:
>>>>>> sched_idle_set_state <-cpuidle_enter_state
>>>>>> 11832547: <idle>-0 [002] *..2. 128.466468:
>>>>>> handle_irq_pipelined_prepare <-arch_pipeline_entry
>>>>>>
>>>>>> However, now the log is large enough to inspect what has happened in
>>>>>> the system for ~4s before.
>>>>>>
>>>>>>
>>>>>> With the logs:
>>>>>> timer-responder-738 [002] *.~2. 128.466214: evl_timer_shot:
>>>>>> [proxy-timer/2] at 128.466467 (delay: 253 us, 66918 cycles)
>>>>>>
>>>>>> and after ISW:
>>>>>>
>>>>>> <idle>-0 [002] *.~1. 128.466469: evl_timer_shot:
>>>>>> latmus_pulse_handler at 128.467208 (delay: 739 us, 195206 cycles)
>>>>>>
>>>>>> Why there is such increase in "delay" (253 vs 739 us) ?
>>>>>>
>>>>>> It looks like we have only 2us delay as the timer has fired on
>>>>>> 128.466469 vs. expected 128.466467.
>>>>>>
>>>>>
>>>>> Ok, so the issue is still the same: for some reason, the
>>>>> timer-responder thread is taking a (minor) fault when accessing some
>>>>> memory, on its way back from the ioctl(EVL_LATIOC_PULSE) syscall, or
>>>>> right after. This seems to be valid memory though, just not
>>>>> immediately accessible, otherwise we'd have received SEGV, which we
>>>>> did not.
>>>>
>>>> Yes, I also think so - like some "deferred allocation" despite of
>>>> mlockall.
>>>>
>>>> Please find the logs with the below patch applied:
>>>> https://nextcloud.swupdate.org/index.php/s/JXNW5bJrB3fQo5E
>>>>
>>>
>>> Thanks. I need to analyze the fault flags next week to see if I can make
>>> something of these, but I just stumbled on an obviously significant fact
>>> when looking at the multi-CPU view: the memory compaction daemon is
>>> running asynchronously on other cores while the responder is taking the
>>> fault. Compaction is all about moving memory around to free larger
>>> physical regions, so updating PTEs will happen as a result.
>>>
>>> Moreover, I could not reproduce the issue at hand, but I did not have
>>> CONFIG_COMPACTION enabled. In the meantime, could you try disabling this
>>> on your end, to see if the fault still appears?
>>
>> CONFIG_COMPACTION selects CONFIG_MIGRATION, and that is well-known to be
>> incompatible with RT. Xenomai 3 is warning about that that, EVL does not?
>>
>
> Dovetail should do that when CONFIG_DOVETAIL is set, this is by
> definition common to any rt core.
Also, it would be much better to find a way to exclude any mm bearing
the MMF_DOVETAIL flag from the movable set, instead of excluding
everything that sets CONFIG_COMPACTION.
--
Philippe.
next prev parent reply other threads:[~2025-10-31 18:25 UTC|newest]
Thread overview: 50+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-09 4:33 Unexpected switches to in-band Giulio Moro
2025-10-09 13:17 ` Łukasz Majewski
2025-10-09 19:05 ` Giulio Moro
2025-10-10 10:24 ` Łukasz Majewski
2025-10-10 12:21 ` Giulio Moro
2025-10-10 13:08 ` Łukasz Majewski
2025-10-11 4:25 ` Giulio Moro
2025-10-11 15:55 ` Philippe Gerum
2025-10-11 16:10 ` Philippe Gerum
2025-10-11 16:47 ` Giulio Moro
2025-10-11 16:56 ` Philippe Gerum
2025-10-11 17:15 ` Philippe Gerum
2025-10-11 19:46 ` Giulio Moro
2025-10-12 8:54 ` Philippe Gerum
2025-10-12 14:44 ` Philippe Gerum
2025-10-20 7:47 ` Łukasz Majewski
2025-10-20 12:46 ` Giulio Moro
2025-10-20 14:01 ` Philippe Gerum
2025-10-21 11:13 ` Łukasz Majewski
2025-10-23 13:54 ` Łukasz Majewski
2025-10-26 20:04 ` Philippe Gerum
2025-10-27 11:05 ` Łukasz Majewski
2025-10-27 11:35 ` Philippe Gerum
2025-10-27 12:54 ` Łukasz Majewski
2025-10-27 16:25 ` Łukasz Majewski
2025-10-27 18:16 ` Giulio Moro
2025-10-27 22:42 ` Giulio Moro
2025-10-29 9:18 ` Philippe Gerum
2025-10-29 13:51 ` Łukasz Majewski
2025-10-30 12:26 ` Łukasz Majewski
2025-10-30 16:17 ` Philippe Gerum
2025-10-31 15:56 ` Łukasz Majewski
2025-10-31 16:30 ` Philippe Gerum
2025-10-31 17:34 ` Jan Kiszka
2025-10-31 18:09 ` Philippe Gerum
2025-10-31 18:11 ` Philippe Gerum [this message]
2025-11-01 11:32 ` Łukasz Majewski
2025-11-03 7:57 ` Florian Bezdeka
2025-11-03 9:29 ` Jan Kiszka
2025-11-01 11:31 ` Łukasz Majewski
2025-10-31 18:13 ` Philippe Gerum
2025-11-01 15:59 ` Łukasz Majewski
2025-11-01 16:33 ` Giulio Moro
2025-11-03 14:06 ` Philippe Gerum
2025-11-04 7:53 ` Łukasz Majewski
2025-11-04 8:19 ` Philippe Gerum
2025-11-03 14:00 ` Philippe Gerum
2025-10-30 16:26 ` Philippe Gerum
2025-10-11 17:43 ` Philippe Gerum
2025-10-11 15:37 ` Philippe Gerum
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87qzujaqlw.fsf@xenomai.org \
--to=rpm@xenomai.org \
--cc=giulio@bela.io \
--cc=jan.kiszka@siemens.com \
--cc=lukma@nabladev.com \
--cc=xenomai@lists.linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.