All of lore.kernel.org
 help / color / mirror / Atom feed
From: Wolfgang Netbal <wolfgang.netbal@sigmatek.at>
To: gilles.chanteperdrix@xenomai.org
Cc: xenomai@xenomai.org
Subject: Re: [Xenomai] Performance impact after switching from 2.6.2.1 to 2.6.4
Date: Tue, 28 Jun 2016 12:31:42 +0200	[thread overview]
Message-ID: <5772520E.8010407@sigmatek.at> (raw)
In-Reply-To: <20160628101957.GO18662@hermes.click-hack.org>



Am 2016-06-28 um 12:19 schrieb Gilles Chanteperdrix:
> On Tue, Jun 28, 2016 at 12:10:06PM +0200, Wolfgang Netbal wrote:
>>
>> Am 2016-06-28 um 11:55 schrieb Gilles Chanteperdrix:
>>> On Tue, Jun 28, 2016 at 11:51:39AM +0200, Wolfgang Netbal wrote:
>>>> Am 2016-06-28 um 11:29 schrieb Gilles Chanteperdrix:
>>>>> On Tue, Jun 28, 2016 at 11:28:19AM +0200, Wolfgang Netbal wrote:
>>>>>> Am 2016-06-28 um 11:17 schrieb Gilles Chanteperdrix:
>>>>>>> On Tue, Jun 28, 2016 at 11:15:14AM +0200, Wolfgang Netbal wrote:
>>>>>>>> Am 2016-06-28 um 10:34 schrieb Gilles Chanteperdrix:
>>>>>>>>> On Tue, Jun 28, 2016 at 10:31:00AM +0200, Wolfgang Netbal wrote:
>>>>>>>>>> Am 2016-06-27 um 18:46 schrieb Gilles Chanteperdrix:
>>>>>>>>>>> On Mon, Jun 27, 2016 at 05:55:12PM +0200, Wolfgang Netbal wrote:
>>>>>>>>>>>> Am 2016-06-07 um 19:00 schrieb Gilles Chanteperdrix:
>>>>>>>>>>>>> On Tue, Jun 07, 2016 at 04:13:07PM +0200, Wolfgang Netbal wrote:
>>>>>>>>>>>>>> Am 2016-06-06 um 17:35 schrieb Gilles Chanteperdrix:
>>>>>>>>>>>>>>> On Mon, Jun 06, 2016 at 09:03:40AM +0200, Wolfgang Netbal wrote:
>>>>>>>>>>>>>>>> Am 2016-06-02 um 10:23 schrieb Gilles Chanteperdrix:
>>>>>>>>>>>>>>>>> On Thu, Jun 02, 2016 at 10:15:41AM +0200, Wolfgang Netbal wrote:
>>>>>>>>>>>>>>>>>> Am 2016-06-01 um 16:12 schrieb Gilles Chanteperdrix:
>>>>>>>>>>>>>>>>>>> On Wed, Jun 01, 2016 at 03:52:06PM +0200, Wolfgang Netbal wrote:
>>>>>>>>>>>>>>>>>>>> Am 2016-05-31 um 16:16 schrieb Gilles Chanteperdrix:
>>>>>>>>>>>>>>>>>>>>> On Tue, May 31, 2016 at 04:09:07PM +0200, Wolfgang Netbal wrote:
>>>>>>>>>>>>>>>>>>>>>> Dear all,
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> we have moved our application from "XENOMAI 2.6.2.1 + Linux 3.0.43" to
>>>>>>>>>>>>>>>>>>>>>> "XENOMAI 2.6.4. + Linux 3.10.53". Our target is an i.MX6DL. The system
>>>>>>>>>>>>>>>>>>>>>> is now up and running and works stable. Unfortunately we see a
>>>>>>>>>>>>>>>>>>>>>> difference in the performance. Our old combination (XENOMAI 2.6.2.1 +
>>>>>>>>>>>>>>>>>>>>>> Linux 3.0.43) was slightly faster.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> At the moment it looks like that XENOMAI 2.6.4 calls
>>>>>>>>>>>>>>>>>>>>>> xnpod_schedule_handler much more often then XENOMAI 2.6.2.1 in our old
>>>>>>>>>>>>>>>>>>>>>> system.  Every call of xnpod_schedule_handler interrupts our main
>>>>>>>>>>>>>>>>>>>>>> XENOMAI task with priority = 95.
>>>>>>>>>>>>>> As I wrote above, I get interrupts 1037 handled by rthal_apc_handler()
>>>>>>>>>>>>>> and 1038 handled by xnpod_schedule_handler() while my realtime task
>>>>>>>>>>>>>> is running on kernel 3.10.53 with Xenomai 2.6.4.
>>>>>>>>>>>>>> On kernel 3.0.43 with Xenomai 2.6.4 there are no interrupts, except the
>>>>>>>>>>>>>> once that are send by my board using GPIOs, but this virtual interrupts
>>>>>>>>>>>>>> are assigned to Xenomai and Linux as well but I didn't see a handler
>>>>>>>>>>>>>> installed.
>>>>>>>>>>>>>> I'm pretty sure that these interrupts are slowing down my system, but
>>>>>>>>>>>>>> where do they come from ?
>>>>>>>>>>>>>> why didn't I see them on Kernel 3.0.43 with Xenomai 2.6.4 ?
>>>>>>>>>>>>>> how long do they need to process ?
>>>>>>>>>>>>> How do you mean you do not see them? If you are talking about the
>>>>>>>>>>>>> rescheduling API, it used no to be bound to a virq (so, it would
>>>>>>>>>>>>> have a different irq number on cortex A9, something between 0 and 31
>>>>>>>>>>>>> that would not show in the usual /proc files), I wonder if 3.0 is
>>>>>>>>>>>>> before or after that. You do not see them in /proc, or you see them
>>>>>>>>>>>>> and their count does not increase?
>>>>>>>>>>>> Sorry for the long delay, we ran a lot of tests to find out what could
>>>>>>>>>>>> be the reason for
>>>>>>>>>>>> the performance difference.
>>>>>>>>>>>>
>>>>>>>>>>>> If I call cat /proc/ipipe/Xenomai I dont see the IRQ handler assigned to
>>>>>>>>>>>> the virtual
>>>>>>>>>>>> IRQ on Kernel 3.0.43, but it looks like thats an issue of the Kernel
>>>>>>>>>>>>> As for where they come from, this is not a mystery, the reschedule
>>>>>>>>>>>>> IPI is triggered when code on one cpu changes the scheduler state
>>>>>>>>>>>>> (wakes up a thread for instance) on another cpu. If you want to
>>>>>>>>>>>>> avoid it, do not do that. That means, do not share mutex between
>>>>>>>>>>>>> threads running on different cpus, pay attention for timers to be
>>>>>>>>>>>>> running on the same cpu as the thread they signal, etc...
>>>>>>>>>>>>>
>>>>>>>>>>>>> The APC virq is used to multiplex several services, which you can
>>>>>>>>>>>>> find by grepping the sources for rthal_apc_alloc:
>>>>>>>>>>>>> ./ksrc/skins/posix/apc.c:       pse51_lostage_apc = rthal_apc_alloc("pse51_lostage_handler",
>>>>>>>>>>>>> ./ksrc/skins/rtdm/device.c:     rtdm_apc = rthal_apc_alloc("deferred RTDM close", rtdm_apc_handler,
>>>>>>>>>>>>> ./ksrc/nucleus/registry.c:          rthal_apc_alloc("registry_export", &registry_proc_schedule, NULL);
>>>>>>>>>>>>> ./ksrc/nucleus/pipe.c:      rthal_apc_alloc("pipe_wakeup", &xnpipe_wakeup_proc, NULL);
>>>>>>>>>>>>> ./ksrc/nucleus/shadow.c:            rthal_apc_alloc("lostage_handler", &lostage_handler, NULL);
>>>>>>>>>>>>> ./ksrc/nucleus/select.c:        xnselect_apc = rthal_apc_alloc("xnselectors_destroy",
>>>>>>>>>>>>>
>>>>>>>>>>>>> It would be interesting to know which of these services is triggered
>>>>>>>>>>>>> a lot. One possibility I see would be root thread priority
>>>>>>>>>>>>> inheritance, so it would be caused by mode switches. This brings the
>>>>>>>>>>>>> question: do your application have threads migrating between primary
>>>>>>>>>>>>> and secondary mode, do you see the count of mode switches increase
>>>>>>>>>>>>> with the kernel changes, do you have root thread priority
>>>>>>>>>>>>> inheritance enabled?
>>>>>>>>>>>>>
>>>>>>>>>>>> Here a short sum up of our tests and the results and at the end a few
>>>>>>>>>>>> questions :-)
>>>>>>>>>>>>
>>>>>>>>>>>> we are using a Freescale imx6dl on our hardware and upgraded our operating system from
>>>>>>>>>>>> Freescale Kernel 3.0.43 with Xenomai 2.6.2.1 and U-Boot 2013.04 as compiler we use GCC 4.7.2
>>>>>>>>>>>> Freescale Kernel 3.10.53 with Xenomai 2.6.4 and U-Boot 2016.01 as compiler we use GCC 4.8.2
>>>>>>>>>>>> On both Kernels the CONFIG_SMP is set.
>>>>>>>>>>>>
>>>>>>>>>>>> What we see is that when we running a customer project in a Xenomai task with priority 95
>>>>>>>>>>>> tooks 40% of the CPU time on Kernel 3.0.43
>>>>>>>>>>>> and 47% of CPU time on Kernel 3.10.53
>>>>>>>>>>>>
>>>>>>>>>>>> so the new system is slower by 7% if we sum up this to 100% CPU load we have a difference of 15%
>>>>>>>>>>>> To find out what is the reason for this difference we ran the following test.
>>>>>>>>>>>> We tried to get the new system faster by change some components of the system.
>>>>>>>>>>>>
>>>>>>>>>>>> -Changing U-Boot on new system                -> still 7% slower
>>>>>>>>>>>> -Copy Kernel 3.0.43 to new system            -> still 7% slower
>>>>>>>>>>>> -Creating Kernel 3.0.43 with
>>>>>>>>>>>>            Xenomai 2.6.4 and copy it to new system    -> still 7% slower
>>>>>>>>>>>> -Compiling the new system with
>>>>>>>>>>>>            old GCC version                                        -> still 7% slower
>>>>>>>>>>>> -We also checked the settings for RAM and CPU clock -> these are equal
>>>>>>>>>>>>
>>>>>>>>>>>> It looks like that is not one of the big components,
>>>>>>>>>>>> so we started to test some special functions like rt_timer_tsc()
>>>>>>>>>>>> In the following example we stay for 800µs in the while loop and
>>>>>>>>>>>> start this loop again after 200µs delay.
>>>>>>>>>>>> The task application running this code has priotity 95.
>>>>>>>>>>>>
>>>>>>>>>>>> Here a simplified code snipped
>>>>>>>>>>>> start = rt_timer_tsc();
>>>>>>>>>>>> do
>>>>>>>>>>>> {
>>>>>>>>>>>> 	current = rt_timer_tsc();
>>>>>>>>>>>> 	i++;	
>>>>>>>>>>>> } while((current - start) < 800)
>>>>>>>>>>> If your CPU is running at 1 GHz and uses the global timer as clock
>>>>>>>>>>> source, the clock source runs at 500 MHz, so 800 ticks of the tsc is
>>>>>>>>>>> something around 1.6 us
>>>>>>>>>> Sorry I simplified the code snippet a little bit to much.
>>>>>>>>>> Thats the correct code.
>>>>>>>>>>
>>>>>>>>>> current = rt_timer_tsc2ns(rt_timer_tsc());
>>>>>>>>>>
>>>>>>>>>>> So, I do not really understand what you are talking about. But are
>>>>>>>>>>> you sure the two kernels use the same clocksource for xenomai?
>>>>>>>>>>>
>>>>>>>>>>> Could you show us the result of "dmesg | grep I-pipe" with the two
>>>>>>>>>>> kernels ?
>>>>>>>>>> Output of Kernel 3.10.53 with Xenomai 2.6.4
>>>>>>>>>> I-pipe, 3.000 MHz clocksource
>>>>>>>>>> I-pipe, 396.000 MHz clocksource
>>>>>>>>>> I-pipe, 396.000 MHz timer
>>>>>>>>>> I-pipe, 396.000 MHz timer
>>>>>>>>>> I-pipe: head domain Xenomai registered.
>>>>>>>>>>
>>>>>>>>>> Output of Kernel 3.0.43 with Xenomai 2.6.2.1
>>>>>>>>>> [    0.000000] I-pipe 1.18-13: pipeline enabled.
>>>>>>>>>> [    0.331999] I-pipe, 396.000 MHz timer
>>>>>>>>>> [    0.335720] I-pipe, 396.000 MHz clocksource
>>>>>>>>>> [    0.844016] I-pipe: Domain Xenomai registered.
>>>>>>>>>>
>>>>>>>>>> The controller is a imx6dl, this controller can run maximum 800MHz
>>>>>>>>> Ok, so the new kernel registers two tsc emulations, could you run
>>>>>>>>> the "tsc" regression test to measure the tsc latency? The two tsc
>>>>>>>>> emulations have very different latencies, so the result would be
>>>>>>>>> unmistakable.
>>>>>>>>>
>>>>>>>> Output of Kernel 3.10.53 with Xenomai 2.6.4
>>>>>>>> /usr/xenomai/bin/latency
>>>>>>> This test is named "latency", not "tsc". As the different names
>>>>>>> imply, they are not measuring the same thing.
>>>>>>>
>>>>>> Sorry for the stupied question,
>>>>>> but where do I find the "tsc" test, because in folder /usr/xenomai/bin/
>>>>>> is it not located
>>>>> You have millions of files on your target? Or you compiled busybox
>>>>> without support for the "find" utility ?
>>>>>
>>>> Sorry I searched for tsc befor but didn't find any executable called "tsc"
>>>> what I find is /usr/share/ghostscript and its config files and
>>>> /usr/bin/xtscal.
>>>>
>>>> I find in the xenomai sources the file tsc.c, could you please tell me
>>>> what option do I have to enable that this will be build ?
>>> There is no option that can disable its compilation/installation as
>>> far as I know.
>>>
>>> It is normally built and installed under
>>> @XENO_TEST_DIR@/regression/native
>>>
>>> So the installation directory depends on the value you passed to
>>> configure --with-testdir option (the default being /usr/bin).
>>>
>> Thanks a lot I found it and downloaded it to my target
>>
>> Here are the output for Kernel 3.10.53 and Xenomai 2.6.4
>>
>> #> ./tsc
>> Checking tsc for 1 minute(s)
>> min: 10, max: 596, avg: 10.5056
>> min: 10, max: 595, avg: 10.5053
>> min: 10, max: 603, avg: 10.5053
>> min: 10, max: 630, avg: 10.5052
>> min: 10, max: 600, avg: 10.505
>> min: 10, max: 595, avg: 10.5056
>> min: 10, max: 562, avg: 10.505
>> min: 10, max: 605, avg: 10.5056
>> min: 10, max: 602, avg: 10.5055
>> min: 10, max: 595, avg: 10.5052
> Could you let the test run until the end ? (1 minute is not so long).
>
Sorry I dont want to overload the mail

Here are the output for Kernel 3.10.53 and Xenomai 2.6.4


#> ./tsc
Checking tsc for 1 minute(s)
min: 10, max: 595, avg: 10.5055
min: 10, max: 596, avg: 10.5048
min: 10, max: 595, avg: 10.5048
min: 10, max: 563, avg: 10.5047
min: 10, max: 594, avg: 10.505
min: 10, max: 652, avg: 10.5047
min: 10, max: 604, avg: 10.505
min: 10, max: 595, avg: 10.5045
min: 10, max: 595, avg: 10.5046
min: 10, max: 594, avg: 10.5046
min: 10, max: 582, avg: 10.5045
min: 10, max: 599, avg: 10.5046
min: 10, max: 379, avg: 10.5046
min: 10, max: 571, avg: 10.5046
min: 10, max: 534, avg: 10.5051
min: 10, max: 591, avg: 10.5046
min: 10, max: 595, avg: 10.5048
min: 10, max: 630, avg: 10.5049
min: 10, max: 534, avg: 10.5047
min: 10, max: 564, avg: 10.5048
min: 10, max: 520, avg: 10.5049
min: 10, max: 255, avg: 10.5047
min: 10, max: 594, avg: 10.5047
min: 10, max: 677, avg: 10.505
min: 10, max: 128, avg: 10.5053
min: 10, max: 530, avg: 10.5046
min: 10, max: 595, avg: 10.5047
min: 10, max: 595, avg: 10.5048
min: 10, max: 121, avg: 10.5044
min: 10, max: 595, avg: 10.5053
min: 10, max: 598, avg: 10.505
min: 10, max: 595, avg: 10.505
min: 10, max: 595, avg: 10.505
min: 10, max: 598, avg: 10.5047
min: 10, max: 143, avg: 10.5058
min: 10, max: 125, avg: 10.505
min: 10, max: 202, avg: 10.5047
min: 10, max: 595, avg: 10.5047
min: 10, max: 295, avg: 10.5049
min: 10, max: 156, avg: 10.5047
min: 10, max: 546, avg: 10.5049
min: 10, max: 547, avg: 10.5048
min: 10, max: 153, avg: 10.5048
min: 10, max: 116, avg: 10.5046
min: 10, max: 591, avg: 10.5052
min: 10, max: 594, avg: 10.5046
min: 10, max: 563, avg: 10.5048
min: 10, max: 101, avg: 10.5046
min: 10, max: 117, avg: 10.5048
min: 10, max: 608, avg: 10.505
min: 10, max: 593, avg: 10.5048
min: 10, max: 617, avg: 10.5047
min: 10, max: 596, avg: 10.505
min: 10, max: 529, avg: 10.5048
min: 10, max: 557, avg: 10.5052
min: 10, max: 104, avg: 10.5047
min: 10, max: 595, avg: 10.5048
min: 10, max: 595, avg: 10.5046
min: 10, max: 115, avg: 10.5045
min: 10, max: 196, avg: 10.5046
min: 10, max: 677, avg: 10.5048 -> 0.0265273 us

Here are the output for Kernel 3.0.43 and Xenomai 2.6.2.1

#> ./tsc
Checking tsc for 1 minute(s)
min: 10, max: 141, avg: 11.5371
min: 10, max: 181, avg: 11.5377
min: 10, max: 512, avg: 11.5444
min: 10, max: 93, avg: 11.5335
min: 10, max: 578, avg: 11.5581
min: 10, max: 81, avg: 11.5838
min: 10, max: 290, avg: 11.5455
min: 10, max: 105, avg: 11.5441
min: 10, max: 543, avg: 11.5648
min: 10, max: 111, avg: 11.5602
min: 10, max: 96, avg: 11.5607
min: 10, max: 77, avg: 11.5758
min: 10, max: 122, avg: 11.5601
min: 10, max: 78, avg: 11.5494
min: 10, max: 142, avg: 11.581
min: 10, max: 95, avg: 11.5513
min: 10, max: 142, avg: 11.5825
min: 10, max: 580, avg: 11.5829
min: 10, max: 104, avg: 11.5656
min: 10, max: 77, avg: 11.5833
min: 10, max: 667, avg: 11.5623
min: 10, max: 91, avg: 11.5742
min: 10, max: 656, avg: 11.5742
min: 10, max: 78, avg: 11.5866
min: 10, max: 428, avg: 11.5958
min: 10, max: 77, avg: 11.5945
min: 10, max: 81, avg: 11.5948
min: 10, max: 96, avg: 11.5787
min: 10, max: 85, avg: 11.5975
min: 10, max: 122, avg: 11.6049
min: 10, max: 516, avg: 11.5776
min: 10, max: 95, avg: 11.597
min: 10, max: 550, avg: 11.5868
min: 10, max: 147, avg: 11.6125
min: 10, max: 582, avg: 11.611
min: 10, max: 541, avg: 11.6105
min: 10, max: 154, avg: 11.5942
min: 10, max: 94, avg: 11.6217
min: 10, max: 74, avg: 11.6066
min: 10, max: 91, avg: 11.653
min: 10, max: 539, avg: 11.6248
min: 10, max: 166, avg: 11.5915
min: 10, max: 81, avg: 11.6156
min: 10, max: 111, avg: 11.6099
min: 10, max: 578, avg: 11.6348
min: 10, max: 95, avg: 11.6329
min: 10, max: 640, avg: 11.629
min: 10, max: 597, avg: 11.6336
min: 10, max: 95, avg: 11.615
min: 10, max: 555, avg: 11.5757
min: 10, max: 94, avg: 11.5749
min: 10, max: 530, avg: 11.5542
min: 10, max: 82, avg: 11.5346
min: 10, max: 119, avg: 11.5466
min: 10, max: 94, avg: 11.5169
min: 10, max: 173, avg: 11.5044
min: 10, max: 147, avg: 11.5309
min: 10, max: 196, avg: 11.4833
min: 10, max: 542, avg: 11.4979
min: 10, max: 95, avg: 11.4895
min: 10, max: 667, avg: 11.5755 -> 0.029231 us




  reply	other threads:[~2016-06-28 10:31 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-05-31 14:09 [Xenomai] Performance impact after switching from 2.6.2.1 to 2.6.4 Wolfgang Netbal
2016-05-31 14:16 ` Gilles Chanteperdrix
2016-06-01 13:52   ` Wolfgang Netbal
2016-06-01 14:12     ` Gilles Chanteperdrix
2016-06-02  8:15       ` Wolfgang Netbal
2016-06-02  8:23         ` Gilles Chanteperdrix
2016-06-06  7:03           ` Wolfgang Netbal
2016-06-06 15:35             ` Gilles Chanteperdrix
2016-06-07 14:13               ` Wolfgang Netbal
2016-06-07 17:00                 ` Gilles Chanteperdrix
2016-06-27 15:55                   ` Wolfgang Netbal
2016-06-27 16:00                     ` Gilles Chanteperdrix
2016-06-28  8:08                       ` Wolfgang Netbal
2016-06-27 16:46                     ` Gilles Chanteperdrix
2016-06-28  8:31                       ` Wolfgang Netbal
2016-06-28  8:34                         ` Gilles Chanteperdrix
2016-06-28  9:15                           ` Wolfgang Netbal
2016-06-28  9:17                             ` Gilles Chanteperdrix
2016-06-28  9:28                               ` Wolfgang Netbal
2016-06-28  9:29                                 ` Gilles Chanteperdrix
2016-06-28  9:51                                   ` Wolfgang Netbal
2016-06-28  9:55                                     ` Gilles Chanteperdrix
2016-06-28 10:10                                       ` Wolfgang Netbal
2016-06-28 10:19                                         ` Gilles Chanteperdrix
2016-06-28 10:31                                           ` Wolfgang Netbal [this message]
2016-06-28 10:39                                             ` Gilles Chanteperdrix
2016-06-28 11:45                                               ` Wolfgang Netbal
2016-06-28 11:57                                                 ` Gilles Chanteperdrix
2016-06-28 11:55                                               ` Wolfgang Netbal
2016-06-28 12:01                                                 ` Gilles Chanteperdrix
2016-06-28 14:32                                                   ` Wolfgang Netbal
2016-06-28 14:42                                                     ` Gilles Chanteperdrix
2016-06-30  9:17                                                       ` Wolfgang Netbal
2016-06-30  9:39                                                         ` Gilles Chanteperdrix
2016-06-07 17:22                 ` Philippe Gerum
2016-05-31 15:08 ` Philippe Gerum

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5772520E.8010407@sigmatek.at \
    --to=wolfgang.netbal@sigmatek.at \
    --cc=gilles.chanteperdrix@xenomai.org \
    --cc=xenomai@xenomai.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.