From: Wolfgang Netbal <wolfgang.netbal@sigmatek.at>
To: xenomai@xenomai.org
Subject: Re: [Xenomai] Performance impact after switching from 2.6.2.1 to 2.6.4
Date: Mon, 6 Jun 2016 09:03:40 +0200 [thread overview]
Message-ID: <5755204C.6090701@sigmatek.at> (raw)
In-Reply-To: <20160602082318.GB1801@hermes.click-hack.org>
Am 2016-06-02 um 10:23 schrieb Gilles Chanteperdrix:
> On Thu, Jun 02, 2016 at 10:15:41AM +0200, Wolfgang Netbal wrote:
>>
>> Am 2016-06-01 um 16:12 schrieb Gilles Chanteperdrix:
>>> On Wed, Jun 01, 2016 at 03:52:06PM +0200, Wolfgang Netbal wrote:
>>>> Am 2016-05-31 um 16:16 schrieb Gilles Chanteperdrix:
>>>>> On Tue, May 31, 2016 at 04:09:07PM +0200, Wolfgang Netbal wrote:
>>>>>> Dear all,
>>>>>>
>>>>>> we have moved our application from "XENOMAI 2.6.2.1 + Linux 3.0.43" to
>>>>>> "XENOMAI 2.6.4. + Linux 3.10.53". Our target is an i.MX6DL. The system
>>>>>> is now up and running and works stable. Unfortunately we see a
>>>>>> difference in the performance. Our old combination (XENOMAI 2.6.2.1 +
>>>>>> Linux 3.0.43) was slightly faster.
>>>>>>
>>>>>> At the moment it looks like that XENOMAI 2.6.4 calls
>>>>>> xnpod_schedule_handler much more often then XENOMAI 2.6.2.1 in our old
>>>>>> system. Every call of xnpod_schedule_handler interrupts our main
>>>>>> XENOMAI task with priority = 95.
>>>>>>
>>>>>> I have compared the configuration of both XENOMAI versions but did not
>>>>>> found any difference. I checked the source code (new commits) but did
>>>>>> also not find a solution.
>>>>> Have you tried Xenomai 2.6.4 with Linux 3.0.43 ? In order to see
>>>>> whether it comes from the kernel update or the Xenomai udpate?
>>>> I've tried Linux 3.0.43 with Xenomai 2.6.4 an there is no difference to
>>>> Xenomai 2.6.2.1
>>>> Looks like there is an other reason than Xenomai.
>>> Ok, one thing to pay attention to on imx6 is the L2 cache write
>>> allocate policy. You want to disable L2 write allocate on imx6 to
>>> get low latencies. I do not know which patches exactly you are
>>> using, so it is difficult to check, but the kernel normally displays
>>> the value set in the L2 auxiliary configuration register, you can
>>> check in the datasheet if it means that L2 write allocate is
>>> disabled or not. And check if you get the same value with 3.0 and
>>> 3.10.
>> Thank you for this hint, I looked around in the kernel config, but cant
>> find
>> an option sounds like L2 write allocate.
>> The only option I found was CACHE_L2X0 and that is activated on both
>> kernels.
>> Do you have an idea whats the name of this configuration or where in the
>> kernel sources it should be located, so I can find out whats the name of
>> the
>> config flag by searching the sourcecode.
> I never talked about any kernel configuration option. I am talking
> checking the value passed to the L2 cache auxiliary configuration
> register, this is a hardware register. Also, as I said, the value
> passed to the L2 cache auxiliary register is printed by the kernel
> during boot.
>
>
Sorry Gilles,
I found the message in the kernel log, you are right they are different
Kernel 3.0.43 shows l2x0: 16 ways, CACHE_ID 0x410000c8, AUX_CTRL
0x02850000, Cache size: 524288 B
Kernel 3.10.53 shows l2x0: 16 ways, CACHE_ID 0x410000c8, AUX_CTRL
0x32c50000, Cache size: 524288 B
Kernel 3.10.53 sets addidtional the bits 22 (Shared attribute override
enable), 28 (Data prefetch) and 29 (Instruction prefetch)
I used the same settings on Kernel 3.0.43 but the perfromance didn't
change, looks like this configurations didn't slow down my
system.
What I have seen while searching the kernel config was that there are a
few errate that are activated as dependency in 3.10.53,
to be sure none of the errata is the source of my performance reduction
I activated them on 3.0.43 as well.
But again no difference to our default configuration.
To avoid our application is running slower I created a shell-script
incrementing a variable
10.000 times and measuring the runtime with time
#!/bin/sh
var=0
while [ $var -lt $1 ]; do
let var++
done
> time /mnt/drive-C/CpuTime.sh 10000
On this test
Kernel 3.0.43 Xenomai 2.6.2.1 needs 480 ms
Kernel 3.10.53 Xenomai 2.6.4 needs 820ms
This differences are huge, an I'm not sure if I can trust this test
because we also use a different busybox,
and the difference using our application are between 2% and 3%
in the realtime task (Xenomaitask with priority 95)
Do you have an idea why this is that much slower ?
I also see differences when I use the xeno-test command to check the speed
Kernel 3.0.43 Xenomai 2.6.2.1
Started child 1209: /bin/sh /usr/xenomai/bin/xeno-test-run-wrapper
/usr/xenomai/bin/xeno-test
+ echo 0
+ /usr/xenomai/bin/arith
mul: 0x79364d93, shft: 26
integ: 30, frac: 0x4d9364d9364d9364
signed positive operation: 0x03ffffffffffffff * 1000000000 / 33000000
inline calibration: 0x0000000000000000: 43.260 ns, rejected 0/10000
inlined llimd: 0x79364d9364d9362f: 1476.384 ns, rejected 4/10000
inlined llmulshft: 0x79364d92ffffffe1: 35.131 ns, rejected 0/10000
inlined nodiv_llimd: 0x79364d9364d9362f: 47.745 ns, rejected 3/10000
out of line calibration: 0x0000000000000000: 49.235 ns, rejected 2/10000
out of line llimd: 0x79364d9364d9362f: 1483.759 ns, rejected 2/10000
out of line llmulshft: 0x79364d92ffffffe1: 31.719 ns, rejected 2/10000
out of line nodiv_llimd: 0x79364d9364d9362f: 49.376 ns, rejected 0/10000
signed negative operation: 0xfc00000000000001 * 1000000000 / 33000000
inline calibration: 0x0000000000000000: 41.872 ns, rejected 0/10000
inlined llimd: 0x86c9b26c9b26c9d1: 1485.415 ns, rejected 2/10000
inlined llmulshft: 0x86c9b26d0000001e: 39.234 ns, rejected 0/10000
inlined nodiv_llimd: 0x86c9b26c9b26c9d1: 54.266 ns, rejected 1/10000
out of line calibration: 0x0000000000000000: 49.237 ns, rejected 0/10000
out of line llimd: 0x86c9b26c9b26c9d1: 1489.059 ns, rejected 1/10000
out of line llmulshft: 0xd45d172d0000001e: 36.847 ns, rejected 0/10000
out of line nodiv_llimd: 0x86c9b26c9b26c9d1: 56.973 ns, rejected 2/10000
unsigned operation: 0x03ffffffffffffff * 1000000000 / 33000000
inline calibration: 0x0000000000000000: 42.432 ns, rejected 1/10000
inlined nodiv_ullimd: 0x79364d9364d9362f: 51.083 ns, rejected 0/10000
out of line calibration: 0x0000000000000000: 48.086 ns, rejected 0/10000
out of line nodiv_ullimd: 0x79364d9364d9362f: 44.964 ns, rejected 0/10000
+ /usr/xenomai/bin/clocktest -C 42 -T 30
Xenomai: POSIX skin or CONFIG_XENO_OPT_PERVASIVE disabled.
(modprobe xeno_posix?)
+ /usr/xenomai/bin/clocktest -T 30
Xenomai: POSIX skin or CONFIG_XENO_OPT_PERVASIVE disabled.
(modprobe xeno_posix?)
Kernel 3.10.53 Xenomai 2.6.4
Started child 729: /bin/sh /usr/xenomai/bin/xeno-test-run-wrapper
/usr/xenomai/bin/xeno-test
++ echo 0
++ /usr/xenomai/bin/arith
mul: 0x79364d93, shft: 26
integ: 30, frac: 0x4d9364d9364d9364
signed positive operation: 0x03ffffffffffffff * 1000000000 / 33000000
inline calibration: 0x0000000000000000: 42.979 ns, rejected 1/10000
inlined llimd: 0x79364d9364d9362f: 1491.632 ns, rejected 2/10000
inlined llmulshft: 0x79364d92ffffffe1: 37.873 ns, rejected 1/10000
inlined nodiv_llimd: 0x79364d9364d9362f: 50.520 ns, rejected 0/10000
out of line calibration: 0x0000000000000000: 50.611 ns, rejected 1/10000
out of line llimd: 0x79364d9364d9362f: 1476.381 ns, rejected 4/10000
out of line llmulshft: 0x79364d92ffffffe1: 25.364 ns, rejected 1/10000
out of line nodiv_llimd: 0x79364d9364d9362f: 45.493 ns, rejected 1/10000
signed negative operation: 0xfc00000000000001 * 1000000000 / 33000000
inline calibration: 0x0000000000000000: 42.962 ns, rejected 1/10000
inlined llimd: 0x86c9b26c9b26c9d1: 1488.811 ns, rejected 4/10000
inlined llmulshft: 0x86c9b26d0000001e: 42.972 ns, rejected 2/10000
inlined nodiv_llimd: 0x86c9b26c9b26c9d1: 55.611 ns, rejected 1/10000
out of line calibration: 0x0000000000000000: 50.572 ns, rejected 1/10000
out of line llimd: 0x86c9b26c9b26c9d1: 1481.904 ns, rejected 3/10000
out of line llmulshft: 0x86c9b26d0000001e: 27.818 ns, rejected 0/10000
out of line nodiv_llimd: 0x86c9b26c9b26c9d1: 53.008 ns, rejected 1/10000
unsigned operation: 0x03ffffffffffffff * 1000000000 / 33000000
inline calibration: 0x0000000000000000: 42.968 ns, rejected 0/10000
inlined nodiv_ullimd: 0x79364d9364d9362f: 53.060 ns, rejected 1/10000
out of line calibration: 0x0000000000000000: 50.591 ns, rejected 1/10000
out of line nodiv_ullimd: 0x79364d9364d9362f: 46.102 ns, rejected 1/10000
++ /usr/xenomai/bin/clocktest -C 42 -T 30
Xenomai: POSIX skin or CONFIG_XENO_OPT_PERVASIVE disabled.
(modprobe xeno_posix?)
++ /usr/xenomai/bin/clocktest -T 30
Xenomai: POSIX skin or CONFIG_XENO_OPT_PERVASIVE disabled.
(modprobe xeno_posix?)
Some of the operations are faster on newer Xenomai but a few are much
slower,
for example inlined llimd.
With every test I run it looks like the issue is not located in Kernel
or Xenomai.
Do you know any speed issues on system libraries like libc or something
like that ?
Kind regards
Wolfgang
next prev parent reply other threads:[~2016-06-06 7:03 UTC|newest]
Thread overview: 36+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-05-31 14:09 [Xenomai] Performance impact after switching from 2.6.2.1 to 2.6.4 Wolfgang Netbal
2016-05-31 14:16 ` Gilles Chanteperdrix
2016-06-01 13:52 ` Wolfgang Netbal
2016-06-01 14:12 ` Gilles Chanteperdrix
2016-06-02 8:15 ` Wolfgang Netbal
2016-06-02 8:23 ` Gilles Chanteperdrix
2016-06-06 7:03 ` Wolfgang Netbal [this message]
2016-06-06 15:35 ` Gilles Chanteperdrix
2016-06-07 14:13 ` Wolfgang Netbal
2016-06-07 17:00 ` Gilles Chanteperdrix
2016-06-27 15:55 ` Wolfgang Netbal
2016-06-27 16:00 ` Gilles Chanteperdrix
2016-06-28 8:08 ` Wolfgang Netbal
2016-06-27 16:46 ` Gilles Chanteperdrix
2016-06-28 8:31 ` Wolfgang Netbal
2016-06-28 8:34 ` Gilles Chanteperdrix
2016-06-28 9:15 ` Wolfgang Netbal
2016-06-28 9:17 ` Gilles Chanteperdrix
2016-06-28 9:28 ` Wolfgang Netbal
2016-06-28 9:29 ` Gilles Chanteperdrix
2016-06-28 9:51 ` Wolfgang Netbal
2016-06-28 9:55 ` Gilles Chanteperdrix
2016-06-28 10:10 ` Wolfgang Netbal
2016-06-28 10:19 ` Gilles Chanteperdrix
2016-06-28 10:31 ` Wolfgang Netbal
2016-06-28 10:39 ` Gilles Chanteperdrix
2016-06-28 11:45 ` Wolfgang Netbal
2016-06-28 11:57 ` Gilles Chanteperdrix
2016-06-28 11:55 ` Wolfgang Netbal
2016-06-28 12:01 ` Gilles Chanteperdrix
2016-06-28 14:32 ` Wolfgang Netbal
2016-06-28 14:42 ` Gilles Chanteperdrix
2016-06-30 9:17 ` Wolfgang Netbal
2016-06-30 9:39 ` Gilles Chanteperdrix
2016-06-07 17:22 ` Philippe Gerum
2016-05-31 15:08 ` Philippe Gerum
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5755204C.6090701@sigmatek.at \
--to=wolfgang.netbal@sigmatek.at \
--cc=xenomai@xenomai.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.