* cyclictest better values with system load than without (OMAP3530 target) @ 2013-11-26 9:21 Stefan Roese 2013-11-26 14:21 ` Dmitry Lysenko 2013-11-26 16:12 ` Clark Williams 0 siblings, 2 replies; 12+ messages in thread From: Stefan Roese @ 2013-11-26 9:21 UTC (permalink / raw) To: linux-rt-users Hi! I'm running cylictest on a OMAP3530 target board and am a bit astonished about the results. Especially that the latency values are better on a system with system load (hackbench) than on one without system load. Here the values I get: With system load (hackbench): ----------------------------- # ./cyclictest -l 10000 -i 1000 -n -p 80 -q # /dev/cpu_dma_latency set to 0us T: 0 ( 1853) P:80 I:1000 C: 10000 Min: 36 Act: 156 Avg: 154 Max: 244 Idle system: ------------ # ./cyclictest -l 10000 -i 1000 -n -p 80 -q # /dev/cpu_dma_latency set to 0us T: 0 ( 2332) P:80 I:1000 C: 10000 Min: 81 Act: 530 Avg: 484 Max: 602 Some details to my test/system setup: - Linux v3.8.13 - preempt-rt patch 3.8.13-rt14 - HW: TI OMAP3530 CM_T35 board - Latest cyclictest from rt-tests git repository I might have misconfigured the system. So here some extracts from my .config: ... CONFIG_TICK_ONESHOT=y CONFIG_NO_HZ=y CONFIG_HIGH_RES_TIMERS=y ... # CONFIG_CPU_FREQ is not set # CONFIG_CPU_IDLE is not set # CONFIG_ARCH_NEEDS_CPU_IDLE_COUPLED is not set ... CONFIG_PREEMPT_RT_FULL=y ... With CONFIG_NO_HZ disabled I get slightly better results: With system load (hackbench): ----------------------------- # ./cyclictest -l 10000 -i 1000 -n -p 80 -q # /dev/cpu_dma_latency set to 0us T: 0 ( 1840) P:80 I:1000 C: 10000 Min: 30 Act: 153 Avg: 154 Max: 238 Idle system: ------------- # ./cyclictest -l 10000 -i 1000 -n -p 80 -q # /dev/cpu_dma_latency set to 0us T: 0 ( 1371) P:80 I:1000 C: 10000 Min: 40 Act: 465 Avg: 435 Max: 502 Any ideas/explanations are really appreciated. Thanks, Stefan ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: cyclictest better values with system load than without (OMAP3530 target) 2013-11-26 9:21 cyclictest better values with system load than without (OMAP3530 target) Stefan Roese @ 2013-11-26 14:21 ` Dmitry Lysenko 2013-11-26 19:14 ` Stefan Roese 2013-11-26 16:12 ` Clark Williams 1 sibling, 1 reply; 12+ messages in thread From: Dmitry Lysenko @ 2013-11-26 14:21 UTC (permalink / raw) To: Stefan Roese; +Cc: linux-rt-users Hi! On Marvell Kirkwood at 1.2Ghz and 3.2.51-rt72 kernel I have slightly better results: --- with hackbench: root@debian:~/rt-tests# ./cyclictest -l 10000 -i 1000 -n -p 80 -q T: 0 (10268) P:80 I:1000 C: 10000 Min: 16 Act: 29 Avg: 29 Max: 46 --- w/o hackbench: root@debian:~/rt-tests# ./cyclictest -l 10000 -i 1000 -n -p 80 -q T: 0 (12686) P:80 I:1000 C: 10000 Min: 5 Act: 6 Avg: 6 Max: 22 -- .config: CONFIG_TICK_ONESHOT=y # CONFIG_NO_HZ is not set CONFIG_HIGH_RES_TIMERS=y # CONFIG_CPU_IDLE is not set Try to play with: # CONFIG_PM_RUNTIME is not set # CONFIG_ARM_CPU_SUSPEND is not set Best wishes, Dmitry. 2013/11/26 Stefan Roese <stefan.roese@gmail.com> > > Hi! > > I'm running cylictest on a OMAP3530 target board and am a bit > astonished about the results. Especially that the latency values > are better on a system with system load (hackbench) than on one > without system load. Here the values I get: > > With system load (hackbench): > ----------------------------- > # ./cyclictest -l 10000 -i 1000 -n -p 80 -q > # /dev/cpu_dma_latency set to 0us > T: 0 ( 1853) P:80 I:1000 C: 10000 Min: 36 Act: 156 Avg: 154 Max: > 244 > > Idle system: > ------------ > # ./cyclictest -l 10000 -i 1000 -n -p 80 -q > # /dev/cpu_dma_latency set to 0us > T: 0 ( 2332) P:80 I:1000 C: 10000 Min: 81 Act: 530 Avg: 484 Max: > 602 > > > Some details to my test/system setup: > - Linux v3.8.13 > - preempt-rt patch 3.8.13-rt14 > - HW: TI OMAP3530 CM_T35 board > - Latest cyclictest from rt-tests git repository > > > I might have misconfigured the system. So here some extracts from > my .config: > > ... > CONFIG_TICK_ONESHOT=y > CONFIG_NO_HZ=y > CONFIG_HIGH_RES_TIMERS=y > ... > # CONFIG_CPU_FREQ is not set > # CONFIG_CPU_IDLE is not set > # CONFIG_ARCH_NEEDS_CPU_IDLE_COUPLED is not set > ... > CONFIG_PREEMPT_RT_FULL=y > ... > > With CONFIG_NO_HZ disabled I get slightly better results: > > With system load (hackbench): > ----------------------------- > # ./cyclictest -l 10000 -i 1000 -n -p 80 -q > # /dev/cpu_dma_latency set to 0us > T: 0 ( 1840) P:80 I:1000 C: 10000 Min: 30 Act: 153 Avg: 154 Max: > 238 > > Idle system: > ------------- > # ./cyclictest -l 10000 -i 1000 -n -p 80 -q > # /dev/cpu_dma_latency set to 0us > T: 0 ( 1371) P:80 I:1000 C: 10000 Min: 40 Act: 465 Avg: 435 Max: > 502 > > > Any ideas/explanations are really appreciated. > > Thanks, > Stefan ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: cyclictest better values with system load than without (OMAP3530 target) 2013-11-26 14:21 ` Dmitry Lysenko @ 2013-11-26 19:14 ` Stefan Roese 2013-11-26 20:14 ` Tim Sander 0 siblings, 1 reply; 12+ messages in thread From: Stefan Roese @ 2013-11-26 19:14 UTC (permalink / raw) To: Dmitry Lysenko; +Cc: linux-rt-users Hi Dmitry! On 26.11.2013 15:21, Dmitry Lysenko wrote: > On Marvell Kirkwood at 1.2Ghz and 3.2.51-rt72 kernel I have slightly > better results: > > --- with hackbench: > > root@debian:~/rt-tests# ./cyclictest -l 10000 -i 1000 -n -p 80 -q > T: 0 (10268) P:80 I:1000 C: 10000 Min: 16 Act: 29 Avg: 29 Max: 46 > > --- w/o hackbench: > > root@debian:~/rt-tests# ./cyclictest -l 10000 -i 1000 -n -p 80 -q > T: 0 (12686) P:80 I:1000 C: 10000 Min: 5 Act: 6 Avg: 6 Max: 22 > > -- .config: > CONFIG_TICK_ONESHOT=y > # CONFIG_NO_HZ is not set > CONFIG_HIGH_RES_TIMERS=y > > # CONFIG_CPU_IDLE is not set > > Try to play with: > > # CONFIG_PM_RUNTIME is not set > # CONFIG_ARM_CPU_SUSPEND is not set Thanks, this was helpful. After some tweaks I was able to disable CONFIG_PM_RUNTIME and ARM_CPU_SUSPEND. And after disabling NO_HZ and less CONFIG_DEBUG_xxx options I now have the following results: Idle: # ./cyclictest -l 10000 -i 1000 -n -p 80 -q # /dev/cpu_dma_latency set to 0us T: 0 ( 1382) P:80 I:1000 C: 10000 Min: 12 Act: 141 Avg: 127 Max: 202 Load: # ./cyclictest -l 10000 -i 1000 -n -p 80 -q # /dev/cpu_dma_latency set to 0us T: 0 ( 2777) P:80 I:1000 C: 10000 Min: 26 Act: 167 Avg: 152 Max: 229 So the test now finally has better results on a idle system than on one with heavy system load. The numbers are still far away from your latency values on the 1.2GHz Kirkwood. Does anybody have OMAP3 values at hand to compare? Thanks for your input, Stefan ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: cyclictest better values with system load than without (OMAP3530 target) 2013-11-26 19:14 ` Stefan Roese @ 2013-11-26 20:14 ` Tim Sander 0 siblings, 0 replies; 12+ messages in thread From: Tim Sander @ 2013-11-26 20:14 UTC (permalink / raw) To: Stefan Roese; +Cc: Dmitry Lysenko, linux-rt-users Hi Stefan > On 26.11.2013 15:21, Dmitry Lysenko wrote: > > On Marvell Kirkwood at 1.2Ghz and 3.2.51-rt72 kernel I have slightly > > better results: > > > > --- with hackbench: > > > > root@debian:~/rt-tests# ./cyclictest -l 10000 -i 1000 -n -p 80 -q > > T: 0 (10268) P:80 I:1000 C: 10000 Min: 16 Act: 29 Avg: 29 Max: > > 46 > > > > --- w/o hackbench: > > > > root@debian:~/rt-tests# ./cyclictest -l 10000 -i 1000 -n -p 80 -q > > T: 0 (12686) P:80 I:1000 C: 10000 Min: 5 Act: 6 Avg: 6 Max: > > 22 > > > > -- .config: > > CONFIG_TICK_ONESHOT=y > > # CONFIG_NO_HZ is not set > > CONFIG_HIGH_RES_TIMERS=y > > > > # CONFIG_CPU_IDLE is not set > > > > Try to play with: > > > > # CONFIG_PM_RUNTIME is not set > > # CONFIG_ARM_CPU_SUSPEND is not set > > Thanks, this was helpful. After some tweaks I was able to disable > CONFIG_PM_RUNTIME and ARM_CPU_SUSPEND. And after disabling NO_HZ and > less CONFIG_DEBUG_xxx options I now have the following results: > > Idle: > # ./cyclictest -l 10000 -i 1000 -n -p 80 -q > # /dev/cpu_dma_latency set to 0us > T: 0 ( 1382) P:80 I:1000 C: 10000 Min: 12 Act: 141 Avg: 127 Max: > 202 > > Load: > # ./cyclictest -l 10000 -i 1000 -n -p 80 -q > # /dev/cpu_dma_latency set to 0us > T: 0 ( 2777) P:80 I:1000 C: 10000 Min: 26 Act: 167 Avg: 152 Max: > 229 > > So the test now finally has better results on a idle system than on > one with heavy system load. The numbers are still far away from your > latency values on the 1.2GHz Kirkwood. Does anybody have OMAP3 > values at hand to compare? I have not used OMAP3 but Sitara a while ago. But as far as i know they are derived from the same die-mask. I think the numbers where a little better than yours but it was still in the 100 range which was to large for my use-case :-(. I haven't seen an cortex a8 with good realtime performance. The A9 seems much better in this regard from my experience. If you need better numbers ping me again. Probably i will find the results again on the other machine currently out of reach. One thing that really bothers is that ARM repurposed the FIC with some security features which are not that useful for realtime and i think most of the time Digital Restrictions Management does more harm than good. But thats a little offtopic for this mailing list. Best regards Tim ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: cyclictest better values with system load than without (OMAP3530 target) 2013-11-26 9:21 cyclictest better values with system load than without (OMAP3530 target) Stefan Roese 2013-11-26 14:21 ` Dmitry Lysenko @ 2013-11-26 16:12 ` Clark Williams 2013-11-29 12:56 ` Sebastian Andrzej Siewior 1 sibling, 1 reply; 12+ messages in thread From: Clark Williams @ 2013-11-26 16:12 UTC (permalink / raw) To: Stefan Roese; +Cc: linux-rt-users [-- Attachment #1: Type: text/plain, Size: 2654 bytes --] In my experience (on x86_64 mainly), that behavior (worse times when not under load) is due to the overhead of coming out of power-save/idle states. When you've got a big load on the system and all the cores are active, then the power-save logic and/or the idle logic doesn't kick in and devices aren't being powered down. Do you know if your OMAP has power-save logic available? Alternatively do you know how expensive the idle mechanism is? Have you tried booting with idle=poll then measuring without a load? Clark On Tue, 26 Nov 2013 10:21:34 +0100 Stefan Roese <stefan.roese@gmail.com> wrote: > Hi! > > I'm running cylictest on a OMAP3530 target board and am a bit > astonished about the results. Especially that the latency values > are better on a system with system load (hackbench) than on one > without system load. Here the values I get: > > With system load (hackbench): > ----------------------------- > # ./cyclictest -l 10000 -i 1000 -n -p 80 -q > # /dev/cpu_dma_latency set to 0us > T: 0 ( 1853) P:80 I:1000 C: 10000 Min: 36 Act: 156 Avg: 154 Max: > 244 > > Idle system: > ------------ > # ./cyclictest -l 10000 -i 1000 -n -p 80 -q > # /dev/cpu_dma_latency set to 0us > T: 0 ( 2332) P:80 I:1000 C: 10000 Min: 81 Act: 530 Avg: 484 Max: > 602 > > > Some details to my test/system setup: > - Linux v3.8.13 > - preempt-rt patch 3.8.13-rt14 > - HW: TI OMAP3530 CM_T35 board > - Latest cyclictest from rt-tests git repository > > > I might have misconfigured the system. So here some extracts from > my .config: > > ... > CONFIG_TICK_ONESHOT=y > CONFIG_NO_HZ=y > CONFIG_HIGH_RES_TIMERS=y > ... > # CONFIG_CPU_FREQ is not set > # CONFIG_CPU_IDLE is not set > # CONFIG_ARCH_NEEDS_CPU_IDLE_COUPLED is not set > ... > CONFIG_PREEMPT_RT_FULL=y > ... > > With CONFIG_NO_HZ disabled I get slightly better results: > > With system load (hackbench): > ----------------------------- > # ./cyclictest -l 10000 -i 1000 -n -p 80 -q > # /dev/cpu_dma_latency set to 0us > T: 0 ( 1840) P:80 I:1000 C: 10000 Min: 30 Act: 153 Avg: 154 Max: > 238 > > Idle system: > ------------- > # ./cyclictest -l 10000 -i 1000 -n -p 80 -q > # /dev/cpu_dma_latency set to 0us > T: 0 ( 1371) P:80 I:1000 C: 10000 Min: 40 Act: 465 Avg: 435 Max: > 502 > > > Any ideas/explanations are really appreciated. > > Thanks, > Stefan > -- > To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 198 bytes --] ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: cyclictest better values with system load than without (OMAP3530 target) 2013-11-26 16:12 ` Clark Williams @ 2013-11-29 12:56 ` Sebastian Andrzej Siewior 2013-11-29 15:10 ` Carsten Emde 0 siblings, 1 reply; 12+ messages in thread From: Sebastian Andrzej Siewior @ 2013-11-29 12:56 UTC (permalink / raw) To: Clark Williams; +Cc: Stefan Roese, linux-rt-users * Clark Williams | 2013-11-26 10:12:32 [-0600]: >In my experience (on x86_64 mainly), that behavior (worse times when >not under load) is due to the overhead of coming out of power-save/idle >states. When you've got a big load on the system and all the cores are >active, then the power-save logic and/or the idle logic doesn't kick in >and devices aren't being powered down. This is the case here, too. The overhead comming out of a deep power state plus the invalidated caches. >Do you know if your OMAP has power-save logic available? Alternatively >do you know how expensive the idle mechanism is? Have you tried booting >with idle=poll then measuring without a load? idle=pull is x86 only. Disable the complete PM stuff should solve the issue. You could go via arch_cpu_idle() to check what is used. The idle routine should come either via cpuidle_idle_call() or arm_pm_idle(). >Clark Sebastian ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: cyclictest better values with system load than without (OMAP3530 target) 2013-11-29 12:56 ` Sebastian Andrzej Siewior @ 2013-11-29 15:10 ` Carsten Emde 2013-11-29 16:36 ` Gilles Chanteperdrix 0 siblings, 1 reply; 12+ messages in thread From: Carsten Emde @ 2013-11-29 15:10 UTC (permalink / raw) To: Sebastian Andrzej Siewior, Clark Williams, Stefan Roese; +Cc: linux-rt-users On 11/29/2013 01:56 PM, Sebastian Andrzej Siewior wrote: > * Clark Williams | 2013-11-26 10:12:32 [-0600]: > >> In my experience (on x86_64 mainly), that behavior (worse times when >> not under load) is due to the overhead of coming out of power-save/idle >> states. When you've got a big load on the system and all the cores are >> active, then the power-save logic and/or the idle logic doesn't kick in >> and devices aren't being powered down. > This is the case here, too. The overhead comming out of a deep power > state plus the invalidated caches. Sorry, I feel that the discussion a somewhat out of sync with the original posting. Let me explain. Among others, processors may use two completely different interfaces to save power: 1. Sleep states aka C states, Linux interface cpuidle 2. Clock frequency modulation aka P states, Linux interface cpufreq 1. Sleep states Processors may come with a number of C states from light sleep to deep sleep to save power when idle. The longer a processor is idle, the deeper normally is the sleep state the processor may enter. Sleep states may be disabled i) on a per-processor and per-state basis in /sys/devices/system/cpu/cpuX/cpuidle/stateX/disable or ii) altogether using the somewhat mislabeled /dev/cpu_dma_latency pseudo device. As far as cyclictest is concerned, sleep states normally are disabled altogether. If this is the case, cyclictest prints the message: # /dev/cpu_dma_latency set to 0us The original posting contains this line. In consequence, sleep states cannot be responsible for any observed latency prolongation. To check whether sleep states are disabled, the command # cat /sys/devices/system/cpu/cpu0/cpuidle/state?/time may be used repeatedly for every CPU. If sleep states are disabled correctly, only the first state (poll state) may increase such as # cat /sys/devices/system/cpu/cpu0/cpuidle/state?/time 444330737734 234393550 1760323375 1234658099 183251179053 and sometime later # cat /sys/devices/system/cpu/cpu0/cpuidle/state?/time 444417947595 234393550 1760323375 1234658099 183251179053 BTW: The cyclictest source contains a related comment: /* Latency trick * if the file /dev/cpu_dma_latency exists, * open it and write a zero into it. This will tell * the power management system not to transition to * a high cstate (in fact, the system acts like idle=poll) * When the fd to /dev/cpu_dma_latency is closed, the behavior * goes back to the system default. * * Documentation/power/pm_qos_interface.txt */ 2. Clock frequency modulation This is an entirely different story as cylictest has no business with it at all. The clock frequency of x86 processors has a more or less linear effect on latency, e.g. a system running at 1 GHz will show a latency that is twice as high as when running at 2 GHz. ARM processors, however, behave differently. Many ARM cores do not provide acceptable latency values unless running at full speed. It is, therefore, often mandatory to switch to the performance CPU frequency governor before starting cyclictest or before running a real-world user space application that relies on minimum latency. The /sys/devices/system/cpu/cpu0/cpufreq interface is available to manage P states: Switch to maximum performance: cd /sys/devices/system/cpu/ for i in cpu?/cpufreq/scaling_governor do echo performance >$i done Switch to on-demand frequency modulation: for i in cpu?/cpufreq/scaling_governor do echo ondemand >$i done BTW: Power saving and real-time do not necessarily exclude each other. If a - still deterministic - but a little longer latency is acceptable, some light sleep states and a somewhat lower clock frequency may be allowed which still may result in considerable energy saving. If, however, the fastest possible real-time response is required, C states and P states must be disabled (or set to polling and maximum speed, repsectively) and the power bill must be payed. > So the test now finally has better results on a idle system than on > one with heavy system load. The numbers are still far away from your > latency values on the 1.2GHz Kirkwood. Does anybody have OMAP3 > values at hand to compare? This is why we run the OSADL QA Farm. An AM3359 system is in rack 7, slot 5 -> https://www.osadl.org/?id=1590. We run 100 million cycles with 200 µs cycle interval (which takes about 5 hours and 33 minutes) to obtain reliable data. In addition, the processor is in idle state but also executing defined load scenarios during the recording. Please do the same before you compare the results. To facilitate the comparison, the cyclictest command line is given below every plot, and any other relevant information (including kernel command line) is available in the systems' profiles. Hope this helps, -Carsten. -- To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: cyclictest better values with system load than without (OMAP3530 target) 2013-11-29 15:10 ` Carsten Emde @ 2013-11-29 16:36 ` Gilles Chanteperdrix 2013-11-29 16:58 ` Gilles Chanteperdrix 0 siblings, 1 reply; 12+ messages in thread From: Gilles Chanteperdrix @ 2013-11-29 16:36 UTC (permalink / raw) To: Carsten Emde Cc: Sebastian Andrzej Siewior, Clark Williams, Stefan Roese, linux-rt-users On 11/29/2013 04:10 PM, Carsten Emde wrote: > BTW: Power saving and real-time do not necessarily exclude each other. > If a - still deterministic - but a little longer latency is acceptable, > some light sleep states and a somewhat lower clock frequency may be > allowed which still may result in considerable energy saving. If, > however, the fastest possible real-time response is required, C states > and P states must be disabled (or set to polling and maximum speed, > repsectively) and the power bill must be payed. Well, I do not fully agree. To be sure that you can clock down the processor for executing a task which has sufficient time to meet its deadline, your system must be "time triggered", all the timer events must be known in advance. Because on a fully dynamic system, you may make that decision, but a new timer may be scheduled which causes the system to miss its deadline whereas it would not have missed it if it had run at full speed. -- Gilles. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: cyclictest better values with system load than without (OMAP3530 target) 2013-11-29 16:36 ` Gilles Chanteperdrix @ 2013-11-29 16:58 ` Gilles Chanteperdrix 2013-11-29 17:36 ` Carsten Emde 0 siblings, 1 reply; 12+ messages in thread From: Gilles Chanteperdrix @ 2013-11-29 16:58 UTC (permalink / raw) To: Carsten Emde Cc: Sebastian Andrzej Siewior, Clark Williams, Stefan Roese, linux-rt-users On 11/29/2013 05:36 PM, Gilles Chanteperdrix wrote: > On 11/29/2013 04:10 PM, Carsten Emde wrote: >> BTW: Power saving and real-time do not necessarily exclude each other. >> If a - still deterministic - but a little longer latency is acceptable, >> some light sleep states and a somewhat lower clock frequency may be >> allowed which still may result in considerable energy saving. If, >> however, the fastest possible real-time response is required, C states >> and P states must be disabled (or set to polling and maximum speed, >> repsectively) and the power bill must be payed. > > Well, I do not fully agree. To be sure that you can clock down the > processor for executing a task which has sufficient time to meet its > deadline, your system must be "time triggered", all the timer events > must be known in advance. Because on a fully dynamic system, you may > make that decision, but a new timer may be scheduled which causes the > system to miss its deadline whereas it would not have missed it if it > had run at full speed. > And a second problem is that you must know the task WCET, which on a modern system: - depends on the task frequency; - depends on the IRQ load. Again, only a time triggered system seems to make this possible. -- Gilles. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: cyclictest better values with system load than without (OMAP3530 target) 2013-11-29 16:58 ` Gilles Chanteperdrix @ 2013-11-29 17:36 ` Carsten Emde 2013-11-29 19:34 ` Gilles Chanteperdrix 0 siblings, 1 reply; 12+ messages in thread From: Carsten Emde @ 2013-11-29 17:36 UTC (permalink / raw) To: Gilles Chanteperdrix Cc: Sebastian Andrzej Siewior, Clark Williams, Stefan Roese, linux-rt-users Hi Gilles, > On 11/29/2013 05:36 PM, Gilles Chanteperdrix wrote: >> On 11/29/2013 04:10 PM, Carsten Emde wrote: >>> BTW: Power saving and real-time do not necessarily exclude each other. >>> If a - still deterministic - but a little longer latency is acceptable, >>> some light sleep states and a somewhat lower clock frequency may be >>> allowed which still may result in considerable energy saving. If, >>> however, the fastest possible real-time response is required, C states >>> and P states must be disabled (or set to polling and maximum speed, >>> respectively) and the power bill must be payed. >> Well, I do not fully agree. To be sure that you can clock down the >> processor for executing a task which has sufficient time to meet its >> deadline, your system must be "time triggered", all the timer events >> must be known in advance. Because on a fully dynamic system, you may >> make that decision, but a new timer may be scheduled which causes the >> system to miss its deadline whereas it would not have missed it if it >> had run at full speed. > And a second problem is that you must know the task WCET, which on a > modern system: > - depends on the task frequency; > - depends on the IRQ load. > Again, only a time triggered system seems to make this possible. Hmm, I'm not sure whether I correctly got your point. Let me try an example: A 1-GHz CPU of a given systems runs at full speed with frequency governor set to performance and provides the required real-time capabilities. When a second system with the same capabilities was needed, the 1-GHz CPU unfortunately was out of stock, and the decision was made to buy the 2-GHz variant of the processor. To save energy, however, the clock frequency of the second system was set to 1 GHz using the cpufreq interface. Are you arguing that the 2-GHz processor that is throttled down to 1 GHz has a slower response time than the 1-GHz processor that always runs at full speed? -Carsten. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: cyclictest better values with system load than without (OMAP3530 target) 2013-11-29 17:36 ` Carsten Emde @ 2013-11-29 19:34 ` Gilles Chanteperdrix 2013-11-29 21:10 ` Carsten Emde 0 siblings, 1 reply; 12+ messages in thread From: Gilles Chanteperdrix @ 2013-11-29 19:34 UTC (permalink / raw) To: Carsten Emde Cc: Sebastian Andrzej Siewior, Clark Williams, Stefan Roese, linux-rt-users On 11/29/2013 06:36 PM, Carsten Emde wrote: > Hi Gilles, > >> On 11/29/2013 05:36 PM, Gilles Chanteperdrix wrote: >>> On 11/29/2013 04:10 PM, Carsten Emde wrote: >>>> BTW: Power saving and real-time do not necessarily exclude each other. >>>> If a - still deterministic - but a little longer latency is acceptable, >>>> some light sleep states and a somewhat lower clock frequency may be >>>> allowed which still may result in considerable energy saving. If, >>>> however, the fastest possible real-time response is required, C states >>>> and P states must be disabled (or set to polling and maximum speed, >>>> respectively) and the power bill must be payed. >>> Well, I do not fully agree. To be sure that you can clock down the >>> processor for executing a task which has sufficient time to meet its >>> deadline, your system must be "time triggered", all the timer events >>> must be known in advance. Because on a fully dynamic system, you may >>> make that decision, but a new timer may be scheduled which causes the >>> system to miss its deadline whereas it would not have missed it if it >>> had run at full speed. >> And a second problem is that you must know the task WCET, which on a >> modern system: >> - depends on the task frequency; >> - depends on the IRQ load. >> Again, only a time triggered system seems to make this possible. > Hmm, I'm not sure whether I correctly got your point. > > Let me try an example: A 1-GHz CPU of a given systems runs at full speed > with frequency governor set to performance and provides the required > real-time capabilities. When a second system with the same capabilities > was needed, the 1-GHz CPU unfortunately was out of stock, and the > decision was made to buy the 2-GHz variant of the processor. To save > energy, however, the clock frequency of the second system was set to 1 > GHz using the cpufreq interface. Are you arguing that the 2-GHz > processor that is throttled down to 1 GHz has a slower response time > than the 1-GHz processor that always runs at full speed? I probably misread what you were saying and thought you were talking about dynamically changing the processor frequency when knowing that the WCET of a task allows running it with a smaller frequency and still meet its deadline. The thing implemented here for instance: https://code.google.com/p/xenomaiote/ So called OTE algorithm (but I do not find what this acronym means exactly). -- Gilles. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: cyclictest better values with system load than without (OMAP3530 target) 2013-11-29 19:34 ` Gilles Chanteperdrix @ 2013-11-29 21:10 ` Carsten Emde 0 siblings, 0 replies; 12+ messages in thread From: Carsten Emde @ 2013-11-29 21:10 UTC (permalink / raw) To: Gilles Chanteperdrix Cc: Sebastian Andrzej Siewior, Clark Williams, Stefan Roese, linux-rt-users On 11/29/2013 08:34 PM, Gilles Chanteperdrix wrote: > On 11/29/2013 06:36 PM, Carsten Emde wrote: >>> On 11/29/2013 05:36 PM, Gilles Chanteperdrix wrote: >>>> On 11/29/2013 04:10 PM, Carsten Emde wrote: >>>>> BTW: Power saving and real-time do not necessarily exclude each other. >>>>> If a - still deterministic - but a little longer latency is acceptable, >>>>> some light sleep states and a somewhat lower clock frequency may be >>>>> allowed which still may result in considerable energy saving. If, >>>>> however, the fastest possible real-time response is required, C states >>>>> and P states must be disabled (or set to polling and maximum speed, >>>>> respectively) and the power bill must be payed. >>>> Well, I do not fully agree. To be sure that you can clock down the >>>> processor for executing a task which has sufficient time to meet its >>>> deadline, your system must be "time triggered", all the timer events >>>> must be known in advance. Because on a fully dynamic system, you may >>>> make that decision, but a new timer may be scheduled which causes the >>>> system to miss its deadline whereas it would not have missed it if it >>>> had run at full speed. >>> And a second problem is that you must know the task WCET, which on a >>> modern system: >>> - depends on the task frequency; >>> - depends on the IRQ load. >>> Again, only a time triggered system seems to make this possible. >> Hmm, I'm not sure whether I correctly got your point. >> Let me try an example: A 1-GHz CPU of a given systems runs at full speed >> with frequency governor set to performance and provides the required >> real-time capabilities. When a second system with the same capabilities >> was needed, the 1-GHz CPU unfortunately was out of stock, and the >> decision was made to buy the 2-GHz variant of the processor. To save >> energy, however, the clock frequency of the second system was set to 1 >> GHz using the cpufreq interface. Are you arguing that the 2-GHz >> processor that is throttled down to 1 GHz has a slower response time >> than the 1-GHz processor that always runs at full speed? > I probably misread what you were saying and thought you were talking > about dynamically changing the processor frequency when knowing that the > WCET of a task allows running it with a smaller frequency and still meet > its deadline. The thing implemented here for instance: > https://code.google.com/p/xenomaiote/ Ah, no, the whole thing was about static setting. I got your point wrong since I didn't know the proposal to dynamically change the clock frequency in relation to the WCET of a task. Sounds like a cool feature - but maybe a little to early to appear on our PREEMPT_RT wish list ... Thanks, -Carsten. ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2013-11-29 21:20 UTC | newest] Thread overview: 12+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2013-11-26 9:21 cyclictest better values with system load than without (OMAP3530 target) Stefan Roese 2013-11-26 14:21 ` Dmitry Lysenko 2013-11-26 19:14 ` Stefan Roese 2013-11-26 20:14 ` Tim Sander 2013-11-26 16:12 ` Clark Williams 2013-11-29 12:56 ` Sebastian Andrzej Siewior 2013-11-29 15:10 ` Carsten Emde 2013-11-29 16:36 ` Gilles Chanteperdrix 2013-11-29 16:58 ` Gilles Chanteperdrix 2013-11-29 17:36 ` Carsten Emde 2013-11-29 19:34 ` Gilles Chanteperdrix 2013-11-29 21:10 ` Carsten Emde
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).