* 5.15.28-rt35 #2 SMP PREEMPT_RT: Should scheduling latency be as large as 800 usec? @ 2022-03-23 19:35 Gautam Thaker 2022-03-23 20:03 ` John Ogness 0 siblings, 1 reply; 9+ messages in thread From: Gautam Thaker @ 2022-03-23 19:35 UTC (permalink / raw) To: linux-rt-users I built 5.15.28-rt35 #2 SMP PREEMPT_RT with “Fully Preemptible Kernel" option and otherwise using .config from a stock Ubuntu 20.04 5.4.0 kernel. Quick question: I see scheduling/wake_up latencies around 800 usec and this is confirmed by cyclictest. Is there a link that can point me to systematic steps I can take to trace down if my problem is a wrong config param (un-needed, hurtful debug option) or this is best that I can do (on good modern HW.)? I am hoping that with PREEMPT-RT on an unloaded system running at SCHED_FIFO prio 99 I can get max latencies of <= 100 usec? Is this possible? Details: node-0> grep -i preempt /boot/config-5.15.28-rt35 CONFIG_HAVE_PREEMPT_LAZY=y CONFIG_PREEMPT_LAZY=y # CONFIG_PREEMPT_NONE is not set # CONFIG_PREEMPT_VOLUNTARY is not set # CONFIG_PREEMPT is not set CONFIG_PREEMPT_RT=y CONFIG_PREEMPT_COUNT=y CONFIG_PREEMPTION=y CONFIG_PREEMPT_RCU=y CONFIG_HAVE_PREEMPT_DYNAMIC=yA CONFIG_PREEMPT_NOTIFIERS=y # CONFIG_DEBUG_PREEMPT is not set # CONFIG_PREEMPT_TRACER is not set # CONFIG_PREEMPTIRQ_DELAY_TEST is not set node-0> grep -i debug /boot/config-5.15.28-rt35 |grep =y CONFIG_SLUB_DEBUG=y CONFIG_IOSF_MBI_DEBUG=y CONFIG_X86_DEBUGCTLMSR=y CONFIG_PM_DEBUG=y CONFIG_PM_ADVANCED_DEBUG=y CONFIG_PM_SLEEP_DEBUG=y CONFIG_ACPI_DEBUGGER=y CONFIG_ACPI_DEBUGGER_USER=y CONFIG_ACPI_DEBUG=y CONFIG_ARCH_SUPPORTS_DEBUG_PAGEALLOC=y CONFIG_BLK_DEBUG_FS=y CONFIG_BLK_DEBUG_FS_ZONED=y CONFIG_NFS_DEBUG=y CONFIG_SUNRPC_DEBUG=y CONFIG_DYNAMIC_DEBUG=y CONFIG_DYNAMIC_DEBUG_CORE=y CONFIG_DEBUG_BUGVERBOSE=y CONFIG_DEBUG_INFO=y CONFIG_DEBUG_INFO_DWARF4=y CONFIG_DEBUG_FS=y CONFIG_DEBUG_FS_ALLOW_ALL=y CONFIG_ARCH_HAS_EARLY_DEBUG=y CONFIG_DEBUG_KERNEL=y CONFIG_DEBUG_MISC=y CONFIG_ARCH_HAS_DEBUG_WX=y CONFIG_DEBUG_WX=y CONFIG_HAVE_DEBUG_KMEMLEAK=y CONFIG_ARCH_HAS_DEBUG_VM_PGTABLE=y CONFIG_ARCH_HAS_DEBUG_VIRTUAL=y CONFIG_SCHED_DEBUG=y CONFIG_LOCK_DEBUGGING_SUPPORT=y CONFIG_X86_DEBUG_FPU=y My test involves using "nanosleep(x)" with different values of "x" and examining stats of actual sleep durations. I see: #nanosleep_duration minimum: maximum: mean: num_points: Range 100 103 934 108.635 100000 831 500 504 1377 543.272 100000 873 1000 1010 1870 1053.24 100000 860 2000 2010 2872 2052.46 100000 862 4000 4011 4906 4046.66 100000 895 10000 10013 10738 10067.3 100000 725 I next ran cyclictest which mostly agrees with the ~700-800 max latency I see in my own tests: node-0> sudo ./cyclictest --mlockall --smp --priority=80 --interval=1000 --distance=0 # /dev/cpu_dma_latency set to 0us policy: fifo: loadavg: 2.84 0.86 0.62 1/551 8688 T: 0 ( 8655) P:80 I:1000 C: 6656 Min: 3 Act: 3 Avg: 4 Max: 24 T: 1 ( 8656) P:80 I:1000 C: 6650 Min: 3 Act: 4 Avg: 5 Max: 68 T: 2 ( 8657) P:80 I:1000 C: 6645 Min: 3 Act: 4 Avg: 4 Max: 24 T: 3 ( 8658) P:80 I:1000 C: 6640 Min: 3 Act: 4 Avg: 4 Max: 28 T: 4 ( 8659) P:80 I:1000 C: 6635 Min: 3 Act: 3 Avg: 4 Max: 27 T: 5 ( 8660) P:80 I:1000 C: 6629 Min: 3 Act: 3 Avg: 4 Max: 26 T: 6 ( 8661) P:80 I:1000 C: 6625 Min: 3 Act: 3 Avg: 4 Max: 25 T: 7 ( 8662) P:80 I:1000 C: 6620 Min: 3 Act: 3 Avg: 4 Max: 98 T: 8 ( 8663) P:80 I:1000 C: 6615 Min: 3 Act: 4 Avg: 5 Max: 300 T: 9 ( 8664) P:80 I:1000 C: 6610 Min: 3 Act: 3 Avg: 5 Max: 610 T:10 ( 8665) P:80 I:1000 C: 6606 Min: 3 Act: 4 Avg: 4 Max: 16 T:11 ( 8666) P:80 I:1000 C: 6601 Min: 3 Act: 3 Avg: 5 Max: 319 T:12 ( 8667) P:80 I:1000 C: 6596 Min: 3 Act: 5 Avg: 5 Max: 696 T:13 ( 8668) P:80 I:1000 C: 6592 Min: 3 Act: 4 Avg: 5 Max: 149 T:14 ( 8669) P:80 I:1000 C: 6587 Min: 3 Act: 3 Avg: 5 Max: 633 T:15 ( 8670) P:80 I:1000 C: 6583 Min: 3 Act: 3 Avg: 4 Max: 159 T:16 ( 8671) P:80 I:1000 C: 6578 Min: 3 Act: 3 Avg: 5 Max: 661 T:17 ( 8672) P:80 I:1000 C: 6574 Min: 3 Act: 4 Avg: 5 Max: 170 T:18 ( 8673) P:80 I:1000 C: 6569 Min: 3 Act: 3 Avg: 4 Max: 13 T:19 ( 8674) P:80 I:1000 C: 6565 Min: 3 Act: 4 Avg: 4 Max: 348 T:20 ( 8675) P:80 I:1000 C: 6561 Min: 3 Act: 3 Avg: 4 Max: 23 T:21 ( 8676) P:80 I:1000 C: 6556 Min: 3 Act: 4 Avg: 5 Max: 635 T:22 ( 8677) P:80 I:1000 C: 6552 Min: 3 Act: 4 Avg: 5 Max: 349 T:23 ( 8678) P:80 I:1000 C: 6548 Min: 3 Act: 3 Avg: 4 Max: 16 T:24 ( 8679) P:80 I:1000 C: 6543 Min: 3 Act: 4 Avg: 6 Max: 610 T:25 ( 8680) P:80 I:1000 C: 6539 Min: 3 Act: 3 Avg: 4 Max: 99 T:26 ( 8681) P:80 I:1000 C: 6534 Min: 3 Act: 4 Avg: 4 Max: 547 T:27 ( 8682) P:80 I:1000 C: 6530 Min: 3 Act: 3 Avg: 5 Max: 157 T:28 ( 8683) P:80 I:1000 C: 6526 Min: 3 Act: 4 Avg: 4 Max: 26 T:29 ( 8684) P:80 I:1000 C: 6521 Min: 3 Act: 4 Avg: 5 Max: 382 T:30 ( 8685) P:80 I:1000 C: 6517 Min: 3 Act: 3 Avg: 4 Max: 17 T:31 ( 8686) P:80 I:1000 C: 6512 Min: 3 Act: 3 Avg: 5 Max: 639 My HW is a Dell d430 on which /proc/cpuinfo shows 32 processors of type: processor : 31 vendor_id : GenuineIntel cpu family : 6 model : 63 model name : Intel(R) Xeon(R) CPU E5-2630 v3 @ 2.40GHz stepping : 2 microcode : 0x46 cpu MHz : 1200.000 cache size : 20480 KB Any directions to how one goes about reducing my max latencies will be much appreciated. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: 5.15.28-rt35 #2 SMP PREEMPT_RT: Should scheduling latency be as large as 800 usec? 2022-03-23 19:35 5.15.28-rt35 #2 SMP PREEMPT_RT: Should scheduling latency be as large as 800 usec? Gautam Thaker @ 2022-03-23 20:03 ` John Ogness 2022-03-23 20:16 ` Gautam Thaker 0 siblings, 1 reply; 9+ messages in thread From: John Ogness @ 2022-03-23 20:03 UTC (permalink / raw) To: Gautam Thaker, linux-rt-users On 2022-03-23, Gautam Thaker <ghthaker@gmail.com> wrote: > I built 5.15.28-rt35 #2 SMP PREEMPT_RT with “Fully Preemptible > Kernel" option and otherwise using .config from a stock Ubuntu 20.04 > 5.4.0 kernel. > > Quick question: I see scheduling/wake_up latencies around 800 usec and > this is confirmed by cyclictest. [...] > node-0> grep -i preempt /boot/config-5.15.28-rt35 > node-0> grep -i debug /boot/config-5.15.28-rt35 |grep =y [...] > processor : 31 > vendor_id : GenuineIntel > cpu family : 6 > model : 63 > model name : Intel(R) Xeon(R) CPU E5-2630 v3 @ 2.40GHz > stepping : 2 > microcode : 0x46 > cpu MHz : 1200.000 > cache size : 20480 KB Your CPU is running at 1.2GHz although it is capable of 2.4GHz. This looks like you have CPU frequency scaling activated. Investigate: grep -i cpu_freq /boot/config-5.15.28-rt35 The configuration of your tick may also be interesting: grep -i hz /boot/config-5.15.28-rt35 John Ogness ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: 5.15.28-rt35 #2 SMP PREEMPT_RT: Should scheduling latency be as large as 800 usec? 2022-03-23 20:03 ` John Ogness @ 2022-03-23 20:16 ` Gautam Thaker [not found] ` <CAMLffL9hqXcco9NCH1eGdzw4uWPSxPpLaO5fZWgNqS9moKE2HQ@mail.gmail.com> 2022-03-24 8:55 ` John Ogness 0 siblings, 2 replies; 9+ messages in thread From: Gautam Thaker @ 2022-03-23 20:16 UTC (permalink / raw) To: John Ogness; +Cc: linux-rt-users > > processor : 31 > > vendor_id : GenuineIntel > > cpu family : 6 > > model : 63 > > model name : Intel(R) Xeon(R) CPU E5-2630 v3 @ 2.40GHz > > stepping : 2 > > microcode : 0x46 > > cpu MHz : 1200.000 > > cache size : 20480 KB > > Your CPU is running at 1.2GHz although it is capable of 2.4GHz. This > looks like you have CPU frequency scaling activated. Investigate: > > grep -i cpu_freq /boot/config-5.15.28-rt35 node-0> grep -i cpu_freq /boot/config-5.15.28-rt35 CONFIG_ACPI_CPU_FREQ_PSS=y CONFIG_CPU_FREQ=y CONFIG_CPU_FREQ_GOV_ATTR_SET=y CONFIG_CPU_FREQ_GOV_COMMON=y CONFIG_CPU_FREQ_STAT=y CONFIG_CPU_FREQ_DEFAULT_GOV_PERFORMANCE=y # CONFIG_CPU_FREQ_DEFAULT_GOV_POWERSAVE is not set # CONFIG_CPU_FREQ_DEFAULT_GOV_USERSPACE is not set # CONFIG_CPU_FREQ_DEFAULT_GOV_SCHEDUTIL is not set CONFIG_CPU_FREQ_GOV_PERFORMANCE=y CONFIG_CPU_FREQ_GOV_POWERSAVE=y CONFIG_CPU_FREQ_GOV_USERSPACE=y CONFIG_CPU_FREQ_GOV_ONDEMAND=y CONFIG_CPU_FREQ_GOV_CONSERVATIVE=y CONFIG_CPU_FREQ_GOV_SCHEDUTIL=y AND: > > The configuration of your tick may also be interesting: > > grep -i hz /boot/config-5.15.28-rt35 > node-0> grep -i hz /boot/config-5.15.28-rt35 CONFIG_NO_HZ_COMMON=y # CONFIG_HZ_PERIODIC is not set CONFIG_NO_HZ_IDLE=y # CONFIG_NO_HZ_FULL is not set CONFIG_NO_HZ=y # CONFIG_HZ_100 is not set CONFIG_HZ_250=y # CONFIG_HZ_300 is not set # CONFIG_HZ_1000 is not set CONFIG_HZ=250 # CONFIG_MACHZ_WDT is not set > John Ogness It seems you are saying some of these settings, related to CPU freq scaling and "tick" settings may be the cause of my ~800 usec latencies I see. As I am seeking the "best possible that I can achieve" number I am willing to set the configs anyway needed. Any suggestions? Thanks. Gautam ^ permalink raw reply [flat|nested] 9+ messages in thread
[parent not found: <CAMLffL9hqXcco9NCH1eGdzw4uWPSxPpLaO5fZWgNqS9moKE2HQ@mail.gmail.com>]
* Re: 5.15.28-rt35 #2 SMP PREEMPT_RT: Should scheduling latency be as large as 800 usec? [not found] ` <CAMLffL9hqXcco9NCH1eGdzw4uWPSxPpLaO5fZWgNqS9moKE2HQ@mail.gmail.com> @ 2022-03-24 0:56 ` Gautam Thaker 0 siblings, 0 replies; 9+ messages in thread From: Gautam Thaker @ 2022-03-24 0:56 UTC (permalink / raw) To: Clark Williams; +Cc: John Ogness, linux-rt-users On Wed, Mar 23, 2022 at 3:26 PM Clark Williams <williams@redhat.com> wrote: > > Something that's not always obvious is that BIOS settings can affect your latency numbers. Look in the rt-tests project for a script called 'hwlatdetect' and try running that to see if control is transferring to the BIOS via an SMI or MCE. Here's a sample run on an untuned Intel NUC: > > $ sudo hwlatdetect --duration=30s OK, this was very useful. This has returned the following for me: node-0> sudo hwlatdetect --duration=30s hwlatdetect: test duration 30 seconds detector: tracer parameters: Latency threshold: 10us Sample window: 1000000us Sample width: 500000us Non-sampling period: 500000us Output File: None Starting test test finished Max Latency: 811us Samples recorded: 23 Samples exceeding threshold: 23 ts: 1648082981.696619376, inner:811, outer:19 ts: 1648082982.999941947, inner:545, outer:16 ts: 1648082985.249826218, inner:0, outer:426 ts: 1648082986.249810897, inner:409, outer:0 ts: 1648082987.249818630, inner:0, outer:415 ts: 1648082988.249822370, inner:417, outer:0 ts: 1648082989.249816168, inner:409, outer:0 ts: 1648082990.249820394, inner:412, outer:0 ts: 1648082991.249820365, inner:0, outer:410 ts: 1648082992.249825886, inner:0, outer:414 ts: 1648082993.249822956, inner:409, outer:0 ts: 1648082994.249822720, inner:407, outer:0 ts: 1648082995.249826191, inner:409, outer:5 ts: 1648082996.249832029, inner:413, outer:0 ts: 1648082997.249832007, inner:412, outer:0 ts: 1648082998.249842325, inner:0, outer:420 ts: 1648082999.249838335, inner:0, outer:415 ts: 1648083000.249836740, inner:412, outer:0 ts: 1648083001.249835774, inner:409, outer:0 ts: 1648083002.249832951, inner:0, outer:404 ts: 1648083003.249833781, inner:0, outer:404 ts: 1648083004.249837415, inner:0, outer:406 ts: 1648083005.249851829, inner:0, outer:418 node-0> I see one inner value of > 800usec, and several cases where sum of "outer" + (next) "inner" is > 800 usec. If I can reduce this to the 55 usec you see on your "untuned" Intel NUC (what is NUC BTW?) I would be quite happy. Is there a pointer to any documentation as to what I may try on the BIOS settings? And I am not familiar w/ "...since transitioning between c-states is another place you can see latency spikes", any pointers regarding that also will be appreciated. Gautam ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: 5.15.28-rt35 #2 SMP PREEMPT_RT: Should scheduling latency be as large as 800 usec? 2022-03-23 20:16 ` Gautam Thaker [not found] ` <CAMLffL9hqXcco9NCH1eGdzw4uWPSxPpLaO5fZWgNqS9moKE2HQ@mail.gmail.com> @ 2022-03-24 8:55 ` John Ogness 2022-03-24 16:34 ` Gautam Thaker 1 sibling, 1 reply; 9+ messages in thread From: John Ogness @ 2022-03-24 8:55 UTC (permalink / raw) To: Gautam Thaker; +Cc: linux-rt-users On 2022-03-23, Gautam Thaker <ghthaker@gmail.com> wrote: >> > processor : 31 >> > vendor_id : GenuineIntel >> > cpu family : 6 >> > model : 63 >> > model name : Intel(R) Xeon(R) CPU E5-2630 v3 @ 2.40GHz >> > stepping : 2 >> > microcode : 0x46 >> > cpu MHz : 1200.000 >> > cache size : 20480 KB >> >> Your CPU is running at 1.2GHz although it is capable of 2.4GHz. This >> looks like you have CPU frequency scaling activated. Investigate: >> >> grep -i cpu_freq /boot/config-5.15.28-rt35 > > node-0> grep -i cpu_freq /boot/config-5.15.28-rt35 > CONFIG_ACPI_CPU_FREQ_PSS=y > CONFIG_CPU_FREQ=y > CONFIG_CPU_FREQ_GOV_ATTR_SET=y > CONFIG_CPU_FREQ_GOV_COMMON=y > CONFIG_CPU_FREQ_STAT=y > CONFIG_CPU_FREQ_DEFAULT_GOV_PERFORMANCE=y > # CONFIG_CPU_FREQ_DEFAULT_GOV_POWERSAVE is not set > # CONFIG_CPU_FREQ_DEFAULT_GOV_USERSPACE is not set > # CONFIG_CPU_FREQ_DEFAULT_GOV_SCHEDUTIL is not set > CONFIG_CPU_FREQ_GOV_PERFORMANCE=y > CONFIG_CPU_FREQ_GOV_POWERSAVE=y > CONFIG_CPU_FREQ_GOV_USERSPACE=y > CONFIG_CPU_FREQ_GOV_ONDEMAND=y > CONFIG_CPU_FREQ_GOV_CONSERVATIVE=y > CONFIG_CPU_FREQ_GOV_SCHEDUTIL=y Your system is configured to support many different governors. For minimal latency it is important that you are using the performance governor. The frequency scaling might be the cause of the horrible hwlatdetect numbers you are seeing. If you have the cpufrequtils package installed, you can easily switch to the performance governor with: cpufreq-set -g performance With cpufreq-info you can see a nice summary of how your CPUs are currently set. You should see it running full speed. You can also configure the performance governor manually using sysfs. You may want to read through the documentation [0] so that you understand what you are doing. [0] https://www.kernel.org/doc/html/latest/admin-guide/pm/cpufreq.html >> The configuration of your tick may also be interesting: >> >> grep -i hz /boot/config-5.15.28-rt35 >> > > node-0> grep -i hz /boot/config-5.15.28-rt35 > CONFIG_NO_HZ_COMMON=y > # CONFIG_HZ_PERIODIC is not set > CONFIG_NO_HZ_IDLE=y > # CONFIG_NO_HZ_FULL is not set > CONFIG_NO_HZ=y > # CONFIG_HZ_100 is not set > CONFIG_HZ_250=y > # CONFIG_HZ_300 is not set > # CONFIG_HZ_1000 is not set > CONFIG_HZ=250 > # CONFIG_MACHZ_WDT is not set The tick configuration won't be responsible for the huge latencies you are seeing. But when you start getting your latency down, you may want to consider using CONFIG_HZ_PERIODIC with CONFIG_HZ_100 (or CONFIG_HZ_1000 if this is a machine with a huge network load). But these settings can be very dependent on the types of work your system is doing, so you will need to test the latency affects. Possibly you will notice no affects. Adding --secaligned to your cyclictest command can help to force situations where your latency will be worsened because of the tick. Optimally you want cyclictest to hit the worst case scenario. John Ogness ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: 5.15.28-rt35 #2 SMP PREEMPT_RT: Should scheduling latency be as large as 800 usec? 2022-03-24 8:55 ` John Ogness @ 2022-03-24 16:34 ` Gautam Thaker 2022-03-25 8:18 ` John Ogness 0 siblings, 1 reply; 9+ messages in thread From: Gautam Thaker @ 2022-03-24 16:34 UTC (permalink / raw) To: John Ogness; +Cc: linux-rt-users On Thu, Mar 24, 2022 at 1:55 AM John Ogness <john.ogness@linutronix.de> wrote: > > Your system is configured to support many different governors. For > minimal latency it is important that you are using the performance > governor. The frequency scaling might be the cause of the horrible > hwlatdetect numbers you are seeing. > > If you have the cpufrequtils package installed, you can easily switch to > the performance governor with: > > cpufreq-set -g performance > > With cpufreq-info you can see a nice summary of how your CPUs are > currently set. You should see it running full speed. > > You can also configure the performance governor manually using > sysfs. You may want to read through the documentation [0] so that you > understand what you are doing. > > [0] https://www.kernel.org/doc/html/latest/admin-guide/pm/cpufreq.html I tried both cpufreq-set -g performance and cpufreq-set -g schedutil And also set the lower and upper freq to 3.2 GHz. Overall this has not made much of a difference, I still see ~800 usec latency reports out of 'hwlatdetect'. The best run was: ... ... [Same for CPUs 0-30 ...] analyzing CPU 31: driver: intel_cpufreq CPUs which run at the same hardware frequency: 31 CPUs which need to have their frequency coordinated by software: 31 maximum transition latency: 20.0 us. hardware limits: 1.20 GHz - 3.20 GHz available cpufreq governors: conservative, ondemand, userspace, powersave, performance, schedutil current policy: frequency should be within 3.20 GHz and 3.20 GHz. The governor "schedutil" may decide which speed to use within this range. current CPU frequency is 1.20 GHz. node-0> sudo hwlatdetect --duration=30s hwlatdetect: test duration 30 seconds detector: tracer parameters: Latency threshold: 10us Sample window: 1000000us Sample width: 500000us Non-sampling period: 500000us Output File: None Starting test test finished Max Latency: 779us Samples recorded: 8 Samples exceeding threshold: 8 ts: 1648139049.597634649, inner:779, outer:18 ts: 1648139050.597355299, inner:16, outer:498 ts: 1648139051.347299254, inner:441, outer:0 ts: 1648139074.347304238, inner:410, outer:0 ts: 1648139075.347310909, inner:414, outer:0 ts: 1648139076.347309378, inner:0, outer:412 ts: 1648139077.347313644, inner:0, outer:415 ts: 1648139078.347315084, inner:414, outer:0 I understand my issues may be related to SMI (system management interrupt) which may cause the processor to enter the SMM (system management mode) which locks out the OS. Are their techniques to investigate (and bypass/avoid) these latency sources? > > John Ogness Many Thanks, I finally feel I am making some progress after some period of head scratching. Gautam ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: 5.15.28-rt35 #2 SMP PREEMPT_RT: Should scheduling latency be as large as 800 usec? 2022-03-24 16:34 ` Gautam Thaker @ 2022-03-25 8:18 ` John Ogness 2022-03-25 15:50 ` Gautam Thaker 0 siblings, 1 reply; 9+ messages in thread From: John Ogness @ 2022-03-25 8:18 UTC (permalink / raw) To: Gautam Thaker; +Cc: linux-rt-users On 2022-03-24, Gautam Thaker <ghthaker@gmail.com> wrote: > On Thu, Mar 24, 2022 at 1:55 AM John Ogness <john.ogness@linutronix.de> wrote: > I tried both > cpufreq-set -g performance > and > cpufreq-set -g schedutil > > And also set the lower and upper freq to 3.2 GHz. Overall this has not > made much of a difference, I still see ~800 usec latency reports out > of 'hwlatdetect'. The best run was: > > ... > ... [Same for CPUs 0-30 ...] > analyzing CPU 31: > driver: intel_cpufreq > CPUs which run at the same hardware frequency: 31 > CPUs which need to have their frequency coordinated by software: 31 > maximum transition latency: 20.0 us. > hardware limits: 1.20 GHz - 3.20 GHz > available cpufreq governors: conservative, ondemand, userspace, > powersave, performance, schedutil > current policy: frequency should be within 3.20 GHz and 3.20 GHz. > The governor "schedutil" may decide which speed to use > within this range. > current CPU frequency is 1.20 GHz. With the performance governor you will see that "current CPU frequency" is _always_ the max. Without that, I cannot imagine eliminating >100us latencies. John Ogness ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: 5.15.28-rt35 #2 SMP PREEMPT_RT: Should scheduling latency be as large as 800 usec? 2022-03-25 8:18 ` John Ogness @ 2022-03-25 15:50 ` Gautam Thaker 2022-03-25 19:27 ` John Ogness 0 siblings, 1 reply; 9+ messages in thread From: Gautam Thaker @ 2022-03-25 15:50 UTC (permalink / raw) To: John Ogness; +Cc: linux-rt-users On Fri, Mar 25, 2022 at 1:18 AM John Ogness <john.ogness@linutronix.de> wrote: > With the performance governor you will see that "current CPU frequency" > is _always_ the max. Without that, I cannot imagine eliminating >100us > latencies. Even after I set to "-g performance" and set "-d 3.2GHz -u 3.2GHz" I get: ..[same for all CPUS 0-30 as well] analyzing CPU 31: driver: intel_cpufreq CPUs which run at the same hardware frequency: 31 CPUs which need to have their frequency coordinated by software: 31 maximum transition latency: 20.0 us. hardware limits: 1.20 GHz - 3.20 GHz available cpufreq governors: conservative, ondemand, userspace, powersave, performance, schedutil current policy: frequency should be within 3.20 GHz and 3.20 GHz. The governor "performance" may decide which speed to use within this range. current CPU frequency is 1.20 GHz. The current CPU freq is at 1.2 GHz. SHould "current CPU frequency" be at 3.2 GHz now all the time? And even at 1.2GHz, if that freq is steady, why should 'hwlatdetect' report the 800 usec latency? It seems my issues are something other than "cpufreq-set" policy I am using. (SMIs?) Gautam > > John Ogness ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: 5.15.28-rt35 #2 SMP PREEMPT_RT: Should scheduling latency be as large as 800 usec? 2022-03-25 15:50 ` Gautam Thaker @ 2022-03-25 19:27 ` John Ogness 0 siblings, 0 replies; 9+ messages in thread From: John Ogness @ 2022-03-25 19:27 UTC (permalink / raw) To: Gautam Thaker; +Cc: linux-rt-users On 2022-03-25, Gautam Thaker <ghthaker@gmail.com> wrote: > Even after I set to "-g performance" and set "-d 3.2GHz -u 3.2GHz" I get: > > ..[same for all CPUS 0-30 as well] > analyzing CPU 31: > driver: intel_cpufreq > CPUs which run at the same hardware frequency: 31 > CPUs which need to have their frequency coordinated by software: 31 > maximum transition latency: 20.0 us. > hardware limits: 1.20 GHz - 3.20 GHz > available cpufreq governors: conservative, ondemand, userspace, > powersave, performance, schedutil > current policy: frequency should be within 3.20 GHz and 3.20 GHz. > The governor "performance" may decide which speed to use > within this range. > current CPU frequency is 1.20 GHz. > > The current CPU freq is at 1.2 GHz. SHould "current CPU frequency" > be at 3.2 GHz now all the time? Yes, it should. > And even at 1.2GHz, if that freq is steady, why should 'hwlatdetect' > report the 800 usec latency? Why do you think it is steady? > It seems my issues are something other > than "cpufreq-set" policy I am using. (SMIs?) Perhaps your BIOS is performing some scaling via SMIs. It might be interesting to burn your CPUs and see if they ramp up. And then perform your cyclictest and hwlatdetect tests while they burn. An effective way to burn your CPUs is just send them into infinite loops: for c in $(seq $(nproc)); do ( while true; do echo -n; done & ); done Be sure check cpufreq-info while this is running. John Ogness ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2022-03-25 19:45 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-03-23 19:35 5.15.28-rt35 #2 SMP PREEMPT_RT: Should scheduling latency be as large as 800 usec? Gautam Thaker
2022-03-23 20:03 ` John Ogness
2022-03-23 20:16 ` Gautam Thaker
[not found] ` <CAMLffL9hqXcco9NCH1eGdzw4uWPSxPpLaO5fZWgNqS9moKE2HQ@mail.gmail.com>
2022-03-24 0:56 ` Gautam Thaker
2022-03-24 8:55 ` John Ogness
2022-03-24 16:34 ` Gautam Thaker
2022-03-25 8:18 ` John Ogness
2022-03-25 15:50 ` Gautam Thaker
2022-03-25 19:27 ` John Ogness
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox