* [PATCH 0/3] LoongArch: Add PREEMPT_RT support
@ 2024-11-08 9:15 Huacai Chen
2024-11-08 9:15 ` [PATCH 1/3] LoongArch: Reduce min_delta for the arch clockevent device Huacai Chen
` (2 more replies)
0 siblings, 3 replies; 19+ messages in thread
From: Huacai Chen @ 2024-11-08 9:15 UTC (permalink / raw)
To: Huacai Chen
Cc: Xuerui Wang, loongarch, Sebastian Andrzej Siewior, Clark Williams,
Steven Rostedt, linux-rt-devel, Guo Ren, Jiaxun Yang,
linux-kernel, Huacai Chen
This series add PREEMPT_RT support for LoongArch.
With the recent printk changes, the last known road block has been
addressed. The main architectural preparation we need is selecting
HAVE_POSIX_CPU_TIMERS_TASK_WORK to allow PREEMPT_RT coexist with KVM.
Then we can also enable PREEMPT_RT as X86, ARM64 and RISC-V.
Huacai Chen(3):
LoongArch: Reduce min_delta for the arch clockevent device.
LoongArch: Select HAVE_POSIX_CPU_TIMERS_TASK_WORK.
LoongArch: Allow to enable PREEMPT_RT.
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
---
arch/loongarch/Kconfig | 2 ++
arch/loongarch/kernel/time.c | 2 +-
2 files changed, 3 insertions(+), 1 deletion(-)
---
2.27.0
^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH 1/3] LoongArch: Reduce min_delta for the arch clockevent device
2024-11-08 9:15 [PATCH 0/3] LoongArch: Add PREEMPT_RT support Huacai Chen
@ 2024-11-08 9:15 ` Huacai Chen
2024-11-14 10:21 ` Sebastian Andrzej Siewior
2024-11-08 9:15 ` [PATCH 2/3] LoongArch: Select HAVE_POSIX_CPU_TIMERS_TASK_WORK Huacai Chen
2024-11-08 9:15 ` [PATCH 3/3] LoongArch: Allow to enable PREEMPT_RT Huacai Chen
2 siblings, 1 reply; 19+ messages in thread
From: Huacai Chen @ 2024-11-08 9:15 UTC (permalink / raw)
To: Huacai Chen
Cc: Xuerui Wang, loongarch, Sebastian Andrzej Siewior, Clark Williams,
Steven Rostedt, linux-rt-devel, Guo Ren, Jiaxun Yang,
linux-kernel, Huacai Chen
Now the min_delta is 0x600 (1536) for LoongArch's constant clockevent
device. For a 100MHz hardware timer this means ~15us. This is a little
big, especially for PREEMPT_RT enabled kernels. So reduce it to 1000
(we don't want too small values to affect performance).
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
---
arch/loongarch/kernel/time.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/loongarch/kernel/time.c b/arch/loongarch/kernel/time.c
index 46d7d40c87e3..e914b27a7c89 100644
--- a/arch/loongarch/kernel/time.c
+++ b/arch/loongarch/kernel/time.c
@@ -127,7 +127,7 @@ void sync_counter(void)
int constant_clockevent_init(void)
{
unsigned int cpu = smp_processor_id();
- unsigned long min_delta = 0x600;
+ unsigned long min_delta = 1000;
unsigned long max_delta = (1UL << 48) - 1;
struct clock_event_device *cd;
static int irq = 0, timer_irq_installed = 0;
--
2.43.5
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH 2/3] LoongArch: Select HAVE_POSIX_CPU_TIMERS_TASK_WORK
2024-11-08 9:15 [PATCH 0/3] LoongArch: Add PREEMPT_RT support Huacai Chen
2024-11-08 9:15 ` [PATCH 1/3] LoongArch: Reduce min_delta for the arch clockevent device Huacai Chen
@ 2024-11-08 9:15 ` Huacai Chen
2024-11-08 9:15 ` [PATCH 3/3] LoongArch: Allow to enable PREEMPT_RT Huacai Chen
2 siblings, 0 replies; 19+ messages in thread
From: Huacai Chen @ 2024-11-08 9:15 UTC (permalink / raw)
To: Huacai Chen
Cc: Xuerui Wang, loongarch, Sebastian Andrzej Siewior, Clark Williams,
Steven Rostedt, linux-rt-devel, Guo Ren, Jiaxun Yang,
linux-kernel, Huacai Chen
Move POSIX CPU timer expiry and signal delivery into task context to
allow PREEMPT_RT setups to coexist with KVM.
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
---
arch/loongarch/Kconfig | 1 +
1 file changed, 1 insertion(+)
diff --git a/arch/loongarch/Kconfig b/arch/loongarch/Kconfig
index bb35c34f86d2..3734f5dd9a57 100644
--- a/arch/loongarch/Kconfig
+++ b/arch/loongarch/Kconfig
@@ -155,6 +155,7 @@ config LOONGARCH
select HAVE_PERF_EVENTS
select HAVE_PERF_REGS
select HAVE_PERF_USER_STACK_DUMP
+ select HAVE_POSIX_CPU_TIMERS_TASK_WORK
select HAVE_PREEMPT_DYNAMIC_KEY
select HAVE_REGS_AND_STACK_ACCESS_API
select HAVE_RELIABLE_STACKTRACE if UNWINDER_ORC
--
2.43.5
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH 3/3] LoongArch: Allow to enable PREEMPT_RT
2024-11-08 9:15 [PATCH 0/3] LoongArch: Add PREEMPT_RT support Huacai Chen
2024-11-08 9:15 ` [PATCH 1/3] LoongArch: Reduce min_delta for the arch clockevent device Huacai Chen
2024-11-08 9:15 ` [PATCH 2/3] LoongArch: Select HAVE_POSIX_CPU_TIMERS_TASK_WORK Huacai Chen
@ 2024-11-08 9:15 ` Huacai Chen
2024-11-08 15:05 ` Steven Rostedt
2024-11-14 10:31 ` Sebastian Andrzej Siewior
2 siblings, 2 replies; 19+ messages in thread
From: Huacai Chen @ 2024-11-08 9:15 UTC (permalink / raw)
To: Huacai Chen
Cc: Xuerui Wang, loongarch, Sebastian Andrzej Siewior, Clark Williams,
Steven Rostedt, linux-rt-devel, Guo Ren, Jiaxun Yang,
linux-kernel, Huacai Chen
It is really time.
LoongArch has all the required architecture related changes, that have
been identified over time, in order to enable PREEMPT_RT. With the recent
printk changes, the last known road block has been addressed.
Allow to enable PREEMPT_RT on LoongArch.
Below are the latency data from cyclictest on a 4-core Loongson-3A5000
machine, with a "make -j8" kernel building workload in the background.
1. PREEMPT kernel with default configuration:
./cyclictest -a -t -m -i200 -d0 -p99
policy: fifo: loadavg: 8.78 8.96 8.64 10/296 64800
T: 0 ( 4592) P:99 I:200 C:14838617 Min: 3 Act: 6 Avg: 8 Max: 844
T: 1 ( 4593) P:99 I:200 C:14838765 Min: 3 Act: 9 Avg: 8 Max: 909
T: 2 ( 4594) P:99 I:200 C:14838510 Min: 3 Act: 7 Avg: 8 Max: 832
T: 3 ( 4595) P:99 I:200 C:14838631 Min: 3 Act: 8 Avg: 8 Max: 931
2. PREEMPT_RT kernel with default configuration:
./cyclictest -a -t -m -i200 -d0 -p99
policy: fifo: loadavg: 10.38 10.47 10.35 9/336 77788
T: 0 ( 3941) P:99 I:200 C:19439626 Min: 3 Act: 12 Avg: 8 Max: 227
T: 1 ( 3942) P:99 I:200 C:19439624 Min: 2 Act: 11 Avg: 8 Max: 184
T: 2 ( 3943) P:99 I:200 C:19439623 Min: 3 Act: 4 Avg: 7 Max: 223
T: 3 ( 3944) P:99 I:200 C:19439623 Min: 2 Act: 10 Avg: 7 Max: 226
3. PREEMPT_RT kernel with tuned configuration:
./cyclictest -a -t -m -i200 -d0 -p99
policy: fifo: loadavg: 10.52 10.66 10.62 12/334 109397
T: 0 ( 4765) P:99 I:200 C:29335186 Min: 3 Act: 6 Avg: 8 Max: 62
T: 1 ( 4766) P:99 I:200 C:29335185 Min: 3 Act: 10 Avg: 8 Max: 52
T: 2 ( 4767) P:99 I:200 C:29335184 Min: 3 Act: 8 Avg: 8 Max: 64
T: 3 ( 4768) P:99 I:200 C:29335183 Min: 3 Act: 12 Avg: 8 Max: 53
Main instruments of tuned configuration include: Disable the boot rom
space in BIOS for kernel, in order to avoid speculative access to low-
speed memory; Disable CPUFreq scaling; Disable RTC synchronization in
the ntpd/chronyd service.
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
---
arch/loongarch/Kconfig | 1 +
1 file changed, 1 insertion(+)
diff --git a/arch/loongarch/Kconfig b/arch/loongarch/Kconfig
index 3734f5dd9a57..26ed9d925e7f 100644
--- a/arch/loongarch/Kconfig
+++ b/arch/loongarch/Kconfig
@@ -66,6 +66,7 @@ config LOONGARCH
select ARCH_SUPPORTS_LTO_CLANG
select ARCH_SUPPORTS_LTO_CLANG_THIN
select ARCH_SUPPORTS_NUMA_BALANCING
+ select ARCH_SUPPORTS_RT
select ARCH_USE_BUILTIN_BSWAP
select ARCH_USE_CMPXCHG_LOCKREF
select ARCH_USE_QUEUED_RWLOCKS
--
2.43.5
^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [PATCH 3/3] LoongArch: Allow to enable PREEMPT_RT
2024-11-08 9:15 ` [PATCH 3/3] LoongArch: Allow to enable PREEMPT_RT Huacai Chen
@ 2024-11-08 15:05 ` Steven Rostedt
2024-11-14 10:31 ` Sebastian Andrzej Siewior
1 sibling, 0 replies; 19+ messages in thread
From: Steven Rostedt @ 2024-11-08 15:05 UTC (permalink / raw)
To: Huacai Chen
Cc: Huacai Chen, Xuerui Wang, loongarch, Sebastian Andrzej Siewior,
Clark Williams, linux-rt-devel, Guo Ren, Jiaxun Yang,
linux-kernel
On Fri, 8 Nov 2024 17:15:45 +0800
Huacai Chen <chenhuacai@loongson.cn> wrote:
> It is really time.
>
> LoongArch has all the required architecture related changes, that have
> been identified over time, in order to enable PREEMPT_RT. With the recent
> printk changes, the last known road block has been addressed.
>
> Allow to enable PREEMPT_RT on LoongArch.
>
> Below are the latency data from cyclictest on a 4-core Loongson-3A5000
> machine, with a "make -j8" kernel building workload in the background.
>
> 1. PREEMPT kernel with default configuration:
>
> ./cyclictest -a -t -m -i200 -d0 -p99
> policy: fifo: loadavg: 8.78 8.96 8.64 10/296 64800
> T: 0 ( 4592) P:99 I:200 C:14838617 Min: 3 Act: 6 Avg: 8 Max: 844
> T: 1 ( 4593) P:99 I:200 C:14838765 Min: 3 Act: 9 Avg: 8 Max: 909
> T: 2 ( 4594) P:99 I:200 C:14838510 Min: 3 Act: 7 Avg: 8 Max: 832
> T: 3 ( 4595) P:99 I:200 C:14838631 Min: 3 Act: 8 Avg: 8 Max: 931
>
> 2. PREEMPT_RT kernel with default configuration:
>
> ./cyclictest -a -t -m -i200 -d0 -p99
> policy: fifo: loadavg: 10.38 10.47 10.35 9/336 77788
> T: 0 ( 3941) P:99 I:200 C:19439626 Min: 3 Act: 12 Avg: 8 Max: 227
> T: 1 ( 3942) P:99 I:200 C:19439624 Min: 2 Act: 11 Avg: 8 Max: 184
> T: 2 ( 3943) P:99 I:200 C:19439623 Min: 3 Act: 4 Avg: 7 Max: 223
> T: 3 ( 3944) P:99 I:200 C:19439623 Min: 2 Act: 10 Avg: 7 Max: 226
>
> 3. PREEMPT_RT kernel with tuned configuration:
>
> ./cyclictest -a -t -m -i200 -d0 -p99
> policy: fifo: loadavg: 10.52 10.66 10.62 12/334 109397
> T: 0 ( 4765) P:99 I:200 C:29335186 Min: 3 Act: 6 Avg: 8 Max: 62
> T: 1 ( 4766) P:99 I:200 C:29335185 Min: 3 Act: 10 Avg: 8 Max: 52
> T: 2 ( 4767) P:99 I:200 C:29335184 Min: 3 Act: 8 Avg: 8 Max: 64
> T: 3 ( 4768) P:99 I:200 C:29335183 Min: 3 Act: 12 Avg: 8 Max: 53
>
> Main instruments of tuned configuration include: Disable the boot rom
> space in BIOS for kernel, in order to avoid speculative access to low-
> speed memory; Disable CPUFreq scaling; Disable RTC synchronization in
> the ntpd/chronyd service.
Nice! Looks good.
-- Steve
>
> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
> ---
> arch/loongarch/Kconfig | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/arch/loongarch/Kconfig b/arch/loongarch/Kconfig
> index 3734f5dd9a57..26ed9d925e7f 100644
> --- a/arch/loongarch/Kconfig
> +++ b/arch/loongarch/Kconfig
> @@ -66,6 +66,7 @@ config LOONGARCH
> select ARCH_SUPPORTS_LTO_CLANG
> select ARCH_SUPPORTS_LTO_CLANG_THIN
> select ARCH_SUPPORTS_NUMA_BALANCING
> + select ARCH_SUPPORTS_RT
> select ARCH_USE_BUILTIN_BSWAP
> select ARCH_USE_CMPXCHG_LOCKREF
> select ARCH_USE_QUEUED_RWLOCKS
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 1/3] LoongArch: Reduce min_delta for the arch clockevent device
2024-11-08 9:15 ` [PATCH 1/3] LoongArch: Reduce min_delta for the arch clockevent device Huacai Chen
@ 2024-11-14 10:21 ` Sebastian Andrzej Siewior
2024-11-14 11:46 ` Huacai Chen
0 siblings, 1 reply; 19+ messages in thread
From: Sebastian Andrzej Siewior @ 2024-11-14 10:21 UTC (permalink / raw)
To: Huacai Chen
Cc: Huacai Chen, Xuerui Wang, loongarch, Clark Williams,
Steven Rostedt, linux-rt-devel, Guo Ren, Jiaxun Yang,
linux-kernel
On 2024-11-08 17:15:43 [+0800], Huacai Chen wrote:
> Now the min_delta is 0x600 (1536) for LoongArch's constant clockevent
> device. For a 100MHz hardware timer this means ~15us. This is a little
> big, especially for PREEMPT_RT enabled kernels. So reduce it to 1000
> (we don't want too small values to affect performance).
So this reduces it to 10us. Is anything lower than that bad performance
wise?
> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Sebastian
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 3/3] LoongArch: Allow to enable PREEMPT_RT
2024-11-08 9:15 ` [PATCH 3/3] LoongArch: Allow to enable PREEMPT_RT Huacai Chen
2024-11-08 15:05 ` Steven Rostedt
@ 2024-11-14 10:31 ` Sebastian Andrzej Siewior
2024-11-14 11:07 ` Huacai Chen
1 sibling, 1 reply; 19+ messages in thread
From: Sebastian Andrzej Siewior @ 2024-11-14 10:31 UTC (permalink / raw)
To: Huacai Chen
Cc: Huacai Chen, Xuerui Wang, loongarch, Clark Williams,
Steven Rostedt, linux-rt-devel, Guo Ren, Jiaxun Yang,
linux-kernel
On 2024-11-08 17:15:45 [+0800], Huacai Chen wrote:
> It is really time.
>
> LoongArch has all the required architecture related changes, that have
> been identified over time, in order to enable PREEMPT_RT. With the recent
> printk changes, the last known road block has been addressed.
>
> Allow to enable PREEMPT_RT on LoongArch.
>
> Below are the latency data from cyclictest on a 4-core Loongson-3A5000
> machine, with a "make -j8" kernel building workload in the background.
>
> 1. PREEMPT kernel with default configuration:
>
> ./cyclictest -a -t -m -i200 -d0 -p99
> policy: fifo: loadavg: 8.78 8.96 8.64 10/296 64800
> T: 0 ( 4592) P:99 I:200 C:14838617 Min: 3 Act: 6 Avg: 8 Max: 844
> T: 1 ( 4593) P:99 I:200 C:14838765 Min: 3 Act: 9 Avg: 8 Max: 909
> T: 2 ( 4594) P:99 I:200 C:14838510 Min: 3 Act: 7 Avg: 8 Max: 832
> T: 3 ( 4595) P:99 I:200 C:14838631 Min: 3 Act: 8 Avg: 8 Max: 931
>
> 2. PREEMPT_RT kernel with default configuration:
>
> ./cyclictest -a -t -m -i200 -d0 -p99
> policy: fifo: loadavg: 10.38 10.47 10.35 9/336 77788
> T: 0 ( 3941) P:99 I:200 C:19439626 Min: 3 Act: 12 Avg: 8 Max: 227
> T: 1 ( 3942) P:99 I:200 C:19439624 Min: 2 Act: 11 Avg: 8 Max: 184
> T: 2 ( 3943) P:99 I:200 C:19439623 Min: 3 Act: 4 Avg: 7 Max: 223
> T: 3 ( 3944) P:99 I:200 C:19439623 Min: 2 Act: 10 Avg: 7 Max: 226
>
> 3. PREEMPT_RT kernel with tuned configuration:
>
> ./cyclictest -a -t -m -i200 -d0 -p99
> policy: fifo: loadavg: 10.52 10.66 10.62 12/334 109397
> T: 0 ( 4765) P:99 I:200 C:29335186 Min: 3 Act: 6 Avg: 8 Max: 62
> T: 1 ( 4766) P:99 I:200 C:29335185 Min: 3 Act: 10 Avg: 8 Max: 52
> T: 2 ( 4767) P:99 I:200 C:29335184 Min: 3 Act: 8 Avg: 8 Max: 64
> T: 3 ( 4768) P:99 I:200 C:29335183 Min: 3 Act: 12 Avg: 8 Max: 53
>
> Main instruments of tuned configuration include: Disable the boot rom
> space in BIOS for kernel, in order to avoid speculative access to low-
> speed memory; Disable CPUFreq scaling; Disable RTC synchronization in
> the ntpd/chronyd service.
If "rom space in BIOS for kernel" is a thing you should document it
somewhere or issue a warning at boot. I don't know what the latency
impact is here and if this is needed at all during runtime.
Why is ntpd/chronyd service affecting this? Is it running at prio 99?
Otherwise it should not be noticed.
Is lockdep complaining in any workloads?
Is CONFIG_DEBUG_ATOMIC_SLEEP leading to any complains?
> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Sebastian
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 3/3] LoongArch: Allow to enable PREEMPT_RT
2024-11-14 10:31 ` Sebastian Andrzej Siewior
@ 2024-11-14 11:07 ` Huacai Chen
2024-11-14 11:14 ` Sebastian Andrzej Siewior
0 siblings, 1 reply; 19+ messages in thread
From: Huacai Chen @ 2024-11-14 11:07 UTC (permalink / raw)
To: Sebastian Andrzej Siewior
Cc: Huacai Chen, Xuerui Wang, loongarch, Clark Williams,
Steven Rostedt, linux-rt-devel, Guo Ren, Jiaxun Yang,
linux-kernel
Hi, Sebastian,
On Thu, Nov 14, 2024 at 6:31 PM Sebastian Andrzej Siewior
<bigeasy@linutronix.de> wrote:
>
> On 2024-11-08 17:15:45 [+0800], Huacai Chen wrote:
> > It is really time.
> >
> > LoongArch has all the required architecture related changes, that have
> > been identified over time, in order to enable PREEMPT_RT. With the recent
> > printk changes, the last known road block has been addressed.
> >
> > Allow to enable PREEMPT_RT on LoongArch.
> >
> > Below are the latency data from cyclictest on a 4-core Loongson-3A5000
> > machine, with a "make -j8" kernel building workload in the background.
> >
> > 1. PREEMPT kernel with default configuration:
> >
> > ./cyclictest -a -t -m -i200 -d0 -p99
> > policy: fifo: loadavg: 8.78 8.96 8.64 10/296 64800
> > T: 0 ( 4592) P:99 I:200 C:14838617 Min: 3 Act: 6 Avg: 8 Max: 844
> > T: 1 ( 4593) P:99 I:200 C:14838765 Min: 3 Act: 9 Avg: 8 Max: 909
> > T: 2 ( 4594) P:99 I:200 C:14838510 Min: 3 Act: 7 Avg: 8 Max: 832
> > T: 3 ( 4595) P:99 I:200 C:14838631 Min: 3 Act: 8 Avg: 8 Max: 931
> >
> > 2. PREEMPT_RT kernel with default configuration:
> >
> > ./cyclictest -a -t -m -i200 -d0 -p99
> > policy: fifo: loadavg: 10.38 10.47 10.35 9/336 77788
> > T: 0 ( 3941) P:99 I:200 C:19439626 Min: 3 Act: 12 Avg: 8 Max: 227
> > T: 1 ( 3942) P:99 I:200 C:19439624 Min: 2 Act: 11 Avg: 8 Max: 184
> > T: 2 ( 3943) P:99 I:200 C:19439623 Min: 3 Act: 4 Avg: 7 Max: 223
> > T: 3 ( 3944) P:99 I:200 C:19439623 Min: 2 Act: 10 Avg: 7 Max: 226
> >
> > 3. PREEMPT_RT kernel with tuned configuration:
> >
> > ./cyclictest -a -t -m -i200 -d0 -p99
> > policy: fifo: loadavg: 10.52 10.66 10.62 12/334 109397
> > T: 0 ( 4765) P:99 I:200 C:29335186 Min: 3 Act: 6 Avg: 8 Max: 62
> > T: 1 ( 4766) P:99 I:200 C:29335185 Min: 3 Act: 10 Avg: 8 Max: 52
> > T: 2 ( 4767) P:99 I:200 C:29335184 Min: 3 Act: 8 Avg: 8 Max: 64
> > T: 3 ( 4768) P:99 I:200 C:29335183 Min: 3 Act: 12 Avg: 8 Max: 53
> >
> > Main instruments of tuned configuration include: Disable the boot rom
> > space in BIOS for kernel, in order to avoid speculative access to low-
> > speed memory; Disable CPUFreq scaling; Disable RTC synchronization in
> > the ntpd/chronyd service.
>
> If "rom space in BIOS for kernel" is a thing you should document it
> somewhere or issue a warning at boot. I don't know what the latency
> impact is here and if this is needed at all during runtime.
I'm sorry to confuse you. This sentence should be reworded. The real
meaning is: we should disable something in BIOS configuration, the
goal is avoid kernel code's speculative access to boot rom (low speed
memory).
>
> Why is ntpd/chronyd service affecting this? Is it running at prio 99?
> Otherwise it should not be noticed.
No, ntpd/chronyd doesn't affect latency. But they may trigger RTC
synchronization every 11 minutes, and RTC synchronization affects
latency. We can keep ntpd/chronyd running but disable RTC
synchronization by configuration, this is the least aggressive method.
>
> Is lockdep complaining in any workloads?
> Is CONFIG_DEBUG_ATOMIC_SLEEP leading to any complains?
This needs more tests because I haven't enabled them.
Huacai
>
>
> > Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
>
> Sebastian
>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 3/3] LoongArch: Allow to enable PREEMPT_RT
2024-11-14 11:07 ` Huacai Chen
@ 2024-11-14 11:14 ` Sebastian Andrzej Siewior
2024-11-14 11:19 ` Huacai Chen
0 siblings, 1 reply; 19+ messages in thread
From: Sebastian Andrzej Siewior @ 2024-11-14 11:14 UTC (permalink / raw)
To: Huacai Chen
Cc: Huacai Chen, Xuerui Wang, loongarch, Clark Williams,
Steven Rostedt, linux-rt-devel, Guo Ren, Jiaxun Yang,
linux-kernel
On 2024-11-14 19:07:37 [+0800], Huacai Chen wrote:
> Hi, Sebastian,
Hi,
> > Why is ntpd/chronyd service affecting this? Is it running at prio 99?
> > Otherwise it should not be noticed.
> No, ntpd/chronyd doesn't affect latency. But they may trigger RTC
> synchronization every 11 minutes, and RTC synchronization affects
> latency. We can keep ntpd/chronyd running but disable RTC
> synchronization by configuration, this is the least aggressive method.
What is "RTC synchronization" in this context?
> > Is lockdep complaining in any workloads?
> > Is CONFIG_DEBUG_ATOMIC_SLEEP leading to any complains?
> This needs more tests because I haven't enabled them.
That would be good. It would show if there is anything that has not yet
been noticed.
> Huacai
Sebastian
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 3/3] LoongArch: Allow to enable PREEMPT_RT
2024-11-14 11:14 ` Sebastian Andrzej Siewior
@ 2024-11-14 11:19 ` Huacai Chen
2024-11-14 11:30 ` Sebastian Andrzej Siewior
0 siblings, 1 reply; 19+ messages in thread
From: Huacai Chen @ 2024-11-14 11:19 UTC (permalink / raw)
To: Sebastian Andrzej Siewior
Cc: Huacai Chen, Xuerui Wang, loongarch, Clark Williams,
Steven Rostedt, linux-rt-devel, Guo Ren, Jiaxun Yang,
linux-kernel
On Thu, Nov 14, 2024 at 7:14 PM Sebastian Andrzej Siewior
<bigeasy@linutronix.de> wrote:
>
> On 2024-11-14 19:07:37 [+0800], Huacai Chen wrote:
> > Hi, Sebastian,
> Hi,
>
> > > Why is ntpd/chronyd service affecting this? Is it running at prio 99?
> > > Otherwise it should not be noticed.
> > No, ntpd/chronyd doesn't affect latency. But they may trigger RTC
> > synchronization every 11 minutes, and RTC synchronization affects
> > latency. We can keep ntpd/chronyd running but disable RTC
> > synchronization by configuration, this is the least aggressive method.
>
> What is "RTC synchronization" in this context?
Means the sync_hw_clock() function in kernel/time/ntp.c, it can be
enabled/disabled by chronyd configuration:
/etc/chrony.conf
# Enable kernel synchronization of the real-time clock (RTC).
# rtcsync
Huacai
>
> > > Is lockdep complaining in any workloads?
> > > Is CONFIG_DEBUG_ATOMIC_SLEEP leading to any complains?
> > This needs more tests because I haven't enabled them.
>
> That would be good. It would show if there is anything that has not yet
> been noticed.
>
> > Huacai
>
> Sebastian
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 3/3] LoongArch: Allow to enable PREEMPT_RT
2024-11-14 11:19 ` Huacai Chen
@ 2024-11-14 11:30 ` Sebastian Andrzej Siewior
2024-11-14 11:36 ` Huacai Chen
0 siblings, 1 reply; 19+ messages in thread
From: Sebastian Andrzej Siewior @ 2024-11-14 11:30 UTC (permalink / raw)
To: Huacai Chen
Cc: Huacai Chen, Xuerui Wang, loongarch, Clark Williams,
Steven Rostedt, linux-rt-devel, Guo Ren, Jiaxun Yang,
linux-kernel
On 2024-11-14 19:19:26 [+0800], Huacai Chen wrote:
> > > > Why is ntpd/chronyd service affecting this? Is it running at prio 99?
> > > > Otherwise it should not be noticed.
> > > No, ntpd/chronyd doesn't affect latency. But they may trigger RTC
> > > synchronization every 11 minutes, and RTC synchronization affects
> > > latency. We can keep ntpd/chronyd running but disable RTC
> > > synchronization by configuration, this is the least aggressive method.
> >
> > What is "RTC synchronization" in this context?
> Means the sync_hw_clock() function in kernel/time/ntp.c, it can be
> enabled/disabled by chronyd configuration:
But what exactly is sync_hw_clock() doing that is causing a problem
here? The clock on HW is updated. The access to the RTC clock is
preemptible.
> Huacai
Sebastian
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 3/3] LoongArch: Allow to enable PREEMPT_RT
2024-11-14 11:30 ` Sebastian Andrzej Siewior
@ 2024-11-14 11:36 ` Huacai Chen
2024-11-14 13:29 ` Sebastian Andrzej Siewior
0 siblings, 1 reply; 19+ messages in thread
From: Huacai Chen @ 2024-11-14 11:36 UTC (permalink / raw)
To: Sebastian Andrzej Siewior
Cc: Huacai Chen, Xuerui Wang, loongarch, Clark Williams,
Steven Rostedt, linux-rt-devel, Guo Ren, Jiaxun Yang,
linux-kernel
On Thu, Nov 14, 2024 at 7:30 PM Sebastian Andrzej Siewior
<bigeasy@linutronix.de> wrote:
>
> On 2024-11-14 19:19:26 [+0800], Huacai Chen wrote:
> > > > > Why is ntpd/chronyd service affecting this? Is it running at prio 99?
> > > > > Otherwise it should not be noticed.
> > > > No, ntpd/chronyd doesn't affect latency. But they may trigger RTC
> > > > synchronization every 11 minutes, and RTC synchronization affects
> > > > latency. We can keep ntpd/chronyd running but disable RTC
> > > > synchronization by configuration, this is the least aggressive method.
> > >
> > > What is "RTC synchronization" in this context?
> > Means the sync_hw_clock() function in kernel/time/ntp.c, it can be
> > enabled/disabled by chronyd configuration:
>
> But what exactly is sync_hw_clock() doing that is causing a problem
> here? The clock on HW is updated. The access to the RTC clock is
> preemptible.
This is a platform-specific problem, our RTC driver is
drivers/rtc/rtc-loongson.c, the write operation to RTC register is
slow.
Huacai
>
> > Huacai
>
> Sebastian
>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 1/3] LoongArch: Reduce min_delta for the arch clockevent device
2024-11-14 10:21 ` Sebastian Andrzej Siewior
@ 2024-11-14 11:46 ` Huacai Chen
2024-11-14 13:27 ` Sebastian Andrzej Siewior
0 siblings, 1 reply; 19+ messages in thread
From: Huacai Chen @ 2024-11-14 11:46 UTC (permalink / raw)
To: Sebastian Andrzej Siewior
Cc: Huacai Chen, Xuerui Wang, loongarch, Clark Williams,
Steven Rostedt, linux-rt-devel, Guo Ren, Jiaxun Yang,
linux-kernel
Hi, Sebastian,
On Thu, Nov 14, 2024 at 6:21 PM Sebastian Andrzej Siewior
<bigeasy@linutronix.de> wrote:
>
> On 2024-11-08 17:15:43 [+0800], Huacai Chen wrote:
> > Now the min_delta is 0x600 (1536) for LoongArch's constant clockevent
> > device. For a 100MHz hardware timer this means ~15us. This is a little
> > big, especially for PREEMPT_RT enabled kernels. So reduce it to 1000
> > (we don't want too small values to affect performance).
>
> So this reduces it to 10us. Is anything lower than that bad performance
> wise?
Maybe I misunderstood the meaning of min_delta, but if I'm correct,
small min_delta may cause more timers to be triggered, because timers
are aligned by the granularity (min_delta). So I think min_delta
affects performance.
And I choose 10us just because I saw latency improvements when I
reduce 15us to 10us, but no more effect when I reduce it to even
lower.
Huacai
>
> > Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
>
> Sebastian
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 1/3] LoongArch: Reduce min_delta for the arch clockevent device
2024-11-14 11:46 ` Huacai Chen
@ 2024-11-14 13:27 ` Sebastian Andrzej Siewior
2024-11-15 6:44 ` Huacai Chen
0 siblings, 1 reply; 19+ messages in thread
From: Sebastian Andrzej Siewior @ 2024-11-14 13:27 UTC (permalink / raw)
To: Huacai Chen
Cc: Huacai Chen, Xuerui Wang, loongarch, Clark Williams,
Steven Rostedt, linux-rt-devel, Guo Ren, Jiaxun Yang,
linux-kernel
On 2024-11-14 19:46:39 [+0800], Huacai Chen wrote:
> Hi, Sebastian,
Hi,
> On Thu, Nov 14, 2024 at 6:21 PM Sebastian Andrzej Siewior
> <bigeasy@linutronix.de> wrote:
> >
> > On 2024-11-08 17:15:43 [+0800], Huacai Chen wrote:
> > > Now the min_delta is 0x600 (1536) for LoongArch's constant clockevent
> > > device. For a 100MHz hardware timer this means ~15us. This is a little
> > > big, especially for PREEMPT_RT enabled kernels. So reduce it to 1000
> > > (we don't want too small values to affect performance).
> >
> > So this reduces it to 10us. Is anything lower than that bad performance
> > wise?
> Maybe I misunderstood the meaning of min_delta, but if I'm correct,
> small min_delta may cause more timers to be triggered, because timers
> are aligned by the granularity (min_delta). So I think min_delta
> affects performance.
They are not aligned. Well they get aligned due to the consequences.
In one-shot mode you program the device for the next timer to expire. It
computes the delta between expire-time and now. This delta is then
clamped between min & max delta. See clockevents_program_event().
This means if your timer is supposed to expire in 5us (from now) but
your min delta is set to 15us then the timer device will be programmed
to 15us from now. This is 10us after the expire time of your first
timer. Once the timer devices fires, it will expire all hrtimers which
expired at this point. This includes that timer, that should have fired
10us ago, plus everything else following in the 10us window.
> Huacai
Sebastian
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 3/3] LoongArch: Allow to enable PREEMPT_RT
2024-11-14 11:36 ` Huacai Chen
@ 2024-11-14 13:29 ` Sebastian Andrzej Siewior
2024-11-14 14:43 ` Clark Williams
0 siblings, 1 reply; 19+ messages in thread
From: Sebastian Andrzej Siewior @ 2024-11-14 13:29 UTC (permalink / raw)
To: Huacai Chen
Cc: Huacai Chen, Xuerui Wang, loongarch, Clark Williams,
Steven Rostedt, linux-rt-devel, Guo Ren, Jiaxun Yang,
linux-kernel
On 2024-11-14 19:36:38 [+0800], Huacai Chen wrote:
> On Thu, Nov 14, 2024 at 7:30 PM Sebastian Andrzej Siewior
> <bigeasy@linutronix.de> wrote:
> >
> > On 2024-11-14 19:19:26 [+0800], Huacai Chen wrote:
> > > > > > Why is ntpd/chronyd service affecting this? Is it running at prio 99?
> > > > > > Otherwise it should not be noticed.
> > > > > No, ntpd/chronyd doesn't affect latency. But they may trigger RTC
> > > > > synchronization every 11 minutes, and RTC synchronization affects
> > > > > latency. We can keep ntpd/chronyd running but disable RTC
> > > > > synchronization by configuration, this is the least aggressive method.
> > > >
> > > > What is "RTC synchronization" in this context?
> > > Means the sync_hw_clock() function in kernel/time/ntp.c, it can be
> > > enabled/disabled by chronyd configuration:
> >
> > But what exactly is sync_hw_clock() doing that is causing a problem
> > here? The clock on HW is updated. The access to the RTC clock is
> > preemptible.
> This is a platform-specific problem, our RTC driver is
> drivers/rtc/rtc-loongson.c, the write operation to RTC register is
> slow.
Ach okay. So the pure read on the slow bus causes a delay because the
CPU is stalled. That is not limited to chrony but should also have an
affect if the user uses hwclock to read/ write the time.
> Huacai
Sebastian
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 3/3] LoongArch: Allow to enable PREEMPT_RT
2024-11-14 13:29 ` Sebastian Andrzej Siewior
@ 2024-11-14 14:43 ` Clark Williams
2024-11-18 7:36 ` Sebastian Andrzej Siewior
0 siblings, 1 reply; 19+ messages in thread
From: Clark Williams @ 2024-11-14 14:43 UTC (permalink / raw)
To: Sebastian Andrzej Siewior
Cc: Huacai Chen, Huacai Chen, Xuerui Wang, loongarch, Steven Rostedt,
linux-rt-devel, Guo Ren, Jiaxun Yang, linux-kernel
On Thu, Nov 14, 2024 at 02:29:56PM +0100, Sebastian Andrzej Siewior wrote:
>
> Ach okay. So the pure read on the slow bus causes a delay because the
> CPU is stalled. That is not limited to chrony but should also have an
> affect if the user uses hwclock to read/ write the time.
>
> Sebastian
>
We see similar problems with chronyd accessing the RTC on aarch64
systems that use UEFI. Accessing anything via the EFI Runtime is very
slow. Probably going to turn off 'rtcsync' in chronyd when running
low-latency workloads.
Clark
--
The United States Coast Guard
Ruining Natural Selection since 1790
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 1/3] LoongArch: Reduce min_delta for the arch clockevent device
2024-11-14 13:27 ` Sebastian Andrzej Siewior
@ 2024-11-15 6:44 ` Huacai Chen
0 siblings, 0 replies; 19+ messages in thread
From: Huacai Chen @ 2024-11-15 6:44 UTC (permalink / raw)
To: Sebastian Andrzej Siewior
Cc: Huacai Chen, Xuerui Wang, loongarch, Clark Williams,
Steven Rostedt, linux-rt-devel, Guo Ren, Jiaxun Yang,
linux-kernel
On Thu, Nov 14, 2024 at 9:27 PM Sebastian Andrzej Siewior
<bigeasy@linutronix.de> wrote:
>
> On 2024-11-14 19:46:39 [+0800], Huacai Chen wrote:
> > Hi, Sebastian,
> Hi,
>
> > On Thu, Nov 14, 2024 at 6:21 PM Sebastian Andrzej Siewior
> > <bigeasy@linutronix.de> wrote:
> > >
> > > On 2024-11-08 17:15:43 [+0800], Huacai Chen wrote:
> > > > Now the min_delta is 0x600 (1536) for LoongArch's constant clockevent
> > > > device. For a 100MHz hardware timer this means ~15us. This is a little
> > > > big, especially for PREEMPT_RT enabled kernels. So reduce it to 1000
> > > > (we don't want too small values to affect performance).
> > >
> > > So this reduces it to 10us. Is anything lower than that bad performance
> > > wise?
> > Maybe I misunderstood the meaning of min_delta, but if I'm correct,
> > small min_delta may cause more timers to be triggered, because timers
> > are aligned by the granularity (min_delta). So I think min_delta
> > affects performance.
>
> They are not aligned. Well they get aligned due to the consequences.
Then I still think it affects performance (and power
consumption).Because it is different to fire a timer every 1us and
fire 10 timers together at the end of 10us.
Huacai
>
> In one-shot mode you program the device for the next timer to expire. It
> computes the delta between expire-time and now. This delta is then
> clamped between min & max delta. See clockevents_program_event().
>
> This means if your timer is supposed to expire in 5us (from now) but
> your min delta is set to 15us then the timer device will be programmed
> to 15us from now. This is 10us after the expire time of your first
> timer. Once the timer devices fires, it will expire all hrtimers which
> expired at this point. This includes that timer, that should have fired
> 10us ago, plus everything else following in the 10us window.
>
> > Huacai
>
> Sebastian
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 3/3] LoongArch: Allow to enable PREEMPT_RT
2024-11-14 14:43 ` Clark Williams
@ 2024-11-18 7:36 ` Sebastian Andrzej Siewior
[not found] ` <CAPAFJkp_MQ8rNsTTY3xfYMhdtiWQunN65Yfft1SqZLptG2J5cw@mail.gmail.com>
0 siblings, 1 reply; 19+ messages in thread
From: Sebastian Andrzej Siewior @ 2024-11-18 7:36 UTC (permalink / raw)
To: Clark Williams
Cc: Huacai Chen, Huacai Chen, Xuerui Wang, loongarch, Steven Rostedt,
linux-rt-devel, Guo Ren, Jiaxun Yang, linux-kernel
On 2024-11-14 08:43:21 [-0600], Clark Williams wrote:
> We see similar problems with chronyd accessing the RTC on aarch64
> systems that use UEFI. Accessing anything via the EFI Runtime is very
> slow. Probably going to turn off 'rtcsync' in chronyd when running
> low-latency workloads.
But isn't "we call into EFI and have no clue what happens" exactly the
reason why we disable EFI runtime services?
> Clark
Sebastian
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 3/3] LoongArch: Allow to enable PREEMPT_RT
[not found] ` <CAPAFJkp_MQ8rNsTTY3xfYMhdtiWQunN65Yfft1SqZLptG2J5cw@mail.gmail.com>
@ 2024-11-20 7:36 ` Sebastian Andrzej Siewior
0 siblings, 0 replies; 19+ messages in thread
From: Sebastian Andrzej Siewior @ 2024-11-20 7:36 UTC (permalink / raw)
To: Clark Williams
Cc: Clark Williams, Huacai Chen, Huacai Chen, Xuerui Wang, loongarch,
Steven Rostedt, linux-rt-devel, Guo Ren, Jiaxun Yang,
linux-kernel
On 2024-11-20 02:17:53 [+0000], Clark Williams wrote:
> On Mon, Nov 18, 2024 at 7:37 AM Sebastian Andrzej Siewior <
> bigeasy@linutronix.de> wrote:
>
> > On 2024-11-14 08:43:21 [-0600], Clark Williams wrote:
> > > We see similar problems with chronyd accessing the RTC on aarch64
> > > systems that use UEFI. Accessing anything via the EFI Runtime is very
> > > slow. Probably going to turn off 'rtcsync' in chronyd when running
> > > low-latency workloads.
> >
> > But isn't "we call into EFI and have no clue what happens" exactly the
> > reason why we disable EFI runtime services?
> >
> >
> I've had customers want access to EFI variables. I believe we default EFI
> runtime to be off and allow it to be turned on.
So the efi-rtc is accessed via functions calls into EFI. So you call in
there and they do (probably) access the RTC via i2c and remain in EFI
(block) for the entire process. That is why the access is disabled. The
difference here is that the bus access is slow so every read/ write
seems to take a while.
The EFI variable access by itself is fine however the variables might be
saved in NAND flash. The write access may trigger an erase process and
relocate the data and so may the read if too many bit flips were
detected.
> Clark
Sebastian
^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2024-11-20 7:36 UTC | newest]
Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-11-08 9:15 [PATCH 0/3] LoongArch: Add PREEMPT_RT support Huacai Chen
2024-11-08 9:15 ` [PATCH 1/3] LoongArch: Reduce min_delta for the arch clockevent device Huacai Chen
2024-11-14 10:21 ` Sebastian Andrzej Siewior
2024-11-14 11:46 ` Huacai Chen
2024-11-14 13:27 ` Sebastian Andrzej Siewior
2024-11-15 6:44 ` Huacai Chen
2024-11-08 9:15 ` [PATCH 2/3] LoongArch: Select HAVE_POSIX_CPU_TIMERS_TASK_WORK Huacai Chen
2024-11-08 9:15 ` [PATCH 3/3] LoongArch: Allow to enable PREEMPT_RT Huacai Chen
2024-11-08 15:05 ` Steven Rostedt
2024-11-14 10:31 ` Sebastian Andrzej Siewior
2024-11-14 11:07 ` Huacai Chen
2024-11-14 11:14 ` Sebastian Andrzej Siewior
2024-11-14 11:19 ` Huacai Chen
2024-11-14 11:30 ` Sebastian Andrzej Siewior
2024-11-14 11:36 ` Huacai Chen
2024-11-14 13:29 ` Sebastian Andrzej Siewior
2024-11-14 14:43 ` Clark Williams
2024-11-18 7:36 ` Sebastian Andrzej Siewior
[not found] ` <CAPAFJkp_MQ8rNsTTY3xfYMhdtiWQunN65Yfft1SqZLptG2J5cw@mail.gmail.com>
2024-11-20 7:36 ` Sebastian Andrzej Siewior
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox