Linux-mm Archive on lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] sched/fork: Fix timer_slack_ns inheritance for RT tasks
@ 2026-05-20 13:07 Xiao Feng via B4 Relay
  2026-05-20 14:57 ` K Prateek Nayak
  2026-05-28  4:50 ` kernel test robot
  0 siblings, 2 replies; 4+ messages in thread
From: Xiao Feng via B4 Relay @ 2026-05-20 13:07 UTC (permalink / raw)
  To: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
	Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
	Valentin Schneider, K Prateek Nayak, Andrew Morton,
	David Hildenbrand, Lorenzo Stoakes, Liam R. Howlett,
	Vlastimil Babka, Mike Rapoport, Suren Baghdasaryan, Michal Hocko,
	Kees Cook, Felix Moessbauer, Thomas Gleixner
  Cc: linux-kernel, linux-mm, Xiao Feng

From: Xiao Feng <xiaofeng5@xiaomi.com>

Since commit ed4fb6d7ef68 ("hrtimer: Use and report correct timerslack
values for realtime tasks"), RT tasks have their timer_slack_ns set to 0
in __setscheduler_params(). This introduces two related problems when RT
tasks fork children that end up running as CFS:

Problem 1: sched_reset_on_fork

When a RT task with sched_reset_on_fork set forks a child:

1. dup_task_struct() copies timer_slack_ns (0) from the RT parent.
2. copy_process() sets:
     p->default_timer_slack_ns = current->timer_slack_ns (= 0)
3. sched_fork() demotes the child to SCHED_NORMAL but does not
   restore timer_slack_ns.

Result: CFS child has timer_slack_ns = 0 and default_timer_slack_ns = 0
permanently.

Problem 2: RT fork followed by later policy change

When a RT task forks a child (without reset_on_fork), the child inherits
RT policy with timer_slack_ns = 0. copy_process() sets
default_timer_slack_ns = current->timer_slack_ns (= 0). If the child is
later demoted to CFS via sched_setscheduler(), __setscheduler_params()
tries to restore timer_slack_ns from default_timer_slack_ns, but it is 0.

Result: same as above.

Both problems prevent timer coalescing for these CFS tasks, causing
unnecessary wakeups and increased power consumption. Writing 0 to
/proc/pid/timerslack_ns also cannot restore a proper default.

Fix both issues:

1. In copy_process(), inherit default_timer_slack_ns from the parent's
   default_timer_slack_ns (which is preserved across RT transitions)
   instead of timer_slack_ns (which is 0 for RT tasks).

2. In sched_fork(), when sched_reset_on_fork demotes RT/DL to CFS,
   explicitly restore timer_slack_ns from the parent's
   default_timer_slack_ns, falling back to 50us if it is also 0.

Fixes: ed4fb6d7ef68 ("hrtimer: Use and report correct timerslack values for realtime tasks")
Signed-off-by: Xiao Feng <xiaofeng5@xiaomi.com>
---
 kernel/fork.c       | 2 +-
 kernel/sched/core.c | 2 ++
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/kernel/fork.c b/kernel/fork.c
index 5f3fdfdb14c7..c3ef5ebb3037 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -2174,7 +2174,7 @@ __latent_entropy struct task_struct *copy_process(
 	retval = -EAGAIN;
 #endif
 
-	p->default_timer_slack_ns = current->timer_slack_ns;
+	p->default_timer_slack_ns = current->default_timer_slack_ns;
 
 #ifdef CONFIG_PSI
 	p->psi_flags = 0;
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index b8871449d3c6..9b63560a45de 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -4710,6 +4710,8 @@ int sched_fork(u64 clone_flags, struct task_struct *p)
 			p->policy = SCHED_NORMAL;
 			p->static_prio = NICE_TO_PRIO(0);
 			p->rt_priority = 0;
+			p->timer_slack_ns = current->default_timer_slack_ns ?: 50000;
+			p->default_timer_slack_ns = p->timer_slack_ns;
 		} else if (PRIO_TO_NICE(p->static_prio) < 0)
 			p->static_prio = NICE_TO_PRIO(0);
 

---
base-commit: 27fa82620cbaa89a7fc11ac3057701d598813e87
change-id: 20260520-sched-fork-fix-timer-slack-6dfacb033d43

Best regards,
--  
Xiao Feng <xiaofeng5@xiaomi.com>




^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] sched/fork: Fix timer_slack_ns inheritance for RT tasks
  2026-05-20 13:07 [PATCH] sched/fork: Fix timer_slack_ns inheritance for RT tasks Xiao Feng via B4 Relay
@ 2026-05-20 14:57 ` K Prateek Nayak
  2026-05-28  4:50 ` kernel test robot
  1 sibling, 0 replies; 4+ messages in thread
From: K Prateek Nayak @ 2026-05-20 14:57 UTC (permalink / raw)
  To: xiaofeng5, Ingo Molnar, Peter Zijlstra, Juri Lelli,
	Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Ben Segall,
	Mel Gorman, Valentin Schneider, Andrew Morton, David Hildenbrand,
	Lorenzo Stoakes, Liam R. Howlett, Vlastimil Babka, Mike Rapoport,
	Suren Baghdasaryan, Michal Hocko, Kees Cook, Felix Moessbauer,
	Thomas Gleixner
  Cc: linux-kernel, linux-mm

Hello Xiao,

On 5/20/2026 6:37 PM, Xiao Feng via B4 Relay wrote:
> Both problems prevent timer coalescing for these CFS tasks, causing
> unnecessary wakeups and increased power consumption. Writing 0 to
> /proc/pid/timerslack_ns also cannot restore a proper default.
> 
> Fix both issues:
> 
> 1. In copy_process(), inherit default_timer_slack_ns from the parent's
>    default_timer_slack_ns (which is preserved across RT transitions)
>    instead of timer_slack_ns (which is 0 for RT tasks).

man page for fork() [1] reads:

    The default timer slack value is set to the parent's current
    timer slack value.  See the description of PR_SET_TIMERSLACK in
    prctl(2).

And that description in man page for PR_SET_TIMERSLACK [2] reads:

    When a new thread is created, the two timer slack values are made
    the same as the "current" value of the creating thread.


The two timer slack value that the man page refers to above is the
"default" and the "current" value as is describes in the opening
statement.

From a documentation standpoint, it is doing the right thing. A
RT thread returns 0 for PR_GET_TIMERSLACK and the same is set as
"default" and "current" for its children.

[1] https://man7.org/linux/man-pages/man2/fork.2.html
[2] https://man7.org/linux/man-pages/man2/pr_set_timerslack.2const.html

> 
> 2. In sched_fork(), when sched_reset_on_fork demotes RT/DL to CFS,
>    explicitly restore timer_slack_ns from the parent's
>    default_timer_slack_ns, falling back to 50us if it is also 0.

As for SCHED_FLAG_RESET_ON_FORK, man page for sched() [3] reads:

    More precisely, if the reset-on-fork flag is set, the following
    rules apply for subsequently created children:

    -  If the calling thread has a scheduling policy of SCHED_FIFO or
        SCHED_RR, the policy is reset to SCHED_OTHER in child
        processes.

    -  If the calling process has a negative nice value, the nice
        value is reset to zero in child processes.

    After the reset-on-fork flag has been enabled, it can be reset
    only if the thread has the CAP_SYS_NICE capability.  This flag is
    disabled in child processes created by fork(2).


Nowhere it says anything other than the scheduling policy is affected by
this flag. How is timer_slack any special?

[3] https://man7.org/linux/man-pages/man7/sched.7.html

> 
> Fixes: ed4fb6d7ef68 ("hrtimer: Use and report correct timerslack values for realtime tasks")

Based on my reading, this is not fixing anything but instead introducing
a behavior change contrary to what has been currently documented.

If it is acceptable, at the very least, the man pages need to be updated
stating this new behavior and the kernel version that introduces it.

I'll let others comment since they know these bits better than me.

-- 
Thanks and Regards,
Prateek



^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] sched/fork: Fix timer_slack_ns inheritance for RT tasks
  2026-05-20 13:07 [PATCH] sched/fork: Fix timer_slack_ns inheritance for RT tasks Xiao Feng via B4 Relay
  2026-05-20 14:57 ` K Prateek Nayak
@ 2026-05-28  4:50 ` kernel test robot
  2026-05-28  5:24   ` K Prateek Nayak
  1 sibling, 1 reply; 4+ messages in thread
From: kernel test robot @ 2026-05-28  4:50 UTC (permalink / raw)
  To: Xiao Feng via B4 Relay
  Cc: oe-lkp, lkp, linux-kernel, aubrey.li, yu.c.chen, ltp, Ingo Molnar,
	Peter Zijlstra, Juri Lelli, Vincent Guittot, Dietmar Eggemann,
	Steven Rostedt, Ben Segall, Mel Gorman, Valentin Schneider,
	K Prateek Nayak, Andrew Morton, David Hildenbrand,
	Lorenzo Stoakes, Liam R. Howlett, Vlastimil Babka, Mike Rapoport,
	Suren Baghdasaryan, Michal Hocko, Kees Cook, Felix Moessbauer,
	Thomas Gleixner, linux-mm, Xiao Feng, oliver.sang



Hello,

kernel test robot noticed "ltp.prctl08.fail" on:

commit: 2aa2f0a955c7459279e861f62c770f5f81d64f4f ("[PATCH] sched/fork: Fix timer_slack_ns inheritance for RT tasks")
url: https://github.com/intel-lab-lkp/linux/commits/Xiao-Feng-via-B4-Relay/sched-fork-Fix-timer_slack_ns-inheritance-for-RT-tasks/20260520-210956
patch link: https://lore.kernel.org/all/20260520-sched-fork-fix-timer-slack-v1-1-d4fc937245b2@xiaomi.com/
patch subject: [PATCH] sched/fork: Fix timer_slack_ns inheritance for RT tasks

in testcase: ltp
version: 
with following parameters:

	disk: 1HDD
	fs: ext4
	test: syscalls-05



config: x86_64-rhel-9.4-ltp
compiler: gcc-14
test machine: 36 threads 1 sockets Intel(R) Core(TM) i9-10980XE CPU @ 3.00GHz (Cascade Lake) with 128G memory

(please refer to attached dmesg/kmsg for entire log/backtrace)



If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@intel.com>
| Closes: https://lore.kernel.org/oe-lkp/202605281011.2ba77cc3-lkp@intel.com


2026-05-29 16:17:35 kirk -f syscalls-05 --tmp-dir /fs/sda1/tmpdir
Host information
	Hostname:   lkp-csl-d02
	Python:     3.13.5 (main, Jun 25 2025, 18:55:22) [GCC 14.2.0]
	Directory:  /fs/sda1/tmpdir/kirk.root/tmp0lg24sa6

Connecting to SUT: default

Suite: syscalls-05
──────────────────

...

^[[1;37mprctl02: ^[[0m^[[1;32mpass^[[0m | ^[[1;33mtainted^[[0m  (0.011s)
^[[1;37mprctl08: ^[[0m^[[1;31mfail^[[0m | ^[[1;33mtainted^[[0m  (0.011s)   <---
^[[1;37mpreadv203: ^[[0m^[[1;36mskip^[[0m | ^[[1;33mtainted^[[0m  (47.775s)
^[[1;37mprofil01: ^[[0m^[[1;32mpass^[[0m | ^[[1;33mtainted^[[0m  (5.005s)

...


Execution time: 9m 8s

Disconnecting from SUT: default

Target information
──────────────────
Kernel:   Linux 7.1.0-rc4+ #1 SMP PREEMPT_DYNAMIC Tue May 26 16:01:55 CST 2026
Cmdline:  ip=::::lkp-csl-d02::dhcp
          root=/dev/ram0
          RESULT_ROOT=/result/ltp/1HDD-ext4-syscalls-05/lkp-csl-d02/debian-13-x86_64-20250902.cgz/x86_64-rhel-9.4-ltp/gcc-14/2aa2f0a955c7459279e861f62c770f5f81d64f4f/0
          BOOT_IMAGE=/pkg/linux/x86_64-rhel-9.4-ltp/gcc-14/2aa2f0a955c7459279e861f62c770f5f81d64f4f/vmlinuz-7.1.0-rc4+
          branch=linux-devel/devel-hourly-20260521-050335
          job=/lkp/jobs/scheduled/lkp-csl-d02/ltp-1HDD-ext4-syscalls-05-debian-13-x86_64-20250902.cgz-2aa2f0a955c7-20260526-26280-5dbkeb-0.yaml
          user=lkp
          ARCH=x86_64
          kconfig=x86_64-rhel-9.4-ltp
          commit=2aa2f0a955c7459279e861f62c770f5f81d64f4f
          intremap=posted_msi
          acpi_rsdp=0x36983000
          max_uptime=14400
          LKP_SERVER=10.239.97.5
          nokaslr
          selinux=0
          debug
          apic=debug
          sysrq_always_enabled
          rcupdate.rcu_cpu_stall_timeout=100
          net.ifnames=0
          printk.devkmsg=on
          panic=-1
          softlockup_panic=1
          nmi_watchdog=panic
          oops=panic
          load_ramdisk=2
          prompt_ramdisk=0
          drbd.minor_count=8
          systemd.log_level=err
          ignore_loglevel
          console=tty0
          earlyprintk=ttyS0,115200
          console=ttyS0,115200
          vga=normal
          rw
          keep_initrds=/osimage/pkg/debian-13-x86_64-20250902.cgz/ltp-x86_64-8cd7644a5-1_20260514.cgz
          acpi_rsdp=0x36983000
Machine:  unknown
Arch:     x86_64
RAM:      114660756 kB
Swap:     0 kB
Distro:   debian 13

────────────────────────
      TEST SUMMARY
────────────────────────
Suite:   syscalls-05
Runtime: 9m 4s
Runs:    189

Results:
    Passed:   1293
    Failed:   3
    Broken:   0
    Skipped:  38
    Warnings: 0

^[[1;31mFailures:^[[0m
    • prctl08

Session stopped



The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20260528/202605281011.2ba77cc3-lkp@intel.com



-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki



^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] sched/fork: Fix timer_slack_ns inheritance for RT tasks
  2026-05-28  4:50 ` kernel test robot
@ 2026-05-28  5:24   ` K Prateek Nayak
  0 siblings, 0 replies; 4+ messages in thread
From: K Prateek Nayak @ 2026-05-28  5:24 UTC (permalink / raw)
  To: kernel test robot, Xiao Feng via B4 Relay
  Cc: oe-lkp, lkp, linux-kernel, aubrey.li, yu.c.chen, ltp, Ingo Molnar,
	Peter Zijlstra, Juri Lelli, Vincent Guittot, Dietmar Eggemann,
	Steven Rostedt, Ben Segall, Mel Gorman, Valentin Schneider,
	Andrew Morton, David Hildenbrand, Lorenzo Stoakes,
	Liam R. Howlett, Vlastimil Babka, Mike Rapoport,
	Suren Baghdasaryan, Michal Hocko, Kees Cook, Felix Moessbauer,
	Thomas Gleixner, linux-mm, Xiao Feng

Hello Oliver,

Thank you for the test!

On 5/28/2026 10:20 AM, kernel test robot wrote:
> Suite: syscalls-05
> ──────────────────
> 
> ...
> 
> ^[[1;37mprctl02: ^[[0m^[[1;32mpass^[[0m | ^[[1;33mtainted^[[0m  (0.011s)
> ^[[1;37mprctl08: ^[[0m^[[1;31mfail^[[0m | ^[[1;33mtainted^[[0m  (0.011s)   <---

I think we are failing here
https://github.com/linux-test-project/ltp/blob/master/testcases/kernel/syscalls/prctl/prctl08.c#L86

for:

    {check_inherit_timerslack, 70000, 70000, "Child process"}

This checks if the default timerslack is set to the Parent's timerslack
at the time of fork and prctl(PR_SET_TIMERSLACK, 0) in the child process
expects the timer slack to go back that inherited value.

With this patch, instead of inheriting it, we are resetting it at 50us
at the time of fork. Since this is documented in man page, I believe the
test is correct and this will indeed be a behavior change.

-- 
Thanks and Regards,
Prateek



^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2026-05-28  5:24 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-20 13:07 [PATCH] sched/fork: Fix timer_slack_ns inheritance for RT tasks Xiao Feng via B4 Relay
2026-05-20 14:57 ` K Prateek Nayak
2026-05-28  4:50 ` kernel test robot
2026-05-28  5:24   ` K Prateek Nayak

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox