Hi Cheng-Yang, On 2026-04-13T02:16:58+0800, Cheng-Yang Chou wrote: > Add the sched_ext(7) manual page and update existing scheduling > documentation to include the SCHED_EXT policy. > > Signed-off-by: Cheng-Yang Chou > --- > man/man2/sched_setattr.2 | 11 +++- > man/man2/sched_setscheduler.2 | 4 ++ > man/man7/sched.7 | 13 +++++ > man/man7/sched_ext.7 | 100 ++++++++++++++++++++++++++++++++++ > 4 files changed, 126 insertions(+), 2 deletions(-) > create mode 100644 man/man7/sched_ext.7 > > diff --git a/man/man2/sched_setattr.2 b/man/man2/sched_setattr.2 > index 80a0ac726dcf..d60678f00e72 100644 > --- a/man/man2/sched_setattr.2 > +++ b/man/man2/sched_setattr.2 > @@ -81,6 +81,10 @@ a deadline scheduling policy; > see > .BR sched (7) > for details. > +.TP 14 > +.B SCHED_EXT > +for extensible scheduling policies implemented via BPF > +(see \fBsched_ext\fR(7)). Please follow the style within that manual page. We avoid \f unless truly necessary. > .P > The > .I attr > @@ -95,7 +99,8 @@ struct sched_attr { > u32 sched_policy; /* Policy (SCHED_*) */ > u64 sched_flags; /* Flags */ > s32 sched_nice; /* Nice value (SCHED_OTHER, > - SCHED_BATCH) */ > + SCHED_BATCH, > + SCHED_EXT) */ Why break the line? > u32 sched_priority; /* Static priority (SCHED_FIFO, > SCHED_RR) */ > /* For SCHED_DEADLINE */ > @@ -218,8 +223,10 @@ This field specifies the nice value to be set when specifying > .I sched_policy > as > .B SCHED_OTHER > +, What's the reason for this weird formatting of the source code? At this point I wonder if this was generated by AI. Please take into account 'CONTRIBUTING.d/ai'. > +.BR SCHED_BATCH , > or > -.BR SCHED_BATCH . > +.BR SCHED_EXT . > The nice value is a number in the range \-20 (high priority) > to +19 (low priority); > see > diff --git a/man/man2/sched_setscheduler.2 b/man/man2/sched_setscheduler.2 > index b4c35543e5bf..825eb7290ee7 100644 > --- a/man/man2/sched_setscheduler.2 > +++ b/man/man2/sched_setscheduler.2 > @@ -67,6 +67,10 @@ and > for running > .I very > low priority background jobs. > +.TP > +.B SCHED_EXT > +for extensible scheduling policies implemented via BPF > +(see \fBsched_ext\fR(7)). Please check formatting. > .P > For each of the above policies, > .I param\->sched_priority > diff --git a/man/man7/sched.7 b/man/man7/sched.7 > index 00926cd34ecf..2e73a4c716b9 100644 > --- a/man/man7/sched.7 > +++ b/man/man7/sched.7 > @@ -116,6 +116,13 @@ and > .BR sched_get_priority_max (2) > to find the range of priorities supported for a particular policy. > .P > +Since Linux 6.12, there is an extensible BPF scheduling policy > +.RB ( SCHED_EXT ), > +which allows for custom scheduling algorithms to be implemented as BPF > +programs. > +See > +.BR sched_ext (7). > +.P > Conceptually, > the scheduler maintains a list of runnable threads for each possible > .I sched_priority > @@ -529,6 +536,12 @@ priority (lower even than a +19 nice value with the > or > .B SCHED_BATCH > policies). > +.SS SCHED_EXT: Extensible BPF Scheduling > +Tasks with this policy are managed by an extensible scheduler class, > +which allows for custom scheduling algorithms to be implemented as > +BPF programs. > +See > +.BR sched_ext (7). > .\" > .SS Resetting scheduling policy for child processes > Each thread has a reset-on-fork scheduling flag. > diff --git a/man/man7/sched_ext.7 b/man/man7/sched_ext.7 > new file mode 100644 > index 000000000000..7ea467e18b84 > --- /dev/null > +++ b/man/man7/sched_ext.7 > @@ -0,0 +1,100 @@ > +.TH SCHED_EXT 7 2024-04-13 "Linux" "Linux Programmer's Manual" > +.SH NAME > +sched_ext \- Extensible BPF Scheduler Class > +.SH SYNOPSIS > +.B #include > +.PP The use of P and PP seems very inconsistent. Have a lovely day! Alex > +.B #define SCHED_EXT 7 > +.SH DESCRIPTION > +.B sched_ext > +is a scheduling class whose behavior can be defined by a set of BPF > +programs, known as the BPF scheduler. It allows for the implementation > +of custom scheduling algorithms that can be loaded and unloaded > +dynamically. > +.PP > +When a BPF scheduler is loaded, it can take over the scheduling of > +tasks that use the > +.B SCHED_EXT > +policy, as well as tasks using standard policies like > +.BR SCHED_NORMAL , > +.BR SCHED_BATCH , > +and > +.B SCHED_IDLE , > +depending on how the BPF scheduler is configured. > +.SS Switching to and from sched_ext > +The feature is enabled via the > +.B CONFIG_SCHED_CLASS_EXT > +kernel configuration option. > +.PP > +A task can explicitly request the > +.B SCHED_EXT > +policy using system calls such as > +.BR sched_setscheduler (2) > +or > +.BR sched_setattr (2). > +If no BPF scheduler is currently loaded, tasks with the > +.B SCHED_EXT > +policy are treated as > +.B SCHED_NORMAL > +and scheduled by the default fair-class scheduler (CFS/EEVDF). > +.PP > +When a BPF scheduler is loaded: > +.IP \(bu 3 > +If > +.B SCX_OPS_SWITCH_PARTIAL > +is NOT set in the scheduler's flags, ALL tasks with policies > +.BR SCHED_NORMAL , > +.BR SCHED_BATCH , > +.BR SCHED_IDLE , > +and > +.B SCHED_EXT > +are scheduled by > +.BR sched_ext . > +.IP \(bu 3 > +If > +.B SCX_OPS_SWITCH_PARTIAL > +IS set, only tasks with the > +.B SCHED_EXT > +policy are scheduled by > +.BR sched_ext . > +Tasks with other policies remain under the control of the fair-class scheduler. > +.PP > +If the BPF scheduler terminates (either normally, due to an error, or > +via a SysRq command), all tasks are automatically reverted to the > +fair-class scheduler. > +.SS System Interfaces > +.B sched_ext > +exposes several interfaces in sysfs for monitoring and control: > +.TP > +.I /sys/kernel/sched_ext/state > +Shows the current state of the BPF scheduler (\fBenabled\fR, \fBdisabled\fR, etc.). > +.TP > +.I /sys/kernel/sched_ext/root/ops > +Shows the name of the currently loaded BPF scheduler. > +.TP > +.I /sys/kernel/sched_ext/enable_seq > +A monotonically incrementing counter that tracks how many times a BPF > +scheduler has been enabled since boot. > +.SS Safety and Debugging > +System integrity is maintained regardless of the BPF scheduler's > +behavior. If a runnable task stalls or an internal error is detected, > +the BPF scheduler is aborted. > +.PP > +The following SysRq sequences are available for emergency management: > +.TP > +.B SysRq-S > +Aborts the current BPF scheduler and reverts all tasks to the fair-class > +scheduler. > +.TP > +.B SysRq-D > +Triggers a debug dump of the current scheduler state to the > +.B sched_ext_dump > +tracepoint. > +.SH SEE ALSO > +.BR sched (7), > +.BR sched_setscheduler (2), > +.BR sched_setattr (2), > +.BR bpf (2) > +.PP > +.I Documentation/scheduler/sched-ext.rst > +in the Linux kernel source tree. > -- > 2.48.1 > --