public inbox for linux-pm@vger.kernel.org
 help / color / mirror / Atom feed
From: Tejun Heo <tj@kernel.org>
To: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: Yipeng Zou <zouyipeng@huawei.com>,
	Linux Power Management <linux-pm@vger.kernel.org>,
	bpf <bpf@vger.kernel.org>,
	"Rafael J. Wysocki" <rafael@kernel.org>,
	Viresh Kumar <viresh.kumar@linaro.org>,
	Alexei Starovoitov <ast@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Andrii Nakryiko <andrii@kernel.org>,
	Martin KaFai Lau <martin.lau@linux.dev>,
	Eddy Z <eddyz87@gmail.com>, Song Liu <song@kernel.org>,
	John Fastabend <john.fastabend@gmail.com>,
	KP Singh <kpsingh@kernel.org>,
	Stanislav Fomichev <sdf@fomichev.me>, Hao Luo <haoluo@google.com>,
	Jiri Olsa <jolsa@kernel.org>,
	liaochang1@huawei.com,
	Daniel Hodges <hodges.daniel.scott@gmail.com>
Subject: Re: [RFC PATCH 0/2] cpufreq_ext: Introduce cpufreq ext governor
Date: Mon, 30 Sep 2024 08:22:15 -1000	[thread overview]
Message-ID: <ZvrsV-A1Jizokuef@slm.duckdns.org> (raw)
In-Reply-To: <CAADnVQJmVo4BU345irnnLNxQ_sT1cOEx8ky4T2iH_ZKpAyFfww@mail.gmail.com>

(cc'ing Daniel Hodges and quoting the whole body)

On Sun, Sep 29, 2024 at 09:56:02AM -0700, Alexei Starovoitov wrote:
> On Fri, Sep 27, 2024 at 3:03 AM Yipeng Zou <zouyipeng@huawei.com> wrote:
> >
> > Hi everyone,
> >
> > I am currently working on a patch for a CPU frequency governor based on
> > BPF, which can use BPF to customize and implement various frequency
> > scaling strategies.
> >
> > If you have any feedback or suggestions, please do let me know.
> >
> > Motivation
> > ----------
> >
> > 1. Customization
> >
> > Existing cpufreq governors in the kernel are designed for general
> > scenarios, which may not always be optimal for specific or specialized
> > workloads.
> >
> > The userspace governor allows direct control over cpufreq, but users
> > often require guidance from the kernel to achieve the desired frequency.
> >
> > Cpufreq_ext aims to address this by providing a customizable framework that
> > can be tailored to the unique needs of different systems and applications.
> >
> > While cpufreq governors can be implemented within a kernel module,
> > maintaining a ko tailored for specific scenarios can be challenging.
> > The complexity and overhead associated with kernel modules make it
> > difficult to quickly adapt and deploy custom frequency scaling strategies.
> >
> > Cpufreq_ext leverages BPF to offer a more lightweight and flexible approach
> > to implementing customized strategies, allowing for easier maintenance and
> > deployment.
> >
> > 2. Integration with sched_ext:
> >
> > sched_ext is a scheduler class whose behavior can be defined by a set of
> > BPF programs - the BPF scheduler.
> >
> > Look for more about sched_ext in [1]:
> >
> >         [1] https://www.kernel.org/doc/html/next/scheduler/sched-ext.html
> >
> > The interaction between CPU frequency scaling and task scheduling is
> > critical for performance.
> >
> > cpufreq_ext can work with sched_ext to ensure that both scheduling
> > decisions and frequency adjustments are made in a coordinated manner,
> > optimizing system responsiveness and power consumption.
> 
> I think sched-ext already has a mechanism to influence cpufreq.
> How is this different ?

FWIW, sched_ext's cpufreq implementation is through the schedutil governor.
All that the BPF scheduler does is providing utilization signal to the
governor. This seems to work fine for sched_ext schedulers (this doesn't
preclude more direct BPF governor).

> Pls cc sched-ext folks in the future.

Yeah, it'd be great if you can cc Daniel, me and sched-ext@meta.com.

> > Overview
> > --------
> >
> > The cpufreq ext is a BPF based cpufreq governor, we can customize
> > cpufreq governor in BPF program.
> >
> > CPUFreq ext works as common cpufreq governor with cpufreq policy.
> >
> >                    --------------------------
> >                   |        BPF governor      |
> >                    --------------------------
> >                                |
> >                                v
> >                           BPF Register
> >                                |
> >                                v
> >             --------------------------------------
> >            |             CPUFreq ext              |
> >             --------------------------------------
> >               ^                ^               ^
> >               |                |               |
> >            ---------       ---------       ---------
> >           | policy0 | ... | policy1 | ... | policyn |
> >            ---------       ---------       ---------
> >
> > We can register serval function hooks to cpufreq ext by BPF Struct OPS.
> >
> > The first patch define a dbs_governor, and it's works like other
> > governor.
> >
> > The second patch gives a sample how to use it, implement one
> > typical cpufreq governor, switch to max cpufreq when VIP task
> > is running on target cpu.
> >
> > Detail
> > ------
> >
> > The cpufreq ext use bpf_struct_ops to register serval function hooks.
> >
> >         struct cpufreq_governor_ext_ops {
> >                 ...
> >         }
> >
> > Cpufreq_governor_ext_ops defines all the functions that BPF programs can
> > implement customly.
> >
> > If you need to add a custom function, you only need to define it in this
> > struct.
> >
> > At the moment we have defined the basic functions.
> >
> > 1. unsigned long (*get_next_freq)(struct cpufreq_policy *policy)
> >
> >         Make decision how to adjust cpufreq here.
> >         The return value represents the CPU frequency that will be
> >         updated.
> >
> > 2. unsigned int (*get_sampling_rate)(struct cpufreq_policy *policy)
> >
> >         Make decision how to adjust sampling_rate here.
> >         The return value represents the governor samplint rate that
> >         will be updated.
> >
> > 3. unsigned int (*init)(void)
> >
> >         BPF governor init callback, return 0 means success.
> >
> > 4. void (*exit)(void)
> >
> >         BPF governor exit callback.
> >
> > 5. char name[CPUFREQ_EXT_NAME_LEN]
> >
> >         BPF governor name.
> >
> > The cpufreq_ext also add sysfs interface which refer to governor status.
> >
> > 1. ext/stat attribute:
> >
> >         Access to current BPF governor status.
> >
> >         # cat /sys/devices/system/cpu/cpufreq/ext/stat
> >         Stat: CPUFREQ_EXT_INIT
> >         BPF governor: performance
> >
> > There are number of constraints on the cpufreq_ext:
> >
> > 1. Only one ext governor can be registered at a time.
> >
> > 2. By default, it operates as a performance governor when no BPF
> >    governor is registered.
> >
> > 3. The cpufreq_ext governor must be selected before loading a BPF
> >    governor; otherwise, the installation of the BPF governor will fail.
> >
> > TODO
> > ----
> >
> > The current patch is a starting point, and future work will focus on
> > expanding its capabilities.
> >
> > I plan to leverage the BPF ecosystem to introduce innovative features,
> > such as real-time adjustments and optimizations based on system-wide
> > observations and analytics.
> >
> > And I am looking forward to any insights, critiques, or suggestions you
> > may have.
> >
> > Yipeng Zou (2):
> >   cpufreq_ext: Introduce cpufreq ext governor
> >   cpufreq_ext: Add bpf sample
> >
> >  drivers/cpufreq/Kconfig        |  23 ++
> >  drivers/cpufreq/Makefile       |   1 +
> >  drivers/cpufreq/cpufreq_ext.c  | 525 +++++++++++++++++++++++++++++++++
> >  samples/bpf/.gitignore         |   1 +
> >  samples/bpf/Makefile           |   8 +-
> >  samples/bpf/cpufreq_ext.bpf.c  | 113 +++++++
> >  samples/bpf/cpufreq_ext_user.c |  48 +++
> >  7 files changed, 718 insertions(+), 1 deletion(-)
> >  create mode 100644 drivers/cpufreq/cpufreq_ext.c
> >  create mode 100644 samples/bpf/cpufreq_ext.bpf.c
> >  create mode 100644 samples/bpf/cpufreq_ext_user.c
> >
> > --
> > 2.34.1
> >

-- 
tejun

  reply	other threads:[~2024-09-30 18:22 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-09-27 10:13 [RFC PATCH 0/2] cpufreq_ext: Introduce cpufreq ext governor Yipeng Zou
2024-09-27 10:13 ` [RFC PATCH 1/2] " Yipeng Zou
2024-09-27 10:13 ` [RFC PATCH 2/2] cpufreq_ext: Add bpf sample Yipeng Zou
2024-09-29 16:56 ` [RFC PATCH 0/2] cpufreq_ext: Introduce cpufreq ext governor Alexei Starovoitov
2024-09-30 18:22   ` Tejun Heo [this message]
2024-09-29 23:23 ` Daniel Hodges
2025-09-18 12:52 ` [PATCH] cpufreq: ext: fix NULL deref in ext_gov_update() Jinghao Zhou

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZvrsV-A1Jizokuef@slm.duckdns.org \
    --to=tj@kernel.org \
    --cc=alexei.starovoitov@gmail.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=eddyz87@gmail.com \
    --cc=haoluo@google.com \
    --cc=hodges.daniel.scott@gmail.com \
    --cc=john.fastabend@gmail.com \
    --cc=jolsa@kernel.org \
    --cc=kpsingh@kernel.org \
    --cc=liaochang1@huawei.com \
    --cc=linux-pm@vger.kernel.org \
    --cc=martin.lau@linux.dev \
    --cc=rafael@kernel.org \
    --cc=sdf@fomichev.me \
    --cc=song@kernel.org \
    --cc=viresh.kumar@linaro.org \
    --cc=zouyipeng@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox