* [PATCH v3 00/20] xen: introduce CONFIG_SYSCTL
@ 2025-04-21 7:37 Penny Zheng
2025-04-21 7:37 ` [PATCH v3 01/20] xen/x86: remove "depends on !PV_SHIM_EXCLUSIVE" Penny Zheng
` (19 more replies)
0 siblings, 20 replies; 35+ messages in thread
From: Penny Zheng @ 2025-04-21 7:37 UTC (permalink / raw)
To: xen-devel, xen-devel
Cc: ray.huang, Penny Zheng, Jan Beulich, Andrew Cooper,
Roger Pau Monné, Anthony PERARD, Michal Orzel, Julien Grall,
Stefano Stabellini, Daniel P. Smith, Dario Faggioli,
Juergen Gross, George Dunlap, Nathan Studer, Stewart Hildebrand,
Bertrand Marquis, Volodymyr Babchuk, Alistair Francis,
Bob Eshleman, Connor Davis, Oleksii Kurochko
It can be beneficial for some dom0less systems to further reduce Xen footprint
and disable some hypercalls handling code, which may not to be used & required
in such systems. Each hypercall has a separate option to keep configuration
flexible.
Options to disable hypercalls:
- sysctl
- domctl
- hvm
- physdev
- platform
This patch serie is only focusing on introducing CONFIG_SYSCTL. Different
options will be covered in different patch serie.
Features, like LIVEPATCH, Overlay DTB, which fully rely on sysctl op, are also
being wrapped with proper CONFIG_SYSCTL, to reduce Xen footprint as much as
possible.
It is based on Stefano Stabellini's commit "xen: introduce kconfig options to
disable hypercalls"(
https://lore.kernel.org/xen-devel/20241219092917.3006174-1-Sergiy_Kibrik@epam.com)
Penny Zheng (18):
xen/x86: remove "depends on !PV_SHIM_EXCLUSIVE"
xen/xsm: wrap around xsm_sysctl with CONFIG_SYSCTL
xen/sysctl: wrap around XEN_SYSCTL_readconsole
xen/sysctl: make CONFIG_TRACEBUFFER depend on CONFIG_SYSCTL
xen/sysctl: wrap around XEN_SYSCTL_sched_id
xen/sysctl: wrap around XEN_SYSCTL_perfc_op
xen/sysctl: wrap around XEN_SYSCTL_lockprof_op
xen/pmstat: consolidate code into pmstat.c
xen/pmstat: introduce CONFIG_PM_OP
xen/sysctl: introduce CONFIG_PM_STATS
xen/sysctl: wrap around XEN_SYSCTL_page_offline_op
xen/sysctl: wrap around XEN_SYSCTL_cpupool_op
xen/sysctl: wrap around XEN_SYSCTL_scheduler_op
xen: make avail_domheap_pages() inlined into get_outstanding_claims()
xen/sysctl: wrap around XEN_SYSCTL_physinfo
xen/sysctl: make CONFIG_COVERAGE depend on CONFIG_SYSCTL
xen/sysctl: make CONFIG_LIVEPATCH depend on CONFIG_SYSCTL
xen/sysctl: wrap around arch-specific arch_do_sysctl
Stefano Stabellini (2):
xen: introduce CONFIG_SYSCTL
xen/sysctl: wrap around sysctl hypercall
xen/Kconfig.debug | 2 +-
xen/arch/arm/Kconfig | 1 +
xen/arch/arm/Makefile | 2 +-
xen/arch/riscv/stubs.c | 2 +
xen/arch/x86/Kconfig | 4 -
xen/arch/x86/Makefile | 2 +-
xen/arch/x86/acpi/cpu_idle.c | 2 +
xen/arch/x86/acpi/cpufreq/hwp.c | 6 +
xen/arch/x86/acpi/cpufreq/powernow.c | 4 +
xen/arch/x86/hvm/Kconfig | 1 -
xen/arch/x86/psr.c | 18 +
xen/common/Kconfig | 29 +-
xen/common/Makefile | 2 +-
xen/common/page_alloc.c | 55 +-
xen/common/perfc.c | 2 +
xen/common/sched/arinc653.c | 6 +
xen/common/sched/core.c | 4 +
xen/common/sched/cpupool.c | 8 +
xen/common/sched/credit.c | 4 +
xen/common/sched/credit2.c | 4 +
xen/common/sched/private.h | 4 +
xen/common/spinlock.c | 2 +
xen/common/sysctl.c | 7 +-
xen/drivers/acpi/Makefile | 3 +-
xen/drivers/acpi/pm_op.c | 409 ++++++++++++++
xen/drivers/acpi/pmstat.c | 559 ++++++-------------
xen/drivers/char/console.c | 2 +
xen/drivers/cpufreq/cpufreq.c | 31 +
xen/drivers/cpufreq/cpufreq_misc_governors.c | 2 +
xen/drivers/cpufreq/cpufreq_ondemand.c | 2 +
xen/drivers/cpufreq/utility.c | 203 -------
xen/drivers/video/Kconfig | 4 +-
xen/include/acpi/cpufreq/cpufreq.h | 5 -
xen/include/acpi/cpufreq/processor_perf.h | 14 +-
xen/include/hypercall-defs.c | 8 +-
xen/include/xen/mm.h | 1 -
xen/include/xsm/xsm.h | 18 +
xen/xsm/dummy.c | 6 +
xen/xsm/flask/hooks.c | 14 +
39 files changed, 804 insertions(+), 648 deletions(-)
create mode 100644 xen/drivers/acpi/pm_op.c
--
2.34.1
^ permalink raw reply [flat|nested] 35+ messages in thread
* [PATCH v3 01/20] xen/x86: remove "depends on !PV_SHIM_EXCLUSIVE"
2025-04-21 7:37 [PATCH v3 00/20] xen: introduce CONFIG_SYSCTL Penny Zheng
@ 2025-04-21 7:37 ` Penny Zheng
2025-04-30 15:16 ` Jan Beulich
2025-04-21 7:37 ` [PATCH v3 02/20] xen: introduce CONFIG_SYSCTL Penny Zheng
` (18 subsequent siblings)
19 siblings, 1 reply; 35+ messages in thread
From: Penny Zheng @ 2025-04-21 7:37 UTC (permalink / raw)
To: xen-devel
Cc: ray.huang, Penny Zheng, Jan Beulich, Andrew Cooper,
Roger Pau Monné, Anthony PERARD, Michal Orzel, Julien Grall,
Stefano Stabellini
Remove all "depends on !PV_SHIM_EXCLUSIVE" (also the functionally
equivalent "if !...") in Kconfig file, since negative dependancy will badly
affect allyesconfig.
This commit is based on "x86: provide an inverted Kconfig control for
shim-exclusive mode"[1]
[1] https://lists.xen.org/archives/html/xen-devel/2023-03/msg00040.html
Signed-off-by: Penny Zheng <Penny.Zheng@amd.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
---
v2 -> v3:
- remove comment for PV_SHIM_EXCLUSIVE
---
xen/arch/x86/Kconfig | 4 ----
xen/arch/x86/hvm/Kconfig | 1 -
xen/drivers/video/Kconfig | 4 ++--
3 files changed, 2 insertions(+), 7 deletions(-)
diff --git a/xen/arch/x86/Kconfig b/xen/arch/x86/Kconfig
index de2fa37f08..29a8b4b35c 100644
--- a/xen/arch/x86/Kconfig
+++ b/xen/arch/x86/Kconfig
@@ -138,7 +138,6 @@ config XEN_IBT
config SHADOW_PAGING
bool "Shadow Paging"
- default !PV_SHIM_EXCLUSIVE
depends on PV || HVM
help
@@ -170,7 +169,6 @@ config BIGMEM
config TBOOT
bool "Xen tboot support (UNSUPPORTED)"
depends on INTEL && UNSUPPORTED
- default !PV_SHIM_EXCLUSIVE
select CRYPTO
help
Allows support for Trusted Boot using the Intel(R) Trusted Execution
@@ -283,7 +281,6 @@ config PV_SHIM_EXCLUSIVE
If unsure, say N.
-if !PV_SHIM_EXCLUSIVE
config HYPERV_GUEST
bool "Hyper-V Guest"
@@ -293,7 +290,6 @@ config HYPERV_GUEST
If unsure, say N.
-endif
config REQUIRE_NX
bool "Require NX (No eXecute) support"
diff --git a/xen/arch/x86/hvm/Kconfig b/xen/arch/x86/hvm/Kconfig
index 2def0f98e2..b903764bda 100644
--- a/xen/arch/x86/hvm/Kconfig
+++ b/xen/arch/x86/hvm/Kconfig
@@ -1,6 +1,5 @@
menuconfig HVM
bool "HVM support"
- depends on !PV_SHIM_EXCLUSIVE
default !PV_SHIM
select COMPAT
select IOREQ_SERVER
diff --git a/xen/drivers/video/Kconfig b/xen/drivers/video/Kconfig
index 245030beea..66ee1e7c9c 100644
--- a/xen/drivers/video/Kconfig
+++ b/xen/drivers/video/Kconfig
@@ -3,10 +3,10 @@ config VIDEO
bool
config VGA
- bool "VGA support" if !PV_SHIM_EXCLUSIVE
+ bool "VGA support"
select VIDEO
depends on X86
- default y if !PV_SHIM_EXCLUSIVE
+ default y
help
Enable VGA output for the Xen hypervisor.
--
2.34.1
^ permalink raw reply related [flat|nested] 35+ messages in thread
* [PATCH v3 02/20] xen: introduce CONFIG_SYSCTL
2025-04-21 7:37 [PATCH v3 00/20] xen: introduce CONFIG_SYSCTL Penny Zheng
2025-04-21 7:37 ` [PATCH v3 01/20] xen/x86: remove "depends on !PV_SHIM_EXCLUSIVE" Penny Zheng
@ 2025-04-21 7:37 ` Penny Zheng
2025-04-21 20:54 ` Stefano Stabellini
` (2 more replies)
2025-04-21 7:37 ` [PATCH v3 03/20] xen/xsm: wrap around xsm_sysctl with CONFIG_SYSCTL Penny Zheng
` (17 subsequent siblings)
19 siblings, 3 replies; 35+ messages in thread
From: Penny Zheng @ 2025-04-21 7:37 UTC (permalink / raw)
To: xen-devel
Cc: ray.huang, Stefano Stabellini, Andrew Cooper, Anthony PERARD,
Michal Orzel, Jan Beulich, Julien Grall, Roger Pau Monné,
Stefano Stabellini, Sergiy Kibrik, Penny Zheng
From: Stefano Stabellini <stefano.stabellini@amd.com>
We introduce a new Kconfig CONFIG_SYSCTL, which shall only be disabled
on some dom0less systems, to reduce Xen footprint.
Signed-off-by: Stefano Stabellini <stefano.stabellini@amd.com>
Signed-off-by: Sergiy Kibrik <Sergiy_Kibrik@epam.com>
Signed-off-by: Penny Zheng <Penny.Zheng@amd.com>
---
v2 -> v3:
- remove "intend to" in commit message
---
xen/common/Kconfig | 11 +++++++++++
1 file changed, 11 insertions(+)
diff --git a/xen/common/Kconfig b/xen/common/Kconfig
index be28060716..d89e9ede77 100644
--- a/xen/common/Kconfig
+++ b/xen/common/Kconfig
@@ -581,4 +581,15 @@ config BUDDY_ALLOCATOR_SIZE
Amount of memory reserved for the buddy allocator to serve Xen heap,
working alongside the colored one.
+menu "Supported hypercall interfaces"
+ visible if EXPERT
+
+config SYSCTL
+ bool "Enable sysctl hypercall"
+ default y
+ help
+ This option shall only be disabled on some dom0less systems,
+ to reduce Xen footprint.
+endmenu
+
endmenu
--
2.34.1
^ permalink raw reply related [flat|nested] 35+ messages in thread
* [PATCH v3 03/20] xen/xsm: wrap around xsm_sysctl with CONFIG_SYSCTL
2025-04-21 7:37 [PATCH v3 00/20] xen: introduce CONFIG_SYSCTL Penny Zheng
2025-04-21 7:37 ` [PATCH v3 01/20] xen/x86: remove "depends on !PV_SHIM_EXCLUSIVE" Penny Zheng
2025-04-21 7:37 ` [PATCH v3 02/20] xen: introduce CONFIG_SYSCTL Penny Zheng
@ 2025-04-21 7:37 ` Penny Zheng
2025-04-21 7:37 ` [PATCH v3 04/20] xen/sysctl: wrap around XEN_SYSCTL_readconsole Penny Zheng
` (16 subsequent siblings)
19 siblings, 0 replies; 35+ messages in thread
From: Penny Zheng @ 2025-04-21 7:37 UTC (permalink / raw)
To: xen-devel; +Cc: ray.huang, Penny Zheng, Daniel P. Smith, Stefano Stabellini
As function xsm_sysctl() is solely invoked in sysctl.c, we need to
wrap around it with CONFIG_SYSCTL
Signed-off-by: Penny Zheng <Penny.Zheng@amd.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
---
xen/include/xsm/xsm.h | 4 ++++
xen/xsm/dummy.c | 2 ++
xen/xsm/flask/hooks.c | 4 ++++
3 files changed, 10 insertions(+)
diff --git a/xen/include/xsm/xsm.h b/xen/include/xsm/xsm.h
index 24acc16125..22e2429f52 100644
--- a/xen/include/xsm/xsm.h
+++ b/xen/include/xsm/xsm.h
@@ -261,7 +261,11 @@ static inline int xsm_domctl(xsm_default_t def, struct domain *d,
static inline int xsm_sysctl(xsm_default_t def, int cmd)
{
+#ifdef CONFIG_SYSCTL
return alternative_call(xsm_ops.sysctl, cmd);
+#else
+ return -EOPNOTSUPP;
+#endif
}
static inline int xsm_readconsole(xsm_default_t def, uint32_t clear)
diff --git a/xen/xsm/dummy.c b/xen/xsm/dummy.c
index 93fbfc43cc..93a0665ecc 100644
--- a/xen/xsm/dummy.c
+++ b/xen/xsm/dummy.c
@@ -22,7 +22,9 @@ static const struct xsm_ops __initconst_cf_clobber dummy_ops = {
.sysctl_scheduler_op = xsm_sysctl_scheduler_op,
.set_target = xsm_set_target,
.domctl = xsm_domctl,
+#ifdef CONFIG_SYSCTL
.sysctl = xsm_sysctl,
+#endif
.readconsole = xsm_readconsole,
.evtchn_unbound = xsm_evtchn_unbound,
diff --git a/xen/xsm/flask/hooks.c b/xen/xsm/flask/hooks.c
index 6a53487ea4..3a43e5a1d6 100644
--- a/xen/xsm/flask/hooks.c
+++ b/xen/xsm/flask/hooks.c
@@ -856,6 +856,7 @@ static int cf_check flask_domctl(struct domain *d, unsigned int cmd,
}
}
+#ifdef CONFIG_SYSCTL
static int cf_check flask_sysctl(int cmd)
{
switch ( cmd )
@@ -933,6 +934,7 @@ static int cf_check flask_sysctl(int cmd)
return avc_unknown_permission("sysctl", cmd);
}
}
+#endif /* CONFIG_SYSCTL */
static int cf_check flask_readconsole(uint32_t clear)
{
@@ -1884,7 +1886,9 @@ static const struct xsm_ops __initconst_cf_clobber flask_ops = {
.sysctl_scheduler_op = flask_sysctl_scheduler_op,
.set_target = flask_set_target,
.domctl = flask_domctl,
+#ifdef CONFIG_SYSCTL
.sysctl = flask_sysctl,
+#endif
.readconsole = flask_readconsole,
.evtchn_unbound = flask_evtchn_unbound,
--
2.34.1
^ permalink raw reply related [flat|nested] 35+ messages in thread
* [PATCH v3 04/20] xen/sysctl: wrap around XEN_SYSCTL_readconsole
2025-04-21 7:37 [PATCH v3 00/20] xen: introduce CONFIG_SYSCTL Penny Zheng
` (2 preceding siblings ...)
2025-04-21 7:37 ` [PATCH v3 03/20] xen/xsm: wrap around xsm_sysctl with CONFIG_SYSCTL Penny Zheng
@ 2025-04-21 7:37 ` Penny Zheng
2025-04-21 7:37 ` [PATCH v3 05/20] xen/sysctl: make CONFIG_TRACEBUFFER depend on CONFIG_SYSCTL Penny Zheng
` (15 subsequent siblings)
19 siblings, 0 replies; 35+ messages in thread
From: Penny Zheng @ 2025-04-21 7:37 UTC (permalink / raw)
To: xen-devel
Cc: ray.huang, Penny Zheng, Andrew Cooper, Anthony PERARD,
Michal Orzel, Jan Beulich, Julien Grall, Roger Pau Monné,
Stefano Stabellini, Daniel P. Smith
The following functions is to deal with XEN_SYSCTL_readconsole sub-op, and
shall be wrapped:
- xsm_readconsole
- read_console_ring
Signed-off-by: Penny Zheng <Penny.Zheng@amd.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
---
v2 -> v3:
- move #endif up ahead of the blank line
---
xen/common/sysctl.c | 2 ++
xen/drivers/char/console.c | 2 ++
xen/include/xsm/xsm.h | 4 ++++
xen/xsm/dummy.c | 2 +-
xen/xsm/flask/hooks.c | 4 ++--
5 files changed, 11 insertions(+), 3 deletions(-)
diff --git a/xen/common/sysctl.c b/xen/common/sysctl.c
index c2d99ae12e..814f153a23 100644
--- a/xen/common/sysctl.c
+++ b/xen/common/sysctl.c
@@ -58,6 +58,7 @@ long do_sysctl(XEN_GUEST_HANDLE_PARAM(xen_sysctl_t) u_sysctl)
switch ( op->cmd )
{
+#ifdef CONFIG_SYSCTL
case XEN_SYSCTL_readconsole:
ret = xsm_readconsole(XSM_HOOK, op->u.readconsole.clear);
if ( ret )
@@ -65,6 +66,7 @@ long do_sysctl(XEN_GUEST_HANDLE_PARAM(xen_sysctl_t) u_sysctl)
ret = read_console_ring(&op->u.readconsole);
break;
+#endif /* CONFIG_SYSCTL */
case XEN_SYSCTL_tbuf_op:
ret = tb_control(&op->u.tbuf_op);
diff --git a/xen/drivers/char/console.c b/xen/drivers/char/console.c
index c3150fbdb7..64f7e146a7 100644
--- a/xen/drivers/char/console.c
+++ b/xen/drivers/char/console.c
@@ -336,6 +336,7 @@ static void conring_puts(const char *str, size_t len)
conringc = conringp - conring_size;
}
+#ifdef CONFIG_SYSCTL
long read_console_ring(struct xen_sysctl_readconsole *op)
{
XEN_GUEST_HANDLE_PARAM(char) str;
@@ -378,6 +379,7 @@ long read_console_ring(struct xen_sysctl_readconsole *op)
return 0;
}
+#endif /* CONFIG_SYSCTL */
/*
diff --git a/xen/include/xsm/xsm.h b/xen/include/xsm/xsm.h
index 22e2429f52..042a99449f 100644
--- a/xen/include/xsm/xsm.h
+++ b/xen/include/xsm/xsm.h
@@ -270,7 +270,11 @@ static inline int xsm_sysctl(xsm_default_t def, int cmd)
static inline int xsm_readconsole(xsm_default_t def, uint32_t clear)
{
+#ifdef CONFIG_SYSCTL
return alternative_call(xsm_ops.readconsole, clear);
+#else
+ return -EOPNOTSUPP;
+#endif
}
static inline int xsm_evtchn_unbound(
diff --git a/xen/xsm/dummy.c b/xen/xsm/dummy.c
index 93a0665ecc..cd0e844fcf 100644
--- a/xen/xsm/dummy.c
+++ b/xen/xsm/dummy.c
@@ -24,8 +24,8 @@ static const struct xsm_ops __initconst_cf_clobber dummy_ops = {
.domctl = xsm_domctl,
#ifdef CONFIG_SYSCTL
.sysctl = xsm_sysctl,
-#endif
.readconsole = xsm_readconsole,
+#endif
.evtchn_unbound = xsm_evtchn_unbound,
.evtchn_interdomain = xsm_evtchn_interdomain,
diff --git a/xen/xsm/flask/hooks.c b/xen/xsm/flask/hooks.c
index 3a43e5a1d6..df7e10775b 100644
--- a/xen/xsm/flask/hooks.c
+++ b/xen/xsm/flask/hooks.c
@@ -934,7 +934,6 @@ static int cf_check flask_sysctl(int cmd)
return avc_unknown_permission("sysctl", cmd);
}
}
-#endif /* CONFIG_SYSCTL */
static int cf_check flask_readconsole(uint32_t clear)
{
@@ -945,6 +944,7 @@ static int cf_check flask_readconsole(uint32_t clear)
return domain_has_xen(current->domain, perms);
}
+#endif /* CONFIG_SYSCTL */
static inline uint32_t resource_to_perm(uint8_t access)
{
@@ -1888,8 +1888,8 @@ static const struct xsm_ops __initconst_cf_clobber flask_ops = {
.domctl = flask_domctl,
#ifdef CONFIG_SYSCTL
.sysctl = flask_sysctl,
-#endif
.readconsole = flask_readconsole,
+#endif
.evtchn_unbound = flask_evtchn_unbound,
.evtchn_interdomain = flask_evtchn_interdomain,
--
2.34.1
^ permalink raw reply related [flat|nested] 35+ messages in thread
* [PATCH v3 05/20] xen/sysctl: make CONFIG_TRACEBUFFER depend on CONFIG_SYSCTL
2025-04-21 7:37 [PATCH v3 00/20] xen: introduce CONFIG_SYSCTL Penny Zheng
` (3 preceding siblings ...)
2025-04-21 7:37 ` [PATCH v3 04/20] xen/sysctl: wrap around XEN_SYSCTL_readconsole Penny Zheng
@ 2025-04-21 7:37 ` Penny Zheng
2025-04-21 7:37 ` [PATCH v3 06/20] xen/sysctl: wrap around XEN_SYSCTL_sched_id Penny Zheng
` (14 subsequent siblings)
19 siblings, 0 replies; 35+ messages in thread
From: Penny Zheng @ 2025-04-21 7:37 UTC (permalink / raw)
To: xen-devel
Cc: ray.huang, Penny Zheng, Andrew Cooper, Anthony PERARD,
Michal Orzel, Jan Beulich, Julien Grall, Roger Pau Monné,
Stefano Stabellini
Users could only access trace buffers via hypercal XEN_SYSCTL_tbuf_op,
so this commit makes CONFIG_TRACEBUFFER depend on CONFIG_SYSCTL
Signed-off-by: Penny Zheng <Penny.Zheng@amd.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
---
xen/common/Kconfig | 1 +
1 file changed, 1 insertion(+)
diff --git a/xen/common/Kconfig b/xen/common/Kconfig
index d89e9ede77..9cccc37232 100644
--- a/xen/common/Kconfig
+++ b/xen/common/Kconfig
@@ -549,6 +549,7 @@ config DTB_FILE
config TRACEBUFFER
bool "Enable tracing infrastructure" if EXPERT
default y
+ depends on SYSCTL
help
Enable tracing infrastructure and pre-defined tracepoints within Xen.
This will allow live information about Xen's execution and performance
--
2.34.1
^ permalink raw reply related [flat|nested] 35+ messages in thread
* [PATCH v3 06/20] xen/sysctl: wrap around XEN_SYSCTL_sched_id
2025-04-21 7:37 [PATCH v3 00/20] xen: introduce CONFIG_SYSCTL Penny Zheng
` (4 preceding siblings ...)
2025-04-21 7:37 ` [PATCH v3 05/20] xen/sysctl: make CONFIG_TRACEBUFFER depend on CONFIG_SYSCTL Penny Zheng
@ 2025-04-21 7:37 ` Penny Zheng
2025-04-21 7:37 ` [PATCH v3 07/20] xen/sysctl: wrap around XEN_SYSCTL_perfc_op Penny Zheng
` (13 subsequent siblings)
19 siblings, 0 replies; 35+ messages in thread
From: Penny Zheng @ 2025-04-21 7:37 UTC (permalink / raw)
To: xen-devel
Cc: ray.huang, Penny Zheng, Dario Faggioli, Juergen Gross,
George Dunlap, Andrew Cooper, Anthony PERARD, Michal Orzel,
Jan Beulich, Julien Grall, Roger Pau Monné,
Stefano Stabellini
The following function shall be wrapped:
- scheduler_id
Signed-off-by: Penny Zheng <Penny.Zheng@amd.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
---
v2 -> v3:
- move #endif up ahead of the blank line
---
xen/common/sched/core.c | 2 ++
xen/common/sysctl.c | 2 ++
2 files changed, 4 insertions(+)
diff --git a/xen/common/sched/core.c b/xen/common/sched/core.c
index 9043414290..13fdf57e57 100644
--- a/xen/common/sched/core.c
+++ b/xen/common/sched/core.c
@@ -2069,11 +2069,13 @@ long do_set_timer_op(s_time_t timeout)
return 0;
}
+#ifdef CONFIG_SYSCTL
/* scheduler_id - fetch ID of current scheduler */
int scheduler_id(void)
{
return operations.sched_id;
}
+#endif
/* Adjust scheduling parameter for a given domain. */
long sched_adjust(struct domain *d, struct xen_domctl_scheduler_op *op)
diff --git a/xen/common/sysctl.c b/xen/common/sysctl.c
index 814f153a23..b644347b40 100644
--- a/xen/common/sysctl.c
+++ b/xen/common/sysctl.c
@@ -72,9 +72,11 @@ long do_sysctl(XEN_GUEST_HANDLE_PARAM(xen_sysctl_t) u_sysctl)
ret = tb_control(&op->u.tbuf_op);
break;
+#ifdef CONFIG_SYSCTL
case XEN_SYSCTL_sched_id:
op->u.sched_id.sched_id = scheduler_id();
break;
+#endif /* CONFIG_SYSCTL */
case XEN_SYSCTL_getdomaininfolist:
{
--
2.34.1
^ permalink raw reply related [flat|nested] 35+ messages in thread
* [PATCH v3 07/20] xen/sysctl: wrap around XEN_SYSCTL_perfc_op
2025-04-21 7:37 [PATCH v3 00/20] xen: introduce CONFIG_SYSCTL Penny Zheng
` (5 preceding siblings ...)
2025-04-21 7:37 ` [PATCH v3 06/20] xen/sysctl: wrap around XEN_SYSCTL_sched_id Penny Zheng
@ 2025-04-21 7:37 ` Penny Zheng
2025-04-21 7:37 ` [PATCH v3 08/20] xen/sysctl: wrap around XEN_SYSCTL_lockprof_op Penny Zheng
` (12 subsequent siblings)
19 siblings, 0 replies; 35+ messages in thread
From: Penny Zheng @ 2025-04-21 7:37 UTC (permalink / raw)
To: xen-devel
Cc: ray.huang, Penny Zheng, Andrew Cooper, Anthony PERARD,
Michal Orzel, Jan Beulich, Julien Grall, Roger Pau Monné,
Stefano Stabellini
perfc_control() and perfc_copy_info() are responsible for providing control
of perf counters via XEN_SYSCTL_perfc_op in DOM0, so they both shall
be wrapped.
Signed-off-by: Penny Zheng <Penny.Zheng@amd.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
---
xen/common/perfc.c | 2 ++
xen/common/sysctl.c | 2 ++
2 files changed, 4 insertions(+)
diff --git a/xen/common/perfc.c b/xen/common/perfc.c
index 8302b7cf6d..0f3b89af2c 100644
--- a/xen/common/perfc.c
+++ b/xen/common/perfc.c
@@ -149,6 +149,7 @@ void cf_check perfc_reset(unsigned char key)
}
}
+#ifdef CONFIG_SYSCTL
static struct xen_sysctl_perfc_desc perfc_d[NR_PERFCTRS];
static xen_sysctl_perfc_val_t *perfc_vals;
static unsigned int perfc_nbr_vals;
@@ -265,6 +266,7 @@ int perfc_control(struct xen_sysctl_perfc_op *pc)
return rc;
}
+#endif /* CONFIG_SYSCTL */
/*
* Local variables:
diff --git a/xen/common/sysctl.c b/xen/common/sysctl.c
index b644347b40..608e159571 100644
--- a/xen/common/sysctl.c
+++ b/xen/common/sysctl.c
@@ -117,11 +117,13 @@ long do_sysctl(XEN_GUEST_HANDLE_PARAM(xen_sysctl_t) u_sysctl)
}
break;
+#ifdef CONFIG_SYSCTL
#ifdef CONFIG_PERF_COUNTERS
case XEN_SYSCTL_perfc_op:
ret = perfc_control(&op->u.perfc_op);
break;
#endif
+#endif /* CONFIG_SYSCTL */
#ifdef CONFIG_DEBUG_LOCK_PROFILE
case XEN_SYSCTL_lockprof_op:
--
2.34.1
^ permalink raw reply related [flat|nested] 35+ messages in thread
* [PATCH v3 08/20] xen/sysctl: wrap around XEN_SYSCTL_lockprof_op
2025-04-21 7:37 [PATCH v3 00/20] xen: introduce CONFIG_SYSCTL Penny Zheng
` (6 preceding siblings ...)
2025-04-21 7:37 ` [PATCH v3 07/20] xen/sysctl: wrap around XEN_SYSCTL_perfc_op Penny Zheng
@ 2025-04-21 7:37 ` Penny Zheng
2025-04-21 7:37 ` [PATCH v3 09/20] xen/pmstat: consolidate code into pmstat.c Penny Zheng
` (11 subsequent siblings)
19 siblings, 0 replies; 35+ messages in thread
From: Penny Zheng @ 2025-04-21 7:37 UTC (permalink / raw)
To: xen-devel
Cc: ray.huang, Penny Zheng, Andrew Cooper, Anthony PERARD,
Michal Orzel, Jan Beulich, Julien Grall, Roger Pau Monné,
Stefano Stabellini
The following function is only to serve spinlock profiling via
XEN_SYSCTL_lockprof_op, so it shall be wrapped:
- spinlock_profile_control
Signed-off-by: Penny Zheng <Penny.Zheng@amd.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
---
v2 -> v3:
- add the blank line
---
xen/common/spinlock.c | 2 ++
xen/common/sysctl.c | 3 ++-
2 files changed, 4 insertions(+), 1 deletion(-)
diff --git a/xen/common/spinlock.c b/xen/common/spinlock.c
index 38caa10a2e..0389293b09 100644
--- a/xen/common/spinlock.c
+++ b/xen/common/spinlock.c
@@ -690,6 +690,7 @@ void cf_check spinlock_profile_reset(unsigned char key)
spinlock_profile_iterate(spinlock_profile_reset_elem, NULL);
}
+#ifdef CONFIG_SYSCTL
typedef struct {
struct xen_sysctl_lockprof_op *pc;
int rc;
@@ -749,6 +750,7 @@ int spinlock_profile_control(struct xen_sysctl_lockprof_op *pc)
return rc;
}
+#endif /* CONFIG_SYSCTL */
void _lock_profile_register_struct(
int32_t type, struct lock_profile_qhead *qhead, int32_t idx)
diff --git a/xen/common/sysctl.c b/xen/common/sysctl.c
index 608e159571..2fe76362b1 100644
--- a/xen/common/sysctl.c
+++ b/xen/common/sysctl.c
@@ -123,13 +123,14 @@ long do_sysctl(XEN_GUEST_HANDLE_PARAM(xen_sysctl_t) u_sysctl)
ret = perfc_control(&op->u.perfc_op);
break;
#endif
-#endif /* CONFIG_SYSCTL */
#ifdef CONFIG_DEBUG_LOCK_PROFILE
case XEN_SYSCTL_lockprof_op:
ret = spinlock_profile_control(&op->u.lockprof_op);
break;
#endif
+#endif /* CONFIG_SYSCTL */
+
case XEN_SYSCTL_debug_keys:
{
char c;
--
2.34.1
^ permalink raw reply related [flat|nested] 35+ messages in thread
* [PATCH v3 09/20] xen/pmstat: consolidate code into pmstat.c
2025-04-21 7:37 [PATCH v3 00/20] xen: introduce CONFIG_SYSCTL Penny Zheng
` (7 preceding siblings ...)
2025-04-21 7:37 ` [PATCH v3 08/20] xen/sysctl: wrap around XEN_SYSCTL_lockprof_op Penny Zheng
@ 2025-04-21 7:37 ` Penny Zheng
2025-04-30 15:24 ` Jan Beulich
2025-04-21 7:37 ` [PATCH v3 10/20] xen/pmstat: introduce CONFIG_PM_OP Penny Zheng
` (10 subsequent siblings)
19 siblings, 1 reply; 35+ messages in thread
From: Penny Zheng @ 2025-04-21 7:37 UTC (permalink / raw)
To: xen-devel; +Cc: ray.huang, Penny Zheng, Jan Beulich, Stefano Stabellini
We move the following functions into drivers/acpi/pmstat.c, as they
are all designed for performance statistic:
- cpufreq_residency_update
- cpufreq_statistic_reset
- cpufreq_statistic_update
- cpufreq_statistic_init
- cpufreq_statistic_exit
Consequently, variable "cpufreq_statistic_data" and "cpufreq_statistic_lock"
shall become static.
We also move out acpi_set_pdc_bits(), as it is the handler for sub-hypercall
XEN_PM_PDC, and shall stay with the other handlers together in
drivers/cpufreq/cpufreq.c.
Various style corrections shall be applied at the same time while moving these
functions, including:
- brace for if() and for() shall live at a seperate line
- add extra space before and after bracket of if() and for()
- use array notation
- convert uint32_t into unsigned int
- convert u32 into uint32_t
Signed-off-by: Penny Zheng <Penny.Zheng@amd.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
---
v2 -> v3:
- brace for if() and for() shall live at a seperate line
- use array notation
- convert uint32_t into unsigned int
---
xen/drivers/acpi/pmstat.c | 202 ++++++++++++++++++----
xen/drivers/cpufreq/cpufreq.c | 31 ++++
xen/drivers/cpufreq/utility.c | 162 -----------------
xen/include/acpi/cpufreq/cpufreq.h | 2 -
xen/include/acpi/cpufreq/processor_perf.h | 4 -
5 files changed, 201 insertions(+), 200 deletions(-)
diff --git a/xen/drivers/acpi/pmstat.c b/xen/drivers/acpi/pmstat.c
index c51b9ca358..abfdc45cc2 100644
--- a/xen/drivers/acpi/pmstat.c
+++ b/xen/drivers/acpi/pmstat.c
@@ -41,7 +41,176 @@
#include <acpi/cpufreq/cpufreq.h>
#include <xen/pmstat.h>
-DEFINE_PER_CPU_READ_MOSTLY(struct pm_px *, cpufreq_statistic_data);
+static DEFINE_PER_CPU_READ_MOSTLY(struct pm_px *, cpufreq_statistic_data);
+
+static DEFINE_PER_CPU(spinlock_t, cpufreq_statistic_lock);
+
+/*********************************************************************
+ * Px STATISTIC INFO *
+ *********************************************************************/
+
+static void cpufreq_residency_update(unsigned int cpu, uint8_t state)
+{
+ uint64_t now, total_idle_ns;
+ int64_t delta;
+ struct pm_px *pxpt = per_cpu(cpufreq_statistic_data, cpu);
+
+ total_idle_ns = get_cpu_idle_time(cpu);
+ now = NOW();
+
+ delta = (now - pxpt->prev_state_wall) -
+ (total_idle_ns - pxpt->prev_idle_wall);
+
+ if ( likely(delta >= 0) )
+ pxpt->u.pt[state].residency += delta;
+
+ pxpt->prev_state_wall = now;
+ pxpt->prev_idle_wall = total_idle_ns;
+}
+
+void cpufreq_statistic_update(unsigned int cpu, uint8_t from, uint8_t to)
+{
+ struct pm_px *pxpt;
+ const struct processor_pminfo *pmpt = processor_pminfo[cpu];
+ spinlock_t *cpufreq_statistic_lock =
+ &per_cpu(cpufreq_statistic_lock, cpu);
+
+ spin_lock(cpufreq_statistic_lock);
+
+ pxpt = per_cpu(cpufreq_statistic_data, cpu);
+ if ( !pxpt || !pmpt )
+ {
+ spin_unlock(cpufreq_statistic_lock);
+ return;
+ }
+
+ pxpt->u.last = from;
+ pxpt->u.cur = to;
+ pxpt->u.pt[to].count++;
+
+ cpufreq_residency_update(cpu, from);
+
+ pxpt->u.trans_pt[from * pmpt->perf.state_count + to]++;
+
+ spin_unlock(cpufreq_statistic_lock);
+}
+
+int cpufreq_statistic_init(unsigned int cpu)
+{
+ unsigned int i, count;
+ struct pm_px *pxpt;
+ const struct processor_pminfo *pmpt = processor_pminfo[cpu];
+ spinlock_t *cpufreq_statistic_lock = &per_cpu(cpufreq_statistic_lock, cpu);
+
+ spin_lock_init(cpufreq_statistic_lock);
+
+ if ( !pmpt )
+ return -EINVAL;
+
+ spin_lock(cpufreq_statistic_lock);
+
+ pxpt = per_cpu(cpufreq_statistic_data, cpu);
+ if ( pxpt )
+ {
+ spin_unlock(cpufreq_statistic_lock);
+ return 0;
+ }
+
+ count = pmpt->perf.state_count;
+
+ pxpt = xzalloc(struct pm_px);
+ if ( !pxpt )
+ {
+ spin_unlock(cpufreq_statistic_lock);
+ return -ENOMEM;
+ }
+ per_cpu(cpufreq_statistic_data, cpu) = pxpt;
+
+ pxpt->u.trans_pt = xzalloc_array(uint64_t, count * count);
+ if ( !pxpt->u.trans_pt )
+ {
+ xfree(pxpt);
+ spin_unlock(cpufreq_statistic_lock);
+ return -ENOMEM;
+ }
+
+ pxpt->u.pt = xzalloc_array(struct pm_px_val, count);
+ if ( !pxpt->u.pt )
+ {
+ xfree(pxpt->u.trans_pt);
+ xfree(pxpt);
+ spin_unlock(cpufreq_statistic_lock);
+ return -ENOMEM;
+ }
+
+ pxpt->u.total = pmpt->perf.state_count;
+ pxpt->u.usable = pmpt->perf.state_count - pmpt->perf.platform_limit;
+
+ for ( i = 0; i < pmpt->perf.state_count; i++ )
+ pxpt->u.pt[i].freq = pmpt->perf.states[i].core_frequency;
+
+ pxpt->prev_state_wall = NOW();
+ pxpt->prev_idle_wall = get_cpu_idle_time(cpu);
+
+ spin_unlock(cpufreq_statistic_lock);
+
+ return 0;
+}
+
+void cpufreq_statistic_exit(unsigned int cpu)
+{
+ struct pm_px *pxpt;
+ spinlock_t *cpufreq_statistic_lock = &per_cpu(cpufreq_statistic_lock, cpu);
+
+ spin_lock(cpufreq_statistic_lock);
+
+ pxpt = per_cpu(cpufreq_statistic_data, cpu);
+ if ( !pxpt )
+ {
+ spin_unlock(cpufreq_statistic_lock);
+ return;
+ }
+
+ xfree(pxpt->u.trans_pt);
+ xfree(pxpt->u.pt);
+ xfree(pxpt);
+ per_cpu(cpufreq_statistic_data, cpu) = NULL;
+
+ spin_unlock(cpufreq_statistic_lock);
+}
+
+static void cpufreq_statistic_reset(unsigned int cpu)
+{
+ unsigned int i, j, count;
+ struct pm_px *pxpt;
+ const struct processor_pminfo *pmpt = processor_pminfo[cpu];
+ spinlock_t *cpufreq_statistic_lock = &per_cpu(cpufreq_statistic_lock, cpu);
+
+ spin_lock(cpufreq_statistic_lock);
+
+ pxpt = per_cpu(cpufreq_statistic_data, cpu);
+ if ( !pmpt || !pxpt || !pxpt->u.pt || !pxpt->u.trans_pt )
+ {
+ spin_unlock(cpufreq_statistic_lock);
+ return;
+ }
+
+ count = pmpt->perf.state_count;
+
+ for ( i = 0; i < count; i++ )
+ {
+ pxpt->u.pt[i].residency = 0;
+ pxpt->u.pt[i].count = 0;
+
+ for ( j = 0; j < count; j++ )
+ pxpt->u.trans_pt[i * count + j] = 0;
+ }
+
+ pxpt->prev_state_wall = NOW();
+ pxpt->prev_idle_wall = get_cpu_idle_time(cpu);
+
+ spin_unlock(cpufreq_statistic_lock);
+}
/*
* Get PM statistic info
@@ -518,34 +687,3 @@ int do_pm_op(struct xen_sysctl_pm_op *op)
return ret;
}
-
-int acpi_set_pdc_bits(uint32_t acpi_id, XEN_GUEST_HANDLE(uint32) pdc)
-{
- u32 bits[3];
- int ret;
-
- if ( copy_from_guest(bits, pdc, 2) )
- ret = -EFAULT;
- else if ( bits[0] != ACPI_PDC_REVISION_ID || !bits[1] )
- ret = -EINVAL;
- else if ( copy_from_guest_offset(bits + 2, pdc, 2, 1) )
- ret = -EFAULT;
- else
- {
- u32 mask = 0;
-
- if ( xen_processor_pmbits & XEN_PROCESSOR_PM_CX )
- mask |= ACPI_PDC_C_MASK | ACPI_PDC_SMP_C1PT;
- if ( xen_processor_pmbits & XEN_PROCESSOR_PM_PX )
- mask |= ACPI_PDC_P_MASK | ACPI_PDC_SMP_C1PT;
- if ( xen_processor_pmbits & XEN_PROCESSOR_PM_TX )
- mask |= ACPI_PDC_T_MASK | ACPI_PDC_SMP_C1PT;
- bits[2] &= (ACPI_PDC_C_MASK | ACPI_PDC_P_MASK | ACPI_PDC_T_MASK |
- ACPI_PDC_SMP_C1PT) & ~mask;
- ret = arch_acpi_set_pdc_bits(acpi_id, bits, mask);
- }
- if ( !ret && __copy_to_guest_offset(pdc, 2, bits + 2, 1) )
- ret = -EFAULT;
-
- return ret;
-}
diff --git a/xen/drivers/cpufreq/cpufreq.c b/xen/drivers/cpufreq/cpufreq.c
index 19e2992335..c2d777e0ec 100644
--- a/xen/drivers/cpufreq/cpufreq.c
+++ b/xen/drivers/cpufreq/cpufreq.c
@@ -588,6 +588,37 @@ out:
return ret;
}
+int acpi_set_pdc_bits(unsigned int acpi_id, XEN_GUEST_HANDLE(uint32) pdc)
+{
+ uint32_t bits[3];
+ int ret;
+
+ if ( copy_from_guest(bits, pdc, 2) )
+ ret = -EFAULT;
+ else if ( bits[0] != ACPI_PDC_REVISION_ID || !bits[1] )
+ ret = -EINVAL;
+ else if ( copy_from_guest_offset(bits + 2, pdc, 2, 1) )
+ ret = -EFAULT;
+ else
+ {
+ uint32_t mask = 0;
+
+ if ( xen_processor_pmbits & XEN_PROCESSOR_PM_CX )
+ mask |= ACPI_PDC_C_MASK | ACPI_PDC_SMP_C1PT;
+ if ( xen_processor_pmbits & XEN_PROCESSOR_PM_PX )
+ mask |= ACPI_PDC_P_MASK | ACPI_PDC_SMP_C1PT;
+ if ( xen_processor_pmbits & XEN_PROCESSOR_PM_TX )
+ mask |= ACPI_PDC_T_MASK | ACPI_PDC_SMP_C1PT;
+ bits[2] &= (ACPI_PDC_C_MASK | ACPI_PDC_P_MASK | ACPI_PDC_T_MASK |
+ ACPI_PDC_SMP_C1PT) & ~mask;
+ ret = arch_acpi_set_pdc_bits(acpi_id, bits, mask);
+ }
+ if ( !ret && __copy_to_guest_offset(pdc, 2, bits + 2, 1) )
+ ret = -EFAULT;
+
+ return ret;
+}
+
static void cpufreq_cmdline_common_para(struct cpufreq_policy *new_policy)
{
if (usr_max_freq)
diff --git a/xen/drivers/cpufreq/utility.c b/xen/drivers/cpufreq/utility.c
index e690a484f1..723045b240 100644
--- a/xen/drivers/cpufreq/utility.c
+++ b/xen/drivers/cpufreq/utility.c
@@ -35,168 +35,6 @@ struct cpufreq_driver __read_mostly cpufreq_driver;
struct processor_pminfo *__read_mostly processor_pminfo[NR_CPUS];
DEFINE_PER_CPU_READ_MOSTLY(struct cpufreq_policy *, cpufreq_cpu_policy);
-DEFINE_PER_CPU(spinlock_t, cpufreq_statistic_lock);
-
-/*********************************************************************
- * Px STATISTIC INFO *
- *********************************************************************/
-
-void cpufreq_residency_update(unsigned int cpu, uint8_t state)
-{
- uint64_t now, total_idle_ns;
- int64_t delta;
- struct pm_px *pxpt = per_cpu(cpufreq_statistic_data, cpu);
-
- total_idle_ns = get_cpu_idle_time(cpu);
- now = NOW();
-
- delta = (now - pxpt->prev_state_wall) -
- (total_idle_ns - pxpt->prev_idle_wall);
-
- if ( likely(delta >= 0) )
- pxpt->u.pt[state].residency += delta;
-
- pxpt->prev_state_wall = now;
- pxpt->prev_idle_wall = total_idle_ns;
-}
-
-void cpufreq_statistic_update(unsigned int cpu, uint8_t from, uint8_t to)
-{
- struct pm_px *pxpt;
- struct processor_pminfo *pmpt = processor_pminfo[cpu];
- spinlock_t *cpufreq_statistic_lock =
- &per_cpu(cpufreq_statistic_lock, cpu);
-
- spin_lock(cpufreq_statistic_lock);
-
- pxpt = per_cpu(cpufreq_statistic_data, cpu);
- if ( !pxpt || !pmpt ) {
- spin_unlock(cpufreq_statistic_lock);
- return;
- }
-
- pxpt->u.last = from;
- pxpt->u.cur = to;
- pxpt->u.pt[to].count++;
-
- cpufreq_residency_update(cpu, from);
-
- (*(pxpt->u.trans_pt + from * pmpt->perf.state_count + to))++;
-
- spin_unlock(cpufreq_statistic_lock);
-}
-
-int cpufreq_statistic_init(unsigned int cpu)
-{
- uint32_t i, count;
- struct pm_px *pxpt;
- const struct processor_pminfo *pmpt = processor_pminfo[cpu];
- spinlock_t *cpufreq_statistic_lock = &per_cpu(cpufreq_statistic_lock, cpu);
-
- spin_lock_init(cpufreq_statistic_lock);
-
- if ( !pmpt )
- return -EINVAL;
-
- spin_lock(cpufreq_statistic_lock);
-
- pxpt = per_cpu(cpufreq_statistic_data, cpu);
- if ( pxpt ) {
- spin_unlock(cpufreq_statistic_lock);
- return 0;
- }
-
- count = pmpt->perf.state_count;
-
- pxpt = xzalloc(struct pm_px);
- if ( !pxpt ) {
- spin_unlock(cpufreq_statistic_lock);
- return -ENOMEM;
- }
- per_cpu(cpufreq_statistic_data, cpu) = pxpt;
-
- pxpt->u.trans_pt = xzalloc_array(uint64_t, count * count);
- if (!pxpt->u.trans_pt) {
- xfree(pxpt);
- spin_unlock(cpufreq_statistic_lock);
- return -ENOMEM;
- }
-
- pxpt->u.pt = xzalloc_array(struct pm_px_val, count);
- if (!pxpt->u.pt) {
- xfree(pxpt->u.trans_pt);
- xfree(pxpt);
- spin_unlock(cpufreq_statistic_lock);
- return -ENOMEM;
- }
-
- pxpt->u.total = pmpt->perf.state_count;
- pxpt->u.usable = pmpt->perf.state_count - pmpt->perf.platform_limit;
-
- for (i=0; i < pmpt->perf.state_count; i++)
- pxpt->u.pt[i].freq = pmpt->perf.states[i].core_frequency;
-
- pxpt->prev_state_wall = NOW();
- pxpt->prev_idle_wall = get_cpu_idle_time(cpu);
-
- spin_unlock(cpufreq_statistic_lock);
-
- return 0;
-}
-
-void cpufreq_statistic_exit(unsigned int cpu)
-{
- struct pm_px *pxpt;
- spinlock_t *cpufreq_statistic_lock = &per_cpu(cpufreq_statistic_lock, cpu);
-
- spin_lock(cpufreq_statistic_lock);
-
- pxpt = per_cpu(cpufreq_statistic_data, cpu);
- if (!pxpt) {
- spin_unlock(cpufreq_statistic_lock);
- return;
- }
-
- xfree(pxpt->u.trans_pt);
- xfree(pxpt->u.pt);
- xfree(pxpt);
- per_cpu(cpufreq_statistic_data, cpu) = NULL;
-
- spin_unlock(cpufreq_statistic_lock);
-}
-
-void cpufreq_statistic_reset(unsigned int cpu)
-{
- uint32_t i, j, count;
- struct pm_px *pxpt;
- const struct processor_pminfo *pmpt = processor_pminfo[cpu];
- spinlock_t *cpufreq_statistic_lock = &per_cpu(cpufreq_statistic_lock, cpu);
-
- spin_lock(cpufreq_statistic_lock);
-
- pxpt = per_cpu(cpufreq_statistic_data, cpu);
- if ( !pmpt || !pxpt || !pxpt->u.pt || !pxpt->u.trans_pt ) {
- spin_unlock(cpufreq_statistic_lock);
- return;
- }
-
- count = pmpt->perf.state_count;
-
- for (i=0; i < count; i++) {
- pxpt->u.pt[i].residency = 0;
- pxpt->u.pt[i].count = 0;
-
- for (j=0; j < count; j++)
- *(pxpt->u.trans_pt + i*count + j) = 0;
- }
-
- pxpt->prev_state_wall = NOW();
- pxpt->prev_idle_wall = get_cpu_idle_time(cpu);
-
- spin_unlock(cpufreq_statistic_lock);
-}
-
-
/*********************************************************************
* FREQUENCY TABLE HELPERS *
*********************************************************************/
diff --git a/xen/include/acpi/cpufreq/cpufreq.h b/xen/include/acpi/cpufreq/cpufreq.h
index a3c84143af..241117a9af 100644
--- a/xen/include/acpi/cpufreq/cpufreq.h
+++ b/xen/include/acpi/cpufreq/cpufreq.h
@@ -20,8 +20,6 @@
#include "processor_perf.h"
-DECLARE_PER_CPU(spinlock_t, cpufreq_statistic_lock);
-
extern bool cpufreq_verbose;
enum cpufreq_xen_opt {
diff --git a/xen/include/acpi/cpufreq/processor_perf.h b/xen/include/acpi/cpufreq/processor_perf.h
index 301104e16f..6de43f8602 100644
--- a/xen/include/acpi/cpufreq/processor_perf.h
+++ b/xen/include/acpi/cpufreq/processor_perf.h
@@ -9,11 +9,9 @@
unsigned int powernow_register_driver(void);
unsigned int get_measured_perf(unsigned int cpu, unsigned int flag);
-void cpufreq_residency_update(unsigned int cpu, uint8_t state);
void cpufreq_statistic_update(unsigned int cpu, uint8_t from, uint8_t to);
int cpufreq_statistic_init(unsigned int cpu);
void cpufreq_statistic_exit(unsigned int cpu);
-void cpufreq_statistic_reset(unsigned int cpu);
int cpufreq_limit_change(unsigned int cpu);
@@ -56,7 +54,5 @@ struct pm_px {
uint64_t prev_idle_wall;
};
-DECLARE_PER_CPU(struct pm_px *, cpufreq_statistic_data);
-
int cpufreq_cpu_init(unsigned int cpu);
#endif /* __XEN_PROCESSOR_PM_H__ */
--
2.34.1
^ permalink raw reply related [flat|nested] 35+ messages in thread
* [PATCH v3 10/20] xen/pmstat: introduce CONFIG_PM_OP
2025-04-21 7:37 [PATCH v3 00/20] xen: introduce CONFIG_SYSCTL Penny Zheng
` (8 preceding siblings ...)
2025-04-21 7:37 ` [PATCH v3 09/20] xen/pmstat: consolidate code into pmstat.c Penny Zheng
@ 2025-04-21 7:37 ` Penny Zheng
2025-04-21 21:09 ` Stefano Stabellini
2025-04-30 15:32 ` Jan Beulich
2025-04-21 7:37 ` [PATCH v3 11/20] xen/sysctl: introduce CONFIG_PM_STATS Penny Zheng
` (9 subsequent siblings)
19 siblings, 2 replies; 35+ messages in thread
From: Penny Zheng @ 2025-04-21 7:37 UTC (permalink / raw)
To: xen-devel
Cc: ray.huang, Penny Zheng, Jan Beulich, Andrew Cooper,
Roger Pau Monné, Anthony PERARD, Michal Orzel, Julien Grall,
Stefano Stabellini
We move the following functions into a new file drivers/acpi/pm_op.c, as
they are all more fitting in performance controling and only called by
do_pm_op():
- get_cpufreq_para()
- set_cpufreq_para()
- set_cpufreq_gov()
- set_cpufreq_cppc()
- cpufreq_driver_getavg()
- cpufreq_update_turbo()
- cpufreq_get_turbo_status()
We introduce a new Kconfig CONFIG_PM_OP to wrap the new file.
Also, although the following helpers are only called by do_pm_op(), they have
dependency on local variable, we wrap them with CONFIG_PM_OP in place:
- write_userspace_scaling_setspeed()
- write_ondemand_sampling_rate()
- write_ondemand_up_threshold()
- get_cpufreq_ondemand_para()
- cpufreq_driver.update()
- get_hwp_para()
Various style corrections shall be applied at the same time while moving these
functions, including:
- add extra space before and after bracket of if() and switch()
- fix indentation
Signed-off-by: Penny Zheng <Penny.Zheng@amd.com>
---
v2 -> v3
- new commit
---
xen/arch/x86/acpi/cpufreq/hwp.c | 6 +
xen/arch/x86/acpi/cpufreq/powernow.c | 4 +
xen/common/Kconfig | 7 +
xen/common/sysctl.c | 4 +-
xen/drivers/acpi/Makefile | 1 +
xen/drivers/acpi/pm_op.c | 409 +++++++++++++++++++
xen/drivers/acpi/pmstat.c | 357 ----------------
xen/drivers/cpufreq/cpufreq_misc_governors.c | 2 +
xen/drivers/cpufreq/cpufreq_ondemand.c | 2 +
xen/drivers/cpufreq/utility.c | 41 --
xen/include/acpi/cpufreq/cpufreq.h | 3 -
11 files changed, 434 insertions(+), 402 deletions(-)
create mode 100644 xen/drivers/acpi/pm_op.c
diff --git a/xen/arch/x86/acpi/cpufreq/hwp.c b/xen/arch/x86/acpi/cpufreq/hwp.c
index d5fa3d47ca..e4c09244ab 100644
--- a/xen/arch/x86/acpi/cpufreq/hwp.c
+++ b/xen/arch/x86/acpi/cpufreq/hwp.c
@@ -466,6 +466,7 @@ static int cf_check hwp_cpufreq_cpu_exit(struct cpufreq_policy *policy)
return 0;
}
+#ifdef CONFIG_PM_OP
/*
* The SDM reads like turbo should be disabled with MSR_IA32_PERF_CTL and
* PERF_CTL_TURBO_DISENGAGE, but that does not seem to actually work, at least
@@ -508,6 +509,7 @@ static int cf_check hwp_cpufreq_update(unsigned int cpu, struct cpufreq_policy *
return per_cpu(hwp_drv_data, cpu)->ret;
}
+#endif /* CONFIG_PM_OP */
static const struct cpufreq_driver __initconst_cf_clobber
hwp_cpufreq_driver = {
@@ -516,9 +518,12 @@ hwp_cpufreq_driver = {
.target = hwp_cpufreq_target,
.init = hwp_cpufreq_cpu_init,
.exit = hwp_cpufreq_cpu_exit,
+#ifdef CONFIG_PM_OP
.update = hwp_cpufreq_update,
+#endif
};
+#ifdef CONFIG_PM_OP
int get_hwp_para(unsigned int cpu,
struct xen_cppc_para *cppc_para)
{
@@ -639,6 +644,7 @@ int set_hwp_para(struct cpufreq_policy *policy,
return hwp_cpufreq_target(policy, 0, 0);
}
+#endif /* CONFIG_PM_OP */
int __init hwp_register_driver(void)
{
diff --git a/xen/arch/x86/acpi/cpufreq/powernow.c b/xen/arch/x86/acpi/cpufreq/powernow.c
index 69364e1855..12fca45b45 100644
--- a/xen/arch/x86/acpi/cpufreq/powernow.c
+++ b/xen/arch/x86/acpi/cpufreq/powernow.c
@@ -49,6 +49,7 @@ static void cf_check transition_pstate(void *pstate)
wrmsrl(MSR_PSTATE_CTRL, *(unsigned int *)pstate);
}
+#ifdef CONFIG_PM_OP
static void cf_check update_cpb(void *data)
{
struct cpufreq_policy *policy = data;
@@ -77,6 +78,7 @@ static int cf_check powernow_cpufreq_update(
return 0;
}
+#endif /* CONFIG_PM_OP */
static int cf_check powernow_cpufreq_target(
struct cpufreq_policy *policy,
@@ -324,7 +326,9 @@ powernow_cpufreq_driver = {
.target = powernow_cpufreq_target,
.init = powernow_cpufreq_cpu_init,
.exit = powernow_cpufreq_cpu_exit,
+#ifdef CONFIG_PM_OP
.update = powernow_cpufreq_update
+#endif
};
unsigned int __init powernow_register_driver(void)
diff --git a/xen/common/Kconfig b/xen/common/Kconfig
index 9cccc37232..ca1f692487 100644
--- a/xen/common/Kconfig
+++ b/xen/common/Kconfig
@@ -593,4 +593,11 @@ config SYSCTL
to reduce Xen footprint.
endmenu
+config PM_OP
+ bool "Enable Performance Management Operation"
+ depends on ACPI && HAS_CPUFREQ && SYSCTL
+ default y
+ help
+ This option shall enable userspace performance management control
+ to do power/performance analyzing and tuning.
endmenu
diff --git a/xen/common/sysctl.c b/xen/common/sysctl.c
index 2fe76362b1..4ab827b694 100644
--- a/xen/common/sysctl.c
+++ b/xen/common/sysctl.c
@@ -181,13 +181,15 @@ long do_sysctl(XEN_GUEST_HANDLE_PARAM(xen_sysctl_t) u_sysctl)
case XEN_SYSCTL_get_pmstat:
ret = do_get_pm_info(&op->u.get_pmstat);
break;
+#endif
+#ifdef CONFIG_PM_OP
case XEN_SYSCTL_pm_op:
ret = do_pm_op(&op->u.pm_op);
if ( ret == -EAGAIN )
copyback = 1;
break;
-#endif
+#endif /* CONFIG_PM_OP */
case XEN_SYSCTL_page_offline_op:
{
diff --git a/xen/drivers/acpi/Makefile b/xen/drivers/acpi/Makefile
index 2fc5230253..e1f84a4468 100644
--- a/xen/drivers/acpi/Makefile
+++ b/xen/drivers/acpi/Makefile
@@ -6,6 +6,7 @@ obj-bin-y += tables.init.o
obj-$(CONFIG_ACPI_NUMA) += numa.o
obj-y += osl.o
obj-$(CONFIG_HAS_CPUFREQ) += pmstat.o
+obj-$(CONFIG_PM_OP) += pm_op.o
obj-$(CONFIG_X86) += hwregs.o
obj-$(CONFIG_X86) += reboot.o
diff --git a/xen/drivers/acpi/pm_op.c b/xen/drivers/acpi/pm_op.c
new file mode 100644
index 0000000000..3123cb9556
--- /dev/null
+++ b/xen/drivers/acpi/pm_op.c
@@ -0,0 +1,409 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+
+#include <xen/acpi.h>
+#include <xen/domain.h>
+#include <xen/errno.h>
+#include <xen/guest_access.h>
+#include <xen/lib.h>
+#include <xen/sched.h>
+
+#include <acpi/cpufreq/cpufreq.h>
+#include <public/platform.h>
+#include <public/sysctl.h>
+
+/*
+ * 1. Get PM parameter
+ * 2. Provide user PM control
+ */
+static int cpufreq_update_turbo(unsigned int cpu, int new_state)
+{
+ struct cpufreq_policy *policy;
+ int curr_state;
+ int ret = 0;
+
+ if ( new_state != CPUFREQ_TURBO_ENABLED &&
+ new_state != CPUFREQ_TURBO_DISABLED )
+ return -EINVAL;
+
+ policy = per_cpu(cpufreq_cpu_policy, cpu);
+ if ( !policy )
+ return -EACCES;
+
+ if ( policy->turbo == CPUFREQ_TURBO_UNSUPPORTED )
+ return -EOPNOTSUPP;
+
+ curr_state = policy->turbo;
+ if ( curr_state == new_state )
+ return 0;
+
+ policy->turbo = new_state;
+ if ( cpufreq_driver.update )
+ {
+ ret = alternative_call(cpufreq_driver.update, cpu, policy);
+ if ( ret )
+ policy->turbo = curr_state;
+ }
+
+ return ret;
+}
+
+static int cpufreq_get_turbo_status(unsigned int cpu)
+{
+ struct cpufreq_policy *policy;
+
+ policy = per_cpu(cpufreq_cpu_policy, cpu);
+ return policy && policy->turbo == CPUFREQ_TURBO_ENABLED;
+}
+
+static int read_scaling_available_governors(char *scaling_available_governors,
+ unsigned int size)
+{
+ unsigned int i = 0;
+ struct cpufreq_governor *t;
+
+ if ( !scaling_available_governors )
+ return -EINVAL;
+
+ list_for_each_entry(t, &cpufreq_governor_list, governor_list)
+ {
+ i += scnprintf(&scaling_available_governors[i],
+ CPUFREQ_NAME_LEN, "%s ", t->name);
+ if ( i > size )
+ return -EINVAL;
+ }
+ scaling_available_governors[i-1] = '\0';
+
+ return 0;
+}
+
+static int get_cpufreq_para(struct xen_sysctl_pm_op *op)
+{
+ uint32_t ret = 0;
+ const struct processor_pminfo *pmpt;
+ struct cpufreq_policy *policy;
+ uint32_t gov_num = 0;
+ uint32_t *data;
+ char *scaling_available_governors;
+ struct list_head *pos;
+ unsigned int cpu, i = 0;
+
+ pmpt = processor_pminfo[op->cpuid];
+ policy = per_cpu(cpufreq_cpu_policy, op->cpuid);
+
+ if ( !pmpt || !pmpt->perf.states ||
+ !policy || !policy->governor )
+ return -EINVAL;
+
+ list_for_each(pos, &cpufreq_governor_list)
+ gov_num++;
+
+ if ( (op->u.get_para.cpu_num != cpumask_weight(policy->cpus)) ||
+ (op->u.get_para.freq_num != pmpt->perf.state_count) ||
+ (op->u.get_para.gov_num != gov_num) )
+ {
+ op->u.get_para.cpu_num = cpumask_weight(policy->cpus);
+ op->u.get_para.freq_num = pmpt->perf.state_count;
+ op->u.get_para.gov_num = gov_num;
+ return -EAGAIN;
+ }
+
+ if ( !(data = xzalloc_array(uint32_t,
+ max(op->u.get_para.cpu_num,
+ op->u.get_para.freq_num))) )
+ return -ENOMEM;
+
+ for_each_cpu(cpu, policy->cpus)
+ data[i++] = cpu;
+ ret = copy_to_guest(op->u.get_para.affected_cpus,
+ data, op->u.get_para.cpu_num);
+
+ for ( i = 0; i < op->u.get_para.freq_num; i++ )
+ data[i] = pmpt->perf.states[i].core_frequency * 1000;
+ ret += copy_to_guest(op->u.get_para.scaling_available_frequencies,
+ data, op->u.get_para.freq_num);
+
+ xfree(data);
+ if ( ret )
+ return -EFAULT;
+
+ op->u.get_para.cpuinfo_cur_freq =
+ cpufreq_driver.get ? alternative_call(cpufreq_driver.get, op->cpuid)
+ : policy->cur;
+ op->u.get_para.cpuinfo_max_freq = policy->cpuinfo.max_freq;
+ op->u.get_para.cpuinfo_min_freq = policy->cpuinfo.min_freq;
+ op->u.get_para.turbo_enabled = cpufreq_get_turbo_status(op->cpuid);
+
+ if ( cpufreq_driver.name[0] )
+ strlcpy(op->u.get_para.scaling_driver,
+ cpufreq_driver.name, CPUFREQ_NAME_LEN);
+ else
+ strlcpy(op->u.get_para.scaling_driver, "Unknown", CPUFREQ_NAME_LEN);
+
+ if ( IS_ENABLED(CONFIG_INTEL) &&
+ !strncmp(op->u.get_para.scaling_driver, XEN_HWP_DRIVER_NAME,
+ CPUFREQ_NAME_LEN) )
+ ret = get_hwp_para(policy->cpu, &op->u.get_para.u.cppc_para);
+ else
+ {
+ if ( !(scaling_available_governors =
+ xzalloc_array(char, gov_num * CPUFREQ_NAME_LEN)) )
+ return -ENOMEM;
+ if ( (ret = read_scaling_available_governors(
+ scaling_available_governors,
+ (gov_num * CPUFREQ_NAME_LEN *
+ sizeof(*scaling_available_governors)))) )
+ {
+ xfree(scaling_available_governors);
+ return ret;
+ }
+ ret = copy_to_guest(op->u.get_para.scaling_available_governors,
+ scaling_available_governors,
+ gov_num * CPUFREQ_NAME_LEN);
+ xfree(scaling_available_governors);
+ if ( ret )
+ return -EFAULT;
+
+ op->u.get_para.u.s.scaling_cur_freq = policy->cur;
+ op->u.get_para.u.s.scaling_max_freq = policy->max;
+ op->u.get_para.u.s.scaling_min_freq = policy->min;
+
+ if ( policy->governor->name[0] )
+ strlcpy(op->u.get_para.u.s.scaling_governor,
+ policy->governor->name, CPUFREQ_NAME_LEN);
+ else
+ strlcpy(op->u.get_para.u.s.scaling_governor, "Unknown",
+ CPUFREQ_NAME_LEN);
+
+ /* governor specific para */
+ if ( !strncasecmp(op->u.get_para.u.s.scaling_governor,
+ "userspace", CPUFREQ_NAME_LEN) )
+ op->u.get_para.u.s.u.userspace.scaling_setspeed = policy->cur;
+
+ if ( !strncasecmp(op->u.get_para.u.s.scaling_governor,
+ "ondemand", CPUFREQ_NAME_LEN) )
+ ret = get_cpufreq_ondemand_para(
+ &op->u.get_para.u.s.u.ondemand.sampling_rate_max,
+ &op->u.get_para.u.s.u.ondemand.sampling_rate_min,
+ &op->u.get_para.u.s.u.ondemand.sampling_rate,
+ &op->u.get_para.u.s.u.ondemand.up_threshold);
+ }
+
+ return ret;
+}
+
+static int set_cpufreq_gov(struct xen_sysctl_pm_op *op)
+{
+ struct cpufreq_policy new_policy, *old_policy;
+
+ old_policy = per_cpu(cpufreq_cpu_policy, op->cpuid);
+ if ( !old_policy )
+ return -EINVAL;
+
+ memcpy(&new_policy, old_policy, sizeof(struct cpufreq_policy));
+
+ new_policy.governor = __find_governor(op->u.set_gov.scaling_governor);
+ if ( new_policy.governor == NULL )
+ return -EINVAL;
+
+ return __cpufreq_set_policy(old_policy, &new_policy);
+}
+
+static int set_cpufreq_para(struct xen_sysctl_pm_op *op)
+{
+ int ret = 0;
+ struct cpufreq_policy *policy;
+
+ policy = per_cpu(cpufreq_cpu_policy, op->cpuid);
+
+ if ( !policy || !policy->governor )
+ return -EINVAL;
+
+ if ( hwp_active() )
+ return -EOPNOTSUPP;
+
+ switch( op->u.set_para.ctrl_type )
+ {
+ case SCALING_MAX_FREQ:
+ {
+ struct cpufreq_policy new_policy;
+
+ memcpy(&new_policy, policy, sizeof(struct cpufreq_policy));
+ new_policy.max = op->u.set_para.ctrl_value;
+ ret = __cpufreq_set_policy(policy, &new_policy);
+
+ break;
+ }
+
+ case SCALING_MIN_FREQ:
+ {
+ struct cpufreq_policy new_policy;
+
+ memcpy(&new_policy, policy, sizeof(struct cpufreq_policy));
+ new_policy.min = op->u.set_para.ctrl_value;
+ ret = __cpufreq_set_policy(policy, &new_policy);
+
+ break;
+ }
+
+ case SCALING_SETSPEED:
+ {
+ unsigned int freq =op->u.set_para.ctrl_value;
+
+ if ( !strncasecmp(policy->governor->name,
+ "userspace", CPUFREQ_NAME_LEN) )
+ ret = write_userspace_scaling_setspeed(op->cpuid, freq);
+ else
+ ret = -EINVAL;
+
+ break;
+ }
+
+ case SAMPLING_RATE:
+ {
+ unsigned int sampling_rate = op->u.set_para.ctrl_value;
+
+ if ( !strncasecmp(policy->governor->name,
+ "ondemand", CPUFREQ_NAME_LEN) )
+ ret = write_ondemand_sampling_rate(sampling_rate);
+ else
+ ret = -EINVAL;
+
+ break;
+ }
+
+ case UP_THRESHOLD:
+ {
+ unsigned int up_threshold = op->u.set_para.ctrl_value;
+
+ if ( !strncasecmp(policy->governor->name,
+ "ondemand", CPUFREQ_NAME_LEN) )
+ ret = write_ondemand_up_threshold(up_threshold);
+ else
+ ret = -EINVAL;
+
+ break;
+ }
+
+ default:
+ ret = -EINVAL;
+ break;
+ }
+
+ return ret;
+}
+
+static int set_cpufreq_cppc(struct xen_sysctl_pm_op *op)
+{
+ struct cpufreq_policy *policy = per_cpu(cpufreq_cpu_policy, op->cpuid);
+
+ if ( !policy || !policy->governor )
+ return -ENOENT;
+
+ if ( !hwp_active() )
+ return -EOPNOTSUPP;
+
+ return set_hwp_para(policy, &op->u.set_cppc);
+}
+
+int do_pm_op(struct xen_sysctl_pm_op *op)
+{
+ int ret = 0;
+ const struct processor_pminfo *pmpt;
+
+ switch ( op->cmd )
+ {
+ case XEN_SYSCTL_pm_op_set_sched_opt_smt:
+ {
+ uint32_t saved_value = sched_smt_power_savings;
+
+ if ( op->cpuid != 0 )
+ return -EINVAL;
+ sched_smt_power_savings = !!op->u.set_sched_opt_smt;
+ op->u.set_sched_opt_smt = saved_value;
+ return 0;
+ }
+
+ case XEN_SYSCTL_pm_op_get_max_cstate:
+ BUILD_BUG_ON(XEN_SYSCTL_CX_UNLIMITED != UINT_MAX);
+ if ( op->cpuid == 0 )
+ op->u.get_max_cstate = acpi_get_cstate_limit();
+ else if ( op->cpuid == 1 )
+ op->u.get_max_cstate = acpi_get_csubstate_limit();
+ else
+ ret = -EINVAL;
+ return ret;
+
+ case XEN_SYSCTL_pm_op_set_max_cstate:
+ if ( op->cpuid == 0 )
+ acpi_set_cstate_limit(op->u.set_max_cstate);
+ else if ( op->cpuid == 1 )
+ acpi_set_csubstate_limit(op->u.set_max_cstate);
+ else
+ ret = -EINVAL;
+ return ret;
+ }
+
+ if ( op->cpuid >= nr_cpu_ids || !cpu_online(op->cpuid) )
+ return -EINVAL;
+ pmpt = processor_pminfo[op->cpuid];
+
+ switch ( op->cmd & PM_PARA_CATEGORY_MASK )
+ {
+ case CPUFREQ_PARA:
+ if ( !(xen_processor_pmbits & XEN_PROCESSOR_PM_PX) )
+ return -ENODEV;
+ if ( !pmpt || !(pmpt->perf.init & XEN_PX_INIT) )
+ return -EINVAL;
+ break;
+ }
+
+ switch ( op->cmd )
+ {
+ case GET_CPUFREQ_PARA:
+ {
+ ret = get_cpufreq_para(op);
+ break;
+ }
+
+ case SET_CPUFREQ_GOV:
+ {
+ ret = set_cpufreq_gov(op);
+ break;
+ }
+
+ case SET_CPUFREQ_PARA:
+ {
+ ret = set_cpufreq_para(op);
+ break;
+ }
+
+ case SET_CPUFREQ_CPPC:
+ ret = set_cpufreq_cppc(op);
+ break;
+
+ case GET_CPUFREQ_AVGFREQ:
+ {
+ op->u.get_avgfreq = cpufreq_driver_getavg(op->cpuid, USR_GETAVG);
+ break;
+ }
+
+ case XEN_SYSCTL_pm_op_enable_turbo:
+ {
+ ret = cpufreq_update_turbo(op->cpuid, CPUFREQ_TURBO_ENABLED);
+ break;
+ }
+
+ case XEN_SYSCTL_pm_op_disable_turbo:
+ {
+ ret = cpufreq_update_turbo(op->cpuid, CPUFREQ_TURBO_DISABLED);
+ break;
+ }
+
+ default:
+ printk("not defined sub-hypercall @ do_pm_op\n");
+ ret = -ENOSYS;
+ break;
+ }
+
+ return ret;
+}
diff --git a/xen/drivers/acpi/pmstat.c b/xen/drivers/acpi/pmstat.c
index abfdc45cc2..61b60e59a2 100644
--- a/xen/drivers/acpi/pmstat.c
+++ b/xen/drivers/acpi/pmstat.c
@@ -330,360 +330,3 @@ int do_get_pm_info(struct xen_sysctl_get_pmstat *op)
return ret;
}
-
-/*
- * 1. Get PM parameter
- * 2. Provide user PM control
- */
-static int read_scaling_available_governors(char *scaling_available_governors,
- unsigned int size)
-{
- unsigned int i = 0;
- struct cpufreq_governor *t;
-
- if ( !scaling_available_governors )
- return -EINVAL;
-
- list_for_each_entry(t, &cpufreq_governor_list, governor_list)
- {
- i += scnprintf(&scaling_available_governors[i],
- CPUFREQ_NAME_LEN, "%s ", t->name);
- if ( i > size )
- return -EINVAL;
- }
- scaling_available_governors[i-1] = '\0';
-
- return 0;
-}
-
-static int get_cpufreq_para(struct xen_sysctl_pm_op *op)
-{
- uint32_t ret = 0;
- const struct processor_pminfo *pmpt;
- struct cpufreq_policy *policy;
- uint32_t gov_num = 0;
- uint32_t *data;
- char *scaling_available_governors;
- struct list_head *pos;
- unsigned int cpu, i = 0;
-
- pmpt = processor_pminfo[op->cpuid];
- policy = per_cpu(cpufreq_cpu_policy, op->cpuid);
-
- if ( !pmpt || !pmpt->perf.states ||
- !policy || !policy->governor )
- return -EINVAL;
-
- list_for_each(pos, &cpufreq_governor_list)
- gov_num++;
-
- if ( (op->u.get_para.cpu_num != cpumask_weight(policy->cpus)) ||
- (op->u.get_para.freq_num != pmpt->perf.state_count) ||
- (op->u.get_para.gov_num != gov_num) )
- {
- op->u.get_para.cpu_num = cpumask_weight(policy->cpus);
- op->u.get_para.freq_num = pmpt->perf.state_count;
- op->u.get_para.gov_num = gov_num;
- return -EAGAIN;
- }
-
- if ( !(data = xzalloc_array(uint32_t,
- max(op->u.get_para.cpu_num,
- op->u.get_para.freq_num))) )
- return -ENOMEM;
-
- for_each_cpu(cpu, policy->cpus)
- data[i++] = cpu;
- ret = copy_to_guest(op->u.get_para.affected_cpus,
- data, op->u.get_para.cpu_num);
-
- for ( i = 0; i < op->u.get_para.freq_num; i++ )
- data[i] = pmpt->perf.states[i].core_frequency * 1000;
- ret += copy_to_guest(op->u.get_para.scaling_available_frequencies,
- data, op->u.get_para.freq_num);
-
- xfree(data);
- if ( ret )
- return -EFAULT;
-
- op->u.get_para.cpuinfo_cur_freq =
- cpufreq_driver.get ? alternative_call(cpufreq_driver.get, op->cpuid)
- : policy->cur;
- op->u.get_para.cpuinfo_max_freq = policy->cpuinfo.max_freq;
- op->u.get_para.cpuinfo_min_freq = policy->cpuinfo.min_freq;
- op->u.get_para.turbo_enabled = cpufreq_get_turbo_status(op->cpuid);
-
- if ( cpufreq_driver.name[0] )
- strlcpy(op->u.get_para.scaling_driver,
- cpufreq_driver.name, CPUFREQ_NAME_LEN);
- else
- strlcpy(op->u.get_para.scaling_driver, "Unknown", CPUFREQ_NAME_LEN);
-
- if ( IS_ENABLED(CONFIG_INTEL) &&
- !strncmp(op->u.get_para.scaling_driver, XEN_HWP_DRIVER_NAME,
- CPUFREQ_NAME_LEN) )
- ret = get_hwp_para(policy->cpu, &op->u.get_para.u.cppc_para);
- else
- {
- if ( !(scaling_available_governors =
- xzalloc_array(char, gov_num * CPUFREQ_NAME_LEN)) )
- return -ENOMEM;
- if ( (ret = read_scaling_available_governors(
- scaling_available_governors,
- (gov_num * CPUFREQ_NAME_LEN *
- sizeof(*scaling_available_governors)))) )
- {
- xfree(scaling_available_governors);
- return ret;
- }
- ret = copy_to_guest(op->u.get_para.scaling_available_governors,
- scaling_available_governors,
- gov_num * CPUFREQ_NAME_LEN);
- xfree(scaling_available_governors);
- if ( ret )
- return -EFAULT;
-
- op->u.get_para.u.s.scaling_cur_freq = policy->cur;
- op->u.get_para.u.s.scaling_max_freq = policy->max;
- op->u.get_para.u.s.scaling_min_freq = policy->min;
-
- if ( policy->governor->name[0] )
- strlcpy(op->u.get_para.u.s.scaling_governor,
- policy->governor->name, CPUFREQ_NAME_LEN);
- else
- strlcpy(op->u.get_para.u.s.scaling_governor, "Unknown",
- CPUFREQ_NAME_LEN);
-
- /* governor specific para */
- if ( !strncasecmp(op->u.get_para.u.s.scaling_governor,
- "userspace", CPUFREQ_NAME_LEN) )
- op->u.get_para.u.s.u.userspace.scaling_setspeed = policy->cur;
-
- if ( !strncasecmp(op->u.get_para.u.s.scaling_governor,
- "ondemand", CPUFREQ_NAME_LEN) )
- ret = get_cpufreq_ondemand_para(
- &op->u.get_para.u.s.u.ondemand.sampling_rate_max,
- &op->u.get_para.u.s.u.ondemand.sampling_rate_min,
- &op->u.get_para.u.s.u.ondemand.sampling_rate,
- &op->u.get_para.u.s.u.ondemand.up_threshold);
- }
-
- return ret;
-}
-
-static int set_cpufreq_gov(struct xen_sysctl_pm_op *op)
-{
- struct cpufreq_policy new_policy, *old_policy;
-
- old_policy = per_cpu(cpufreq_cpu_policy, op->cpuid);
- if ( !old_policy )
- return -EINVAL;
-
- memcpy(&new_policy, old_policy, sizeof(struct cpufreq_policy));
-
- new_policy.governor = __find_governor(op->u.set_gov.scaling_governor);
- if (new_policy.governor == NULL)
- return -EINVAL;
-
- return __cpufreq_set_policy(old_policy, &new_policy);
-}
-
-static int set_cpufreq_para(struct xen_sysctl_pm_op *op)
-{
- int ret = 0;
- struct cpufreq_policy *policy;
-
- policy = per_cpu(cpufreq_cpu_policy, op->cpuid);
-
- if ( !policy || !policy->governor )
- return -EINVAL;
-
- if ( hwp_active() )
- return -EOPNOTSUPP;
-
- switch(op->u.set_para.ctrl_type)
- {
- case SCALING_MAX_FREQ:
- {
- struct cpufreq_policy new_policy;
-
- memcpy(&new_policy, policy, sizeof(struct cpufreq_policy));
- new_policy.max = op->u.set_para.ctrl_value;
- ret = __cpufreq_set_policy(policy, &new_policy);
-
- break;
- }
-
- case SCALING_MIN_FREQ:
- {
- struct cpufreq_policy new_policy;
-
- memcpy(&new_policy, policy, sizeof(struct cpufreq_policy));
- new_policy.min = op->u.set_para.ctrl_value;
- ret = __cpufreq_set_policy(policy, &new_policy);
-
- break;
- }
-
- case SCALING_SETSPEED:
- {
- unsigned int freq =op->u.set_para.ctrl_value;
-
- if ( !strncasecmp(policy->governor->name,
- "userspace", CPUFREQ_NAME_LEN) )
- ret = write_userspace_scaling_setspeed(op->cpuid, freq);
- else
- ret = -EINVAL;
-
- break;
- }
-
- case SAMPLING_RATE:
- {
- unsigned int sampling_rate = op->u.set_para.ctrl_value;
-
- if ( !strncasecmp(policy->governor->name,
- "ondemand", CPUFREQ_NAME_LEN) )
- ret = write_ondemand_sampling_rate(sampling_rate);
- else
- ret = -EINVAL;
-
- break;
- }
-
- case UP_THRESHOLD:
- {
- unsigned int up_threshold = op->u.set_para.ctrl_value;
-
- if ( !strncasecmp(policy->governor->name,
- "ondemand", CPUFREQ_NAME_LEN) )
- ret = write_ondemand_up_threshold(up_threshold);
- else
- ret = -EINVAL;
-
- break;
- }
-
- default:
- ret = -EINVAL;
- break;
- }
-
- return ret;
-}
-
-static int set_cpufreq_cppc(struct xen_sysctl_pm_op *op)
-{
- struct cpufreq_policy *policy = per_cpu(cpufreq_cpu_policy, op->cpuid);
-
- if ( !policy || !policy->governor )
- return -ENOENT;
-
- if ( !hwp_active() )
- return -EOPNOTSUPP;
-
- return set_hwp_para(policy, &op->u.set_cppc);
-}
-
-int do_pm_op(struct xen_sysctl_pm_op *op)
-{
- int ret = 0;
- const struct processor_pminfo *pmpt;
-
- switch ( op->cmd )
- {
- case XEN_SYSCTL_pm_op_set_sched_opt_smt:
- {
- uint32_t saved_value = sched_smt_power_savings;
-
- if ( op->cpuid != 0 )
- return -EINVAL;
- sched_smt_power_savings = !!op->u.set_sched_opt_smt;
- op->u.set_sched_opt_smt = saved_value;
- return 0;
- }
-
- case XEN_SYSCTL_pm_op_get_max_cstate:
- BUILD_BUG_ON(XEN_SYSCTL_CX_UNLIMITED != UINT_MAX);
- if ( op->cpuid == 0 )
- op->u.get_max_cstate = acpi_get_cstate_limit();
- else if ( op->cpuid == 1 )
- op->u.get_max_cstate = acpi_get_csubstate_limit();
- else
- ret = -EINVAL;
- return ret;
-
- case XEN_SYSCTL_pm_op_set_max_cstate:
- if ( op->cpuid == 0 )
- acpi_set_cstate_limit(op->u.set_max_cstate);
- else if ( op->cpuid == 1 )
- acpi_set_csubstate_limit(op->u.set_max_cstate);
- else
- ret = -EINVAL;
- return ret;
- }
-
- if ( op->cpuid >= nr_cpu_ids || !cpu_online(op->cpuid) )
- return -EINVAL;
- pmpt = processor_pminfo[op->cpuid];
-
- switch ( op->cmd & PM_PARA_CATEGORY_MASK )
- {
- case CPUFREQ_PARA:
- if ( !(xen_processor_pmbits & XEN_PROCESSOR_PM_PX) )
- return -ENODEV;
- if ( !pmpt || !(pmpt->perf.init & XEN_PX_INIT) )
- return -EINVAL;
- break;
- }
-
- switch ( op->cmd )
- {
- case GET_CPUFREQ_PARA:
- {
- ret = get_cpufreq_para(op);
- break;
- }
-
- case SET_CPUFREQ_GOV:
- {
- ret = set_cpufreq_gov(op);
- break;
- }
-
- case SET_CPUFREQ_PARA:
- {
- ret = set_cpufreq_para(op);
- break;
- }
-
- case SET_CPUFREQ_CPPC:
- ret = set_cpufreq_cppc(op);
- break;
-
- case GET_CPUFREQ_AVGFREQ:
- {
- op->u.get_avgfreq = cpufreq_driver_getavg(op->cpuid, USR_GETAVG);
- break;
- }
-
- case XEN_SYSCTL_pm_op_enable_turbo:
- {
- ret = cpufreq_update_turbo(op->cpuid, CPUFREQ_TURBO_ENABLED);
- break;
- }
-
- case XEN_SYSCTL_pm_op_disable_turbo:
- {
- ret = cpufreq_update_turbo(op->cpuid, CPUFREQ_TURBO_DISABLED);
- break;
- }
-
- default:
- printk("not defined sub-hypercall @ do_pm_op\n");
- ret = -ENOSYS;
- break;
- }
-
- return ret;
-}
diff --git a/xen/drivers/cpufreq/cpufreq_misc_governors.c b/xen/drivers/cpufreq/cpufreq_misc_governors.c
index 0327fad23b..e5cb9ab02f 100644
--- a/xen/drivers/cpufreq/cpufreq_misc_governors.c
+++ b/xen/drivers/cpufreq/cpufreq_misc_governors.c
@@ -64,6 +64,7 @@ static int cf_check cpufreq_governor_userspace(
return ret;
}
+#ifdef CONFIG_PM_OP
int write_userspace_scaling_setspeed(unsigned int cpu, unsigned int freq)
{
struct cpufreq_policy *policy;
@@ -80,6 +81,7 @@ int write_userspace_scaling_setspeed(unsigned int cpu, unsigned int freq)
return __cpufreq_driver_target(policy, freq, CPUFREQ_RELATION_L);
}
+#endif /* CONFIG_PM_OP */
static bool __init cf_check
cpufreq_userspace_handle_option(const char *name, const char *val)
diff --git a/xen/drivers/cpufreq/cpufreq_ondemand.c b/xen/drivers/cpufreq/cpufreq_ondemand.c
index 06cfc88d30..0126a3f5d9 100644
--- a/xen/drivers/cpufreq/cpufreq_ondemand.c
+++ b/xen/drivers/cpufreq/cpufreq_ondemand.c
@@ -57,6 +57,7 @@ static struct dbs_tuners {
static DEFINE_PER_CPU(struct timer, dbs_timer);
+#ifdef CONFIG_PM_OP
int write_ondemand_sampling_rate(unsigned int sampling_rate)
{
if ( (sampling_rate > MAX_SAMPLING_RATE / MICROSECS(1)) ||
@@ -93,6 +94,7 @@ int get_cpufreq_ondemand_para(uint32_t *sampling_rate_max,
return 0;
}
+#endif /* CONFIG_PM_OP */
static void dbs_check_cpu(struct cpu_dbs_info_s *this_dbs_info)
{
diff --git a/xen/drivers/cpufreq/utility.c b/xen/drivers/cpufreq/utility.c
index 723045b240..987c3b5929 100644
--- a/xen/drivers/cpufreq/utility.c
+++ b/xen/drivers/cpufreq/utility.c
@@ -224,47 +224,6 @@ int cpufreq_driver_getavg(unsigned int cpu, unsigned int flag)
return policy->cur;
}
-int cpufreq_update_turbo(unsigned int cpu, int new_state)
-{
- struct cpufreq_policy *policy;
- int curr_state;
- int ret = 0;
-
- if (new_state != CPUFREQ_TURBO_ENABLED &&
- new_state != CPUFREQ_TURBO_DISABLED)
- return -EINVAL;
-
- policy = per_cpu(cpufreq_cpu_policy, cpu);
- if (!policy)
- return -EACCES;
-
- if (policy->turbo == CPUFREQ_TURBO_UNSUPPORTED)
- return -EOPNOTSUPP;
-
- curr_state = policy->turbo;
- if (curr_state == new_state)
- return 0;
-
- policy->turbo = new_state;
- if (cpufreq_driver.update)
- {
- ret = alternative_call(cpufreq_driver.update, cpu, policy);
- if (ret)
- policy->turbo = curr_state;
- }
-
- return ret;
-}
-
-
-int cpufreq_get_turbo_status(unsigned int cpu)
-{
- struct cpufreq_policy *policy;
-
- policy = per_cpu(cpufreq_cpu_policy, cpu);
- return policy && policy->turbo == CPUFREQ_TURBO_ENABLED;
-}
-
/*********************************************************************
* POLICY *
*********************************************************************/
diff --git a/xen/include/acpi/cpufreq/cpufreq.h b/xen/include/acpi/cpufreq/cpufreq.h
index 241117a9af..0742aa9f44 100644
--- a/xen/include/acpi/cpufreq/cpufreq.h
+++ b/xen/include/acpi/cpufreq/cpufreq.h
@@ -143,9 +143,6 @@ extern int cpufreq_driver_getavg(unsigned int cpu, unsigned int flag);
#define CPUFREQ_TURBO_UNSUPPORTED 0
#define CPUFREQ_TURBO_ENABLED 1
-int cpufreq_update_turbo(unsigned int cpu, int new_state);
-int cpufreq_get_turbo_status(unsigned int cpu);
-
static inline int
__cpufreq_governor(struct cpufreq_policy *policy, unsigned int event)
{
--
2.34.1
^ permalink raw reply related [flat|nested] 35+ messages in thread
* [PATCH v3 11/20] xen/sysctl: introduce CONFIG_PM_STATS
2025-04-21 7:37 [PATCH v3 00/20] xen: introduce CONFIG_SYSCTL Penny Zheng
` (9 preceding siblings ...)
2025-04-21 7:37 ` [PATCH v3 10/20] xen/pmstat: introduce CONFIG_PM_OP Penny Zheng
@ 2025-04-21 7:37 ` Penny Zheng
2025-04-21 21:12 ` Stefano Stabellini
2025-04-21 7:37 ` [PATCH v3 12/20] xen/sysctl: wrap around XEN_SYSCTL_page_offline_op Penny Zheng
` (8 subsequent siblings)
19 siblings, 1 reply; 35+ messages in thread
From: Penny Zheng @ 2025-04-21 7:37 UTC (permalink / raw)
To: xen-devel
Cc: ray.huang, Penny Zheng, Jan Beulich, Andrew Cooper,
Roger Pau Monné, Anthony PERARD, Michal Orzel, Julien Grall,
Stefano Stabellini
We introduce a new Kconfig CONFIG_PM_STATS for wrapping all operations
regarding performance management statistics.
The major codes reside in xen/drivers/acpi/pmstat.c, including the
pm-statistic-related sysctl op: do_get_pm_info().
CONFIG_PM_STATS also shall depend on CONFIG_SYSCTL
Signed-off-by: Penny Zheng <Penny.Zheng@amd.com>
---
v1 -> v2:
- rename to CONFIG_PM_STATS
- fix indention and stray semicolon
- make code movements into a new commit
- No need to wrap inline functions and declarations
---
v2 -> v3:
- sepearte functions related to do_pm_op() into a new commit
- both braces shall be moved to the line with the closing parenthesis
---
xen/arch/x86/acpi/cpu_idle.c | 2 ++
xen/common/Kconfig | 8 ++++++++
xen/common/sysctl.c | 4 ++--
xen/drivers/acpi/Makefile | 2 +-
xen/include/acpi/cpufreq/processor_perf.h | 10 ++++++++++
5 files changed, 23 insertions(+), 3 deletions(-)
diff --git a/xen/arch/x86/acpi/cpu_idle.c b/xen/arch/x86/acpi/cpu_idle.c
index 420198406d..b537ac4cd6 100644
--- a/xen/arch/x86/acpi/cpu_idle.c
+++ b/xen/arch/x86/acpi/cpu_idle.c
@@ -1487,6 +1487,7 @@ static void amd_cpuidle_init(struct acpi_processor_power *power)
vendor_override = -1;
}
+#ifdef CONFIG_PM_STATS
uint32_t pmstat_get_cx_nr(unsigned int cpu)
{
return processor_powers[cpu] ? processor_powers[cpu]->count : 0;
@@ -1606,6 +1607,7 @@ int pmstat_reset_cx_stat(unsigned int cpu)
{
return 0;
}
+#endif /* CONFIG_PM_STATS */
void cpuidle_disable_deep_cstate(void)
{
diff --git a/xen/common/Kconfig b/xen/common/Kconfig
index ca1f692487..d8e242eebc 100644
--- a/xen/common/Kconfig
+++ b/xen/common/Kconfig
@@ -600,4 +600,12 @@ config PM_OP
help
This option shall enable userspace performance management control
to do power/performance analyzing and tuning.
+
+config PM_STATS
+ bool "Enable Performance Management Statistics"
+ depends on ACPI && HAS_CPUFREQ && SYSCTL
+ default y
+ help
+ Enable collection of performance management statistics to aid in
+ analyzing and tuning power/performance characteristics of the system
endmenu
diff --git a/xen/common/sysctl.c b/xen/common/sysctl.c
index 4ab827b694..baaad3bd42 100644
--- a/xen/common/sysctl.c
+++ b/xen/common/sysctl.c
@@ -177,11 +177,11 @@ long do_sysctl(XEN_GUEST_HANDLE_PARAM(xen_sysctl_t) u_sysctl)
op->u.availheap.avail_bytes <<= PAGE_SHIFT;
break;
-#if defined (CONFIG_ACPI) && defined (CONFIG_HAS_CPUFREQ)
+#ifdef CONFIG_PM_STATS
case XEN_SYSCTL_get_pmstat:
ret = do_get_pm_info(&op->u.get_pmstat);
break;
-#endif
+#endif /* CONFIG_PM_STATS */
#ifdef CONFIG_PM_OP
case XEN_SYSCTL_pm_op:
diff --git a/xen/drivers/acpi/Makefile b/xen/drivers/acpi/Makefile
index e1f84a4468..b52b006100 100644
--- a/xen/drivers/acpi/Makefile
+++ b/xen/drivers/acpi/Makefile
@@ -5,7 +5,7 @@ obj-$(CONFIG_X86) += apei/
obj-bin-y += tables.init.o
obj-$(CONFIG_ACPI_NUMA) += numa.o
obj-y += osl.o
-obj-$(CONFIG_HAS_CPUFREQ) += pmstat.o
+obj-$(CONFIG_PM_STATS) += pmstat.o
obj-$(CONFIG_PM_OP) += pm_op.o
obj-$(CONFIG_X86) += hwregs.o
diff --git a/xen/include/acpi/cpufreq/processor_perf.h b/xen/include/acpi/cpufreq/processor_perf.h
index 6de43f8602..a9a3b7a372 100644
--- a/xen/include/acpi/cpufreq/processor_perf.h
+++ b/xen/include/acpi/cpufreq/processor_perf.h
@@ -9,9 +9,19 @@
unsigned int powernow_register_driver(void);
unsigned int get_measured_perf(unsigned int cpu, unsigned int flag);
+#ifdef CONFIG_PM_STATS
void cpufreq_statistic_update(unsigned int cpu, uint8_t from, uint8_t to);
int cpufreq_statistic_init(unsigned int cpu);
void cpufreq_statistic_exit(unsigned int cpu);
+#else
+static inline void cpufreq_statistic_update(unsigned int cpu, uint8_t from,
+ uint8_t to) {}
+static inline int cpufreq_statistic_init(unsigned int cpu)
+{
+ return 0;
+}
+static inline void cpufreq_statistic_exit(unsigned int cpu) {}
+#endif /* CONFIG_PM_STATS */
int cpufreq_limit_change(unsigned int cpu);
--
2.34.1
^ permalink raw reply related [flat|nested] 35+ messages in thread
* [PATCH v3 12/20] xen/sysctl: wrap around XEN_SYSCTL_page_offline_op
2025-04-21 7:37 [PATCH v3 00/20] xen: introduce CONFIG_SYSCTL Penny Zheng
` (10 preceding siblings ...)
2025-04-21 7:37 ` [PATCH v3 11/20] xen/sysctl: introduce CONFIG_PM_STATS Penny Zheng
@ 2025-04-21 7:37 ` Penny Zheng
2025-04-21 7:37 ` [PATCH v3 13/20] xen/sysctl: wrap around XEN_SYSCTL_cpupool_op Penny Zheng
` (7 subsequent siblings)
19 siblings, 0 replies; 35+ messages in thread
From: Penny Zheng @ 2025-04-21 7:37 UTC (permalink / raw)
To: xen-devel
Cc: ray.huang, Penny Zheng, Andrew Cooper, Anthony PERARD,
Michal Orzel, Jan Beulich, Julien Grall, Roger Pau Monné,
Stefano Stabellini, Daniel P. Smith
The following functions are only to deal with XEN_SYSCTL_page_offline_op,
then shall be wrapped:
- xsm_page_offline
- online_page
- query_page_offline
Signed-off-by: Penny Zheng <Penny.Zheng@amd.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
---
v1 -> v2:
- add transient #ifdef in sysctl.c for correct compilation
- no need to wrap declarations
- place the #ifdef inside the function body to have less redundancy
---
xen/common/page_alloc.c | 2 ++
xen/common/sysctl.c | 2 ++
xen/include/xsm/xsm.h | 6 ++++++
xen/xsm/dummy.c | 2 ++
xen/xsm/flask/hooks.c | 6 ++++++
5 files changed, 18 insertions(+)
diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c
index bd4538c28d..cc2ad4423a 100644
--- a/xen/common/page_alloc.c
+++ b/xen/common/page_alloc.c
@@ -1758,6 +1758,7 @@ int offline_page(mfn_t mfn, int broken, uint32_t *status)
return 0;
}
+#ifdef CONFIG_SYSCTL
/*
* Online the memory.
* The caller should make sure end_pfn <= max_page,
@@ -1842,6 +1843,7 @@ int query_page_offline(mfn_t mfn, uint32_t *status)
return 0;
}
+#endif /* CONFIG_SYSCTL */
/*
* This function should only be called with valid pages from the same NUMA
diff --git a/xen/common/sysctl.c b/xen/common/sysctl.c
index baaad3bd42..504e3516c3 100644
--- a/xen/common/sysctl.c
+++ b/xen/common/sysctl.c
@@ -191,6 +191,7 @@ long do_sysctl(XEN_GUEST_HANDLE_PARAM(xen_sysctl_t) u_sysctl)
break;
#endif /* CONFIG_PM_OP */
+#ifdef CONFIG_SYSCTL
case XEN_SYSCTL_page_offline_op:
{
uint32_t *status, *ptr;
@@ -251,6 +252,7 @@ long do_sysctl(XEN_GUEST_HANDLE_PARAM(xen_sysctl_t) u_sysctl)
copyback = 0;
}
break;
+#endif /* CONFIG_SYSCTL */
case XEN_SYSCTL_cpupool_op:
ret = cpupool_do_sysctl(&op->u.cpupool_op);
diff --git a/xen/include/xsm/xsm.h b/xen/include/xsm/xsm.h
index 042a99449f..5ac99904c4 100644
--- a/xen/include/xsm/xsm.h
+++ b/xen/include/xsm/xsm.h
@@ -138,7 +138,9 @@ struct xsm_ops {
int (*resource_setup_gsi)(int gsi);
int (*resource_setup_misc)(void);
+#ifdef CONFIG_SYSCTL
int (*page_offline)(uint32_t cmd);
+#endif
int (*hypfs_op)(void);
long (*do_xsm_op)(XEN_GUEST_HANDLE_PARAM(void) op);
@@ -597,7 +599,11 @@ static inline int xsm_resource_setup_misc(xsm_default_t def)
static inline int xsm_page_offline(xsm_default_t def, uint32_t cmd)
{
+#ifdef CONFIG_SYSCTL
return alternative_call(xsm_ops.page_offline, cmd);
+#else
+ return -EOPNOTSUPP;
+#endif
}
static inline int xsm_hypfs_op(xsm_default_t def)
diff --git a/xen/xsm/dummy.c b/xen/xsm/dummy.c
index cd0e844fcf..d46413ad8c 100644
--- a/xen/xsm/dummy.c
+++ b/xen/xsm/dummy.c
@@ -96,7 +96,9 @@ static const struct xsm_ops __initconst_cf_clobber dummy_ops = {
.resource_setup_gsi = xsm_resource_setup_gsi,
.resource_setup_misc = xsm_resource_setup_misc,
+#ifdef CONFIG_SYSCTL
.page_offline = xsm_page_offline,
+#endif
.hypfs_op = xsm_hypfs_op,
.hvm_param = xsm_hvm_param,
.hvm_param_altp2mhvm = xsm_hvm_param_altp2mhvm,
diff --git a/xen/xsm/flask/hooks.c b/xen/xsm/flask/hooks.c
index df7e10775b..45c12aa662 100644
--- a/xen/xsm/flask/hooks.c
+++ b/xen/xsm/flask/hooks.c
@@ -1206,10 +1206,12 @@ static int cf_check flask_resource_unplug_core(void)
return avc_current_has_perm(SECINITSID_DOMXEN, SECCLASS_RESOURCE, RESOURCE__UNPLUG, NULL);
}
+#ifdef CONFIG_SYSCTL
static int flask_resource_use_core(void)
{
return avc_current_has_perm(SECINITSID_DOMXEN, SECCLASS_RESOURCE, RESOURCE__USE, NULL);
}
+#endif /* CONFIG_SYSCTL */
static int cf_check flask_resource_plug_pci(uint32_t machine_bdf)
{
@@ -1274,6 +1276,7 @@ static int cf_check flask_resource_setup_misc(void)
return avc_current_has_perm(SECINITSID_XEN, SECCLASS_RESOURCE, RESOURCE__SETUP, NULL);
}
+#ifdef CONFIG_SYSCTL
static inline int cf_check flask_page_offline(uint32_t cmd)
{
switch ( cmd )
@@ -1288,6 +1291,7 @@ static inline int cf_check flask_page_offline(uint32_t cmd)
return avc_unknown_permission("page_offline", cmd);
}
}
+#endif /* CONFIG_SYSCTL */
static inline int cf_check flask_hypfs_op(void)
{
@@ -1948,7 +1952,9 @@ static const struct xsm_ops __initconst_cf_clobber flask_ops = {
.resource_setup_gsi = flask_resource_setup_gsi,
.resource_setup_misc = flask_resource_setup_misc,
+#ifdef CONFIG_SYSCTL
.page_offline = flask_page_offline,
+#endif
.hypfs_op = flask_hypfs_op,
.hvm_param = flask_hvm_param,
.hvm_param_altp2mhvm = flask_hvm_param_altp2mhvm,
--
2.34.1
^ permalink raw reply related [flat|nested] 35+ messages in thread
* [PATCH v3 13/20] xen/sysctl: wrap around XEN_SYSCTL_cpupool_op
2025-04-21 7:37 [PATCH v3 00/20] xen: introduce CONFIG_SYSCTL Penny Zheng
` (11 preceding siblings ...)
2025-04-21 7:37 ` [PATCH v3 12/20] xen/sysctl: wrap around XEN_SYSCTL_page_offline_op Penny Zheng
@ 2025-04-21 7:37 ` Penny Zheng
2025-04-21 7:37 ` [PATCH v3 14/20] xen/sysctl: wrap around XEN_SYSCTL_scheduler_op Penny Zheng
` (6 subsequent siblings)
19 siblings, 0 replies; 35+ messages in thread
From: Penny Zheng @ 2025-04-21 7:37 UTC (permalink / raw)
To: xen-devel
Cc: ray.huang, Penny Zheng, Dario Faggioli, Juergen Gross,
George Dunlap, Andrew Cooper, Anthony PERARD, Michal Orzel,
Jan Beulich, Julien Grall, Roger Pau Monné,
Stefano Stabellini
Function cpupool_do_sysctl is designed for doing cpupool related sysctl
operations, and shall be wrapped.
The following static functions are only called by cpupool_do_sysctl(), then
shall be wrapped too:
- cpupool_get_next_by_id
- cpupool_destroy
- cpupool_unassign_cpu_helper
- cpupool_unassign_cpu
Signed-off-by: Penny Zheng <Penny.Zheng@amd.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
---
v1 -> v2:
- no need to wrap declaration
- add transient #ifdef in sysctl.c for correct compilation
---
v2 -> v3
- move #endif up ahead of the blank line
---
xen/common/sched/cpupool.c | 8 ++++++++
xen/common/sysctl.c | 2 +-
2 files changed, 9 insertions(+), 1 deletion(-)
diff --git a/xen/common/sched/cpupool.c b/xen/common/sched/cpupool.c
index 3d02c7b706..f5459c2779 100644
--- a/xen/common/sched/cpupool.c
+++ b/xen/common/sched/cpupool.c
@@ -241,10 +241,12 @@ struct cpupool *cpupool_get_by_id(unsigned int poolid)
return __cpupool_get_by_id(poolid, true);
}
+#ifdef CONFIG_SYSCTL
static struct cpupool *cpupool_get_next_by_id(unsigned int poolid)
{
return __cpupool_get_by_id(poolid, false);
}
+#endif /* CONFIG_SYSCTL */
void cpupool_put(struct cpupool *pool)
{
@@ -352,6 +354,7 @@ static struct cpupool *cpupool_create(unsigned int poolid,
return ERR_PTR(ret);
}
+#ifdef CONFIG_SYSCTL
/*
* destroys the given cpupool
* returns 0 on success, 1 else
@@ -379,6 +382,7 @@ static int cpupool_destroy(struct cpupool *c)
debugtrace_printk("cpupool_destroy(pool=%u)\n", c->cpupool_id);
return 0;
}
+#endif /* CONFIG_SYSCTL */
/*
* Move domain to another cpupool
@@ -568,6 +572,7 @@ static int cpupool_unassign_cpu_start(struct cpupool *c, unsigned int cpu)
return ret;
}
+#ifdef CONFIG_SYSCTL
static long cf_check cpupool_unassign_cpu_helper(void *info)
{
struct cpupool *c = info;
@@ -633,6 +638,7 @@ static int cpupool_unassign_cpu(struct cpupool *c, unsigned int cpu)
}
return continue_hypercall_on_cpu(work_cpu, cpupool_unassign_cpu_helper, c);
}
+#endif /* CONFIG_SYSCTL */
/*
* add a new domain to a cpupool
@@ -810,6 +816,7 @@ static void cpupool_cpu_remove_forced(unsigned int cpu)
rcu_read_unlock(&sched_res_rculock);
}
+#ifdef CONFIG_SYSCTL
/*
* do cpupool related sysctl operations
*/
@@ -975,6 +982,7 @@ int cpupool_do_sysctl(struct xen_sysctl_cpupool_op *op)
return ret;
}
+#endif /* CONFIG_SYSCTL */
unsigned int cpupool_get_id(const struct domain *d)
{
diff --git a/xen/common/sysctl.c b/xen/common/sysctl.c
index 504e3516c3..767e0b7389 100644
--- a/xen/common/sysctl.c
+++ b/xen/common/sysctl.c
@@ -252,11 +252,11 @@ long do_sysctl(XEN_GUEST_HANDLE_PARAM(xen_sysctl_t) u_sysctl)
copyback = 0;
}
break;
-#endif /* CONFIG_SYSCTL */
case XEN_SYSCTL_cpupool_op:
ret = cpupool_do_sysctl(&op->u.cpupool_op);
break;
+#endif /* CONFIG_SYSCTL */
case XEN_SYSCTL_scheduler_op:
ret = sched_adjust_global(&op->u.scheduler_op);
--
2.34.1
^ permalink raw reply related [flat|nested] 35+ messages in thread
* [PATCH v3 14/20] xen/sysctl: wrap around XEN_SYSCTL_scheduler_op
2025-04-21 7:37 [PATCH v3 00/20] xen: introduce CONFIG_SYSCTL Penny Zheng
` (12 preceding siblings ...)
2025-04-21 7:37 ` [PATCH v3 13/20] xen/sysctl: wrap around XEN_SYSCTL_cpupool_op Penny Zheng
@ 2025-04-21 7:37 ` Penny Zheng
2025-04-21 7:37 ` [PATCH v3 15/20] xen: make avail_domheap_pages() inlined into get_outstanding_claims() Penny Zheng
` (5 subsequent siblings)
19 siblings, 0 replies; 35+ messages in thread
From: Penny Zheng @ 2025-04-21 7:37 UTC (permalink / raw)
To: xen-devel, xen-devel
Cc: ray.huang, Penny Zheng, Nathan Studer, Stewart Hildebrand,
Dario Faggioli, Juergen Gross, George Dunlap, Andrew Cooper,
Anthony PERARD, Michal Orzel, Jan Beulich, Julien Grall,
Roger Pau Monné, Stefano Stabellini, Daniel P. Smith
Function sched_adjust_global is designed for XEN_SYSCTL_scheduler_op, so
itself and its calling flow, like .adjust_global, shall all be wrapped.
Signed-off-by: Penny Zheng <Penny.Zheng@amd.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Acked-by: Stewart Hildebrand <stewart@stew.dk> #a653
---
v1 -> v2:
- no need to wrap declarations
- add transient #ifdef in sysctl.c for correct compilation
---
v2 -> v3
- move #endif up ahead of the blank line
---
xen/common/sched/arinc653.c | 6 ++++++
xen/common/sched/core.c | 2 ++
xen/common/sched/credit.c | 4 ++++
xen/common/sched/credit2.c | 4 ++++
xen/common/sched/private.h | 4 ++++
xen/common/sysctl.c | 2 +-
xen/include/xsm/xsm.h | 4 ++++
xen/xsm/dummy.c | 2 ++
xen/xsm/flask/hooks.c | 4 ++++
9 files changed, 31 insertions(+), 1 deletion(-)
diff --git a/xen/common/sched/arinc653.c b/xen/common/sched/arinc653.c
index 432ccfe662..3c014c9934 100644
--- a/xen/common/sched/arinc653.c
+++ b/xen/common/sched/arinc653.c
@@ -220,6 +220,7 @@ static void update_schedule_units(const struct scheduler *ops)
SCHED_PRIV(ops)->schedule[i].unit_id);
}
+#ifdef CONFIG_SYSCTL
/**
* This function is called by the adjust_global scheduler hook to put
* in place a new ARINC653 schedule.
@@ -334,6 +335,7 @@ arinc653_sched_get(
return 0;
}
+#endif /* CONFIG_SYSCTL */
/**************************************************************************
* Scheduler callback functions *
@@ -653,6 +655,7 @@ a653_switch_sched(struct scheduler *new_ops, unsigned int cpu,
return &sr->_lock;
}
+#ifdef CONFIG_SYSCTL
/**
* Xen scheduler callback function to perform a global (not domain-specific)
* adjustment. It is used by the ARINC 653 scheduler to put in place a new
@@ -692,6 +695,7 @@ a653sched_adjust_global(const struct scheduler *ops,
return rc;
}
+#endif /* CONFIG_SYSCTL */
/**
* This structure defines our scheduler for Xen.
@@ -726,7 +730,9 @@ static const struct scheduler sched_arinc653_def = {
.switch_sched = a653_switch_sched,
.adjust = NULL,
+#ifdef CONFIG_SYSCTL
.adjust_global = a653sched_adjust_global,
+#endif
.dump_settings = NULL,
.dump_cpu_state = NULL,
diff --git a/xen/common/sched/core.c b/xen/common/sched/core.c
index 13fdf57e57..ea95dea65a 100644
--- a/xen/common/sched/core.c
+++ b/xen/common/sched/core.c
@@ -2112,6 +2112,7 @@ long sched_adjust(struct domain *d, struct xen_domctl_scheduler_op *op)
return ret;
}
+#ifdef CONFIG_SYSCTL
long sched_adjust_global(struct xen_sysctl_scheduler_op *op)
{
struct cpupool *pool;
@@ -2140,6 +2141,7 @@ long sched_adjust_global(struct xen_sysctl_scheduler_op *op)
return rc;
}
+#endif /* CONFIG_SYSCTL */
static void vcpu_periodic_timer_work_locked(struct vcpu *v)
{
diff --git a/xen/common/sched/credit.c b/xen/common/sched/credit.c
index a6bb321e7d..6dcf6b2c8b 100644
--- a/xen/common/sched/credit.c
+++ b/xen/common/sched/credit.c
@@ -1256,6 +1256,7 @@ __csched_set_tslice(struct csched_private *prv, unsigned int timeslice_ms)
prv->credit = prv->credits_per_tslice * prv->ncpus;
}
+#ifdef CONFIG_SYSCTL
static int cf_check
csched_sys_cntl(const struct scheduler *ops,
struct xen_sysctl_scheduler_op *sc)
@@ -1298,6 +1299,7 @@ csched_sys_cntl(const struct scheduler *ops,
out:
return rc;
}
+#endif /* CONFIG_SYSCTL */
static void *cf_check
csched_alloc_domdata(const struct scheduler *ops, struct domain *dom)
@@ -2288,7 +2290,9 @@ static const struct scheduler sched_credit_def = {
.adjust = csched_dom_cntl,
.adjust_affinity= csched_aff_cntl,
+#ifdef CONFIG_SYSCTL
.adjust_global = csched_sys_cntl,
+#endif
.pick_resource = csched_res_pick,
.do_schedule = csched_schedule,
diff --git a/xen/common/sched/credit2.c b/xen/common/sched/credit2.c
index 0a83f23725..0b3b61df57 100644
--- a/xen/common/sched/credit2.c
+++ b/xen/common/sched/credit2.c
@@ -3131,6 +3131,7 @@ csched2_aff_cntl(const struct scheduler *ops, struct sched_unit *unit,
__clear_bit(__CSFLAG_pinned, &svc->flags);
}
+#ifdef CONFIG_SYSCTL
static int cf_check csched2_sys_cntl(
const struct scheduler *ops, struct xen_sysctl_scheduler_op *sc)
{
@@ -3162,6 +3163,7 @@ static int cf_check csched2_sys_cntl(
return 0;
}
+#endif /* CONFIG_SYSCTL */
static void *cf_check
csched2_alloc_domdata(const struct scheduler *ops, struct domain *dom)
@@ -4232,7 +4234,9 @@ static const struct scheduler sched_credit2_def = {
.adjust = csched2_dom_cntl,
.adjust_affinity= csched2_aff_cntl,
+#ifdef CONFIG_SYSCTL
.adjust_global = csched2_sys_cntl,
+#endif
.pick_resource = csched2_res_pick,
.migrate = csched2_unit_migrate,
diff --git a/xen/common/sched/private.h b/xen/common/sched/private.h
index c0e7c96d24..d6884550cd 100644
--- a/xen/common/sched/private.h
+++ b/xen/common/sched/private.h
@@ -356,8 +356,10 @@ struct scheduler {
struct sched_unit *unit,
const struct cpumask *hard,
const struct cpumask *soft);
+#ifdef CONFIG_SYSCTL
int (*adjust_global) (const struct scheduler *ops,
struct xen_sysctl_scheduler_op *sc);
+#endif
void (*dump_settings) (const struct scheduler *ops);
void (*dump_cpu_state) (const struct scheduler *ops, int cpu);
void (*move_timers) (const struct scheduler *ops,
@@ -510,11 +512,13 @@ static inline int sched_adjust_dom(const struct scheduler *s, struct domain *d,
return s->adjust ? s->adjust(s, d, op) : 0;
}
+#ifdef CONFIG_SYSCTL
static inline int sched_adjust_cpupool(const struct scheduler *s,
struct xen_sysctl_scheduler_op *op)
{
return s->adjust_global ? s->adjust_global(s, op) : 0;
}
+#endif
static inline void sched_move_timers(const struct scheduler *s,
struct sched_resource *sr)
diff --git a/xen/common/sysctl.c b/xen/common/sysctl.c
index 767e0b7389..200e0a0488 100644
--- a/xen/common/sysctl.c
+++ b/xen/common/sysctl.c
@@ -256,11 +256,11 @@ long do_sysctl(XEN_GUEST_HANDLE_PARAM(xen_sysctl_t) u_sysctl)
case XEN_SYSCTL_cpupool_op:
ret = cpupool_do_sysctl(&op->u.cpupool_op);
break;
-#endif /* CONFIG_SYSCTL */
case XEN_SYSCTL_scheduler_op:
ret = sched_adjust_global(&op->u.scheduler_op);
break;
+#endif /* CONFIG_SYSCTL */
case XEN_SYSCTL_physinfo:
{
diff --git a/xen/include/xsm/xsm.h b/xen/include/xsm/xsm.h
index 5ac99904c4..6e1789c314 100644
--- a/xen/include/xsm/xsm.h
+++ b/xen/include/xsm/xsm.h
@@ -57,7 +57,9 @@ struct xsm_ops {
int (*domain_create)(struct domain *d, uint32_t ssidref);
int (*getdomaininfo)(struct domain *d);
int (*domctl_scheduler_op)(struct domain *d, int op);
+#ifdef CONFIG_SYSCTL
int (*sysctl_scheduler_op)(int op);
+#endif
int (*set_target)(struct domain *d, struct domain *e);
int (*domctl)(struct domain *d, unsigned int cmd, uint32_t ssidref);
int (*sysctl)(int cmd);
@@ -244,10 +246,12 @@ static inline int xsm_domctl_scheduler_op(
return alternative_call(xsm_ops.domctl_scheduler_op, d, cmd);
}
+#ifdef CONFIG_SYSCTL
static inline int xsm_sysctl_scheduler_op(xsm_default_t def, int cmd)
{
return alternative_call(xsm_ops.sysctl_scheduler_op, cmd);
}
+#endif
static inline int xsm_set_target(
xsm_default_t def, struct domain *d, struct domain *e)
diff --git a/xen/xsm/dummy.c b/xen/xsm/dummy.c
index d46413ad8c..8d44f5bfb6 100644
--- a/xen/xsm/dummy.c
+++ b/xen/xsm/dummy.c
@@ -19,7 +19,9 @@ static const struct xsm_ops __initconst_cf_clobber dummy_ops = {
.domain_create = xsm_domain_create,
.getdomaininfo = xsm_getdomaininfo,
.domctl_scheduler_op = xsm_domctl_scheduler_op,
+#ifdef CONFIG_SYSCTL
.sysctl_scheduler_op = xsm_sysctl_scheduler_op,
+#endif
.set_target = xsm_set_target,
.domctl = xsm_domctl,
#ifdef CONFIG_SYSCTL
diff --git a/xen/xsm/flask/hooks.c b/xen/xsm/flask/hooks.c
index 45c12aa662..a7cb33a718 100644
--- a/xen/xsm/flask/hooks.c
+++ b/xen/xsm/flask/hooks.c
@@ -626,6 +626,7 @@ static int cf_check flask_domctl_scheduler_op(struct domain *d, int op)
}
}
+#ifdef CONFIG_SYSCTL
static int cf_check flask_sysctl_scheduler_op(int op)
{
switch ( op )
@@ -640,6 +641,7 @@ static int cf_check flask_sysctl_scheduler_op(int op)
return avc_unknown_permission("sysctl_scheduler_op", op);
}
}
+#endif /* CONFIG_SYSCTL */
static int cf_check flask_set_target(struct domain *d, struct domain *t)
{
@@ -1887,7 +1889,9 @@ static const struct xsm_ops __initconst_cf_clobber flask_ops = {
.domain_create = flask_domain_create,
.getdomaininfo = flask_getdomaininfo,
.domctl_scheduler_op = flask_domctl_scheduler_op,
+#ifdef CONFIG_SYSCTL
.sysctl_scheduler_op = flask_sysctl_scheduler_op,
+#endif
.set_target = flask_set_target,
.domctl = flask_domctl,
#ifdef CONFIG_SYSCTL
--
2.34.1
^ permalink raw reply related [flat|nested] 35+ messages in thread
* [PATCH v3 15/20] xen: make avail_domheap_pages() inlined into get_outstanding_claims()
2025-04-21 7:37 [PATCH v3 00/20] xen: introduce CONFIG_SYSCTL Penny Zheng
` (13 preceding siblings ...)
2025-04-21 7:37 ` [PATCH v3 14/20] xen/sysctl: wrap around XEN_SYSCTL_scheduler_op Penny Zheng
@ 2025-04-21 7:37 ` Penny Zheng
2025-04-21 7:37 ` [PATCH v3 16/20] xen/sysctl: wrap around XEN_SYSCTL_physinfo Penny Zheng
` (4 subsequent siblings)
19 siblings, 0 replies; 35+ messages in thread
From: Penny Zheng @ 2025-04-21 7:37 UTC (permalink / raw)
To: xen-devel
Cc: ray.huang, Penny Zheng, Andrew Cooper, Anthony PERARD,
Michal Orzel, Jan Beulich, Julien Grall, Roger Pau Monné,
Stefano Stabellini
Function avail_domheap_pages() is only invoked by get_outstanding_claims(),
so it could be inlined into get_outstanding_claims().
Move up avail_heap_pages() to avoid declaration before
get_outstanding_claims().
Signed-off-by: Penny Zheng <Penny.Zheng@amd.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
---
v1 -> v2:
- let avail_domheap_pages() being inlined into its sole caller
- move up avail_heap_pages()
---
v2 -> v3:
- change the title
---
xen/common/page_alloc.c | 51 ++++++++++++++++++-----------------------
xen/include/xen/mm.h | 1 -
2 files changed, 22 insertions(+), 30 deletions(-)
diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c
index cc2ad4423a..5803a1ef4e 100644
--- a/xen/common/page_alloc.c
+++ b/xen/common/page_alloc.c
@@ -488,6 +488,27 @@ static long total_avail_pages;
static DEFINE_SPINLOCK(heap_lock);
static long outstanding_claims; /* total outstanding claims by all domains */
+static unsigned long avail_heap_pages(
+ unsigned int zone_lo, unsigned int zone_hi, unsigned int node)
+{
+ unsigned int i, zone;
+ unsigned long free_pages = 0;
+
+ if ( zone_hi >= NR_ZONES )
+ zone_hi = NR_ZONES - 1;
+
+ for_each_online_node(i)
+ {
+ if ( !avail[i] )
+ continue;
+ for ( zone = zone_lo; zone <= zone_hi; zone++ )
+ if ( (node == -1) || (node == i) )
+ free_pages += avail[i][zone];
+ }
+
+ return free_pages;
+}
+
unsigned long domain_adjust_tot_pages(struct domain *d, long pages)
{
ASSERT(rspin_is_locked(&d->page_alloc_lock));
@@ -584,7 +605,7 @@ void get_outstanding_claims(uint64_t *free_pages, uint64_t *outstanding_pages)
{
spin_lock(&heap_lock);
*outstanding_pages = outstanding_claims;
- *free_pages = avail_domheap_pages();
+ *free_pages = avail_heap_pages(MEMZONE_XEN + 1, NR_ZONES - 1, -1);
spin_unlock(&heap_lock);
}
@@ -1964,27 +1985,6 @@ static void init_heap_pages(
}
}
-static unsigned long avail_heap_pages(
- unsigned int zone_lo, unsigned int zone_hi, unsigned int node)
-{
- unsigned int i, zone;
- unsigned long free_pages = 0;
-
- if ( zone_hi >= NR_ZONES )
- zone_hi = NR_ZONES - 1;
-
- for_each_online_node(i)
- {
- if ( !avail[i] )
- continue;
- for ( zone = zone_lo; zone <= zone_hi; zone++ )
- if ( (node == -1) || (node == i) )
- free_pages += avail[i][zone];
- }
-
- return free_pages;
-}
-
/*************************
* COLORED SIDE-ALLOCATOR
*
@@ -2795,13 +2795,6 @@ unsigned long avail_domheap_pages_region(
return avail_heap_pages(zone_lo, zone_hi, node);
}
-unsigned long avail_domheap_pages(void)
-{
- return avail_heap_pages(MEMZONE_XEN + 1,
- NR_ZONES - 1,
- -1);
-}
-
unsigned long avail_node_heap_pages(unsigned int nodeid)
{
return avail_heap_pages(MEMZONE_XEN, NR_ZONES -1, nodeid);
diff --git a/xen/include/xen/mm.h b/xen/include/xen/mm.h
index ae1c48a615..eda57486cf 100644
--- a/xen/include/xen/mm.h
+++ b/xen/include/xen/mm.h
@@ -140,7 +140,6 @@ struct page_info *alloc_domheap_pages(
void free_domheap_pages(struct page_info *pg, unsigned int order);
unsigned long avail_domheap_pages_region(
unsigned int node, unsigned int min_width, unsigned int max_width);
-unsigned long avail_domheap_pages(void);
unsigned long avail_node_heap_pages(unsigned int nodeid);
#define alloc_domheap_page(d,f) (alloc_domheap_pages(d,0,f))
#define free_domheap_page(p) (free_domheap_pages(p,0))
--
2.34.1
^ permalink raw reply related [flat|nested] 35+ messages in thread
* [PATCH v3 16/20] xen/sysctl: wrap around XEN_SYSCTL_physinfo
2025-04-21 7:37 [PATCH v3 00/20] xen: introduce CONFIG_SYSCTL Penny Zheng
` (14 preceding siblings ...)
2025-04-21 7:37 ` [PATCH v3 15/20] xen: make avail_domheap_pages() inlined into get_outstanding_claims() Penny Zheng
@ 2025-04-21 7:37 ` Penny Zheng
2025-04-21 7:37 ` [PATCH v3 17/20] xen/sysctl: make CONFIG_COVERAGE depend on CONFIG_SYSCTL Penny Zheng
` (3 subsequent siblings)
19 siblings, 0 replies; 35+ messages in thread
From: Penny Zheng @ 2025-04-21 7:37 UTC (permalink / raw)
To: xen-devel
Cc: ray.huang, Penny Zheng, Stefano Stabellini, Julien Grall,
Bertrand Marquis, Michal Orzel, Volodymyr Babchuk, Andrew Cooper,
Anthony PERARD, Jan Beulich, Roger Pau Monné,
Alistair Francis, Bob Eshleman, Connor Davis, Oleksii Kurochko
The following functions are only used to deal with XEN_SYSCTL_physinfo,
then they shall be wrapped:
- arch_do_physinfo
- get_outstanding_claims
Signed-off-by: Penny Zheng <Penny.Zheng@amd.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Acked-by: Jan Beulich <jbeulich@suse.com>
---
v1 -> v2:
- no need to wrap declaration
- add transient #ifdef in sysctl.c for correct compilation
---
v2 -> v3:
- move #endif up ahead of the blank line
---
xen/arch/arm/sysctl.c | 2 ++
xen/arch/riscv/stubs.c | 2 ++
xen/arch/x86/sysctl.c | 2 ++
xen/common/page_alloc.c | 2 ++
xen/common/sysctl.c | 2 +-
5 files changed, 9 insertions(+), 1 deletion(-)
diff --git a/xen/arch/arm/sysctl.c b/xen/arch/arm/sysctl.c
index 32cab4feff..2d350b700a 100644
--- a/xen/arch/arm/sysctl.c
+++ b/xen/arch/arm/sysctl.c
@@ -15,6 +15,7 @@
#include <asm/arm64/sve.h>
#include <public/sysctl.h>
+#ifdef CONFIG_SYSCTL
void arch_do_physinfo(struct xen_sysctl_physinfo *pi)
{
pi->capabilities |= XEN_SYSCTL_PHYSCAP_hvm | XEN_SYSCTL_PHYSCAP_hap;
@@ -22,6 +23,7 @@ void arch_do_physinfo(struct xen_sysctl_physinfo *pi)
pi->arch_capabilities |= MASK_INSR(sve_encode_vl(get_sys_vl_len()),
XEN_SYSCTL_PHYSCAP_ARM_SVE_MASK);
}
+#endif
long arch_do_sysctl(struct xen_sysctl *sysctl,
XEN_GUEST_HANDLE_PARAM(xen_sysctl_t) u_sysctl)
diff --git a/xen/arch/riscv/stubs.c b/xen/arch/riscv/stubs.c
index 83416d3350..295456d0c8 100644
--- a/xen/arch/riscv/stubs.c
+++ b/xen/arch/riscv/stubs.c
@@ -321,10 +321,12 @@ long arch_do_sysctl(struct xen_sysctl *sysctl,
BUG_ON("unimplemented");
}
+#ifdef CONFIG_SYSCTL
void arch_do_physinfo(struct xen_sysctl_physinfo *pi)
{
BUG_ON("unimplemented");
}
+#endif /* CONFIG_SYSCTL */
/* p2m.c */
diff --git a/xen/arch/x86/sysctl.c b/xen/arch/x86/sysctl.c
index 1b04947516..f64addbe2b 100644
--- a/xen/arch/x86/sysctl.c
+++ b/xen/arch/x86/sysctl.c
@@ -91,6 +91,7 @@ static long cf_check smt_up_down_helper(void *data)
return ret;
}
+#ifdef CONFIG_SYSCTL
void arch_do_physinfo(struct xen_sysctl_physinfo *pi)
{
memcpy(pi->hw_cap, boot_cpu_data.x86_capability,
@@ -104,6 +105,7 @@ void arch_do_physinfo(struct xen_sysctl_physinfo *pi)
if ( IS_ENABLED(CONFIG_SHADOW_PAGING) )
pi->capabilities |= XEN_SYSCTL_PHYSCAP_shadow;
}
+#endif /* CONFIG_SYSCTL */
long arch_do_sysctl(
struct xen_sysctl *sysctl, XEN_GUEST_HANDLE_PARAM(xen_sysctl_t) u_sysctl)
diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c
index 5803a1ef4e..36424a9245 100644
--- a/xen/common/page_alloc.c
+++ b/xen/common/page_alloc.c
@@ -601,6 +601,7 @@ out:
return ret;
}
+#ifdef CONFIG_SYSCTL
void get_outstanding_claims(uint64_t *free_pages, uint64_t *outstanding_pages)
{
spin_lock(&heap_lock);
@@ -608,6 +609,7 @@ void get_outstanding_claims(uint64_t *free_pages, uint64_t *outstanding_pages)
*free_pages = avail_heap_pages(MEMZONE_XEN + 1, NR_ZONES - 1, -1);
spin_unlock(&heap_lock);
}
+#endif /* CONFIG_SYSCTL */
static bool __read_mostly first_node_initialised;
#ifndef CONFIG_SEPARATE_XENHEAP
diff --git a/xen/common/sysctl.c b/xen/common/sysctl.c
index 200e0a0488..b4feb07e60 100644
--- a/xen/common/sysctl.c
+++ b/xen/common/sysctl.c
@@ -260,7 +260,6 @@ long do_sysctl(XEN_GUEST_HANDLE_PARAM(xen_sysctl_t) u_sysctl)
case XEN_SYSCTL_scheduler_op:
ret = sched_adjust_global(&op->u.scheduler_op);
break;
-#endif /* CONFIG_SYSCTL */
case XEN_SYSCTL_physinfo:
{
@@ -303,6 +302,7 @@ long do_sysctl(XEN_GUEST_HANDLE_PARAM(xen_sysctl_t) u_sysctl)
ret = -EFAULT;
}
break;
+#endif /* CONFIG_SYSCTL */
case XEN_SYSCTL_numainfo:
{
--
2.34.1
^ permalink raw reply related [flat|nested] 35+ messages in thread
* [PATCH v3 17/20] xen/sysctl: make CONFIG_COVERAGE depend on CONFIG_SYSCTL
2025-04-21 7:37 [PATCH v3 00/20] xen: introduce CONFIG_SYSCTL Penny Zheng
` (15 preceding siblings ...)
2025-04-21 7:37 ` [PATCH v3 16/20] xen/sysctl: wrap around XEN_SYSCTL_physinfo Penny Zheng
@ 2025-04-21 7:37 ` Penny Zheng
2025-04-21 7:37 ` [PATCH v3 18/20] xen/sysctl: make CONFIG_LIVEPATCH " Penny Zheng
` (2 subsequent siblings)
19 siblings, 0 replies; 35+ messages in thread
From: Penny Zheng @ 2025-04-21 7:37 UTC (permalink / raw)
To: xen-devel
Cc: ray.huang, Penny Zheng, Andrew Cooper, Anthony PERARD,
Michal Orzel, Jan Beulich, Julien Grall, Roger Pau Monné,
Stefano Stabellini
All coverage-related op shall be wrapped around with CONFIG_SYSCTL,
so this commit makes CONFIG_COVERAGE depend on CONFIG_SYSCTL.
Signed-off-by: Penny Zheng <Penny.Zheng@amd.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
---
v1 -> v2:
- commit message refactor
---
xen/Kconfig.debug | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/xen/Kconfig.debug b/xen/Kconfig.debug
index f7cc5ffaab..84d26b3f46 100644
--- a/xen/Kconfig.debug
+++ b/xen/Kconfig.debug
@@ -37,7 +37,7 @@ config SELF_TESTS
config COVERAGE
bool "Code coverage support"
- depends on !LIVEPATCH
+ depends on !LIVEPATCH && SYSCTL
select SUPPRESS_DUPLICATE_SYMBOL_WARNINGS if !ENFORCE_UNIQUE_SYMBOLS
help
Enable code coverage support.
--
2.34.1
^ permalink raw reply related [flat|nested] 35+ messages in thread
* [PATCH v3 18/20] xen/sysctl: make CONFIG_LIVEPATCH depend on CONFIG_SYSCTL
2025-04-21 7:37 [PATCH v3 00/20] xen: introduce CONFIG_SYSCTL Penny Zheng
` (16 preceding siblings ...)
2025-04-21 7:37 ` [PATCH v3 17/20] xen/sysctl: make CONFIG_COVERAGE depend on CONFIG_SYSCTL Penny Zheng
@ 2025-04-21 7:37 ` Penny Zheng
2025-04-21 7:37 ` [PATCH v3 19/20] xen/sysctl: wrap around arch-specific arch_do_sysctl Penny Zheng
2025-04-21 7:37 ` [PATCH v3 20/20] xen/sysctl: wrap around sysctl hypercall Penny Zheng
19 siblings, 0 replies; 35+ messages in thread
From: Penny Zheng @ 2025-04-21 7:37 UTC (permalink / raw)
To: xen-devel
Cc: ray.huang, Penny Zheng, Andrew Cooper, Anthony PERARD,
Michal Orzel, Jan Beulich, Julien Grall, Roger Pau Monné,
Stefano Stabellini
LIVEPATCH mechanism relies on LIVEPATCH_SYSCTL hypercall, so CONFIG_LIVEPATCH
shall depend on CONFIG_SYSCTL
Signed-off-by: Penny Zheng <Penny.Zheng@amd.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
---
v1 -> v2:
- commit message refactor
---
xen/common/Kconfig | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/xen/common/Kconfig b/xen/common/Kconfig
index d8e242eebc..db6f75fae5 100644
--- a/xen/common/Kconfig
+++ b/xen/common/Kconfig
@@ -457,7 +457,7 @@ config CRYPTO
config LIVEPATCH
bool "Live patching support"
default X86
- depends on "$(XEN_HAS_BUILD_ID)" = "y"
+ depends on "$(XEN_HAS_BUILD_ID)" = "y" && SYSCTL
select CC_SPLIT_SECTIONS
help
Allows a running Xen hypervisor to be dynamically patched using
--
2.34.1
^ permalink raw reply related [flat|nested] 35+ messages in thread
* [PATCH v3 19/20] xen/sysctl: wrap around arch-specific arch_do_sysctl
2025-04-21 7:37 [PATCH v3 00/20] xen: introduce CONFIG_SYSCTL Penny Zheng
` (17 preceding siblings ...)
2025-04-21 7:37 ` [PATCH v3 18/20] xen/sysctl: make CONFIG_LIVEPATCH " Penny Zheng
@ 2025-04-21 7:37 ` Penny Zheng
2025-04-21 21:19 ` Stefano Stabellini
2025-04-30 15:45 ` Jan Beulich
2025-04-21 7:37 ` [PATCH v3 20/20] xen/sysctl: wrap around sysctl hypercall Penny Zheng
19 siblings, 2 replies; 35+ messages in thread
From: Penny Zheng @ 2025-04-21 7:37 UTC (permalink / raw)
To: xen-devel
Cc: ray.huang, Penny Zheng, Stefano Stabellini, Julien Grall,
Bertrand Marquis, Michal Orzel, Volodymyr Babchuk, Andrew Cooper,
Anthony PERARD, Jan Beulich, Roger Pau Monné,
Alistair Francis, Bob Eshleman, Connor Davis, Oleksii Kurochko,
Stefano Stabellini, Sergiy Kibrik
Function arch_do_sysctl is to perform arch-specific sysctl op.
Some functions, like psr_get_info for x86, DTB overlay support for arm,
are solely available through sysctl op, then they all shall be wrapped
with CONFIG_SYSCTL
Also, remove all #ifdef CONFIG_SYSCTL-s in arch-specific sysctl.c, as
we put the guardian in Makefile for the whole file.
Since PV_SHIM_EXCLUSIVE needs sorting as a prereq in the future, we move
obj-$(CONFIG_SYSCTL) += sysctl.o out of PV_SHIM_EXCLUSIVE condition.
Signed-off-by: Stefano Stabellini <stefano.stabellini@amd.com>
Signed-off-by: Sergiy Kibrik <Sergiy_Kibrik@epam.com>
Signed-off-by: Penny Zheng <Penny.Zheng@amd.com>
---
- use "depends on" for config OVERLAY_DTB
- no need to wrap declaration
- add transient #ifdef in sysctl.c for correct compilation
---
v2 -> v3
- move obj-$(CONFIG_SYSCTL) += sysctl.o out of PV_SHIM_EXCLUSIVE condition
- move copyback out of #ifdef
- add #else process for default label
---
xen/arch/arm/Kconfig | 1 +
xen/arch/arm/Makefile | 2 +-
xen/arch/arm/sysctl.c | 2 --
xen/arch/riscv/stubs.c | 2 +-
xen/arch/x86/Makefile | 2 +-
xen/arch/x86/psr.c | 18 ++++++++++++++++++
xen/arch/x86/sysctl.c | 2 --
xen/common/sysctl.c | 4 ++++
8 files changed, 26 insertions(+), 7 deletions(-)
diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig
index da8a406f5a..9ac015b0cd 100644
--- a/xen/arch/arm/Kconfig
+++ b/xen/arch/arm/Kconfig
@@ -144,6 +144,7 @@ config HAS_ITS
config OVERLAY_DTB
bool "DTB overlay support (UNSUPPORTED)" if UNSUPPORTED
+ depends on SYSCTL
help
Dynamic addition/removal of Xen device tree nodes using a dtbo.
diff --git a/xen/arch/arm/Makefile b/xen/arch/arm/Makefile
index 4837ad467a..7c6015b84d 100644
--- a/xen/arch/arm/Makefile
+++ b/xen/arch/arm/Makefile
@@ -54,7 +54,7 @@ obj-y += smpboot.o
obj-$(CONFIG_STATIC_EVTCHN) += static-evtchn.init.o
obj-$(CONFIG_STATIC_MEMORY) += static-memory.init.o
obj-$(CONFIG_STATIC_SHM) += static-shmem.init.o
-obj-y += sysctl.o
+obj-$(CONFIG_SYSCTL) += sysctl.o
obj-y += time.o
obj-y += traps.o
obj-y += vcpreg.o
diff --git a/xen/arch/arm/sysctl.c b/xen/arch/arm/sysctl.c
index 2d350b700a..32cab4feff 100644
--- a/xen/arch/arm/sysctl.c
+++ b/xen/arch/arm/sysctl.c
@@ -15,7 +15,6 @@
#include <asm/arm64/sve.h>
#include <public/sysctl.h>
-#ifdef CONFIG_SYSCTL
void arch_do_physinfo(struct xen_sysctl_physinfo *pi)
{
pi->capabilities |= XEN_SYSCTL_PHYSCAP_hvm | XEN_SYSCTL_PHYSCAP_hap;
@@ -23,7 +22,6 @@ void arch_do_physinfo(struct xen_sysctl_physinfo *pi)
pi->arch_capabilities |= MASK_INSR(sve_encode_vl(get_sys_vl_len()),
XEN_SYSCTL_PHYSCAP_ARM_SVE_MASK);
}
-#endif
long arch_do_sysctl(struct xen_sysctl *sysctl,
XEN_GUEST_HANDLE_PARAM(xen_sysctl_t) u_sysctl)
diff --git a/xen/arch/riscv/stubs.c b/xen/arch/riscv/stubs.c
index 295456d0c8..cb9b90591a 100644
--- a/xen/arch/riscv/stubs.c
+++ b/xen/arch/riscv/stubs.c
@@ -315,13 +315,13 @@ unsigned long raw_copy_from_guest(void *to, const void __user *from,
/* sysctl.c */
+#ifdef CONFIG_SYSCTL
long arch_do_sysctl(struct xen_sysctl *sysctl,
XEN_GUEST_HANDLE_PARAM(xen_sysctl_t) u_sysctl)
{
BUG_ON("unimplemented");
}
-#ifdef CONFIG_SYSCTL
void arch_do_physinfo(struct xen_sysctl_physinfo *pi)
{
BUG_ON("unimplemented");
diff --git a/xen/arch/x86/Makefile b/xen/arch/x86/Makefile
index f59c9665fd..1602f4fd21 100644
--- a/xen/arch/x86/Makefile
+++ b/xen/arch/x86/Makefile
@@ -64,6 +64,7 @@ obj-y += smpboot.o
obj-y += spec_ctrl.o
obj-y += srat.o
obj-y += string.o
+obj-$(CONFIG_SYSCTL) += sysctl.o
obj-y += time.o
obj-y += traps-setup.o
obj-y += traps.o
@@ -79,7 +80,6 @@ ifneq ($(CONFIG_PV_SHIM_EXCLUSIVE),y)
obj-y += domctl.o
obj-y += platform_hypercall.o
obj-$(CONFIG_COMPAT) += x86_64/platform_hypercall.o
-obj-y += sysctl.o
endif
extra-y += asm-macros.i
diff --git a/xen/arch/x86/psr.c b/xen/arch/x86/psr.c
index 5815a35335..499d320e61 100644
--- a/xen/arch/x86/psr.c
+++ b/xen/arch/x86/psr.c
@@ -133,9 +133,11 @@ static const struct feat_props {
*/
enum psr_type alt_type;
+#ifdef CONFIG_SYSCTL
/* get_feat_info is used to return feature HW info through sysctl. */
bool (*get_feat_info)(const struct feat_node *feat,
uint32_t data[], unsigned int array_len);
+#endif
/* write_msr is used to write out feature MSR register. */
void (*write_msr)(unsigned int cos, uint32_t val, enum psr_type type);
@@ -418,6 +420,7 @@ static bool mba_init_feature(const struct cpuid_leaf *regs,
return true;
}
+#ifdef CONFIG_SYSCTL
static bool cf_check cat_get_feat_info(
const struct feat_node *feat, uint32_t data[], unsigned int array_len)
{
@@ -430,6 +433,7 @@ static bool cf_check cat_get_feat_info(
return true;
}
+#endif /* CONFIG_SYSCTL */
/* L3 CAT props */
static void cf_check l3_cat_write_msr(
@@ -442,11 +446,14 @@ static const struct feat_props l3_cat_props = {
.cos_num = 1,
.type[0] = PSR_TYPE_L3_CBM,
.alt_type = PSR_TYPE_UNKNOWN,
+#ifdef CONFIG_SYSCTL
.get_feat_info = cat_get_feat_info,
+#endif
.write_msr = l3_cat_write_msr,
.sanitize = cat_check_cbm,
};
+#ifdef CONFIG_SYSCTL
/* L3 CDP props */
static bool cf_check l3_cdp_get_feat_info(
const struct feat_node *feat, uint32_t data[], uint32_t array_len)
@@ -458,6 +465,7 @@ static bool cf_check l3_cdp_get_feat_info(
return true;
}
+#endif /* CONFIG_SYSCTL */
static void cf_check l3_cdp_write_msr(
unsigned int cos, uint32_t val, enum psr_type type)
@@ -473,7 +481,9 @@ static const struct feat_props l3_cdp_props = {
.type[0] = PSR_TYPE_L3_DATA,
.type[1] = PSR_TYPE_L3_CODE,
.alt_type = PSR_TYPE_L3_CBM,
+#ifdef CONFIG_SYSCTL
.get_feat_info = l3_cdp_get_feat_info,
+#endif
.write_msr = l3_cdp_write_msr,
.sanitize = cat_check_cbm,
};
@@ -489,11 +499,14 @@ static const struct feat_props l2_cat_props = {
.cos_num = 1,
.type[0] = PSR_TYPE_L2_CBM,
.alt_type = PSR_TYPE_UNKNOWN,
+#ifdef CONFIG_SYSCTL
.get_feat_info = cat_get_feat_info,
+#endif
.write_msr = l2_cat_write_msr,
.sanitize = cat_check_cbm,
};
+#ifdef CONFIG_SYSCTL
/* MBA props */
static bool cf_check mba_get_feat_info(
const struct feat_node *feat, uint32_t data[], unsigned int array_len)
@@ -508,6 +521,7 @@ static bool cf_check mba_get_feat_info(
return true;
}
+#endif /* CONFIG_SYSCTL */
static void cf_check mba_write_msr(
unsigned int cos, uint32_t val, enum psr_type type)
@@ -545,7 +559,9 @@ static const struct feat_props mba_props = {
.cos_num = 1,
.type[0] = PSR_TYPE_MBA_THRTL,
.alt_type = PSR_TYPE_UNKNOWN,
+#ifdef CONFIG_SYSCTL
.get_feat_info = mba_get_feat_info,
+#endif
.write_msr = mba_write_msr,
.sanitize = mba_sanitize_thrtl,
};
@@ -808,6 +824,7 @@ static struct psr_socket_info *get_socket_info(unsigned int socket)
return socket_info + socket;
}
+#ifdef CONFIG_SYSCTL
int psr_get_info(unsigned int socket, enum psr_type type,
uint32_t data[], unsigned int array_len)
{
@@ -839,6 +856,7 @@ int psr_get_info(unsigned int socket, enum psr_type type,
return -EINVAL;
}
+#endif /* CONFIG_SYSCTL */
int psr_get_val(struct domain *d, unsigned int socket,
uint32_t *val, enum psr_type type)
diff --git a/xen/arch/x86/sysctl.c b/xen/arch/x86/sysctl.c
index f64addbe2b..1b04947516 100644
--- a/xen/arch/x86/sysctl.c
+++ b/xen/arch/x86/sysctl.c
@@ -91,7 +91,6 @@ static long cf_check smt_up_down_helper(void *data)
return ret;
}
-#ifdef CONFIG_SYSCTL
void arch_do_physinfo(struct xen_sysctl_physinfo *pi)
{
memcpy(pi->hw_cap, boot_cpu_data.x86_capability,
@@ -105,7 +104,6 @@ void arch_do_physinfo(struct xen_sysctl_physinfo *pi)
if ( IS_ENABLED(CONFIG_SHADOW_PAGING) )
pi->capabilities |= XEN_SYSCTL_PHYSCAP_shadow;
}
-#endif /* CONFIG_SYSCTL */
long arch_do_sysctl(
struct xen_sysctl *sysctl, XEN_GUEST_HANDLE_PARAM(xen_sysctl_t) u_sysctl)
diff --git a/xen/common/sysctl.c b/xen/common/sysctl.c
index b4feb07e60..85a1adacdd 100644
--- a/xen/common/sysctl.c
+++ b/xen/common/sysctl.c
@@ -493,7 +493,11 @@ long do_sysctl(XEN_GUEST_HANDLE_PARAM(xen_sysctl_t) u_sysctl)
break;
default:
+#ifdef CONFIG_SYSCTL
ret = arch_do_sysctl(op, u_sysctl);
+#else
+ ret = -EOPNOTSUPP;
+#endif
copyback = 0;
break;
}
--
2.34.1
^ permalink raw reply related [flat|nested] 35+ messages in thread
* [PATCH v3 20/20] xen/sysctl: wrap around sysctl hypercall
2025-04-21 7:37 [PATCH v3 00/20] xen: introduce CONFIG_SYSCTL Penny Zheng
` (18 preceding siblings ...)
2025-04-21 7:37 ` [PATCH v3 19/20] xen/sysctl: wrap around arch-specific arch_do_sysctl Penny Zheng
@ 2025-04-21 7:37 ` Penny Zheng
2025-04-21 21:21 ` Stefano Stabellini
19 siblings, 1 reply; 35+ messages in thread
From: Penny Zheng @ 2025-04-21 7:37 UTC (permalink / raw)
To: xen-devel
Cc: ray.huang, Stefano Stabellini, Andrew Cooper, Anthony PERARD,
Michal Orzel, Jan Beulich, Julien Grall, Roger Pau Monné,
Stefano Stabellini, Sergiy Kibrik, Penny Zheng
From: Stefano Stabellini <stefano.stabellini@amd.com>
Wrap sysctl hypercall def and sysctl.o with CONFIG_SYSCTL, and since
PV_SHIM_EXCLUSIVE needs sorting as a prereq in the future, we move
them out of PV_SHIM_EXCLUSIVE condition at the same time.
We also need to remove all transient "#ifdef CONFIG_SYSCTL"-s in sysctl.c.
Signed-off-by: Stefano Stabellini <stefano.stabellini@amd.com>
Signed-off-by: Sergiy Kibrik <Sergiy_Kibrik@epam.com>
Signed-off-by: Penny Zheng <Penny.Zheng@amd.com>
---
v1 -> v2:
- remove all transient "#ifdef CONFIG_SYSCTL"-s in sysctl.c
---
v2 -> v3:
- move out of CONFIG_PV_SHIM_EXCLUSIVE condition
---
xen/common/Makefile | 2 +-
xen/common/sysctl.c | 12 ------------
xen/include/hypercall-defs.c | 8 ++++++--
3 files changed, 7 insertions(+), 15 deletions(-)
diff --git a/xen/common/Makefile b/xen/common/Makefile
index 98f0873056..15ab048244 100644
--- a/xen/common/Makefile
+++ b/xen/common/Makefile
@@ -49,6 +49,7 @@ obj-y += spinlock.o
obj-$(CONFIG_STACK_PROTECTOR) += stack-protector.o
obj-y += stop_machine.o
obj-y += symbols.o
+obj-$(CONFIG_SYSCTL) += sysctl.o
obj-y += tasklet.o
obj-y += time.o
obj-y += timer.o
@@ -70,7 +71,6 @@ obj-$(CONFIG_COMPAT) += $(addprefix compat/,domain.o memory.o multicall.o xlat.o
ifneq ($(CONFIG_PV_SHIM_EXCLUSIVE),y)
obj-y += domctl.o
obj-$(CONFIG_VM_EVENT) += monitor.o
-obj-y += sysctl.o
endif
extra-y := symbols-dummy.o
diff --git a/xen/common/sysctl.c b/xen/common/sysctl.c
index 85a1adacdd..08174a924d 100644
--- a/xen/common/sysctl.c
+++ b/xen/common/sysctl.c
@@ -58,7 +58,6 @@ long do_sysctl(XEN_GUEST_HANDLE_PARAM(xen_sysctl_t) u_sysctl)
switch ( op->cmd )
{
-#ifdef CONFIG_SYSCTL
case XEN_SYSCTL_readconsole:
ret = xsm_readconsole(XSM_HOOK, op->u.readconsole.clear);
if ( ret )
@@ -66,17 +65,14 @@ long do_sysctl(XEN_GUEST_HANDLE_PARAM(xen_sysctl_t) u_sysctl)
ret = read_console_ring(&op->u.readconsole);
break;
-#endif /* CONFIG_SYSCTL */
case XEN_SYSCTL_tbuf_op:
ret = tb_control(&op->u.tbuf_op);
break;
-#ifdef CONFIG_SYSCTL
case XEN_SYSCTL_sched_id:
op->u.sched_id.sched_id = scheduler_id();
break;
-#endif /* CONFIG_SYSCTL */
case XEN_SYSCTL_getdomaininfolist:
{
@@ -117,7 +113,6 @@ long do_sysctl(XEN_GUEST_HANDLE_PARAM(xen_sysctl_t) u_sysctl)
}
break;
-#ifdef CONFIG_SYSCTL
#ifdef CONFIG_PERF_COUNTERS
case XEN_SYSCTL_perfc_op:
ret = perfc_control(&op->u.perfc_op);
@@ -129,7 +124,6 @@ long do_sysctl(XEN_GUEST_HANDLE_PARAM(xen_sysctl_t) u_sysctl)
ret = spinlock_profile_control(&op->u.lockprof_op);
break;
#endif
-#endif /* CONFIG_SYSCTL */
case XEN_SYSCTL_debug_keys:
{
@@ -191,7 +185,6 @@ long do_sysctl(XEN_GUEST_HANDLE_PARAM(xen_sysctl_t) u_sysctl)
break;
#endif /* CONFIG_PM_OP */
-#ifdef CONFIG_SYSCTL
case XEN_SYSCTL_page_offline_op:
{
uint32_t *status, *ptr;
@@ -302,7 +295,6 @@ long do_sysctl(XEN_GUEST_HANDLE_PARAM(xen_sysctl_t) u_sysctl)
ret = -EFAULT;
}
break;
-#endif /* CONFIG_SYSCTL */
case XEN_SYSCTL_numainfo:
{
@@ -493,11 +485,7 @@ long do_sysctl(XEN_GUEST_HANDLE_PARAM(xen_sysctl_t) u_sysctl)
break;
default:
-#ifdef CONFIG_SYSCTL
ret = arch_do_sysctl(op, u_sysctl);
-#else
- ret = -EOPNOTSUPP;
-#endif
copyback = 0;
break;
}
diff --git a/xen/include/hypercall-defs.c b/xen/include/hypercall-defs.c
index 7720a29ade..c1081d87a2 100644
--- a/xen/include/hypercall-defs.c
+++ b/xen/include/hypercall-defs.c
@@ -194,8 +194,10 @@ kexec_op(unsigned long op, void *uarg)
#ifdef CONFIG_IOREQ_SERVER
dm_op(domid_t domid, unsigned int nr_bufs, xen_dm_op_buf_t *bufs)
#endif
-#ifndef CONFIG_PV_SHIM_EXCLUSIVE
+#ifdef CONFIG_SYSCTL
sysctl(xen_sysctl_t *u_sysctl)
+#endif
+#ifndef CONFIG_PV_SHIM_EXCLUSIVE
domctl(xen_domctl_t *u_domctl)
paging_domctl_cont(xen_domctl_t *u_domctl)
platform_op(xen_platform_op_t *u_xenpf_op)
@@ -273,8 +275,10 @@ physdev_op compat do hvm hvm do_arm
#ifdef CONFIG_HVM
hvm_op do do do do do
#endif
-#ifndef CONFIG_PV_SHIM_EXCLUSIVE
+#ifdef CONFIG_SYSCTL
sysctl do do do do do
+#endif
+#ifndef CONFIG_PV_SHIM_EXCLUSIVE
domctl do do do do do
#endif
#ifdef CONFIG_KEXEC
--
2.34.1
^ permalink raw reply related [flat|nested] 35+ messages in thread
* Re: [PATCH v3 02/20] xen: introduce CONFIG_SYSCTL
2025-04-21 7:37 ` [PATCH v3 02/20] xen: introduce CONFIG_SYSCTL Penny Zheng
@ 2025-04-21 20:54 ` Stefano Stabellini
2025-04-30 15:05 ` Jan Beulich
2025-04-30 15:20 ` Jan Beulich
2 siblings, 0 replies; 35+ messages in thread
From: Stefano Stabellini @ 2025-04-21 20:54 UTC (permalink / raw)
To: Penny Zheng
Cc: xen-devel, ray.huang, Stefano Stabellini, Andrew Cooper,
Anthony PERARD, Michal Orzel, Jan Beulich, Julien Grall,
Roger Pau Monné, Stefano Stabellini, Sergiy Kibrik
On Mon, 21 Apr 2025, Penny Zheng wrote:
> From: Stefano Stabellini <stefano.stabellini@amd.com>
>
> We introduce a new Kconfig CONFIG_SYSCTL, which shall only be disabled
> on some dom0less systems, to reduce Xen footprint.
>
> Signed-off-by: Stefano Stabellini <stefano.stabellini@amd.com>
> Signed-off-by: Sergiy Kibrik <Sergiy_Kibrik@epam.com>
> Signed-off-by: Penny Zheng <Penny.Zheng@amd.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
> ---
> v2 -> v3:
> - remove "intend to" in commit message
> ---
> xen/common/Kconfig | 11 +++++++++++
> 1 file changed, 11 insertions(+)
>
> diff --git a/xen/common/Kconfig b/xen/common/Kconfig
> index be28060716..d89e9ede77 100644
> --- a/xen/common/Kconfig
> +++ b/xen/common/Kconfig
> @@ -581,4 +581,15 @@ config BUDDY_ALLOCATOR_SIZE
> Amount of memory reserved for the buddy allocator to serve Xen heap,
> working alongside the colored one.
>
> +menu "Supported hypercall interfaces"
> + visible if EXPERT
> +
> +config SYSCTL
> + bool "Enable sysctl hypercall"
> + default y
> + help
> + This option shall only be disabled on some dom0less systems,
> + to reduce Xen footprint.
> +endmenu
> +
> endmenu
> --
> 2.34.1
>
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v3 10/20] xen/pmstat: introduce CONFIG_PM_OP
2025-04-21 7:37 ` [PATCH v3 10/20] xen/pmstat: introduce CONFIG_PM_OP Penny Zheng
@ 2025-04-21 21:09 ` Stefano Stabellini
2025-04-30 15:32 ` Jan Beulich
1 sibling, 0 replies; 35+ messages in thread
From: Stefano Stabellini @ 2025-04-21 21:09 UTC (permalink / raw)
To: Penny Zheng
Cc: xen-devel, ray.huang, Jan Beulich, Andrew Cooper,
Roger Pau Monné, Anthony PERARD, Michal Orzel, Julien Grall,
Stefano Stabellini
On Mon, 21 Apr 2025, Penny Zheng wrote:
> We move the following functions into a new file drivers/acpi/pm_op.c, as
> they are all more fitting in performance controling and only called by
> do_pm_op():
> - get_cpufreq_para()
> - set_cpufreq_para()
> - set_cpufreq_gov()
> - set_cpufreq_cppc()
> - cpufreq_driver_getavg()
> - cpufreq_update_turbo()
> - cpufreq_get_turbo_status()
> We introduce a new Kconfig CONFIG_PM_OP to wrap the new file.
>
> Also, although the following helpers are only called by do_pm_op(), they have
> dependency on local variable, we wrap them with CONFIG_PM_OP in place:
> - write_userspace_scaling_setspeed()
> - write_ondemand_sampling_rate()
> - write_ondemand_up_threshold()
> - get_cpufreq_ondemand_para()
> - cpufreq_driver.update()
> - get_hwp_para()
> Various style corrections shall be applied at the same time while moving these
> functions, including:
> - add extra space before and after bracket of if() and switch()
> - fix indentation
>
> Signed-off-by: Penny Zheng <Penny.Zheng@amd.com>
I manually checked the code movement for correctness
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
> ---
> v2 -> v3
> - new commit
> ---
> xen/arch/x86/acpi/cpufreq/hwp.c | 6 +
> xen/arch/x86/acpi/cpufreq/powernow.c | 4 +
> xen/common/Kconfig | 7 +
> xen/common/sysctl.c | 4 +-
> xen/drivers/acpi/Makefile | 1 +
> xen/drivers/acpi/pm_op.c | 409 +++++++++++++++++++
> xen/drivers/acpi/pmstat.c | 357 ----------------
> xen/drivers/cpufreq/cpufreq_misc_governors.c | 2 +
> xen/drivers/cpufreq/cpufreq_ondemand.c | 2 +
> xen/drivers/cpufreq/utility.c | 41 --
> xen/include/acpi/cpufreq/cpufreq.h | 3 -
> 11 files changed, 434 insertions(+), 402 deletions(-)
> create mode 100644 xen/drivers/acpi/pm_op.c
>
> diff --git a/xen/arch/x86/acpi/cpufreq/hwp.c b/xen/arch/x86/acpi/cpufreq/hwp.c
> index d5fa3d47ca..e4c09244ab 100644
> --- a/xen/arch/x86/acpi/cpufreq/hwp.c
> +++ b/xen/arch/x86/acpi/cpufreq/hwp.c
> @@ -466,6 +466,7 @@ static int cf_check hwp_cpufreq_cpu_exit(struct cpufreq_policy *policy)
> return 0;
> }
>
> +#ifdef CONFIG_PM_OP
> /*
> * The SDM reads like turbo should be disabled with MSR_IA32_PERF_CTL and
> * PERF_CTL_TURBO_DISENGAGE, but that does not seem to actually work, at least
> @@ -508,6 +509,7 @@ static int cf_check hwp_cpufreq_update(unsigned int cpu, struct cpufreq_policy *
>
> return per_cpu(hwp_drv_data, cpu)->ret;
> }
> +#endif /* CONFIG_PM_OP */
>
> static const struct cpufreq_driver __initconst_cf_clobber
> hwp_cpufreq_driver = {
> @@ -516,9 +518,12 @@ hwp_cpufreq_driver = {
> .target = hwp_cpufreq_target,
> .init = hwp_cpufreq_cpu_init,
> .exit = hwp_cpufreq_cpu_exit,
> +#ifdef CONFIG_PM_OP
> .update = hwp_cpufreq_update,
> +#endif
> };
>
> +#ifdef CONFIG_PM_OP
> int get_hwp_para(unsigned int cpu,
> struct xen_cppc_para *cppc_para)
> {
> @@ -639,6 +644,7 @@ int set_hwp_para(struct cpufreq_policy *policy,
>
> return hwp_cpufreq_target(policy, 0, 0);
> }
> +#endif /* CONFIG_PM_OP */
>
> int __init hwp_register_driver(void)
> {
> diff --git a/xen/arch/x86/acpi/cpufreq/powernow.c b/xen/arch/x86/acpi/cpufreq/powernow.c
> index 69364e1855..12fca45b45 100644
> --- a/xen/arch/x86/acpi/cpufreq/powernow.c
> +++ b/xen/arch/x86/acpi/cpufreq/powernow.c
> @@ -49,6 +49,7 @@ static void cf_check transition_pstate(void *pstate)
> wrmsrl(MSR_PSTATE_CTRL, *(unsigned int *)pstate);
> }
>
> +#ifdef CONFIG_PM_OP
> static void cf_check update_cpb(void *data)
> {
> struct cpufreq_policy *policy = data;
> @@ -77,6 +78,7 @@ static int cf_check powernow_cpufreq_update(
>
> return 0;
> }
> +#endif /* CONFIG_PM_OP */
>
> static int cf_check powernow_cpufreq_target(
> struct cpufreq_policy *policy,
> @@ -324,7 +326,9 @@ powernow_cpufreq_driver = {
> .target = powernow_cpufreq_target,
> .init = powernow_cpufreq_cpu_init,
> .exit = powernow_cpufreq_cpu_exit,
> +#ifdef CONFIG_PM_OP
> .update = powernow_cpufreq_update
> +#endif
> };
>
> unsigned int __init powernow_register_driver(void)
> diff --git a/xen/common/Kconfig b/xen/common/Kconfig
> index 9cccc37232..ca1f692487 100644
> --- a/xen/common/Kconfig
> +++ b/xen/common/Kconfig
> @@ -593,4 +593,11 @@ config SYSCTL
> to reduce Xen footprint.
> endmenu
>
> +config PM_OP
> + bool "Enable Performance Management Operation"
> + depends on ACPI && HAS_CPUFREQ && SYSCTL
> + default y
> + help
> + This option shall enable userspace performance management control
> + to do power/performance analyzing and tuning.
> endmenu
> diff --git a/xen/common/sysctl.c b/xen/common/sysctl.c
> index 2fe76362b1..4ab827b694 100644
> --- a/xen/common/sysctl.c
> +++ b/xen/common/sysctl.c
> @@ -181,13 +181,15 @@ long do_sysctl(XEN_GUEST_HANDLE_PARAM(xen_sysctl_t) u_sysctl)
> case XEN_SYSCTL_get_pmstat:
> ret = do_get_pm_info(&op->u.get_pmstat);
> break;
> +#endif
>
> +#ifdef CONFIG_PM_OP
> case XEN_SYSCTL_pm_op:
> ret = do_pm_op(&op->u.pm_op);
> if ( ret == -EAGAIN )
> copyback = 1;
> break;
> -#endif
> +#endif /* CONFIG_PM_OP */
>
> case XEN_SYSCTL_page_offline_op:
> {
> diff --git a/xen/drivers/acpi/Makefile b/xen/drivers/acpi/Makefile
> index 2fc5230253..e1f84a4468 100644
> --- a/xen/drivers/acpi/Makefile
> +++ b/xen/drivers/acpi/Makefile
> @@ -6,6 +6,7 @@ obj-bin-y += tables.init.o
> obj-$(CONFIG_ACPI_NUMA) += numa.o
> obj-y += osl.o
> obj-$(CONFIG_HAS_CPUFREQ) += pmstat.o
> +obj-$(CONFIG_PM_OP) += pm_op.o
>
> obj-$(CONFIG_X86) += hwregs.o
> obj-$(CONFIG_X86) += reboot.o
> diff --git a/xen/drivers/acpi/pm_op.c b/xen/drivers/acpi/pm_op.c
> new file mode 100644
> index 0000000000..3123cb9556
> --- /dev/null
> +++ b/xen/drivers/acpi/pm_op.c
> @@ -0,0 +1,409 @@
> +/* SPDX-License-Identifier: GPL-2.0-or-later */
> +
> +#include <xen/acpi.h>
> +#include <xen/domain.h>
> +#include <xen/errno.h>
> +#include <xen/guest_access.h>
> +#include <xen/lib.h>
> +#include <xen/sched.h>
> +
> +#include <acpi/cpufreq/cpufreq.h>
> +#include <public/platform.h>
> +#include <public/sysctl.h>
> +
> +/*
> + * 1. Get PM parameter
> + * 2. Provide user PM control
> + */
> +static int cpufreq_update_turbo(unsigned int cpu, int new_state)
> +{
> + struct cpufreq_policy *policy;
> + int curr_state;
> + int ret = 0;
> +
> + if ( new_state != CPUFREQ_TURBO_ENABLED &&
> + new_state != CPUFREQ_TURBO_DISABLED )
> + return -EINVAL;
> +
> + policy = per_cpu(cpufreq_cpu_policy, cpu);
> + if ( !policy )
> + return -EACCES;
> +
> + if ( policy->turbo == CPUFREQ_TURBO_UNSUPPORTED )
> + return -EOPNOTSUPP;
> +
> + curr_state = policy->turbo;
> + if ( curr_state == new_state )
> + return 0;
> +
> + policy->turbo = new_state;
> + if ( cpufreq_driver.update )
> + {
> + ret = alternative_call(cpufreq_driver.update, cpu, policy);
> + if ( ret )
> + policy->turbo = curr_state;
> + }
> +
> + return ret;
> +}
> +
> +static int cpufreq_get_turbo_status(unsigned int cpu)
> +{
> + struct cpufreq_policy *policy;
> +
> + policy = per_cpu(cpufreq_cpu_policy, cpu);
> + return policy && policy->turbo == CPUFREQ_TURBO_ENABLED;
> +}
> +
> +static int read_scaling_available_governors(char *scaling_available_governors,
> + unsigned int size)
> +{
> + unsigned int i = 0;
> + struct cpufreq_governor *t;
> +
> + if ( !scaling_available_governors )
> + return -EINVAL;
> +
> + list_for_each_entry(t, &cpufreq_governor_list, governor_list)
> + {
> + i += scnprintf(&scaling_available_governors[i],
> + CPUFREQ_NAME_LEN, "%s ", t->name);
> + if ( i > size )
> + return -EINVAL;
> + }
> + scaling_available_governors[i-1] = '\0';
> +
> + return 0;
> +}
> +
> +static int get_cpufreq_para(struct xen_sysctl_pm_op *op)
> +{
> + uint32_t ret = 0;
> + const struct processor_pminfo *pmpt;
> + struct cpufreq_policy *policy;
> + uint32_t gov_num = 0;
> + uint32_t *data;
> + char *scaling_available_governors;
> + struct list_head *pos;
> + unsigned int cpu, i = 0;
> +
> + pmpt = processor_pminfo[op->cpuid];
> + policy = per_cpu(cpufreq_cpu_policy, op->cpuid);
> +
> + if ( !pmpt || !pmpt->perf.states ||
> + !policy || !policy->governor )
> + return -EINVAL;
> +
> + list_for_each(pos, &cpufreq_governor_list)
> + gov_num++;
> +
> + if ( (op->u.get_para.cpu_num != cpumask_weight(policy->cpus)) ||
> + (op->u.get_para.freq_num != pmpt->perf.state_count) ||
> + (op->u.get_para.gov_num != gov_num) )
> + {
> + op->u.get_para.cpu_num = cpumask_weight(policy->cpus);
> + op->u.get_para.freq_num = pmpt->perf.state_count;
> + op->u.get_para.gov_num = gov_num;
> + return -EAGAIN;
> + }
> +
> + if ( !(data = xzalloc_array(uint32_t,
> + max(op->u.get_para.cpu_num,
> + op->u.get_para.freq_num))) )
> + return -ENOMEM;
> +
> + for_each_cpu(cpu, policy->cpus)
> + data[i++] = cpu;
> + ret = copy_to_guest(op->u.get_para.affected_cpus,
> + data, op->u.get_para.cpu_num);
> +
> + for ( i = 0; i < op->u.get_para.freq_num; i++ )
> + data[i] = pmpt->perf.states[i].core_frequency * 1000;
> + ret += copy_to_guest(op->u.get_para.scaling_available_frequencies,
> + data, op->u.get_para.freq_num);
> +
> + xfree(data);
> + if ( ret )
> + return -EFAULT;
> +
> + op->u.get_para.cpuinfo_cur_freq =
> + cpufreq_driver.get ? alternative_call(cpufreq_driver.get, op->cpuid)
> + : policy->cur;
> + op->u.get_para.cpuinfo_max_freq = policy->cpuinfo.max_freq;
> + op->u.get_para.cpuinfo_min_freq = policy->cpuinfo.min_freq;
> + op->u.get_para.turbo_enabled = cpufreq_get_turbo_status(op->cpuid);
> +
> + if ( cpufreq_driver.name[0] )
> + strlcpy(op->u.get_para.scaling_driver,
> + cpufreq_driver.name, CPUFREQ_NAME_LEN);
> + else
> + strlcpy(op->u.get_para.scaling_driver, "Unknown", CPUFREQ_NAME_LEN);
> +
> + if ( IS_ENABLED(CONFIG_INTEL) &&
> + !strncmp(op->u.get_para.scaling_driver, XEN_HWP_DRIVER_NAME,
> + CPUFREQ_NAME_LEN) )
> + ret = get_hwp_para(policy->cpu, &op->u.get_para.u.cppc_para);
> + else
> + {
> + if ( !(scaling_available_governors =
> + xzalloc_array(char, gov_num * CPUFREQ_NAME_LEN)) )
> + return -ENOMEM;
> + if ( (ret = read_scaling_available_governors(
> + scaling_available_governors,
> + (gov_num * CPUFREQ_NAME_LEN *
> + sizeof(*scaling_available_governors)))) )
> + {
> + xfree(scaling_available_governors);
> + return ret;
> + }
> + ret = copy_to_guest(op->u.get_para.scaling_available_governors,
> + scaling_available_governors,
> + gov_num * CPUFREQ_NAME_LEN);
> + xfree(scaling_available_governors);
> + if ( ret )
> + return -EFAULT;
> +
> + op->u.get_para.u.s.scaling_cur_freq = policy->cur;
> + op->u.get_para.u.s.scaling_max_freq = policy->max;
> + op->u.get_para.u.s.scaling_min_freq = policy->min;
> +
> + if ( policy->governor->name[0] )
> + strlcpy(op->u.get_para.u.s.scaling_governor,
> + policy->governor->name, CPUFREQ_NAME_LEN);
> + else
> + strlcpy(op->u.get_para.u.s.scaling_governor, "Unknown",
> + CPUFREQ_NAME_LEN);
> +
> + /* governor specific para */
> + if ( !strncasecmp(op->u.get_para.u.s.scaling_governor,
> + "userspace", CPUFREQ_NAME_LEN) )
> + op->u.get_para.u.s.u.userspace.scaling_setspeed = policy->cur;
> +
> + if ( !strncasecmp(op->u.get_para.u.s.scaling_governor,
> + "ondemand", CPUFREQ_NAME_LEN) )
> + ret = get_cpufreq_ondemand_para(
> + &op->u.get_para.u.s.u.ondemand.sampling_rate_max,
> + &op->u.get_para.u.s.u.ondemand.sampling_rate_min,
> + &op->u.get_para.u.s.u.ondemand.sampling_rate,
> + &op->u.get_para.u.s.u.ondemand.up_threshold);
> + }
> +
> + return ret;
> +}
> +
> +static int set_cpufreq_gov(struct xen_sysctl_pm_op *op)
> +{
> + struct cpufreq_policy new_policy, *old_policy;
> +
> + old_policy = per_cpu(cpufreq_cpu_policy, op->cpuid);
> + if ( !old_policy )
> + return -EINVAL;
> +
> + memcpy(&new_policy, old_policy, sizeof(struct cpufreq_policy));
> +
> + new_policy.governor = __find_governor(op->u.set_gov.scaling_governor);
> + if ( new_policy.governor == NULL )
> + return -EINVAL;
> +
> + return __cpufreq_set_policy(old_policy, &new_policy);
> +}
> +
> +static int set_cpufreq_para(struct xen_sysctl_pm_op *op)
> +{
> + int ret = 0;
> + struct cpufreq_policy *policy;
> +
> + policy = per_cpu(cpufreq_cpu_policy, op->cpuid);
> +
> + if ( !policy || !policy->governor )
> + return -EINVAL;
> +
> + if ( hwp_active() )
> + return -EOPNOTSUPP;
> +
> + switch( op->u.set_para.ctrl_type )
> + {
> + case SCALING_MAX_FREQ:
> + {
> + struct cpufreq_policy new_policy;
> +
> + memcpy(&new_policy, policy, sizeof(struct cpufreq_policy));
> + new_policy.max = op->u.set_para.ctrl_value;
> + ret = __cpufreq_set_policy(policy, &new_policy);
> +
> + break;
> + }
> +
> + case SCALING_MIN_FREQ:
> + {
> + struct cpufreq_policy new_policy;
> +
> + memcpy(&new_policy, policy, sizeof(struct cpufreq_policy));
> + new_policy.min = op->u.set_para.ctrl_value;
> + ret = __cpufreq_set_policy(policy, &new_policy);
> +
> + break;
> + }
> +
> + case SCALING_SETSPEED:
> + {
> + unsigned int freq =op->u.set_para.ctrl_value;
> +
> + if ( !strncasecmp(policy->governor->name,
> + "userspace", CPUFREQ_NAME_LEN) )
> + ret = write_userspace_scaling_setspeed(op->cpuid, freq);
> + else
> + ret = -EINVAL;
> +
> + break;
> + }
> +
> + case SAMPLING_RATE:
> + {
> + unsigned int sampling_rate = op->u.set_para.ctrl_value;
> +
> + if ( !strncasecmp(policy->governor->name,
> + "ondemand", CPUFREQ_NAME_LEN) )
> + ret = write_ondemand_sampling_rate(sampling_rate);
> + else
> + ret = -EINVAL;
> +
> + break;
> + }
> +
> + case UP_THRESHOLD:
> + {
> + unsigned int up_threshold = op->u.set_para.ctrl_value;
> +
> + if ( !strncasecmp(policy->governor->name,
> + "ondemand", CPUFREQ_NAME_LEN) )
> + ret = write_ondemand_up_threshold(up_threshold);
> + else
> + ret = -EINVAL;
> +
> + break;
> + }
> +
> + default:
> + ret = -EINVAL;
> + break;
> + }
> +
> + return ret;
> +}
> +
> +static int set_cpufreq_cppc(struct xen_sysctl_pm_op *op)
> +{
> + struct cpufreq_policy *policy = per_cpu(cpufreq_cpu_policy, op->cpuid);
> +
> + if ( !policy || !policy->governor )
> + return -ENOENT;
> +
> + if ( !hwp_active() )
> + return -EOPNOTSUPP;
> +
> + return set_hwp_para(policy, &op->u.set_cppc);
> +}
> +
> +int do_pm_op(struct xen_sysctl_pm_op *op)
> +{
> + int ret = 0;
> + const struct processor_pminfo *pmpt;
> +
> + switch ( op->cmd )
> + {
> + case XEN_SYSCTL_pm_op_set_sched_opt_smt:
> + {
> + uint32_t saved_value = sched_smt_power_savings;
> +
> + if ( op->cpuid != 0 )
> + return -EINVAL;
> + sched_smt_power_savings = !!op->u.set_sched_opt_smt;
> + op->u.set_sched_opt_smt = saved_value;
> + return 0;
> + }
> +
> + case XEN_SYSCTL_pm_op_get_max_cstate:
> + BUILD_BUG_ON(XEN_SYSCTL_CX_UNLIMITED != UINT_MAX);
> + if ( op->cpuid == 0 )
> + op->u.get_max_cstate = acpi_get_cstate_limit();
> + else if ( op->cpuid == 1 )
> + op->u.get_max_cstate = acpi_get_csubstate_limit();
> + else
> + ret = -EINVAL;
> + return ret;
> +
> + case XEN_SYSCTL_pm_op_set_max_cstate:
> + if ( op->cpuid == 0 )
> + acpi_set_cstate_limit(op->u.set_max_cstate);
> + else if ( op->cpuid == 1 )
> + acpi_set_csubstate_limit(op->u.set_max_cstate);
> + else
> + ret = -EINVAL;
> + return ret;
> + }
> +
> + if ( op->cpuid >= nr_cpu_ids || !cpu_online(op->cpuid) )
> + return -EINVAL;
> + pmpt = processor_pminfo[op->cpuid];
> +
> + switch ( op->cmd & PM_PARA_CATEGORY_MASK )
> + {
> + case CPUFREQ_PARA:
> + if ( !(xen_processor_pmbits & XEN_PROCESSOR_PM_PX) )
> + return -ENODEV;
> + if ( !pmpt || !(pmpt->perf.init & XEN_PX_INIT) )
> + return -EINVAL;
> + break;
> + }
> +
> + switch ( op->cmd )
> + {
> + case GET_CPUFREQ_PARA:
> + {
> + ret = get_cpufreq_para(op);
> + break;
> + }
> +
> + case SET_CPUFREQ_GOV:
> + {
> + ret = set_cpufreq_gov(op);
> + break;
> + }
> +
> + case SET_CPUFREQ_PARA:
> + {
> + ret = set_cpufreq_para(op);
> + break;
> + }
> +
> + case SET_CPUFREQ_CPPC:
> + ret = set_cpufreq_cppc(op);
> + break;
> +
> + case GET_CPUFREQ_AVGFREQ:
> + {
> + op->u.get_avgfreq = cpufreq_driver_getavg(op->cpuid, USR_GETAVG);
> + break;
> + }
> +
> + case XEN_SYSCTL_pm_op_enable_turbo:
> + {
> + ret = cpufreq_update_turbo(op->cpuid, CPUFREQ_TURBO_ENABLED);
> + break;
> + }
> +
> + case XEN_SYSCTL_pm_op_disable_turbo:
> + {
> + ret = cpufreq_update_turbo(op->cpuid, CPUFREQ_TURBO_DISABLED);
> + break;
> + }
> +
> + default:
> + printk("not defined sub-hypercall @ do_pm_op\n");
> + ret = -ENOSYS;
> + break;
> + }
> +
> + return ret;
> +}
> diff --git a/xen/drivers/acpi/pmstat.c b/xen/drivers/acpi/pmstat.c
> index abfdc45cc2..61b60e59a2 100644
> --- a/xen/drivers/acpi/pmstat.c
> +++ b/xen/drivers/acpi/pmstat.c
> @@ -330,360 +330,3 @@ int do_get_pm_info(struct xen_sysctl_get_pmstat *op)
>
> return ret;
> }
> -
> -/*
> - * 1. Get PM parameter
> - * 2. Provide user PM control
> - */
> -static int read_scaling_available_governors(char *scaling_available_governors,
> - unsigned int size)
> -{
> - unsigned int i = 0;
> - struct cpufreq_governor *t;
> -
> - if ( !scaling_available_governors )
> - return -EINVAL;
> -
> - list_for_each_entry(t, &cpufreq_governor_list, governor_list)
> - {
> - i += scnprintf(&scaling_available_governors[i],
> - CPUFREQ_NAME_LEN, "%s ", t->name);
> - if ( i > size )
> - return -EINVAL;
> - }
> - scaling_available_governors[i-1] = '\0';
> -
> - return 0;
> -}
> -
> -static int get_cpufreq_para(struct xen_sysctl_pm_op *op)
> -{
> - uint32_t ret = 0;
> - const struct processor_pminfo *pmpt;
> - struct cpufreq_policy *policy;
> - uint32_t gov_num = 0;
> - uint32_t *data;
> - char *scaling_available_governors;
> - struct list_head *pos;
> - unsigned int cpu, i = 0;
> -
> - pmpt = processor_pminfo[op->cpuid];
> - policy = per_cpu(cpufreq_cpu_policy, op->cpuid);
> -
> - if ( !pmpt || !pmpt->perf.states ||
> - !policy || !policy->governor )
> - return -EINVAL;
> -
> - list_for_each(pos, &cpufreq_governor_list)
> - gov_num++;
> -
> - if ( (op->u.get_para.cpu_num != cpumask_weight(policy->cpus)) ||
> - (op->u.get_para.freq_num != pmpt->perf.state_count) ||
> - (op->u.get_para.gov_num != gov_num) )
> - {
> - op->u.get_para.cpu_num = cpumask_weight(policy->cpus);
> - op->u.get_para.freq_num = pmpt->perf.state_count;
> - op->u.get_para.gov_num = gov_num;
> - return -EAGAIN;
> - }
> -
> - if ( !(data = xzalloc_array(uint32_t,
> - max(op->u.get_para.cpu_num,
> - op->u.get_para.freq_num))) )
> - return -ENOMEM;
> -
> - for_each_cpu(cpu, policy->cpus)
> - data[i++] = cpu;
> - ret = copy_to_guest(op->u.get_para.affected_cpus,
> - data, op->u.get_para.cpu_num);
> -
> - for ( i = 0; i < op->u.get_para.freq_num; i++ )
> - data[i] = pmpt->perf.states[i].core_frequency * 1000;
> - ret += copy_to_guest(op->u.get_para.scaling_available_frequencies,
> - data, op->u.get_para.freq_num);
> -
> - xfree(data);
> - if ( ret )
> - return -EFAULT;
> -
> - op->u.get_para.cpuinfo_cur_freq =
> - cpufreq_driver.get ? alternative_call(cpufreq_driver.get, op->cpuid)
> - : policy->cur;
> - op->u.get_para.cpuinfo_max_freq = policy->cpuinfo.max_freq;
> - op->u.get_para.cpuinfo_min_freq = policy->cpuinfo.min_freq;
> - op->u.get_para.turbo_enabled = cpufreq_get_turbo_status(op->cpuid);
> -
> - if ( cpufreq_driver.name[0] )
> - strlcpy(op->u.get_para.scaling_driver,
> - cpufreq_driver.name, CPUFREQ_NAME_LEN);
> - else
> - strlcpy(op->u.get_para.scaling_driver, "Unknown", CPUFREQ_NAME_LEN);
> -
> - if ( IS_ENABLED(CONFIG_INTEL) &&
> - !strncmp(op->u.get_para.scaling_driver, XEN_HWP_DRIVER_NAME,
> - CPUFREQ_NAME_LEN) )
> - ret = get_hwp_para(policy->cpu, &op->u.get_para.u.cppc_para);
> - else
> - {
> - if ( !(scaling_available_governors =
> - xzalloc_array(char, gov_num * CPUFREQ_NAME_LEN)) )
> - return -ENOMEM;
> - if ( (ret = read_scaling_available_governors(
> - scaling_available_governors,
> - (gov_num * CPUFREQ_NAME_LEN *
> - sizeof(*scaling_available_governors)))) )
> - {
> - xfree(scaling_available_governors);
> - return ret;
> - }
> - ret = copy_to_guest(op->u.get_para.scaling_available_governors,
> - scaling_available_governors,
> - gov_num * CPUFREQ_NAME_LEN);
> - xfree(scaling_available_governors);
> - if ( ret )
> - return -EFAULT;
> -
> - op->u.get_para.u.s.scaling_cur_freq = policy->cur;
> - op->u.get_para.u.s.scaling_max_freq = policy->max;
> - op->u.get_para.u.s.scaling_min_freq = policy->min;
> -
> - if ( policy->governor->name[0] )
> - strlcpy(op->u.get_para.u.s.scaling_governor,
> - policy->governor->name, CPUFREQ_NAME_LEN);
> - else
> - strlcpy(op->u.get_para.u.s.scaling_governor, "Unknown",
> - CPUFREQ_NAME_LEN);
> -
> - /* governor specific para */
> - if ( !strncasecmp(op->u.get_para.u.s.scaling_governor,
> - "userspace", CPUFREQ_NAME_LEN) )
> - op->u.get_para.u.s.u.userspace.scaling_setspeed = policy->cur;
> -
> - if ( !strncasecmp(op->u.get_para.u.s.scaling_governor,
> - "ondemand", CPUFREQ_NAME_LEN) )
> - ret = get_cpufreq_ondemand_para(
> - &op->u.get_para.u.s.u.ondemand.sampling_rate_max,
> - &op->u.get_para.u.s.u.ondemand.sampling_rate_min,
> - &op->u.get_para.u.s.u.ondemand.sampling_rate,
> - &op->u.get_para.u.s.u.ondemand.up_threshold);
> - }
> -
> - return ret;
> -}
> -
> -static int set_cpufreq_gov(struct xen_sysctl_pm_op *op)
> -{
> - struct cpufreq_policy new_policy, *old_policy;
> -
> - old_policy = per_cpu(cpufreq_cpu_policy, op->cpuid);
> - if ( !old_policy )
> - return -EINVAL;
> -
> - memcpy(&new_policy, old_policy, sizeof(struct cpufreq_policy));
> -
> - new_policy.governor = __find_governor(op->u.set_gov.scaling_governor);
> - if (new_policy.governor == NULL)
> - return -EINVAL;
> -
> - return __cpufreq_set_policy(old_policy, &new_policy);
> -}
> -
> -static int set_cpufreq_para(struct xen_sysctl_pm_op *op)
> -{
> - int ret = 0;
> - struct cpufreq_policy *policy;
> -
> - policy = per_cpu(cpufreq_cpu_policy, op->cpuid);
> -
> - if ( !policy || !policy->governor )
> - return -EINVAL;
> -
> - if ( hwp_active() )
> - return -EOPNOTSUPP;
> -
> - switch(op->u.set_para.ctrl_type)
> - {
> - case SCALING_MAX_FREQ:
> - {
> - struct cpufreq_policy new_policy;
> -
> - memcpy(&new_policy, policy, sizeof(struct cpufreq_policy));
> - new_policy.max = op->u.set_para.ctrl_value;
> - ret = __cpufreq_set_policy(policy, &new_policy);
> -
> - break;
> - }
> -
> - case SCALING_MIN_FREQ:
> - {
> - struct cpufreq_policy new_policy;
> -
> - memcpy(&new_policy, policy, sizeof(struct cpufreq_policy));
> - new_policy.min = op->u.set_para.ctrl_value;
> - ret = __cpufreq_set_policy(policy, &new_policy);
> -
> - break;
> - }
> -
> - case SCALING_SETSPEED:
> - {
> - unsigned int freq =op->u.set_para.ctrl_value;
> -
> - if ( !strncasecmp(policy->governor->name,
> - "userspace", CPUFREQ_NAME_LEN) )
> - ret = write_userspace_scaling_setspeed(op->cpuid, freq);
> - else
> - ret = -EINVAL;
> -
> - break;
> - }
> -
> - case SAMPLING_RATE:
> - {
> - unsigned int sampling_rate = op->u.set_para.ctrl_value;
> -
> - if ( !strncasecmp(policy->governor->name,
> - "ondemand", CPUFREQ_NAME_LEN) )
> - ret = write_ondemand_sampling_rate(sampling_rate);
> - else
> - ret = -EINVAL;
> -
> - break;
> - }
> -
> - case UP_THRESHOLD:
> - {
> - unsigned int up_threshold = op->u.set_para.ctrl_value;
> -
> - if ( !strncasecmp(policy->governor->name,
> - "ondemand", CPUFREQ_NAME_LEN) )
> - ret = write_ondemand_up_threshold(up_threshold);
> - else
> - ret = -EINVAL;
> -
> - break;
> - }
> -
> - default:
> - ret = -EINVAL;
> - break;
> - }
> -
> - return ret;
> -}
> -
> -static int set_cpufreq_cppc(struct xen_sysctl_pm_op *op)
> -{
> - struct cpufreq_policy *policy = per_cpu(cpufreq_cpu_policy, op->cpuid);
> -
> - if ( !policy || !policy->governor )
> - return -ENOENT;
> -
> - if ( !hwp_active() )
> - return -EOPNOTSUPP;
> -
> - return set_hwp_para(policy, &op->u.set_cppc);
> -}
> -
> -int do_pm_op(struct xen_sysctl_pm_op *op)
> -{
> - int ret = 0;
> - const struct processor_pminfo *pmpt;
> -
> - switch ( op->cmd )
> - {
> - case XEN_SYSCTL_pm_op_set_sched_opt_smt:
> - {
> - uint32_t saved_value = sched_smt_power_savings;
> -
> - if ( op->cpuid != 0 )
> - return -EINVAL;
> - sched_smt_power_savings = !!op->u.set_sched_opt_smt;
> - op->u.set_sched_opt_smt = saved_value;
> - return 0;
> - }
> -
> - case XEN_SYSCTL_pm_op_get_max_cstate:
> - BUILD_BUG_ON(XEN_SYSCTL_CX_UNLIMITED != UINT_MAX);
> - if ( op->cpuid == 0 )
> - op->u.get_max_cstate = acpi_get_cstate_limit();
> - else if ( op->cpuid == 1 )
> - op->u.get_max_cstate = acpi_get_csubstate_limit();
> - else
> - ret = -EINVAL;
> - return ret;
> -
> - case XEN_SYSCTL_pm_op_set_max_cstate:
> - if ( op->cpuid == 0 )
> - acpi_set_cstate_limit(op->u.set_max_cstate);
> - else if ( op->cpuid == 1 )
> - acpi_set_csubstate_limit(op->u.set_max_cstate);
> - else
> - ret = -EINVAL;
> - return ret;
> - }
> -
> - if ( op->cpuid >= nr_cpu_ids || !cpu_online(op->cpuid) )
> - return -EINVAL;
> - pmpt = processor_pminfo[op->cpuid];
> -
> - switch ( op->cmd & PM_PARA_CATEGORY_MASK )
> - {
> - case CPUFREQ_PARA:
> - if ( !(xen_processor_pmbits & XEN_PROCESSOR_PM_PX) )
> - return -ENODEV;
> - if ( !pmpt || !(pmpt->perf.init & XEN_PX_INIT) )
> - return -EINVAL;
> - break;
> - }
> -
> - switch ( op->cmd )
> - {
> - case GET_CPUFREQ_PARA:
> - {
> - ret = get_cpufreq_para(op);
> - break;
> - }
> -
> - case SET_CPUFREQ_GOV:
> - {
> - ret = set_cpufreq_gov(op);
> - break;
> - }
> -
> - case SET_CPUFREQ_PARA:
> - {
> - ret = set_cpufreq_para(op);
> - break;
> - }
> -
> - case SET_CPUFREQ_CPPC:
> - ret = set_cpufreq_cppc(op);
> - break;
> -
> - case GET_CPUFREQ_AVGFREQ:
> - {
> - op->u.get_avgfreq = cpufreq_driver_getavg(op->cpuid, USR_GETAVG);
> - break;
> - }
> -
> - case XEN_SYSCTL_pm_op_enable_turbo:
> - {
> - ret = cpufreq_update_turbo(op->cpuid, CPUFREQ_TURBO_ENABLED);
> - break;
> - }
> -
> - case XEN_SYSCTL_pm_op_disable_turbo:
> - {
> - ret = cpufreq_update_turbo(op->cpuid, CPUFREQ_TURBO_DISABLED);
> - break;
> - }
> -
> - default:
> - printk("not defined sub-hypercall @ do_pm_op\n");
> - ret = -ENOSYS;
> - break;
> - }
> -
> - return ret;
> -}
> diff --git a/xen/drivers/cpufreq/cpufreq_misc_governors.c b/xen/drivers/cpufreq/cpufreq_misc_governors.c
> index 0327fad23b..e5cb9ab02f 100644
> --- a/xen/drivers/cpufreq/cpufreq_misc_governors.c
> +++ b/xen/drivers/cpufreq/cpufreq_misc_governors.c
> @@ -64,6 +64,7 @@ static int cf_check cpufreq_governor_userspace(
> return ret;
> }
>
> +#ifdef CONFIG_PM_OP
> int write_userspace_scaling_setspeed(unsigned int cpu, unsigned int freq)
> {
> struct cpufreq_policy *policy;
> @@ -80,6 +81,7 @@ int write_userspace_scaling_setspeed(unsigned int cpu, unsigned int freq)
>
> return __cpufreq_driver_target(policy, freq, CPUFREQ_RELATION_L);
> }
> +#endif /* CONFIG_PM_OP */
>
> static bool __init cf_check
> cpufreq_userspace_handle_option(const char *name, const char *val)
> diff --git a/xen/drivers/cpufreq/cpufreq_ondemand.c b/xen/drivers/cpufreq/cpufreq_ondemand.c
> index 06cfc88d30..0126a3f5d9 100644
> --- a/xen/drivers/cpufreq/cpufreq_ondemand.c
> +++ b/xen/drivers/cpufreq/cpufreq_ondemand.c
> @@ -57,6 +57,7 @@ static struct dbs_tuners {
>
> static DEFINE_PER_CPU(struct timer, dbs_timer);
>
> +#ifdef CONFIG_PM_OP
> int write_ondemand_sampling_rate(unsigned int sampling_rate)
> {
> if ( (sampling_rate > MAX_SAMPLING_RATE / MICROSECS(1)) ||
> @@ -93,6 +94,7 @@ int get_cpufreq_ondemand_para(uint32_t *sampling_rate_max,
>
> return 0;
> }
> +#endif /* CONFIG_PM_OP */
>
> static void dbs_check_cpu(struct cpu_dbs_info_s *this_dbs_info)
> {
> diff --git a/xen/drivers/cpufreq/utility.c b/xen/drivers/cpufreq/utility.c
> index 723045b240..987c3b5929 100644
> --- a/xen/drivers/cpufreq/utility.c
> +++ b/xen/drivers/cpufreq/utility.c
> @@ -224,47 +224,6 @@ int cpufreq_driver_getavg(unsigned int cpu, unsigned int flag)
> return policy->cur;
> }
>
> -int cpufreq_update_turbo(unsigned int cpu, int new_state)
> -{
> - struct cpufreq_policy *policy;
> - int curr_state;
> - int ret = 0;
> -
> - if (new_state != CPUFREQ_TURBO_ENABLED &&
> - new_state != CPUFREQ_TURBO_DISABLED)
> - return -EINVAL;
> -
> - policy = per_cpu(cpufreq_cpu_policy, cpu);
> - if (!policy)
> - return -EACCES;
> -
> - if (policy->turbo == CPUFREQ_TURBO_UNSUPPORTED)
> - return -EOPNOTSUPP;
> -
> - curr_state = policy->turbo;
> - if (curr_state == new_state)
> - return 0;
> -
> - policy->turbo = new_state;
> - if (cpufreq_driver.update)
> - {
> - ret = alternative_call(cpufreq_driver.update, cpu, policy);
> - if (ret)
> - policy->turbo = curr_state;
> - }
> -
> - return ret;
> -}
> -
> -
> -int cpufreq_get_turbo_status(unsigned int cpu)
> -{
> - struct cpufreq_policy *policy;
> -
> - policy = per_cpu(cpufreq_cpu_policy, cpu);
> - return policy && policy->turbo == CPUFREQ_TURBO_ENABLED;
> -}
> -
> /*********************************************************************
> * POLICY *
> *********************************************************************/
> diff --git a/xen/include/acpi/cpufreq/cpufreq.h b/xen/include/acpi/cpufreq/cpufreq.h
> index 241117a9af..0742aa9f44 100644
> --- a/xen/include/acpi/cpufreq/cpufreq.h
> +++ b/xen/include/acpi/cpufreq/cpufreq.h
> @@ -143,9 +143,6 @@ extern int cpufreq_driver_getavg(unsigned int cpu, unsigned int flag);
> #define CPUFREQ_TURBO_UNSUPPORTED 0
> #define CPUFREQ_TURBO_ENABLED 1
>
> -int cpufreq_update_turbo(unsigned int cpu, int new_state);
> -int cpufreq_get_turbo_status(unsigned int cpu);
> -
> static inline int
> __cpufreq_governor(struct cpufreq_policy *policy, unsigned int event)
> {
> --
> 2.34.1
>
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v3 11/20] xen/sysctl: introduce CONFIG_PM_STATS
2025-04-21 7:37 ` [PATCH v3 11/20] xen/sysctl: introduce CONFIG_PM_STATS Penny Zheng
@ 2025-04-21 21:12 ` Stefano Stabellini
2025-04-30 15:35 ` Jan Beulich
0 siblings, 1 reply; 35+ messages in thread
From: Stefano Stabellini @ 2025-04-21 21:12 UTC (permalink / raw)
To: Penny Zheng
Cc: xen-devel, ray.huang, Jan Beulich, Andrew Cooper,
Roger Pau Monné, Anthony PERARD, Michal Orzel, Julien Grall,
Stefano Stabellini
On Mon, 21 Apr 2025, Penny Zheng wrote:
> We introduce a new Kconfig CONFIG_PM_STATS for wrapping all operations
> regarding performance management statistics.
> The major codes reside in xen/drivers/acpi/pmstat.c, including the
> pm-statistic-related sysctl op: do_get_pm_info().
> CONFIG_PM_STATS also shall depend on CONFIG_SYSCTL
>
> Signed-off-by: Penny Zheng <Penny.Zheng@amd.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
> ---
> v1 -> v2:
> - rename to CONFIG_PM_STATS
> - fix indention and stray semicolon
> - make code movements into a new commit
> - No need to wrap inline functions and declarations
> ---
> v2 -> v3:
> - sepearte functions related to do_pm_op() into a new commit
> - both braces shall be moved to the line with the closing parenthesis
> ---
> xen/arch/x86/acpi/cpu_idle.c | 2 ++
> xen/common/Kconfig | 8 ++++++++
> xen/common/sysctl.c | 4 ++--
> xen/drivers/acpi/Makefile | 2 +-
> xen/include/acpi/cpufreq/processor_perf.h | 10 ++++++++++
> 5 files changed, 23 insertions(+), 3 deletions(-)
>
> diff --git a/xen/arch/x86/acpi/cpu_idle.c b/xen/arch/x86/acpi/cpu_idle.c
> index 420198406d..b537ac4cd6 100644
> --- a/xen/arch/x86/acpi/cpu_idle.c
> +++ b/xen/arch/x86/acpi/cpu_idle.c
> @@ -1487,6 +1487,7 @@ static void amd_cpuidle_init(struct acpi_processor_power *power)
> vendor_override = -1;
> }
>
> +#ifdef CONFIG_PM_STATS
> uint32_t pmstat_get_cx_nr(unsigned int cpu)
> {
> return processor_powers[cpu] ? processor_powers[cpu]->count : 0;
> @@ -1606,6 +1607,7 @@ int pmstat_reset_cx_stat(unsigned int cpu)
> {
> return 0;
> }
> +#endif /* CONFIG_PM_STATS */
>
> void cpuidle_disable_deep_cstate(void)
> {
> diff --git a/xen/common/Kconfig b/xen/common/Kconfig
> index ca1f692487..d8e242eebc 100644
> --- a/xen/common/Kconfig
> +++ b/xen/common/Kconfig
> @@ -600,4 +600,12 @@ config PM_OP
> help
> This option shall enable userspace performance management control
> to do power/performance analyzing and tuning.
> +
> +config PM_STATS
> + bool "Enable Performance Management Statistics"
> + depends on ACPI && HAS_CPUFREQ && SYSCTL
> + default y
> + help
> + Enable collection of performance management statistics to aid in
> + analyzing and tuning power/performance characteristics of the system
> endmenu
> diff --git a/xen/common/sysctl.c b/xen/common/sysctl.c
> index 4ab827b694..baaad3bd42 100644
> --- a/xen/common/sysctl.c
> +++ b/xen/common/sysctl.c
> @@ -177,11 +177,11 @@ long do_sysctl(XEN_GUEST_HANDLE_PARAM(xen_sysctl_t) u_sysctl)
> op->u.availheap.avail_bytes <<= PAGE_SHIFT;
> break;
>
> -#if defined (CONFIG_ACPI) && defined (CONFIG_HAS_CPUFREQ)
> +#ifdef CONFIG_PM_STATS
> case XEN_SYSCTL_get_pmstat:
> ret = do_get_pm_info(&op->u.get_pmstat);
> break;
> -#endif
> +#endif /* CONFIG_PM_STATS */
>
> #ifdef CONFIG_PM_OP
> case XEN_SYSCTL_pm_op:
> diff --git a/xen/drivers/acpi/Makefile b/xen/drivers/acpi/Makefile
> index e1f84a4468..b52b006100 100644
> --- a/xen/drivers/acpi/Makefile
> +++ b/xen/drivers/acpi/Makefile
> @@ -5,7 +5,7 @@ obj-$(CONFIG_X86) += apei/
> obj-bin-y += tables.init.o
> obj-$(CONFIG_ACPI_NUMA) += numa.o
> obj-y += osl.o
> -obj-$(CONFIG_HAS_CPUFREQ) += pmstat.o
> +obj-$(CONFIG_PM_STATS) += pmstat.o
> obj-$(CONFIG_PM_OP) += pm_op.o
>
> obj-$(CONFIG_X86) += hwregs.o
> diff --git a/xen/include/acpi/cpufreq/processor_perf.h b/xen/include/acpi/cpufreq/processor_perf.h
> index 6de43f8602..a9a3b7a372 100644
> --- a/xen/include/acpi/cpufreq/processor_perf.h
> +++ b/xen/include/acpi/cpufreq/processor_perf.h
> @@ -9,9 +9,19 @@
>
> unsigned int powernow_register_driver(void);
> unsigned int get_measured_perf(unsigned int cpu, unsigned int flag);
> +#ifdef CONFIG_PM_STATS
> void cpufreq_statistic_update(unsigned int cpu, uint8_t from, uint8_t to);
> int cpufreq_statistic_init(unsigned int cpu);
> void cpufreq_statistic_exit(unsigned int cpu);
> +#else
> +static inline void cpufreq_statistic_update(unsigned int cpu, uint8_t from,
> + uint8_t to) {}
> +static inline int cpufreq_statistic_init(unsigned int cpu)
> +{
> + return 0;
> +}
> +static inline void cpufreq_statistic_exit(unsigned int cpu) {}
> +#endif /* CONFIG_PM_STATS */
>
> int cpufreq_limit_change(unsigned int cpu);
>
> --
> 2.34.1
>
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v3 19/20] xen/sysctl: wrap around arch-specific arch_do_sysctl
2025-04-21 7:37 ` [PATCH v3 19/20] xen/sysctl: wrap around arch-specific arch_do_sysctl Penny Zheng
@ 2025-04-21 21:19 ` Stefano Stabellini
2025-04-30 15:45 ` Jan Beulich
1 sibling, 0 replies; 35+ messages in thread
From: Stefano Stabellini @ 2025-04-21 21:19 UTC (permalink / raw)
To: Penny Zheng
Cc: xen-devel, ray.huang, Stefano Stabellini, Julien Grall,
Bertrand Marquis, Michal Orzel, Volodymyr Babchuk, Andrew Cooper,
Anthony PERARD, Jan Beulich, Roger Pau Monné,
Alistair Francis, Bob Eshleman, Connor Davis, Oleksii Kurochko,
Stefano Stabellini, Sergiy Kibrik
On Mon, 21 Apr 2025, Penny Zheng wrote:
> Function arch_do_sysctl is to perform arch-specific sysctl op.
> Some functions, like psr_get_info for x86, DTB overlay support for arm,
> are solely available through sysctl op, then they all shall be wrapped
> with CONFIG_SYSCTL
>
> Also, remove all #ifdef CONFIG_SYSCTL-s in arch-specific sysctl.c, as
> we put the guardian in Makefile for the whole file.
> Since PV_SHIM_EXCLUSIVE needs sorting as a prereq in the future, we move
> obj-$(CONFIG_SYSCTL) += sysctl.o out of PV_SHIM_EXCLUSIVE condition.
>
> Signed-off-by: Stefano Stabellini <stefano.stabellini@amd.com>
> Signed-off-by: Sergiy Kibrik <Sergiy_Kibrik@epam.com>
> Signed-off-by: Penny Zheng <Penny.Zheng@amd.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
> ---
> - use "depends on" for config OVERLAY_DTB
> - no need to wrap declaration
> - add transient #ifdef in sysctl.c for correct compilation
> ---
> v2 -> v3
> - move obj-$(CONFIG_SYSCTL) += sysctl.o out of PV_SHIM_EXCLUSIVE condition
> - move copyback out of #ifdef
> - add #else process for default label
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v3 20/20] xen/sysctl: wrap around sysctl hypercall
2025-04-21 7:37 ` [PATCH v3 20/20] xen/sysctl: wrap around sysctl hypercall Penny Zheng
@ 2025-04-21 21:21 ` Stefano Stabellini
0 siblings, 0 replies; 35+ messages in thread
From: Stefano Stabellini @ 2025-04-21 21:21 UTC (permalink / raw)
To: Penny Zheng
Cc: xen-devel, ray.huang, Stefano Stabellini, Andrew Cooper,
Anthony PERARD, Michal Orzel, Jan Beulich, Julien Grall,
Roger Pau Monné, Stefano Stabellini, Sergiy Kibrik
On Mon, 21 Apr 2025, Penny Zheng wrote:
> From: Stefano Stabellini <stefano.stabellini@amd.com>
>
> Wrap sysctl hypercall def and sysctl.o with CONFIG_SYSCTL, and since
> PV_SHIM_EXCLUSIVE needs sorting as a prereq in the future, we move
> them out of PV_SHIM_EXCLUSIVE condition at the same time.
>
> We also need to remove all transient "#ifdef CONFIG_SYSCTL"-s in sysctl.c.
>
> Signed-off-by: Stefano Stabellini <stefano.stabellini@amd.com>
> Signed-off-by: Sergiy Kibrik <Sergiy_Kibrik@epam.com>
> Signed-off-by: Penny Zheng <Penny.Zheng@amd.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
> ---
> v1 -> v2:
> - remove all transient "#ifdef CONFIG_SYSCTL"-s in sysctl.c
> ---
> v2 -> v3:
> - move out of CONFIG_PV_SHIM_EXCLUSIVE condition
> ---
> xen/common/Makefile | 2 +-
> xen/common/sysctl.c | 12 ------------
> xen/include/hypercall-defs.c | 8 ++++++--
> 3 files changed, 7 insertions(+), 15 deletions(-)
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v3 02/20] xen: introduce CONFIG_SYSCTL
2025-04-21 7:37 ` [PATCH v3 02/20] xen: introduce CONFIG_SYSCTL Penny Zheng
2025-04-21 20:54 ` Stefano Stabellini
@ 2025-04-30 15:05 ` Jan Beulich
2025-04-30 15:20 ` Jan Beulich
2 siblings, 0 replies; 35+ messages in thread
From: Jan Beulich @ 2025-04-30 15:05 UTC (permalink / raw)
To: Penny Zheng
Cc: ray.huang, Stefano Stabellini, Andrew Cooper, Anthony PERARD,
Michal Orzel, Julien Grall, Roger Pau Monné,
Stefano Stabellini, Sergiy Kibrik, xen-devel
On 21.04.2025 09:37, Penny Zheng wrote:
> From: Stefano Stabellini <stefano.stabellini@amd.com>
>
> We introduce a new Kconfig CONFIG_SYSCTL, which shall only be disabled
> on some dom0less systems, to reduce Xen footprint.
What about the PV shim on x86? That's also relevant ...
> --- a/xen/common/Kconfig
> +++ b/xen/common/Kconfig
> @@ -581,4 +581,15 @@ config BUDDY_ALLOCATOR_SIZE
> Amount of memory reserved for the buddy allocator to serve Xen heap,
> working alongside the colored one.
>
> +menu "Supported hypercall interfaces"
> + visible if EXPERT
> +
> +config SYSCTL
> + bool "Enable sysctl hypercall"
> + default y
> + help
> + This option shall only be disabled on some dom0less systems,
> + to reduce Xen footprint.
... for the help text here then.
Jan
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v3 01/20] xen/x86: remove "depends on !PV_SHIM_EXCLUSIVE"
2025-04-21 7:37 ` [PATCH v3 01/20] xen/x86: remove "depends on !PV_SHIM_EXCLUSIVE" Penny Zheng
@ 2025-04-30 15:16 ` Jan Beulich
2025-05-21 3:32 ` Penny, Zheng
0 siblings, 1 reply; 35+ messages in thread
From: Jan Beulich @ 2025-04-30 15:16 UTC (permalink / raw)
To: Penny Zheng
Cc: ray.huang, Andrew Cooper, Roger Pau Monné, Anthony PERARD,
Michal Orzel, Julien Grall, Stefano Stabellini, xen-devel
On 21.04.2025 09:37, Penny Zheng wrote:
> Remove all "depends on !PV_SHIM_EXCLUSIVE" (also the functionally
> equivalent "if !...") in Kconfig file, since negative dependancy will badly
> affect allyesconfig.
> This commit is based on "x86: provide an inverted Kconfig control for
> shim-exclusive mode"[1]
Recall me asking to avoid wording like "This commit" in commit messages?
Also personally I consider "is based on" ambiguous: It could also mean the
one here needs to go on top of that other one. It's not entirely clear to
me what kind of (relevant) information you're trying to convey with this
sentence. Surely you didn't really need to even look at that patch of mine
to find all the !PV_SHIM_EXCLUSIVE; that's a matter of a simply grep.
> ---
> xen/arch/x86/Kconfig | 4 ----
> xen/arch/x86/hvm/Kconfig | 1 -
> xen/drivers/video/Kconfig | 4 ++--
> 3 files changed, 2 insertions(+), 7 deletions(-)
With the changes here, what does this mean for the in-tree shim build, or
any others using xen/arch/x86/configs/pvshim_defconfig as the basis? You
aren't altering that file, so I expect the binary produced will change
significantly (when it shouldn't, unless explicitly stated otherwise in
the description, which may be warranted for SHADOW_PAGING).
Jan
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v3 02/20] xen: introduce CONFIG_SYSCTL
2025-04-21 7:37 ` [PATCH v3 02/20] xen: introduce CONFIG_SYSCTL Penny Zheng
2025-04-21 20:54 ` Stefano Stabellini
2025-04-30 15:05 ` Jan Beulich
@ 2025-04-30 15:20 ` Jan Beulich
2025-04-30 15:48 ` Jan Beulich
2 siblings, 1 reply; 35+ messages in thread
From: Jan Beulich @ 2025-04-30 15:20 UTC (permalink / raw)
To: Penny Zheng
Cc: ray.huang, Stefano Stabellini, Andrew Cooper, Anthony PERARD,
Michal Orzel, Julien Grall, Roger Pau Monné,
Stefano Stabellini, Sergiy Kibrik, xen-devel
On 21.04.2025 09:37, Penny Zheng wrote:
> --- a/xen/common/Kconfig
> +++ b/xen/common/Kconfig
> @@ -581,4 +581,15 @@ config BUDDY_ALLOCATOR_SIZE
> Amount of memory reserved for the buddy allocator to serve Xen heap,
> working alongside the colored one.
>
> +menu "Supported hypercall interfaces"
> + visible if EXPERT
> +
> +config SYSCTL
> + bool "Enable sysctl hypercall"
> + default y
Oh, and - just to re-iterate what I said earlier in the context of another patch:
Imo you would better introduce the option without prompt (simply "defbool y"),
and make it user selectable only in the final patch. That'll eliminate the need
for transient "#ifdef CONFIG_SYSCTL", i.e. reduce overall code churn.
Jan
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v3 09/20] xen/pmstat: consolidate code into pmstat.c
2025-04-21 7:37 ` [PATCH v3 09/20] xen/pmstat: consolidate code into pmstat.c Penny Zheng
@ 2025-04-30 15:24 ` Jan Beulich
0 siblings, 0 replies; 35+ messages in thread
From: Jan Beulich @ 2025-04-30 15:24 UTC (permalink / raw)
To: Penny Zheng; +Cc: ray.huang, Stefano Stabellini, xen-devel
On 21.04.2025 09:37, Penny Zheng wrote:
> We move the following functions into drivers/acpi/pmstat.c, as they
> are all designed for performance statistic:
> - cpufreq_residency_update
> - cpufreq_statistic_reset
> - cpufreq_statistic_update
> - cpufreq_statistic_init
> - cpufreq_statistic_exit
> Consequently, variable "cpufreq_statistic_data" and "cpufreq_statistic_lock"
> shall become static.
> We also move out acpi_set_pdc_bits(), as it is the handler for sub-hypercall
> XEN_PM_PDC, and shall stay with the other handlers together in
> drivers/cpufreq/cpufreq.c.
>
> Various style corrections shall be applied at the same time while moving these
> functions, including:
> - brace for if() and for() shall live at a seperate line
> - add extra space before and after bracket of if() and for()
> - use array notation
> - convert uint32_t into unsigned int
> - convert u32 into uint32_t
>
> Signed-off-by: Penny Zheng <Penny.Zheng@amd.com>
> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Acked-by: Jan Beulich <jbeulich@suse.com>
This again looks like it might be independent of earlier patches, and hence
might be able to go in right away. Please may I remind you to clarify such
aspects while submitting?
Jan
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v3 10/20] xen/pmstat: introduce CONFIG_PM_OP
2025-04-21 7:37 ` [PATCH v3 10/20] xen/pmstat: introduce CONFIG_PM_OP Penny Zheng
2025-04-21 21:09 ` Stefano Stabellini
@ 2025-04-30 15:32 ` Jan Beulich
1 sibling, 0 replies; 35+ messages in thread
From: Jan Beulich @ 2025-04-30 15:32 UTC (permalink / raw)
To: Penny Zheng
Cc: ray.huang, Andrew Cooper, Roger Pau Monné, Anthony PERARD,
Michal Orzel, Julien Grall, Stefano Stabellini, xen-devel
On 21.04.2025 09:37, Penny Zheng wrote:
> We move the following functions into a new file drivers/acpi/pm_op.c, as
> they are all more fitting in performance controling and only called by
> do_pm_op():
> - get_cpufreq_para()
> - set_cpufreq_para()
> - set_cpufreq_gov()
> - set_cpufreq_cppc()
> - cpufreq_driver_getavg()
> - cpufreq_update_turbo()
> - cpufreq_get_turbo_status()
> We introduce a new Kconfig CONFIG_PM_OP to wrap the new file.
>
> Also, although the following helpers are only called by do_pm_op(), they have
> dependency on local variable, we wrap them with CONFIG_PM_OP in place:
> - write_userspace_scaling_setspeed()
> - write_ondemand_sampling_rate()
> - write_ondemand_up_threshold()
> - get_cpufreq_ondemand_para()
> - cpufreq_driver.update()
> - get_hwp_para()
> Various style corrections shall be applied at the same time while moving these
> functions, including:
> - add extra space before and after bracket of if() and switch()
> - fix indentation
>
> Signed-off-by: Penny Zheng <Penny.Zheng@amd.com>
> ---
> v2 -> v3
> - new commit
> ---
> xen/arch/x86/acpi/cpufreq/hwp.c | 6 +
> xen/arch/x86/acpi/cpufreq/powernow.c | 4 +
> xen/common/Kconfig | 7 +
> xen/common/sysctl.c | 4 +-
> xen/drivers/acpi/Makefile | 1 +
> xen/drivers/acpi/pm_op.c | 409 +++++++++++++++++++
> xen/drivers/acpi/pmstat.c | 357 ----------------
> xen/drivers/cpufreq/cpufreq_misc_governors.c | 2 +
> xen/drivers/cpufreq/cpufreq_ondemand.c | 2 +
> xen/drivers/cpufreq/utility.c | 41 --
> xen/include/acpi/cpufreq/cpufreq.h | 3 -
> 11 files changed, 434 insertions(+), 402 deletions(-)
> create mode 100644 xen/drivers/acpi/pm_op.c
I'm pretty sure I said "pm-op.c" in replies, maybe even moret than once. Now
you still used an underscore instead of the dash that's preferred.
> --- a/xen/common/sysctl.c
> +++ b/xen/common/sysctl.c
> @@ -181,13 +181,15 @@ long do_sysctl(XEN_GUEST_HANDLE_PARAM(xen_sysctl_t) u_sysctl)
> case XEN_SYSCTL_get_pmstat:
> ret = do_get_pm_info(&op->u.get_pmstat);
> break;
> +#endif
>
> +#ifdef CONFIG_PM_OP
> case XEN_SYSCTL_pm_op:
> ret = do_pm_op(&op->u.pm_op);
> if ( ret == -EAGAIN )
> copyback = 1;
> break;
> -#endif
> +#endif /* CONFIG_PM_OP */
Please can you be consistent here with the comment (or not) on the #endif?
> +int do_pm_op(struct xen_sysctl_pm_op *op)
> +{
> + int ret = 0;
> + const struct processor_pminfo *pmpt;
> +
> + switch ( op->cmd )
> + {
> + case XEN_SYSCTL_pm_op_set_sched_opt_smt:
> + {
> + uint32_t saved_value = sched_smt_power_savings;
> +
> + if ( op->cpuid != 0 )
> + return -EINVAL;
> + sched_smt_power_savings = !!op->u.set_sched_opt_smt;
> + op->u.set_sched_opt_smt = saved_value;
> + return 0;
> + }
> +
> + case XEN_SYSCTL_pm_op_get_max_cstate:
> + BUILD_BUG_ON(XEN_SYSCTL_CX_UNLIMITED != UINT_MAX);
> + if ( op->cpuid == 0 )
> + op->u.get_max_cstate = acpi_get_cstate_limit();
> + else if ( op->cpuid == 1 )
> + op->u.get_max_cstate = acpi_get_csubstate_limit();
> + else
> + ret = -EINVAL;
> + return ret;
> +
> + case XEN_SYSCTL_pm_op_set_max_cstate:
> + if ( op->cpuid == 0 )
> + acpi_set_cstate_limit(op->u.set_max_cstate);
> + else if ( op->cpuid == 1 )
> + acpi_set_csubstate_limit(op->u.set_max_cstate);
> + else
> + ret = -EINVAL;
> + return ret;
> + }
> +
> + if ( op->cpuid >= nr_cpu_ids || !cpu_online(op->cpuid) )
> + return -EINVAL;
> + pmpt = processor_pminfo[op->cpuid];
> +
> + switch ( op->cmd & PM_PARA_CATEGORY_MASK )
> + {
> + case CPUFREQ_PARA:
> + if ( !(xen_processor_pmbits & XEN_PROCESSOR_PM_PX) )
> + return -ENODEV;
> + if ( !pmpt || !(pmpt->perf.init & XEN_PX_INIT) )
> + return -EINVAL;
> + break;
> + }
> +
> + switch ( op->cmd )
> + {
> + case GET_CPUFREQ_PARA:
> + {
> + ret = get_cpufreq_para(op);
> + break;
> + }
> +
> + case SET_CPUFREQ_GOV:
> + {
> + ret = set_cpufreq_gov(op);
> + break;
> + }
> +
> + case SET_CPUFREQ_PARA:
> + {
> + ret = set_cpufreq_para(op);
> + break;
> + }
> +
> + case SET_CPUFREQ_CPPC:
> + ret = set_cpufreq_cppc(op);
> + break;
> +
> + case GET_CPUFREQ_AVGFREQ:
> + {
> + op->u.get_avgfreq = cpufreq_driver_getavg(op->cpuid, USR_GETAVG);
> + break;
> + }
> +
> + case XEN_SYSCTL_pm_op_enable_turbo:
> + {
> + ret = cpufreq_update_turbo(op->cpuid, CPUFREQ_TURBO_ENABLED);
> + break;
> + }
> +
> + case XEN_SYSCTL_pm_op_disable_turbo:
> + {
> + ret = cpufreq_update_turbo(op->cpuid, CPUFREQ_TURBO_DISABLED);
> + break;
> + }
Please can you drop all the unnecessary inner figure braces here? They hamper
readability without - imo - providing any gain at all.
With all of the adjustments:
Acked-by: Jan Beulich <jbeulich@suse.com>
Jan
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v3 11/20] xen/sysctl: introduce CONFIG_PM_STATS
2025-04-21 21:12 ` Stefano Stabellini
@ 2025-04-30 15:35 ` Jan Beulich
0 siblings, 0 replies; 35+ messages in thread
From: Jan Beulich @ 2025-04-30 15:35 UTC (permalink / raw)
To: Penny Zheng
Cc: xen-devel, ray.huang, Andrew Cooper, Roger Pau Monné,
Anthony PERARD, Michal Orzel, Julien Grall, Stefano Stabellini
On 21.04.2025 23:12, Stefano Stabellini wrote:
> On Mon, 21 Apr 2025, Penny Zheng wrote:
>> We introduce a new Kconfig CONFIG_PM_STATS for wrapping all operations
>> regarding performance management statistics.
>> The major codes reside in xen/drivers/acpi/pmstat.c, including the
>> pm-statistic-related sysctl op: do_get_pm_info().
>> CONFIG_PM_STATS also shall depend on CONFIG_SYSCTL
>>
>> Signed-off-by: Penny Zheng <Penny.Zheng@amd.com>
>
> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Acked-by: Jan Beulich <jbeulich@suse.com>
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v3 19/20] xen/sysctl: wrap around arch-specific arch_do_sysctl
2025-04-21 7:37 ` [PATCH v3 19/20] xen/sysctl: wrap around arch-specific arch_do_sysctl Penny Zheng
2025-04-21 21:19 ` Stefano Stabellini
@ 2025-04-30 15:45 ` Jan Beulich
1 sibling, 0 replies; 35+ messages in thread
From: Jan Beulich @ 2025-04-30 15:45 UTC (permalink / raw)
To: Penny Zheng
Cc: ray.huang, Stefano Stabellini, Julien Grall, Bertrand Marquis,
Michal Orzel, Volodymyr Babchuk, Andrew Cooper, Anthony PERARD,
Roger Pau Monné, Alistair Francis, Bob Eshleman,
Connor Davis, Oleksii Kurochko, Stefano Stabellini, Sergiy Kibrik,
xen-devel
On 21.04.2025 09:37, Penny Zheng wrote:
> --- a/xen/arch/riscv/stubs.c
> +++ b/xen/arch/riscv/stubs.c
> @@ -315,13 +315,13 @@ unsigned long raw_copy_from_guest(void *to, const void __user *from,
>
> /* sysctl.c */
>
> +#ifdef CONFIG_SYSCTL
> long arch_do_sysctl(struct xen_sysctl *sysctl,
> XEN_GUEST_HANDLE_PARAM(xen_sysctl_t) u_sysctl)
> {
> BUG_ON("unimplemented");
> }
>
> -#ifdef CONFIG_SYSCTL
> void arch_do_physinfo(struct xen_sysctl_physinfo *pi)
> {
> BUG_ON("unimplemented");
Looks like the #ifdef would better move ahead of the comment, too.
Preferably with that (and notwithstanding any changes resulting from
the comment on patch 02):
Acked-by: Jan Beulich <jbeulich@suse.com>
Jan
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v3 02/20] xen: introduce CONFIG_SYSCTL
2025-04-30 15:20 ` Jan Beulich
@ 2025-04-30 15:48 ` Jan Beulich
0 siblings, 0 replies; 35+ messages in thread
From: Jan Beulich @ 2025-04-30 15:48 UTC (permalink / raw)
To: Penny Zheng
Cc: ray.huang, Stefano Stabellini, Andrew Cooper, Anthony PERARD,
Michal Orzel, Julien Grall, Roger Pau Monné,
Stefano Stabellini, Sergiy Kibrik, xen-devel
On 30.04.2025 17:20, Jan Beulich wrote:
> On 21.04.2025 09:37, Penny Zheng wrote:
>> --- a/xen/common/Kconfig
>> +++ b/xen/common/Kconfig
>> @@ -581,4 +581,15 @@ config BUDDY_ALLOCATOR_SIZE
>> Amount of memory reserved for the buddy allocator to serve Xen heap,
>> working alongside the colored one.
>>
>> +menu "Supported hypercall interfaces"
>> + visible if EXPERT
>> +
>> +config SYSCTL
>> + bool "Enable sysctl hypercall"
>> + default y
>
> Oh, and - just to re-iterate what I said earlier in the context of another patch:
> Imo you would better introduce the option without prompt (simply "defbool y"),
> and make it user selectable only in the final patch. That'll eliminate the need
> for transient "#ifdef CONFIG_SYSCTL", i.e. reduce overall code churn.
Moving further through the series I noticed that omitting the prompt here will
have an effect on the shim then - it'll be transiently (until the final patch of
the series) be built with all of the sysctl code again. I think that's tolerable
as long as in the final patch that is then being "cured" again. But it of course
needs calling out, to make people explicitly aware.
Jan
^ permalink raw reply [flat|nested] 35+ messages in thread
* RE: [PATCH v3 01/20] xen/x86: remove "depends on !PV_SHIM_EXCLUSIVE"
2025-04-30 15:16 ` Jan Beulich
@ 2025-05-21 3:32 ` Penny, Zheng
0 siblings, 0 replies; 35+ messages in thread
From: Penny, Zheng @ 2025-05-21 3:32 UTC (permalink / raw)
To: Jan Beulich
Cc: Huang, Ray, Andrew Cooper, Roger Pau Monné, Anthony PERARD,
Orzel, Michal, Julien Grall, Stefano Stabellini,
xen-devel@lists.xenproject.org
[Public]
> -----Original Message-----
> From: Jan Beulich <jbeulich@suse.com>
> Sent: Wednesday, April 30, 2025 11:17 PM
> To: Penny, Zheng <penny.zheng@amd.com>
> Cc: Huang, Ray <Ray.Huang@amd.com>; Andrew Cooper
> <andrew.cooper3@citrix.com>; Roger Pau Monné <roger.pau@citrix.com>;
> Anthony PERARD <anthony.perard@vates.tech>; Orzel, Michal
> <Michal.Orzel@amd.com>; Julien Grall <julien@xen.org>; Stefano Stabellini
> <sstabellini@kernel.org>; xen-devel@lists.xenproject.org
> Subject: Re: [PATCH v3 01/20] xen/x86: remove "depends
> on !PV_SHIM_EXCLUSIVE"
>
> On 21.04.2025 09:37, Penny Zheng wrote:
> > Remove all "depends on !PV_SHIM_EXCLUSIVE" (also the functionally
> > equivalent "if !...") in Kconfig file, since negative dependancy will
> > badly affect allyesconfig.
> > This commit is based on "x86: provide an inverted Kconfig control for
> > shim-exclusive mode"[1]
>
> Recall me asking to avoid wording like "This commit" in commit messages?
> Also personally I consider "is based on" ambiguous: It could also mean the one here
> needs to go on top of that other one. It's not entirely clear to me what kind of
> (relevant) information you're trying to convey with this sentence. Surely you didn't
> really need to even look at that patch of mine to find all the !PV_SHIM_EXCLUSIVE;
> that's a matter of a simply grep.
>
> > ---
> > xen/arch/x86/Kconfig | 4 ----
> > xen/arch/x86/hvm/Kconfig | 1 -
> > xen/drivers/video/Kconfig | 4 ++--
> > 3 files changed, 2 insertions(+), 7 deletions(-)
>
> With the changes here, what does this mean for the in-tree shim build, or any others
> using xen/arch/x86/configs/pvshim_defconfig as the basis? You aren't altering that
> file, so I expect the binary produced will change significantly (when it shouldn't,
> unless explicitly stated otherwise in the description, which may be warranted for
> SHADOW_PAGING).
>
Yes, I've missed the changes in defconfig
I'll explicitly state above options in xen/arch/x86/configs/pvshim_defconfig
For SHADOW_PAGING and TBOOT, maybe we shall add back default y, otherwise x86_64_defconfig
will change...
> Jan
^ permalink raw reply [flat|nested] 35+ messages in thread
end of thread, other threads:[~2025-05-21 3:33 UTC | newest]
Thread overview: 35+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-04-21 7:37 [PATCH v3 00/20] xen: introduce CONFIG_SYSCTL Penny Zheng
2025-04-21 7:37 ` [PATCH v3 01/20] xen/x86: remove "depends on !PV_SHIM_EXCLUSIVE" Penny Zheng
2025-04-30 15:16 ` Jan Beulich
2025-05-21 3:32 ` Penny, Zheng
2025-04-21 7:37 ` [PATCH v3 02/20] xen: introduce CONFIG_SYSCTL Penny Zheng
2025-04-21 20:54 ` Stefano Stabellini
2025-04-30 15:05 ` Jan Beulich
2025-04-30 15:20 ` Jan Beulich
2025-04-30 15:48 ` Jan Beulich
2025-04-21 7:37 ` [PATCH v3 03/20] xen/xsm: wrap around xsm_sysctl with CONFIG_SYSCTL Penny Zheng
2025-04-21 7:37 ` [PATCH v3 04/20] xen/sysctl: wrap around XEN_SYSCTL_readconsole Penny Zheng
2025-04-21 7:37 ` [PATCH v3 05/20] xen/sysctl: make CONFIG_TRACEBUFFER depend on CONFIG_SYSCTL Penny Zheng
2025-04-21 7:37 ` [PATCH v3 06/20] xen/sysctl: wrap around XEN_SYSCTL_sched_id Penny Zheng
2025-04-21 7:37 ` [PATCH v3 07/20] xen/sysctl: wrap around XEN_SYSCTL_perfc_op Penny Zheng
2025-04-21 7:37 ` [PATCH v3 08/20] xen/sysctl: wrap around XEN_SYSCTL_lockprof_op Penny Zheng
2025-04-21 7:37 ` [PATCH v3 09/20] xen/pmstat: consolidate code into pmstat.c Penny Zheng
2025-04-30 15:24 ` Jan Beulich
2025-04-21 7:37 ` [PATCH v3 10/20] xen/pmstat: introduce CONFIG_PM_OP Penny Zheng
2025-04-21 21:09 ` Stefano Stabellini
2025-04-30 15:32 ` Jan Beulich
2025-04-21 7:37 ` [PATCH v3 11/20] xen/sysctl: introduce CONFIG_PM_STATS Penny Zheng
2025-04-21 21:12 ` Stefano Stabellini
2025-04-30 15:35 ` Jan Beulich
2025-04-21 7:37 ` [PATCH v3 12/20] xen/sysctl: wrap around XEN_SYSCTL_page_offline_op Penny Zheng
2025-04-21 7:37 ` [PATCH v3 13/20] xen/sysctl: wrap around XEN_SYSCTL_cpupool_op Penny Zheng
2025-04-21 7:37 ` [PATCH v3 14/20] xen/sysctl: wrap around XEN_SYSCTL_scheduler_op Penny Zheng
2025-04-21 7:37 ` [PATCH v3 15/20] xen: make avail_domheap_pages() inlined into get_outstanding_claims() Penny Zheng
2025-04-21 7:37 ` [PATCH v3 16/20] xen/sysctl: wrap around XEN_SYSCTL_physinfo Penny Zheng
2025-04-21 7:37 ` [PATCH v3 17/20] xen/sysctl: make CONFIG_COVERAGE depend on CONFIG_SYSCTL Penny Zheng
2025-04-21 7:37 ` [PATCH v3 18/20] xen/sysctl: make CONFIG_LIVEPATCH " Penny Zheng
2025-04-21 7:37 ` [PATCH v3 19/20] xen/sysctl: wrap around arch-specific arch_do_sysctl Penny Zheng
2025-04-21 21:19 ` Stefano Stabellini
2025-04-30 15:45 ` Jan Beulich
2025-04-21 7:37 ` [PATCH v3 20/20] xen/sysctl: wrap around sysctl hypercall Penny Zheng
2025-04-21 21:21 ` Stefano Stabellini
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.