* [RFC V3 0/4] arm_scmi: vendors: ARM SCMI Qualcomm Vendor Protocol
@ 2024-07-02 19:14 Sibi Sankar
2024-07-02 19:14 ` [RFC V3 1/4] dt-bindings: firmware: Document bindings for ARM SCMI QCOM " Sibi Sankar
` (3 more replies)
0 siblings, 4 replies; 17+ messages in thread
From: Sibi Sankar @ 2024-07-02 19:14 UTC (permalink / raw)
To: sudeep.holla, cristian.marussi, andersson, konrad.dybcio, robh+dt,
krzysztof.kozlowski+dt
Cc: linux-kernel, linux-arm-msm, devicetree, linux-arm-kernel,
quic_rgottimu, quic_kshivnan, quic_sibis, conor+dt
The SCMI QCOM vendor protocol provides a generic way of exposing a
number of Qualcomm SoC specific features (like memory bus scaling)
through a mixture of pre-determined algorithm strings and param_id
pairs hosted on the SCMI controller. Introduce a client driver that
uses the memlat algorithm string hosted on QCOM SCMI Vendor Protocol
to detect memory latency workloads and control frequency/level of
the various memory buses (DDR/LLCC/DDR_QOS).
Generic QCOM Vendor protocol background:
It was found that a lot of the vendor protocol used internally was
for debug/internal development purposes that would either be super
SoC specific or had to be disabled because of some features being
fused out during production. This lead to a large number of vendor
protocol numbers being quickly consumed and were never released
either. Using a generic vendor protocol with functionality abstracted
behind algorithm strings gave us the flexibility of allowing such
functionality exist during initial development/debugging while
still being able to expose functionality like memlat once they have
matured enough. The param-ids are certainly expected to act as ABI
for algorithms strings like MEMLAT.
Thanks in advance for taking time to review the series.
Opens:
* opp-tables are used but they don't get to be added to the scmi device (thus we
rely on a lot of manual parsing) because the memlat client driver doesn't vote
on these resources clocks/interconnects/power-domain from the kernel and some
of the resources aren't modeled in the first place like DDR_QOS.
To Dos:
* More documentation [Sudeep/Dmitry/Konrad]
* Use alternate bindings instead of freq-table-Hz. [Dmitry]
* Prevent duplication of code using vendor protocol driver. [Dmitry]
V2:
* Drop container dvfs memlat container node. [Rob]
* Move scmi-memlat.yaml to protocol level given that a lot of vendors might end up
using the same protocol number. [Rob]
* Replace qcom,cpulist with the standard "cpus" property. [Rob]
* Fix up compute-type/ipm-ceil required. [Rob]
* Make driver changes to the accommodate bindings changes. [Rob]
* Minor fixups in subjects/coverletter.
* Minor style fixes in dts.
V1:
* Add missing bindings for the protocol. [Konrad/Dmitry]
* Use alternate bindings. [Dmitry/Konrad]
* Rebase on top of Cristian's "SCMI multiple vendor protocol support" series. [Cristian]
* Add more documentation wherever possible. [Sudeep]
* Replace pr_err/info with it's dev equivalents.
* Mixed tabs and initialization cleanups in the memlat driver. [Konrad]
* Commit message update for the memlat driver. [Dmitry]
* Cleanups/Fixes suggested for the client driver. [Dmitry/Konrad/Cristian]
* Use opp-tables instead of memfreq-tbl. [Dmitry/Konrad]
* Detect physical cpu to deal with variants with reduced cpu count.
* Add support for DDR_QOS mem_type.
Sibi Sankar (4):
dt-bindings: firmware: Document bindings for ARM SCMI QCOM Vendor
Protocol
firmware: arm_scmi: vendors: Add ARM SCMI QCOM vendor protocol v1.0
soc: qcom: Introduce SCMI based Memlat (Memory Latency) governor
arm64: dts: qcom: x1e80100: Enable LLCC/DDR/DDR_QOS dvfs
.../bindings/firmware/arm,scmi.yaml | 15 +
.../bindings/soc/qcom/qcom,scmi-memlat.yaml | 242 ++++++++
arch/arm64/boot/dts/qcom/x1e80100.dtsi | 138 +++++
drivers/firmware/arm_scmi/vendors/Kconfig | 12 +
drivers/firmware/arm_scmi/vendors/Makefile | 2 +-
.../arm_scmi/vendors/qcom_scmi_vendor.c | 184 ++++++
drivers/soc/qcom/Kconfig | 12 +
drivers/soc/qcom/Makefile | 1 +
drivers/soc/qcom/qcom_scmi_client.c | 583 ++++++++++++++++++
include/dt-bindings/soc/qcom,scmi-vendor.h | 22 +
include/linux/qcom_scmi_vendor.h | 39 ++
11 files changed, 1249 insertions(+), 1 deletion(-)
create mode 100644 Documentation/devicetree/bindings/soc/qcom/qcom,scmi-memlat.yaml
create mode 100644 drivers/firmware/arm_scmi/vendors/qcom_scmi_vendor.c
create mode 100644 drivers/soc/qcom/qcom_scmi_client.c
create mode 100644 include/dt-bindings/soc/qcom,scmi-vendor.h
create mode 100644 include/linux/qcom_scmi_vendor.h
--
2.34.1
^ permalink raw reply [flat|nested] 17+ messages in thread
* [RFC V3 1/4] dt-bindings: firmware: Document bindings for ARM SCMI QCOM Vendor Protocol
2024-07-02 19:14 [RFC V3 0/4] arm_scmi: vendors: ARM SCMI Qualcomm Vendor Protocol Sibi Sankar
@ 2024-07-02 19:14 ` Sibi Sankar
2024-07-09 16:32 ` Cristian Marussi
2024-07-02 19:14 ` [RFC V3 2/4] firmware: arm_scmi: vendors: Add ARM SCMI QCOM vendor protocol v1.0 Sibi Sankar
` (2 subsequent siblings)
3 siblings, 1 reply; 17+ messages in thread
From: Sibi Sankar @ 2024-07-02 19:14 UTC (permalink / raw)
To: sudeep.holla, cristian.marussi, andersson, konrad.dybcio, robh+dt,
krzysztof.kozlowski+dt
Cc: linux-kernel, linux-arm-msm, devicetree, linux-arm-kernel,
quic_rgottimu, quic_kshivnan, quic_sibis, conor+dt
Document the various memory buses that can be monitored and scaled by the
memory latency governor hosted by the ARM SCMI QCOM Vendor protocol v1.0.
Signed-off-by: Sibi Sankar <quic_sibis@quicinc.com>
---
Adding a reg property in scmi-memlat.yaml seems incorrect/superfluous
but without it I see the following errors:
Err Logs:
protocol@80: 'reg' does not match any of the regexes: '^memory-[0-9]$', 'pinctrl-[0-9]+'
protocol@80: Unevaluated properties are not allowed ('memory-0', 'memory-1', 'memory-2' were unexpected)
v2:
* Drop container dvfs memlat container node. [Rob]
* Move scmi-memlat.yaml to protocol level given that a lot of vendors might end up
using the same protocol number. [Rob]
* Replace qcom,cpulist with the standard "cpus" property. [Rob]
* Fix up compute-type/ipm-ceil required. [Rob]
.../bindings/firmware/arm,scmi.yaml | 15 ++
.../bindings/soc/qcom/qcom,scmi-memlat.yaml | 242 ++++++++++++++++++
include/dt-bindings/soc/qcom,scmi-vendor.h | 22 ++
3 files changed, 279 insertions(+)
create mode 100644 Documentation/devicetree/bindings/soc/qcom/qcom,scmi-memlat.yaml
create mode 100644 include/dt-bindings/soc/qcom,scmi-vendor.h
diff --git a/Documentation/devicetree/bindings/firmware/arm,scmi.yaml b/Documentation/devicetree/bindings/firmware/arm,scmi.yaml
index 4d823f3b1f0e..a4022682e5ca 100644
--- a/Documentation/devicetree/bindings/firmware/arm,scmi.yaml
+++ b/Documentation/devicetree/bindings/firmware/arm,scmi.yaml
@@ -284,6 +284,21 @@ properties:
required:
- reg
+ protocol@80:
+ type: object
+ allOf:
+ - $ref: '#/$defs/protocol-node'
+ - $ref: /schemas/soc/qcom/qcom,scmi-memlat.yaml#
+
+ unevaluatedProperties: false
+
+ properties:
+ reg:
+ const: 0x80
+
+ required:
+ - reg
+
additionalProperties: false
$defs:
diff --git a/Documentation/devicetree/bindings/soc/qcom/qcom,scmi-memlat.yaml b/Documentation/devicetree/bindings/soc/qcom/qcom,scmi-memlat.yaml
new file mode 100644
index 000000000000..915a6bf5697f
--- /dev/null
+++ b/Documentation/devicetree/bindings/soc/qcom/qcom,scmi-memlat.yaml
@@ -0,0 +1,242 @@
+# SPDX-License-Identifier: (GPL-2.0 OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/soc/qcom/qcom,scmi-memlat.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Qualcomm SCMI Memory Bus nodes
+
+maintainers:
+ - Sibi Sankar <quic_sibis@quicinc.com>
+
+description:
+ This binding describes the various memory buses that can be monitored and scaled
+ by memory latency governor running on the CPU Control Processor (SCMI controller).
+
+properties:
+ reg:
+ maxItems: 1
+
+patternProperties:
+ '^memory-[0-9]$':
+ type: object
+ description:
+ The list of all memory buses that can be monitored and scaled by the
+ memory latency governor running on the SCMI controller.
+
+ unevaluatedProperties: false
+
+ properties:
+ qcom,memory-type:
+ $ref: /schemas/types.yaml#/definitions/uint32
+ enum: [0, 1, 2]
+ description: |
+ Memory Bus Identifier
+ 0 = QCOM_MEM_TYPE_DDR
+ 1 = QCOM_MEM_TYPE_LLCC
+ 2 = QCOM_MEM_TYPE_DDR_QOS
+
+ freq-table-hz:
+ items:
+ items:
+ - description: Minimum frequency of the memory bus in Hz
+ - description: Maximum frequency of the memory bus in Hz
+
+ patternProperties:
+ '^monitor-[0-9]$':
+ type: object
+ unevaluatedProperties: false
+ description:
+ The list of all monitors detecting the memory latency bound workloads using
+ various counters.
+
+ properties:
+ qcom,compute-type:
+ description:
+ Monitors of type compute perform bus dvfs based on a rudimentary CPU
+ frequency to memory frequency map.
+ type: boolean
+
+ qcom,ipm-ceil:
+ $ref: /schemas/types.yaml#/definitions/uint32
+ description:
+ Monitors having this property perform bus dvfs based on the same
+ rudimentary table but the scaling is performed only if the calculated
+ IPM (Instruction Per Misses) exceeds the given ceiling.
+
+ cpus:
+ $ref: /schemas/types.yaml#/definitions/phandle-array
+ description:
+ Should be a list of phandles to CPU nodes (as described in
+ Documentation/devicetree/bindings/arm/cpus.yaml).
+
+ operating-points-v2: true
+ opp-table:
+ type: object
+
+ required:
+ - cpus
+ - operating-points-v2
+
+ oneOf:
+ - required: [ 'qcom,compute-type' ]
+ - required: [ 'qcom,ipm-ceil' ]
+
+ required:
+ - qcom,memory-type
+ - freq-table-hz
+
+additionalProperties: false
+
+examples:
+ - |
+ #include <dt-bindings/soc/qcom,scmi-vendor.h>
+
+ firmware {
+ scmi {
+ compatible = "arm,scmi";
+ mboxes = <&cpucp_mbox 0>, <&cpucp_mbox 2>;
+ mbox-names = "tx", "rx";
+ shmem = <&cpu_scp_lpri0>, <&cpu_scp_lpri1>;
+
+ #address-cells = <1>;
+ #size-cells = <0>;
+
+ protocol@80 {
+ reg = <0x80>;
+
+ memory-0 {
+ qcom,memory-type = <QCOM_MEM_TYPE_DDR>;
+ freq-table-hz = /bits/ 64 <200000000 4224000000>;
+
+ monitor-0 {
+ qcom,ipm-ceil = <20000000>;
+ cpus = <&CPU0 &CPU1 &CPU2 &CPU3 &CPU4 &CPU5 &CPU6 &CPU7
+ &CPU8 &CPU9 &CPU10 &CPU11>;
+ operating-points-v2 = <&memory0_monitor0_opp_table>;
+
+ memory0_monitor0_opp_table: opp-table {
+ compatible = "operating-points-v2";
+
+ opp-999000000 {
+ opp-hz = /bits/ 64 <999000000 547000000>;
+ };
+
+ opp-1440000000 {
+ opp-hz = /bits/ 64 <1440000000 768000000>;
+ };
+
+ opp-1671000000 {
+ opp-hz = /bits/ 64 <1671000000 1555000000>;
+ };
+
+ opp-2189000000 {
+ opp-hz = /bits/ 64 <2189000000 2092000000>;
+ };
+
+ opp-2516000000 {
+ opp-hz = /bits/ 64 <2516000000 3187000000>;
+ };
+
+ opp-3860000000 {
+ opp-hz = /bits/ 64 <3860000000 4224000000>;
+ };
+ };
+ };
+
+ monitor-1 {
+ qcom,compute-type;
+ cpus = <&CPU0 &CPU1 &CPU2 &CPU3 &CPU4 &CPU5 &CPU6 &CPU7
+ &CPU8 &CPU9 &CPU10 &CPU11>;
+ operating-points-v2 = <&memory0_monitor1_opp_table>;
+
+ memory0_monitor1_opp_table: opp-table {
+ compatible = "operating-points-v2";
+
+ opp-1440000000 {
+ opp-hz = /bits/ 64 <1440000000 200000000>;
+ };
+
+ opp-2189000000 {
+ opp-hz = /bits/ 64 <2189000000 768000000>;
+ };
+
+ opp-2516000000 {
+ opp-hz = /bits/ 64 <2516000000 1555000000>;
+ };
+
+ opp-3860000000 {
+ opp-hz = /bits/ 64 <3860000000 4224000000>;
+ };
+ };
+ };
+ };
+
+ memory-1 {
+ qcom,memory-type = <QCOM_MEM_TYPE_LLCC>;
+ freq-table-hz = /bits/ 64 <300000000 1067000000>;
+
+ monitor-0 {
+ qcom,ipm-ceil = <20000000>;
+ cpus = <&CPU0 &CPU1 &CPU2 &CPU3 &CPU4 &CPU5 &CPU6 &CPU7
+ &CPU8 &CPU9 &CPU10 &CPU11>;
+ operating-points-v2 = <&memory1_monitor0_opp_table>;
+
+ memory1_monitor0_opp_table: opp-table {
+ compatible = "operating-points-v2";
+
+ opp-999000000 {
+ opp-hz = /bits/ 64 <999000000 300000000>;
+ };
+
+ opp-1440000000 {
+ opp-hz = /bits/ 64 <1440000000 466000000>;
+ };
+
+ opp-1671000000 {
+ opp-hz = /bits/ 64 <1671000000 600000000>;
+ };
+
+ opp-2189000000 {
+ opp-hz = /bits/ 64 <2189000000 806000000>;
+ };
+
+ opp-2516000000 {
+ opp-hz = /bits/ 64 <2516000000 933000000>;
+ };
+
+ opp-3860000000 {
+ opp-hz = /bits/ 64 <3860000000 1066000000>;
+ };
+ };
+ };
+ };
+
+ memory-2 {
+ qcom,memory-type = <QCOM_MEM_TYPE_DDR_QOS>;
+ freq-table-hz = /bits/ 64 <QCOM_DDR_LEVEL_AUTO QCOM_DDR_LEVEL_PERF>;
+
+ monitor-0 {
+ qcom,ipm-ceil = <20000000>;
+ cpus = <&CPU0 &CPU1 &CPU2 &CPU3 &CPU4 &CPU5 &CPU6 &CPU7
+ &CPU8 &CPU9 &CPU10 &CPU11>;
+ operating-points-v2 = <&memory2_monitor0_opp_table>;
+
+ memory2_monitor0_opp_table: opp-table {
+ compatible = "operating-points-v2";
+
+ opp-2189000000 {
+ opp-hz = /bits/ 64 <2189000000>;
+ opp-level = <QCOM_DDR_LEVEL_AUTO>;
+ };
+
+ opp-3860000000 {
+ opp-hz = /bits/ 64 <3860000000>;
+ opp-level = <QCOM_DDR_LEVEL_PERF>;
+ };
+ };
+ };
+ };
+ };
+ };
+ };
diff --git a/include/dt-bindings/soc/qcom,scmi-vendor.h b/include/dt-bindings/soc/qcom,scmi-vendor.h
new file mode 100644
index 000000000000..7ae8d8d5623b
--- /dev/null
+++ b/include/dt-bindings/soc/qcom,scmi-vendor.h
@@ -0,0 +1,22 @@
+/* SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause) */
+/*
+ * Copyright (c) 2024, Qualcomm Innovation Center, Inc. All rights reserved.
+ */
+#ifndef __DT_BINDINGS_QCOM_SCMI_VENDOR_H
+#define __DT_BINDINGS_QCOM_SCMI_VENDOR
+
+/* Memory IDs */
+#define QCOM_MEM_TYPE_DDR 0x0
+#define QCOM_MEM_TYPE_LLCC 0x1
+#define QCOM_MEM_TYPE_DDR_QOS 0x2
+
+/*
+ * QCOM_MEM_TYPE_DDR_QOS supports the following states.
+ *
+ * %QCOM_DDR_LEVEL_AUTO: DDR operates with LPM enabled
+ * %QCOM_DDR_LEVEL_PERF: DDR operates with LPM disabled
+ */
+#define QCOM_DDR_LEVEL_AUTO 0x0
+#define QCOM_DDR_LEVEL_PERF 0x1
+
+#endif /* __DT_BINDINGS_QCOM_SCMI_VENDOR_H */
--
2.34.1
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [RFC V3 2/4] firmware: arm_scmi: vendors: Add ARM SCMI QCOM vendor protocol v1.0
2024-07-02 19:14 [RFC V3 0/4] arm_scmi: vendors: ARM SCMI Qualcomm Vendor Protocol Sibi Sankar
2024-07-02 19:14 ` [RFC V3 1/4] dt-bindings: firmware: Document bindings for ARM SCMI QCOM " Sibi Sankar
@ 2024-07-02 19:14 ` Sibi Sankar
2024-07-09 10:10 ` Konrad Dybcio
2024-07-09 17:52 ` Cristian Marussi
2024-07-02 19:14 ` [RFC V3 3/4] soc: qcom: Introduce SCMI based Memlat (Memory Latency) governor Sibi Sankar
2024-07-02 19:14 ` [RFC V3 4/4] arm64: dts: qcom: x1e80100: Enable LLCC/DDR/DDR_QOS dvfs Sibi Sankar
3 siblings, 2 replies; 17+ messages in thread
From: Sibi Sankar @ 2024-07-02 19:14 UTC (permalink / raw)
To: sudeep.holla, cristian.marussi, andersson, konrad.dybcio, robh+dt,
krzysztof.kozlowski+dt
Cc: linux-kernel, linux-arm-msm, devicetree, linux-arm-kernel,
quic_rgottimu, quic_kshivnan, quic_sibis, conor+dt, Amir Vajid
The ARM SCMI QCOM vendor protocol provides a generic way of exposing a
number of Qualcomm SoC specific features (like memory bus scaling) through
a mixture of pre-determined algorithm strings and param_id pairs hosted on
the SCMI controller.
Co-developed-by: Shivnandan Kumar <quic_kshivnan@quicinc.com>
Signed-off-by: Shivnandan Kumar <quic_kshivnan@quicinc.com>
Co-developed-by: Ramakrishna Gottimukkula <quic_rgottimu@quicinc.com>
Signed-off-by: Ramakrishna Gottimukkula <quic_rgottimu@quicinc.com>
Co-developed-by: Amir Vajid <avajid@quicinc.com>
Signed-off-by: Amir Vajid <avajid@quicinc.com>
Signed-off-by: Sibi Sankar <quic_sibis@quicinc.com>
---
drivers/firmware/arm_scmi/vendors/Kconfig | 12 ++
drivers/firmware/arm_scmi/vendors/Makefile | 2 +-
.../arm_scmi/vendors/qcom_scmi_vendor.c | 184 ++++++++++++++++++
include/linux/qcom_scmi_vendor.h | 39 ++++
4 files changed, 236 insertions(+), 1 deletion(-)
create mode 100644 drivers/firmware/arm_scmi/vendors/qcom_scmi_vendor.c
create mode 100644 include/linux/qcom_scmi_vendor.h
diff --git a/drivers/firmware/arm_scmi/vendors/Kconfig b/drivers/firmware/arm_scmi/vendors/Kconfig
index 7c1ca7a12603..6bff4550fa25 100644
--- a/drivers/firmware/arm_scmi/vendors/Kconfig
+++ b/drivers/firmware/arm_scmi/vendors/Kconfig
@@ -1,4 +1,16 @@
# SPDX-License-Identifier: GPL-2.0-only
menu "ARM SCMI Vendor Protocols"
+config ARM_SCMI_PROTOCOL_VENDOR_QCOM
+ tristate "Qualcomm Technologies, Inc. Qcom SCMI vendor Protocol"
+ depends on ARM_SCMI_PROTOCOL || COMPILE_TEST
+ help
+ The SCMI QCOM vendor protocol provides a generic way of exposing a
+ number of Qualcomm SoC specific features (like memory bus scaling)
+ through a mixture of pre-determined algorithm strings and param_id
+ pairs hosted on the SCMI controller.
+
+ This driver defines/documents the message ID's used for this
+ communication and also exposes the ops used by the clients.
+
endmenu
diff --git a/drivers/firmware/arm_scmi/vendors/Makefile b/drivers/firmware/arm_scmi/vendors/Makefile
index c6c214158dd8..c1d6a355f579 100644
--- a/drivers/firmware/arm_scmi/vendors/Makefile
+++ b/drivers/firmware/arm_scmi/vendors/Makefile
@@ -1,2 +1,2 @@
# SPDX-License-Identifier: GPL-2.0-only
-# obj-$(CONFIG_ARM_SCMI_PROTOCOL_<your_vendor_proto>) += <your_vendor_proto>.o
+obj-$(CONFIG_ARM_SCMI_PROTOCOL_VENDOR_QCOM) += qcom_scmi_vendor.o
diff --git a/drivers/firmware/arm_scmi/vendors/qcom_scmi_vendor.c b/drivers/firmware/arm_scmi/vendors/qcom_scmi_vendor.c
new file mode 100644
index 000000000000..e02163381d4b
--- /dev/null
+++ b/drivers/firmware/arm_scmi/vendors/qcom_scmi_vendor.c
@@ -0,0 +1,184 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (c) 2024, Qualcomm Innovation Center, Inc. All rights reserved.
+ */
+
+#include <linux/qcom_scmi_vendor.h>
+
+#include "../common.h"
+
+/**
+ * qcom_scmi_vendor_protocol_cmd - vendor specific commands supported by Qualcomm SCMI
+ * vendor protocol.
+ *
+ * This protocol is intended as a generic way of exposing a number of Qualcomm SoC
+ * specific features through a mixture of pre-determined algorithm string and param_id
+ * pairs hosted on the SCMI controller.
+ *
+ * The QCOM SCMI Vendor Protocol has the protocol id as 0x80 and vendor id set to
+ * Qualcomm and the implementation version set to 0x20000. The PROTOCOL_VERSION command
+ * returns version 1.0.
+ *
+ * @QCOM_SCMI_SET_PARAM: message_id: 0x10 is used to set the parameter of a specific algo_str
+ * hosted on QCOM SCMI Vendor Protocol. The tx len depends on the
+ * algo_str used.
+ * @QCOM_SCMI_GET_PARAM: message_id: 0x11 is used to get parameter information of a specific
+ * algo_str hosted on QCOM SCMI Vendor Protocol. The tx and rx len
+ * depends on the algo_str used.
+ * @QCOM_SCMI_START_ACTIVITY: message_id: 0x12 is used to start the activity performed by
+ * the algo_str.
+ * @QCOM_SCMI_STOP_ACTIVITY: message_id: 0x13 is used to stop a pre-existing activity
+ * performed by the algo_str.
+ */
+enum qcom_scmi_vendor_protocol_cmd {
+ QCOM_SCMI_SET_PARAM = 0x10,
+ QCOM_SCMI_GET_PARAM = 0x11,
+ QCOM_SCMI_START_ACTIVITY = 0x12,
+ QCOM_SCMI_STOP_ACTIVITY = 0x13,
+};
+
+/**
+ * struct qcom_scmi_msg - represents the various parameters to be populated
+ * for using the QCOM SCMI Vendor Protocol
+ *
+ * @ext_id: reserved, must be zero
+ * @algo_low: lower 32 bits of the algo_str
+ * @algo_high: upper 32 bits of the algo_str
+ * @param_id: serves as token message id to the specific algo_str
+ * @buf: serves as the payload to the specified param_id and algo_str pair
+ */
+struct qcom_scmi_msg {
+ __le32 ext_id;
+ __le32 algo_low;
+ __le32 algo_high;
+ __le32 param_id;
+ __le32 buf[];
+};
+
+static int qcom_scmi_set_param(const struct scmi_protocol_handle *ph, void *buf, u64 algo_str,
+ u32 param_id, size_t size)
+{
+ struct scmi_xfer *t;
+ struct qcom_scmi_msg *msg;
+ int ret;
+
+ ret = ph->xops->xfer_get_init(ph, QCOM_SCMI_SET_PARAM, size + sizeof(*msg), 0, &t);
+ if (ret)
+ return ret;
+
+ msg = t->tx.buf;
+ msg->algo_low = cpu_to_le32(lower_32_bits(algo_str));
+ msg->algo_high = cpu_to_le32(upper_32_bits(algo_str));
+ msg->param_id = cpu_to_le32(param_id);
+
+ memcpy(msg->buf, buf, t->tx.len - sizeof(*msg));
+
+ ret = ph->xops->do_xfer(ph, t);
+ ph->xops->xfer_put(ph, t);
+
+ return ret;
+}
+
+static int qcom_scmi_get_param(const struct scmi_protocol_handle *ph, void *buf, u64 algo_str,
+ u32 param_id, size_t tx_size, size_t rx_size)
+{
+ struct scmi_xfer *t;
+ struct qcom_scmi_msg *msg;
+ int ret;
+
+ ret = ph->xops->xfer_get_init(ph, QCOM_SCMI_GET_PARAM, tx_size + sizeof(*msg), rx_size, &t);
+ if (ret)
+ return ret;
+
+ msg = t->tx.buf;
+ msg->algo_low = cpu_to_le32(lower_32_bits(algo_str));
+ msg->algo_high = cpu_to_le32(upper_32_bits(algo_str));
+ msg->param_id = cpu_to_le32(param_id);
+ memcpy(msg->buf, buf, t->tx.len - sizeof(*msg));
+
+ ret = ph->xops->do_xfer(ph, t);
+ memcpy(buf, t->rx.buf, t->rx.len);
+ ph->xops->xfer_put(ph, t);
+
+ return ret;
+}
+
+static int qcom_scmi_start_activity(const struct scmi_protocol_handle *ph,
+ void *buf, u64 algo_str, u32 param_id, size_t size)
+{
+ struct scmi_xfer *t;
+ struct qcom_scmi_msg *msg;
+ int ret;
+
+ ret = ph->xops->xfer_get_init(ph, QCOM_SCMI_START_ACTIVITY, size + sizeof(*msg), 0, &t);
+ if (ret)
+ return ret;
+
+ msg = t->tx.buf;
+ msg->algo_low = cpu_to_le32(lower_32_bits(algo_str));
+ msg->algo_high = cpu_to_le32(upper_32_bits(algo_str));
+ msg->param_id = cpu_to_le32(param_id);
+
+ memcpy(msg->buf, buf, t->tx.len - sizeof(*msg));
+
+ ret = ph->xops->do_xfer(ph, t);
+ ph->xops->xfer_put(ph, t);
+
+ return ret;
+}
+
+static int qcom_scmi_stop_activity(const struct scmi_protocol_handle *ph, void *buf, u64 algo_str,
+ u32 param_id, size_t size)
+{
+ struct scmi_xfer *t;
+ struct qcom_scmi_msg *msg;
+ int ret;
+
+ ret = ph->xops->xfer_get_init(ph, QCOM_SCMI_STOP_ACTIVITY, size + sizeof(*msg), 0, &t);
+ if (ret)
+ return ret;
+
+ msg = t->tx.buf;
+ msg->algo_low = cpu_to_le32(lower_32_bits(algo_str));
+ msg->algo_high = cpu_to_le32(upper_32_bits(algo_str));
+ msg->param_id = cpu_to_le32(param_id);
+
+ memcpy(msg->buf, buf, t->tx.len - sizeof(*msg));
+
+ ret = ph->xops->do_xfer(ph, t);
+ ph->xops->xfer_put(ph, t);
+
+ return ret;
+}
+
+static struct qcom_scmi_vendor_ops qcom_proto_ops = {
+ .set_param = qcom_scmi_set_param,
+ .get_param = qcom_scmi_get_param,
+ .start_activity = qcom_scmi_start_activity,
+ .stop_activity = qcom_scmi_stop_activity,
+};
+
+static int qcom_scmi_vendor_protocol_init(const struct scmi_protocol_handle *ph)
+{
+ u32 version;
+
+ ph->xops->version_get(ph, &version);
+
+ dev_dbg(ph->dev, "SCMI QCOM Vendor Version %d.%d\n",
+ PROTOCOL_REV_MAJOR(version), PROTOCOL_REV_MINOR(version));
+
+ return 0;
+}
+
+static const struct scmi_protocol qcom_scmi_vendor = {
+ .id = QCOM_SCMI_VENDOR_PROTOCOL,
+ .owner = THIS_MODULE,
+ .instance_init = &qcom_scmi_vendor_protocol_init,
+ .ops = &qcom_proto_ops,
+ .vendor_id = "Qualcomm",
+ .impl_ver = 0x20000,
+};
+module_scmi_protocol(qcom_scmi_vendor);
+
+MODULE_DESCRIPTION("QTI SCMI vendor protocol");
+MODULE_LICENSE("GPL");
diff --git a/include/linux/qcom_scmi_vendor.h b/include/linux/qcom_scmi_vendor.h
new file mode 100644
index 000000000000..60f85fedee80
--- /dev/null
+++ b/include/linux/qcom_scmi_vendor.h
@@ -0,0 +1,39 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * QTI SCMI vendor protocol's header
+ *
+ * Copyright (c) 2024, Qualcomm Innovation Center, Inc. All rights reserved.
+ */
+
+#ifndef _QCOM_SCMI_VENDOR_H
+#define _QCOM_SCMI_VENDOR_H
+
+#include <linux/bitfield.h>
+#include <linux/device.h>
+#include <linux/types.h>
+
+#define QCOM_SCMI_VENDOR_PROTOCOL 0x80
+
+struct scmi_protocol_handle;
+
+/**
+ * struct qcom_scmi_vendor_ops - represents the various operations provided
+ * by QCOM SCMI Vendor Protocol
+ *
+ * @set_param: set parameter specified by param_id and algo_str pair.
+ * @get_param: retrieve parameter specified by param_id and algo_str pair.
+ * @start_activity: initiate a specific activity defined by algo_str.
+ * @stop_activity: halt previously initiated activity defined by algo_str.
+ */
+struct qcom_scmi_vendor_ops {
+ int (*set_param)(const struct scmi_protocol_handle *ph, void *buf, u64 algo_str,
+ u32 param_id, size_t size);
+ int (*get_param)(const struct scmi_protocol_handle *ph, void *buf, u64 algo_str,
+ u32 param_id, size_t tx_size, size_t rx_size);
+ int (*start_activity)(const struct scmi_protocol_handle *ph, void *buf, u64 algo_str,
+ u32 param_id, size_t size);
+ int (*stop_activity)(const struct scmi_protocol_handle *ph, void *buf, u64 algo_str,
+ u32 param_id, size_t size);
+};
+
+#endif /* _QCOM_SCMI_VENDOR_H */
--
2.34.1
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [RFC V3 3/4] soc: qcom: Introduce SCMI based Memlat (Memory Latency) governor
2024-07-02 19:14 [RFC V3 0/4] arm_scmi: vendors: ARM SCMI Qualcomm Vendor Protocol Sibi Sankar
2024-07-02 19:14 ` [RFC V3 1/4] dt-bindings: firmware: Document bindings for ARM SCMI QCOM " Sibi Sankar
2024-07-02 19:14 ` [RFC V3 2/4] firmware: arm_scmi: vendors: Add ARM SCMI QCOM vendor protocol v1.0 Sibi Sankar
@ 2024-07-02 19:14 ` Sibi Sankar
2024-07-03 8:44 ` Shivnandan Kumar
` (2 more replies)
2024-07-02 19:14 ` [RFC V3 4/4] arm64: dts: qcom: x1e80100: Enable LLCC/DDR/DDR_QOS dvfs Sibi Sankar
3 siblings, 3 replies; 17+ messages in thread
From: Sibi Sankar @ 2024-07-02 19:14 UTC (permalink / raw)
To: sudeep.holla, cristian.marussi, andersson, konrad.dybcio, robh+dt,
krzysztof.kozlowski+dt
Cc: linux-kernel, linux-arm-msm, devicetree, linux-arm-kernel,
quic_rgottimu, quic_kshivnan, quic_sibis, conor+dt, Amir Vajid
Introduce a client driver that uses the memlat algorithm string hosted
on ARM SCMI QCOM Vendor Protocol to detect memory latency workloads and
control frequency/level of the various memory buses (DDR/LLCC/DDR_QOS).
Co-developed-by: Shivnandan Kumar <quic_kshivnan@quicinc.com>
Signed-off-by: Shivnandan Kumar <quic_kshivnan@quicinc.com>
Co-developed-by: Ramakrishna Gottimukkula <quic_rgottimu@quicinc.com>
Signed-off-by: Ramakrishna Gottimukkula <quic_rgottimu@quicinc.com>
Co-developed-by: Amir Vajid <avajid@quicinc.com>
Signed-off-by: Amir Vajid <avajid@quicinc.com>
Signed-off-by: Sibi Sankar <quic_sibis@quicinc.com>
---
V2:
* Make driver changes to the accommodate bindings changes. [Rob]
* Replace explicit of_node_put with _free.
drivers/soc/qcom/Kconfig | 12 +
drivers/soc/qcom/Makefile | 1 +
drivers/soc/qcom/qcom_scmi_client.c | 583 ++++++++++++++++++++++++++++
3 files changed, 596 insertions(+)
create mode 100644 drivers/soc/qcom/qcom_scmi_client.c
diff --git a/drivers/soc/qcom/Kconfig b/drivers/soc/qcom/Kconfig
index 432c85bd8ad4..b253504bd386 100644
--- a/drivers/soc/qcom/Kconfig
+++ b/drivers/soc/qcom/Kconfig
@@ -294,4 +294,16 @@ config QCOM_PBS
This module provides the APIs to the client drivers that wants to send the
PBS trigger event to the PBS RAM.
+config QCOM_SCMI_CLIENT
+ tristate "Qualcomm Technologies Inc. SCMI client driver"
+ depends on ARM_SCMI_PROTOCOL_VENDOR_QCOM || COMPILE_TEST
+ default n
+ help
+ This driver uses the memlat algorithm string hosted on QCOM SCMI
+ Vendor Protocol to detect memory latency workloads and control
+ frequency/level of the various memory buses (DDR/LLCC/DDR_QOS).
+
+ This driver defines/documents the parameter IDs used while configuring
+ the memory buses.
+
endmenu
diff --git a/drivers/soc/qcom/Makefile b/drivers/soc/qcom/Makefile
index d3560f861085..8a2e832d1d5d 100644
--- a/drivers/soc/qcom/Makefile
+++ b/drivers/soc/qcom/Makefile
@@ -35,6 +35,7 @@ obj-$(CONFIG_QCOM_APR) += apr.o
obj-$(CONFIG_QCOM_LLCC) += llcc-qcom.o
obj-$(CONFIG_QCOM_KRYO_L2_ACCESSORS) += kryo-l2-accessors.o
obj-$(CONFIG_QCOM_ICC_BWMON) += icc-bwmon.o
+obj-$(CONFIG_QCOM_SCMI_CLIENT) += qcom_scmi_client.o
qcom_ice-objs += ice.o
obj-$(CONFIG_QCOM_INLINE_CRYPTO_ENGINE) += qcom_ice.o
obj-$(CONFIG_QCOM_PBS) += qcom-pbs.o
diff --git a/drivers/soc/qcom/qcom_scmi_client.c b/drivers/soc/qcom/qcom_scmi_client.c
new file mode 100644
index 000000000000..8369b415c0ab
--- /dev/null
+++ b/drivers/soc/qcom/qcom_scmi_client.c
@@ -0,0 +1,583 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (c) 2024, Qualcomm Innovation Center, Inc. All rights reserved.
+ */
+
+#include <linux/cpu.h>
+#include <linux/err.h>
+#include <linux/errno.h>
+#include <linux/init.h>
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/of.h>
+#include <linux/platform_device.h>
+#include <linux/qcom_scmi_vendor.h>
+#include <linux/scmi_protocol.h>
+#include <linux/units.h>
+#include <dt-bindings/soc/qcom,scmi-vendor.h>
+
+#define MEMLAT_ALGO_STR 0x4D454D4C4154 /* MEMLAT */
+#define INVALID_IDX 0xff
+#define MAX_MEMORY_TYPES 3
+#define MAX_MONITOR_CNT 4
+#define MAX_NAME_LEN 20
+#define MAX_MAP_ENTRIES 7
+#define CPUCP_DEFAULT_SAMPLING_PERIOD_MS 4
+#define CPUCP_DEFAULT_FREQ_METHOD 1
+
+/**
+ * scmi_memlat_protocol_cmd - parameter_ids supported by the "MEMLAT" algo_str hosted
+ * by the Qualcomm SCMI Vendor Protocol on the SCMI controller.
+ *
+ * MEMLAT (Memory Latency) monitors the counters to detect memory latency bound workloads
+ * and scales the frequency/levels of the memory buses accordingly.
+ *
+ * @MEMLAT_SET_MEM_GROUP: initializes the frequency/level scaling functions for the memory bus.
+ * @MEMLAT_SET_MONITOR: configures the monitor to work on a specific memory bus.
+ * @MEMLAT_SET_COMMON_EV_MAP: set up common counters used to monitor the cpu frequency.
+ * @MEMLAT_SET_GRP_EV_MAP: set up any specific counters used to monitor the memory bus.
+ * @MEMLAT_IPM_CEIL: set the IPM (Instruction Per Misses) ceiling per monitor.
+ * @MEMLAT_SAMPLE_MS: set the sampling period for all the monitors.
+ * MEMLAT_MON_FREQ_MAP: setup the cpufreq to memfreq map.
+ * MEMLAT_SET_MIN_FREQ: set the max frequency of the memory bus.
+ * MEMLAT_SET_MAX_FREQ: set the min frequency of the memory bus.
+ * MEMLAT_START_TIMER: start all the monitors with the requested sampling period.
+ * MEMLAT_START_TIMER: stop all the running monitors.
+ * MEMLAT_SET_EFFECTIVE_FREQ_METHOD: set the method used to determine cpu frequency.
+ */
+enum scmi_memlat_protocol_cmd {
+ MEMLAT_SET_MEM_GROUP = 16,
+ MEMLAT_SET_MONITOR,
+ MEMLAT_SET_COMMON_EV_MAP,
+ MEMLAT_SET_GRP_EV_MAP,
+ MEMLAT_IPM_CEIL = 23,
+ MEMLAT_SAMPLE_MS = 31,
+ MEMLAT_MON_FREQ_MAP,
+ MEMLAT_SET_MIN_FREQ,
+ MEMLAT_SET_MAX_FREQ,
+ MEMLAT_START_TIMER = 36,
+ MEMLAT_STOP_TIMER,
+ MEMLAT_SET_EFFECTIVE_FREQ_METHOD = 39,
+};
+
+struct map_table {
+ u16 v1;
+ u16 v2;
+};
+
+struct map_param_msg {
+ u32 hw_type;
+ u32 mon_idx;
+ u32 nr_rows;
+ struct map_table tbl[MAX_MAP_ENTRIES];
+} __packed;
+
+struct node_msg {
+ u32 cpumask;
+ u32 hw_type;
+ u32 mon_type;
+ u32 mon_idx;
+ char mon_name[MAX_NAME_LEN];
+};
+
+struct scalar_param_msg {
+ u32 hw_type;
+ u32 mon_idx;
+ u32 val;
+};
+
+enum common_ev_idx {
+ INST_IDX,
+ CYC_IDX,
+ CONST_CYC_IDX,
+ FE_STALL_IDX,
+ BE_STALL_IDX,
+ NUM_COMMON_EVS
+};
+
+enum grp_ev_idx {
+ MISS_IDX,
+ WB_IDX,
+ ACC_IDX,
+ NUM_GRP_EVS
+};
+
+#define EV_CPU_CYCLES 0
+#define EV_INST_RETIRED 2
+#define EV_L2_D_RFILL 5
+
+struct ev_map_msg {
+ u32 num_evs;
+ u32 hw_type;
+ u32 cid[NUM_COMMON_EVS];
+};
+
+struct cpufreq_memfreq_map {
+ unsigned int cpufreq_mhz;
+ unsigned int memfreq_khz;
+};
+
+struct scmi_monitor_info {
+ struct cpufreq_memfreq_map *freq_map;
+ char mon_name[MAX_NAME_LEN];
+ u32 mon_idx;
+ u32 mon_type;
+ u32 ipm_ceil;
+ u32 mask;
+ u32 freq_map_len;
+};
+
+struct scmi_memory_info {
+ struct scmi_monitor_info *monitor[MAX_MONITOR_CNT];
+ u32 hw_type;
+ int monitor_cnt;
+ u32 min_freq;
+ u32 max_freq;
+};
+
+struct scmi_memlat_info {
+ struct scmi_protocol_handle *ph;
+ const struct qcom_scmi_vendor_ops *ops;
+ struct scmi_memory_info *memory[MAX_MEMORY_TYPES];
+ u32 cluster_info[NR_CPUS];
+ int memory_cnt;
+};
+
+static int populate_cluster_info(u32 *cluster_info)
+{
+ char name[MAX_NAME_LEN];
+ int i = 0;
+
+ struct device_node *cn __free(device_node) = of_find_node_by_path("/cpus");
+ if (!cn)
+ return -ENODEV;
+
+ struct device_node *map __free(device_node) = of_get_child_by_name(cn, "cpu-map");
+ if (!map)
+ return -ENODEV;
+
+ do {
+ snprintf(name, sizeof(name), "cluster%d", i);
+ struct device_node *c __free(device_node) = of_get_child_by_name(map, name);
+ if (!c)
+ break;
+
+ *(cluster_info + i) = of_get_child_count(c);
+ i++;
+ } while (1);
+
+ return 0;
+}
+
+static int populate_physical_mask(struct device_node *np, u32 *mask, u32 *cluster_info)
+{
+ struct device_node *dev_phandle;
+ int cpu, i = 0, physical_id;
+
+ do {
+ dev_phandle = of_parse_phandle(np, "cpus", i++);
+ cpu = of_cpu_node_to_id(dev_phandle);
+ if (cpu != -ENODEV) {
+ physical_id = topology_core_id(cpu);
+ for (int j = 0; j < topology_cluster_id(cpu); j++)
+ physical_id += *(cluster_info + j);
+ *mask |= BIT(physical_id);
+ }
+ } while (dev_phandle);
+
+ return 0;
+}
+
+static struct cpufreq_memfreq_map *init_cpufreq_memfreq_map(struct device *dev,
+ struct scmi_memory_info *memory,
+ struct device_node *of_node,
+ u32 *cnt)
+{
+ struct device_node *tbl_np, *opp_np;
+ struct cpufreq_memfreq_map *tbl;
+ int ret, i = 0;
+ u32 level, len;
+ u64 rate;
+
+ tbl_np = of_parse_phandle(of_node, "operating-points-v2", 0);
+ if (!tbl_np)
+ return ERR_PTR(-ENODEV);
+
+ len = min(of_get_available_child_count(tbl_np), MAX_MAP_ENTRIES);
+ if (len == 0)
+ return ERR_PTR(-ENODEV);
+
+ tbl = devm_kzalloc(dev, (len + 1) * sizeof(struct cpufreq_memfreq_map),
+ GFP_KERNEL);
+ if (!tbl)
+ return ERR_PTR(-ENOMEM);
+
+ for_each_available_child_of_node(tbl_np, opp_np) {
+ ret = of_property_read_u64_index(opp_np, "opp-hz", 0, &rate);
+ if (ret < 0)
+ return ERR_PTR(ret);
+
+ tbl[i].cpufreq_mhz = rate / HZ_PER_MHZ;
+
+ if (memory->hw_type != QCOM_MEM_TYPE_DDR_QOS) {
+ ret = of_property_read_u64_index(opp_np, "opp-hz", 1, &rate);
+ if (ret < 0)
+ return ERR_PTR(ret);
+
+ tbl[i].memfreq_khz = rate / HZ_PER_KHZ;
+ } else {
+ ret = of_property_read_u32(opp_np, "opp-level", &level);
+ if (ret < 0)
+ return ERR_PTR(ret);
+
+ tbl[i].memfreq_khz = level;
+ }
+
+ dev_dbg(dev, "Entry%d CPU:%u, Mem:%u\n", i, tbl[i].cpufreq_mhz, tbl[i].memfreq_khz);
+ i++;
+ }
+ *cnt = len;
+ tbl[i].cpufreq_mhz = 0;
+
+ return tbl;
+}
+
+static int process_scmi_memlat_of_node(struct scmi_device *sdev, struct scmi_memlat_info *info)
+{
+ struct scmi_monitor_info *monitor;
+ struct scmi_memory_info *memory;
+ char name[MAX_NAME_LEN];
+ u64 memfreq[2];
+ int ret;
+
+ ret = populate_cluster_info(info->cluster_info);
+ if (ret < 0) {
+ dev_err_probe(&sdev->dev, ret, "failed to populate cluster info\n");
+ goto err;
+ }
+
+ of_node_get(sdev->handle->dev->of_node);
+ do {
+ snprintf(name, sizeof(name), "memory-%d", info->memory_cnt);
+ struct device_node *memory_np __free(device_node) =
+ of_find_node_by_name(sdev->handle->dev->of_node, name);
+
+ if (!memory_np)
+ break;
+
+ memory = devm_kzalloc(&sdev->dev, sizeof(*memory), GFP_KERNEL);
+ if (!memory) {
+ ret = -ENOMEM;
+ goto err;
+ }
+
+ ret = of_property_read_u32(memory_np, "qcom,memory-type", &memory->hw_type);
+ if (ret) {
+ dev_err_probe(&sdev->dev, ret, "failed to read memory type\n");
+ goto err;
+ }
+
+ ret = of_property_read_u64_array(memory_np, "freq-table-hz", memfreq, 2);
+ if (ret && (ret != -EINVAL)) {
+ dev_err_probe(&sdev->dev, ret, "failed to read min/max freq\n");
+ goto err;
+ }
+
+ if (memory->hw_type != QCOM_MEM_TYPE_DDR_QOS) {
+ memory->min_freq = memfreq[0] / HZ_PER_KHZ;
+ memory->max_freq = memfreq[1] / HZ_PER_KHZ;
+ } else {
+ memory->min_freq = memfreq[0];
+ memory->max_freq = memfreq[1];
+ }
+ info->memory[info->memory_cnt++] = memory;
+
+ do {
+ snprintf(name, sizeof(name), "monitor-%d", memory->monitor_cnt);
+ struct device_node *monitor_np __free(device_node) =
+ of_get_child_by_name(memory_np, name);
+
+ if (!monitor_np)
+ break;
+
+ monitor = devm_kzalloc(&sdev->dev, sizeof(*monitor), GFP_KERNEL);
+ if (!monitor) {
+ ret = -ENOMEM;
+ goto err;
+ }
+
+ monitor->mon_type = of_property_read_bool(monitor_np, "qcom,compute-type");
+ if (!monitor->mon_type) {
+ ret = of_property_read_u32(monitor_np, "qcom,ipm-ceil",
+ &monitor->ipm_ceil);
+ if (ret) {
+ dev_err_probe(&sdev->dev, ret,
+ "failed to read IPM ceiling\n");
+ goto err;
+ }
+ }
+
+ /*
+ * Variants of the SoC having reduced number of cpus operate
+ * with the same number of logical cpus but the physical
+ * cpu disabled will differ between parts. Calculate the
+ * physical cpu number using cluster information instead.
+ */
+ ret = populate_physical_mask(monitor_np, &monitor->mask,
+ info->cluster_info);
+ if (ret < 0) {
+ dev_err_probe(&sdev->dev, ret, "failed to populate cpu mask\n");
+ goto err;
+ }
+
+ monitor->freq_map = init_cpufreq_memfreq_map(&sdev->dev, memory, monitor_np,
+ &monitor->freq_map_len);
+ if (IS_ERR(monitor->freq_map)) {
+ dev_err_probe(&sdev->dev, PTR_ERR(monitor->freq_map),
+ "failed to populate cpufreq-memfreq map\n");
+ goto err;
+ }
+
+ strscpy(monitor->mon_name, name, sizeof(monitor->mon_name));
+ monitor->mon_idx = memory->monitor_cnt;
+
+ memory->monitor[memory->monitor_cnt++] = monitor;
+ } while (1);
+
+ if (!memory->monitor_cnt) {
+ ret = -EINVAL;
+ dev_err_probe(&sdev->dev, ret, "failed to find monitor nodes\n");
+ goto err;
+ }
+ } while (1);
+
+ if (!info->memory_cnt) {
+ ret = -EINVAL;
+ dev_err_probe(&sdev->dev, ret, "failed to find memory nodes\n");
+ }
+
+err:
+ of_node_put(sdev->handle->dev->of_node);
+
+ return ret;
+}
+
+static int configure_cpucp_common_events(struct scmi_memlat_info *info)
+{
+ const struct qcom_scmi_vendor_ops *ops = info->ops;
+ u8 ev_map[NUM_COMMON_EVS];
+ struct ev_map_msg msg;
+ int ret;
+
+ memset(ev_map, 0xFF, NUM_COMMON_EVS);
+
+ msg.num_evs = NUM_COMMON_EVS;
+ msg.hw_type = INVALID_IDX;
+ msg.cid[INST_IDX] = EV_INST_RETIRED;
+ msg.cid[CYC_IDX] = EV_CPU_CYCLES;
+ msg.cid[CONST_CYC_IDX] = INVALID_IDX;
+ msg.cid[FE_STALL_IDX] = INVALID_IDX;
+ msg.cid[BE_STALL_IDX] = INVALID_IDX;
+
+ ret = ops->set_param(info->ph, &msg, MEMLAT_ALGO_STR, MEMLAT_SET_COMMON_EV_MAP,
+ sizeof(msg));
+ return ret;
+}
+
+static int configure_cpucp_grp(struct device *dev, struct scmi_memlat_info *info, int memory_index)
+{
+ const struct qcom_scmi_vendor_ops *ops = info->ops;
+ struct scmi_memory_info *memory = info->memory[memory_index];
+ struct ev_map_msg ev_msg;
+ u8 ev_map[NUM_GRP_EVS];
+ struct node_msg msg;
+ int ret;
+
+ msg.cpumask = 0;
+ msg.hw_type = memory->hw_type;
+ msg.mon_type = 0;
+ msg.mon_idx = 0;
+ ret = ops->set_param(info->ph, &msg, MEMLAT_ALGO_STR, MEMLAT_SET_MEM_GROUP, sizeof(msg));
+ if (ret < 0) {
+ dev_err_probe(dev, ret, "failed to configure mem type %d\n", memory->hw_type);
+ return ret;
+ }
+
+ memset(ev_map, 0xFF, NUM_GRP_EVS);
+ ev_msg.num_evs = NUM_GRP_EVS;
+ ev_msg.hw_type = memory->hw_type;
+ ev_msg.cid[MISS_IDX] = EV_L2_D_RFILL;
+ ev_msg.cid[WB_IDX] = INVALID_IDX;
+ ev_msg.cid[ACC_IDX] = INVALID_IDX;
+ ret = ops->set_param(info->ph, &ev_msg, MEMLAT_ALGO_STR, MEMLAT_SET_GRP_EV_MAP,
+ sizeof(ev_msg));
+ if (ret < 0) {
+ dev_err_probe(dev, ret,
+ "failed to configure event map for mem type %d\n", memory->hw_type);
+ return ret;
+ }
+
+ return ret;
+}
+
+static int configure_cpucp_mon(struct device *dev, struct scmi_memlat_info *info,
+ int memory_index, int monitor_index)
+{
+ const struct qcom_scmi_vendor_ops *ops = info->ops;
+ struct scmi_memory_info *memory = info->memory[memory_index];
+ struct scmi_monitor_info *monitor = memory->monitor[monitor_index];
+ struct scalar_param_msg scalar_msg;
+ struct map_param_msg map_msg;
+ struct node_msg msg;
+ int ret;
+ int i;
+
+ msg.cpumask = monitor->mask;
+ msg.hw_type = memory->hw_type;
+ msg.mon_type = monitor->mon_type;
+ msg.mon_idx = monitor->mon_idx;
+ strscpy(msg.mon_name, monitor->mon_name, sizeof(msg.mon_name));
+ ret = ops->set_param(info->ph, &msg, MEMLAT_ALGO_STR, MEMLAT_SET_MONITOR, sizeof(msg));
+ if (ret < 0) {
+ dev_err_probe(dev, ret, "failed to configure monitor %s\n", monitor->mon_name);
+ return ret;
+ }
+
+ scalar_msg.hw_type = memory->hw_type;
+ scalar_msg.mon_idx = monitor->mon_idx;
+ scalar_msg.val = monitor->ipm_ceil;
+ ret = ops->set_param(info->ph, &scalar_msg, MEMLAT_ALGO_STR, MEMLAT_IPM_CEIL,
+ sizeof(scalar_msg));
+ if (ret < 0) {
+ dev_err_probe(dev, ret, "failed to set ipm ceil for %s\n", monitor->mon_name);
+ return ret;
+ }
+
+ map_msg.hw_type = memory->hw_type;
+ map_msg.mon_idx = monitor->mon_idx;
+ map_msg.nr_rows = monitor->freq_map_len;
+ for (i = 0; i < monitor->freq_map_len; i++) {
+ map_msg.tbl[i].v1 = monitor->freq_map[i].cpufreq_mhz;
+ map_msg.tbl[i].v2 = monitor->freq_map[i].memfreq_khz;
+ }
+ ret = ops->set_param(info->ph, &map_msg, MEMLAT_ALGO_STR, MEMLAT_MON_FREQ_MAP,
+ sizeof(map_msg));
+ if (ret < 0) {
+ dev_err_probe(dev, ret, "failed to configure freq_map for %s\n", monitor->mon_name);
+ return ret;
+ }
+
+ scalar_msg.hw_type = memory->hw_type;
+ scalar_msg.mon_idx = monitor->mon_idx;
+ scalar_msg.val = memory->min_freq;
+ ret = ops->set_param(info->ph, &scalar_msg, MEMLAT_ALGO_STR, MEMLAT_SET_MIN_FREQ,
+ sizeof(scalar_msg));
+ if (ret < 0) {
+ dev_err_probe(dev, ret, "failed to set min_freq for %s\n", monitor->mon_name);
+ return ret;
+ }
+
+ scalar_msg.hw_type = memory->hw_type;
+ scalar_msg.mon_idx = monitor->mon_idx;
+ scalar_msg.val = memory->max_freq;
+ ret = ops->set_param(info->ph, &scalar_msg, MEMLAT_ALGO_STR, MEMLAT_SET_MAX_FREQ,
+ sizeof(scalar_msg));
+ if (ret < 0)
+ dev_err_probe(dev, ret, "failed to set max_freq for %s\n", monitor->mon_name);
+
+ return ret;
+}
+
+static int cpucp_memlat_init(struct scmi_device *sdev)
+{
+ const struct scmi_handle *handle = sdev->handle;
+ const struct qcom_scmi_vendor_ops *ops;
+ struct scmi_protocol_handle *ph;
+ struct scmi_memlat_info *info;
+ u32 cpucp_freq_method = CPUCP_DEFAULT_FREQ_METHOD;
+ u32 cpucp_sample_ms = CPUCP_DEFAULT_SAMPLING_PERIOD_MS;
+ int ret, i, j;
+
+ if (!handle)
+ return -ENODEV;
+
+ ops = handle->devm_protocol_get(sdev, QCOM_SCMI_VENDOR_PROTOCOL, &ph);
+ if (IS_ERR(ops))
+ return PTR_ERR(ops);
+
+ info = devm_kzalloc(&sdev->dev, sizeof(*info), GFP_KERNEL);
+ if (!info)
+ return -ENOMEM;
+
+ ret = process_scmi_memlat_of_node(sdev, info);
+ if (ret)
+ return ret;
+
+ info->ph = ph;
+ info->ops = ops;
+
+ /* Configure common events ids */
+ ret = configure_cpucp_common_events(info);
+ if (ret < 0) {
+ dev_err_probe(&sdev->dev, ret, "failed to configure common events\n");
+ return ret;
+ }
+
+ for (i = 0; i < info->memory_cnt; i++) {
+ /* Configure per group parameters */
+ ret = configure_cpucp_grp(&sdev->dev, info, i);
+ if (ret < 0)
+ return ret;
+
+ for (j = 0; j < info->memory[i]->monitor_cnt; j++) {
+ /* Configure per monitor parameters */
+ ret = configure_cpucp_mon(&sdev->dev, info, i, j);
+ if (ret < 0)
+ return ret;
+ }
+ }
+
+ /* Set loop sampling time */
+ ret = ops->set_param(ph, &cpucp_sample_ms, MEMLAT_ALGO_STR, MEMLAT_SAMPLE_MS,
+ sizeof(cpucp_sample_ms));
+ if (ret < 0) {
+ dev_err_probe(&sdev->dev, ret, "failed to set sample_ms\n");
+ return ret;
+ }
+
+ /* Set the effective cpu frequency calculation method */
+ ret = ops->set_param(ph, &cpucp_freq_method, MEMLAT_ALGO_STR,
+ MEMLAT_SET_EFFECTIVE_FREQ_METHOD, sizeof(cpucp_freq_method));
+ if (ret < 0) {
+ dev_err_probe(&sdev->dev, ret, "failed to set effective frequency calc method\n");
+ return ret;
+ }
+
+ /* Start sampling and voting timer */
+ ret = ops->start_activity(ph, NULL, MEMLAT_ALGO_STR, MEMLAT_START_TIMER, 0);
+ if (ret < 0)
+ dev_err_probe(&sdev->dev, ret, "failed to start memory group timer\n");
+
+ return ret;
+}
+
+static int scmi_client_probe(struct scmi_device *sdev)
+{
+ return cpucp_memlat_init(sdev);
+}
+
+static const struct scmi_device_id scmi_id_table[] = {
+ { .protocol_id = QCOM_SCMI_VENDOR_PROTOCOL, .name = "qcom_scmi_vendor_protocol" },
+ { },
+};
+MODULE_DEVICE_TABLE(scmi, scmi_id_table);
+
+static struct scmi_driver qcom_scmi_client_drv = {
+ .name = "qcom-scmi-driver",
+ .probe = scmi_client_probe,
+ .id_table = scmi_id_table,
+};
+module_scmi_driver(qcom_scmi_client_drv);
+
+MODULE_DESCRIPTION("QTI SCMI client driver");
+MODULE_LICENSE("GPL");
--
2.34.1
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [RFC V3 4/4] arm64: dts: qcom: x1e80100: Enable LLCC/DDR/DDR_QOS dvfs
2024-07-02 19:14 [RFC V3 0/4] arm_scmi: vendors: ARM SCMI Qualcomm Vendor Protocol Sibi Sankar
` (2 preceding siblings ...)
2024-07-02 19:14 ` [RFC V3 3/4] soc: qcom: Introduce SCMI based Memlat (Memory Latency) governor Sibi Sankar
@ 2024-07-02 19:14 ` Sibi Sankar
3 siblings, 0 replies; 17+ messages in thread
From: Sibi Sankar @ 2024-07-02 19:14 UTC (permalink / raw)
To: sudeep.holla, cristian.marussi, andersson, konrad.dybcio, robh+dt,
krzysztof.kozlowski+dt
Cc: linux-kernel, linux-arm-msm, devicetree, linux-arm-kernel,
quic_rgottimu, quic_kshivnan, quic_sibis, conor+dt
Enable LLCC/DDR/DDR_QOS dvfs through the ARM SCMI QCOM vendor protocol.
Signed-off-by: Sibi Sankar <quic_sibis@quicinc.com>
---
V2:
* Drop container dvfs memlat container node. [Rob]
* Replace qcom,cpulist with the standard "cpus" property. [Rob]
* Minor style fixes in dts.
arch/arm64/boot/dts/qcom/x1e80100.dtsi | 138 +++++++++++++++++++++++++
1 file changed, 138 insertions(+)
diff --git a/arch/arm64/boot/dts/qcom/x1e80100.dtsi b/arch/arm64/boot/dts/qcom/x1e80100.dtsi
index 985b9a33285e..b22e74f64481 100644
--- a/arch/arm64/boot/dts/qcom/x1e80100.dtsi
+++ b/arch/arm64/boot/dts/qcom/x1e80100.dtsi
@@ -17,6 +17,7 @@
#include <dt-bindings/power/qcom-rpmpd.h>
#include <dt-bindings/soc/qcom,gpr.h>
#include <dt-bindings/soc/qcom,rpmh-rsc.h>
+#include <dt-bindings/soc/qcom,scmi-vendor.h>
#include <dt-bindings/sound/qcom,q6dsp-lpass-ports.h>
/ {
@@ -323,6 +324,143 @@ scmi_dvfs: protocol@13 {
reg = <0x13>;
#power-domain-cells = <1>;
};
+
+ scmi_vendor: protocol@80 {
+ reg = <0x80>;
+
+ memory-0 {
+ qcom,memory-type = <QCOM_MEM_TYPE_DDR>;
+ freq-table-hz = /bits/ 64 <200000000 4224000000>;
+
+ monitor-0 {
+ cpus = <&CPU0 &CPU1 &CPU2 &CPU3 &CPU4 &CPU5 &CPU6
+ &CPU7 &CPU8 &CPU9 &CPU10 &CPU11>;
+ qcom,ipm-ceil = <20000000>;
+ operating-points-v2 = <&memory0_monitor0_opp_table>;
+
+ memory0_monitor0_opp_table: opp-table {
+ compatible = "operating-points-v2";
+
+ opp-999000000 {
+ opp-hz = /bits/ 64 <999000000 547000000>;
+ };
+
+ opp-1440000000 {
+ opp-hz = /bits/ 64 <1440000000 768000000>;
+ };
+
+ opp-1671000000 {
+ opp-hz = /bits/ 64 <1671000000 1555000000>;
+ };
+
+ opp-2189000000 {
+ opp-hz = /bits/ 64 <2189000000 2092000000>;
+ };
+
+ opp-2516000000 {
+ opp-hz = /bits/ 64 <2516000000 3187000000>;
+ };
+
+ opp-3860000000 {
+ opp-hz = /bits/ 64 <3860000000 4224000000>;
+ };
+ };
+ };
+
+ monitor-1 {
+ cpus = <&CPU0 &CPU1 &CPU2 &CPU3 &CPU4 &CPU5 &CPU6
+ &CPU7 &CPU8 &CPU9 &CPU10 &CPU11>;
+ operating-points-v2 = <&memory0_monitor1_opp_table>;
+ qcom,compute-type;
+
+ memory0_monitor1_opp_table: opp-table {
+ compatible = "operating-points-v2";
+
+ opp-1440000000 {
+ opp-hz = /bits/ 64 <1440000000 200000000>;
+ };
+
+ opp-2189000000 {
+ opp-hz = /bits/ 64 <2189000000 768000000>;
+ };
+
+ opp-2516000000 {
+ opp-hz = /bits/ 64 <2516000000 1555000000>;
+ };
+
+ opp-3860000000 {
+ opp-hz = /bits/ 64 <3860000000 4224000000>;
+ };
+ };
+ };
+ };
+
+ memory-1 {
+ qcom,memory-type = <QCOM_MEM_TYPE_LLCC>;
+ freq-table-hz = /bits/ 64 <300000000 1067000000>;
+
+ monitor-0 {
+ cpus = <&CPU0 &CPU1 &CPU2 &CPU3 &CPU4 &CPU5 &CPU6
+ &CPU7 &CPU8 &CPU9 &CPU10 &CPU11>;
+ qcom,ipm-ceil = <20000000>;
+ operating-points-v2 = <&memory1_monitor0_opp_table>;
+
+ memory1_monitor0_opp_table: opp-table {
+ compatible = "operating-points-v2";
+
+ opp-999000000 {
+ opp-hz = /bits/ 64 <999000000 300000000>;
+ };
+
+ opp-1440000000 {
+ opp-hz = /bits/ 64 <1440000000 466000000>;
+ };
+
+ opp-1671000000 {
+ opp-hz = /bits/ 64 <1671000000 600000000>;
+ };
+
+ opp-2189000000 {
+ opp-hz = /bits/ 64 <2189000000 806000000>;
+ };
+
+ opp-2516000000 {
+ opp-hz = /bits/ 64 <2516000000 933000000>;
+ };
+
+ opp-3860000000 {
+ opp-hz = /bits/ 64 <3860000000 1066000000>;
+ };
+ };
+ };
+ };
+
+ memory-2 {
+ qcom,memory-type = <QCOM_MEM_TYPE_DDR_QOS>;
+ freq-table-hz = /bits/ 64 <QCOM_DDR_LEVEL_AUTO QCOM_DDR_LEVEL_PERF>;
+
+ monitor-0 {
+ qcom,ipm-ceil = <20000000>;
+ cpus = <&CPU0 &CPU1 &CPU2 &CPU3 &CPU4 &CPU5 &CPU6
+ &CPU7 &CPU8 &CPU9 &CPU10 &CPU11>;
+ operating-points-v2 = <&memory2_monitor0_opp_table>;
+
+ memory2_monitor0_opp_table: opp-table {
+ compatible = "operating-points-v2";
+
+ opp-2189000000 {
+ opp-hz = /bits/ 64 <2189000000>;
+ opp-level = <QCOM_DDR_LEVEL_AUTO>;
+ };
+
+ opp-3860000000 {
+ opp-hz = /bits/ 64 <3860000000>;
+ opp-level = <QCOM_DDR_LEVEL_PERF>;
+ };
+ };
+ };
+ };
+ };
};
};
--
2.34.1
^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [RFC V3 3/4] soc: qcom: Introduce SCMI based Memlat (Memory Latency) governor
2024-07-02 19:14 ` [RFC V3 3/4] soc: qcom: Introduce SCMI based Memlat (Memory Latency) governor Sibi Sankar
@ 2024-07-03 8:44 ` Shivnandan Kumar
2024-07-03 11:34 ` Sibi Sankar
2024-07-09 10:51 ` Konrad Dybcio
2024-07-10 11:36 ` Cristian Marussi
2 siblings, 1 reply; 17+ messages in thread
From: Shivnandan Kumar @ 2024-07-03 8:44 UTC (permalink / raw)
To: Sibi Sankar, sudeep.holla, cristian.marussi, andersson,
konrad.dybcio, robh+dt, krzysztof.kozlowski+dt
Cc: linux-kernel, linux-arm-msm, devicetree, linux-arm-kernel,
quic_rgottimu, conor+dt, Amir Vajid
On 7/3/2024 12:44 AM, Sibi Sankar wrote:
> Introduce a client driver that uses the memlat algorithm string hosted
> on ARM SCMI QCOM Vendor Protocol to detect memory latency workloads and
> control frequency/level of the various memory buses (DDR/LLCC/DDR_QOS).
>
> Co-developed-by: Shivnandan Kumar <quic_kshivnan@quicinc.com>
> Signed-off-by: Shivnandan Kumar <quic_kshivnan@quicinc.com>
> Co-developed-by: Ramakrishna Gottimukkula <quic_rgottimu@quicinc.com>
> Signed-off-by: Ramakrishna Gottimukkula <quic_rgottimu@quicinc.com>
> Co-developed-by: Amir Vajid <avajid@quicinc.com>
> Signed-off-by: Amir Vajid <avajid@quicinc.com>
> Signed-off-by: Sibi Sankar <quic_sibis@quicinc.com>
> ---
>
> V2:
> * Make driver changes to the accommodate bindings changes. [Rob]
> * Replace explicit of_node_put with _free.
>
> drivers/soc/qcom/Kconfig | 12 +
> drivers/soc/qcom/Makefile | 1 +
> drivers/soc/qcom/qcom_scmi_client.c | 583 ++++++++++++++++++++++++++++
> 3 files changed, 596 insertions(+)
> create mode 100644 drivers/soc/qcom/qcom_scmi_client.c
>
> diff --git a/drivers/soc/qcom/Kconfig b/drivers/soc/qcom/Kconfig
> index 432c85bd8ad4..b253504bd386 100644
> --- a/drivers/soc/qcom/Kconfig
> +++ b/drivers/soc/qcom/Kconfig
> @@ -294,4 +294,16 @@ config QCOM_PBS
> This module provides the APIs to the client drivers that wants to send the
> PBS trigger event to the PBS RAM.
>
> +config QCOM_SCMI_CLIENT
> + tristate "Qualcomm Technologies Inc. SCMI client driver"
> + depends on ARM_SCMI_PROTOCOL_VENDOR_QCOM || COMPILE_TEST
> + default n
> + help
> + This driver uses the memlat algorithm string hosted on QCOM SCMI
> + Vendor Protocol to detect memory latency workloads and control
> + frequency/level of the various memory buses (DDR/LLCC/DDR_QOS).
> +
> + This driver defines/documents the parameter IDs used while configuring
> + the memory buses.
> +
> endmenu
> diff --git a/drivers/soc/qcom/Makefile b/drivers/soc/qcom/Makefile
> index d3560f861085..8a2e832d1d5d 100644
> --- a/drivers/soc/qcom/Makefile
> +++ b/drivers/soc/qcom/Makefile
> @@ -35,6 +35,7 @@ obj-$(CONFIG_QCOM_APR) += apr.o
> obj-$(CONFIG_QCOM_LLCC) += llcc-qcom.o
> obj-$(CONFIG_QCOM_KRYO_L2_ACCESSORS) += kryo-l2-accessors.o
> obj-$(CONFIG_QCOM_ICC_BWMON) += icc-bwmon.o
> +obj-$(CONFIG_QCOM_SCMI_CLIENT) += qcom_scmi_client.o
> qcom_ice-objs += ice.o
> obj-$(CONFIG_QCOM_INLINE_CRYPTO_ENGINE) += qcom_ice.o
> obj-$(CONFIG_QCOM_PBS) += qcom-pbs.o
> diff --git a/drivers/soc/qcom/qcom_scmi_client.c b/drivers/soc/qcom/qcom_scmi_client.c
> new file mode 100644
> index 000000000000..8369b415c0ab
> --- /dev/null
> +++ b/drivers/soc/qcom/qcom_scmi_client.c
> @@ -0,0 +1,583 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * Copyright (c) 2024, Qualcomm Innovation Center, Inc. All rights reserved.
> + */
> +
> +#include <linux/cpu.h>
> +#include <linux/err.h>
> +#include <linux/errno.h>
> +#include <linux/init.h>
> +#include <linux/kernel.h>
> +#include <linux/module.h>
> +#include <linux/of.h>
> +#include <linux/platform_device.h>
> +#include <linux/qcom_scmi_vendor.h>
> +#include <linux/scmi_protocol.h>
> +#include <linux/units.h>
> +#include <dt-bindings/soc/qcom,scmi-vendor.h>
> +
> +#define MEMLAT_ALGO_STR 0x4D454D4C4154 /* MEMLAT */
> +#define INVALID_IDX 0xff
> +#define MAX_MEMORY_TYPES 3
> +#define MAX_MONITOR_CNT 4
> +#define MAX_NAME_LEN 20
> +#define MAX_MAP_ENTRIES 7
> +#define CPUCP_DEFAULT_SAMPLING_PERIOD_MS 4
> +#define CPUCP_DEFAULT_FREQ_METHOD 1
> +
> +/**
> + * scmi_memlat_protocol_cmd - parameter_ids supported by the "MEMLAT" algo_str hosted
> + * by the Qualcomm SCMI Vendor Protocol on the SCMI controller.
> + *
> + * MEMLAT (Memory Latency) monitors the counters to detect memory latency bound workloads
> + * and scales the frequency/levels of the memory buses accordingly.
> + *
> + * @MEMLAT_SET_MEM_GROUP: initializes the frequency/level scaling functions for the memory bus.
> + * @MEMLAT_SET_MONITOR: configures the monitor to work on a specific memory bus.
> + * @MEMLAT_SET_COMMON_EV_MAP: set up common counters used to monitor the cpu frequency.
> + * @MEMLAT_SET_GRP_EV_MAP: set up any specific counters used to monitor the memory bus.
> + * @MEMLAT_IPM_CEIL: set the IPM (Instruction Per Misses) ceiling per monitor.
> + * @MEMLAT_SAMPLE_MS: set the sampling period for all the monitors.
> + * MEMLAT_MON_FREQ_MAP: setup the cpufreq to memfreq map.
> + * MEMLAT_SET_MIN_FREQ: set the max frequency of the memory bus.
> + * MEMLAT_SET_MAX_FREQ: set the min frequency of the memory bus.
> + * MEMLAT_START_TIMER: start all the monitors with the requested sampling period.
> + * MEMLAT_START_TIMER: stop all the running monitors.
Typo above, it should be MEMLAT_STOP_TIMER
> + * MEMLAT_SET_EFFECTIVE_FREQ_METHOD: set the method used to determine cpu frequency.
> + */
> +enum scmi_memlat_protocol_cmd {
> + MEMLAT_SET_MEM_GROUP = 16,
> + MEMLAT_SET_MONITOR,
> + MEMLAT_SET_COMMON_EV_MAP,
> + MEMLAT_SET_GRP_EV_MAP,
> + MEMLAT_IPM_CEIL = 23,
> + MEMLAT_SAMPLE_MS = 31,
> + MEMLAT_MON_FREQ_MAP,
> + MEMLAT_SET_MIN_FREQ,
> + MEMLAT_SET_MAX_FREQ,
> + MEMLAT_START_TIMER = 36,
> + MEMLAT_STOP_TIMER,
> + MEMLAT_SET_EFFECTIVE_FREQ_METHOD = 39,
> +};
> +
> +struct map_table {
> + u16 v1;
> + u16 v2;
> +};
> +
> +struct map_param_msg {
> + u32 hw_type;
> + u32 mon_idx;
> + u32 nr_rows;
> + struct map_table tbl[MAX_MAP_ENTRIES];
> +} __packed;
> +
> +struct node_msg {
> + u32 cpumask;
> + u32 hw_type;
> + u32 mon_type;
> + u32 mon_idx;
> + char mon_name[MAX_NAME_LEN];
> +};
> +
> +struct scalar_param_msg {
> + u32 hw_type;
> + u32 mon_idx;
> + u32 val;
> +};
> +
> +enum common_ev_idx {
> + INST_IDX,
> + CYC_IDX,
> + CONST_CYC_IDX,
> + FE_STALL_IDX,
> + BE_STALL_IDX,
> + NUM_COMMON_EVS
> +};
> +
> +enum grp_ev_idx {
> + MISS_IDX,
> + WB_IDX,
> + ACC_IDX,
> + NUM_GRP_EVS
> +};
> +
> +#define EV_CPU_CYCLES 0
> +#define EV_INST_RETIRED 2
> +#define EV_L2_D_RFILL 5
> +
> +struct ev_map_msg {
> + u32 num_evs;
> + u32 hw_type;
> + u32 cid[NUM_COMMON_EVS];
> +};
> +
> +struct cpufreq_memfreq_map {
> + unsigned int cpufreq_mhz;
> + unsigned int memfreq_khz;
> +};
> +
> +struct scmi_monitor_info {
> + struct cpufreq_memfreq_map *freq_map;
> + char mon_name[MAX_NAME_LEN];
> + u32 mon_idx;
> + u32 mon_type;
> + u32 ipm_ceil;
> + u32 mask;
> + u32 freq_map_len;
> +};
> +
> +struct scmi_memory_info {
> + struct scmi_monitor_info *monitor[MAX_MONITOR_CNT];
> + u32 hw_type;
> + int monitor_cnt;
> + u32 min_freq;
> + u32 max_freq;
> +};
> +
> +struct scmi_memlat_info {
> + struct scmi_protocol_handle *ph;
> + const struct qcom_scmi_vendor_ops *ops;
> + struct scmi_memory_info *memory[MAX_MEMORY_TYPES];
> + u32 cluster_info[NR_CPUS];
> + int memory_cnt;
> +};
> +
> +static int populate_cluster_info(u32 *cluster_info)
> +{
> + char name[MAX_NAME_LEN];
> + int i = 0;
> +
> + struct device_node *cn __free(device_node) = of_find_node_by_path("/cpus");
> + if (!cn)
> + return -ENODEV;
> +
> + struct device_node *map __free(device_node) = of_get_child_by_name(cn, "cpu-map");
> + if (!map)
> + return -ENODEV;
> +
> + do {
> + snprintf(name, sizeof(name), "cluster%d", i);
> + struct device_node *c __free(device_node) = of_get_child_by_name(map, name);
> + if (!c)
> + break;
> +
> + *(cluster_info + i) = of_get_child_count(c);
> + i++;
> + } while (1);
> +
> + return 0;
> +}
> +
> +static int populate_physical_mask(struct device_node *np, u32 *mask, u32 *cluster_info)
> +{
> + struct device_node *dev_phandle;
> + int cpu, i = 0, physical_id;
> +
> + do {
> + dev_phandle = of_parse_phandle(np, "cpus", i++);
> + cpu = of_cpu_node_to_id(dev_phandle);
> + if (cpu != -ENODEV) {
> + physical_id = topology_core_id(cpu);
> + for (int j = 0; j < topology_cluster_id(cpu); j++)
> + physical_id += *(cluster_info + j);
> + *mask |= BIT(physical_id);
> + }
> + } while (dev_phandle);
> +
> + return 0;
> +}
> +
> +static struct cpufreq_memfreq_map *init_cpufreq_memfreq_map(struct device *dev,
> + struct scmi_memory_info *memory,
> + struct device_node *of_node,
> + u32 *cnt)
> +{
> + struct device_node *tbl_np, *opp_np;
> + struct cpufreq_memfreq_map *tbl;
> + int ret, i = 0;
> + u32 level, len;
> + u64 rate;
> +
> + tbl_np = of_parse_phandle(of_node, "operating-points-v2", 0);
> + if (!tbl_np)
> + return ERR_PTR(-ENODEV);
> +
> + len = min(of_get_available_child_count(tbl_np), MAX_MAP_ENTRIES);
> + if (len == 0)
> + return ERR_PTR(-ENODEV);
> +
> + tbl = devm_kzalloc(dev, (len + 1) * sizeof(struct cpufreq_memfreq_map),
> + GFP_KERNEL);
> + if (!tbl)
> + return ERR_PTR(-ENOMEM);
> +
> + for_each_available_child_of_node(tbl_np, opp_np) {
> + ret = of_property_read_u64_index(opp_np, "opp-hz", 0, &rate);
> + if (ret < 0)
> + return ERR_PTR(ret);
> +
> + tbl[i].cpufreq_mhz = rate / HZ_PER_MHZ;
> +
> + if (memory->hw_type != QCOM_MEM_TYPE_DDR_QOS) {
> + ret = of_property_read_u64_index(opp_np, "opp-hz", 1, &rate);
> + if (ret < 0)
> + return ERR_PTR(ret);
> +
> + tbl[i].memfreq_khz = rate / HZ_PER_KHZ;
> + } else {
> + ret = of_property_read_u32(opp_np, "opp-level", &level);
> + if (ret < 0)
> + return ERR_PTR(ret);
> +
> + tbl[i].memfreq_khz = level;
> + }
> +
> + dev_dbg(dev, "Entry%d CPU:%u, Mem:%u\n", i, tbl[i].cpufreq_mhz, tbl[i].memfreq_khz);
> + i++;
> + }
> + *cnt = len;
> + tbl[i].cpufreq_mhz = 0;
This is not needed as for allocating memory, devm_kzalloc is used.
> +
> + return tbl;
> +}
> +
> +static int process_scmi_memlat_of_node(struct scmi_device *sdev, struct scmi_memlat_info *info)
> +{
> + struct scmi_monitor_info *monitor;
> + struct scmi_memory_info *memory;
> + char name[MAX_NAME_LEN];
> + u64 memfreq[2];
> + int ret;
> +
> + ret = populate_cluster_info(info->cluster_info);
> + if (ret < 0) {
> + dev_err_probe(&sdev->dev, ret, "failed to populate cluster info\n");
> + goto err;
> + }
> +
> + of_node_get(sdev->handle->dev->of_node);
> + do {
> + snprintf(name, sizeof(name), "memory-%d", info->memory_cnt);
> + struct device_node *memory_np __free(device_node) =
> + of_find_node_by_name(sdev->handle->dev->of_node, name);
> +
> + if (!memory_np)
> + break;
> +
> + memory = devm_kzalloc(&sdev->dev, sizeof(*memory), GFP_KERNEL);
> + if (!memory) {
> + ret = -ENOMEM;
> + goto err;
> + }
> +
> + ret = of_property_read_u32(memory_np, "qcom,memory-type", &memory->hw_type);
> + if (ret) {
> + dev_err_probe(&sdev->dev, ret, "failed to read memory type\n");
> + goto err;
> + }
> +
> + ret = of_property_read_u64_array(memory_np, "freq-table-hz", memfreq, 2);
> + if (ret && (ret != -EINVAL)) {
> + dev_err_probe(&sdev->dev, ret, "failed to read min/max freq\n");
> + goto err;
> + }
> +
> + if (memory->hw_type != QCOM_MEM_TYPE_DDR_QOS) {
> + memory->min_freq = memfreq[0] / HZ_PER_KHZ;
> + memory->max_freq = memfreq[1] / HZ_PER_KHZ;
> + } else {
> + memory->min_freq = memfreq[0];
> + memory->max_freq = memfreq[1];
> + }
> + info->memory[info->memory_cnt++] = memory;
> +
> + do {
> + snprintf(name, sizeof(name), "monitor-%d", memory->monitor_cnt);
> + struct device_node *monitor_np __free(device_node) =
> + of_get_child_by_name(memory_np, name);
> +
> + if (!monitor_np)
> + break;
> +
> + monitor = devm_kzalloc(&sdev->dev, sizeof(*monitor), GFP_KERNEL);
> + if (!monitor) {
> + ret = -ENOMEM;
> + goto err;
> + }
> +
> + monitor->mon_type = of_property_read_bool(monitor_np, "qcom,compute-type");
> + if (!monitor->mon_type) {
> + ret = of_property_read_u32(monitor_np, "qcom,ipm-ceil",
> + &monitor->ipm_ceil);
> + if (ret) {
> + dev_err_probe(&sdev->dev, ret,
> + "failed to read IPM ceiling\n");
> + goto err;
> + }
> + }
> +
> + /*
> + * Variants of the SoC having reduced number of cpus operate
> + * with the same number of logical cpus but the physical
> + * cpu disabled will differ between parts. Calculate the
> + * physical cpu number using cluster information instead.
> + */
> + ret = populate_physical_mask(monitor_np, &monitor->mask,
> + info->cluster_info);
> + if (ret < 0) {
> + dev_err_probe(&sdev->dev, ret, "failed to populate cpu mask\n");
> + goto err;
> + }
> +
Above error check is not needed as populate_physical_mask always return 0.
> + monitor->freq_map = init_cpufreq_memfreq_map(&sdev->dev, memory, monitor_np,
> + &monitor->freq_map_len);
> + if (IS_ERR(monitor->freq_map)) {
> + dev_err_probe(&sdev->dev, PTR_ERR(monitor->freq_map),
> + "failed to populate cpufreq-memfreq map\n");
> + goto err;
> + }
> +
> + strscpy(monitor->mon_name, name, sizeof(monitor->mon_name));
> + monitor->mon_idx = memory->monitor_cnt;
> +
> + memory->monitor[memory->monitor_cnt++] = monitor;
If from dt, more than 4 monitor are passed then this may lead to
overflow, adding a check and giving warning in case number of monitor
exceeds MAX_MONITOR_CNT will be better idea.
> + } while (1);
> +
> + if (!memory->monitor_cnt) {
> + ret = -EINVAL;
> + dev_err_probe(&sdev->dev, ret, "failed to find monitor nodes\n");
> + goto err;
> + }
> + } while (1);
> +
> + if (!info->memory_cnt) {
> + ret = -EINVAL;
> + dev_err_probe(&sdev->dev, ret, "failed to find memory nodes\n");
> + }
> +
> +err:
> + of_node_put(sdev->handle->dev->of_node);
> +
> + return ret;
> +}
> +
> +static int configure_cpucp_common_events(struct scmi_memlat_info *info)
> +{
> + const struct qcom_scmi_vendor_ops *ops = info->ops;
> + u8 ev_map[NUM_COMMON_EVS];
> + struct ev_map_msg msg;
> + int ret;
> +
> + memset(ev_map, 0xFF, NUM_COMMON_EVS);
> +
> + msg.num_evs = NUM_COMMON_EVS;
> + msg.hw_type = INVALID_IDX;
> + msg.cid[INST_IDX] = EV_INST_RETIRED;
> + msg.cid[CYC_IDX] = EV_CPU_CYCLES;
> + msg.cid[CONST_CYC_IDX] = INVALID_IDX;
> + msg.cid[FE_STALL_IDX] = INVALID_IDX;
> + msg.cid[BE_STALL_IDX] = INVALID_IDX;
> +
> + ret = ops->set_param(info->ph, &msg, MEMLAT_ALGO_STR, MEMLAT_SET_COMMON_EV_MAP,
> + sizeof(msg));
> + return ret;
> +}
> +
> +static int configure_cpucp_grp(struct device *dev, struct scmi_memlat_info *info, int memory_index)
> +{
> + const struct qcom_scmi_vendor_ops *ops = info->ops;
> + struct scmi_memory_info *memory = info->memory[memory_index];
> + struct ev_map_msg ev_msg;
> + u8 ev_map[NUM_GRP_EVS];
> + struct node_msg msg;
> + int ret;
> +
> + msg.cpumask = 0;
> + msg.hw_type = memory->hw_type;
> + msg.mon_type = 0;
> + msg.mon_idx = 0;
> + ret = ops->set_param(info->ph, &msg, MEMLAT_ALGO_STR, MEMLAT_SET_MEM_GROUP, sizeof(msg));
> + if (ret < 0) {
> + dev_err_probe(dev, ret, "failed to configure mem type %d\n", memory->hw_type);
> + return ret;
> + }
> +
> + memset(ev_map, 0xFF, NUM_GRP_EVS);
> + ev_msg.num_evs = NUM_GRP_EVS;
> + ev_msg.hw_type = memory->hw_type;
> + ev_msg.cid[MISS_IDX] = EV_L2_D_RFILL;
> + ev_msg.cid[WB_IDX] = INVALID_IDX;
> + ev_msg.cid[ACC_IDX] = INVALID_IDX;
> + ret = ops->set_param(info->ph, &ev_msg, MEMLAT_ALGO_STR, MEMLAT_SET_GRP_EV_MAP,
> + sizeof(ev_msg));
> + if (ret < 0) {
> + dev_err_probe(dev, ret,
> + "failed to configure event map for mem type %d\n", memory->hw_type);
> + return ret;
> + }
> +
> + return ret;
> +}
> +
> +static int configure_cpucp_mon(struct device *dev, struct scmi_memlat_info *info,
> + int memory_index, int monitor_index)
> +{
> + const struct qcom_scmi_vendor_ops *ops = info->ops;
> + struct scmi_memory_info *memory = info->memory[memory_index];
> + struct scmi_monitor_info *monitor = memory->monitor[monitor_index];
> + struct scalar_param_msg scalar_msg;
> + struct map_param_msg map_msg;
> + struct node_msg msg;
> + int ret;
> + int i;
> +
> + msg.cpumask = monitor->mask;
> + msg.hw_type = memory->hw_type;
> + msg.mon_type = monitor->mon_type;
> + msg.mon_idx = monitor->mon_idx;
> + strscpy(msg.mon_name, monitor->mon_name, sizeof(msg.mon_name));
> + ret = ops->set_param(info->ph, &msg, MEMLAT_ALGO_STR, MEMLAT_SET_MONITOR, sizeof(msg));
> + if (ret < 0) {
> + dev_err_probe(dev, ret, "failed to configure monitor %s\n", monitor->mon_name);
> + return ret;
> + }
> +
> + scalar_msg.hw_type = memory->hw_type;
> + scalar_msg.mon_idx = monitor->mon_idx;
> + scalar_msg.val = monitor->ipm_ceil;
> + ret = ops->set_param(info->ph, &scalar_msg, MEMLAT_ALGO_STR, MEMLAT_IPM_CEIL,
> + sizeof(scalar_msg));
> + if (ret < 0) {
> + dev_err_probe(dev, ret, "failed to set ipm ceil for %s\n", monitor->mon_name);
> + return ret;
> + }
> +
> + map_msg.hw_type = memory->hw_type;
> + map_msg.mon_idx = monitor->mon_idx;
> + map_msg.nr_rows = monitor->freq_map_len;
> + for (i = 0; i < monitor->freq_map_len; i++) {
> + map_msg.tbl[i].v1 = monitor->freq_map[i].cpufreq_mhz;
> + map_msg.tbl[i].v2 = monitor->freq_map[i].memfreq_khz;
> + }
> + ret = ops->set_param(info->ph, &map_msg, MEMLAT_ALGO_STR, MEMLAT_MON_FREQ_MAP,
> + sizeof(map_msg));
> + if (ret < 0) {
> + dev_err_probe(dev, ret, "failed to configure freq_map for %s\n", monitor->mon_name);
> + return ret;
> + }
> +
> + scalar_msg.hw_type = memory->hw_type;
> + scalar_msg.mon_idx = monitor->mon_idx;
> + scalar_msg.val = memory->min_freq;
> + ret = ops->set_param(info->ph, &scalar_msg, MEMLAT_ALGO_STR, MEMLAT_SET_MIN_FREQ,
> + sizeof(scalar_msg));
> + if (ret < 0) {
> + dev_err_probe(dev, ret, "failed to set min_freq for %s\n", monitor->mon_name);
> + return ret;
> + }
> +
> + scalar_msg.hw_type = memory->hw_type;
> + scalar_msg.mon_idx = monitor->mon_idx;
> + scalar_msg.val = memory->max_freq;
> + ret = ops->set_param(info->ph, &scalar_msg, MEMLAT_ALGO_STR, MEMLAT_SET_MAX_FREQ,
> + sizeof(scalar_msg));
> + if (ret < 0)
> + dev_err_probe(dev, ret, "failed to set max_freq for %s\n", monitor->mon_name);
> +
> + return ret;
> +}
> +
> +static int cpucp_memlat_init(struct scmi_device *sdev)
> +{
> + const struct scmi_handle *handle = sdev->handle;
> + const struct qcom_scmi_vendor_ops *ops;
> + struct scmi_protocol_handle *ph;
> + struct scmi_memlat_info *info;
> + u32 cpucp_freq_method = CPUCP_DEFAULT_FREQ_METHOD;
> + u32 cpucp_sample_ms = CPUCP_DEFAULT_SAMPLING_PERIOD_MS;
> + int ret, i, j;
> +
> + if (!handle)
> + return -ENODEV;
> +
> + ops = handle->devm_protocol_get(sdev, QCOM_SCMI_VENDOR_PROTOCOL, &ph);
> + if (IS_ERR(ops))
> + return PTR_ERR(ops);
> +
> + info = devm_kzalloc(&sdev->dev, sizeof(*info), GFP_KERNEL);
> + if (!info)
> + return -ENOMEM;
> +
> + ret = process_scmi_memlat_of_node(sdev, info);
> + if (ret)
> + return ret;
> +
> + info->ph = ph;
> + info->ops = ops;
> +
> + /* Configure common events ids */
> + ret = configure_cpucp_common_events(info);
> + if (ret < 0) {
> + dev_err_probe(&sdev->dev, ret, "failed to configure common events\n");
> + return ret;
> + }
> +
> + for (i = 0; i < info->memory_cnt; i++) {
> + /* Configure per group parameters */
> + ret = configure_cpucp_grp(&sdev->dev, info, i);
> + if (ret < 0)
> + return ret;
> +
> + for (j = 0; j < info->memory[i]->monitor_cnt; j++) {
> + /* Configure per monitor parameters */
> + ret = configure_cpucp_mon(&sdev->dev, info, i, j);
> + if (ret < 0)
> + return ret;
> + }
> + }
> +
> + /* Set loop sampling time */
> + ret = ops->set_param(ph, &cpucp_sample_ms, MEMLAT_ALGO_STR, MEMLAT_SAMPLE_MS,
> + sizeof(cpucp_sample_ms));
> + if (ret < 0) {
> + dev_err_probe(&sdev->dev, ret, "failed to set sample_ms\n");
> + return ret;
> + }
> +
> + /* Set the effective cpu frequency calculation method */
> + ret = ops->set_param(ph, &cpucp_freq_method, MEMLAT_ALGO_STR,
> + MEMLAT_SET_EFFECTIVE_FREQ_METHOD, sizeof(cpucp_freq_method));
> + if (ret < 0) {
> + dev_err_probe(&sdev->dev, ret, "failed to set effective frequency calc method\n");
> + return ret;
> + }
> +
> + /* Start sampling and voting timer */
> + ret = ops->start_activity(ph, NULL, MEMLAT_ALGO_STR, MEMLAT_START_TIMER, 0);
> + if (ret < 0)
> + dev_err_probe(&sdev->dev, ret, "failed to start memory group timer\n");
> +
> + return ret;
> +}
> +
> +static int scmi_client_probe(struct scmi_device *sdev)
> +{
> + return cpucp_memlat_init(sdev);
> +}
> +
> +static const struct scmi_device_id scmi_id_table[] = {
> + { .protocol_id = QCOM_SCMI_VENDOR_PROTOCOL, .name = "qcom_scmi_vendor_protocol" },
> + { },
> +};
> +MODULE_DEVICE_TABLE(scmi, scmi_id_table);
> +
> +static struct scmi_driver qcom_scmi_client_drv = {
> + .name = "qcom-scmi-driver",
> + .probe = scmi_client_probe,
> + .id_table = scmi_id_table,
> +};
> +module_scmi_driver(qcom_scmi_client_drv);
> +
> +MODULE_DESCRIPTION("QTI SCMI client driver");
> +MODULE_LICENSE("GPL");
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC V3 3/4] soc: qcom: Introduce SCMI based Memlat (Memory Latency) governor
2024-07-03 8:44 ` Shivnandan Kumar
@ 2024-07-03 11:34 ` Sibi Sankar
0 siblings, 0 replies; 17+ messages in thread
From: Sibi Sankar @ 2024-07-03 11:34 UTC (permalink / raw)
To: Shivnandan Kumar, sudeep.holla, cristian.marussi, andersson,
konrad.dybcio, robh+dt, krzysztof.kozlowski+dt
Cc: linux-kernel, linux-arm-msm, devicetree, linux-arm-kernel,
quic_rgottimu, conor+dt, Amir Vajid
On 7/3/24 14:14, Shivnandan Kumar wrote:
>
>
> On 7/3/2024 12:44 AM, Sibi Sankar wrote:
>> Introduce a client driver that uses the memlat algorithm string hosted
>> on ARM SCMI QCOM Vendor Protocol to detect memory latency workloads and
>> control frequency/level of the various memory buses (DDR/LLCC/DDR_QOS).
>>
Hey Shiv,
Thanks for taking time to review the series :)
>> Co-developed-by: Shivnandan Kumar <quic_kshivnan@quicinc.com>
>> Signed-off-by: Shivnandan Kumar <quic_kshivnan@quicinc.com>
>> Co-developed-by: Ramakrishna Gottimukkula <quic_rgottimu@quicinc.com>
>> Signed-off-by: Ramakrishna Gottimukkula <quic_rgottimu@quicinc.com>
>> Co-developed-by: Amir Vajid <avajid@quicinc.com>
>> Signed-off-by: Amir Vajid <avajid@quicinc.com>
>> Signed-off-by: Sibi Sankar <quic_sibis@quicinc.com>
>> ---
>>
>> V2:
>> * Make driver changes to the accommodate bindings changes. [Rob]
>> * Replace explicit of_node_put with _free.
>>
>> drivers/soc/qcom/Kconfig | 12 +
>> drivers/soc/qcom/Makefile | 1 +
>> drivers/soc/qcom/qcom_scmi_client.c | 583 ++++++++++++++++++++++++++++
>> 3 files changed, 596 insertions(+)
>> create mode 100644 drivers/soc/qcom/qcom_scmi_client.c
>>
>> diff --git a/drivers/soc/qcom/Kconfig b/drivers/soc/qcom/Kconfig
>> index 432c85bd8ad4..b253504bd386 100644
>> --- a/drivers/soc/qcom/Kconfig
>> +++ b/drivers/soc/qcom/Kconfig
>> @@ -294,4 +294,16 @@ config QCOM_PBS
>> This module provides the APIs to the client drivers that wants
>> to send the
>> PBS trigger event to the PBS RAM.
>> +config QCOM_SCMI_CLIENT
>> + tristate "Qualcomm Technologies Inc. SCMI client driver"
>> + depends on ARM_SCMI_PROTOCOL_VENDOR_QCOM || COMPILE_TEST
>> + default n
>> + help
>> + This driver uses the memlat algorithm string hosted on QCOM SCMI
>> + Vendor Protocol to detect memory latency workloads and control
>> + frequency/level of the various memory buses (DDR/LLCC/DDR_QOS).
>> +
>> + This driver defines/documents the parameter IDs used while
>> configuring
>> + the memory buses.
>> +
>> endmenu
>> diff --git a/drivers/soc/qcom/Makefile b/drivers/soc/qcom/Makefile
>> index d3560f861085..8a2e832d1d5d 100644
>> --- a/drivers/soc/qcom/Makefile
>> +++ b/drivers/soc/qcom/Makefile
>> @@ -35,6 +35,7 @@ obj-$(CONFIG_QCOM_APR) += apr.o
>> obj-$(CONFIG_QCOM_LLCC) += llcc-qcom.o
>> obj-$(CONFIG_QCOM_KRYO_L2_ACCESSORS) += kryo-l2-accessors.o
>> obj-$(CONFIG_QCOM_ICC_BWMON) += icc-bwmon.o
>> +obj-$(CONFIG_QCOM_SCMI_CLIENT) += qcom_scmi_client.o
>> qcom_ice-objs += ice.o
>> obj-$(CONFIG_QCOM_INLINE_CRYPTO_ENGINE) += qcom_ice.o
>> obj-$(CONFIG_QCOM_PBS) += qcom-pbs.o
>> diff --git a/drivers/soc/qcom/qcom_scmi_client.c
>> b/drivers/soc/qcom/qcom_scmi_client.c
>> new file mode 100644
>> index 000000000000..8369b415c0ab
>> --- /dev/null
>> +++ b/drivers/soc/qcom/qcom_scmi_client.c
>> @@ -0,0 +1,583 @@
>> +// SPDX-License-Identifier: GPL-2.0-only
>> +/*
>> + * Copyright (c) 2024, Qualcomm Innovation Center, Inc. All rights
>> reserved.
>> + */
>> +
>> +#include <linux/cpu.h>
>> +#include <linux/err.h>
>> +#include <linux/errno.h>
>> +#include <linux/init.h>
>> +#include <linux/kernel.h>
>> +#include <linux/module.h>
>> +#include <linux/of.h>
>> +#include <linux/platform_device.h>
>> +#include <linux/qcom_scmi_vendor.h>
>> +#include <linux/scmi_protocol.h>
>> +#include <linux/units.h>
>> +#include <dt-bindings/soc/qcom,scmi-vendor.h>
>> +
>> +#define MEMLAT_ALGO_STR 0x4D454D4C4154 /* MEMLAT */
>> +#define INVALID_IDX 0xff
>> +#define MAX_MEMORY_TYPES 3
>> +#define MAX_MONITOR_CNT 4
>> +#define MAX_NAME_LEN 20
>> +#define MAX_MAP_ENTRIES 7
>> +#define CPUCP_DEFAULT_SAMPLING_PERIOD_MS 4
>> +#define CPUCP_DEFAULT_FREQ_METHOD 1
>> +
>> +/**
>> + * scmi_memlat_protocol_cmd - parameter_ids supported by the "MEMLAT"
>> algo_str hosted
>> + * by the Qualcomm SCMI Vendor Protocol on
>> the SCMI controller.
>> + *
>> + * MEMLAT (Memory Latency) monitors the counters to detect memory
>> latency bound workloads
>> + * and scales the frequency/levels of the memory buses accordingly.
>> + *
>> + * @MEMLAT_SET_MEM_GROUP: initializes the frequency/level scaling
>> functions for the memory bus.
>> + * @MEMLAT_SET_MONITOR: configures the monitor to work on a specific
>> memory bus.
>> + * @MEMLAT_SET_COMMON_EV_MAP: set up common counters used to monitor
>> the cpu frequency.
>> + * @MEMLAT_SET_GRP_EV_MAP: set up any specific counters used to
>> monitor the memory bus.
>> + * @MEMLAT_IPM_CEIL: set the IPM (Instruction Per Misses) ceiling per
>> monitor.
>> + * @MEMLAT_SAMPLE_MS: set the sampling period for all the monitors.
>> + * MEMLAT_MON_FREQ_MAP: setup the cpufreq to memfreq map.
>> + * MEMLAT_SET_MIN_FREQ: set the max frequency of the memory bus.
>> + * MEMLAT_SET_MAX_FREQ: set the min frequency of the memory bus.
>> + * MEMLAT_START_TIMER: start all the monitors with the requested
>> sampling period.
>> + * MEMLAT_START_TIMER: stop all the running monitors.
>
> Typo above, it should be MEMLAT_STOP_TIMER
ack, thanks for catching this.
>
>> + * MEMLAT_SET_EFFECTIVE_FREQ_METHOD: set the method used to determine
>> cpu frequency.
>> + */
>> +enum scmi_memlat_protocol_cmd {
>> + MEMLAT_SET_MEM_GROUP = 16,
>> + MEMLAT_SET_MONITOR,
>> + MEMLAT_SET_COMMON_EV_MAP,
>> + MEMLAT_SET_GRP_EV_MAP,
>> + MEMLAT_IPM_CEIL = 23,
>> + MEMLAT_SAMPLE_MS = 31,
>> + MEMLAT_MON_FREQ_MAP,
>> + MEMLAT_SET_MIN_FREQ,
>> + MEMLAT_SET_MAX_FREQ,
>> + MEMLAT_START_TIMER = 36,
>> + MEMLAT_STOP_TIMER,
>> + MEMLAT_SET_EFFECTIVE_FREQ_METHOD = 39,
>> +};
>> +
>> +struct map_table {
[...]
>> +
>> +static int populate_physical_mask(struct device_node *np, u32 *mask,
>> u32 *cluster_info)
>> +{
>> + struct device_node *dev_phandle;
>> + int cpu, i = 0, physical_id;
>> +
>> + do {
>> + dev_phandle = of_parse_phandle(np, "cpus", i++);
>> + cpu = of_cpu_node_to_id(dev_phandle);
>> + if (cpu != -ENODEV) {
>> + physical_id = topology_core_id(cpu);
>> + for (int j = 0; j < topology_cluster_id(cpu); j++)
>> + physical_id += *(cluster_info + j);
>> + *mask |= BIT(physical_id);
>> + }
>> + } while (dev_phandle);
>> +
>> + return 0;
>> +}
>> +
>> +static struct cpufreq_memfreq_map *init_cpufreq_memfreq_map(struct
>> device *dev,
>> + struct scmi_memory_info *memory,
>> + struct device_node *of_node,
>> + u32 *cnt)
>> +{
>> + struct device_node *tbl_np, *opp_np;
>> + struct cpufreq_memfreq_map *tbl;
>> + int ret, i = 0;
>> + u32 level, len;
>> + u64 rate;
>> +
>> + tbl_np = of_parse_phandle(of_node, "operating-points-v2", 0);
>> + if (!tbl_np)
>> + return ERR_PTR(-ENODEV);
>> +
>> + len = min(of_get_available_child_count(tbl_np), MAX_MAP_ENTRIES);
>> + if (len == 0)
>> + return ERR_PTR(-ENODEV);
>> +
>> + tbl = devm_kzalloc(dev, (len + 1) * sizeof(struct
>> cpufreq_memfreq_map),
>> + GFP_KERNEL);
>> + if (!tbl)
>> + return ERR_PTR(-ENOMEM);
>> +
>> + for_each_available_child_of_node(tbl_np, opp_np) {
>> + ret = of_property_read_u64_index(opp_np, "opp-hz", 0, &rate);
>> + if (ret < 0)
>> + return ERR_PTR(ret);
>> +
>> + tbl[i].cpufreq_mhz = rate / HZ_PER_MHZ;
>> +
>> + if (memory->hw_type != QCOM_MEM_TYPE_DDR_QOS) {
>> + ret = of_property_read_u64_index(opp_np, "opp-hz", 1,
>> &rate);
>> + if (ret < 0)
>> + return ERR_PTR(ret);
>> +
>> + tbl[i].memfreq_khz = rate / HZ_PER_KHZ;
>> + } else {
>> + ret = of_property_read_u32(opp_np, "opp-level", &level);
>> + if (ret < 0)
>> + return ERR_PTR(ret);
>> +
>> + tbl[i].memfreq_khz = level;
>> + }
>> +
>> + dev_dbg(dev, "Entry%d CPU:%u, Mem:%u\n", i,
>> tbl[i].cpufreq_mhz, tbl[i].memfreq_khz);
>> + i++;
>> + }
>> + *cnt = len;
>> + tbl[i].cpufreq_mhz = 0;
>
> This is not needed as for allocating memory, devm_kzalloc is used.
ack
>
>> +
>> + return tbl;
>> +}
>> +
>> +static int process_scmi_memlat_of_node(struct scmi_device *sdev,
>> struct scmi_memlat_info *info)
>> +{
>> + struct scmi_monitor_info *monitor;
>> + struct scmi_memory_info *memory;
>> + char name[MAX_NAME_LEN];
>> + u64 memfreq[2];
>> + int ret;
>> +
>> + ret = populate_cluster_info(info->cluster_info);
>> + if (ret < 0) {
>> + dev_err_probe(&sdev->dev, ret, "failed to populate cluster
>> info\n");
>> + goto err;
>> + }
>> +
>> + of_node_get(sdev->handle->dev->of_node);
>> + do {
>> + snprintf(name, sizeof(name), "memory-%d", info->memory_cnt);
>> + struct device_node *memory_np __free(device_node) =
>> + of_find_node_by_name(sdev->handle->dev->of_node, name);
>> +
>> + if (!memory_np)
>> + break;
>> +
>> + memory = devm_kzalloc(&sdev->dev, sizeof(*memory), GFP_KERNEL);
>> + if (!memory) {
>> + ret = -ENOMEM;
>> + goto err;
>> + }
>> +
>> + ret = of_property_read_u32(memory_np, "qcom,memory-type",
>> &memory->hw_type);
>> + if (ret) {
>> + dev_err_probe(&sdev->dev, ret, "failed to read memory
>> type\n");
>> + goto err;
>> + }
>> +
>> + ret = of_property_read_u64_array(memory_np, "freq-table-hz",
>> memfreq, 2);
>> + if (ret && (ret != -EINVAL)) {
>> + dev_err_probe(&sdev->dev, ret, "failed to read min/max
>> freq\n");
>> + goto err;
>> + }
>> +
>> + if (memory->hw_type != QCOM_MEM_TYPE_DDR_QOS) {
>> + memory->min_freq = memfreq[0] / HZ_PER_KHZ;
>> + memory->max_freq = memfreq[1] / HZ_PER_KHZ;
>> + } else {
>> + memory->min_freq = memfreq[0];
>> + memory->max_freq = memfreq[1];
>> + }
>> + info->memory[info->memory_cnt++] = memory;
>> +
>> + do {
>> + snprintf(name, sizeof(name), "monitor-%d",
>> memory->monitor_cnt);
>> + struct device_node *monitor_np __free(device_node) =
>> + of_get_child_by_name(memory_np, name);
>> +
>> + if (!monitor_np)
>> + break;
>> +
>> + monitor = devm_kzalloc(&sdev->dev, sizeof(*monitor),
>> GFP_KERNEL);
>> + if (!monitor) {
>> + ret = -ENOMEM;
>> + goto err;
>> + }
>> +
>> + monitor->mon_type = of_property_read_bool(monitor_np,
>> "qcom,compute-type");
>> + if (!monitor->mon_type) {
>> + ret = of_property_read_u32(monitor_np, "qcom,ipm-ceil",
>> + &monitor->ipm_ceil);
>> + if (ret) {
>> + dev_err_probe(&sdev->dev, ret,
>> + "failed to read IPM ceiling\n");
>> + goto err;
>> + }
>> + }
>> +
>> + /*
>> + * Variants of the SoC having reduced number of cpus operate
>> + * with the same number of logical cpus but the physical
>> + * cpu disabled will differ between parts. Calculate the
>> + * physical cpu number using cluster information instead.
>> + */
>> + ret = populate_physical_mask(monitor_np, &monitor->mask,
>> + info->cluster_info);
>> + if (ret < 0) {
>> + dev_err_probe(&sdev->dev, ret, "failed to populate
>> cpu mask\n");
>> + goto err;
>> + }
>> +
>
> Above error check is not needed as populate_physical_mask always return 0.
ack, will fix this in the next re-spin.
>
>
>> + monitor->freq_map = init_cpufreq_memfreq_map(&sdev->dev,
>> memory, monitor_np,
>> + &monitor->freq_map_len);
>> + if (IS_ERR(monitor->freq_map)) {
>> + dev_err_probe(&sdev->dev, PTR_ERR(monitor->freq_map),
>> + "failed to populate cpufreq-memfreq map\n");
>> + goto err;
>> + }
>> +
>> + strscpy(monitor->mon_name, name, sizeof(monitor->mon_name));
>> + monitor->mon_idx = memory->monitor_cnt;
>> +
>> + memory->monitor[memory->monitor_cnt++] = monitor;
>
> If from dt, more than 4 monitor are passed then this may lead to
> overflow, adding a check and giving warning in case number of monitor
> exceeds MAX_MONITOR_CNT will be better idea.
I'll make sure to account for ^^ by either switching to dynamic
allocation or by using MAX_MONITOR_CNT.
>
>
>> + } while (1);
>> +
[...]
>> +MODULE_DESCRIPTION("QTI SCMI client driver");
>> +MODULE_LICENSE("GPL");
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC V3 2/4] firmware: arm_scmi: vendors: Add ARM SCMI QCOM vendor protocol v1.0
2024-07-02 19:14 ` [RFC V3 2/4] firmware: arm_scmi: vendors: Add ARM SCMI QCOM vendor protocol v1.0 Sibi Sankar
@ 2024-07-09 10:10 ` Konrad Dybcio
2024-07-11 13:33 ` Sibi Sankar
2024-07-09 17:52 ` Cristian Marussi
1 sibling, 1 reply; 17+ messages in thread
From: Konrad Dybcio @ 2024-07-09 10:10 UTC (permalink / raw)
To: Sibi Sankar, sudeep.holla, cristian.marussi, andersson, robh+dt,
krzysztof.kozlowski+dt
Cc: linux-kernel, linux-arm-msm, devicetree, linux-arm-kernel,
quic_rgottimu, quic_kshivnan, conor+dt, Amir Vajid
On 2.07.2024 9:14 PM, Sibi Sankar wrote:
> The ARM SCMI QCOM vendor protocol provides a generic way of exposing a
> number of Qualcomm SoC specific features (like memory bus scaling) through
> a mixture of pre-determined algorithm strings and param_id pairs hosted on
> the SCMI controller.
>
[...]
> +/**
> + * qcom_scmi_vendor_protocol_cmd - vendor specific commands supported by Qualcomm SCMI
> + * vendor protocol.
include the word 'enum' or:
warning: cannot understand function prototype: 'enum qcom_scmi_vendor_protocol_cmd '
Konrad
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC V3 3/4] soc: qcom: Introduce SCMI based Memlat (Memory Latency) governor
2024-07-02 19:14 ` [RFC V3 3/4] soc: qcom: Introduce SCMI based Memlat (Memory Latency) governor Sibi Sankar
2024-07-03 8:44 ` Shivnandan Kumar
@ 2024-07-09 10:51 ` Konrad Dybcio
2024-10-07 7:22 ` Sibi Sankar
2024-07-10 11:36 ` Cristian Marussi
2 siblings, 1 reply; 17+ messages in thread
From: Konrad Dybcio @ 2024-07-09 10:51 UTC (permalink / raw)
To: Sibi Sankar, sudeep.holla, cristian.marussi, andersson, robh+dt,
krzysztof.kozlowski+dt
Cc: linux-kernel, linux-arm-msm, devicetree, linux-arm-kernel,
quic_rgottimu, quic_kshivnan, conor+dt, Amir Vajid
On 2.07.2024 9:14 PM, Sibi Sankar wrote:
> Introduce a client driver that uses the memlat algorithm string hosted
> on ARM SCMI QCOM Vendor Protocol to detect memory latency workloads and
> control frequency/level of the various memory buses (DDR/LLCC/DDR_QOS).
>
> Co-developed-by: Shivnandan Kumar <quic_kshivnan@quicinc.com>
> Signed-off-by: Shivnandan Kumar <quic_kshivnan@quicinc.com>
> Co-developed-by: Ramakrishna Gottimukkula <quic_rgottimu@quicinc.com>
> Signed-off-by: Ramakrishna Gottimukkula <quic_rgottimu@quicinc.com>
> Co-developed-by: Amir Vajid <avajid@quicinc.com>
> Signed-off-by: Amir Vajid <avajid@quicinc.com>
> Signed-off-by: Sibi Sankar <quic_sibis@quicinc.com>
> ---
[...]
> +/**
> + * scmi_memlat_protocol_cmd - parameter_ids supported by the "MEMLAT" algo_str hosted
> + * by the Qualcomm SCMI Vendor Protocol on the SCMI controller.
'enum scmi_mem..'
> +static int populate_cluster_info(u32 *cluster_info)
> +{
> + char name[MAX_NAME_LEN];
> + int i = 0;
> +
> + struct device_node *cn __free(device_node) = of_find_node_by_path("/cpus");
> + if (!cn)
> + return -ENODEV;
> +
> + struct device_node *map __free(device_node) = of_get_child_by_name(cn, "cpu-map");
> + if (!map)
> + return -ENODEV;
> +
> + do {
> + snprintf(name, sizeof(name), "cluster%d", i);
> + struct device_node *c __free(device_node) = of_get_child_by_name(map, name);
> + if (!c)
> + break;
> +
> + *(cluster_info + i) = of_get_child_count(c);
> + i++;
> + } while (1);
of_cpu_device_node_get(0) + of_get_next_cpu_node() +
of_get_cpu_hwid() & MPIDR_EL1.Aff2 [1]
[...]
> +static struct cpufreq_memfreq_map *init_cpufreq_memfreq_map(struct device *dev,
> + struct scmi_memory_info *memory,
> + struct device_node *of_node,
> + u32 *cnt)
> +{
> + struct device_node *tbl_np, *opp_np;
> + struct cpufreq_memfreq_map *tbl;
> + int ret, i = 0;
> + u32 level, len;
> + u64 rate;
> +
> + tbl_np = of_parse_phandle(of_node, "operating-points-v2", 0);
> + if (!tbl_np)
> + return ERR_PTR(-ENODEV);
> +
> + len = min(of_get_available_child_count(tbl_np), MAX_MAP_ENTRIES);
> + if (len == 0)
> + return ERR_PTR(-ENODEV);
> +
> + tbl = devm_kzalloc(dev, (len + 1) * sizeof(struct cpufreq_memfreq_map),
> + GFP_KERNEL);
> + if (!tbl)
> + return ERR_PTR(-ENOMEM);
> +
> + for_each_available_child_of_node(tbl_np, opp_np) {
> + ret = of_property_read_u64_index(opp_np, "opp-hz", 0, &rate);
> + if (ret < 0)
> + return ERR_PTR(ret);
> +
> + tbl[i].cpufreq_mhz = rate / HZ_PER_MHZ;
> +
> + if (memory->hw_type != QCOM_MEM_TYPE_DDR_QOS) {
> + ret = of_property_read_u64_index(opp_np, "opp-hz", 1, &rate);
> + if (ret < 0)
> + return ERR_PTR(ret);
> +
> + tbl[i].memfreq_khz = rate / HZ_PER_KHZ;
> + } else {
> + ret = of_property_read_u32(opp_np, "opp-level", &level);
> + if (ret < 0)
> + return ERR_PTR(ret);
> +
> + tbl[i].memfreq_khz = level;
> + }
> +
> + dev_dbg(dev, "Entry%d CPU:%u, Mem:%u\n", i, tbl[i].cpufreq_mhz, tbl[i].memfreq_khz);
> + i++;
> + }
> + *cnt = len;
> + tbl[i].cpufreq_mhz = 0;
missing of_node_put, or even better __free(device_node)
[...]
> + /*
> + * Variants of the SoC having reduced number of cpus operate
> + * with the same number of logical cpus but the physical
> + * cpu disabled will differ between parts. Calculate the
> + * physical cpu number using cluster information instead.
> + */
> + ret = populate_physical_mask(monitor_np, &monitor->mask,
> + info->cluster_info);
> + if (ret < 0) {
> + dev_err_probe(&sdev->dev, ret, "failed to populate cpu mask\n");
> + goto err;
> + }
err.. the same number of logical CPUs? as in, PSCI will happily report that
the inexistent cores have been booted? or some cores start doing some sort
of hyperthreading to make up for the missing ones? this sounds sketchy..
Konrad
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC V3 1/4] dt-bindings: firmware: Document bindings for ARM SCMI QCOM Vendor Protocol
2024-07-02 19:14 ` [RFC V3 1/4] dt-bindings: firmware: Document bindings for ARM SCMI QCOM " Sibi Sankar
@ 2024-07-09 16:32 ` Cristian Marussi
2024-07-11 13:31 ` Sibi Sankar
0 siblings, 1 reply; 17+ messages in thread
From: Cristian Marussi @ 2024-07-09 16:32 UTC (permalink / raw)
To: Sibi Sankar
Cc: sudeep.holla, cristian.marussi, andersson, konrad.dybcio, robh+dt,
krzysztof.kozlowski+dt, linux-kernel, linux-arm-msm, devicetree,
linux-arm-kernel, quic_rgottimu, quic_kshivnan, conor+dt
On Wed, Jul 03, 2024 at 12:44:37AM +0530, Sibi Sankar wrote:
> Document the various memory buses that can be monitored and scaled by the
> memory latency governor hosted by the ARM SCMI QCOM Vendor protocol v1.0.
>
> Signed-off-by: Sibi Sankar <quic_sibis@quicinc.com>
> ---
Hi Sibi,
this series got a bit neglected...my bad...a few comments down below.
>
> Adding a reg property in scmi-memlat.yaml seems incorrect/superfluous
> but without it I see the following errors:
>
> Err Logs:
> protocol@80: 'reg' does not match any of the regexes: '^memory-[0-9]$', 'pinctrl-[0-9]+'
> protocol@80: Unevaluated properties are not allowed ('memory-0', 'memory-1', 'memory-2' were unexpected)
>
> v2:
> * Drop container dvfs memlat container node. [Rob]
> * Move scmi-memlat.yaml to protocol level given that a lot of vendors might end up
> using the same protocol number. [Rob]
> * Replace qcom,cpulist with the standard "cpus" property. [Rob]
> * Fix up compute-type/ipm-ceil required. [Rob]
>
...so there has been a lot of work around Vendor protos recently (as you
have seen) and especially around the way we define the DT bindings to have
multiple vendor protocols coexist with the same overlapping numbers.
(the code-level coexistence is already in place as you've seen...)
I think some sort of agreement on HOW to render this in the bindings
side was reached around a series from NXP...not sure if I am missing something
here but this commit from Peng/NXP (if you have not seen it already):
https://lore.kernel.org/linux-arm-kernel/20240621-imx95-bbm-misc-v2-v5-2-b85a6bf778cb@nxp.com/
...it is a good example of how you can define your vendor specific part in
a vendor specific binding files and then just add a single $ref line in
the core binding Documentation/devicetree/bindings/firmware/arm,scmi.yaml
(and that has been successfully reviewd...)
Moreover, in that same series from Peng/NXP you could have a look at
https://lore.kernel.org/linux-arm-kernel/20240621-imx95-bbm-misc-v2-v5-1-b85a6bf778cb@nxp.com/
that adds the Documentation for their Vendor protocols.
Beside the final location in the tree for such docs, which is a detail
we can settle later on our side too, I think that patch is a good example
of the kind of vendor-protos Documentation Sudeep is expecting.
> .../bindings/firmware/arm,scmi.yaml | 15 ++
> .../bindings/soc/qcom/qcom,scmi-memlat.yaml | 242 ++++++++++++++++++
> include/dt-bindings/soc/qcom,scmi-vendor.h | 22 ++
> 3 files changed, 279 insertions(+)
> create mode 100644 Documentation/devicetree/bindings/soc/qcom/qcom,scmi-memlat.yaml
> create mode 100644 include/dt-bindings/soc/qcom,scmi-vendor.h
>
> diff --git a/Documentation/devicetree/bindings/firmware/arm,scmi.yaml b/Documentation/devicetree/bindings/firmware/arm,scmi.yaml
> index 4d823f3b1f0e..a4022682e5ca 100644
> --- a/Documentation/devicetree/bindings/firmware/arm,scmi.yaml
> +++ b/Documentation/devicetree/bindings/firmware/arm,scmi.yaml
> @@ -284,6 +284,21 @@ properties:
> required:
> - reg
>
> + protocol@80:
> + type: object
> + allOf:
> + - $ref: '#/$defs/protocol-node'
> + - $ref: /schemas/soc/qcom/qcom,scmi-memlat.yaml#
> +
> + unevaluatedProperties: false
> +
> + properties:
> + reg:
> + const: 0x80
> +
> + required:
> + - reg
> +
..here you should be able to just plant your $ref without redefining the
protocol@80
> additionalProperties: false
>
> $defs:
> diff --git a/Documentation/devicetree/bindings/soc/qcom/qcom,scmi-memlat.yaml b/Documentation/devicetree/bindings/soc/qcom/qcom,scmi-memlat.yaml
> new file mode 100644
> index 000000000000..915a6bf5697f
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/soc/qcom/qcom,scmi-memlat.yaml
> @@ -0,0 +1,242 @@
> +# SPDX-License-Identifier: (GPL-2.0 OR BSD-2-Clause)
> +%YAML 1.2
> +---
> +$id: http://devicetree.org/schemas/soc/qcom/qcom,scmi-memlat.yaml#
> +$schema: http://devicetree.org/meta-schemas/core.yaml#
> +
> +title: Qualcomm SCMI Memory Bus nodes
> +
> +maintainers:
> + - Sibi Sankar <quic_sibis@quicinc.com>
> +
> +description:
> + This binding describes the various memory buses that can be monitored and scaled
> + by memory latency governor running on the CPU Control Processor (SCMI controller).
> +
...and instead here you will define your protocols, compliant with the
main protocol-node def and any specific vendor sub-properties that you
should need....
...the above example from NXP is probably more clear than any attempt of mine
to explain this :P
Thanks,
Cristian
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC V3 2/4] firmware: arm_scmi: vendors: Add ARM SCMI QCOM vendor protocol v1.0
2024-07-02 19:14 ` [RFC V3 2/4] firmware: arm_scmi: vendors: Add ARM SCMI QCOM vendor protocol v1.0 Sibi Sankar
2024-07-09 10:10 ` Konrad Dybcio
@ 2024-07-09 17:52 ` Cristian Marussi
2024-10-07 7:29 ` Sibi Sankar
1 sibling, 1 reply; 17+ messages in thread
From: Cristian Marussi @ 2024-07-09 17:52 UTC (permalink / raw)
To: Sibi Sankar
Cc: sudeep.holla, cristian.marussi, andersson, konrad.dybcio, robh+dt,
krzysztof.kozlowski+dt, linux-kernel, linux-arm-msm, devicetree,
linux-arm-kernel, quic_rgottimu, quic_kshivnan, conor+dt,
Amir Vajid
On Wed, Jul 03, 2024 at 12:44:38AM +0530, Sibi Sankar wrote:
> The ARM SCMI QCOM vendor protocol provides a generic way of exposing a
> number of Qualcomm SoC specific features (like memory bus scaling) through
> a mixture of pre-determined algorithm strings and param_id pairs hosted on
> the SCMI controller.
>
Hi Sibi,
> Co-developed-by: Shivnandan Kumar <quic_kshivnan@quicinc.com>
> Signed-off-by: Shivnandan Kumar <quic_kshivnan@quicinc.com>
> Co-developed-by: Ramakrishna Gottimukkula <quic_rgottimu@quicinc.com>
> Signed-off-by: Ramakrishna Gottimukkula <quic_rgottimu@quicinc.com>
> Co-developed-by: Amir Vajid <avajid@quicinc.com>
> Signed-off-by: Amir Vajid <avajid@quicinc.com>
> Signed-off-by: Sibi Sankar <quic_sibis@quicinc.com>
> ---
> drivers/firmware/arm_scmi/vendors/Kconfig | 12 ++
> drivers/firmware/arm_scmi/vendors/Makefile | 2 +-
> .../arm_scmi/vendors/qcom_scmi_vendor.c | 184 ++++++++++++++++++
> include/linux/qcom_scmi_vendor.h | 39 ++++
> 4 files changed, 236 insertions(+), 1 deletion(-)
> create mode 100644 drivers/firmware/arm_scmi/vendors/qcom_scmi_vendor.c
> create mode 100644 include/linux/qcom_scmi_vendor.h
>
> diff --git a/drivers/firmware/arm_scmi/vendors/Kconfig b/drivers/firmware/arm_scmi/vendors/Kconfig
> index 7c1ca7a12603..6bff4550fa25 100644
> --- a/drivers/firmware/arm_scmi/vendors/Kconfig
> +++ b/drivers/firmware/arm_scmi/vendors/Kconfig
> @@ -1,4 +1,16 @@
> # SPDX-License-Identifier: GPL-2.0-only
> menu "ARM SCMI Vendor Protocols"
>
> +config ARM_SCMI_PROTOCOL_VENDOR_QCOM
> + tristate "Qualcomm Technologies, Inc. Qcom SCMI vendor Protocol"
> + depends on ARM_SCMI_PROTOCOL || COMPILE_TEST
> + help
> + The SCMI QCOM vendor protocol provides a generic way of exposing a
> + number of Qualcomm SoC specific features (like memory bus scaling)
> + through a mixture of pre-determined algorithm strings and param_id
> + pairs hosted on the SCMI controller.
> +
> + This driver defines/documents the message ID's used for this
> + communication and also exposes the ops used by the clients.
operations
> +
> endmenu
> diff --git a/drivers/firmware/arm_scmi/vendors/Makefile b/drivers/firmware/arm_scmi/vendors/Makefile
> index c6c214158dd8..c1d6a355f579 100644
> --- a/drivers/firmware/arm_scmi/vendors/Makefile
> +++ b/drivers/firmware/arm_scmi/vendors/Makefile
> @@ -1,2 +1,2 @@
> # SPDX-License-Identifier: GPL-2.0-only
> -# obj-$(CONFIG_ARM_SCMI_PROTOCOL_<your_vendor_proto>) += <your_vendor_proto>.o
> +obj-$(CONFIG_ARM_SCMI_PROTOCOL_VENDOR_QCOM) += qcom_scmi_vendor.o
> diff --git a/drivers/firmware/arm_scmi/vendors/qcom_scmi_vendor.c b/drivers/firmware/arm_scmi/vendors/qcom_scmi_vendor.c
> new file mode 100644
> index 000000000000..e02163381d4b
> --- /dev/null
> +++ b/drivers/firmware/arm_scmi/vendors/qcom_scmi_vendor.c
> @@ -0,0 +1,184 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * Copyright (c) 2024, Qualcomm Innovation Center, Inc. All rights reserved.
> + */
> +
> +#include <linux/qcom_scmi_vendor.h>
> +
> +#include "../common.h"
> +
> +/**
> + * qcom_scmi_vendor_protocol_cmd - vendor specific commands supported by Qualcomm SCMI
> + * vendor protocol.
> + *
> + * This protocol is intended as a generic way of exposing a number of Qualcomm SoC
> + * specific features through a mixture of pre-determined algorithm string and param_id
> + * pairs hosted on the SCMI controller.
> + *
> + * The QCOM SCMI Vendor Protocol has the protocol id as 0x80 and vendor id set to
> + * Qualcomm and the implementation version set to 0x20000. The PROTOCOL_VERSION command
> + * returns version 1.0.
> + *
> + * @QCOM_SCMI_SET_PARAM: message_id: 0x10 is used to set the parameter of a specific algo_str
> + * hosted on QCOM SCMI Vendor Protocol. The tx len depends on the
> + * algo_str used.
> + * @QCOM_SCMI_GET_PARAM: message_id: 0x11 is used to get parameter information of a specific
> + * algo_str hosted on QCOM SCMI Vendor Protocol. The tx and rx len
> + * depends on the algo_str used.
> + * @QCOM_SCMI_START_ACTIVITY: message_id: 0x12 is used to start the activity performed by
> + * the algo_str.
> + * @QCOM_SCMI_STOP_ACTIVITY: message_id: 0x13 is used to stop a pre-existing activity
> + * performed by the algo_str.
> + */
> +enum qcom_scmi_vendor_protocol_cmd {
> + QCOM_SCMI_SET_PARAM = 0x10,
> + QCOM_SCMI_GET_PARAM = 0x11,
> + QCOM_SCMI_START_ACTIVITY = 0x12,
> + QCOM_SCMI_STOP_ACTIVITY = 0x13,
> +};
> +
> +/**
> + * struct qcom_scmi_msg - represents the various parameters to be populated
> + * for using the QCOM SCMI Vendor Protocol
> + *
> + * @ext_id: reserved, must be zero
> + * @algo_low: lower 32 bits of the algo_str
> + * @algo_high: upper 32 bits of the algo_str
> + * @param_id: serves as token message id to the specific algo_str
> + * @buf: serves as the payload to the specified param_id and algo_str pair
> + */
> +struct qcom_scmi_msg {
> + __le32 ext_id;
> + __le32 algo_low;
> + __le32 algo_high;
> + __le32 param_id;
> + __le32 buf[];
> +};
> +
> +static int qcom_scmi_set_param(const struct scmi_protocol_handle *ph, void *buf, u64 algo_str,
> + u32 param_id, size_t size)
> +{
size is also length of the provided buffer *buf right ?
In that case, like I also mention below, please call it buf_len, or
something like that, and have it following *buf in the param list...
> + struct scmi_xfer *t;
> + struct qcom_scmi_msg *msg;
> + int ret;
> +
> + ret = ph->xops->xfer_get_init(ph, QCOM_SCMI_SET_PARAM, size + sizeof(*msg), 0, &t);
> + if (ret)
> + return ret;
> +
> + msg = t->tx.buf;
> + msg->algo_low = cpu_to_le32(lower_32_bits(algo_str));
> + msg->algo_high = cpu_to_le32(upper_32_bits(algo_str));
> + msg->param_id = cpu_to_le32(param_id);
> +
> + memcpy(msg->buf, buf, t->tx.len - sizeof(*msg));
> +
> + ret = ph->xops->do_xfer(ph, t);
> + ph->xops->xfer_put(ph, t);
> +
> + return ret;
> +}
> +
> +static int qcom_scmi_get_param(const struct scmi_protocol_handle *ph, void *buf, u64 algo_str,
> + u32 param_id, size_t tx_size, size_t rx_size)
> +{
Similarly...and looking at my past ramblings...this rx_size is the expected RX
size AND also the size of the provided *buf too, right ?
> + struct scmi_xfer *t;
> + struct qcom_scmi_msg *msg;
> + int ret;
> +
> + ret = ph->xops->xfer_get_init(ph, QCOM_SCMI_GET_PARAM, tx_size + sizeof(*msg), rx_size, &t);
> + if (ret)
> + return ret;
> +
> + msg = t->tx.buf;
> + msg->algo_low = cpu_to_le32(lower_32_bits(algo_str));
> + msg->algo_high = cpu_to_le32(upper_32_bits(algo_str));
> + msg->param_id = cpu_to_le32(param_id);
> + memcpy(msg->buf, buf, t->tx.len - sizeof(*msg));
> +
> + ret = ph->xops->do_xfer(ph, t);
> + memcpy(buf, t->rx.buf, t->rx.len);
...so that this memcpy is safe since rx.len is equal to rx_size by construction
(if I read correctly my past review/mublings...)
...in that case maybe, for better clarity you could re-name the rx_size
param as buf_len and have it following *buf in the param list...
...sorry for not having spotted this naming/order niptick earlier ...
> + ph->xops->xfer_put(ph, t);
> +
> + return ret;
> +}
> +
> +static int qcom_scmi_start_activity(const struct scmi_protocol_handle *ph,
> + void *buf, u64 algo_str, u32 param_id, size_t size)
> +{
Same .. rename and reorder.
> + struct scmi_xfer *t;
> + struct qcom_scmi_msg *msg;
> + int ret;
> +
> + ret = ph->xops->xfer_get_init(ph, QCOM_SCMI_START_ACTIVITY, size + sizeof(*msg), 0, &t);
> + if (ret)
> + return ret;
> +
> + msg = t->tx.buf;
> + msg->algo_low = cpu_to_le32(lower_32_bits(algo_str));
> + msg->algo_high = cpu_to_le32(upper_32_bits(algo_str));
> + msg->param_id = cpu_to_le32(param_id);
> +
> + memcpy(msg->buf, buf, t->tx.len - sizeof(*msg));
> +
> + ret = ph->xops->do_xfer(ph, t);
> + ph->xops->xfer_put(ph, t);
> +
> + return ret;
> +}
> +
> +static int qcom_scmi_stop_activity(const struct scmi_protocol_handle *ph, void *buf, u64 algo_str,
> + u32 param_id, size_t size)
> +{
Same .. rename and reorder.
> + struct scmi_xfer *t;
> + struct qcom_scmi_msg *msg;
> + int ret;
> +
> + ret = ph->xops->xfer_get_init(ph, QCOM_SCMI_STOP_ACTIVITY, size + sizeof(*msg), 0, &t);
> + if (ret)
> + return ret;
> +
> + msg = t->tx.buf;
> + msg->algo_low = cpu_to_le32(lower_32_bits(algo_str));
> + msg->algo_high = cpu_to_le32(upper_32_bits(algo_str));
> + msg->param_id = cpu_to_le32(param_id);
> +
> + memcpy(msg->buf, buf, t->tx.len - sizeof(*msg));
> +
> + ret = ph->xops->do_xfer(ph, t);
> + ph->xops->xfer_put(ph, t);
> +
> + return ret;
> +}
> +
> +static struct qcom_scmi_vendor_ops qcom_proto_ops = {
> + .set_param = qcom_scmi_set_param,
> + .get_param = qcom_scmi_get_param,
> + .start_activity = qcom_scmi_start_activity,
> + .stop_activity = qcom_scmi_stop_activity,
> +};
> +
> +static int qcom_scmi_vendor_protocol_init(const struct scmi_protocol_handle *ph)
> +{
> + u32 version;
> +
> + ph->xops->version_get(ph, &version);
> +
> + dev_dbg(ph->dev, "SCMI QCOM Vendor Version %d.%d\n",
> + PROTOCOL_REV_MAJOR(version), PROTOCOL_REV_MINOR(version));
> +
> + return 0;
> +}
> +
> +static const struct scmi_protocol qcom_scmi_vendor = {
> + .id = QCOM_SCMI_VENDOR_PROTOCOL,
> + .owner = THIS_MODULE,
> + .instance_init = &qcom_scmi_vendor_protocol_init,
> + .ops = &qcom_proto_ops,
> + .vendor_id = "Qualcomm",
> + .impl_ver = 0x20000,
> +};
> +module_scmi_protocol(qcom_scmi_vendor);
> +
> +MODULE_DESCRIPTION("QTI SCMI vendor protocol");
> +MODULE_LICENSE("GPL");
> diff --git a/include/linux/qcom_scmi_vendor.h b/include/linux/qcom_scmi_vendor.h
> new file mode 100644
> index 000000000000..60f85fedee80
> --- /dev/null
> +++ b/include/linux/qcom_scmi_vendor.h
> @@ -0,0 +1,39 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * QTI SCMI vendor protocol's header
> + *
> + * Copyright (c) 2024, Qualcomm Innovation Center, Inc. All rights reserved.
> + */
> +
> +#ifndef _QCOM_SCMI_VENDOR_H
> +#define _QCOM_SCMI_VENDOR_H
> +
> +#include <linux/bitfield.h>
> +#include <linux/device.h>
> +#include <linux/types.h>
> +
> +#define QCOM_SCMI_VENDOR_PROTOCOL 0x80
> +
> +struct scmi_protocol_handle;
> +
> +/**
> + * struct qcom_scmi_vendor_ops - represents the various operations provided
> + * by QCOM SCMI Vendor Protocol
> + *
> + * @set_param: set parameter specified by param_id and algo_str pair.
> + * @get_param: retrieve parameter specified by param_id and algo_str pair.
> + * @start_activity: initiate a specific activity defined by algo_str.
> + * @stop_activity: halt previously initiated activity defined by algo_str.
> + */
> +struct qcom_scmi_vendor_ops {
> + int (*set_param)(const struct scmi_protocol_handle *ph, void *buf, u64 algo_str,
> + u32 param_id, size_t size);
> + int (*get_param)(const struct scmi_protocol_handle *ph, void *buf, u64 algo_str,
> + u32 param_id, size_t tx_size, size_t rx_size);
> + int (*start_activity)(const struct scmi_protocol_handle *ph, void *buf, u64 algo_str,
> + u32 param_id, size_t size);
> + int (*stop_activity)(const struct scmi_protocol_handle *ph, void *buf, u64 algo_str,
> + u32 param_id, size_t size);
> +};
> +
> +#endif /* _QCOM_SCMI_VENDOR_H */
...beside the above nitpicks...
LGTM,
Reviewed-by: Cristian Marussi <cristian.marussi@arm.com>
Thanks,
Cristian
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC V3 3/4] soc: qcom: Introduce SCMI based Memlat (Memory Latency) governor
2024-07-02 19:14 ` [RFC V3 3/4] soc: qcom: Introduce SCMI based Memlat (Memory Latency) governor Sibi Sankar
2024-07-03 8:44 ` Shivnandan Kumar
2024-07-09 10:51 ` Konrad Dybcio
@ 2024-07-10 11:36 ` Cristian Marussi
2024-10-07 7:33 ` Sibi Sankar
2 siblings, 1 reply; 17+ messages in thread
From: Cristian Marussi @ 2024-07-10 11:36 UTC (permalink / raw)
To: Sibi Sankar
Cc: sudeep.holla, cristian.marussi, andersson, konrad.dybcio, robh+dt,
krzysztof.kozlowski+dt, linux-kernel, linux-arm-msm, devicetree,
linux-arm-kernel, quic_rgottimu, quic_kshivnan, conor+dt,
Amir Vajid
On Wed, Jul 03, 2024 at 12:44:39AM +0530, Sibi Sankar wrote:
> Introduce a client driver that uses the memlat algorithm string hosted
> on ARM SCMI QCOM Vendor Protocol to detect memory latency workloads and
> control frequency/level of the various memory buses (DDR/LLCC/DDR_QOS).
>
> Co-developed-by: Shivnandan Kumar <quic_kshivnan@quicinc.com>
> Signed-off-by: Shivnandan Kumar <quic_kshivnan@quicinc.com>
> Co-developed-by: Ramakrishna Gottimukkula <quic_rgottimu@quicinc.com>
> Signed-off-by: Ramakrishna Gottimukkula <quic_rgottimu@quicinc.com>
> Co-developed-by: Amir Vajid <avajid@quicinc.com>
> Signed-off-by: Amir Vajid <avajid@quicinc.com>
> Signed-off-by: Sibi Sankar <quic_sibis@quicinc.com>
> ---
>
Hi Sibi,
just a few remarks down below...
> V2:
> * Make driver changes to the accommodate bindings changes. [Rob]
> * Replace explicit of_node_put with _free.
>
> drivers/soc/qcom/Kconfig | 12 +
> drivers/soc/qcom/Makefile | 1 +
> drivers/soc/qcom/qcom_scmi_client.c | 583 ++++++++++++++++++++++++++++
> 3 files changed, 596 insertions(+)
> create mode 100644 drivers/soc/qcom/qcom_scmi_client.c
>
> diff --git a/drivers/soc/qcom/Kconfig b/drivers/soc/qcom/Kconfig
> index 432c85bd8ad4..b253504bd386 100644
> --- a/drivers/soc/qcom/Kconfig
> +++ b/drivers/soc/qcom/Kconfig
> @@ -294,4 +294,16 @@ config QCOM_PBS
> This module provides the APIs to the client drivers that wants to send the
> PBS trigger event to the PBS RAM.
>
> +config QCOM_SCMI_CLIENT
> + tristate "Qualcomm Technologies Inc. SCMI client driver"
> + depends on ARM_SCMI_PROTOCOL_VENDOR_QCOM || COMPILE_TEST
> + default n
I think default is N by..ehm.. default...so it is implicit and not needed to be specified
> + help
> + This driver uses the memlat algorithm string hosted on QCOM SCMI
> + Vendor Protocol to detect memory latency workloads and control
> + frequency/level of the various memory buses (DDR/LLCC/DDR_QOS).
> +
> + This driver defines/documents the parameter IDs used while configuring
> + the memory buses.
> +
> endmenu
> diff --git a/drivers/soc/qcom/Makefile b/drivers/soc/qcom/Makefile
> index d3560f861085..8a2e832d1d5d 100644
> --- a/drivers/soc/qcom/Makefile
> +++ b/drivers/soc/qcom/Makefile
> @@ -35,6 +35,7 @@ obj-$(CONFIG_QCOM_APR) += apr.o
> obj-$(CONFIG_QCOM_LLCC) += llcc-qcom.o
> obj-$(CONFIG_QCOM_KRYO_L2_ACCESSORS) += kryo-l2-accessors.o
> obj-$(CONFIG_QCOM_ICC_BWMON) += icc-bwmon.o
> +obj-$(CONFIG_QCOM_SCMI_CLIENT) += qcom_scmi_client.o
> qcom_ice-objs += ice.o
> obj-$(CONFIG_QCOM_INLINE_CRYPTO_ENGINE) += qcom_ice.o
> obj-$(CONFIG_QCOM_PBS) += qcom-pbs.o
> diff --git a/drivers/soc/qcom/qcom_scmi_client.c b/drivers/soc/qcom/qcom_scmi_client.c
> new file mode 100644
> index 000000000000..8369b415c0ab
> --- /dev/null
> +++ b/drivers/soc/qcom/qcom_scmi_client.c
> @@ -0,0 +1,583 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * Copyright (c) 2024, Qualcomm Innovation Center, Inc. All rights reserved.
> + */
> +
> +#include <linux/cpu.h>
> +#include <linux/err.h>
> +#include <linux/errno.h>
> +#include <linux/init.h>
> +#include <linux/kernel.h>
> +#include <linux/module.h>
> +#include <linux/of.h>
> +#include <linux/platform_device.h>
> +#include <linux/qcom_scmi_vendor.h>
> +#include <linux/scmi_protocol.h>
> +#include <linux/units.h>
> +#include <dt-bindings/soc/qcom,scmi-vendor.h>
> +
> +#define MEMLAT_ALGO_STR 0x4D454D4C4154 /* MEMLAT */
> +#define INVALID_IDX 0xff
> +#define MAX_MEMORY_TYPES 3
> +#define MAX_MONITOR_CNT 4
> +#define MAX_NAME_LEN 20
> +#define MAX_MAP_ENTRIES 7
> +#define CPUCP_DEFAULT_SAMPLING_PERIOD_MS 4
> +#define CPUCP_DEFAULT_FREQ_METHOD 1
> +
> +/**
> + * scmi_memlat_protocol_cmd - parameter_ids supported by the "MEMLAT" algo_str hosted
> + * by the Qualcomm SCMI Vendor Protocol on the SCMI controller.
> + *
> + * MEMLAT (Memory Latency) monitors the counters to detect memory latency bound workloads
> + * and scales the frequency/levels of the memory buses accordingly.
> + *
> + * @MEMLAT_SET_MEM_GROUP: initializes the frequency/level scaling functions for the memory bus.
> + * @MEMLAT_SET_MONITOR: configures the monitor to work on a specific memory bus.
> + * @MEMLAT_SET_COMMON_EV_MAP: set up common counters used to monitor the cpu frequency.
> + * @MEMLAT_SET_GRP_EV_MAP: set up any specific counters used to monitor the memory bus.
> + * @MEMLAT_IPM_CEIL: set the IPM (Instruction Per Misses) ceiling per monitor.
> + * @MEMLAT_SAMPLE_MS: set the sampling period for all the monitors.
> + * MEMLAT_MON_FREQ_MAP: setup the cpufreq to memfreq map.
> + * MEMLAT_SET_MIN_FREQ: set the max frequency of the memory bus.
> + * MEMLAT_SET_MAX_FREQ: set the min frequency of the memory bus.
> + * MEMLAT_START_TIMER: start all the monitors with the requested sampling period.
> + * MEMLAT_START_TIMER: stop all the running monitors.
> + * MEMLAT_SET_EFFECTIVE_FREQ_METHOD: set the method used to determine cpu frequency.
> + */
> +enum scmi_memlat_protocol_cmd {
> + MEMLAT_SET_MEM_GROUP = 16,
> + MEMLAT_SET_MONITOR,
> + MEMLAT_SET_COMMON_EV_MAP,
> + MEMLAT_SET_GRP_EV_MAP,
> + MEMLAT_IPM_CEIL = 23,
> + MEMLAT_SAMPLE_MS = 31,
> + MEMLAT_MON_FREQ_MAP,
> + MEMLAT_SET_MIN_FREQ,
> + MEMLAT_SET_MAX_FREQ,
> + MEMLAT_START_TIMER = 36,
> + MEMLAT_STOP_TIMER,
> + MEMLAT_SET_EFFECTIVE_FREQ_METHOD = 39,
> +};
> +
> +struct map_table {
> + u16 v1;
> + u16 v2;
> +};
> +
> +struct map_param_msg {
> + u32 hw_type;
> + u32 mon_idx;
> + u32 nr_rows;
> + struct map_table tbl[MAX_MAP_ENTRIES];
> +} __packed;
> +
> +struct node_msg {
> + u32 cpumask;
> + u32 hw_type;
> + u32 mon_type;
> + u32 mon_idx;
> + char mon_name[MAX_NAME_LEN];
> +};
> +
> +struct scalar_param_msg {
> + u32 hw_type;
> + u32 mon_idx;
> + u32 val;
> +};
...just reiterating that all of these message payload descriptors
completely ignore endianity....but if it is fine for you I wont complain :D
> +
> +enum common_ev_idx {
> + INST_IDX,
> + CYC_IDX,
> + CONST_CYC_IDX,
> + FE_STALL_IDX,
> + BE_STALL_IDX,
> + NUM_COMMON_EVS
> +};
> +
> +enum grp_ev_idx {
> + MISS_IDX,
> + WB_IDX,
> + ACC_IDX,
> + NUM_GRP_EVS
> +};
> +
> +#define EV_CPU_CYCLES 0
> +#define EV_INST_RETIRED 2
> +#define EV_L2_D_RFILL 5
> +
> +struct ev_map_msg {
> + u32 num_evs;
> + u32 hw_type;
> + u32 cid[NUM_COMMON_EVS];
> +};
> +
> +struct cpufreq_memfreq_map {
> + unsigned int cpufreq_mhz;
> + unsigned int memfreq_khz;
> +};
> +
> +struct scmi_monitor_info {
> + struct cpufreq_memfreq_map *freq_map;
> + char mon_name[MAX_NAME_LEN];
> + u32 mon_idx;
> + u32 mon_type;
> + u32 ipm_ceil;
> + u32 mask;
> + u32 freq_map_len;
> +};
> +
> +struct scmi_memory_info {
> + struct scmi_monitor_info *monitor[MAX_MONITOR_CNT];
> + u32 hw_type;
> + int monitor_cnt;
> + u32 min_freq;
> + u32 max_freq;
> +};
> +
> +struct scmi_memlat_info {
> + struct scmi_protocol_handle *ph;
> + const struct qcom_scmi_vendor_ops *ops;
> + struct scmi_memory_info *memory[MAX_MEMORY_TYPES];
> + u32 cluster_info[NR_CPUS];
> + int memory_cnt;
> +};
> +
> +static int populate_cluster_info(u32 *cluster_info)
> +{
> + char name[MAX_NAME_LEN];
> + int i = 0;
> +
> + struct device_node *cn __free(device_node) = of_find_node_by_path("/cpus");
> + if (!cn)
> + return -ENODEV;
> +
> + struct device_node *map __free(device_node) = of_get_child_by_name(cn, "cpu-map");
> + if (!map)
> + return -ENODEV;
> +
> + do {
> + snprintf(name, sizeof(name), "cluster%d", i);
> + struct device_node *c __free(device_node) = of_get_child_by_name(map, name);
> + if (!c)
> + break;
> +
> + *(cluster_info + i) = of_get_child_count(c);
> + i++;
> + } while (1);
> +
> + return 0;
> +}
> +
> +static int populate_physical_mask(struct device_node *np, u32 *mask, u32 *cluster_info)
> +{
> + struct device_node *dev_phandle;
> + int cpu, i = 0, physical_id;
> +
> + do {
> + dev_phandle = of_parse_phandle(np, "cpus", i++);
Shouldn't dev_phandle ...
> + cpu = of_cpu_node_to_id(dev_phandle);
and cpu be of_put somewhere in this loop ? (explicitly or by __free magic)
> + if (cpu != -ENODEV) {
> + physical_id = topology_core_id(cpu);
> + for (int j = 0; j < topology_cluster_id(cpu); j++)
> + physical_id += *(cluster_info + j);
> + *mask |= BIT(physical_id);
> + }
> + } while (dev_phandle);
> +
> + return 0;
> +}
> +
> +static struct cpufreq_memfreq_map *init_cpufreq_memfreq_map(struct device *dev,
> + struct scmi_memory_info *memory,
> + struct device_node *of_node,
> + u32 *cnt)
> +{
> + struct device_node *tbl_np, *opp_np;
> + struct cpufreq_memfreq_map *tbl;
> + int ret, i = 0;
> + u32 level, len;
> + u64 rate;
> +
> + tbl_np = of_parse_phandle(of_node, "operating-points-v2", 0);
> + if (!tbl_np)
> + return ERR_PTR(-ENODEV);
> +
> + len = min(of_get_available_child_count(tbl_np), MAX_MAP_ENTRIES);
> + if (len == 0)
> + return ERR_PTR(-ENODEV);
of_put or __free magic ?
> +
> + tbl = devm_kzalloc(dev, (len + 1) * sizeof(struct cpufreq_memfreq_map),
> + GFP_KERNEL);
> + if (!tbl)
of_put or __free magic ?
> + return ERR_PTR(-ENOMEM);
> +
> + for_each_available_child_of_node(tbl_np, opp_np) {
> + ret = of_property_read_u64_index(opp_np, "opp-hz", 0, &rate);
> + if (ret < 0)
same
> + return ERR_PTR(ret);
> +
> + tbl[i].cpufreq_mhz = rate / HZ_PER_MHZ;
> +
> + if (memory->hw_type != QCOM_MEM_TYPE_DDR_QOS) {
> + ret = of_property_read_u64_index(opp_np, "opp-hz", 1, &rate);
> + if (ret < 0)
> + return ERR_PTR(ret);
> +
> + tbl[i].memfreq_khz = rate / HZ_PER_KHZ;
> + } else {
> + ret = of_property_read_u32(opp_np, "opp-level", &level);
> + if (ret < 0)
> + return ERR_PTR(ret);
> +
> + tbl[i].memfreq_khz = level;
> + }
> +
> + dev_dbg(dev, "Entry%d CPU:%u, Mem:%u\n", i, tbl[i].cpufreq_mhz, tbl[i].memfreq_khz);
> + i++;
> + }
> + *cnt = len;
> + tbl[i].cpufreq_mhz = 0;
> +
> + return tbl;
> +}
> +
> +static int process_scmi_memlat_of_node(struct scmi_device *sdev, struct scmi_memlat_info *info)
> +{
> + struct scmi_monitor_info *monitor;
> + struct scmi_memory_info *memory;
> + char name[MAX_NAME_LEN];
> + u64 memfreq[2];
> + int ret;
> +
> + ret = populate_cluster_info(info->cluster_info);
> + if (ret < 0) {
> + dev_err_probe(&sdev->dev, ret, "failed to populate cluster info\n");
> + goto err;
> + }
> +
> + of_node_get(sdev->handle->dev->of_node);
From your DT bindings it seems to me that all of your vendor DT props
are (correctly) nested inside your own protocol@80 node right ?
So why you are getting and walking the entire firmware/scmi top node
with sdev->handle->dev->of_node ?
...it should be enough to get and walk sdev->dev.of_node which is the
device built around your protocol@80 node ...
> + do {
> + snprintf(name, sizeof(name), "memory-%d", info->memory_cnt);
> + struct device_node *memory_np __free(device_node) =
> + of_find_node_by_name(sdev->handle->dev->of_node, name);
same here
of_find_by_name(sdev->dev.of_node);
... am I missing something ?
> +
> + if (!memory_np)
> + break;
> +
> + memory = devm_kzalloc(&sdev->dev, sizeof(*memory), GFP_KERNEL);
> + if (!memory) {
> + ret = -ENOMEM;
> + goto err;
> + }
> +
> + ret = of_property_read_u32(memory_np, "qcom,memory-type", &memory->hw_type);
> + if (ret) {
> + dev_err_probe(&sdev->dev, ret, "failed to read memory type\n");
> + goto err;
> + }
> +
> + ret = of_property_read_u64_array(memory_np, "freq-table-hz", memfreq, 2);
> + if (ret && (ret != -EINVAL)) {
> + dev_err_probe(&sdev->dev, ret, "failed to read min/max freq\n");
> + goto err;
> + }
> +
> + if (memory->hw_type != QCOM_MEM_TYPE_DDR_QOS) {
> + memory->min_freq = memfreq[0] / HZ_PER_KHZ;
> + memory->max_freq = memfreq[1] / HZ_PER_KHZ;
> + } else {
> + memory->min_freq = memfreq[0];
> + memory->max_freq = memfreq[1];
> + }
> + info->memory[info->memory_cnt++] = memory;
> +
> + do {
> + snprintf(name, sizeof(name), "monitor-%d", memory->monitor_cnt);
> + struct device_node *monitor_np __free(device_node) =
> + of_get_child_by_name(memory_np, name);
> +
> + if (!monitor_np)
> + break;
> +
> + monitor = devm_kzalloc(&sdev->dev, sizeof(*monitor), GFP_KERNEL);
> + if (!monitor) {
> + ret = -ENOMEM;
> + goto err;
> + }
> +
> + monitor->mon_type = of_property_read_bool(monitor_np, "qcom,compute-type");
> + if (!monitor->mon_type) {
> + ret = of_property_read_u32(monitor_np, "qcom,ipm-ceil",
> + &monitor->ipm_ceil);
> + if (ret) {
> + dev_err_probe(&sdev->dev, ret,
> + "failed to read IPM ceiling\n");
> + goto err;
> + }
> + }
> +
> + /*
> + * Variants of the SoC having reduced number of cpus operate
> + * with the same number of logical cpus but the physical
> + * cpu disabled will differ between parts. Calculate the
> + * physical cpu number using cluster information instead.
> + */
> + ret = populate_physical_mask(monitor_np, &monitor->mask,
> + info->cluster_info);
> + if (ret < 0) {
> + dev_err_probe(&sdev->dev, ret, "failed to populate cpu mask\n");
> + goto err;
> + }
> +
> + monitor->freq_map = init_cpufreq_memfreq_map(&sdev->dev, memory, monitor_np,
> + &monitor->freq_map_len);
> + if (IS_ERR(monitor->freq_map)) {
> + dev_err_probe(&sdev->dev, PTR_ERR(monitor->freq_map),
> + "failed to populate cpufreq-memfreq map\n");
> + goto err;
> + }
> +
> + strscpy(monitor->mon_name, name, sizeof(monitor->mon_name));
> + monitor->mon_idx = memory->monitor_cnt;
> +
> + memory->monitor[memory->monitor_cnt++] = monitor;
> + } while (1);
> +
> + if (!memory->monitor_cnt) {
> + ret = -EINVAL;
> + dev_err_probe(&sdev->dev, ret, "failed to find monitor nodes\n");
> + goto err;
> + }
> + } while (1);
> +
> + if (!info->memory_cnt) {
> + ret = -EINVAL;
> + dev_err_probe(&sdev->dev, ret, "failed to find memory nodes\n");
> + }
> +
> +err:
> + of_node_put(sdev->handle->dev->of_node);
> +
> + return ret;
> +}
> +
> +static int configure_cpucp_common_events(struct scmi_memlat_info *info)
> +{
> + const struct qcom_scmi_vendor_ops *ops = info->ops;
> + u8 ev_map[NUM_COMMON_EVS];
> + struct ev_map_msg msg;
> + int ret;
> +
> + memset(ev_map, 0xFF, NUM_COMMON_EVS);
> +
> + msg.num_evs = NUM_COMMON_EVS;
> + msg.hw_type = INVALID_IDX;
> + msg.cid[INST_IDX] = EV_INST_RETIRED;
> + msg.cid[CYC_IDX] = EV_CPU_CYCLES;
> + msg.cid[CONST_CYC_IDX] = INVALID_IDX;
> + msg.cid[FE_STALL_IDX] = INVALID_IDX;
> + msg.cid[BE_STALL_IDX] = INVALID_IDX;
> +
> + ret = ops->set_param(info->ph, &msg, MEMLAT_ALGO_STR, MEMLAT_SET_COMMON_EV_MAP,
> + sizeof(msg));
return ops->set_param(...
> + return ret;
> +}
> +
> +static int configure_cpucp_grp(struct device *dev, struct scmi_memlat_info *info, int memory_index)
> +{
> + const struct qcom_scmi_vendor_ops *ops = info->ops;
> + struct scmi_memory_info *memory = info->memory[memory_index];
> + struct ev_map_msg ev_msg;
> + u8 ev_map[NUM_GRP_EVS];
> + struct node_msg msg;
> + int ret;
> +
> + msg.cpumask = 0;
> + msg.hw_type = memory->hw_type;
> + msg.mon_type = 0;
> + msg.mon_idx = 0;
> + ret = ops->set_param(info->ph, &msg, MEMLAT_ALGO_STR, MEMLAT_SET_MEM_GROUP, sizeof(msg));
> + if (ret < 0) {
> + dev_err_probe(dev, ret, "failed to configure mem type %d\n", memory->hw_type);
> + return ret;
return dev_err_probe(...
> + }
> +
> + memset(ev_map, 0xFF, NUM_GRP_EVS);
> + ev_msg.num_evs = NUM_GRP_EVS;
> + ev_msg.hw_type = memory->hw_type;
> + ev_msg.cid[MISS_IDX] = EV_L2_D_RFILL;
> + ev_msg.cid[WB_IDX] = INVALID_IDX;
> + ev_msg.cid[ACC_IDX] = INVALID_IDX;
> + ret = ops->set_param(info->ph, &ev_msg, MEMLAT_ALGO_STR, MEMLAT_SET_GRP_EV_MAP,
> + sizeof(ev_msg));
> + if (ret < 0) {
> + dev_err_probe(dev, ret,
> + "failed to configure event map for mem type %d\n", memory->hw_type);
> + return ret;
return dev_err_probe(...
> + }
> +
> + return ret;
> +}
> +
> +static int configure_cpucp_mon(struct device *dev, struct scmi_memlat_info *info,
> + int memory_index, int monitor_index)
> +{
> + const struct qcom_scmi_vendor_ops *ops = info->ops;
> + struct scmi_memory_info *memory = info->memory[memory_index];
> + struct scmi_monitor_info *monitor = memory->monitor[monitor_index];
> + struct scalar_param_msg scalar_msg;
> + struct map_param_msg map_msg;
> + struct node_msg msg;
> + int ret;
> + int i;
> +
> + msg.cpumask = monitor->mask;
> + msg.hw_type = memory->hw_type;
> + msg.mon_type = monitor->mon_type;
> + msg.mon_idx = monitor->mon_idx;
> + strscpy(msg.mon_name, monitor->mon_name, sizeof(msg.mon_name));
> + ret = ops->set_param(info->ph, &msg, MEMLAT_ALGO_STR, MEMLAT_SET_MONITOR, sizeof(msg));
> + if (ret < 0) {
> + dev_err_probe(dev, ret, "failed to configure monitor %s\n", monitor->mon_name);
> + return ret;
return dev_err_probe(...
> + }
> +
> + scalar_msg.hw_type = memory->hw_type;
> + scalar_msg.mon_idx = monitor->mon_idx;
> + scalar_msg.val = monitor->ipm_ceil;
> + ret = ops->set_param(info->ph, &scalar_msg, MEMLAT_ALGO_STR, MEMLAT_IPM_CEIL,
> + sizeof(scalar_msg));
> + if (ret < 0) {
> + dev_err_probe(dev, ret, "failed to set ipm ceil for %s\n", monitor->mon_name);
> + return ret;
return dev_err_probe(...
> + }
> +
> + map_msg.hw_type = memory->hw_type;
> + map_msg.mon_idx = monitor->mon_idx;
> + map_msg.nr_rows = monitor->freq_map_len;
> + for (i = 0; i < monitor->freq_map_len; i++) {
> + map_msg.tbl[i].v1 = monitor->freq_map[i].cpufreq_mhz;
> + map_msg.tbl[i].v2 = monitor->freq_map[i].memfreq_khz;
> + }
> + ret = ops->set_param(info->ph, &map_msg, MEMLAT_ALGO_STR, MEMLAT_MON_FREQ_MAP,
> + sizeof(map_msg));
> + if (ret < 0) {
> + dev_err_probe(dev, ret, "failed to configure freq_map for %s\n", monitor->mon_name);
> + return ret;
return dev_err_probe(...
> + }
> +
> + scalar_msg.hw_type = memory->hw_type;
> + scalar_msg.mon_idx = monitor->mon_idx;
> + scalar_msg.val = memory->min_freq;
> + ret = ops->set_param(info->ph, &scalar_msg, MEMLAT_ALGO_STR, MEMLAT_SET_MIN_FREQ,
> + sizeof(scalar_msg));
> + if (ret < 0) {
> + dev_err_probe(dev, ret, "failed to set min_freq for %s\n", monitor->mon_name);
> + return ret;
return dev_err_probe(...
> + }
> +
> + scalar_msg.hw_type = memory->hw_type;
> + scalar_msg.mon_idx = monitor->mon_idx;
> + scalar_msg.val = memory->max_freq;
> + ret = ops->set_param(info->ph, &scalar_msg, MEMLAT_ALGO_STR, MEMLAT_SET_MAX_FREQ,
> + sizeof(scalar_msg));
> + if (ret < 0)
> + dev_err_probe(dev, ret, "failed to set max_freq for %s\n", monitor->mon_name);
> +
> + return ret;
> +}
> +
> +static int cpucp_memlat_init(struct scmi_device *sdev)
> +{
> + const struct scmi_handle *handle = sdev->handle;
> + const struct qcom_scmi_vendor_ops *ops;
> + struct scmi_protocol_handle *ph;
> + struct scmi_memlat_info *info;
> + u32 cpucp_freq_method = CPUCP_DEFAULT_FREQ_METHOD;
> + u32 cpucp_sample_ms = CPUCP_DEFAULT_SAMPLING_PERIOD_MS;
> + int ret, i, j;
> +
> + if (!handle)
> + return -ENODEV;
> +
> + ops = handle->devm_protocol_get(sdev, QCOM_SCMI_VENDOR_PROTOCOL, &ph);
> + if (IS_ERR(ops))
> + return PTR_ERR(ops);
> +
> + info = devm_kzalloc(&sdev->dev, sizeof(*info), GFP_KERNEL);
> + if (!info)
> + return -ENOMEM;
> +
> + ret = process_scmi_memlat_of_node(sdev, info);
> + if (ret)
> + return ret;
> +
> + info->ph = ph;
> + info->ops = ops;
> +
> + /* Configure common events ids */
> + ret = configure_cpucp_common_events(info);
> + if (ret < 0) {
> + dev_err_probe(&sdev->dev, ret, "failed to configure common events\n");
> + return ret;
return dev_err_probe(...
> + }
> +
> + for (i = 0; i < info->memory_cnt; i++) {
> + /* Configure per group parameters */
> + ret = configure_cpucp_grp(&sdev->dev, info, i);
> + if (ret < 0)
> + return ret;
> +
> + for (j = 0; j < info->memory[i]->monitor_cnt; j++) {
> + /* Configure per monitor parameters */
> + ret = configure_cpucp_mon(&sdev->dev, info, i, j);
> + if (ret < 0)
> + return ret;
> + }
> + }
> +
> + /* Set loop sampling time */
> + ret = ops->set_param(ph, &cpucp_sample_ms, MEMLAT_ALGO_STR, MEMLAT_SAMPLE_MS,
> + sizeof(cpucp_sample_ms));
> + if (ret < 0) {
> + dev_err_probe(&sdev->dev, ret, "failed to set sample_ms\n");
> + return ret;
return dev_err_probe(...
> + }
> +
> + /* Set the effective cpu frequency calculation method */
> + ret = ops->set_param(ph, &cpucp_freq_method, MEMLAT_ALGO_STR,
> + MEMLAT_SET_EFFECTIVE_FREQ_METHOD, sizeof(cpucp_freq_method));
> + if (ret < 0) {
> + dev_err_probe(&sdev->dev, ret, "failed to set effective frequency calc method\n");
> + return ret;
return dev_err_probe(...
> + }
> +
> + /* Start sampling and voting timer */
> + ret = ops->start_activity(ph, NULL, MEMLAT_ALGO_STR, MEMLAT_START_TIMER, 0);
> + if (ret < 0)
> + dev_err_probe(&sdev->dev, ret, "failed to start memory group timer\n");
> +
> + return ret;
> +}
> +
> +static int scmi_client_probe(struct scmi_device *sdev)
> +{
> + return cpucp_memlat_init(sdev);
> +}
> +
> +static const struct scmi_device_id scmi_id_table[] = {
> + { .protocol_id = QCOM_SCMI_VENDOR_PROTOCOL, .name = "qcom_scmi_vendor_protocol" },
> + { },
> +};
> +MODULE_DEVICE_TABLE(scmi, scmi_id_table);
> +
> +static struct scmi_driver qcom_scmi_client_drv = {
> + .name = "qcom-scmi-driver",
> + .probe = scmi_client_probe,
> + .id_table = scmi_id_table,
> +};
> +module_scmi_driver(qcom_scmi_client_drv);
> +
> +MODULE_DESCRIPTION("QTI SCMI client driver");
> +MODULE_LICENSE("GPL");
Thanks,
Cristian
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC V3 1/4] dt-bindings: firmware: Document bindings for ARM SCMI QCOM Vendor Protocol
2024-07-09 16:32 ` Cristian Marussi
@ 2024-07-11 13:31 ` Sibi Sankar
0 siblings, 0 replies; 17+ messages in thread
From: Sibi Sankar @ 2024-07-11 13:31 UTC (permalink / raw)
To: Cristian Marussi
Cc: sudeep.holla, andersson, konrad.dybcio, robh+dt,
krzysztof.kozlowski+dt, linux-kernel, linux-arm-msm, devicetree,
linux-arm-kernel, quic_rgottimu, quic_kshivnan, conor+dt
On 7/9/24 22:02, Cristian Marussi wrote:
> On Wed, Jul 03, 2024 at 12:44:37AM +0530, Sibi Sankar wrote:
>> Document the various memory buses that can be monitored and scaled by the
>> memory latency governor hosted by the ARM SCMI QCOM Vendor protocol v1.0.
>>
>> Signed-off-by: Sibi Sankar <quic_sibis@quicinc.com>
Hey Christian,
Thanks for taking time to review the series :)
>> ---
>
> Hi Sibi,
>
> this series got a bit neglected...my bad...a few comments down below.
>
>>
>> Adding a reg property in scmi-memlat.yaml seems incorrect/superfluous
>> but without it I see the following errors:
>>
>> Err Logs:
>> protocol@80: 'reg' does not match any of the regexes: '^memory-[0-9]$', 'pinctrl-[0-9]+'
>> protocol@80: Unevaluated properties are not allowed ('memory-0', 'memory-1', 'memory-2' were unexpected)
>>
>> v2:
>> * Drop container dvfs memlat container node. [Rob]
>> * Move scmi-memlat.yaml to protocol level given that a lot of vendors might end up
>> using the same protocol number. [Rob]
>> * Replace qcom,cpulist with the standard "cpus" property. [Rob]
>> * Fix up compute-type/ipm-ceil required. [Rob]
>>
>
> ...so there has been a lot of work around Vendor protos recently (as you
> have seen) and especially around the way we define the DT bindings to have
> multiple vendor protocols coexist with the same overlapping numbers.
> (the code-level coexistence is already in place as you've seen...)
>
> I think some sort of agreement on HOW to render this in the bindings
> side was reached around a series from NXP...not sure if I am missing something
> here but this commit from Peng/NXP (if you have not seen it already):
> https://lore.kernel.org/linux-arm-kernel/20240621-imx95-bbm-misc-v2-v5-2-b85a6bf778cb@nxp.com/
>
> ...it is a good example of how you can define your vendor specific part in
> a vendor specific binding files and then just add a single $ref line in
> the core binding Documentation/devicetree/bindings/firmware/arm,scmi.yaml
> (and that has been successfully reviewd...)
>
> Moreover, in that same series from Peng/NXP you could have a look at
> https://lore.kernel.org/linux-arm-kernel/20240621-imx95-bbm-misc-v2-v5-1-b85a6bf778cb@nxp.com/
>
> that adds the Documentation for their Vendor protocols.
> Beside the final location in the tree for such docs, which is a detail
> we can settle later on our side too, I think that patch is a good example
> of the kind of vendor-protos Documentation Sudeep is expecting.
>
>
>> .../bindings/firmware/arm,scmi.yaml | 15 ++
>> .../bindings/soc/qcom/qcom,scmi-memlat.yaml | 242 ++++++++++++++++++
>> include/dt-bindings/soc/qcom,scmi-vendor.h | 22 ++
>> 3 files changed, 279 insertions(+)
>> create mode 100644 Documentation/devicetree/bindings/soc/qcom/qcom,scmi-memlat.yaml
>> create mode 100644 include/dt-bindings/soc/qcom,scmi-vendor.h
>>
>> diff --git a/Documentation/devicetree/bindings/firmware/arm,scmi.yaml b/Documentation/devicetree/bindings/firmware/arm,scmi.yaml
>> index 4d823f3b1f0e..a4022682e5ca 100644
>> --- a/Documentation/devicetree/bindings/firmware/arm,scmi.yaml
>> +++ b/Documentation/devicetree/bindings/firmware/arm,scmi.yaml
>> @@ -284,6 +284,21 @@ properties:
>> required:
>> - reg
>>
>> + protocol@80:
>> + type: object
>> + allOf:
>> + - $ref: '#/$defs/protocol-node'
>> + - $ref: /schemas/soc/qcom/qcom,scmi-memlat.yaml#
>> +
>> + unevaluatedProperties: false
>> +
>> + properties:
>> + reg:
>> + const: 0x80
>> +
>> + required:
>> + - reg
>> +
>
> ..here you should be able to just plant your $ref without redefining the
> protocol@80
>
>> additionalProperties: false
>>
>> $defs:
>> diff --git a/Documentation/devicetree/bindings/soc/qcom/qcom,scmi-memlat.yaml b/Documentation/devicetree/bindings/soc/qcom/qcom,scmi-memlat.yaml
>> new file mode 100644
>> index 000000000000..915a6bf5697f
>> --- /dev/null
>> +++ b/Documentation/devicetree/bindings/soc/qcom/qcom,scmi-memlat.yaml
>> @@ -0,0 +1,242 @@
>> +# SPDX-License-Identifier: (GPL-2.0 OR BSD-2-Clause)
>> +%YAML 1.2
>> +---
>> +$id: http://devicetree.org/schemas/soc/qcom/qcom,scmi-memlat.yaml#
>> +$schema: http://devicetree.org/meta-schemas/core.yaml#
>> +
>> +title: Qualcomm SCMI Memory Bus nodes
>> +
>> +maintainers:
>> + - Sibi Sankar <quic_sibis@quicinc.com>
>> +
>> +description:
>> + This binding describes the various memory buses that can be monitored and scaled
>> + by memory latency governor running on the CPU Control Processor (SCMI controller).
>> +
>
> ...and instead here you will define your protocols, compliant with the
> main protocol-node def and any specific vendor sub-properties that you
> should need....
>
> ...the above example from NXP is probably more clear than any attempt of mine
> to explain this :P
ack, will adhere to the same in the next re-spin.
>
> Thanks,
> Cristian
>
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC V3 2/4] firmware: arm_scmi: vendors: Add ARM SCMI QCOM vendor protocol v1.0
2024-07-09 10:10 ` Konrad Dybcio
@ 2024-07-11 13:33 ` Sibi Sankar
0 siblings, 0 replies; 17+ messages in thread
From: Sibi Sankar @ 2024-07-11 13:33 UTC (permalink / raw)
To: Konrad Dybcio, sudeep.holla, cristian.marussi, andersson, robh+dt,
krzysztof.kozlowski+dt
Cc: linux-kernel, linux-arm-msm, devicetree, linux-arm-kernel,
quic_rgottimu, quic_kshivnan, conor+dt, Amir Vajid
On 7/9/24 15:40, Konrad Dybcio wrote:
> On 2.07.2024 9:14 PM, Sibi Sankar wrote:
>> The ARM SCMI QCOM vendor protocol provides a generic way of exposing a
>> number of Qualcomm SoC specific features (like memory bus scaling) through
>> a mixture of pre-determined algorithm strings and param_id pairs hosted on
>> the SCMI controller.
>>
>
> [...]
Hey Konrad,
Thanks for taking time to review the series.
>
>> +/**
>> + * qcom_scmi_vendor_protocol_cmd - vendor specific commands supported by Qualcomm SCMI
>> + * vendor protocol.
>
> include the word 'enum' or:
>
> warning: cannot understand function prototype: 'enum qcom_scmi_vendor_protocol_cmd '
will fix it in the next re-spin.
-Sibi
>
> Konrad
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC V3 3/4] soc: qcom: Introduce SCMI based Memlat (Memory Latency) governor
2024-07-09 10:51 ` Konrad Dybcio
@ 2024-10-07 7:22 ` Sibi Sankar
0 siblings, 0 replies; 17+ messages in thread
From: Sibi Sankar @ 2024-10-07 7:22 UTC (permalink / raw)
To: Konrad Dybcio, sudeep.holla, cristian.marussi, andersson, robh+dt,
krzysztof.kozlowski+dt
Cc: linux-kernel, linux-arm-msm, devicetree, linux-arm-kernel,
quic_rgottimu, quic_kshivnan, conor+dt, Amir Vajid
On 7/9/24 16:21, Konrad Dybcio wrote:
> On 2.07.2024 9:14 PM, Sibi Sankar wrote:
>> Introduce a client driver that uses the memlat algorithm string hosted
>> on ARM SCMI QCOM Vendor Protocol to detect memory latency workloads and
>> control frequency/level of the various memory buses (DDR/LLCC/DDR_QOS).
>>
>> Co-developed-by: Shivnandan Kumar <quic_kshivnan@quicinc.com>
>> Signed-off-by: Shivnandan Kumar <quic_kshivnan@quicinc.com>
>> Co-developed-by: Ramakrishna Gottimukkula <quic_rgottimu@quicinc.com>
>> Signed-off-by: Ramakrishna Gottimukkula <quic_rgottimu@quicinc.com>
>> Co-developed-by: Amir Vajid <avajid@quicinc.com>
>> Signed-off-by: Amir Vajid <avajid@quicinc.com>
>> Signed-off-by: Sibi Sankar <quic_sibis@quicinc.com>
>> ---
>
> [...]
>
>> +/**
>> + * scmi_memlat_protocol_cmd - parameter_ids supported by the "MEMLAT" algo_str hosted
>> + * by the Qualcomm SCMI Vendor Protocol on the SCMI controller.
>
> 'enum scmi_mem..'
>
>> +static int populate_cluster_info(u32 *cluster_info)
>> +{
>> + char name[MAX_NAME_LEN];
>> + int i = 0;
>> +
>> + struct device_node *cn __free(device_node) = of_find_node_by_path("/cpus");
>> + if (!cn)
>> + return -ENODEV;
>> +
>> + struct device_node *map __free(device_node) = of_get_child_by_name(cn, "cpu-map");
>> + if (!map)
>> + return -ENODEV;
>> +
>> + do {
>> + snprintf(name, sizeof(name), "cluster%d", i);
>> + struct device_node *c __free(device_node) = of_get_child_by_name(map, name);
>> + if (!c)
>> + break;
>> +
>> + *(cluster_info + i) = of_get_child_count(c);
>> + i++;
>> + } while (1);
>
> of_cpu_device_node_get(0) + of_get_next_cpu_node() +
> of_get_cpu_hwid() & MPIDR_EL1.Aff2 [1]
>
> [...]
>
>> +static struct cpufreq_memfreq_map *init_cpufreq_memfreq_map(struct device *dev,
>> + struct scmi_memory_info *memory,
>> + struct device_node *of_node,
>> + u32 *cnt)
>> +{
>> + struct device_node *tbl_np, *opp_np;
>> + struct cpufreq_memfreq_map *tbl;
>> + int ret, i = 0;
>> + u32 level, len;
>> + u64 rate;
>> +
>> + tbl_np = of_parse_phandle(of_node, "operating-points-v2", 0);
>> + if (!tbl_np)
>> + return ERR_PTR(-ENODEV);
>> +
>> + len = min(of_get_available_child_count(tbl_np), MAX_MAP_ENTRIES);
>> + if (len == 0)
>> + return ERR_PTR(-ENODEV);
>> +
>> + tbl = devm_kzalloc(dev, (len + 1) * sizeof(struct cpufreq_memfreq_map),
>> + GFP_KERNEL);
>> + if (!tbl)
>> + return ERR_PTR(-ENOMEM);
>> +
>> + for_each_available_child_of_node(tbl_np, opp_np) {
>> + ret = of_property_read_u64_index(opp_np, "opp-hz", 0, &rate);
>> + if (ret < 0)
>> + return ERR_PTR(ret);
>> +
>> + tbl[i].cpufreq_mhz = rate / HZ_PER_MHZ;
>> +
>> + if (memory->hw_type != QCOM_MEM_TYPE_DDR_QOS) {
>> + ret = of_property_read_u64_index(opp_np, "opp-hz", 1, &rate);
>> + if (ret < 0)
>> + return ERR_PTR(ret);
>> +
>> + tbl[i].memfreq_khz = rate / HZ_PER_KHZ;
>> + } else {
>> + ret = of_property_read_u32(opp_np, "opp-level", &level);
>> + if (ret < 0)
>> + return ERR_PTR(ret);
>> +
>> + tbl[i].memfreq_khz = level;
>> + }
>> +
>> + dev_dbg(dev, "Entry%d CPU:%u, Mem:%u\n", i, tbl[i].cpufreq_mhz, tbl[i].memfreq_khz);
>> + i++;
>> + }
>> + *cnt = len;
>> + tbl[i].cpufreq_mhz = 0;
>
> missing of_node_put, or even better __free(device_node)
>
> [...]
>
>> + /*
>> + * Variants of the SoC having reduced number of cpus operate
>> + * with the same number of logical cpus but the physical
>> + * cpu disabled will differ between parts. Calculate the
>> + * physical cpu number using cluster information instead.
>> + */
>> + ret = populate_physical_mask(monitor_np, &monitor->mask,
>> + info->cluster_info);
>> + if (ret < 0) {
>> + dev_err_probe(&sdev->dev, ret, "failed to populate cpu mask\n");
>> + goto err;
>> + }
>
> err.. the same number of logical CPUs? as in, PSCI will happily report that
> the inexistent cores have been booted? or some cores start doing some sort
> of hyperthreading to make up for the missing ones? this sounds sketchy..
Hey Konrad,
The problem we are describing here is about a scenario where the
bootloaders disable faulty cpus after reading fuse values by fixing
up the cpu node in the device tree by marking them as "failed". In
this case the logical cpu number of the cpus that booted up never
align with the physical numbers. Hence relying on any logical
number for this calculation doesn't work. Will address all your
other comments in v4.
-Sibi
>
> Konrad
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC V3 2/4] firmware: arm_scmi: vendors: Add ARM SCMI QCOM vendor protocol v1.0
2024-07-09 17:52 ` Cristian Marussi
@ 2024-10-07 7:29 ` Sibi Sankar
0 siblings, 0 replies; 17+ messages in thread
From: Sibi Sankar @ 2024-10-07 7:29 UTC (permalink / raw)
To: Cristian Marussi
Cc: sudeep.holla, andersson, konrad.dybcio, robh+dt,
krzysztof.kozlowski+dt, linux-kernel, linux-arm-msm, devicetree,
linux-arm-kernel, quic_rgottimu, quic_kshivnan, conor+dt,
Amir Vajid
On 7/9/24 23:22, Cristian Marussi wrote:
> On Wed, Jul 03, 2024 at 12:44:38AM +0530, Sibi Sankar wrote:
>> The ARM SCMI QCOM vendor protocol provides a generic way of exposing a
>> number of Qualcomm SoC specific features (like memory bus scaling) through
>> a mixture of pre-determined algorithm strings and param_id pairs hosted on
>> the SCMI controller.
>>
>
> Hi Sibi,
>
>> Co-developed-by: Shivnandan Kumar <quic_kshivnan@quicinc.com>
>> Signed-off-by: Shivnandan Kumar <quic_kshivnan@quicinc.com>
>> Co-developed-by: Ramakrishna Gottimukkula <quic_rgottimu@quicinc.com>
>> Signed-off-by: Ramakrishna Gottimukkula <quic_rgottimu@quicinc.com>
>> Co-developed-by: Amir Vajid <avajid@quicinc.com>
>> Signed-off-by: Amir Vajid <avajid@quicinc.com>
>> Signed-off-by: Sibi Sankar <quic_sibis@quicinc.com>
>> ---
>> drivers/firmware/arm_scmi/vendors/Kconfig | 12 ++
>> drivers/firmware/arm_scmi/vendors/Makefile | 2 +-
>> .../arm_scmi/vendors/qcom_scmi_vendor.c | 184 ++++++++++++++++++
>> include/linux/qcom_scmi_vendor.h | 39 ++++
>> 4 files changed, 236 insertions(+), 1 deletion(-)
>> create mode 100644 drivers/firmware/arm_scmi/vendors/qcom_scmi_vendor.c
>> create mode 100644 include/linux/qcom_scmi_vendor.h
>>
>> diff --git a/drivers/firmware/arm_scmi/vendors/Kconfig b/drivers/firmware/arm_scmi/vendors/Kconfig
>> index 7c1ca7a12603..6bff4550fa25 100644
>> --- a/drivers/firmware/arm_scmi/vendors/Kconfig
>> +++ b/drivers/firmware/arm_scmi/vendors/Kconfig
>> @@ -1,4 +1,16 @@
>> # SPDX-License-Identifier: GPL-2.0-only
>> menu "ARM SCMI Vendor Protocols"
>>
>> +config ARM_SCMI_PROTOCOL_VENDOR_QCOM
>> + tristate "Qualcomm Technologies, Inc. Qcom SCMI vendor Protocol"
>> + depends on ARM_SCMI_PROTOCOL || COMPILE_TEST
>> + help
>> + The SCMI QCOM vendor protocol provides a generic way of exposing a
>> + number of Qualcomm SoC specific features (like memory bus scaling)
>> + through a mixture of pre-determined algorithm strings and param_id
>> + pairs hosted on the SCMI controller.
>> +
>> + This driver defines/documents the message ID's used for this
>> + communication and also exposes the ops used by the clients.
>
> operations
>
>> +
>> endmenu
>> diff --git a/drivers/firmware/arm_scmi/vendors/Makefile b/drivers/firmware/arm_scmi/vendors/Makefile
>> index c6c214158dd8..c1d6a355f579 100644
>> --- a/drivers/firmware/arm_scmi/vendors/Makefile
[...]
>> +++ b/drivers/firmware/arm_scmi/vendors/Makefile
>> + if (ret)
>> + return ret;
>> +
>> + msg = t->tx.buf;
>> + msg->algo_low = cpu_to_le32(lower_32_bits(algo_str));
>> + msg->algo_high = cpu_to_le32(upper_32_bits(algo_str));
>> + msg->param_id = cpu_to_le32(param_id);
>> +
>> + memcpy(msg->buf, buf, t->tx.len - sizeof(*msg));
>> +
>> + ret = ph->xops->do_xfer(ph, t);
>> + ph->xops->xfer_put(ph, t);
>> +
>> + return ret;
>> +}
>> +
>> +static int qcom_scmi_get_param(const struct scmi_protocol_handle *ph, void *buf, u64 algo_str,
>> + u32 param_id, size_t tx_size, size_t rx_size)
>> +{
>
> Similarly...and looking at my past ramblings...this rx_size is the expected RX
> size AND also the size of the provided *buf too, right ?
>
>> + struct scmi_xfer *t;
>> + struct qcom_scmi_msg *msg;
>> + int ret;
>> +
>> + ret = ph->xops->xfer_get_init(ph, QCOM_SCMI_GET_PARAM, tx_size + sizeof(*msg), rx_size, &t);
>> + if (ret)
>> + return ret;
>> +
>> + msg = t->tx.buf;
>> + msg->algo_low = cpu_to_le32(lower_32_bits(algo_str));
>> + msg->algo_high = cpu_to_le32(upper_32_bits(algo_str));
>> + msg->param_id = cpu_to_le32(param_id);
>> + memcpy(msg->buf, buf, t->tx.len - sizeof(*msg));
>> +
>> + ret = ph->xops->do_xfer(ph, t);
>> + memcpy(buf, t->rx.buf, t->rx.len);
>
> ...so that this memcpy is safe since rx.len is equal to rx_size by construction
> (if I read correctly my past review/mublings...)
>
> ...in that case maybe, for better clarity you could re-name the rx_size
> param as buf_len and have it following *buf in the param list...
>
>
> ...sorry for not having spotted this naming/order niptick earlier ...
the only caveat being rx_size can be lower than the tx_size
and we dont't want to copy more that what we expect. Addressed
all your other comments from the series in V4. Sry it took a
while due to its dependency with scmi perf changes. Thanks
again.
-Sibi
[...]
>
> Thanks,
> Cristian
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC V3 3/4] soc: qcom: Introduce SCMI based Memlat (Memory Latency) governor
2024-07-10 11:36 ` Cristian Marussi
@ 2024-10-07 7:33 ` Sibi Sankar
0 siblings, 0 replies; 17+ messages in thread
From: Sibi Sankar @ 2024-10-07 7:33 UTC (permalink / raw)
To: Cristian Marussi
Cc: sudeep.holla, andersson, konrad.dybcio, robh+dt,
krzysztof.kozlowski+dt, linux-kernel, linux-arm-msm, devicetree,
linux-arm-kernel, quic_rgottimu, quic_kshivnan, conor+dt,
Amir Vajid
On 7/10/24 17:06, Cristian Marussi wrote:
> On Wed, Jul 03, 2024 at 12:44:39AM +0530, Sibi Sankar wrote:
>> Introduce a client driver that uses the memlat algorithm string hosted
>> on ARM SCMI QCOM Vendor Protocol to detect memory latency workloads and
>> control frequency/level of the various memory buses (DDR/LLCC/DDR_QOS).
>>
>> Co-developed-by: Shivnandan Kumar <quic_kshivnan@quicinc.com>
>> Signed-off-by: Shivnandan Kumar <quic_kshivnan@quicinc.com>
>> Co-developed-by: Ramakrishna Gottimukkula <quic_rgottimu@quicinc.com>
>> Signed-off-by: Ramakrishna Gottimukkula <quic_rgottimu@quicinc.com>
>> Co-developed-by: Amir Vajid <avajid@quicinc.com>
>> Signed-off-by: Amir Vajid <avajid@quicinc.com>
>> Signed-off-by: Sibi Sankar <quic_sibis@quicinc.com>
>> ---
>>
>
> Hi Sibi,
>
> just a few remarks down below...
Thanks, addressed them all in v4.
-Sibi
>
>> V2:
>> * Make driver changes to the accommodate bindings changes. [Rob]
>> * Replace explicit of_node_put with _free.
[...]
>
> Thanks,
> Cristian
^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2024-10-07 7:34 UTC | newest]
Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-07-02 19:14 [RFC V3 0/4] arm_scmi: vendors: ARM SCMI Qualcomm Vendor Protocol Sibi Sankar
2024-07-02 19:14 ` [RFC V3 1/4] dt-bindings: firmware: Document bindings for ARM SCMI QCOM " Sibi Sankar
2024-07-09 16:32 ` Cristian Marussi
2024-07-11 13:31 ` Sibi Sankar
2024-07-02 19:14 ` [RFC V3 2/4] firmware: arm_scmi: vendors: Add ARM SCMI QCOM vendor protocol v1.0 Sibi Sankar
2024-07-09 10:10 ` Konrad Dybcio
2024-07-11 13:33 ` Sibi Sankar
2024-07-09 17:52 ` Cristian Marussi
2024-10-07 7:29 ` Sibi Sankar
2024-07-02 19:14 ` [RFC V3 3/4] soc: qcom: Introduce SCMI based Memlat (Memory Latency) governor Sibi Sankar
2024-07-03 8:44 ` Shivnandan Kumar
2024-07-03 11:34 ` Sibi Sankar
2024-07-09 10:51 ` Konrad Dybcio
2024-10-07 7:22 ` Sibi Sankar
2024-07-10 11:36 ` Cristian Marussi
2024-10-07 7:33 ` Sibi Sankar
2024-07-02 19:14 ` [RFC V3 4/4] arm64: dts: qcom: x1e80100: Enable LLCC/DDR/DDR_QOS dvfs Sibi Sankar
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).