public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH RFC 0/8] drm/msm: adreno: add support for DDR bandwidth scaling via GMU
@ 2024-11-13 15:48 Neil Armstrong
  2024-11-13 15:48 ` [PATCH RFC 1/8] opp: core: implement dev_pm_opp_get_bandwidth Neil Armstrong
                   ` (7 more replies)
  0 siblings, 8 replies; 29+ messages in thread
From: Neil Armstrong @ 2024-11-13 15:48 UTC (permalink / raw)
  To: Akhil P Oommen, Viresh Kumar, Nishanth Menon, Stephen Boyd,
	Rafael J. Wysocki, Rob Clark, Sean Paul, Konrad Dybcio,
	Abhinav Kumar, Dmitry Baryshkov, Marijn Suijten, David Airlie,
	Simona Vetter, Bjorn Andersson, Rob Herring, Krzysztof Kozlowski,
	Conor Dooley
  Cc: Connor Abbott, linux-pm, linux-kernel, linux-arm-msm, dri-devel,
	freedreno, devicetree, Neil Armstrong

The Adreno GMU Management Unit (GMU) can also vote for DDR Bandwidth
along the Frequency and Power Domain level, but by default we leave the
OPP core scale the interconnect ddr path.

While scaling the interconnect path was sufficient, newer GPUs
like the A750 requires specific vote parameters and bandwidth to
achieve full functionnality.

In order to get the vote values to be used by the GPU Management
Unit (GMU), we need to parse all the possible OPP Bandwidths and
create a vote value to be send to the appropriate Bus Control
Modules (BCMs) declared in the GPU info struct.
The added dev_pm_opp_get_bandwidth() is used in this case.

The vote array will then be used to dynamically generate the GMU
bw_table sent during the GMU power-up.

Those entries will then be used by passing the appropriate
bandwidth level when voting for a GPU frequency.

This will make sure all resources are equally voted for a
same OPP, whatever decision is done by the GMU, it will
ensure all resources votes are synchronized.

Tested on SM8650 and SM8550 platforms.

Any feedback is welcome.

Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org>
---
Neil Armstrong (8):
      opp: core: implement dev_pm_opp_get_bandwidth
      drm/msm: adreno: add GMU_BW_VOTE quirk
      drm/msm: adreno: add plumbing to generate bandwidth vote table for GMU
      drm/msm: adreno: dynamically generate GMU bw table
      drm/msm: adreno: find bandwidth index of OPP and set it along freq index
      drm/msm: adreno: enable GMU bandwidth for A740 and A750
      arm64: qcom: dts: sm8550: add interconnect and opp-peak-kBps for GPU
      arm64: qcom: dts: sm8650: add interconnect and opp-peak-kBps for GPU

 arch/arm64/boot/dts/qcom/sm8550.dtsi      |  11 ++
 arch/arm64/boot/dts/qcom/sm8650.dtsi      |  14 +++
 drivers/gpu/drm/msm/adreno/a6xx_catalog.c |  26 ++++-
 drivers/gpu/drm/msm/adreno/a6xx_gmu.c     | 180 +++++++++++++++++++++++++++++-
 drivers/gpu/drm/msm/adreno/a6xx_gmu.h     |  14 ++-
 drivers/gpu/drm/msm/adreno/a6xx_gpu.h     |   1 +
 drivers/gpu/drm/msm/adreno/a6xx_hfi.c     |  54 ++++++---
 drivers/gpu/drm/msm/adreno/adreno_gpu.h   |   1 +
 drivers/opp/core.c                        |  25 +++++
 include/linux/pm_opp.h                    |   7 ++
 10 files changed, 314 insertions(+), 19 deletions(-)
---
base-commit: 86313a9cd152330c634b25d826a281c6a002eb77
change-id: 20241113-topic-sm8x50-gpu-bw-vote-f5e022fe7a47

Best regards,
-- 
Neil Armstrong <neil.armstrong@linaro.org>


^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH RFC 1/8] opp: core: implement dev_pm_opp_get_bandwidth
  2024-11-13 15:48 [PATCH RFC 0/8] drm/msm: adreno: add support for DDR bandwidth scaling via GMU Neil Armstrong
@ 2024-11-13 15:48 ` Neil Armstrong
  2024-11-14  4:10   ` Viresh Kumar
  2024-11-13 15:48 ` [PATCH RFC 2/8] drm/msm: adreno: add GMU_BW_VOTE quirk Neil Armstrong
                   ` (6 subsequent siblings)
  7 siblings, 1 reply; 29+ messages in thread
From: Neil Armstrong @ 2024-11-13 15:48 UTC (permalink / raw)
  To: Akhil P Oommen, Viresh Kumar, Nishanth Menon, Stephen Boyd,
	Rafael J. Wysocki, Rob Clark, Sean Paul, Konrad Dybcio,
	Abhinav Kumar, Dmitry Baryshkov, Marijn Suijten, David Airlie,
	Simona Vetter, Bjorn Andersson, Rob Herring, Krzysztof Kozlowski,
	Conor Dooley
  Cc: Connor Abbott, linux-pm, linux-kernel, linux-arm-msm, dri-devel,
	freedreno, devicetree, Neil Armstrong

Add and implement the dev_pm_opp_get_bandwidth() to retrieve
the OPP's bandwidth in the same was as the dev_pm_opp_get_voltage()
helper.

Retrieving bandwidth is required in the case of the Adreno GPU
where the GPU Management Unit can handle the Bandwidth scaling.

The helper can get the peak or everage bandwidth for any of
the interconnect path.

Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org>
---
 drivers/opp/core.c     | 25 +++++++++++++++++++++++++
 include/linux/pm_opp.h |  7 +++++++
 2 files changed, 32 insertions(+)

diff --git a/drivers/opp/core.c b/drivers/opp/core.c
index 494f8860220d97fc690ebab5ed3b7f5f04f22d73..19fb82033de26b74e9604c33b9781689df2fe80a 100644
--- a/drivers/opp/core.c
+++ b/drivers/opp/core.c
@@ -106,6 +106,31 @@ static bool assert_single_clk(struct opp_table *opp_table)
 	return !WARN_ON(opp_table->clk_count > 1);
 }
 
+/**
+ * dev_pm_opp_get_bandwidth() - Gets the peak bandwidth corresponding to an opp
+ * @opp:	opp for which voltage has to be returned for
+ * @peak:	select peak or average bandwidth
+ * @index:	bandwidth index
+ *
+ * Return: peak bandwidth in kBps, else return 0
+ */
+unsigned long dev_pm_opp_get_bandwidth(struct dev_pm_opp *opp, bool peak, int index)
+{
+	if (IS_ERR_OR_NULL(opp)) {
+		pr_err("%s: Invalid parameters\n", __func__);
+		return 0;
+	}
+
+	if (index > opp->opp_table->path_count)
+		return 0;
+
+	if (!opp->bandwidth)
+		return 0;
+
+	return peak ? opp->bandwidth[index].peak : opp->bandwidth[index].avg;
+}
+EXPORT_SYMBOL_GPL(dev_pm_opp_get_bandwidth);
+
 /**
  * dev_pm_opp_get_voltage() - Gets the voltage corresponding to an opp
  * @opp:	opp for which voltage has to be returned for
diff --git a/include/linux/pm_opp.h b/include/linux/pm_opp.h
index 6424692c30b71fca471a1b7d63e018605dd9324b..526b707a8d61204227222f8c28394dc3a85c4c9a 100644
--- a/include/linux/pm_opp.h
+++ b/include/linux/pm_opp.h
@@ -106,6 +106,8 @@ struct dev_pm_opp_data {
 struct opp_table *dev_pm_opp_get_opp_table(struct device *dev);
 void dev_pm_opp_put_opp_table(struct opp_table *opp_table);
 
+unsigned long dev_pm_opp_get_bandwidth(struct dev_pm_opp *opp, bool peak, int index);
+
 unsigned long dev_pm_opp_get_voltage(struct dev_pm_opp *opp);
 
 int dev_pm_opp_get_supplies(struct dev_pm_opp *opp, struct dev_pm_opp_supply *supplies);
@@ -209,6 +211,11 @@ static inline struct opp_table *dev_pm_opp_get_opp_table_indexed(struct device *
 
 static inline void dev_pm_opp_put_opp_table(struct opp_table *opp_table) {}
 
+static inline unsigned long dev_pm_opp_get_bandwidth(struct dev_pm_opp *opp, bool peak, int index)
+{
+	return 0;
+}
+
 static inline unsigned long dev_pm_opp_get_voltage(struct dev_pm_opp *opp)
 {
 	return 0;

-- 
2.34.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH RFC 2/8] drm/msm: adreno: add GMU_BW_VOTE quirk
  2024-11-13 15:48 [PATCH RFC 0/8] drm/msm: adreno: add support for DDR bandwidth scaling via GMU Neil Armstrong
  2024-11-13 15:48 ` [PATCH RFC 1/8] opp: core: implement dev_pm_opp_get_bandwidth Neil Armstrong
@ 2024-11-13 15:48 ` Neil Armstrong
  2024-11-15  7:07   ` Dmitry Baryshkov
  2024-11-13 15:48 ` [PATCH RFC 3/8] drm/msm: adreno: add plumbing to generate bandwidth vote table for GMU Neil Armstrong
                   ` (5 subsequent siblings)
  7 siblings, 1 reply; 29+ messages in thread
From: Neil Armstrong @ 2024-11-13 15:48 UTC (permalink / raw)
  To: Akhil P Oommen, Viresh Kumar, Nishanth Menon, Stephen Boyd,
	Rafael J. Wysocki, Rob Clark, Sean Paul, Konrad Dybcio,
	Abhinav Kumar, Dmitry Baryshkov, Marijn Suijten, David Airlie,
	Simona Vetter, Bjorn Andersson, Rob Herring, Krzysztof Kozlowski,
	Conor Dooley
  Cc: Connor Abbott, linux-pm, linux-kernel, linux-arm-msm, dri-devel,
	freedreno, devicetree, Neil Armstrong

The Adreno GMU Management Unit (GNU) can also scale the DDR Bandwidth
along the Frequency and Power Domain level, but by default we leave the
OPP core vote for the interconnect ddr path.

While scaling via the interconnect path was sufficient, newer GPUs
like the A750 requires specific vote paremeters and bandwidth to
achieve full functionality.

Add a new Quirk enabling DDR Bandwidth vote via GMU.

Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org>
---
 drivers/gpu/drm/msm/adreno/adreno_gpu.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.h b/drivers/gpu/drm/msm/adreno/adreno_gpu.h
index e71f420f8b3a8e6cfc52dd1c4d5a63ef3704a07f..20b6b7f49473d42751cd4fb4fc82849be42cb807 100644
--- a/drivers/gpu/drm/msm/adreno/adreno_gpu.h
+++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.h
@@ -57,6 +57,7 @@ enum adreno_family {
 #define ADRENO_QUIRK_HAS_HW_APRIV		BIT(3)
 #define ADRENO_QUIRK_HAS_CACHED_COHERENT	BIT(4)
 #define ADRENO_QUIRK_PREEMPTION			BIT(5)
+#define ADRENO_QUIRK_GMU_BW_VOTE		BIT(6)
 
 /* Helper for formating the chip_id in the way that userspace tools like
  * crashdec expect.

-- 
2.34.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH RFC 3/8] drm/msm: adreno: add plumbing to generate bandwidth vote table for GMU
  2024-11-13 15:48 [PATCH RFC 0/8] drm/msm: adreno: add support for DDR bandwidth scaling via GMU Neil Armstrong
  2024-11-13 15:48 ` [PATCH RFC 1/8] opp: core: implement dev_pm_opp_get_bandwidth Neil Armstrong
  2024-11-13 15:48 ` [PATCH RFC 2/8] drm/msm: adreno: add GMU_BW_VOTE quirk Neil Armstrong
@ 2024-11-13 15:48 ` Neil Armstrong
  2024-11-15  7:20   ` Dmitry Baryshkov
  2024-11-13 15:48 ` [PATCH RFC 4/8] drm/msm: adreno: dynamically generate GMU bw table Neil Armstrong
                   ` (4 subsequent siblings)
  7 siblings, 1 reply; 29+ messages in thread
From: Neil Armstrong @ 2024-11-13 15:48 UTC (permalink / raw)
  To: Akhil P Oommen, Viresh Kumar, Nishanth Menon, Stephen Boyd,
	Rafael J. Wysocki, Rob Clark, Sean Paul, Konrad Dybcio,
	Abhinav Kumar, Dmitry Baryshkov, Marijn Suijten, David Airlie,
	Simona Vetter, Bjorn Andersson, Rob Herring, Krzysztof Kozlowski,
	Conor Dooley
  Cc: Connor Abbott, linux-pm, linux-kernel, linux-arm-msm, dri-devel,
	freedreno, devicetree, Neil Armstrong

The Adreno GMU Management Unit (GMU) can also scale DDR Bandwidth along
the Frequency and Power Domain level, but by default we leave the
OPP core scale the interconnect ddr path.

In order to get the vote values to be used by the GPU Management
Unit (GMU), we need to parse all the possible OPP Bandwidths and
create a vote value to be send to the appropriate Bus Control
Modules (BCMs) declared in the GPU info struct.

The vote array will be used to dynamically generate the GMU bw_table
sent during the GMU power-up.

Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org>
---
 drivers/gpu/drm/msm/adreno/a6xx_gmu.c | 163 ++++++++++++++++++++++++++++++++++
 drivers/gpu/drm/msm/adreno/a6xx_gmu.h |  12 +++
 drivers/gpu/drm/msm/adreno/a6xx_gpu.h |   1 +
 3 files changed, 176 insertions(+)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
index 14db7376c712d19446b38152e480bd5a1e0a5198..504a7c5d5a9df4c787951f2ae3a69d566d205ad5 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
@@ -9,6 +9,7 @@
 #include <linux/pm_domain.h>
 #include <linux/pm_opp.h>
 #include <soc/qcom/cmd-db.h>
+#include <soc/qcom/tcs.h>
 #include <drm/drm_gem.h>
 
 #include "a6xx_gpu.h"
@@ -1287,6 +1288,119 @@ static int a6xx_gmu_memory_probe(struct a6xx_gmu *gmu)
 	return 0;
 }
 
+struct a6xx_bcm_data {
+	u32 buswidth;
+	unsigned int unit;
+	unsigned int width;
+	unsigned int vcd;
+	bool fixed;
+	unsigned int perfmode;
+	unsigned int perfmode_bw;
+};
+
+struct bcm_db {
+	__le32 unit;
+	__le16 width;
+	u8 vcd;
+	u8 reserved;
+};
+
+static int a6xx_gmu_rpmh_get_bcm_data(const struct a6xx_bcm *bcm,
+				      struct a6xx_bcm_data *bcm_data)
+{
+	const struct bcm_db *data;
+	size_t count;
+
+	data = cmd_db_read_aux_data(bcm->name, &count);
+	if (IS_ERR(data))
+		return PTR_ERR(data);
+
+	if (!count)
+		return -EINVAL;
+
+	bcm_data->unit = le32_to_cpu(data->unit);
+	bcm_data->width = le16_to_cpu(data->width);
+	bcm_data->vcd = data->vcd;
+	bcm_data->fixed = bcm->fixed;
+	bcm_data->perfmode = bcm->perfmode;
+	bcm_data->perfmode_bw = bcm->perfmode_bw;
+	bcm_data->buswidth = bcm->buswidth;
+
+	return 0;
+}
+
+static void a6xx_gmu_rpmh_calc_bw_vote(struct a6xx_bcm_data *bcms,
+				       int count, u32 bw, u32 *data)
+{
+	int i;
+
+	for (i = 0; i < count; i++) {
+		bool valid = true;
+		bool commit = false;
+		u64 peak, y;
+
+		if (i == count - 1 || bcms[i].vcd != bcms[i + 1].vcd)
+			commit = true;
+
+		if (bcms[i].fixed) {
+			if (!bw)
+				data[i] = BCM_TCS_CMD(commit, false, 0x0, 0x0);
+			else
+				data[i] = BCM_TCS_CMD(commit, true, 0x0,
+					bw >= bcms[i].perfmode_bw ?
+						bcms[i].perfmode : 0x0);
+			continue;
+		}
+
+		/* Multiple the bandwidth by the width of the connection */
+		peak = (u64)bw * bcms[i].width;
+		do_div(peak, bcms[i].buswidth);
+
+		/* Input bandwidth value is in KBps */
+		y = peak * 1000ULL;
+		do_div(y, bcms[i].unit);
+
+		/*
+		 * If a bandwidth value was specified but the calculation ends
+		 * rounding down to zero, set a minimum level
+		 */
+		if (bw && y == 0)
+			y = 1;
+
+		y = min_t(u64, y, BCM_TCS_CMD_VOTE_MASK);
+		if (!y)
+			valid = false;
+
+		data[i] = BCM_TCS_CMD(commit, valid, y, y);
+	}
+}
+
+static int a6xx_gmu_rpmh_bw_votes_init(const struct a6xx_info *info, struct a6xx_gmu *gmu)
+{
+	struct a6xx_bcm_data bcms[3];
+	unsigned int bcm_count = 0;
+	int ret, index;
+
+	/* Retrieve BCM data from cmd-db and merge with a6xx_info bcm table */
+	for (index = 0; index < 3; index++) {
+		if (!info->bcm[index].name)
+			continue;
+
+		ret = a6xx_gmu_rpmh_get_bcm_data(&info->bcm[index], &bcms[index]);
+		if (ret)
+			return ret;
+
+		++bcm_count;
+	}
+
+	/* Generate BCM votes values for each bandwidth & bcm */
+	for (index = 0; index < gmu->nr_gpu_bws; index++)
+		a6xx_gmu_rpmh_calc_bw_vote(bcms, bcm_count, gmu->gpu_bw_table[index],
+					   gmu->gpu_bw_votes[index]);
+
+	return 0;
+}
+
 /* Return the 'arc-level' for the given frequency */
 static unsigned int a6xx_gmu_get_arc_level(struct device *dev,
 					   unsigned long freq)
@@ -1390,12 +1504,15 @@ static int a6xx_gmu_rpmh_arc_votes_init(struct device *dev, u32 *votes,
  * The GMU votes with the RPMh for itself and on behalf of the GPU but we need
  * to construct the list of votes on the CPU and send it over. Query the RPMh
  * voltage levels and build the votes
+ * The GMU can also vote for DDR interconnects, use the OPP bandwidth entries
+ * and BCM parameters to build the votes.
  */
 
 static int a6xx_gmu_rpmh_votes_init(struct a6xx_gmu *gmu)
 {
 	struct a6xx_gpu *a6xx_gpu = container_of(gmu, struct a6xx_gpu, gmu);
 	struct adreno_gpu *adreno_gpu = &a6xx_gpu->base;
+	const struct a6xx_info *info = adreno_gpu->info->a6xx;
 	struct msm_gpu *gpu = &adreno_gpu->base;
 	int ret;
 
@@ -1407,6 +1524,10 @@ static int a6xx_gmu_rpmh_votes_init(struct a6xx_gmu *gmu)
 	ret |= a6xx_gmu_rpmh_arc_votes_init(gmu->dev, gmu->cx_arc_votes,
 		gmu->gmu_freqs, gmu->nr_gmu_freqs, "cx.lvl");
 
+	/* Build the interconnect votes */
+	if (adreno_gpu->info->quirks & ADRENO_QUIRK_GMU_BW_VOTE)
+		ret |= a6xx_gmu_rpmh_bw_votes_init(info, gmu);
+
 	return ret;
 }
 
@@ -1442,6 +1563,38 @@ static int a6xx_gmu_build_freq_table(struct device *dev, unsigned long *freqs,
 	return index;
 }
 
+static int a6xx_gmu_build_bw_table(struct device *dev, unsigned long *bandwidths,
+		u32 size)
+{
+	int count = dev_pm_opp_get_opp_count(dev);
+	struct dev_pm_opp *opp;
+	int i, index = 0;
+	unsigned int bandwidth = 1;
+
+	/*
+	 * The OPP table doesn't contain the "off" bandwidth level so we need to
+	 * add 1 to the table size to account for it
+	 */
+
+	if (WARN(count + 1 > size,
+		"The GMU bandwidth table is being truncated\n"))
+		count = size - 1;
+
+	/* Set the "off" bandwidth */
+	bandwidths[index++] = 0;
+
+	for (i = 0; i < count; i++) {
+		opp = dev_pm_opp_find_bw_ceil(dev, &bandwidth, 0);
+		if (IS_ERR(opp))
+			break;
+
+		dev_pm_opp_put(opp);
+		bandwidths[index++] = bandwidth++;
+	}
+
+	return index;
+}
+
 static int a6xx_gmu_pwrlevels_probe(struct a6xx_gmu *gmu)
 {
 	struct a6xx_gpu *a6xx_gpu = container_of(gmu, struct a6xx_gpu, gmu);
@@ -1472,6 +1625,16 @@ static int a6xx_gmu_pwrlevels_probe(struct a6xx_gmu *gmu)
 
 	gmu->current_perf_index = gmu->nr_gpu_freqs - 1;
 
+	/*
+	 * The GMU also handles GPU Interconnect Votes so build a list
+	 * of DDR bandwidths from the GPU OPP table
+	 */
+	if (adreno_gpu->info->quirks & ADRENO_QUIRK_GMU_BW_VOTE)
+		gmu->nr_gpu_bws = a6xx_gmu_build_bw_table(&gpu->pdev->dev,
+			gmu->gpu_bw_table, ARRAY_SIZE(gmu->gpu_bw_table));
+
+	gmu->current_perf_index = gmu->nr_gpu_freqs - 1;
+
 	/* Build the list of RPMh votes that we'll send to the GMU */
 	return a6xx_gmu_rpmh_votes_init(gmu);
 }
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.h b/drivers/gpu/drm/msm/adreno/a6xx_gmu.h
index b4a79f88ccf45cfe651c86d2a9da39541c5772b3..95c632d8987a517f067c48c61c6c06b9a4f61fc0 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.h
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.h
@@ -19,6 +19,14 @@ struct a6xx_gmu_bo {
 	u64 iova;
 };
 
+struct a6xx_bcm {
+	char *name;
+	unsigned int buswidth;
+	bool fixed;
+	unsigned int perfmode;
+	unsigned int perfmode_bw;
+};
+
 /*
  * These define the different GMU wake up options - these define how both the
  * CPU and the GMU bring up the hardware
@@ -82,6 +90,10 @@ struct a6xx_gmu {
 	unsigned long gpu_freqs[16];
 	u32 gx_arc_votes[16];
 
+	int nr_gpu_bws;
+	unsigned long gpu_bw_table[16];
+	u32 gpu_bw_votes[16][3];
+
 	int nr_gmu_freqs;
 	unsigned long gmu_freqs[4];
 	u32 cx_arc_votes[4];
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.h b/drivers/gpu/drm/msm/adreno/a6xx_gpu.h
index 4aceffb6aae89c781facc2a6e4a82b20b341b6cb..d779d700120cbd974ee87a67214739b1d85156e2 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.h
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.h
@@ -44,6 +44,7 @@ struct a6xx_info {
 	u32 gmu_chipid;
 	u32 gmu_cgc_mode;
 	u32 prim_fifo_threshold;
+	const struct a6xx_bcm bcm[3];
 };
 
 struct a6xx_gpu {

-- 
2.34.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH RFC 4/8] drm/msm: adreno: dynamically generate GMU bw table
  2024-11-13 15:48 [PATCH RFC 0/8] drm/msm: adreno: add support for DDR bandwidth scaling via GMU Neil Armstrong
                   ` (2 preceding siblings ...)
  2024-11-13 15:48 ` [PATCH RFC 3/8] drm/msm: adreno: add plumbing to generate bandwidth vote table for GMU Neil Armstrong
@ 2024-11-13 15:48 ` Neil Armstrong
  2024-11-15  7:24   ` Dmitry Baryshkov
  2024-11-13 15:48 ` [PATCH RFC 5/8] drm/msm: adreno: find bandwidth index of OPP and set it along freq index Neil Armstrong
                   ` (3 subsequent siblings)
  7 siblings, 1 reply; 29+ messages in thread
From: Neil Armstrong @ 2024-11-13 15:48 UTC (permalink / raw)
  To: Akhil P Oommen, Viresh Kumar, Nishanth Menon, Stephen Boyd,
	Rafael J. Wysocki, Rob Clark, Sean Paul, Konrad Dybcio,
	Abhinav Kumar, Dmitry Baryshkov, Marijn Suijten, David Airlie,
	Simona Vetter, Bjorn Andersson, Rob Herring, Krzysztof Kozlowski,
	Conor Dooley
  Cc: Connor Abbott, linux-pm, linux-kernel, linux-arm-msm, dri-devel,
	freedreno, devicetree, Neil Armstrong

The Adreno GPU Management Unit (GMU) can also scale the ddr
bandwidth along the frequency and power domain level, but for
now we statically fill the bw_table with values from the
downstream driver.

Only the first entry is used, which is a disable vote, so we
currently rely on scaling via the linux interconnect paths.

Let's dynamically generate the bw_table with the vote values
previously calculated from the OPPs.

Those entried will then be used by the GMU when passing the
appropriate bandwidth level when voting for a gpu frequency.

Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org>
---
 drivers/gpu/drm/msm/adreno/a6xx_hfi.c | 48 +++++++++++++++++++++++++++--------
 1 file changed, 37 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_hfi.c b/drivers/gpu/drm/msm/adreno/a6xx_hfi.c
index cb8844ed46b29c4569d05eb7a24f7b27e173190f..9a89ba95843e7805d78f0e5ddbe328677b6431dd 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_hfi.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_hfi.c
@@ -596,22 +596,48 @@ static void a730_build_bw_table(struct a6xx_hfi_msg_bw_table *msg)
 	msg->cnoc_cmds_data[1][0] = 0x60000001;
 }
 
-static void a740_build_bw_table(struct a6xx_hfi_msg_bw_table *msg)
+static void a740_generate_bw_table(struct adreno_gpu *adreno_gpu, struct a6xx_gmu *gmu,
+				   struct a6xx_hfi_msg_bw_table *msg)
 {
-	msg->bw_level_num = 1;
+	const struct a6xx_info *info = adreno_gpu->info->a6xx;
+	unsigned int i, j;
 
-	msg->ddr_cmds_num = 3;
 	msg->ddr_wait_bitmask = 0x7;
 
-	msg->ddr_cmds_addrs[0] = cmd_db_read_addr("SH0");
-	msg->ddr_cmds_addrs[1] = cmd_db_read_addr("MC0");
-	msg->ddr_cmds_addrs[2] = cmd_db_read_addr("ACV");
+	for (i = 0; i < 3; i++) {
+		if (!info->bcm[i].name)
+			break;
+		msg->ddr_cmds_addrs[i] = cmd_db_read_addr(info->bcm[i].name);
+	}
+	msg->ddr_cmds_num = i;
 
-	msg->ddr_cmds_data[0][0] = 0x40000000;
-	msg->ddr_cmds_data[0][1] = 0x40000000;
-	msg->ddr_cmds_data[0][2] = 0x40000000;
+	for (i = 0; i < gmu->nr_gpu_bws; ++i)
+		for (j = 0; j < msg->ddr_cmds_num; j++)
+			msg->ddr_cmds_data[i][j] = gmu->gpu_bw_votes[i][j];
+	msg->bw_level_num = gmu->nr_gpu_bws;
+}
+
+static void a740_build_bw_table(struct adreno_gpu *adreno_gpu, struct a6xx_gmu *gmu,
+				struct a6xx_hfi_msg_bw_table *msg)
+{
+	if ((adreno_gpu->info->quirks & ADRENO_QUIRK_GMU_BW_VOTE) && gmu->nr_gpu_bws) {
+		a740_generate_bw_table(adreno_gpu, gmu, msg);
+	} else {
+		msg->bw_level_num = 1;
 
-	/* TODO: add a proper dvfs table */
+		msg->ddr_cmds_num = 3;
+		msg->ddr_wait_bitmask = 0x7;
+
+		msg->ddr_cmds_addrs[0] = cmd_db_read_addr("SH0");
+		msg->ddr_cmds_addrs[1] = cmd_db_read_addr("MC0");
+		msg->ddr_cmds_addrs[2] = cmd_db_read_addr("ACV");
+
+		msg->ddr_cmds_data[0][0] = 0x40000000;
+		msg->ddr_cmds_data[0][1] = 0x40000000;
+		msg->ddr_cmds_data[0][2] = 0x40000000;
+
+		/* TODO: add a proper dvfs table */
+	}
 
 	msg->cnoc_cmds_num = 1;
 	msg->cnoc_wait_bitmask = 0x1;
@@ -691,7 +717,7 @@ static int a6xx_hfi_send_bw_table(struct a6xx_gmu *gmu)
 	else if (adreno_is_a730(adreno_gpu))
 		a730_build_bw_table(msg);
 	else if (adreno_is_a740_family(adreno_gpu))
-		a740_build_bw_table(msg);
+		a740_build_bw_table(adreno_gpu, gmu, msg);
 	else
 		a6xx_build_bw_table(msg);
 

-- 
2.34.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH RFC 5/8] drm/msm: adreno: find bandwidth index of OPP and set it along freq index
  2024-11-13 15:48 [PATCH RFC 0/8] drm/msm: adreno: add support for DDR bandwidth scaling via GMU Neil Armstrong
                   ` (3 preceding siblings ...)
  2024-11-13 15:48 ` [PATCH RFC 4/8] drm/msm: adreno: dynamically generate GMU bw table Neil Armstrong
@ 2024-11-13 15:48 ` Neil Armstrong
  2024-11-15  7:28   ` Dmitry Baryshkov
  2024-11-13 15:48 ` [PATCH RFC 6/8] drm/msm: adreno: enable GMU bandwidth for A740 and A750 Neil Armstrong
                   ` (2 subsequent siblings)
  7 siblings, 1 reply; 29+ messages in thread
From: Neil Armstrong @ 2024-11-13 15:48 UTC (permalink / raw)
  To: Akhil P Oommen, Viresh Kumar, Nishanth Menon, Stephen Boyd,
	Rafael J. Wysocki, Rob Clark, Sean Paul, Konrad Dybcio,
	Abhinav Kumar, Dmitry Baryshkov, Marijn Suijten, David Airlie,
	Simona Vetter, Bjorn Andersson, Rob Herring, Krzysztof Kozlowski,
	Conor Dooley
  Cc: Connor Abbott, linux-pm, linux-kernel, linux-arm-msm, dri-devel,
	freedreno, devicetree, Neil Armstrong

The Adreno GMU Management Unit (GMU) can also scale the DDR Bandwidth
along the Frequency and Power Domain level, until now we left the OPP
core scale the OPP bandwidth via the interconnect path.

In order to enable bandwidth voting via the GPU Management
Unit (GMU), when an opp is set by devfreq we also look for
the corresponding bandwidth index in the previously generated
bw_table and pass this value along the frequency index to the GMU.

Since we now vote for all resources via the GMU, setting the OPP
is no more needed, so we can completely skip calling
dev_pm_opp_set_opp() in this situation.

Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org>
---
 drivers/gpu/drm/msm/adreno/a6xx_gmu.c | 17 +++++++++++++++--
 drivers/gpu/drm/msm/adreno/a6xx_gmu.h |  2 +-
 drivers/gpu/drm/msm/adreno/a6xx_hfi.c |  6 +++---
 3 files changed, 19 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
index 504a7c5d5a9df4c787951f2ae3a69d566d205ad5..1131c3521ebbb0d053aceb162052ed01e197726a 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
@@ -113,6 +113,7 @@ void a6xx_gmu_set_freq(struct msm_gpu *gpu, struct dev_pm_opp *opp,
 	struct a6xx_gpu *a6xx_gpu = to_a6xx_gpu(adreno_gpu);
 	struct a6xx_gmu *gmu = &a6xx_gpu->gmu;
 	u32 perf_index;
+	u32 bw_index = 0;
 	unsigned long gpu_freq;
 	int ret = 0;
 
@@ -125,6 +126,16 @@ void a6xx_gmu_set_freq(struct msm_gpu *gpu, struct dev_pm_opp *opp,
 		if (gpu_freq == gmu->gpu_freqs[perf_index])
 			break;
 
+	/* If enabled, find the corresponding DDR bandwidth index */
+	if ((adreno_gpu->info->quirks & ADRENO_QUIRK_GMU_BW_VOTE) && gmu->nr_gpu_bws) {
+		unsigned int bw = dev_pm_opp_get_bandwidth(opp, true, 0);
+
+		for (bw_index = 0; bw_index < gmu->nr_gpu_bws - 1; bw_index++) {
+			if (bw == gmu->gpu_bw_table[bw_index])
+				break;
+		}
+	}
+
 	gmu->current_perf_index = perf_index;
 	gmu->freq = gmu->gpu_freqs[perf_index];
 
@@ -140,8 +151,10 @@ void a6xx_gmu_set_freq(struct msm_gpu *gpu, struct dev_pm_opp *opp,
 		return;
 
 	if (!gmu->legacy) {
-		a6xx_hfi_set_freq(gmu, perf_index);
-		dev_pm_opp_set_opp(&gpu->pdev->dev, opp);
+		a6xx_hfi_set_freq(gmu, perf_index, bw_index);
+		/* With Bandwidth voting, we now vote for all resources, so skip OPP set */
+		if (bw_index)
+			dev_pm_opp_set_opp(&gpu->pdev->dev, opp);
 		return;
 	}
 
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.h b/drivers/gpu/drm/msm/adreno/a6xx_gmu.h
index 95c632d8987a517f067c48c61c6c06b9a4f61fc0..9b4f2b1a0c48a133cd5c48713bc321c74eaffce9 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.h
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.h
@@ -205,7 +205,7 @@ void a6xx_hfi_init(struct a6xx_gmu *gmu);
 int a6xx_hfi_start(struct a6xx_gmu *gmu, int boot_state);
 void a6xx_hfi_stop(struct a6xx_gmu *gmu);
 int a6xx_hfi_send_prep_slumber(struct a6xx_gmu *gmu);
-int a6xx_hfi_set_freq(struct a6xx_gmu *gmu, int index);
+int a6xx_hfi_set_freq(struct a6xx_gmu *gmu, int perf_index, int bw_index);
 
 bool a6xx_gmu_gx_is_on(struct a6xx_gmu *gmu);
 bool a6xx_gmu_sptprac_is_on(struct a6xx_gmu *gmu);
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_hfi.c b/drivers/gpu/drm/msm/adreno/a6xx_hfi.c
index 9a89ba95843e7805d78f0e5ddbe328677b6431dd..e2325c15677f1a1194a811e6ecbb5931bdfb1ad9 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_hfi.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_hfi.c
@@ -752,13 +752,13 @@ static int a6xx_hfi_send_core_fw_start(struct a6xx_gmu *gmu)
 		sizeof(msg), NULL, 0);
 }
 
-int a6xx_hfi_set_freq(struct a6xx_gmu *gmu, int index)
+int a6xx_hfi_set_freq(struct a6xx_gmu *gmu, int freq_index, int bw_index)
 {
 	struct a6xx_hfi_gx_bw_perf_vote_cmd msg = { 0 };
 
 	msg.ack_type = 1; /* blocking */
-	msg.freq = index;
-	msg.bw = 0; /* TODO: bus scaling */
+	msg.freq = freq_index;
+	msg.bw = bw_index;
 
 	return a6xx_hfi_send_msg(gmu, HFI_H2F_MSG_GX_BW_PERF_VOTE, &msg,
 		sizeof(msg), NULL, 0);

-- 
2.34.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH RFC 6/8] drm/msm: adreno: enable GMU bandwidth for A740 and A750
  2024-11-13 15:48 [PATCH RFC 0/8] drm/msm: adreno: add support for DDR bandwidth scaling via GMU Neil Armstrong
                   ` (4 preceding siblings ...)
  2024-11-13 15:48 ` [PATCH RFC 5/8] drm/msm: adreno: find bandwidth index of OPP and set it along freq index Neil Armstrong
@ 2024-11-13 15:48 ` Neil Armstrong
  2024-11-15  7:33   ` Dmitry Baryshkov
  2024-11-13 15:48 ` [PATCH RFC 7/8] arm64: qcom: dts: sm8550: add interconnect and opp-peak-kBps for GPU Neil Armstrong
  2024-11-13 15:48 ` [PATCH RFC 8/8] arm64: qcom: dts: sm8650: " Neil Armstrong
  7 siblings, 1 reply; 29+ messages in thread
From: Neil Armstrong @ 2024-11-13 15:48 UTC (permalink / raw)
  To: Akhil P Oommen, Viresh Kumar, Nishanth Menon, Stephen Boyd,
	Rafael J. Wysocki, Rob Clark, Sean Paul, Konrad Dybcio,
	Abhinav Kumar, Dmitry Baryshkov, Marijn Suijten, David Airlie,
	Simona Vetter, Bjorn Andersson, Rob Herring, Krzysztof Kozlowski,
	Conor Dooley
  Cc: Connor Abbott, linux-pm, linux-kernel, linux-arm-msm, dri-devel,
	freedreno, devicetree, Neil Armstrong

Now all the DDR bandwidth voting via the GPU Management Unit (GMU)
is in place, let's declare the Bus Control Modules (BCMs) and
it's parameters in the GPU info struct and add the GMU_BW_VOTE
quirk to enable it.

Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org>
---
 drivers/gpu/drm/msm/adreno/a6xx_catalog.c | 26 ++++++++++++++++++++++++--
 1 file changed, 24 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_catalog.c b/drivers/gpu/drm/msm/adreno/a6xx_catalog.c
index 0c560e84ad5a53bb4e8a49ba4e153ce9cf33f7ae..014a24256b832d8e03fe06a6516b5348a5c0474a 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_catalog.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_catalog.c
@@ -1379,7 +1379,8 @@ static const struct adreno_info a7xx_gpus[] = {
 		.inactive_period = DRM_MSM_INACTIVE_PERIOD,
 		.quirks = ADRENO_QUIRK_HAS_CACHED_COHERENT |
 			  ADRENO_QUIRK_HAS_HW_APRIV |
-			  ADRENO_QUIRK_PREEMPTION,
+			  ADRENO_QUIRK_PREEMPTION |
+			  ADRENO_QUIRK_GMU_BW_VOTE,
 		.init = a6xx_gpu_init,
 		.zapfw = "a740_zap.mdt",
 		.a6xx = &(const struct a6xx_info) {
@@ -1388,6 +1389,16 @@ static const struct adreno_info a7xx_gpus[] = {
 			.pwrup_reglist = &a7xx_pwrup_reglist,
 			.gmu_chipid = 0x7020100,
 			.gmu_cgc_mode = 0x00020202,
+			.bcm = {
+				[0] = { .name = "SH0", .buswidth = 16 },
+				[1] = { .name = "MC0", .buswidth = 4 },
+				[2] = {
+					.name = "ACV",
+					.fixed = true,
+					.perfmode = BIT(3),
+					.perfmode_bw = 16500000,
+				},
+			},
 		},
 		.address_space_size = SZ_16G,
 		.preempt_record_size = 4192 * SZ_1K,
@@ -1424,7 +1435,8 @@ static const struct adreno_info a7xx_gpus[] = {
 		.inactive_period = DRM_MSM_INACTIVE_PERIOD,
 		.quirks = ADRENO_QUIRK_HAS_CACHED_COHERENT |
 			  ADRENO_QUIRK_HAS_HW_APRIV |
-			  ADRENO_QUIRK_PREEMPTION,
+			  ADRENO_QUIRK_PREEMPTION |
+			  ADRENO_QUIRK_GMU_BW_VOTE,
 		.init = a6xx_gpu_init,
 		.zapfw = "gen70900_zap.mbn",
 		.a6xx = &(const struct a6xx_info) {
@@ -1432,6 +1444,16 @@ static const struct adreno_info a7xx_gpus[] = {
 			.pwrup_reglist = &a7xx_pwrup_reglist,
 			.gmu_chipid = 0x7090100,
 			.gmu_cgc_mode = 0x00020202,
+			.bcm = {
+				[0] = { .name = "SH0", .buswidth = 16 },
+				[1] = { .name = "MC0", .buswidth = 4 },
+				[2] = {
+					.name = "ACV",
+					.fixed = true,
+					.perfmode = BIT(2),
+					.perfmode_bw = 10687500,
+				},
+			},
 		},
 		.address_space_size = SZ_16G,
 		.preempt_record_size = 3572 * SZ_1K,

-- 
2.34.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH RFC 7/8] arm64: qcom: dts: sm8550: add interconnect and opp-peak-kBps for GPU
  2024-11-13 15:48 [PATCH RFC 0/8] drm/msm: adreno: add support for DDR bandwidth scaling via GMU Neil Armstrong
                   ` (5 preceding siblings ...)
  2024-11-13 15:48 ` [PATCH RFC 6/8] drm/msm: adreno: enable GMU bandwidth for A740 and A750 Neil Armstrong
@ 2024-11-13 15:48 ` Neil Armstrong
  2024-11-13 15:48 ` [PATCH RFC 8/8] arm64: qcom: dts: sm8650: " Neil Armstrong
  7 siblings, 0 replies; 29+ messages in thread
From: Neil Armstrong @ 2024-11-13 15:48 UTC (permalink / raw)
  To: Akhil P Oommen, Viresh Kumar, Nishanth Menon, Stephen Boyd,
	Rafael J. Wysocki, Rob Clark, Sean Paul, Konrad Dybcio,
	Abhinav Kumar, Dmitry Baryshkov, Marijn Suijten, David Airlie,
	Simona Vetter, Bjorn Andersson, Rob Herring, Krzysztof Kozlowski,
	Conor Dooley
  Cc: Connor Abbott, linux-pm, linux-kernel, linux-arm-msm, dri-devel,
	freedreno, devicetree, Neil Armstrong

Each GPU OPP requires a specific peak DDR bandwidth, let's add
those to each OPP and also the related interconnect path.

Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org>
---
 arch/arm64/boot/dts/qcom/sm8550.dtsi | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/arch/arm64/boot/dts/qcom/sm8550.dtsi b/arch/arm64/boot/dts/qcom/sm8550.dtsi
index 9dc0ee3eb98f8711e01934e47331b99e3bb73682..808dce3a624197d38222f53fffa280e63088c1c1 100644
--- a/arch/arm64/boot/dts/qcom/sm8550.dtsi
+++ b/arch/arm64/boot/dts/qcom/sm8550.dtsi
@@ -2113,6 +2113,9 @@ gpu: gpu@3d00000 {
 			qcom,gmu = <&gmu>;
 			#cooling-cells = <2>;
 
+			interconnects = <&gem_noc MASTER_GFX3D 0 &mc_virt SLAVE_EBI1 0>;
+			interconnect-names = "gfx-mem";
+
 			status = "disabled";
 
 			zap-shader {
@@ -2126,41 +2129,49 @@ gpu_opp_table: opp-table {
 				opp-680000000 {
 					opp-hz = /bits/ 64 <680000000>;
 					opp-level = <RPMH_REGULATOR_LEVEL_SVS_L1>;
+					opp-peak-kBps = <16500000>;
 				};
 
 				opp-615000000 {
 					opp-hz = /bits/ 64 <615000000>;
 					opp-level = <RPMH_REGULATOR_LEVEL_SVS_L0>;
+					opp-peak-kBps = <16500000>;
 				};
 
 				opp-550000000 {
 					opp-hz = /bits/ 64 <550000000>;
 					opp-level = <RPMH_REGULATOR_LEVEL_SVS>;
+					opp-peak-kBps = <12449218>;
 				};
 
 				opp-475000000 {
 					opp-hz = /bits/ 64 <475000000>;
 					opp-level = <RPMH_REGULATOR_LEVEL_LOW_SVS_L1>;
+					opp-peak-kBps = <8171875>;
 				};
 
 				opp-401000000 {
 					opp-hz = /bits/ 64 <401000000>;
 					opp-level = <RPMH_REGULATOR_LEVEL_LOW_SVS>;
+					opp-peak-kBps = <6671875>;
 				};
 
 				opp-348000000 {
 					opp-hz = /bits/ 64 <348000000>;
 					opp-level = <RPMH_REGULATOR_LEVEL_LOW_SVS_D0>;
+					opp-peak-kBps = <6074218>;
 				};
 
 				opp-295000000 {
 					opp-hz = /bits/ 64 <295000000>;
 					opp-level = <RPMH_REGULATOR_LEVEL_LOW_SVS_D1>;
+					opp-peak-kBps = <6074218>;
 				};
 
 				opp-220000000 {
 					opp-hz = /bits/ 64 <220000000>;
 					opp-level = <RPMH_REGULATOR_LEVEL_LOW_SVS_D2>;
+					opp-peak-kBps = <6074218>;
 				};
 			};
 		};

-- 
2.34.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH RFC 8/8] arm64: qcom: dts: sm8650: add interconnect and opp-peak-kBps for GPU
  2024-11-13 15:48 [PATCH RFC 0/8] drm/msm: adreno: add support for DDR bandwidth scaling via GMU Neil Armstrong
                   ` (6 preceding siblings ...)
  2024-11-13 15:48 ` [PATCH RFC 7/8] arm64: qcom: dts: sm8550: add interconnect and opp-peak-kBps for GPU Neil Armstrong
@ 2024-11-13 15:48 ` Neil Armstrong
  7 siblings, 0 replies; 29+ messages in thread
From: Neil Armstrong @ 2024-11-13 15:48 UTC (permalink / raw)
  To: Akhil P Oommen, Viresh Kumar, Nishanth Menon, Stephen Boyd,
	Rafael J. Wysocki, Rob Clark, Sean Paul, Konrad Dybcio,
	Abhinav Kumar, Dmitry Baryshkov, Marijn Suijten, David Airlie,
	Simona Vetter, Bjorn Andersson, Rob Herring, Krzysztof Kozlowski,
	Conor Dooley
  Cc: Connor Abbott, linux-pm, linux-kernel, linux-arm-msm, dri-devel,
	freedreno, devicetree, Neil Armstrong

Each GPU OPP requires a specific peak DDR bandwidth, let's add
those to each OPP and also the related interconnect path.

Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org>
---
 arch/arm64/boot/dts/qcom/sm8650.dtsi | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/arch/arm64/boot/dts/qcom/sm8650.dtsi b/arch/arm64/boot/dts/qcom/sm8650.dtsi
index 01ac3769ffa62ffb83c5c51878e2823e1982eb67..331c5140c16bf013190d6da136c0920009d2646b 100644
--- a/arch/arm64/boot/dts/qcom/sm8650.dtsi
+++ b/arch/arm64/boot/dts/qcom/sm8650.dtsi
@@ -2636,6 +2636,9 @@ gpu: gpu@3d00000 {
 			qcom,gmu = <&gmu>;
 			#cooling-cells = <2>;
 
+			interconnects = <&gem_noc MASTER_GFX3D 0 &mc_virt SLAVE_EBI1 0>;
+			interconnect-names = "gfx-mem";
+
 			status = "disabled";
 
 			zap-shader {
@@ -2649,56 +2652,67 @@ gpu_opp_table: opp-table {
 				opp-231000000 {
 					opp-hz = /bits/ 64 <231000000>;
 					opp-level = <RPMH_REGULATOR_LEVEL_LOW_SVS_D2>;
+					opp-peak-kBps = <2136718>;
 				};
 
 				opp-310000000 {
 					opp-hz = /bits/ 64 <310000000>;
 					opp-level = <RPMH_REGULATOR_LEVEL_LOW_SVS_D1>;
+					opp-peak-kBps = <6074218>;
 				};
 
 				opp-366000000 {
 					opp-hz = /bits/ 64 <366000000>;
 					opp-level = <RPMH_REGULATOR_LEVEL_LOW_SVS_D0>;
+					opp-peak-kBps = <6074218>;
 				};
 
 				opp-422000000 {
 					opp-hz = /bits/ 64 <422000000>;
 					opp-level = <RPMH_REGULATOR_LEVEL_LOW_SVS>;
+					opp-peak-kBps = <8171875>;
 				};
 
 				opp-500000000 {
 					opp-hz = /bits/ 64 <500000000>;
 					opp-level = <RPMH_REGULATOR_LEVEL_LOW_SVS_L1>;
+					opp-peak-kBps = <8171875>;
 				};
 
 				opp-578000000 {
 					opp-hz = /bits/ 64 <578000000>;
 					opp-level = <RPMH_REGULATOR_LEVEL_SVS>;
+					opp-peak-kBps = <12449218>;
 				};
 
 				opp-629000000 {
 					opp-hz = /bits/ 64 <629000000>;
 					opp-level = <RPMH_REGULATOR_LEVEL_SVS_L0>;
+					opp-peak-kBps = <12449218>;
 				};
 
 				opp-680000000 {
 					opp-hz = /bits/ 64 <680000000>;
 					opp-level = <RPMH_REGULATOR_LEVEL_SVS_L1>;
+					opp-peak-kBps = <16500000>;
 				};
 
 				opp-720000000 {
 					opp-hz = /bits/ 64 <720000000>;
 					opp-level = <RPMH_REGULATOR_LEVEL_SVS_L2>;
+					opp-peak-kBps = <16500000>;
 				};
 
 				opp-770000000 {
 					opp-hz = /bits/ 64 <770000000>;
 					opp-level = <RPMH_REGULATOR_LEVEL_NOM>;
+					opp-peak-kBps = <16500000>;
 				};
 
 				opp-834000000 {
 					opp-hz = /bits/ 64 <834000000>;
 					opp-level = <RPMH_REGULATOR_LEVEL_NOM_L1>;
+					opp-peak-kBps = <16500000>;
 				};
 			};
 		};

-- 
2.34.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [PATCH RFC 1/8] opp: core: implement dev_pm_opp_get_bandwidth
  2024-11-13 15:48 ` [PATCH RFC 1/8] opp: core: implement dev_pm_opp_get_bandwidth Neil Armstrong
@ 2024-11-14  4:10   ` Viresh Kumar
  2024-11-14  9:23     ` Neil Armstrong
  0 siblings, 1 reply; 29+ messages in thread
From: Viresh Kumar @ 2024-11-14  4:10 UTC (permalink / raw)
  To: Neil Armstrong
  Cc: Akhil P Oommen, Viresh Kumar, Nishanth Menon, Stephen Boyd,
	Rafael J. Wysocki, Rob Clark, Sean Paul, Konrad Dybcio,
	Abhinav Kumar, Dmitry Baryshkov, Marijn Suijten, David Airlie,
	Simona Vetter, Bjorn Andersson, Rob Herring, Krzysztof Kozlowski,
	Conor Dooley, Connor Abbott, linux-pm, linux-kernel,
	linux-arm-msm, dri-devel, freedreno, devicetree

On 13-11-24, 16:48, Neil Armstrong wrote:
> Add and implement the dev_pm_opp_get_bandwidth() to retrieve
> the OPP's bandwidth in the same was as the dev_pm_opp_get_voltage()

                                  way

> helper.
> 
> Retrieving bandwidth is required in the case of the Adreno GPU
> where the GPU Management Unit can handle the Bandwidth scaling.
> 
> The helper can get the peak or everage bandwidth for any of

                                 average

> the interconnect path.
> 
> Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org>
> ---
>  drivers/opp/core.c     | 25 +++++++++++++++++++++++++
>  include/linux/pm_opp.h |  7 +++++++
>  2 files changed, 32 insertions(+)
> 
> diff --git a/drivers/opp/core.c b/drivers/opp/core.c
> index 494f8860220d97fc690ebab5ed3b7f5f04f22d73..19fb82033de26b74e9604c33b9781689df2fe80a 100644
> --- a/drivers/opp/core.c
> +++ b/drivers/opp/core.c
> @@ -106,6 +106,31 @@ static bool assert_single_clk(struct opp_table *opp_table)
>  	return !WARN_ON(opp_table->clk_count > 1);
>  }
>  
> +/**
> + * dev_pm_opp_get_bandwidth() - Gets the peak bandwidth corresponding to an opp

s/peak bandwidth/bandwidth/

> + * @opp:	opp for which voltage has to be returned for
> + * @peak:	select peak or average bandwidth
> + * @index:	bandwidth index
> + *
> + * Return: peak bandwidth in kBps, else return 0

s/peak bandwidth/bandwidth/

> + */
> +unsigned long dev_pm_opp_get_bandwidth(struct dev_pm_opp *opp, bool peak, int index)
> +{
> +	if (IS_ERR_OR_NULL(opp)) {
> +		pr_err("%s: Invalid parameters\n", __func__);
> +		return 0;
> +	}
> +
> +	if (index > opp->opp_table->path_count)
> +		return 0;
> +
> +	if (!opp->bandwidth)
> +		return 0;
> +
> +	return peak ? opp->bandwidth[index].peak : opp->bandwidth[index].avg;
> +}
> +EXPORT_SYMBOL_GPL(dev_pm_opp_get_bandwidth);

All other bandwidth APIs are named as _bw, maybe do same here too ?

-- 
viresh

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH RFC 1/8] opp: core: implement dev_pm_opp_get_bandwidth
  2024-11-14  4:10   ` Viresh Kumar
@ 2024-11-14  9:23     ` Neil Armstrong
  0 siblings, 0 replies; 29+ messages in thread
From: Neil Armstrong @ 2024-11-14  9:23 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Akhil P Oommen, Viresh Kumar, Nishanth Menon, Stephen Boyd,
	Rafael J. Wysocki, Rob Clark, Sean Paul, Konrad Dybcio,
	Abhinav Kumar, Dmitry Baryshkov, Marijn Suijten, David Airlie,
	Simona Vetter, Bjorn Andersson, Rob Herring, Krzysztof Kozlowski,
	Conor Dooley, Connor Abbott, linux-pm, linux-kernel,
	linux-arm-msm, dri-devel, freedreno, devicetree

Hi,

On 14/11/2024 05:10, Viresh Kumar wrote:
> On 13-11-24, 16:48, Neil Armstrong wrote:
>> Add and implement the dev_pm_opp_get_bandwidth() to retrieve
>> the OPP's bandwidth in the same was as the dev_pm_opp_get_voltage()
> 
>                                    way
> 
>> helper.
>>
>> Retrieving bandwidth is required in the case of the Adreno GPU
>> where the GPU Management Unit can handle the Bandwidth scaling.
>>
>> The helper can get the peak or everage bandwidth for any of
> 
>                                   average

Aww, good catch, thanks

> 
>> the interconnect path.
>>
>> Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org>
>> ---
>>   drivers/opp/core.c     | 25 +++++++++++++++++++++++++
>>   include/linux/pm_opp.h |  7 +++++++
>>   2 files changed, 32 insertions(+)
>>
>> diff --git a/drivers/opp/core.c b/drivers/opp/core.c
>> index 494f8860220d97fc690ebab5ed3b7f5f04f22d73..19fb82033de26b74e9604c33b9781689df2fe80a 100644
>> --- a/drivers/opp/core.c
>> +++ b/drivers/opp/core.c
>> @@ -106,6 +106,31 @@ static bool assert_single_clk(struct opp_table *opp_table)
>>   	return !WARN_ON(opp_table->clk_count > 1);
>>   }
>>   
>> +/**
>> + * dev_pm_opp_get_bandwidth() - Gets the peak bandwidth corresponding to an opp
> 
> s/peak bandwidth/bandwidth/

Ack

> 
>> + * @opp:	opp for which voltage has to be returned for
>> + * @peak:	select peak or average bandwidth
>> + * @index:	bandwidth index
>> + *
>> + * Return: peak bandwidth in kBps, else return 0
> 
> s/peak bandwidth/bandwidth/

Ack

> 
>> + */
>> +unsigned long dev_pm_opp_get_bandwidth(struct dev_pm_opp *opp, bool peak, int index)
>> +{
>> +	if (IS_ERR_OR_NULL(opp)) {
>> +		pr_err("%s: Invalid parameters\n", __func__);
>> +		return 0;
>> +	}
>> +
>> +	if (index > opp->opp_table->path_count)
>> +		return 0;
>> +
>> +	if (!opp->bandwidth)
>> +		return 0;
>> +
>> +	return peak ? opp->bandwidth[index].peak : opp->bandwidth[index].avg;
>> +}
>> +EXPORT_SYMBOL_GPL(dev_pm_opp_get_bandwidth);
> 
> All other bandwidth APIs are named as _bw, maybe do same here too ?
> 

Sure, I wasn't sure about that, will switch to _bw.

Neil


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH RFC 2/8] drm/msm: adreno: add GMU_BW_VOTE quirk
  2024-11-13 15:48 ` [PATCH RFC 2/8] drm/msm: adreno: add GMU_BW_VOTE quirk Neil Armstrong
@ 2024-11-15  7:07   ` Dmitry Baryshkov
  2024-11-15  9:21     ` Neil Armstrong
  0 siblings, 1 reply; 29+ messages in thread
From: Dmitry Baryshkov @ 2024-11-15  7:07 UTC (permalink / raw)
  To: Neil Armstrong
  Cc: Akhil P Oommen, Viresh Kumar, Nishanth Menon, Stephen Boyd,
	Rafael J. Wysocki, Rob Clark, Sean Paul, Konrad Dybcio,
	Abhinav Kumar, Marijn Suijten, David Airlie, Simona Vetter,
	Bjorn Andersson, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
	Connor Abbott, linux-pm, linux-kernel, linux-arm-msm, dri-devel,
	freedreno, devicetree

On Wed, Nov 13, 2024 at 04:48:28PM +0100, Neil Armstrong wrote:
> The Adreno GMU Management Unit (GNU) can also scale the DDR Bandwidth
> along the Frequency and Power Domain level, but by default we leave the
> OPP core vote for the interconnect ddr path.
> 
> While scaling via the interconnect path was sufficient, newer GPUs
> like the A750 requires specific vote paremeters and bandwidth to
> achieve full functionality.
> 
> Add a new Quirk enabling DDR Bandwidth vote via GMU.

Please describe, why this is defined as a quirk rather than a proper
platform-level property. From my experience with 6xx and 7xx, all the
platforms need to send some kind of BW data to the GMU.

> 
> Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org>
> ---
>  drivers/gpu/drm/msm/adreno/adreno_gpu.h | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.h b/drivers/gpu/drm/msm/adreno/adreno_gpu.h
> index e71f420f8b3a8e6cfc52dd1c4d5a63ef3704a07f..20b6b7f49473d42751cd4fb4fc82849be42cb807 100644
> --- a/drivers/gpu/drm/msm/adreno/adreno_gpu.h
> +++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.h
> @@ -57,6 +57,7 @@ enum adreno_family {
>  #define ADRENO_QUIRK_HAS_HW_APRIV		BIT(3)
>  #define ADRENO_QUIRK_HAS_CACHED_COHERENT	BIT(4)
>  #define ADRENO_QUIRK_PREEMPTION			BIT(5)
> +#define ADRENO_QUIRK_GMU_BW_VOTE		BIT(6)
>  
>  /* Helper for formating the chip_id in the way that userspace tools like
>   * crashdec expect.
> 
> -- 
> 2.34.1
> 

-- 
With best wishes
Dmitry

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH RFC 3/8] drm/msm: adreno: add plumbing to generate bandwidth vote table for GMU
  2024-11-13 15:48 ` [PATCH RFC 3/8] drm/msm: adreno: add plumbing to generate bandwidth vote table for GMU Neil Armstrong
@ 2024-11-15  7:20   ` Dmitry Baryshkov
  2024-11-15  9:09     ` Neil Armstrong
  0 siblings, 1 reply; 29+ messages in thread
From: Dmitry Baryshkov @ 2024-11-15  7:20 UTC (permalink / raw)
  To: Neil Armstrong
  Cc: Akhil P Oommen, Viresh Kumar, Nishanth Menon, Stephen Boyd,
	Rafael J. Wysocki, Rob Clark, Sean Paul, Konrad Dybcio,
	Abhinav Kumar, Marijn Suijten, David Airlie, Simona Vetter,
	Bjorn Andersson, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
	Connor Abbott, linux-pm, linux-kernel, linux-arm-msm, dri-devel,
	freedreno, devicetree

On Wed, Nov 13, 2024 at 04:48:29PM +0100, Neil Armstrong wrote:
> The Adreno GMU Management Unit (GMU) can also scale DDR Bandwidth along
> the Frequency and Power Domain level, but by default we leave the
> OPP core scale the interconnect ddr path.
> 
> In order to get the vote values to be used by the GPU Management
> Unit (GMU), we need to parse all the possible OPP Bandwidths and
> create a vote value to be send to the appropriate Bus Control
> Modules (BCMs) declared in the GPU info struct.
> 
> The vote array will be used to dynamically generate the GMU bw_table
> sent during the GMU power-up.
> 
> Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org>
> ---
>  drivers/gpu/drm/msm/adreno/a6xx_gmu.c | 163 ++++++++++++++++++++++++++++++++++
>  drivers/gpu/drm/msm/adreno/a6xx_gmu.h |  12 +++
>  drivers/gpu/drm/msm/adreno/a6xx_gpu.h |   1 +
>  3 files changed, 176 insertions(+)
> 
> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> index 14db7376c712d19446b38152e480bd5a1e0a5198..504a7c5d5a9df4c787951f2ae3a69d566d205ad5 100644
> --- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> @@ -9,6 +9,7 @@
>  #include <linux/pm_domain.h>
>  #include <linux/pm_opp.h>
>  #include <soc/qcom/cmd-db.h>
> +#include <soc/qcom/tcs.h>
>  #include <drm/drm_gem.h>
>  
>  #include "a6xx_gpu.h"
> @@ -1287,6 +1288,119 @@ static int a6xx_gmu_memory_probe(struct a6xx_gmu *gmu)
>  	return 0;
>  }
>  
> +struct a6xx_bcm_data {
> +	u32 buswidth;
> +	unsigned int unit;
> +	unsigned int width;

In bits?

> +	unsigned int vcd;

What is this?

> +	bool fixed;

What does it mean?

> +	unsigned int perfmode;
> +	unsigned int perfmode_bw;
> +};
> +
> +struct bcm_db {
> +	__le32 unit;
> +	__le16 width;
> +	u8 vcd;
> +	u8 reserved;
> +};
> +
> +static int a6xx_gmu_rpmh_get_bcm_data(const struct a6xx_bcm *bcm,
> +				      struct a6xx_bcm_data *bcm_data)

Is there a reason to copy CMD DB and BCM data to the interim
representation instead of using those directly?

> +{
> +	const struct bcm_db *data;
> +	size_t count;
> +
> +	data = cmd_db_read_aux_data(bcm->name, &count);
> +	if (IS_ERR(data))
> +		return PTR_ERR(data);
> +
> +	if (!count)
> +		return -EINVAL;
> +
> +	bcm_data->unit = le32_to_cpu(data->unit);
> +	bcm_data->width = le16_to_cpu(data->width);
> +	bcm_data->vcd = data->vcd;
> +	bcm_data->fixed = bcm->fixed;
> +	bcm_data->perfmode = bcm->perfmode;
> +	bcm_data->perfmode_bw = bcm->perfmode_bw;
> +	bcm_data->buswidth = bcm->buswidth;
> +
> +	return 0;
> +}
> +
> +static void a6xx_gmu_rpmh_calc_bw_vote(struct a6xx_bcm_data *bcms,
> +				       int count, u32 bw, u32 *data)
> +{
> +	int i;
> +
> +	for (i = 0; i < count; i++) {
> +		bool valid = true;
> +		bool commit = false;
> +		u64 peak, y;
> +
> +		if (i == count - 1 || bcms[i].vcd != bcms[i + 1].vcd)
> +			commit = true;
> +
> +		if (bcms[i].fixed) {
> +			if (!bw)
> +				data[i] = BCM_TCS_CMD(commit, false, 0x0, 0x0);
> +			else
> +				data[i] = BCM_TCS_CMD(commit, true, 0x0,
> +					bw >= bcms[i].perfmode_bw ?
> +						bcms[i].perfmode : 0x0);
> +			continue;
> +		}
> +
> +		/* Multiple the bandwidth by the width of the connection */

... and divide by the bus width. However it's not clear why you are
multiplying bandwidth (bits or bytes per second) with the width
(probably also bits?). Or is it not a width but the number of paths
between units?

> +		peak = (u64)bw * bcms[i].width;
> +		do_div(peak, bcms[i].buswidth);
> +
> +		/* Input bandwidth value is in KBps */

Input or OPP / Interconnect?

> +		y = peak * 1000ULL;
> +		do_div(y, bcms[i].unit);
> +
> +		/*
> +		 * If a bandwidth value was specified but the calculation ends
> +		 * rounding down to zero, set a minimum level
> +		 */
> +		if (bw && y == 0)
> +			y = 1;

Is it a real usecase or just a safety net? If the bandwidth ends up
being very low, maybe we should warn the users about it?

> +
> +		y = min_t(u64, y, BCM_TCS_CMD_VOTE_MASK);
> +		if (!y)
> +			valid = false;

This can probably be coupled with the previous condition.

> +
> +		data[i] = BCM_TCS_CMD(commit, valid, y, y);
> +	}
> +}
> +
> +static int a6xx_gmu_rpmh_bw_votes_init(const struct a6xx_info *info, struct a6xx_gmu *gmu)
> +{
> +	struct a6xx_bcm_data bcms[3];
> +	unsigned int bcm_count = 0;
> +	int ret, index;
> +
> +	/* Retrieve BCM data from cmd-db and merge with a6xx_info bcm table */
> +	for (index = 0; index < 3; index++) {

Magic number 3.

> +		if (!info->bcm[index].name)
> +			continue;
> +
> +		ret = a6xx_gmu_rpmh_get_bcm_data(&info->bcm[index], &bcms[index]);
> +		if (ret)
> +			return ret;
> +
> +		++bcm_count;
> +	}
> +
> +	/* Generate BCM votes values for each bandwidth & bcm */
> +	for (index = 0; index < gmu->nr_gpu_bws; index++)
> +		a6xx_gmu_rpmh_calc_bw_vote(bcms, bcm_count, gmu->gpu_bw_table[index],
> +					   gmu->gpu_bw_votes[index]);
> +
> +	return 0;
> +}
> +
>  /* Return the 'arc-level' for the given frequency */
>  static unsigned int a6xx_gmu_get_arc_level(struct device *dev,
>  					   unsigned long freq)
> @@ -1390,12 +1504,15 @@ static int a6xx_gmu_rpmh_arc_votes_init(struct device *dev, u32 *votes,
>   * The GMU votes with the RPMh for itself and on behalf of the GPU but we need
>   * to construct the list of votes on the CPU and send it over. Query the RPMh
>   * voltage levels and build the votes
> + * The GMU can also vote for DDR interconnects, use the OPP bandwidth entries
> + * and BCM parameters to build the votes.
>   */
>  
>  static int a6xx_gmu_rpmh_votes_init(struct a6xx_gmu *gmu)
>  {
>  	struct a6xx_gpu *a6xx_gpu = container_of(gmu, struct a6xx_gpu, gmu);
>  	struct adreno_gpu *adreno_gpu = &a6xx_gpu->base;
> +	const struct a6xx_info *info = adreno_gpu->info->a6xx;
>  	struct msm_gpu *gpu = &adreno_gpu->base;
>  	int ret;
>  
> @@ -1407,6 +1524,10 @@ static int a6xx_gmu_rpmh_votes_init(struct a6xx_gmu *gmu)
>  	ret |= a6xx_gmu_rpmh_arc_votes_init(gmu->dev, gmu->cx_arc_votes,
>  		gmu->gmu_freqs, gmu->nr_gmu_freqs, "cx.lvl");
>  
> +	/* Build the interconnect votes */
> +	if (adreno_gpu->info->quirks & ADRENO_QUIRK_GMU_BW_VOTE)
> +		ret |= a6xx_gmu_rpmh_bw_votes_init(info, gmu);
> +
>  	return ret;
>  }
>  
> @@ -1442,6 +1563,38 @@ static int a6xx_gmu_build_freq_table(struct device *dev, unsigned long *freqs,
>  	return index;
>  }
>  
> +static int a6xx_gmu_build_bw_table(struct device *dev, unsigned long *bandwidths,
> +		u32 size)
> +{
> +	int count = dev_pm_opp_get_opp_count(dev);
> +	struct dev_pm_opp *opp;
> +	int i, index = 0;
> +	unsigned int bandwidth = 1;
> +
> +	/*
> +	 * The OPP table doesn't contain the "off" bandwidth level so we need to
> +	 * add 1 to the table size to account for it
> +	 */
> +
> +	if (WARN(count + 1 > size,
> +		"The GMU bandwidth table is being truncated\n"))
> +		count = size - 1;
> +
> +	/* Set the "off" bandwidth */
> +	bandwidths[index++] = 0;
> +
> +	for (i = 0; i < count; i++) {
> +		opp = dev_pm_opp_find_bw_ceil(dev, &bandwidth, 0);
> +		if (IS_ERR(opp))
> +			break;
> +
> +		dev_pm_opp_put(opp);
> +		bandwidths[index++] = bandwidth++;
> +	}
> +
> +	return index;
> +}
> +
>  static int a6xx_gmu_pwrlevels_probe(struct a6xx_gmu *gmu)
>  {
>  	struct a6xx_gpu *a6xx_gpu = container_of(gmu, struct a6xx_gpu, gmu);
> @@ -1472,6 +1625,16 @@ static int a6xx_gmu_pwrlevels_probe(struct a6xx_gmu *gmu)
>  
>  	gmu->current_perf_index = gmu->nr_gpu_freqs - 1;
>  
> +	/*
> +	 * The GMU also handles GPU Interconnect Votes so build a list
> +	 * of DDR bandwidths from the GPU OPP table
> +	 */
> +	if (adreno_gpu->info->quirks & ADRENO_QUIRK_GMU_BW_VOTE)
> +		gmu->nr_gpu_bws = a6xx_gmu_build_bw_table(&gpu->pdev->dev,
> +			gmu->gpu_bw_table, ARRAY_SIZE(gmu->gpu_bw_table));
> +
> +	gmu->current_perf_index = gmu->nr_gpu_freqs - 1;
> +
>  	/* Build the list of RPMh votes that we'll send to the GMU */
>  	return a6xx_gmu_rpmh_votes_init(gmu);
>  }
> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.h b/drivers/gpu/drm/msm/adreno/a6xx_gmu.h
> index b4a79f88ccf45cfe651c86d2a9da39541c5772b3..95c632d8987a517f067c48c61c6c06b9a4f61fc0 100644
> --- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.h
> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.h
> @@ -19,6 +19,14 @@ struct a6xx_gmu_bo {
>  	u64 iova;
>  };
>  
> +struct a6xx_bcm {
> +	char *name;
> +	unsigned int buswidth;
> +	bool fixed;
> +	unsigned int perfmode;
> +	unsigned int perfmode_bw;
> +};
> +
>  /*
>   * These define the different GMU wake up options - these define how both the
>   * CPU and the GMU bring up the hardware
> @@ -82,6 +90,10 @@ struct a6xx_gmu {
>  	unsigned long gpu_freqs[16];
>  	u32 gx_arc_votes[16];
>  
> +	int nr_gpu_bws;
> +	unsigned long gpu_bw_table[16];
> +	u32 gpu_bw_votes[16][3];

Is it is the same magic 16 as we have few lines above or is this 16 a
different magic 16? And also 3 is a pure dark secret.

> +
>  	int nr_gmu_freqs;
>  	unsigned long gmu_freqs[4];
>  	u32 cx_arc_votes[4];
> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.h b/drivers/gpu/drm/msm/adreno/a6xx_gpu.h
> index 4aceffb6aae89c781facc2a6e4a82b20b341b6cb..d779d700120cbd974ee87a67214739b1d85156e2 100644
> --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.h
> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.h
> @@ -44,6 +44,7 @@ struct a6xx_info {
>  	u32 gmu_chipid;
>  	u32 gmu_cgc_mode;
>  	u32 prim_fifo_threshold;
> +	const struct a6xx_bcm bcm[3];
>  };
>  
>  struct a6xx_gpu {
> 
> -- 
> 2.34.1
> 

-- 
With best wishes
Dmitry

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH RFC 4/8] drm/msm: adreno: dynamically generate GMU bw table
  2024-11-13 15:48 ` [PATCH RFC 4/8] drm/msm: adreno: dynamically generate GMU bw table Neil Armstrong
@ 2024-11-15  7:24   ` Dmitry Baryshkov
  2024-11-15  9:11     ` Neil Armstrong
  0 siblings, 1 reply; 29+ messages in thread
From: Dmitry Baryshkov @ 2024-11-15  7:24 UTC (permalink / raw)
  To: Neil Armstrong
  Cc: Akhil P Oommen, Viresh Kumar, Nishanth Menon, Stephen Boyd,
	Rafael J. Wysocki, Rob Clark, Sean Paul, Konrad Dybcio,
	Abhinav Kumar, Marijn Suijten, David Airlie, Simona Vetter,
	Bjorn Andersson, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
	Connor Abbott, linux-pm, linux-kernel, linux-arm-msm, dri-devel,
	freedreno, devicetree

On Wed, Nov 13, 2024 at 04:48:30PM +0100, Neil Armstrong wrote:
> The Adreno GPU Management Unit (GMU) can also scale the ddr
> bandwidth along the frequency and power domain level, but for
> now we statically fill the bw_table with values from the
> downstream driver.
> 
> Only the first entry is used, which is a disable vote, so we
> currently rely on scaling via the linux interconnect paths.
> 
> Let's dynamically generate the bw_table with the vote values
> previously calculated from the OPPs.

Nice to see this being worked upon. I hope the code can is generic
enough so that we can use it from other adreno_foo_build_bw_table()
functions.

> 
> Those entried will then be used by the GMU when passing the
> appropriate bandwidth level when voting for a gpu frequency.
> 
> Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org>
> ---
>  drivers/gpu/drm/msm/adreno/a6xx_hfi.c | 48 +++++++++++++++++++++++++++--------
>  1 file changed, 37 insertions(+), 11 deletions(-)
> 
> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_hfi.c b/drivers/gpu/drm/msm/adreno/a6xx_hfi.c
> index cb8844ed46b29c4569d05eb7a24f7b27e173190f..9a89ba95843e7805d78f0e5ddbe328677b6431dd 100644
> --- a/drivers/gpu/drm/msm/adreno/a6xx_hfi.c
> +++ b/drivers/gpu/drm/msm/adreno/a6xx_hfi.c
> @@ -596,22 +596,48 @@ static void a730_build_bw_table(struct a6xx_hfi_msg_bw_table *msg)
>  	msg->cnoc_cmds_data[1][0] = 0x60000001;
>  }
>  
> -static void a740_build_bw_table(struct a6xx_hfi_msg_bw_table *msg)
> +static void a740_generate_bw_table(struct adreno_gpu *adreno_gpu, struct a6xx_gmu *gmu,
> +				   struct a6xx_hfi_msg_bw_table *msg)
>  {
> -	msg->bw_level_num = 1;
> +	const struct a6xx_info *info = adreno_gpu->info->a6xx;
> +	unsigned int i, j;
>  
> -	msg->ddr_cmds_num = 3;
>  	msg->ddr_wait_bitmask = 0x7;
>  
> -	msg->ddr_cmds_addrs[0] = cmd_db_read_addr("SH0");
> -	msg->ddr_cmds_addrs[1] = cmd_db_read_addr("MC0");
> -	msg->ddr_cmds_addrs[2] = cmd_db_read_addr("ACV");
> +	for (i = 0; i < 3; i++) {
> +		if (!info->bcm[i].name)
> +			break;
> +		msg->ddr_cmds_addrs[i] = cmd_db_read_addr(info->bcm[i].name);
> +	}
> +	msg->ddr_cmds_num = i;
>  
> -	msg->ddr_cmds_data[0][0] = 0x40000000;
> -	msg->ddr_cmds_data[0][1] = 0x40000000;
> -	msg->ddr_cmds_data[0][2] = 0x40000000;
> +	for (i = 0; i < gmu->nr_gpu_bws; ++i)
> +		for (j = 0; j < msg->ddr_cmds_num; j++)
> +			msg->ddr_cmds_data[i][j] = gmu->gpu_bw_votes[i][j];
> +	msg->bw_level_num = gmu->nr_gpu_bws;
> +}
> +
> +static void a740_build_bw_table(struct adreno_gpu *adreno_gpu, struct a6xx_gmu *gmu,
> +				struct a6xx_hfi_msg_bw_table *msg)
> +{
> +	if ((adreno_gpu->info->quirks & ADRENO_QUIRK_GMU_BW_VOTE) && gmu->nr_gpu_bws) {
> +		a740_generate_bw_table(adreno_gpu, gmu, msg);
> +	} else {

Why do we need a fallback code here?

> +		msg->bw_level_num = 1;
>  
> -	/* TODO: add a proper dvfs table */
> +		msg->ddr_cmds_num = 3;
> +		msg->ddr_wait_bitmask = 0x7;
> +
> +		msg->ddr_cmds_addrs[0] = cmd_db_read_addr("SH0");
> +		msg->ddr_cmds_addrs[1] = cmd_db_read_addr("MC0");
> +		msg->ddr_cmds_addrs[2] = cmd_db_read_addr("ACV");
> +
> +		msg->ddr_cmds_data[0][0] = 0x40000000;
> +		msg->ddr_cmds_data[0][1] = 0x40000000;
> +		msg->ddr_cmds_data[0][2] = 0x40000000;
> +
> +		/* TODO: add a proper dvfs table */

I think TODO is unapplicable anymore.

> +	}
>  
>  	msg->cnoc_cmds_num = 1;
>  	msg->cnoc_wait_bitmask = 0x1;
> @@ -691,7 +717,7 @@ static int a6xx_hfi_send_bw_table(struct a6xx_gmu *gmu)
>  	else if (adreno_is_a730(adreno_gpu))
>  		a730_build_bw_table(msg);
>  	else if (adreno_is_a740_family(adreno_gpu))
> -		a740_build_bw_table(msg);
> +		a740_build_bw_table(adreno_gpu, gmu, msg);
>  	else
>  		a6xx_build_bw_table(msg);
>  
> 
> -- 
> 2.34.1
> 

-- 
With best wishes
Dmitry

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH RFC 5/8] drm/msm: adreno: find bandwidth index of OPP and set it along freq index
  2024-11-13 15:48 ` [PATCH RFC 5/8] drm/msm: adreno: find bandwidth index of OPP and set it along freq index Neil Armstrong
@ 2024-11-15  7:28   ` Dmitry Baryshkov
  2024-11-15  9:15     ` Neil Armstrong
  0 siblings, 1 reply; 29+ messages in thread
From: Dmitry Baryshkov @ 2024-11-15  7:28 UTC (permalink / raw)
  To: Neil Armstrong
  Cc: Akhil P Oommen, Viresh Kumar, Nishanth Menon, Stephen Boyd,
	Rafael J. Wysocki, Rob Clark, Sean Paul, Konrad Dybcio,
	Abhinav Kumar, Marijn Suijten, David Airlie, Simona Vetter,
	Bjorn Andersson, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
	Connor Abbott, linux-pm, linux-kernel, linux-arm-msm, dri-devel,
	freedreno, devicetree

On Wed, Nov 13, 2024 at 04:48:31PM +0100, Neil Armstrong wrote:
> The Adreno GMU Management Unit (GMU) can also scale the DDR Bandwidth
> along the Frequency and Power Domain level, until now we left the OPP
> core scale the OPP bandwidth via the interconnect path.
> 
> In order to enable bandwidth voting via the GPU Management
> Unit (GMU), when an opp is set by devfreq we also look for
> the corresponding bandwidth index in the previously generated
> bw_table and pass this value along the frequency index to the GMU.
> 
> Since we now vote for all resources via the GMU, setting the OPP
> is no more needed, so we can completely skip calling
> dev_pm_opp_set_opp() in this situation.
> 
> Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org>
> ---
>  drivers/gpu/drm/msm/adreno/a6xx_gmu.c | 17 +++++++++++++++--
>  drivers/gpu/drm/msm/adreno/a6xx_gmu.h |  2 +-
>  drivers/gpu/drm/msm/adreno/a6xx_hfi.c |  6 +++---
>  3 files changed, 19 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> index 504a7c5d5a9df4c787951f2ae3a69d566d205ad5..1131c3521ebbb0d053aceb162052ed01e197726a 100644
> --- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> @@ -113,6 +113,7 @@ void a6xx_gmu_set_freq(struct msm_gpu *gpu, struct dev_pm_opp *opp,
>  	struct a6xx_gpu *a6xx_gpu = to_a6xx_gpu(adreno_gpu);
>  	struct a6xx_gmu *gmu = &a6xx_gpu->gmu;
>  	u32 perf_index;
> +	u32 bw_index = 0;
>  	unsigned long gpu_freq;
>  	int ret = 0;
>  
> @@ -125,6 +126,16 @@ void a6xx_gmu_set_freq(struct msm_gpu *gpu, struct dev_pm_opp *opp,
>  		if (gpu_freq == gmu->gpu_freqs[perf_index])
>  			break;
>  
> +	/* If enabled, find the corresponding DDR bandwidth index */
> +	if ((adreno_gpu->info->quirks & ADRENO_QUIRK_GMU_BW_VOTE) && gmu->nr_gpu_bws) {
> +		unsigned int bw = dev_pm_opp_get_bandwidth(opp, true, 0);
> +
> +		for (bw_index = 0; bw_index < gmu->nr_gpu_bws - 1; bw_index++) {
> +			if (bw == gmu->gpu_bw_table[bw_index])
> +				break;
> +		}
> +	}
> +
>  	gmu->current_perf_index = perf_index;
>  	gmu->freq = gmu->gpu_freqs[perf_index];
>  
> @@ -140,8 +151,10 @@ void a6xx_gmu_set_freq(struct msm_gpu *gpu, struct dev_pm_opp *opp,
>  		return;
>  
>  	if (!gmu->legacy) {
> -		a6xx_hfi_set_freq(gmu, perf_index);
> -		dev_pm_opp_set_opp(&gpu->pdev->dev, opp);
> +		a6xx_hfi_set_freq(gmu, perf_index, bw_index);
> +		/* With Bandwidth voting, we now vote for all resources, so skip OPP set */
> +		if (bw_index)

if (!bw_index) ???

Also should there be a 0 vote too in case we are shutting down /
suspending?

> +			dev_pm_opp_set_opp(&gpu->pdev->dev, opp);
>  		return;
>  	}
>  
> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.h b/drivers/gpu/drm/msm/adreno/a6xx_gmu.h
> index 95c632d8987a517f067c48c61c6c06b9a4f61fc0..9b4f2b1a0c48a133cd5c48713bc321c74eaffce9 100644
> --- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.h
> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.h
> @@ -205,7 +205,7 @@ void a6xx_hfi_init(struct a6xx_gmu *gmu);
>  int a6xx_hfi_start(struct a6xx_gmu *gmu, int boot_state);
>  void a6xx_hfi_stop(struct a6xx_gmu *gmu);
>  int a6xx_hfi_send_prep_slumber(struct a6xx_gmu *gmu);
> -int a6xx_hfi_set_freq(struct a6xx_gmu *gmu, int index);
> +int a6xx_hfi_set_freq(struct a6xx_gmu *gmu, int perf_index, int bw_index);
>  
>  bool a6xx_gmu_gx_is_on(struct a6xx_gmu *gmu);
>  bool a6xx_gmu_sptprac_is_on(struct a6xx_gmu *gmu);
> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_hfi.c b/drivers/gpu/drm/msm/adreno/a6xx_hfi.c
> index 9a89ba95843e7805d78f0e5ddbe328677b6431dd..e2325c15677f1a1194a811e6ecbb5931bdfb1ad9 100644
> --- a/drivers/gpu/drm/msm/adreno/a6xx_hfi.c
> +++ b/drivers/gpu/drm/msm/adreno/a6xx_hfi.c
> @@ -752,13 +752,13 @@ static int a6xx_hfi_send_core_fw_start(struct a6xx_gmu *gmu)
>  		sizeof(msg), NULL, 0);
>  }
>  
> -int a6xx_hfi_set_freq(struct a6xx_gmu *gmu, int index)
> +int a6xx_hfi_set_freq(struct a6xx_gmu *gmu, int freq_index, int bw_index)
>  {
>  	struct a6xx_hfi_gx_bw_perf_vote_cmd msg = { 0 };
>  
>  	msg.ack_type = 1; /* blocking */
> -	msg.freq = index;
> -	msg.bw = 0; /* TODO: bus scaling */
> +	msg.freq = freq_index;
> +	msg.bw = bw_index;
>  
>  	return a6xx_hfi_send_msg(gmu, HFI_H2F_MSG_GX_BW_PERF_VOTE, &msg,
>  		sizeof(msg), NULL, 0);
> 
> -- 
> 2.34.1
> 

-- 
With best wishes
Dmitry

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH RFC 6/8] drm/msm: adreno: enable GMU bandwidth for A740 and A750
  2024-11-13 15:48 ` [PATCH RFC 6/8] drm/msm: adreno: enable GMU bandwidth for A740 and A750 Neil Armstrong
@ 2024-11-15  7:33   ` Dmitry Baryshkov
  2024-11-15  9:20     ` Neil Armstrong
  0 siblings, 1 reply; 29+ messages in thread
From: Dmitry Baryshkov @ 2024-11-15  7:33 UTC (permalink / raw)
  To: Neil Armstrong
  Cc: Akhil P Oommen, Viresh Kumar, Nishanth Menon, Stephen Boyd,
	Rafael J. Wysocki, Rob Clark, Sean Paul, Konrad Dybcio,
	Abhinav Kumar, Marijn Suijten, David Airlie, Simona Vetter,
	Bjorn Andersson, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
	Connor Abbott, linux-pm, linux-kernel, linux-arm-msm, dri-devel,
	freedreno, devicetree

On Wed, Nov 13, 2024 at 04:48:32PM +0100, Neil Armstrong wrote:
> Now all the DDR bandwidth voting via the GPU Management Unit (GMU)
> is in place, let's declare the Bus Control Modules (BCMs) and

s/let's //g

> it's parameters in the GPU info struct and add the GMU_BW_VOTE
> quirk to enable it.

Can we define a function that checks for info.bcm[0].name isntead of
adding a quirk?

> 
> Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org>
> ---
>  drivers/gpu/drm/msm/adreno/a6xx_catalog.c | 26 ++++++++++++++++++++++++--
>  1 file changed, 24 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_catalog.c b/drivers/gpu/drm/msm/adreno/a6xx_catalog.c
> index 0c560e84ad5a53bb4e8a49ba4e153ce9cf33f7ae..014a24256b832d8e03fe06a6516b5348a5c0474a 100644
> --- a/drivers/gpu/drm/msm/adreno/a6xx_catalog.c
> +++ b/drivers/gpu/drm/msm/adreno/a6xx_catalog.c
> @@ -1379,7 +1379,8 @@ static const struct adreno_info a7xx_gpus[] = {
>  		.inactive_period = DRM_MSM_INACTIVE_PERIOD,
>  		.quirks = ADRENO_QUIRK_HAS_CACHED_COHERENT |
>  			  ADRENO_QUIRK_HAS_HW_APRIV |
> -			  ADRENO_QUIRK_PREEMPTION,
> +			  ADRENO_QUIRK_PREEMPTION |
> +			  ADRENO_QUIRK_GMU_BW_VOTE,
>  		.init = a6xx_gpu_init,
>  		.zapfw = "a740_zap.mdt",
>  		.a6xx = &(const struct a6xx_info) {
> @@ -1388,6 +1389,16 @@ static const struct adreno_info a7xx_gpus[] = {
>  			.pwrup_reglist = &a7xx_pwrup_reglist,
>  			.gmu_chipid = 0x7020100,
>  			.gmu_cgc_mode = 0x00020202,
> +			.bcm = {
> +				[0] = { .name = "SH0", .buswidth = 16 },
> +				[1] = { .name = "MC0", .buswidth = 4 },
> +				[2] = {
> +					.name = "ACV",
> +					.fixed = true,
> +					.perfmode = BIT(3),
> +					.perfmode_bw = 16500000,

Is it a platform property or GPU / GMU property? Can expect that there
might be several SoCs having the same GPU, but different perfmode_bw
entry?

> +				},
> +			},
>  		},
>  		.address_space_size = SZ_16G,
>  		.preempt_record_size = 4192 * SZ_1K,
> @@ -1424,7 +1435,8 @@ static const struct adreno_info a7xx_gpus[] = {
>  		.inactive_period = DRM_MSM_INACTIVE_PERIOD,
>  		.quirks = ADRENO_QUIRK_HAS_CACHED_COHERENT |
>  			  ADRENO_QUIRK_HAS_HW_APRIV |
> -			  ADRENO_QUIRK_PREEMPTION,
> +			  ADRENO_QUIRK_PREEMPTION |
> +			  ADRENO_QUIRK_GMU_BW_VOTE,
>  		.init = a6xx_gpu_init,
>  		.zapfw = "gen70900_zap.mbn",
>  		.a6xx = &(const struct a6xx_info) {
> @@ -1432,6 +1444,16 @@ static const struct adreno_info a7xx_gpus[] = {
>  			.pwrup_reglist = &a7xx_pwrup_reglist,
>  			.gmu_chipid = 0x7090100,
>  			.gmu_cgc_mode = 0x00020202,
> +			.bcm = {
> +				[0] = { .name = "SH0", .buswidth = 16 },
> +				[1] = { .name = "MC0", .buswidth = 4 },
> +				[2] = {
> +					.name = "ACV",
> +					.fixed = true,
> +					.perfmode = BIT(2),
> +					.perfmode_bw = 10687500,
> +				},
> +			},
>  		},
>  		.address_space_size = SZ_16G,
>  		.preempt_record_size = 3572 * SZ_1K,
> 
> -- 
> 2.34.1
> 

-- 
With best wishes
Dmitry

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH RFC 3/8] drm/msm: adreno: add plumbing to generate bandwidth vote table for GMU
  2024-11-15  7:20   ` Dmitry Baryshkov
@ 2024-11-15  9:09     ` Neil Armstrong
  2024-11-15 14:34       ` Dmitry Baryshkov
  0 siblings, 1 reply; 29+ messages in thread
From: Neil Armstrong @ 2024-11-15  9:09 UTC (permalink / raw)
  To: Dmitry Baryshkov
  Cc: Akhil P Oommen, Viresh Kumar, Nishanth Menon, Stephen Boyd,
	Rafael J. Wysocki, Rob Clark, Sean Paul, Konrad Dybcio,
	Abhinav Kumar, Marijn Suijten, David Airlie, Simona Vetter,
	Bjorn Andersson, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
	Connor Abbott, linux-pm, linux-kernel, linux-arm-msm, dri-devel,
	freedreno, devicetree

On 15/11/2024 08:20, Dmitry Baryshkov wrote:
> On Wed, Nov 13, 2024 at 04:48:29PM +0100, Neil Armstrong wrote:
>> The Adreno GMU Management Unit (GMU) can also scale DDR Bandwidth along
>> the Frequency and Power Domain level, but by default we leave the
>> OPP core scale the interconnect ddr path.
>>
>> In order to get the vote values to be used by the GPU Management
>> Unit (GMU), we need to parse all the possible OPP Bandwidths and
>> create a vote value to be send to the appropriate Bus Control
>> Modules (BCMs) declared in the GPU info struct.
>>
>> The vote array will be used to dynamically generate the GMU bw_table
>> sent during the GMU power-up.
>>
>> Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org>
>> ---
>>   drivers/gpu/drm/msm/adreno/a6xx_gmu.c | 163 ++++++++++++++++++++++++++++++++++
>>   drivers/gpu/drm/msm/adreno/a6xx_gmu.h |  12 +++
>>   drivers/gpu/drm/msm/adreno/a6xx_gpu.h |   1 +
>>   3 files changed, 176 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
>> index 14db7376c712d19446b38152e480bd5a1e0a5198..504a7c5d5a9df4c787951f2ae3a69d566d205ad5 100644
>> --- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
>> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
>> @@ -9,6 +9,7 @@
>>   #include <linux/pm_domain.h>
>>   #include <linux/pm_opp.h>
>>   #include <soc/qcom/cmd-db.h>
>> +#include <soc/qcom/tcs.h>
>>   #include <drm/drm_gem.h>
>>   
>>   #include "a6xx_gpu.h"
>> @@ -1287,6 +1288,119 @@ static int a6xx_gmu_memory_probe(struct a6xx_gmu *gmu)
>>   	return 0;
>>   }
>>   
>> +struct a6xx_bcm_data {
>> +	u32 buswidth;
>> +	unsigned int unit;
>> +	unsigned int width;
> 
> In bits?
> 
>> +	unsigned int vcd;
> 
> What is this?

I'll also copy the icc-rpmh.h doc associated with those fields

> 
>> +	bool fixed;
> 
> What does it mean?

I took it from downstream, but it's the same as qcom_icc_bcm enable_mask instead here the mask depends on the platform and OPP, this is why I specified it in perfmode.

> 
>> +	unsigned int perfmode;
>> +	unsigned int perfmode_bw;
>> +};
>> +
>> +struct bcm_db {
>> +	__le32 unit;
>> +	__le16 width;
>> +	u8 vcd;
>> +	u8 reserved;
>> +};
>> +
>> +static int a6xx_gmu_rpmh_get_bcm_data(const struct a6xx_bcm *bcm,
>> +				      struct a6xx_bcm_data *bcm_data)
> 
> Is there a reason to copy CMD DB and BCM data to the interim
> representation instead of using those directly?

I guess I can keep bcm_db & a6xx_bcm as-is and do the _to_cpu() in-place.

> 
>> +{
>> +	const struct bcm_db *data;
>> +	size_t count;
>> +
>> +	data = cmd_db_read_aux_data(bcm->name, &count);
>> +	if (IS_ERR(data))
>> +		return PTR_ERR(data);
>> +
>> +	if (!count)
>> +		return -EINVAL;
>> +
>> +	bcm_data->unit = le32_to_cpu(data->unit);
>> +	bcm_data->width = le16_to_cpu(data->width);
>> +	bcm_data->vcd = data->vcd;
>> +	bcm_data->fixed = bcm->fixed;
>> +	bcm_data->perfmode = bcm->perfmode;
>> +	bcm_data->perfmode_bw = bcm->perfmode_bw;
>> +	bcm_data->buswidth = bcm->buswidth;
>> +
>> +	return 0;
>> +}
>> +
>> +static void a6xx_gmu_rpmh_calc_bw_vote(struct a6xx_bcm_data *bcms,
>> +				       int count, u32 bw, u32 *data)
>> +{
>> +	int i;
>> +
>> +	for (i = 0; i < count; i++) {
>> +		bool valid = true;
>> +		bool commit = false;
>> +		u64 peak, y;
>> +
>> +		if (i == count - 1 || bcms[i].vcd != bcms[i + 1].vcd)
>> +			commit = true;
>> +
>> +		if (bcms[i].fixed) {
>> +			if (!bw)
>> +				data[i] = BCM_TCS_CMD(commit, false, 0x0, 0x0);
>> +			else
>> +				data[i] = BCM_TCS_CMD(commit, true, 0x0,
>> +					bw >= bcms[i].perfmode_bw ?
>> +						bcms[i].perfmode : 0x0);
>> +			continue;
>> +		}
>> +
>> +		/* Multiple the bandwidth by the width of the connection */
> 
> ... and divide by the bus width. However it's not clear why you are
> multiplying bandwidth (bits or bytes per second) with the width
> (probably also bits?). Or is it not a width but the number of paths
> between units?

So this is basically the same as in bcm_agregate:
https://elixir.bootlin.com/linux/v6.12-rc6/source/drivers/interconnect/qcom/bcm-voter.c#L91

Just done slightly differently since we don't aggregate stuff but we want
to set the bandwidth directly here from the GMU.

> 
>> +		peak = (u64)bw * bcms[i].width;
>> +		do_div(peak, bcms[i].buswidth);
>> +
>> +		/* Input bandwidth value is in KBps */
> 
> Input or OPP / Interconnect?

I don't see the point, it's the input of the function which directly comes from OPP which is in KBps

> 
>> +		y = peak * 1000ULL;
>> +		do_div(y, bcms[i].unit);
>> +
>> +		/*
>> +		 * If a bandwidth value was specified but the calculation ends
>> +		 * rounding down to zero, set a minimum level
>> +		 */
>> +		if (bw && y == 0)
>> +			y = 1;
> 
> Is it a real usecase or just a safety net? If the bandwidth ends up
> being very low, maybe we should warn the users about it?

Probably a safety net, perhaps we could warn instead

> 
>> +
>> +		y = min_t(u64, y, BCM_TCS_CMD_VOTE_MASK);
>> +		if (!y)
>> +			valid = false;
> 
> This can probably be coupled with the previous condition.

Yeah I should probably refactor it and just avoid doing the
calculation if bw == 0.

> 
>> +
>> +		data[i] = BCM_TCS_CMD(commit, valid, y, y);
>> +	}
>> +}
>> +
>> +static int a6xx_gmu_rpmh_bw_votes_init(const struct a6xx_info *info, struct a6xx_gmu *gmu)
>> +{
>> +	struct a6xx_bcm_data bcms[3];
>> +	unsigned int bcm_count = 0;
>> +	int ret, index;
>> +
>> +	/* Retrieve BCM data from cmd-db and merge with a6xx_info bcm table */
>> +	for (index = 0; index < 3; index++) {
> 
> Magic number 3.
> 
>> +		if (!info->bcm[index].name)
>> +			continue;
>> +
>> +		ret = a6xx_gmu_rpmh_get_bcm_data(&info->bcm[index], &bcms[index]);
>> +		if (ret)
>> +			return ret;
>> +
>> +		++bcm_count;
>> +	}
>> +
>> +	/* Generate BCM votes values for each bandwidth & bcm */
>> +	for (index = 0; index < gmu->nr_gpu_bws; index++)
>> +		a6xx_gmu_rpmh_calc_bw_vote(bcms, bcm_count, gmu->gpu_bw_table[index],
>> +					   gmu->gpu_bw_votes[index]);
>> +
>> +	return 0;
>> +}
>> +
>>   /* Return the 'arc-level' for the given frequency */
>>   static unsigned int a6xx_gmu_get_arc_level(struct device *dev,
>>   					   unsigned long freq)
>> @@ -1390,12 +1504,15 @@ static int a6xx_gmu_rpmh_arc_votes_init(struct device *dev, u32 *votes,
>>    * The GMU votes with the RPMh for itself and on behalf of the GPU but we need
>>    * to construct the list of votes on the CPU and send it over. Query the RPMh
>>    * voltage levels and build the votes
>> + * The GMU can also vote for DDR interconnects, use the OPP bandwidth entries
>> + * and BCM parameters to build the votes.
>>    */
>>   
>>   static int a6xx_gmu_rpmh_votes_init(struct a6xx_gmu *gmu)
>>   {
>>   	struct a6xx_gpu *a6xx_gpu = container_of(gmu, struct a6xx_gpu, gmu);
>>   	struct adreno_gpu *adreno_gpu = &a6xx_gpu->base;
>> +	const struct a6xx_info *info = adreno_gpu->info->a6xx;
>>   	struct msm_gpu *gpu = &adreno_gpu->base;
>>   	int ret;
>>   
>> @@ -1407,6 +1524,10 @@ static int a6xx_gmu_rpmh_votes_init(struct a6xx_gmu *gmu)
>>   	ret |= a6xx_gmu_rpmh_arc_votes_init(gmu->dev, gmu->cx_arc_votes,
>>   		gmu->gmu_freqs, gmu->nr_gmu_freqs, "cx.lvl");
>>   
>> +	/* Build the interconnect votes */
>> +	if (adreno_gpu->info->quirks & ADRENO_QUIRK_GMU_BW_VOTE)
>> +		ret |= a6xx_gmu_rpmh_bw_votes_init(info, gmu);
>> +
>>   	return ret;
>>   }
>>   
>> @@ -1442,6 +1563,38 @@ static int a6xx_gmu_build_freq_table(struct device *dev, unsigned long *freqs,
>>   	return index;
>>   }
>>   
>> +static int a6xx_gmu_build_bw_table(struct device *dev, unsigned long *bandwidths,
>> +		u32 size)
>> +{
>> +	int count = dev_pm_opp_get_opp_count(dev);
>> +	struct dev_pm_opp *opp;
>> +	int i, index = 0;
>> +	unsigned int bandwidth = 1;
>> +
>> +	/*
>> +	 * The OPP table doesn't contain the "off" bandwidth level so we need to
>> +	 * add 1 to the table size to account for it
>> +	 */
>> +
>> +	if (WARN(count + 1 > size,
>> +		"The GMU bandwidth table is being truncated\n"))
>> +		count = size - 1;
>> +
>> +	/* Set the "off" bandwidth */
>> +	bandwidths[index++] = 0;
>> +
>> +	for (i = 0; i < count; i++) {
>> +		opp = dev_pm_opp_find_bw_ceil(dev, &bandwidth, 0);
>> +		if (IS_ERR(opp))
>> +			break;
>> +
>> +		dev_pm_opp_put(opp);
>> +		bandwidths[index++] = bandwidth++;
>> +	}
>> +
>> +	return index;
>> +}
>> +
>>   static int a6xx_gmu_pwrlevels_probe(struct a6xx_gmu *gmu)
>>   {
>>   	struct a6xx_gpu *a6xx_gpu = container_of(gmu, struct a6xx_gpu, gmu);
>> @@ -1472,6 +1625,16 @@ static int a6xx_gmu_pwrlevels_probe(struct a6xx_gmu *gmu)
>>   
>>   	gmu->current_perf_index = gmu->nr_gpu_freqs - 1;
>>   
>> +	/*
>> +	 * The GMU also handles GPU Interconnect Votes so build a list
>> +	 * of DDR bandwidths from the GPU OPP table
>> +	 */
>> +	if (adreno_gpu->info->quirks & ADRENO_QUIRK_GMU_BW_VOTE)
>> +		gmu->nr_gpu_bws = a6xx_gmu_build_bw_table(&gpu->pdev->dev,
>> +			gmu->gpu_bw_table, ARRAY_SIZE(gmu->gpu_bw_table));
>> +
>> +	gmu->current_perf_index = gmu->nr_gpu_freqs - 1;
>> +
>>   	/* Build the list of RPMh votes that we'll send to the GMU */
>>   	return a6xx_gmu_rpmh_votes_init(gmu);
>>   }
>> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.h b/drivers/gpu/drm/msm/adreno/a6xx_gmu.h
>> index b4a79f88ccf45cfe651c86d2a9da39541c5772b3..95c632d8987a517f067c48c61c6c06b9a4f61fc0 100644
>> --- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.h
>> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.h
>> @@ -19,6 +19,14 @@ struct a6xx_gmu_bo {
>>   	u64 iova;
>>   };
>>   
>> +struct a6xx_bcm {
>> +	char *name;
>> +	unsigned int buswidth;
>> +	bool fixed;
>> +	unsigned int perfmode;
>> +	unsigned int perfmode_bw;
>> +};
>> +
>>   /*
>>    * These define the different GMU wake up options - these define how both the
>>    * CPU and the GMU bring up the hardware
>> @@ -82,6 +90,10 @@ struct a6xx_gmu {
>>   	unsigned long gpu_freqs[16];
>>   	u32 gx_arc_votes[16];
>>   
>> +	int nr_gpu_bws;
>> +	unsigned long gpu_bw_table[16];
>> +	u32 gpu_bw_votes[16][3];
> 
> Is it is the same magic 16 as we have few lines above or is this 16 a
> different magic 16? And also 3 is a pure dark secret.

It's the same magic 16, since we use the same OPPs, the 3 is the actual number of BCMs we currently use, I wonder sure define should go, including the magic 16.

> 
>> +
>>   	int nr_gmu_freqs;
>>   	unsigned long gmu_freqs[4];
>>   	u32 cx_arc_votes[4];
>> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.h b/drivers/gpu/drm/msm/adreno/a6xx_gpu.h
>> index 4aceffb6aae89c781facc2a6e4a82b20b341b6cb..d779d700120cbd974ee87a67214739b1d85156e2 100644
>> --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.h
>> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.h
>> @@ -44,6 +44,7 @@ struct a6xx_info {
>>   	u32 gmu_chipid;
>>   	u32 gmu_cgc_mode;
>>   	u32 prim_fifo_threshold;
>> +	const struct a6xx_bcm bcm[3];
>>   };
>>   
>>   struct a6xx_gpu {
>>
>> -- 
>> 2.34.1
>>
> 


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH RFC 4/8] drm/msm: adreno: dynamically generate GMU bw table
  2024-11-15  7:24   ` Dmitry Baryshkov
@ 2024-11-15  9:11     ` Neil Armstrong
  2024-11-15 14:35       ` Dmitry Baryshkov
  0 siblings, 1 reply; 29+ messages in thread
From: Neil Armstrong @ 2024-11-15  9:11 UTC (permalink / raw)
  To: Dmitry Baryshkov
  Cc: Akhil P Oommen, Viresh Kumar, Nishanth Menon, Stephen Boyd,
	Rafael J. Wysocki, Rob Clark, Sean Paul, Konrad Dybcio,
	Abhinav Kumar, Marijn Suijten, David Airlie, Simona Vetter,
	Bjorn Andersson, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
	Connor Abbott, linux-pm, linux-kernel, linux-arm-msm, dri-devel,
	freedreno, devicetree

On 15/11/2024 08:24, Dmitry Baryshkov wrote:
> On Wed, Nov 13, 2024 at 04:48:30PM +0100, Neil Armstrong wrote:
>> The Adreno GPU Management Unit (GMU) can also scale the ddr
>> bandwidth along the frequency and power domain level, but for
>> now we statically fill the bw_table with values from the
>> downstream driver.
>>
>> Only the first entry is used, which is a disable vote, so we
>> currently rely on scaling via the linux interconnect paths.
>>
>> Let's dynamically generate the bw_table with the vote values
>> previously calculated from the OPPs.
> 
> Nice to see this being worked upon. I hope the code can is generic
> enough so that we can use it from other adreno_foo_build_bw_table()
> functions.

I would hope so, but I don't have the HW to properly test it on those
platforms.

> 
>>
>> Those entried will then be used by the GMU when passing the
>> appropriate bandwidth level when voting for a gpu frequency.
>>
>> Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org>
>> ---
>>   drivers/gpu/drm/msm/adreno/a6xx_hfi.c | 48 +++++++++++++++++++++++++++--------
>>   1 file changed, 37 insertions(+), 11 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_hfi.c b/drivers/gpu/drm/msm/adreno/a6xx_hfi.c
>> index cb8844ed46b29c4569d05eb7a24f7b27e173190f..9a89ba95843e7805d78f0e5ddbe328677b6431dd 100644
>> --- a/drivers/gpu/drm/msm/adreno/a6xx_hfi.c
>> +++ b/drivers/gpu/drm/msm/adreno/a6xx_hfi.c
>> @@ -596,22 +596,48 @@ static void a730_build_bw_table(struct a6xx_hfi_msg_bw_table *msg)
>>   	msg->cnoc_cmds_data[1][0] = 0x60000001;
>>   }
>>   
>> -static void a740_build_bw_table(struct a6xx_hfi_msg_bw_table *msg)
>> +static void a740_generate_bw_table(struct adreno_gpu *adreno_gpu, struct a6xx_gmu *gmu,
>> +				   struct a6xx_hfi_msg_bw_table *msg)
>>   {
>> -	msg->bw_level_num = 1;
>> +	const struct a6xx_info *info = adreno_gpu->info->a6xx;
>> +	unsigned int i, j;
>>   
>> -	msg->ddr_cmds_num = 3;
>>   	msg->ddr_wait_bitmask = 0x7;
>>   
>> -	msg->ddr_cmds_addrs[0] = cmd_db_read_addr("SH0");
>> -	msg->ddr_cmds_addrs[1] = cmd_db_read_addr("MC0");
>> -	msg->ddr_cmds_addrs[2] = cmd_db_read_addr("ACV");
>> +	for (i = 0; i < 3; i++) {
>> +		if (!info->bcm[i].name)
>> +			break;
>> +		msg->ddr_cmds_addrs[i] = cmd_db_read_addr(info->bcm[i].name);
>> +	}
>> +	msg->ddr_cmds_num = i;
>>   
>> -	msg->ddr_cmds_data[0][0] = 0x40000000;
>> -	msg->ddr_cmds_data[0][1] = 0x40000000;
>> -	msg->ddr_cmds_data[0][2] = 0x40000000;
>> +	for (i = 0; i < gmu->nr_gpu_bws; ++i)
>> +		for (j = 0; j < msg->ddr_cmds_num; j++)
>> +			msg->ddr_cmds_data[i][j] = gmu->gpu_bw_votes[i][j];
>> +	msg->bw_level_num = gmu->nr_gpu_bws;
>> +}
>> +
>> +static void a740_build_bw_table(struct adreno_gpu *adreno_gpu, struct a6xx_gmu *gmu,
>> +				struct a6xx_hfi_msg_bw_table *msg)
>> +{
>> +	if ((adreno_gpu->info->quirks & ADRENO_QUIRK_GMU_BW_VOTE) && gmu->nr_gpu_bws) {
>> +		a740_generate_bw_table(adreno_gpu, gmu, msg);
>> +	} else {
> 
> Why do we need a fallback code here?

Because at this particular commit, it would generate an invalid table, I should probably remove the fallback at the end

> 
>> +		msg->bw_level_num = 1;
>>   
>> -	/* TODO: add a proper dvfs table */
>> +		msg->ddr_cmds_num = 3;
>> +		msg->ddr_wait_bitmask = 0x7;
>> +
>> +		msg->ddr_cmds_addrs[0] = cmd_db_read_addr("SH0");
>> +		msg->ddr_cmds_addrs[1] = cmd_db_read_addr("MC0");
>> +		msg->ddr_cmds_addrs[2] = cmd_db_read_addr("ACV");
>> +
>> +		msg->ddr_cmds_data[0][0] = 0x40000000;
>> +		msg->ddr_cmds_data[0][1] = 0x40000000;
>> +		msg->ddr_cmds_data[0][2] = 0x40000000;
>> +
>> +		/* TODO: add a proper dvfs table */
> 
> I think TODO is unapplicable anymore.
> 
>> +	}
>>   
>>   	msg->cnoc_cmds_num = 1;
>>   	msg->cnoc_wait_bitmask = 0x1;
>> @@ -691,7 +717,7 @@ static int a6xx_hfi_send_bw_table(struct a6xx_gmu *gmu)
>>   	else if (adreno_is_a730(adreno_gpu))
>>   		a730_build_bw_table(msg);
>>   	else if (adreno_is_a740_family(adreno_gpu))
>> -		a740_build_bw_table(msg);
>> +		a740_build_bw_table(adreno_gpu, gmu, msg);
>>   	else
>>   		a6xx_build_bw_table(msg);
>>   
>>
>> -- 
>> 2.34.1
>>
> 


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH RFC 5/8] drm/msm: adreno: find bandwidth index of OPP and set it along freq index
  2024-11-15  7:28   ` Dmitry Baryshkov
@ 2024-11-15  9:15     ` Neil Armstrong
  0 siblings, 0 replies; 29+ messages in thread
From: Neil Armstrong @ 2024-11-15  9:15 UTC (permalink / raw)
  To: Dmitry Baryshkov
  Cc: Akhil P Oommen, Viresh Kumar, Nishanth Menon, Stephen Boyd,
	Rafael J. Wysocki, Rob Clark, Sean Paul, Konrad Dybcio,
	Abhinav Kumar, Marijn Suijten, David Airlie, Simona Vetter,
	Bjorn Andersson, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
	Connor Abbott, linux-pm, linux-kernel, linux-arm-msm, dri-devel,
	freedreno, devicetree

On 15/11/2024 08:28, Dmitry Baryshkov wrote:
> On Wed, Nov 13, 2024 at 04:48:31PM +0100, Neil Armstrong wrote:
>> The Adreno GMU Management Unit (GMU) can also scale the DDR Bandwidth
>> along the Frequency and Power Domain level, until now we left the OPP
>> core scale the OPP bandwidth via the interconnect path.
>>
>> In order to enable bandwidth voting via the GPU Management
>> Unit (GMU), when an opp is set by devfreq we also look for
>> the corresponding bandwidth index in the previously generated
>> bw_table and pass this value along the frequency index to the GMU.
>>
>> Since we now vote for all resources via the GMU, setting the OPP
>> is no more needed, so we can completely skip calling
>> dev_pm_opp_set_opp() in this situation.
>>
>> Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org>
>> ---
>>   drivers/gpu/drm/msm/adreno/a6xx_gmu.c | 17 +++++++++++++++--
>>   drivers/gpu/drm/msm/adreno/a6xx_gmu.h |  2 +-
>>   drivers/gpu/drm/msm/adreno/a6xx_hfi.c |  6 +++---
>>   3 files changed, 19 insertions(+), 6 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
>> index 504a7c5d5a9df4c787951f2ae3a69d566d205ad5..1131c3521ebbb0d053aceb162052ed01e197726a 100644
>> --- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
>> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
>> @@ -113,6 +113,7 @@ void a6xx_gmu_set_freq(struct msm_gpu *gpu, struct dev_pm_opp *opp,
>>   	struct a6xx_gpu *a6xx_gpu = to_a6xx_gpu(adreno_gpu);
>>   	struct a6xx_gmu *gmu = &a6xx_gpu->gmu;
>>   	u32 perf_index;
>> +	u32 bw_index = 0;
>>   	unsigned long gpu_freq;
>>   	int ret = 0;
>>   
>> @@ -125,6 +126,16 @@ void a6xx_gmu_set_freq(struct msm_gpu *gpu, struct dev_pm_opp *opp,
>>   		if (gpu_freq == gmu->gpu_freqs[perf_index])
>>   			break;
>>   
>> +	/* If enabled, find the corresponding DDR bandwidth index */
>> +	if ((adreno_gpu->info->quirks & ADRENO_QUIRK_GMU_BW_VOTE) && gmu->nr_gpu_bws) {
>> +		unsigned int bw = dev_pm_opp_get_bandwidth(opp, true, 0);
>> +
>> +		for (bw_index = 0; bw_index < gmu->nr_gpu_bws - 1; bw_index++) {
>> +			if (bw == gmu->gpu_bw_table[bw_index])
>> +				break;
>> +		}
>> +	}
>> +
>>   	gmu->current_perf_index = perf_index;
>>   	gmu->freq = gmu->gpu_freqs[perf_index];
>>   
>> @@ -140,8 +151,10 @@ void a6xx_gmu_set_freq(struct msm_gpu *gpu, struct dev_pm_opp *opp,
>>   		return;
>>   
>>   	if (!gmu->legacy) {
>> -		a6xx_hfi_set_freq(gmu, perf_index);
>> -		dev_pm_opp_set_opp(&gpu->pdev->dev, opp);
>> +		a6xx_hfi_set_freq(gmu, perf_index, bw_index);
>> +		/* With Bandwidth voting, we now vote for all resources, so skip OPP set */
>> +		if (bw_index)
> 
> if (!bw_index) ???

Good catch, I added it back wrongly when refactoring...

> 
> Also should there be a 0 vote too in case we are shutting down /
> suspending?

It's already handled in a6xx_gmu_stop()

> 
>> +			dev_pm_opp_set_opp(&gpu->pdev->dev, opp);
>>   		return;
>>   	}
>>   
>> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.h b/drivers/gpu/drm/msm/adreno/a6xx_gmu.h
>> index 95c632d8987a517f067c48c61c6c06b9a4f61fc0..9b4f2b1a0c48a133cd5c48713bc321c74eaffce9 100644
>> --- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.h
>> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.h
>> @@ -205,7 +205,7 @@ void a6xx_hfi_init(struct a6xx_gmu *gmu);
>>   int a6xx_hfi_start(struct a6xx_gmu *gmu, int boot_state);
>>   void a6xx_hfi_stop(struct a6xx_gmu *gmu);
>>   int a6xx_hfi_send_prep_slumber(struct a6xx_gmu *gmu);
>> -int a6xx_hfi_set_freq(struct a6xx_gmu *gmu, int index);
>> +int a6xx_hfi_set_freq(struct a6xx_gmu *gmu, int perf_index, int bw_index);
>>   
>>   bool a6xx_gmu_gx_is_on(struct a6xx_gmu *gmu);
>>   bool a6xx_gmu_sptprac_is_on(struct a6xx_gmu *gmu);
>> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_hfi.c b/drivers/gpu/drm/msm/adreno/a6xx_hfi.c
>> index 9a89ba95843e7805d78f0e5ddbe328677b6431dd..e2325c15677f1a1194a811e6ecbb5931bdfb1ad9 100644
>> --- a/drivers/gpu/drm/msm/adreno/a6xx_hfi.c
>> +++ b/drivers/gpu/drm/msm/adreno/a6xx_hfi.c
>> @@ -752,13 +752,13 @@ static int a6xx_hfi_send_core_fw_start(struct a6xx_gmu *gmu)
>>   		sizeof(msg), NULL, 0);
>>   }
>>   
>> -int a6xx_hfi_set_freq(struct a6xx_gmu *gmu, int index)
>> +int a6xx_hfi_set_freq(struct a6xx_gmu *gmu, int freq_index, int bw_index)
>>   {
>>   	struct a6xx_hfi_gx_bw_perf_vote_cmd msg = { 0 };
>>   
>>   	msg.ack_type = 1; /* blocking */
>> -	msg.freq = index;
>> -	msg.bw = 0; /* TODO: bus scaling */
>> +	msg.freq = freq_index;
>> +	msg.bw = bw_index;
>>   
>>   	return a6xx_hfi_send_msg(gmu, HFI_H2F_MSG_GX_BW_PERF_VOTE, &msg,
>>   		sizeof(msg), NULL, 0);
>>
>> -- 
>> 2.34.1
>>
> 


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH RFC 6/8] drm/msm: adreno: enable GMU bandwidth for A740 and A750
  2024-11-15  7:33   ` Dmitry Baryshkov
@ 2024-11-15  9:20     ` Neil Armstrong
  2024-11-15 14:39       ` Dmitry Baryshkov
  0 siblings, 1 reply; 29+ messages in thread
From: Neil Armstrong @ 2024-11-15  9:20 UTC (permalink / raw)
  To: Dmitry Baryshkov
  Cc: Akhil P Oommen, Viresh Kumar, Nishanth Menon, Stephen Boyd,
	Rafael J. Wysocki, Rob Clark, Sean Paul, Konrad Dybcio,
	Abhinav Kumar, Marijn Suijten, David Airlie, Simona Vetter,
	Bjorn Andersson, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
	Connor Abbott, linux-pm, linux-kernel, linux-arm-msm, dri-devel,
	freedreno, devicetree

On 15/11/2024 08:33, Dmitry Baryshkov wrote:
> On Wed, Nov 13, 2024 at 04:48:32PM +0100, Neil Armstrong wrote:
>> Now all the DDR bandwidth voting via the GPU Management Unit (GMU)
>> is in place, let's declare the Bus Control Modules (BCMs) and
> 
> s/let's //g
> 
>> it's parameters in the GPU info struct and add the GMU_BW_VOTE
>> quirk to enable it.
> 
> Can we define a function that checks for info.bcm[0].name isntead of
> adding a quirk?

Probably, I'll need ideas to how design this better, perhaps a simple
capability bitfield in a6xx_info ?
There's other feature that are lacking, like ACD or BCL which are not supported
on all a6xx/a7xx gpus.

> 
>>
>> Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org>
>> ---
>>   drivers/gpu/drm/msm/adreno/a6xx_catalog.c | 26 ++++++++++++++++++++++++--
>>   1 file changed, 24 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_catalog.c b/drivers/gpu/drm/msm/adreno/a6xx_catalog.c
>> index 0c560e84ad5a53bb4e8a49ba4e153ce9cf33f7ae..014a24256b832d8e03fe06a6516b5348a5c0474a 100644
>> --- a/drivers/gpu/drm/msm/adreno/a6xx_catalog.c
>> +++ b/drivers/gpu/drm/msm/adreno/a6xx_catalog.c
>> @@ -1379,7 +1379,8 @@ static const struct adreno_info a7xx_gpus[] = {
>>   		.inactive_period = DRM_MSM_INACTIVE_PERIOD,
>>   		.quirks = ADRENO_QUIRK_HAS_CACHED_COHERENT |
>>   			  ADRENO_QUIRK_HAS_HW_APRIV |
>> -			  ADRENO_QUIRK_PREEMPTION,
>> +			  ADRENO_QUIRK_PREEMPTION |
>> +			  ADRENO_QUIRK_GMU_BW_VOTE,
>>   		.init = a6xx_gpu_init,
>>   		.zapfw = "a740_zap.mdt",
>>   		.a6xx = &(const struct a6xx_info) {
>> @@ -1388,6 +1389,16 @@ static const struct adreno_info a7xx_gpus[] = {
>>   			.pwrup_reglist = &a7xx_pwrup_reglist,
>>   			.gmu_chipid = 0x7020100,
>>   			.gmu_cgc_mode = 0x00020202,
>> +			.bcm = {
>> +				[0] = { .name = "SH0", .buswidth = 16 },
>> +				[1] = { .name = "MC0", .buswidth = 4 },
>> +				[2] = {
>> +					.name = "ACV",
>> +					.fixed = true,
>> +					.perfmode = BIT(3),
>> +					.perfmode_bw = 16500000,
> 
> Is it a platform property or GPU / GMU property? Can expect that there
> might be several SoCs having the same GPU, but different perfmode_bw
> entry?

I presume this is SoC specific ? But today the XXX_build_bw_table() are
already SoC specific, so where should this go ?

Downstream specifies this in the adreno-gpulist.h, which is the equivalent
here.

Neil

> 
>> +				},
>> +			},
>>   		},
>>   		.address_space_size = SZ_16G,
>>   		.preempt_record_size = 4192 * SZ_1K,
>> @@ -1424,7 +1435,8 @@ static const struct adreno_info a7xx_gpus[] = {
>>   		.inactive_period = DRM_MSM_INACTIVE_PERIOD,
>>   		.quirks = ADRENO_QUIRK_HAS_CACHED_COHERENT |
>>   			  ADRENO_QUIRK_HAS_HW_APRIV |
>> -			  ADRENO_QUIRK_PREEMPTION,
>> +			  ADRENO_QUIRK_PREEMPTION |
>> +			  ADRENO_QUIRK_GMU_BW_VOTE,
>>   		.init = a6xx_gpu_init,
>>   		.zapfw = "gen70900_zap.mbn",
>>   		.a6xx = &(const struct a6xx_info) {
>> @@ -1432,6 +1444,16 @@ static const struct adreno_info a7xx_gpus[] = {
>>   			.pwrup_reglist = &a7xx_pwrup_reglist,
>>   			.gmu_chipid = 0x7090100,
>>   			.gmu_cgc_mode = 0x00020202,
>> +			.bcm = {
>> +				[0] = { .name = "SH0", .buswidth = 16 },
>> +				[1] = { .name = "MC0", .buswidth = 4 },
>> +				[2] = {
>> +					.name = "ACV",
>> +					.fixed = true,
>> +					.perfmode = BIT(2),
>> +					.perfmode_bw = 10687500,
>> +				},
>> +			},
>>   		},
>>   		.address_space_size = SZ_16G,
>>   		.preempt_record_size = 3572 * SZ_1K,
>>
>> -- 
>> 2.34.1
>>
> 


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH RFC 2/8] drm/msm: adreno: add GMU_BW_VOTE quirk
  2024-11-15  7:07   ` Dmitry Baryshkov
@ 2024-11-15  9:21     ` Neil Armstrong
  2024-11-15 14:18       ` Dmitry Baryshkov
  0 siblings, 1 reply; 29+ messages in thread
From: Neil Armstrong @ 2024-11-15  9:21 UTC (permalink / raw)
  To: Dmitry Baryshkov
  Cc: Akhil P Oommen, Viresh Kumar, Nishanth Menon, Stephen Boyd,
	Rafael J. Wysocki, Rob Clark, Sean Paul, Konrad Dybcio,
	Abhinav Kumar, Marijn Suijten, David Airlie, Simona Vetter,
	Bjorn Andersson, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
	Connor Abbott, linux-pm, linux-kernel, linux-arm-msm, dri-devel,
	freedreno, devicetree

On 15/11/2024 08:07, Dmitry Baryshkov wrote:
> On Wed, Nov 13, 2024 at 04:48:28PM +0100, Neil Armstrong wrote:
>> The Adreno GMU Management Unit (GNU) can also scale the DDR Bandwidth
>> along the Frequency and Power Domain level, but by default we leave the
>> OPP core vote for the interconnect ddr path.
>>
>> While scaling via the interconnect path was sufficient, newer GPUs
>> like the A750 requires specific vote paremeters and bandwidth to
>> achieve full functionality.
>>
>> Add a new Quirk enabling DDR Bandwidth vote via GMU.
> 
> Please describe, why this is defined as a quirk rather than a proper
> platform-level property. From my experience with 6xx and 7xx, all the
> platforms need to send some kind of BW data to the GMU.

Well APRIV, CACHED_COHERENT & PREEMPTION are HW features, why this can't be part of this ?

Perhaps the "quirks" bitfield should be features instead ?

> 
>>
>> Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org>
>> ---
>>   drivers/gpu/drm/msm/adreno/adreno_gpu.h | 1 +
>>   1 file changed, 1 insertion(+)
>>
>> diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.h b/drivers/gpu/drm/msm/adreno/adreno_gpu.h
>> index e71f420f8b3a8e6cfc52dd1c4d5a63ef3704a07f..20b6b7f49473d42751cd4fb4fc82849be42cb807 100644
>> --- a/drivers/gpu/drm/msm/adreno/adreno_gpu.h
>> +++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.h
>> @@ -57,6 +57,7 @@ enum adreno_family {
>>   #define ADRENO_QUIRK_HAS_HW_APRIV		BIT(3)
>>   #define ADRENO_QUIRK_HAS_CACHED_COHERENT	BIT(4)
>>   #define ADRENO_QUIRK_PREEMPTION			BIT(5)
>> +#define ADRENO_QUIRK_GMU_BW_VOTE		BIT(6)
>>   
>>   /* Helper for formating the chip_id in the way that userspace tools like
>>    * crashdec expect.
>>
>> -- 
>> 2.34.1
>>
> 


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH RFC 2/8] drm/msm: adreno: add GMU_BW_VOTE quirk
  2024-11-15  9:21     ` Neil Armstrong
@ 2024-11-15 14:18       ` Dmitry Baryshkov
  2024-11-15 15:10         ` Rob Clark
  0 siblings, 1 reply; 29+ messages in thread
From: Dmitry Baryshkov @ 2024-11-15 14:18 UTC (permalink / raw)
  To: neil.armstrong
  Cc: Akhil P Oommen, Viresh Kumar, Nishanth Menon, Stephen Boyd,
	Rafael J. Wysocki, Rob Clark, Sean Paul, Konrad Dybcio,
	Abhinav Kumar, Marijn Suijten, David Airlie, Simona Vetter,
	Bjorn Andersson, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
	Connor Abbott, linux-pm, linux-kernel, linux-arm-msm, dri-devel,
	freedreno, devicetree

On Fri, 15 Nov 2024 at 11:21, Neil Armstrong <neil.armstrong@linaro.org> wrote:
>
> On 15/11/2024 08:07, Dmitry Baryshkov wrote:
> > On Wed, Nov 13, 2024 at 04:48:28PM +0100, Neil Armstrong wrote:
> >> The Adreno GMU Management Unit (GNU) can also scale the DDR Bandwidth
> >> along the Frequency and Power Domain level, but by default we leave the
> >> OPP core vote for the interconnect ddr path.
> >>
> >> While scaling via the interconnect path was sufficient, newer GPUs
> >> like the A750 requires specific vote paremeters and bandwidth to
> >> achieve full functionality.
> >>
> >> Add a new Quirk enabling DDR Bandwidth vote via GMU.
> >
> > Please describe, why this is defined as a quirk rather than a proper
> > platform-level property. From my experience with 6xx and 7xx, all the
> > platforms need to send some kind of BW data to the GMU.
>
> Well APRIV, CACHED_COHERENT & PREEMPTION are HW features, why this can't be part of this ?
>
> Perhaps the "quirks" bitfield should be features instead ?

Sounds like that.


-- 
With best wishes
Dmitry

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH RFC 3/8] drm/msm: adreno: add plumbing to generate bandwidth vote table for GMU
  2024-11-15  9:09     ` Neil Armstrong
@ 2024-11-15 14:34       ` Dmitry Baryshkov
  0 siblings, 0 replies; 29+ messages in thread
From: Dmitry Baryshkov @ 2024-11-15 14:34 UTC (permalink / raw)
  To: Neil Armstrong
  Cc: Akhil P Oommen, Viresh Kumar, Nishanth Menon, Stephen Boyd,
	Rafael J. Wysocki, Rob Clark, Sean Paul, Konrad Dybcio,
	Abhinav Kumar, Marijn Suijten, David Airlie, Simona Vetter,
	Bjorn Andersson, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
	Connor Abbott, linux-pm, linux-kernel, linux-arm-msm, dri-devel,
	freedreno, devicetree

On Fri, Nov 15, 2024 at 10:09:44AM +0100, Neil Armstrong wrote:
> On 15/11/2024 08:20, Dmitry Baryshkov wrote:
> > On Wed, Nov 13, 2024 at 04:48:29PM +0100, Neil Armstrong wrote:
> > > The Adreno GMU Management Unit (GMU) can also scale DDR Bandwidth along
> > > the Frequency and Power Domain level, but by default we leave the
> > > OPP core scale the interconnect ddr path.
> > > 
> > > In order to get the vote values to be used by the GPU Management
> > > Unit (GMU), we need to parse all the possible OPP Bandwidths and
> > > create a vote value to be send to the appropriate Bus Control
> > > Modules (BCMs) declared in the GPU info struct.
> > > 
> > > The vote array will be used to dynamically generate the GMU bw_table
> > > sent during the GMU power-up.
> > > 
> > > Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org>
> > > ---
> > >   drivers/gpu/drm/msm/adreno/a6xx_gmu.c | 163 ++++++++++++++++++++++++++++++++++
> > >   drivers/gpu/drm/msm/adreno/a6xx_gmu.h |  12 +++
> > >   drivers/gpu/drm/msm/adreno/a6xx_gpu.h |   1 +
> > >   3 files changed, 176 insertions(+)
> > > 
> > > diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> > > index 14db7376c712d19446b38152e480bd5a1e0a5198..504a7c5d5a9df4c787951f2ae3a69d566d205ad5 100644
> > > --- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> > > +++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> > > @@ -9,6 +9,7 @@
> > >   #include <linux/pm_domain.h>
> > >   #include <linux/pm_opp.h>
> > >   #include <soc/qcom/cmd-db.h>
> > > +#include <soc/qcom/tcs.h>
> > >   #include <drm/drm_gem.h>
> > >   #include "a6xx_gpu.h"
> > > @@ -1287,6 +1288,119 @@ static int a6xx_gmu_memory_probe(struct a6xx_gmu *gmu)
> > >   	return 0;
> > >   }
> > > +struct a6xx_bcm_data {
> > > +	u32 buswidth;
> > > +	unsigned int unit;
> > > +	unsigned int width;
> > 
> > In bits?
> > 
> > > +	unsigned int vcd;
> > 
> > What is this?
> 
> I'll also copy the icc-rpmh.h doc associated with those fields

Yes, please please provide some kerneldoc for the srtuct.

> 
> > 
> > > +	bool fixed;
> > 
> > What does it mean?
> 
> I took it from downstream, but it's the same as qcom_icc_bcm enable_mask instead here the mask depends on the platform and OPP, this is why I specified it in perfmode.
> 
> > 
> > > +	unsigned int perfmode;
> > > +	unsigned int perfmode_bw;
> > > +};
> > > +
> > > +struct bcm_db {
> > > +	__le32 unit;
> > > +	__le16 width;
> > > +	u8 vcd;
> > > +	u8 reserved;
> > > +};
> > > +
> > > +static int a6xx_gmu_rpmh_get_bcm_data(const struct a6xx_bcm *bcm,
> > > +				      struct a6xx_bcm_data *bcm_data)
> > 
> > Is there a reason to copy CMD DB and BCM data to the interim
> > representation instead of using those directly?
> 
> I guess I can keep bcm_db & a6xx_bcm as-is and do the _to_cpu() in-place.

I think that makes sense.

> 
> > 
> > > +{
> > > +	const struct bcm_db *data;
> > > +	size_t count;
> > > +
> > > +	data = cmd_db_read_aux_data(bcm->name, &count);
> > > +	if (IS_ERR(data))
> > > +		return PTR_ERR(data);
> > > +
> > > +	if (!count)
> > > +		return -EINVAL;
> > > +
> > > +	bcm_data->unit = le32_to_cpu(data->unit);
> > > +	bcm_data->width = le16_to_cpu(data->width);
> > > +	bcm_data->vcd = data->vcd;
> > > +	bcm_data->fixed = bcm->fixed;
> > > +	bcm_data->perfmode = bcm->perfmode;
> > > +	bcm_data->perfmode_bw = bcm->perfmode_bw;
> > > +	bcm_data->buswidth = bcm->buswidth;
> > > +
> > > +	return 0;
> > > +}
> > > +
> > > +static void a6xx_gmu_rpmh_calc_bw_vote(struct a6xx_bcm_data *bcms,
> > > +				       int count, u32 bw, u32 *data)
> > > +{
> > > +	int i;
> > > +
> > > +	for (i = 0; i < count; i++) {
> > > +		bool valid = true;
> > > +		bool commit = false;
> > > +		u64 peak, y;
> > > +
> > > +		if (i == count - 1 || bcms[i].vcd != bcms[i + 1].vcd)
> > > +			commit = true;
> > > +
> > > +		if (bcms[i].fixed) {
> > > +			if (!bw)
> > > +				data[i] = BCM_TCS_CMD(commit, false, 0x0, 0x0);
> > > +			else
> > > +				data[i] = BCM_TCS_CMD(commit, true, 0x0,
> > > +					bw >= bcms[i].perfmode_bw ?
> > > +						bcms[i].perfmode : 0x0);
> > > +			continue;
> > > +		}
> > > +
> > > +		/* Multiple the bandwidth by the width of the connection */
> > 
> > ... and divide by the bus width. However it's not clear why you are
> > multiplying bandwidth (bits or bytes per second) with the width
> > (probably also bits?). Or is it not a width but the number of paths
> > between units?
> 
> So this is basically the same as in bcm_agregate:
> https://elixir.bootlin.com/linux/v6.12-rc6/source/drivers/interconnect/qcom/bcm-voter.c#L91
> 
> Just done slightly differently since we don't aggregate stuff but we want
> to set the bandwidth directly here from the GMU.

I see. And width comes from the CMD DB too.

> 
> > 
> > > +		peak = (u64)bw * bcms[i].width;
> > > +		do_div(peak, bcms[i].buswidth);
> > > +
> > > +		/* Input bandwidth value is in KBps */
> > 
> > Input or OPP / Interconnect?
> 
> I don't see the point, it's the input of the function which directly comes from OPP which is in KBps

I meant is it about the calculated 'peak' value? Also it might be worth
adding something mult_frac_ull, using do_div() instead of usual
division.


> > > +		y = peak * 1000ULL;
> > > +		do_div(y, bcms[i].unit);
> > > +
> > > +		/*
> > > +		 * If a bandwidth value was specified but the calculation ends
> > > +		 * rounding down to zero, set a minimum level
> > > +		 */
> > > +		if (bw && y == 0)
> > > +			y = 1;
> > 
> > Is it a real usecase or just a safety net? If the bandwidth ends up
> > being very low, maybe we should warn the users about it?
> 
> Probably a safety net, perhaps we could warn instead
> 
> > 
> > > +
> > > +		y = min_t(u64, y, BCM_TCS_CMD_VOTE_MASK);
> > > +		if (!y)
> > > +			valid = false;
> > 
> > This can probably be coupled with the previous condition.
> 
> Yeah I should probably refactor it and just avoid doing the
> calculation if bw == 0.
> 
> > 
> > > +
> > > +		data[i] = BCM_TCS_CMD(commit, valid, y, y);
> > > +	}
> > > +}
> > > +
> > > +static int a6xx_gmu_rpmh_bw_votes_init(const struct a6xx_info *info, struct a6xx_gmu *gmu)
> > > +{
> > > +	struct a6xx_bcm_data bcms[3];
> > > +	unsigned int bcm_count = 0;
> > > +	int ret, index;
> > > +
> > > +	/* Retrieve BCM data from cmd-db and merge with a6xx_info bcm table */
> > > +	for (index = 0; index < 3; index++) {
> > 
> > Magic number 3.
> > 
> > > +		if (!info->bcm[index].name)
> > > +			continue;
> > > +
> > > +		ret = a6xx_gmu_rpmh_get_bcm_data(&info->bcm[index], &bcms[index]);
> > > +		if (ret)
> > > +			return ret;
> > > +
> > > +		++bcm_count;
> > > +	}
> > > +
> > > +	/* Generate BCM votes values for each bandwidth & bcm */
> > > +	for (index = 0; index < gmu->nr_gpu_bws; index++)
> > > +		a6xx_gmu_rpmh_calc_bw_vote(bcms, bcm_count, gmu->gpu_bw_table[index],
> > > +					   gmu->gpu_bw_votes[index]);
> > > +
> > > +	return 0;
> > > +}
> > > +
> > >   /* Return the 'arc-level' for the given frequency */
> > >   static unsigned int a6xx_gmu_get_arc_level(struct device *dev,
> > >   					   unsigned long freq)
> > > @@ -1390,12 +1504,15 @@ static int a6xx_gmu_rpmh_arc_votes_init(struct device *dev, u32 *votes,
> > >    * The GMU votes with the RPMh for itself and on behalf of the GPU but we need
> > >    * to construct the list of votes on the CPU and send it over. Query the RPMh
> > >    * voltage levels and build the votes
> > > + * The GMU can also vote for DDR interconnects, use the OPP bandwidth entries
> > > + * and BCM parameters to build the votes.
> > >    */
> > >   static int a6xx_gmu_rpmh_votes_init(struct a6xx_gmu *gmu)
> > >   {
> > >   	struct a6xx_gpu *a6xx_gpu = container_of(gmu, struct a6xx_gpu, gmu);
> > >   	struct adreno_gpu *adreno_gpu = &a6xx_gpu->base;
> > > +	const struct a6xx_info *info = adreno_gpu->info->a6xx;
> > >   	struct msm_gpu *gpu = &adreno_gpu->base;
> > >   	int ret;
> > > @@ -1407,6 +1524,10 @@ static int a6xx_gmu_rpmh_votes_init(struct a6xx_gmu *gmu)
> > >   	ret |= a6xx_gmu_rpmh_arc_votes_init(gmu->dev, gmu->cx_arc_votes,
> > >   		gmu->gmu_freqs, gmu->nr_gmu_freqs, "cx.lvl");
> > > +	/* Build the interconnect votes */
> > > +	if (adreno_gpu->info->quirks & ADRENO_QUIRK_GMU_BW_VOTE)
> > > +		ret |= a6xx_gmu_rpmh_bw_votes_init(info, gmu);
> > > +
> > >   	return ret;
> > >   }
> > > @@ -1442,6 +1563,38 @@ static int a6xx_gmu_build_freq_table(struct device *dev, unsigned long *freqs,
> > >   	return index;
> > >   }
> > > +static int a6xx_gmu_build_bw_table(struct device *dev, unsigned long *bandwidths,
> > > +		u32 size)
> > > +{
> > > +	int count = dev_pm_opp_get_opp_count(dev);
> > > +	struct dev_pm_opp *opp;
> > > +	int i, index = 0;
> > > +	unsigned int bandwidth = 1;
> > > +
> > > +	/*
> > > +	 * The OPP table doesn't contain the "off" bandwidth level so we need to
> > > +	 * add 1 to the table size to account for it
> > > +	 */
> > > +
> > > +	if (WARN(count + 1 > size,
> > > +		"The GMU bandwidth table is being truncated\n"))
> > > +		count = size - 1;
> > > +
> > > +	/* Set the "off" bandwidth */
> > > +	bandwidths[index++] = 0;
> > > +
> > > +	for (i = 0; i < count; i++) {
> > > +		opp = dev_pm_opp_find_bw_ceil(dev, &bandwidth, 0);
> > > +		if (IS_ERR(opp))
> > > +			break;
> > > +
> > > +		dev_pm_opp_put(opp);
> > > +		bandwidths[index++] = bandwidth++;
> > > +	}
> > > +
> > > +	return index;
> > > +}
> > > +
> > >   static int a6xx_gmu_pwrlevels_probe(struct a6xx_gmu *gmu)
> > >   {
> > >   	struct a6xx_gpu *a6xx_gpu = container_of(gmu, struct a6xx_gpu, gmu);
> > > @@ -1472,6 +1625,16 @@ static int a6xx_gmu_pwrlevels_probe(struct a6xx_gmu *gmu)
> > >   	gmu->current_perf_index = gmu->nr_gpu_freqs - 1;
> > > +	/*
> > > +	 * The GMU also handles GPU Interconnect Votes so build a list
> > > +	 * of DDR bandwidths from the GPU OPP table
> > > +	 */
> > > +	if (adreno_gpu->info->quirks & ADRENO_QUIRK_GMU_BW_VOTE)
> > > +		gmu->nr_gpu_bws = a6xx_gmu_build_bw_table(&gpu->pdev->dev,
> > > +			gmu->gpu_bw_table, ARRAY_SIZE(gmu->gpu_bw_table));
> > > +
> > > +	gmu->current_perf_index = gmu->nr_gpu_freqs - 1;
> > > +
> > >   	/* Build the list of RPMh votes that we'll send to the GMU */
> > >   	return a6xx_gmu_rpmh_votes_init(gmu);
> > >   }
> > > diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.h b/drivers/gpu/drm/msm/adreno/a6xx_gmu.h
> > > index b4a79f88ccf45cfe651c86d2a9da39541c5772b3..95c632d8987a517f067c48c61c6c06b9a4f61fc0 100644
> > > --- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.h
> > > +++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.h
> > > @@ -19,6 +19,14 @@ struct a6xx_gmu_bo {
> > >   	u64 iova;
> > >   };
> > > +struct a6xx_bcm {
> > > +	char *name;
> > > +	unsigned int buswidth;
> > > +	bool fixed;
> > > +	unsigned int perfmode;
> > > +	unsigned int perfmode_bw;
> > > +};
> > > +
> > >   /*
> > >    * These define the different GMU wake up options - these define how both the
> > >    * CPU and the GMU bring up the hardware
> > > @@ -82,6 +90,10 @@ struct a6xx_gmu {
> > >   	unsigned long gpu_freqs[16];
> > >   	u32 gx_arc_votes[16];
> > > +	int nr_gpu_bws;
> > > +	unsigned long gpu_bw_table[16];
> > > +	u32 gpu_bw_votes[16][3];
> > 
> > Is it is the same magic 16 as we have few lines above or is this 16 a
> > different magic 16? And also 3 is a pure dark secret.
> 
> It's the same magic 16, since we use the same OPPs, the 3 is the actual number of BCMs we currently use, I wonder sure define should go, including the magic 16.

I think those defines can go to a6xx_gmu.h.
Also if the 16 is the same, should we define something like

  struct a6xx_gmu_freq_something {
  };

...

   struct a6xx_gmu {
       struct a6xx_gmu_freq_something bw_data[16];
   };

Seeing repetitive field size always makes me think about such a change.

> 
> > 
> > > +
> > >   	int nr_gmu_freqs;
> > >   	unsigned long gmu_freqs[4];
> > >   	u32 cx_arc_votes[4];
> > > diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.h b/drivers/gpu/drm/msm/adreno/a6xx_gpu.h
> > > index 4aceffb6aae89c781facc2a6e4a82b20b341b6cb..d779d700120cbd974ee87a67214739b1d85156e2 100644
> > > --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.h
> > > +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.h
> > > @@ -44,6 +44,7 @@ struct a6xx_info {
> > >   	u32 gmu_chipid;
> > >   	u32 gmu_cgc_mode;
> > >   	u32 prim_fifo_threshold;
> > > +	const struct a6xx_bcm bcm[3];
> > >   };
> > >   struct a6xx_gpu {
> > > 
> > > -- 
> > > 2.34.1
> > > 
> > 
> 

-- 
With best wishes
Dmitry

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH RFC 4/8] drm/msm: adreno: dynamically generate GMU bw table
  2024-11-15  9:11     ` Neil Armstrong
@ 2024-11-15 14:35       ` Dmitry Baryshkov
  0 siblings, 0 replies; 29+ messages in thread
From: Dmitry Baryshkov @ 2024-11-15 14:35 UTC (permalink / raw)
  To: Neil Armstrong
  Cc: Akhil P Oommen, Viresh Kumar, Nishanth Menon, Stephen Boyd,
	Rafael J. Wysocki, Rob Clark, Sean Paul, Konrad Dybcio,
	Abhinav Kumar, Marijn Suijten, David Airlie, Simona Vetter,
	Bjorn Andersson, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
	Connor Abbott, linux-pm, linux-kernel, linux-arm-msm, dri-devel,
	freedreno, devicetree

On Fri, Nov 15, 2024 at 10:11:09AM +0100, Neil Armstrong wrote:
> On 15/11/2024 08:24, Dmitry Baryshkov wrote:
> > On Wed, Nov 13, 2024 at 04:48:30PM +0100, Neil Armstrong wrote:
> > > The Adreno GPU Management Unit (GMU) can also scale the ddr
> > > bandwidth along the frequency and power domain level, but for
> > > now we statically fill the bw_table with values from the
> > > downstream driver.
> > > 
> > > Only the first entry is used, which is a disable vote, so we
> > > currently rely on scaling via the linux interconnect paths.
> > > 
> > > Let's dynamically generate the bw_table with the vote values
> > > previously calculated from the OPPs.
> > 
> > Nice to see this being worked upon. I hope the code can is generic
> > enough so that we can use it from other adreno_foo_build_bw_table()
> > functions.
> 
> I would hope so, but I don't have the HW to properly test it on those
> platforms.

Welcome to the club^W Lab.

> > > Those entried will then be used by the GMU when passing the
> > > appropriate bandwidth level when voting for a gpu frequency.
> > > 
> > > Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org>
> > > ---
> > >   drivers/gpu/drm/msm/adreno/a6xx_hfi.c | 48 +++++++++++++++++++++++++++--------
> > >   1 file changed, 37 insertions(+), 11 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/msm/adreno/a6xx_hfi.c b/drivers/gpu/drm/msm/adreno/a6xx_hfi.c
> > > index cb8844ed46b29c4569d05eb7a24f7b27e173190f..9a89ba95843e7805d78f0e5ddbe328677b6431dd 100644
> > > --- a/drivers/gpu/drm/msm/adreno/a6xx_hfi.c
> > > +++ b/drivers/gpu/drm/msm/adreno/a6xx_hfi.c
> > > @@ -596,22 +596,48 @@ static void a730_build_bw_table(struct a6xx_hfi_msg_bw_table *msg)
> > >   	msg->cnoc_cmds_data[1][0] = 0x60000001;
> > >   }
> > > -static void a740_build_bw_table(struct a6xx_hfi_msg_bw_table *msg)
> > > +static void a740_generate_bw_table(struct adreno_gpu *adreno_gpu, struct a6xx_gmu *gmu,
> > > +				   struct a6xx_hfi_msg_bw_table *msg)
> > >   {
> > > -	msg->bw_level_num = 1;
> > > +	const struct a6xx_info *info = adreno_gpu->info->a6xx;
> > > +	unsigned int i, j;
> > > -	msg->ddr_cmds_num = 3;
> > >   	msg->ddr_wait_bitmask = 0x7;
> > > -	msg->ddr_cmds_addrs[0] = cmd_db_read_addr("SH0");
> > > -	msg->ddr_cmds_addrs[1] = cmd_db_read_addr("MC0");
> > > -	msg->ddr_cmds_addrs[2] = cmd_db_read_addr("ACV");
> > > +	for (i = 0; i < 3; i++) {
> > > +		if (!info->bcm[i].name)
> > > +			break;
> > > +		msg->ddr_cmds_addrs[i] = cmd_db_read_addr(info->bcm[i].name);
> > > +	}
> > > +	msg->ddr_cmds_num = i;
> > > -	msg->ddr_cmds_data[0][0] = 0x40000000;
> > > -	msg->ddr_cmds_data[0][1] = 0x40000000;
> > > -	msg->ddr_cmds_data[0][2] = 0x40000000;
> > > +	for (i = 0; i < gmu->nr_gpu_bws; ++i)
> > > +		for (j = 0; j < msg->ddr_cmds_num; j++)
> > > +			msg->ddr_cmds_data[i][j] = gmu->gpu_bw_votes[i][j];
> > > +	msg->bw_level_num = gmu->nr_gpu_bws;
> > > +}
> > > +
> > > +static void a740_build_bw_table(struct adreno_gpu *adreno_gpu, struct a6xx_gmu *gmu,
> > > +				struct a6xx_hfi_msg_bw_table *msg)
> > > +{
> > > +	if ((adreno_gpu->info->quirks & ADRENO_QUIRK_GMU_BW_VOTE) && gmu->nr_gpu_bws) {
> > > +		a740_generate_bw_table(adreno_gpu, gmu, msg);
> > > +	} else {
> > 
> > Why do we need a fallback code here?
> 
> Because at this particular commit, it would generate an invalid table, I should probably remove the fallback at the end

Or move this to a generic code that generates a table if there is no bw
data (like there is none for older platforms with the current DTs).

> 
> > 
> > > +		msg->bw_level_num = 1;
> > > -	/* TODO: add a proper dvfs table */
> > > +		msg->ddr_cmds_num = 3;
> > > +		msg->ddr_wait_bitmask = 0x7;
> > > +
> > > +		msg->ddr_cmds_addrs[0] = cmd_db_read_addr("SH0");
> > > +		msg->ddr_cmds_addrs[1] = cmd_db_read_addr("MC0");
> > > +		msg->ddr_cmds_addrs[2] = cmd_db_read_addr("ACV");
> > > +
> > > +		msg->ddr_cmds_data[0][0] = 0x40000000;
> > > +		msg->ddr_cmds_data[0][1] = 0x40000000;
> > > +		msg->ddr_cmds_data[0][2] = 0x40000000;
> > > +
> > > +		/* TODO: add a proper dvfs table */
> > 
> > I think TODO is unapplicable anymore.
> > 
> > > +	}
> > >   	msg->cnoc_cmds_num = 1;
> > >   	msg->cnoc_wait_bitmask = 0x1;
> > > @@ -691,7 +717,7 @@ static int a6xx_hfi_send_bw_table(struct a6xx_gmu *gmu)
> > >   	else if (adreno_is_a730(adreno_gpu))
> > >   		a730_build_bw_table(msg);
> > >   	else if (adreno_is_a740_family(adreno_gpu))
> > > -		a740_build_bw_table(msg);
> > > +		a740_build_bw_table(adreno_gpu, gmu, msg);
> > >   	else
> > >   		a6xx_build_bw_table(msg);
> > > 
> > > -- 
> > > 2.34.1
> > > 
> > 
> 

-- 
With best wishes
Dmitry

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH RFC 6/8] drm/msm: adreno: enable GMU bandwidth for A740 and A750
  2024-11-15  9:20     ` Neil Armstrong
@ 2024-11-15 14:39       ` Dmitry Baryshkov
  2024-11-18 13:42         ` Neil Armstrong
  0 siblings, 1 reply; 29+ messages in thread
From: Dmitry Baryshkov @ 2024-11-15 14:39 UTC (permalink / raw)
  To: Neil Armstrong
  Cc: Akhil P Oommen, Viresh Kumar, Nishanth Menon, Stephen Boyd,
	Rafael J. Wysocki, Rob Clark, Sean Paul, Konrad Dybcio,
	Abhinav Kumar, Marijn Suijten, David Airlie, Simona Vetter,
	Bjorn Andersson, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
	Connor Abbott, linux-pm, linux-kernel, linux-arm-msm, dri-devel,
	freedreno, devicetree

On Fri, Nov 15, 2024 at 10:20:01AM +0100, Neil Armstrong wrote:
> On 15/11/2024 08:33, Dmitry Baryshkov wrote:
> > On Wed, Nov 13, 2024 at 04:48:32PM +0100, Neil Armstrong wrote:
> > > Now all the DDR bandwidth voting via the GPU Management Unit (GMU)
> > > is in place, let's declare the Bus Control Modules (BCMs) and
> > 
> > s/let's //g
> > 
> > > it's parameters in the GPU info struct and add the GMU_BW_VOTE
> > > quirk to enable it.
> > 
> > Can we define a function that checks for info.bcm[0].name isntead of
> > adding a quirk?
> 
> Probably, I'll need ideas to how design this better, perhaps a simple
> capability bitfield in a6xx_info ?

I'm not sure if I follow the question. I think it's better to check for
the presens of the data rather than having a separate 'cap' bit in
addition to that data.

> There's other feature that are lacking, like ACD or BCL which are not supported
> on all a6xx/a7xx gpus.

Akhil is currently working on ACD, as you have seen from the patches.

> 
> > 
> > > 
> > > Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org>
> > > ---
> > >   drivers/gpu/drm/msm/adreno/a6xx_catalog.c | 26 ++++++++++++++++++++++++--
> > >   1 file changed, 24 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/msm/adreno/a6xx_catalog.c b/drivers/gpu/drm/msm/adreno/a6xx_catalog.c
> > > index 0c560e84ad5a53bb4e8a49ba4e153ce9cf33f7ae..014a24256b832d8e03fe06a6516b5348a5c0474a 100644
> > > --- a/drivers/gpu/drm/msm/adreno/a6xx_catalog.c
> > > +++ b/drivers/gpu/drm/msm/adreno/a6xx_catalog.c
> > > @@ -1379,7 +1379,8 @@ static const struct adreno_info a7xx_gpus[] = {
> > >   		.inactive_period = DRM_MSM_INACTIVE_PERIOD,
> > >   		.quirks = ADRENO_QUIRK_HAS_CACHED_COHERENT |
> > >   			  ADRENO_QUIRK_HAS_HW_APRIV |
> > > -			  ADRENO_QUIRK_PREEMPTION,
> > > +			  ADRENO_QUIRK_PREEMPTION |
> > > +			  ADRENO_QUIRK_GMU_BW_VOTE,
> > >   		.init = a6xx_gpu_init,
> > >   		.zapfw = "a740_zap.mdt",
> > >   		.a6xx = &(const struct a6xx_info) {
> > > @@ -1388,6 +1389,16 @@ static const struct adreno_info a7xx_gpus[] = {
> > >   			.pwrup_reglist = &a7xx_pwrup_reglist,
> > >   			.gmu_chipid = 0x7020100,
> > >   			.gmu_cgc_mode = 0x00020202,
> > > +			.bcm = {
> > > +				[0] = { .name = "SH0", .buswidth = 16 },
> > > +				[1] = { .name = "MC0", .buswidth = 4 },
> > > +				[2] = {
> > > +					.name = "ACV",
> > > +					.fixed = true,
> > > +					.perfmode = BIT(3),
> > > +					.perfmode_bw = 16500000,
> > 
> > Is it a platform property or GPU / GMU property? Can expect that there
> > might be several SoCs having the same GPU, but different perfmode_bw
> > entry?
> 
> I presume this is SoC specific ? But today the XXX_build_bw_table() are
> already SoC specific, so where should this go ?

XXX_build_bw_table() are GPU-specific. There are cases of several SoCs
sharing the same GPU on them.

> Downstream specifies this in the adreno-gpulist.h, which is the equivalent
> here.

-- 
With best wishes
Dmitry

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH RFC 2/8] drm/msm: adreno: add GMU_BW_VOTE quirk
  2024-11-15 14:18       ` Dmitry Baryshkov
@ 2024-11-15 15:10         ` Rob Clark
  2024-11-15 15:28           ` neil.armstrong
  0 siblings, 1 reply; 29+ messages in thread
From: Rob Clark @ 2024-11-15 15:10 UTC (permalink / raw)
  To: Dmitry Baryshkov
  Cc: neil.armstrong, Akhil P Oommen, Viresh Kumar, Nishanth Menon,
	Stephen Boyd, Rafael J. Wysocki, Sean Paul, Konrad Dybcio,
	Abhinav Kumar, Marijn Suijten, David Airlie, Simona Vetter,
	Bjorn Andersson, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
	Connor Abbott, linux-pm, linux-kernel, linux-arm-msm, dri-devel,
	freedreno, devicetree

On Fri, Nov 15, 2024 at 6:18 AM Dmitry Baryshkov
<dmitry.baryshkov@linaro.org> wrote:
>
> On Fri, 15 Nov 2024 at 11:21, Neil Armstrong <neil.armstrong@linaro.org> wrote:
> >
> > On 15/11/2024 08:07, Dmitry Baryshkov wrote:
> > > On Wed, Nov 13, 2024 at 04:48:28PM +0100, Neil Armstrong wrote:
> > >> The Adreno GMU Management Unit (GNU) can also scale the DDR Bandwidth
> > >> along the Frequency and Power Domain level, but by default we leave the
> > >> OPP core vote for the interconnect ddr path.
> > >>
> > >> While scaling via the interconnect path was sufficient, newer GPUs
> > >> like the A750 requires specific vote paremeters and bandwidth to
> > >> achieve full functionality.
> > >>
> > >> Add a new Quirk enabling DDR Bandwidth vote via GMU.
> > >
> > > Please describe, why this is defined as a quirk rather than a proper
> > > platform-level property. From my experience with 6xx and 7xx, all the
> > > platforms need to send some kind of BW data to the GMU.
> >
> > Well APRIV, CACHED_COHERENT & PREEMPTION are HW features, why this can't be part of this ?
> >
> > Perhaps the "quirks" bitfield should be features instead ?
>
> Sounds like that.

But LMLOADKILL_DISABLE and TWO_PASS_USE_WFI are quirks.. so it is kind
of a mix of quirks and features.  So meh

BR,
-R

>
> --
> With best wishes
> Dmitry

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH RFC 2/8] drm/msm: adreno: add GMU_BW_VOTE quirk
  2024-11-15 15:10         ` Rob Clark
@ 2024-11-15 15:28           ` neil.armstrong
  0 siblings, 0 replies; 29+ messages in thread
From: neil.armstrong @ 2024-11-15 15:28 UTC (permalink / raw)
  To: Rob Clark, Dmitry Baryshkov
  Cc: Akhil P Oommen, Viresh Kumar, Nishanth Menon, Stephen Boyd,
	Rafael J. Wysocki, Sean Paul, Konrad Dybcio, Abhinav Kumar,
	Marijn Suijten, David Airlie, Simona Vetter, Bjorn Andersson,
	Rob Herring, Krzysztof Kozlowski, Conor Dooley, Connor Abbott,
	linux-pm, linux-kernel, linux-arm-msm, dri-devel, freedreno,
	devicetree

On 15/11/2024 16:10, Rob Clark wrote:
> On Fri, Nov 15, 2024 at 6:18 AM Dmitry Baryshkov
> <dmitry.baryshkov@linaro.org> wrote:
>>
>> On Fri, 15 Nov 2024 at 11:21, Neil Armstrong <neil.armstrong@linaro.org> wrote:
>>>
>>> On 15/11/2024 08:07, Dmitry Baryshkov wrote:
>>>> On Wed, Nov 13, 2024 at 04:48:28PM +0100, Neil Armstrong wrote:
>>>>> The Adreno GMU Management Unit (GNU) can also scale the DDR Bandwidth
>>>>> along the Frequency and Power Domain level, but by default we leave the
>>>>> OPP core vote for the interconnect ddr path.
>>>>>
>>>>> While scaling via the interconnect path was sufficient, newer GPUs
>>>>> like the A750 requires specific vote paremeters and bandwidth to
>>>>> achieve full functionality.
>>>>>
>>>>> Add a new Quirk enabling DDR Bandwidth vote via GMU.
>>>>
>>>> Please describe, why this is defined as a quirk rather than a proper
>>>> platform-level property. From my experience with 6xx and 7xx, all the
>>>> platforms need to send some kind of BW data to the GMU.
>>>
>>> Well APRIV, CACHED_COHERENT & PREEMPTION are HW features, why this can't be part of this ?
>>>
>>> Perhaps the "quirks" bitfield should be features instead ?
>>
>> Sounds like that.
> 
> But LMLOADKILL_DISABLE and TWO_PASS_USE_WFI are quirks.. so it is kind
> of a mix of quirks and features.  So meh

Well I can do a split and move the features into a clean .features bitfield, would it be ok ?

Neil

> 
> BR,
> -R
> 
>>
>> --
>> With best wishes
>> Dmitry


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH RFC 6/8] drm/msm: adreno: enable GMU bandwidth for A740 and A750
  2024-11-15 14:39       ` Dmitry Baryshkov
@ 2024-11-18 13:42         ` Neil Armstrong
  2024-11-18 14:39           ` Dmitry Baryshkov
  0 siblings, 1 reply; 29+ messages in thread
From: Neil Armstrong @ 2024-11-18 13:42 UTC (permalink / raw)
  To: Dmitry Baryshkov
  Cc: Akhil P Oommen, Viresh Kumar, Nishanth Menon, Stephen Boyd,
	Rafael J. Wysocki, Rob Clark, Sean Paul, Konrad Dybcio,
	Abhinav Kumar, Marijn Suijten, David Airlie, Simona Vetter,
	Bjorn Andersson, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
	Connor Abbott, linux-pm, linux-kernel, linux-arm-msm, dri-devel,
	freedreno, devicetree

On 15/11/2024 15:39, Dmitry Baryshkov wrote:
> On Fri, Nov 15, 2024 at 10:20:01AM +0100, Neil Armstrong wrote:
>> On 15/11/2024 08:33, Dmitry Baryshkov wrote:
>>> On Wed, Nov 13, 2024 at 04:48:32PM +0100, Neil Armstrong wrote:
>>>> Now all the DDR bandwidth voting via the GPU Management Unit (GMU)
>>>> is in place, let's declare the Bus Control Modules (BCMs) and
>>>
>>> s/let's //g
>>>
>>>> it's parameters in the GPU info struct and add the GMU_BW_VOTE
>>>> quirk to enable it.
>>>
>>> Can we define a function that checks for info.bcm[0].name isntead of
>>> adding a quirk?
>>
>> Probably, I'll need ideas to how design this better, perhaps a simple
>> capability bitfield in a6xx_info ?
> 
> I'm not sure if I follow the question. I think it's better to check for
> the presens of the data rather than having a separate 'cap' bit in
> addition to that data.

I don't fully agree here, I just follow the other features (CACHED_COHERENT/APRIV/...)
nothing fancy.
I'll introduce a features bitfield, so we don't mix them with quirks

> 
>> There's other feature that are lacking, like ACD or BCL which are not supported
>> on all a6xx/a7xx gpus.
> 
> Akhil is currently working on ACD, as you have seen from the patches.

Yep I've tested and reviewed the patches

> 
>>
>>>
>>>>
>>>> Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org>
>>>> ---
>>>>    drivers/gpu/drm/msm/adreno/a6xx_catalog.c | 26 ++++++++++++++++++++++++--
>>>>    1 file changed, 24 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_catalog.c b/drivers/gpu/drm/msm/adreno/a6xx_catalog.c
>>>> index 0c560e84ad5a53bb4e8a49ba4e153ce9cf33f7ae..014a24256b832d8e03fe06a6516b5348a5c0474a 100644
>>>> --- a/drivers/gpu/drm/msm/adreno/a6xx_catalog.c
>>>> +++ b/drivers/gpu/drm/msm/adreno/a6xx_catalog.c
>>>> @@ -1379,7 +1379,8 @@ static const struct adreno_info a7xx_gpus[] = {
>>>>    		.inactive_period = DRM_MSM_INACTIVE_PERIOD,
>>>>    		.quirks = ADRENO_QUIRK_HAS_CACHED_COHERENT |
>>>>    			  ADRENO_QUIRK_HAS_HW_APRIV |
>>>> -			  ADRENO_QUIRK_PREEMPTION,
>>>> +			  ADRENO_QUIRK_PREEMPTION |
>>>> +			  ADRENO_QUIRK_GMU_BW_VOTE,
>>>>    		.init = a6xx_gpu_init,
>>>>    		.zapfw = "a740_zap.mdt",
>>>>    		.a6xx = &(const struct a6xx_info) {
>>>> @@ -1388,6 +1389,16 @@ static const struct adreno_info a7xx_gpus[] = {
>>>>    			.pwrup_reglist = &a7xx_pwrup_reglist,
>>>>    			.gmu_chipid = 0x7020100,
>>>>    			.gmu_cgc_mode = 0x00020202,
>>>> +			.bcm = {
>>>> +				[0] = { .name = "SH0", .buswidth = 16 },
>>>> +				[1] = { .name = "MC0", .buswidth = 4 },
>>>> +				[2] = {
>>>> +					.name = "ACV",
>>>> +					.fixed = true,
>>>> +					.perfmode = BIT(3),
>>>> +					.perfmode_bw = 16500000,
>>>
>>> Is it a platform property or GPU / GMU property? Can expect that there
>>> might be several SoCs having the same GPU, but different perfmode_bw
>>> entry?
>>
>> I presume this is SoC specific ? But today the XXX_build_bw_table() are
>> already SoC specific, so where should this go ?
> 
> XXX_build_bw_table() are GPU-specific. There are cases of several SoCs
> sharing the same GPU on them.

So it's gpu-specific

> 
>> Downstream specifies this in the adreno-gpulist.h, which is the equivalent
>> here.
> 


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH RFC 6/8] drm/msm: adreno: enable GMU bandwidth for A740 and A750
  2024-11-18 13:42         ` Neil Armstrong
@ 2024-11-18 14:39           ` Dmitry Baryshkov
  0 siblings, 0 replies; 29+ messages in thread
From: Dmitry Baryshkov @ 2024-11-18 14:39 UTC (permalink / raw)
  To: neil.armstrong
  Cc: Akhil P Oommen, Viresh Kumar, Nishanth Menon, Stephen Boyd,
	Rafael J. Wysocki, Rob Clark, Sean Paul, Konrad Dybcio,
	Abhinav Kumar, Marijn Suijten, David Airlie, Simona Vetter,
	Bjorn Andersson, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
	Connor Abbott, linux-pm, linux-kernel, linux-arm-msm, dri-devel,
	freedreno, devicetree

On Mon, 18 Nov 2024 at 15:43, Neil Armstrong <neil.armstrong@linaro.org> wrote:
>
> On 15/11/2024 15:39, Dmitry Baryshkov wrote:
> > On Fri, Nov 15, 2024 at 10:20:01AM +0100, Neil Armstrong wrote:
> >> On 15/11/2024 08:33, Dmitry Baryshkov wrote:
> >>> On Wed, Nov 13, 2024 at 04:48:32PM +0100, Neil Armstrong wrote:
> >>>> Now all the DDR bandwidth voting via the GPU Management Unit (GMU)
> >>>> is in place, let's declare the Bus Control Modules (BCMs) and
> >>>
> >>> s/let's //g
> >>>
> >>>> it's parameters in the GPU info struct and add the GMU_BW_VOTE
> >>>> quirk to enable it.
> >>>
> >>> Can we define a function that checks for info.bcm[0].name isntead of
> >>> adding a quirk?
> >>
> >> Probably, I'll need ideas to how design this better, perhaps a simple
> >> capability bitfield in a6xx_info ?
> >
> > I'm not sure if I follow the question. I think it's better to check for
> > the presens of the data rather than having a separate 'cap' bit in
> > addition to that data.
>
> I don't fully agree here, I just follow the other features (CACHED_COHERENT/APRIV/...)
> nothing fancy.
> I'll introduce a features bitfield, so we don't mix them with quirks

SGTM

>
> >
> >> There's other feature that are lacking, like ACD or BCL which are not supported
> >> on all a6xx/a7xx gpus.
> >
> > Akhil is currently working on ACD, as you have seen from the patches.
>
> Yep I've tested and reviewed the patches
>
> >
> >>
> >>>
> >>>>
> >>>> Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org>
> >>>> ---
> >>>>    drivers/gpu/drm/msm/adreno/a6xx_catalog.c | 26 ++++++++++++++++++++++++--
> >>>>    1 file changed, 24 insertions(+), 2 deletions(-)
> >>>>
> >>>> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_catalog.c b/drivers/gpu/drm/msm/adreno/a6xx_catalog.c
> >>>> index 0c560e84ad5a53bb4e8a49ba4e153ce9cf33f7ae..014a24256b832d8e03fe06a6516b5348a5c0474a 100644
> >>>> --- a/drivers/gpu/drm/msm/adreno/a6xx_catalog.c
> >>>> +++ b/drivers/gpu/drm/msm/adreno/a6xx_catalog.c
> >>>> @@ -1379,7 +1379,8 @@ static const struct adreno_info a7xx_gpus[] = {
> >>>>                    .inactive_period = DRM_MSM_INACTIVE_PERIOD,
> >>>>                    .quirks = ADRENO_QUIRK_HAS_CACHED_COHERENT |
> >>>>                              ADRENO_QUIRK_HAS_HW_APRIV |
> >>>> -                    ADRENO_QUIRK_PREEMPTION,
> >>>> +                    ADRENO_QUIRK_PREEMPTION |
> >>>> +                    ADRENO_QUIRK_GMU_BW_VOTE,
> >>>>                    .init = a6xx_gpu_init,
> >>>>                    .zapfw = "a740_zap.mdt",
> >>>>                    .a6xx = &(const struct a6xx_info) {
> >>>> @@ -1388,6 +1389,16 @@ static const struct adreno_info a7xx_gpus[] = {
> >>>>                            .pwrup_reglist = &a7xx_pwrup_reglist,
> >>>>                            .gmu_chipid = 0x7020100,
> >>>>                            .gmu_cgc_mode = 0x00020202,
> >>>> +                  .bcm = {
> >>>> +                          [0] = { .name = "SH0", .buswidth = 16 },
> >>>> +                          [1] = { .name = "MC0", .buswidth = 4 },
> >>>> +                          [2] = {
> >>>> +                                  .name = "ACV",
> >>>> +                                  .fixed = true,
> >>>> +                                  .perfmode = BIT(3),
> >>>> +                                  .perfmode_bw = 16500000,
> >>>
> >>> Is it a platform property or GPU / GMU property? Can expect that there
> >>> might be several SoCs having the same GPU, but different perfmode_bw
> >>> entry?
> >>
> >> I presume this is SoC specific ? But today the XXX_build_bw_table() are
> >> already SoC specific, so where should this go ?
> >
> > XXX_build_bw_table() are GPU-specific. There are cases of several SoCs
> > sharing the same GPU on them.
>
> So it's gpu-specific
>
> >
> >> Downstream specifies this in the adreno-gpulist.h, which is the equivalent
> >> here.
> >
>


-- 
With best wishes
Dmitry

^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2024-11-18 14:39 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-11-13 15:48 [PATCH RFC 0/8] drm/msm: adreno: add support for DDR bandwidth scaling via GMU Neil Armstrong
2024-11-13 15:48 ` [PATCH RFC 1/8] opp: core: implement dev_pm_opp_get_bandwidth Neil Armstrong
2024-11-14  4:10   ` Viresh Kumar
2024-11-14  9:23     ` Neil Armstrong
2024-11-13 15:48 ` [PATCH RFC 2/8] drm/msm: adreno: add GMU_BW_VOTE quirk Neil Armstrong
2024-11-15  7:07   ` Dmitry Baryshkov
2024-11-15  9:21     ` Neil Armstrong
2024-11-15 14:18       ` Dmitry Baryshkov
2024-11-15 15:10         ` Rob Clark
2024-11-15 15:28           ` neil.armstrong
2024-11-13 15:48 ` [PATCH RFC 3/8] drm/msm: adreno: add plumbing to generate bandwidth vote table for GMU Neil Armstrong
2024-11-15  7:20   ` Dmitry Baryshkov
2024-11-15  9:09     ` Neil Armstrong
2024-11-15 14:34       ` Dmitry Baryshkov
2024-11-13 15:48 ` [PATCH RFC 4/8] drm/msm: adreno: dynamically generate GMU bw table Neil Armstrong
2024-11-15  7:24   ` Dmitry Baryshkov
2024-11-15  9:11     ` Neil Armstrong
2024-11-15 14:35       ` Dmitry Baryshkov
2024-11-13 15:48 ` [PATCH RFC 5/8] drm/msm: adreno: find bandwidth index of OPP and set it along freq index Neil Armstrong
2024-11-15  7:28   ` Dmitry Baryshkov
2024-11-15  9:15     ` Neil Armstrong
2024-11-13 15:48 ` [PATCH RFC 6/8] drm/msm: adreno: enable GMU bandwidth for A740 and A750 Neil Armstrong
2024-11-15  7:33   ` Dmitry Baryshkov
2024-11-15  9:20     ` Neil Armstrong
2024-11-15 14:39       ` Dmitry Baryshkov
2024-11-18 13:42         ` Neil Armstrong
2024-11-18 14:39           ` Dmitry Baryshkov
2024-11-13 15:48 ` [PATCH RFC 7/8] arm64: qcom: dts: sm8550: add interconnect and opp-peak-kBps for GPU Neil Armstrong
2024-11-13 15:48 ` [PATCH RFC 8/8] arm64: qcom: dts: sm8650: " Neil Armstrong

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox