[PATCH v2 0/7] GPU workload hints for better performance

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* [PATCH v2 0/7] GPU workload hints for better performance
@ 2023-08-21  6:47 Arvind Yadav
  2023-08-21  6:47 ` [PATCH v2 1/7] drm/amdgpu: Added init/fini functions for workload Arvind Yadav
                   ` (6 more replies)
  0 siblings, 7 replies; 39+ messages in thread
From: Arvind Yadav @ 2023-08-21  6:47 UTC (permalink / raw)
  To: Christian.Koenig, alexander.deucher, shashank.sharma, Xinhui.Pan,
	airlied, daniel, Felix.Kuehling, amd-gfx
  Cc: dri-devel, linux-kernel, Arvind Yadav

AMDGPU SOCs supports dynamic workload based power profiles, which can
provide fine-tuned performance for a particular type of workload.
This patch series adds an interface to set/reset these power profiles
based on the submitted job. The driver can dynamically switch
the power profiles based on submitted job. This can optimize the power
performance when the particular workload is on. 

v2:
- Splitting workload_profile_set and workload_profile_put
  into two separate patches.
- Addressed review comment.
- Added new suspend function.
- Added patch to switches the GPU workload mode for KFD.

Arvind Yadav (7):
  drm/amdgpu: Added init/fini functions for workload
  drm/amdgpu: Add new function to set GPU power profile
  drm/amdgpu: Add new function to put GPU power profile
  drm/amdgpu: Add suspend function to clear the GPU power profile.
  drm/amdgpu: Switch on/off GPU workload profile
  drm/amdgpu: switch workload context to/from compute
  Revert "drm/amd/amdgpu: switch on/off vcn power profile mode"

 drivers/gpu/drm/amd/amdgpu/Makefile           |   2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu.h           |   3 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c    |   8 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c    |   6 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_job.c       |   5 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c       |  14 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c  | 220 ++++++++++++++++++
 drivers/gpu/drm/amd/include/amdgpu_workload.h |  61 +++++
 8 files changed, 303 insertions(+), 16 deletions(-)
 create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c
 create mode 100644 drivers/gpu/drm/amd/include/amdgpu_workload.h

-- 
2.34.1

^ permalink raw reply	[flat|nested] 39+ messages in thread

* [PATCH v2 1/7] drm/amdgpu: Added init/fini functions for workload
  2023-08-21  6:47 [PATCH v2 0/7] GPU workload hints for better performance Arvind Yadav
@ 2023-08-21  6:47 ` Arvind Yadav
  2023-08-21 13:06   ` Shashank Sharma
  2023-08-21  6:47 ` [PATCH v2 2/7] drm/amdgpu: Add new function to set GPU power profile Arvind Yadav
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 39+ messages in thread
From: Arvind Yadav @ 2023-08-21  6:47 UTC (permalink / raw)
  To: Christian.Koenig, alexander.deucher, shashank.sharma, Xinhui.Pan,
	airlied, daniel, Felix.Kuehling, amd-gfx
  Cc: dri-devel, linux-kernel, Arvind Yadav, Christian Koenig

The'struct amdgpu_smu_workload' initialization/cleanup
functions is added by this patch.

v2:
- Splitting big patch into separate patches.
- Added new fini function.

Cc: Shashank Sharma <shashank.sharma@amd.com>
Cc: Christian Koenig <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Arvind Yadav <Arvind.Yadav@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/Makefile           |  2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu.h           |  3 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c    |  4 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c  | 44 +++++++++++++++
 drivers/gpu/drm/amd/include/amdgpu_workload.h | 53 +++++++++++++++++++
 5 files changed, 105 insertions(+), 1 deletion(-)
 create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c
 create mode 100644 drivers/gpu/drm/amd/include/amdgpu_workload.h

diff --git a/drivers/gpu/drm/amd/amdgpu/Makefile b/drivers/gpu/drm/amd/amdgpu/Makefile
index 415a7fa395c4..6a9e187d61e1 100644
--- a/drivers/gpu/drm/amd/amdgpu/Makefile
+++ b/drivers/gpu/drm/amd/amdgpu/Makefile
@@ -60,7 +60,7 @@ amdgpu-y += amdgpu_device.o amdgpu_kms.o \
 	amdgpu_umc.o smu_v11_0_i2c.o amdgpu_fru_eeprom.o amdgpu_rap.o \
 	amdgpu_fw_attestation.o amdgpu_securedisplay.o \
 	amdgpu_eeprom.o amdgpu_mca.o amdgpu_psp_ta.o amdgpu_lsdma.o \
-	amdgpu_ring_mux.o
+	amdgpu_ring_mux.o amdgpu_workload.o
 
 amdgpu-$(CONFIG_PROC_FS) += amdgpu_fdinfo.o
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index 02b827785e39..1939fa1af8a6 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -107,6 +107,7 @@
 #include "amdgpu_fdinfo.h"
 #include "amdgpu_mca.h"
 #include "amdgpu_ras.h"
+#include "amdgpu_workload.h"
 
 #define MAX_GPU_INSTANCE		16
 
@@ -1050,6 +1051,8 @@ struct amdgpu_device {
 
 	bool                            job_hang;
 	bool                            dc_enabled;
+
+	struct amdgpu_smu_workload	smu_workload;
 };
 
 static inline struct amdgpu_device *drm_to_adev(struct drm_device *ddev)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 5c7d40873ee2..cd3bf641b630 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -2243,6 +2243,8 @@ static int amdgpu_device_ip_early_init(struct amdgpu_device *adev)
 	adev->cg_flags &= amdgpu_cg_mask;
 	adev->pg_flags &= amdgpu_pg_mask;
 
+	amdgpu_workload_profile_init(adev);
+
 	return 0;
 }
 
@@ -2890,6 +2892,8 @@ static int amdgpu_device_ip_fini(struct amdgpu_device *adev)
 {
 	int i, r;
 
+	amdgpu_workload_profile_fini(adev);
+
 	if (amdgpu_sriov_vf(adev) && adev->virt.ras_init_done)
 		amdgpu_virt_release_ras_err_handler_data(adev);
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c
new file mode 100644
index 000000000000..32166f482f77
--- /dev/null
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c
@@ -0,0 +1,44 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright 2023 Advanced Micro Devices, Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ *
+ */
+
+#include "amdgpu.h"
+
+void amdgpu_workload_profile_init(struct amdgpu_device *adev)
+{
+	adev->smu_workload.adev = adev;
+	adev->smu_workload.submit_workload_status = 0;
+	adev->smu_workload.initialized = true;
+
+	mutex_init(&adev->smu_workload.workload_lock);
+}
+
+void amdgpu_workload_profile_fini(struct amdgpu_device *adev)
+{
+	if (!adev->smu_workload.initialized)
+		return;
+
+	adev->smu_workload.submit_workload_status = 0;
+	adev->smu_workload.initialized = false;
+	mutex_destroy(&adev->smu_workload.workload_lock);
+}
diff --git a/drivers/gpu/drm/amd/include/amdgpu_workload.h b/drivers/gpu/drm/amd/include/amdgpu_workload.h
new file mode 100644
index 000000000000..5d0f068422d4
--- /dev/null
+++ b/drivers/gpu/drm/amd/include/amdgpu_workload.h
@@ -0,0 +1,53 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * Copyright 2023 Advanced Micro Devices, Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ *
+ */
+
+#ifndef _AMDGPU_WORKLOAD_H_
+#define _AMDGPU_WORKLOAD_H_
+
+struct amdgpu_smu_workload {
+	struct amdgpu_device	*adev;
+	struct mutex		workload_lock;
+	struct delayed_work	smu_delayed_work;
+	uint32_t		submit_workload_status;
+	bool			initialized;
+	atomic_t		power_profile_ref[PP_SMC_POWER_PROFILE_COUNT];
+};
+
+/* Workload mode names */
+static const char * const amdgpu_workload_mode_name[] = {
+	"Default",
+	"3D",
+	"Powersaving",
+	"Video",
+	"VR",
+	"Compute",
+	"Custom",
+	"Window3D"
+};
+
+void amdgpu_workload_profile_init(struct amdgpu_device *adev);
+
+void amdgpu_workload_profile_fini(struct amdgpu_device *adev);
+
+#endif
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH v2 2/7] drm/amdgpu: Add new function to set GPU power profile
  2023-08-21  6:47 [PATCH v2 0/7] GPU workload hints for better performance Arvind Yadav
  2023-08-21  6:47 ` [PATCH v2 1/7] drm/amdgpu: Added init/fini functions for workload Arvind Yadav
@ 2023-08-21  6:47 ` Arvind Yadav
  2023-08-21 13:10   ` Shashank Sharma
                     ` (3 more replies)
  2023-08-21  6:47 ` [PATCH v2 3/7] drm/amdgpu: Add new function to put " Arvind Yadav
                   ` (4 subsequent siblings)
  6 siblings, 4 replies; 39+ messages in thread
From: Arvind Yadav @ 2023-08-21  6:47 UTC (permalink / raw)
  To: Christian.Koenig, alexander.deucher, shashank.sharma, Xinhui.Pan,
	airlied, daniel, Felix.Kuehling, amd-gfx
  Cc: dri-devel, linux-kernel, Arvind Yadav, Christian Koenig

This patch adds a function which will change the GPU
power profile based on a submitted job. This can optimize
the power performance when the workload is on.

v2:
- Splitting workload_profile_set and workload_profile_put
  into two separate patches.
- Addressed review comment.

Cc: Shashank Sharma <shashank.sharma@amd.com>
Cc: Christian Koenig <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Arvind Yadav <Arvind.Yadav@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c  | 56 +++++++++++++++++++
 drivers/gpu/drm/amd/include/amdgpu_workload.h |  3 +
 2 files changed, 59 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c
index 32166f482f77..e661cc5b3d92 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c
@@ -24,6 +24,62 @@
 
 #include "amdgpu.h"
 
+static enum PP_SMC_POWER_PROFILE
+ring_to_power_profile(uint32_t ring_type)
+{
+	switch (ring_type) {
+	case AMDGPU_RING_TYPE_GFX:
+		return PP_SMC_POWER_PROFILE_FULLSCREEN3D;
+	case AMDGPU_RING_TYPE_COMPUTE:
+		return PP_SMC_POWER_PROFILE_COMPUTE;
+	case AMDGPU_RING_TYPE_UVD:
+	case AMDGPU_RING_TYPE_VCE:
+	case AMDGPU_RING_TYPE_UVD_ENC:
+	case AMDGPU_RING_TYPE_VCN_DEC:
+	case AMDGPU_RING_TYPE_VCN_ENC:
+	case AMDGPU_RING_TYPE_VCN_JPEG:
+		return PP_SMC_POWER_PROFILE_VIDEO;
+	default:
+		return PP_SMC_POWER_PROFILE_BOOTUP_DEFAULT;
+	}
+}
+
+static int
+amdgpu_power_profile_set(struct amdgpu_device *adev,
+			 enum PP_SMC_POWER_PROFILE profile)
+{
+	int ret = amdgpu_dpm_switch_power_profile(adev, profile, true);
+
+	if (!ret) {
+		/* Set the bit for the submitted workload profile */
+		adev->smu_workload.submit_workload_status |= (1 << profile);
+		atomic_inc(&adev->smu_workload.power_profile_ref[profile]);
+	}
+
+	return ret;
+}
+
+void amdgpu_workload_profile_set(struct amdgpu_device *adev,
+				 uint32_t ring_type)
+{
+	struct amdgpu_smu_workload *workload = &adev->smu_workload;
+	enum PP_SMC_POWER_PROFILE profile = ring_to_power_profile(ring_type);
+	int ret;
+
+	if (profile == PP_SMC_POWER_PROFILE_BOOTUP_DEFAULT)
+		return;
+
+	mutex_lock(&workload->workload_lock);
+
+	ret = amdgpu_power_profile_set(adev, profile);
+	if (ret) {
+		DRM_WARN("Failed to set workload profile to %s, error = %d\n",
+			 amdgpu_workload_mode_name[profile], ret);
+	}
+
+	mutex_unlock(&workload->workload_lock);
+}
+
 void amdgpu_workload_profile_init(struct amdgpu_device *adev)
 {
 	adev->smu_workload.adev = adev;
diff --git a/drivers/gpu/drm/amd/include/amdgpu_workload.h b/drivers/gpu/drm/amd/include/amdgpu_workload.h
index 5d0f068422d4..5022f28fc2f9 100644
--- a/drivers/gpu/drm/amd/include/amdgpu_workload.h
+++ b/drivers/gpu/drm/amd/include/amdgpu_workload.h
@@ -46,6 +46,9 @@ static const char * const amdgpu_workload_mode_name[] = {
 	"Window3D"
 };
 
+void amdgpu_workload_profile_set(struct amdgpu_device *adev,
+				 uint32_t ring_type);
+
 void amdgpu_workload_profile_init(struct amdgpu_device *adev);
 
 void amdgpu_workload_profile_fini(struct amdgpu_device *adev);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH v2 3/7] drm/amdgpu: Add new function to put GPU power profile
  2023-08-21  6:47 [PATCH v2 0/7] GPU workload hints for better performance Arvind Yadav
  2023-08-21  6:47 ` [PATCH v2 1/7] drm/amdgpu: Added init/fini functions for workload Arvind Yadav
  2023-08-21  6:47 ` [PATCH v2 2/7] drm/amdgpu: Add new function to set GPU power profile Arvind Yadav
@ 2023-08-21  6:47 ` Arvind Yadav
  2023-08-21 13:39   ` Shashank Sharma
  2023-08-22  4:51   ` Lazar, Lijo
  2023-08-21  6:47 ` [PATCH v2 4/7] drm/amdgpu: Add suspend function to clear the " Arvind Yadav
                   ` (3 subsequent siblings)
  6 siblings, 2 replies; 39+ messages in thread
From: Arvind Yadav @ 2023-08-21  6:47 UTC (permalink / raw)
  To: Christian.Koenig, alexander.deucher, shashank.sharma, Xinhui.Pan,
	airlied, daniel, Felix.Kuehling, amd-gfx
  Cc: dri-devel, linux-kernel, Arvind Yadav, Christian Koenig

This patch adds a function which will clear the GPU
power profile after job finished.

This is how it works:
- schedular will set the GPU power profile based on ring_type.
- Schedular will clear the GPU Power profile once job finished.
- Here, the *_workload_profile_set function will set the GPU
  power profile and the *_workload_profile_put function will
  schedule the smu_delayed_work task after 100ms delay. This
  smu_delayed_work task will clear a GPU power profile if any
  new jobs are not scheduled within 100 ms. But if any new job
  comes within 100ms then the *_workload_profile_set function
  will cancel this work and set the GPU power profile based on
  preferences.

v2:
- Splitting workload_profile_set and workload_profile_put
  into two separate patches.
- Addressed review comment.

Cc: Shashank Sharma <shashank.sharma@amd.com>
Cc: Christian Koenig <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Arvind Yadav <Arvind.Yadav@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c  | 97 +++++++++++++++++++
 drivers/gpu/drm/amd/include/amdgpu_workload.h |  3 +
 2 files changed, 100 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c
index e661cc5b3d92..6367eb88a44d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c
@@ -24,6 +24,9 @@
 
 #include "amdgpu.h"
 
+/* 100 millsecond timeout */
+#define SMU_IDLE_TIMEOUT	msecs_to_jiffies(100)
+
 static enum PP_SMC_POWER_PROFILE
 ring_to_power_profile(uint32_t ring_type)
 {
@@ -59,6 +62,80 @@ amdgpu_power_profile_set(struct amdgpu_device *adev,
 	return ret;
 }
 
+static int
+amdgpu_power_profile_clear(struct amdgpu_device *adev,
+			   enum PP_SMC_POWER_PROFILE profile)
+{
+	int ret = amdgpu_dpm_switch_power_profile(adev, profile, false);
+
+	if (!ret) {
+		/* Clear the bit for the submitted workload profile */
+		adev->smu_workload.submit_workload_status &= ~(1 << profile);
+	}
+
+	return ret;
+}
+
+static void
+amdgpu_power_profile_idle_work_handler(struct work_struct *work)
+{
+
+	struct amdgpu_smu_workload *workload = container_of(work,
+						      struct amdgpu_smu_workload,
+						      smu_delayed_work.work);
+	struct amdgpu_device *adev = workload->adev;
+	bool reschedule = false;
+	int index  = fls(workload->submit_workload_status);
+	int ret;
+
+	mutex_lock(&workload->workload_lock);
+	for (; index > 0; index--) {
+		int val = atomic_read(&workload->power_profile_ref[index]);
+
+		if (val) {
+			reschedule = true;
+		} else {
+			if (workload->submit_workload_status &
+			    (1 << index)) {
+				ret = amdgpu_power_profile_clear(adev, index);
+				if (ret) {
+					DRM_WARN("Failed to clear workload %s,error = %d\n",
+						 amdgpu_workload_mode_name[index], ret);
+					goto exit;
+				}
+			}
+		}
+	}
+	if (reschedule)
+		schedule_delayed_work(&workload->smu_delayed_work,
+				      SMU_IDLE_TIMEOUT);
+exit:
+	mutex_unlock(&workload->workload_lock);
+}
+
+void amdgpu_workload_profile_put(struct amdgpu_device *adev,
+				 uint32_t ring_type)
+{
+	struct amdgpu_smu_workload *workload = &adev->smu_workload;
+	enum PP_SMC_POWER_PROFILE profile = ring_to_power_profile(ring_type);
+
+	if (profile == PP_SMC_POWER_PROFILE_BOOTUP_DEFAULT)
+		return;
+
+	mutex_lock(&workload->workload_lock);
+
+	if (!atomic_read(&workload->power_profile_ref[profile])) {
+		DRM_WARN("Power profile %s ref. count error\n",
+			 amdgpu_workload_mode_name[profile]);
+	} else {
+		atomic_dec(&workload->power_profile_ref[profile]);
+		schedule_delayed_work(&workload->smu_delayed_work,
+				      SMU_IDLE_TIMEOUT);
+	}
+
+	mutex_unlock(&workload->workload_lock);
+}
+
 void amdgpu_workload_profile_set(struct amdgpu_device *adev,
 				 uint32_t ring_type)
 {
@@ -70,13 +147,30 @@ void amdgpu_workload_profile_set(struct amdgpu_device *adev,
 		return;
 
 	mutex_lock(&workload->workload_lock);
+	cancel_delayed_work_sync(&workload->smu_delayed_work);
 
 	ret = amdgpu_power_profile_set(adev, profile);
 	if (ret) {
 		DRM_WARN("Failed to set workload profile to %s, error = %d\n",
 			 amdgpu_workload_mode_name[profile], ret);
+		goto exit;
+	}
+
+	/* Clear the already finished jobs of higher power profile*/
+	for (int index = fls(workload->submit_workload_status);
+	     index > profile; index--) {
+		if (!atomic_read(&workload->power_profile_ref[index]) &&
+		    workload->submit_workload_status & (1 << index)) {
+			ret = amdgpu_power_profile_clear(adev, index);
+			if (ret) {
+				DRM_WARN("Failed to clear workload %s, err = %d\n",
+					 amdgpu_workload_mode_name[profile], ret);
+				goto exit;
+			}
+		}
 	}
 
+exit:
 	mutex_unlock(&workload->workload_lock);
 }
 
@@ -87,6 +181,8 @@ void amdgpu_workload_profile_init(struct amdgpu_device *adev)
 	adev->smu_workload.initialized = true;
 
 	mutex_init(&adev->smu_workload.workload_lock);
+	INIT_DELAYED_WORK(&adev->smu_workload.smu_delayed_work,
+			  amdgpu_power_profile_idle_work_handler);
 }
 
 void amdgpu_workload_profile_fini(struct amdgpu_device *adev)
@@ -94,6 +190,7 @@ void amdgpu_workload_profile_fini(struct amdgpu_device *adev)
 	if (!adev->smu_workload.initialized)
 		return;
 
+	cancel_delayed_work_sync(&adev->smu_workload.smu_delayed_work);
 	adev->smu_workload.submit_workload_status = 0;
 	adev->smu_workload.initialized = false;
 	mutex_destroy(&adev->smu_workload.workload_lock);
diff --git a/drivers/gpu/drm/amd/include/amdgpu_workload.h b/drivers/gpu/drm/amd/include/amdgpu_workload.h
index 5022f28fc2f9..ee1f87257f2d 100644
--- a/drivers/gpu/drm/amd/include/amdgpu_workload.h
+++ b/drivers/gpu/drm/amd/include/amdgpu_workload.h
@@ -46,6 +46,9 @@ static const char * const amdgpu_workload_mode_name[] = {
 	"Window3D"
 };
 
+void amdgpu_workload_profile_put(struct amdgpu_device *adev,
+				 uint32_t ring_type);
+
 void amdgpu_workload_profile_set(struct amdgpu_device *adev,
 				 uint32_t ring_type);
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH v2 4/7] drm/amdgpu: Add suspend function to clear the GPU power profile.
  2023-08-21  6:47 [PATCH v2 0/7] GPU workload hints for better performance Arvind Yadav
                   ` (2 preceding siblings ...)
  2023-08-21  6:47 ` [PATCH v2 3/7] drm/amdgpu: Add new function to put " Arvind Yadav
@ 2023-08-21  6:47 ` Arvind Yadav
  2023-08-21 13:43   ` Shashank Sharma
  2023-08-22  6:31   ` Lazar, Lijo
  2023-08-21  6:47 ` [PATCH v2 5/7] drm/amdgpu: Switch on/off GPU workload profile Arvind Yadav
                   ` (2 subsequent siblings)
  6 siblings, 2 replies; 39+ messages in thread
From: Arvind Yadav @ 2023-08-21  6:47 UTC (permalink / raw)
  To: Christian.Koenig, alexander.deucher, shashank.sharma, Xinhui.Pan,
	airlied, daniel, Felix.Kuehling, amd-gfx
  Cc: dri-devel, linux-kernel, Arvind Yadav, Christian Koenig

This patch adds a suspend function that will clear the GPU
power profile before going into suspend state.

v2:
- Add the new suspend function based on review comment.

Cc: Shashank Sharma <shashank.sharma@amd.com>
Cc: Christian Koenig <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Arvind Yadav <Arvind.Yadav@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c    |  2 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c  | 23 +++++++++++++++++++
 drivers/gpu/drm/amd/include/amdgpu_workload.h |  2 ++
 3 files changed, 27 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index cd3bf641b630..3b70e657b439 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -4212,6 +4212,8 @@ int amdgpu_device_suspend(struct drm_device *dev, bool fbcon)
 
 	amdgpu_ras_suspend(adev);
 
+	amdgpu_workload_profile_suspend(adev);
+
 	amdgpu_device_ip_suspend_phase1(adev);
 
 	if (!adev->in_s0ix)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c
index 6367eb88a44d..44ca8e986984 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c
@@ -174,6 +174,29 @@ void amdgpu_workload_profile_set(struct amdgpu_device *adev,
 	mutex_unlock(&workload->workload_lock);
 }
 
+void amdgpu_workload_profile_suspend(struct amdgpu_device *adev)
+{
+	struct amdgpu_smu_workload *workload = &adev->smu_workload;
+	int ret;
+
+	mutex_lock(&workload->workload_lock);
+	cancel_delayed_work_sync(&workload->smu_delayed_work);
+
+	/* Clear all the set GPU power profile*/
+	for (int index = fls(workload->submit_workload_status);
+	     index > 0; index--) {
+		if (workload->submit_workload_status & (1 << index)) {
+			atomic_set(&workload->power_profile_ref[index], 0);
+			ret = amdgpu_power_profile_clear(adev, index);
+			if (ret)
+				DRM_WARN("Failed to clear power profile %s, err = %d\n",
+					 amdgpu_workload_mode_name[index], ret);
+		}
+	}
+	workload->submit_workload_status = 0;
+	mutex_unlock(&workload->workload_lock);
+}
+
 void amdgpu_workload_profile_init(struct amdgpu_device *adev)
 {
 	adev->smu_workload.adev = adev;
diff --git a/drivers/gpu/drm/amd/include/amdgpu_workload.h b/drivers/gpu/drm/amd/include/amdgpu_workload.h
index ee1f87257f2d..0acd8769ec52 100644
--- a/drivers/gpu/drm/amd/include/amdgpu_workload.h
+++ b/drivers/gpu/drm/amd/include/amdgpu_workload.h
@@ -52,6 +52,8 @@ void amdgpu_workload_profile_put(struct amdgpu_device *adev,
 void amdgpu_workload_profile_set(struct amdgpu_device *adev,
 				 uint32_t ring_type);
 
+void amdgpu_workload_profile_suspend(struct amdgpu_device *adev);
+
 void amdgpu_workload_profile_init(struct amdgpu_device *adev);
 
 void amdgpu_workload_profile_fini(struct amdgpu_device *adev);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH v2 5/7] drm/amdgpu: Switch on/off GPU workload profile
  2023-08-21  6:47 [PATCH v2 0/7] GPU workload hints for better performance Arvind Yadav
                   ` (3 preceding siblings ...)
  2023-08-21  6:47 ` [PATCH v2 4/7] drm/amdgpu: Add suspend function to clear the " Arvind Yadav
@ 2023-08-21  6:47 ` Arvind Yadav
  2023-08-21 13:46   ` Shashank Sharma
  2023-08-21  6:47 ` [PATCH v2 6/7] drm/amdgpu: switch workload context to/from compute Arvind Yadav
  2023-08-21  6:47 ` [PATCH v2 7/7] Revert "drm/amd/amdgpu: switch on/off vcn power profile mode" Arvind Yadav
  6 siblings, 1 reply; 39+ messages in thread
From: Arvind Yadav @ 2023-08-21  6:47 UTC (permalink / raw)
  To: Christian.Koenig, alexander.deucher, shashank.sharma, Xinhui.Pan,
	airlied, daniel, Felix.Kuehling, amd-gfx
  Cc: dri-devel, linux-kernel, Arvind Yadav, Christian Koenig

This patch is to switch the GPU workload profile based
on the submitted job. The workload profile is reset to
default when the job is done.

Cc: Shashank Sharma <shashank.sharma@amd.com>
Cc: Christian Koenig <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Arvind Yadav <Arvind.Yadav@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
index c3d9d75143f4..c2b0fda6ba26 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
@@ -176,6 +176,9 @@ void amdgpu_job_free_resources(struct amdgpu_job *job)
 static void amdgpu_job_free_cb(struct drm_sched_job *s_job)
 {
 	struct amdgpu_job *job = to_amdgpu_job(s_job);
+	struct amdgpu_ring *ring = to_amdgpu_ring(s_job->sched);
+
+	amdgpu_workload_profile_put(ring->adev, ring->funcs->type);
 
 	drm_sched_job_cleanup(s_job);
 
@@ -295,6 +298,8 @@ static struct dma_fence *amdgpu_job_run(struct drm_sched_job *sched_job)
 			DRM_ERROR("Error scheduling IBs (%d)\n", r);
 	}
 
+	amdgpu_workload_profile_set(adev, ring->funcs->type);
+
 	job->job_run_counter++;
 	amdgpu_job_free_resources(job);
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH v2 6/7] drm/amdgpu: switch workload context to/from compute
  2023-08-21  6:47 [PATCH v2 0/7] GPU workload hints for better performance Arvind Yadav
                   ` (4 preceding siblings ...)
  2023-08-21  6:47 ` [PATCH v2 5/7] drm/amdgpu: Switch on/off GPU workload profile Arvind Yadav
@ 2023-08-21  6:47 ` Arvind Yadav
  2023-08-21 13:47   ` Shashank Sharma
  2023-08-21  6:47 ` [PATCH v2 7/7] Revert "drm/amd/amdgpu: switch on/off vcn power profile mode" Arvind Yadav
  6 siblings, 1 reply; 39+ messages in thread
From: Arvind Yadav @ 2023-08-21  6:47 UTC (permalink / raw)
  To: Christian.Koenig, alexander.deucher, shashank.sharma, Xinhui.Pan,
	airlied, daniel, Felix.Kuehling, amd-gfx
  Cc: dri-devel, linux-kernel, Arvind Yadav, Christian Koenig

This patch switches the GPU workload mode to/from
compute mode, while submitting compute workload.

Cc: Christian Koenig <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Arvind Yadav <Arvind.Yadav@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
index 0385f7f69278..1d6a41f8d24e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
@@ -713,9 +713,11 @@ void amdgpu_amdkfd_set_compute_idle(struct amdgpu_device *adev, bool idle)
 		pr_debug("GFXOFF is %s\n", idle ? "enabled" : "disabled");
 		amdgpu_gfx_off_ctrl(adev, idle);
 	}
-	amdgpu_dpm_switch_power_profile(adev,
-					PP_SMC_POWER_PROFILE_COMPUTE,
-					!idle);
+
+	if (idle)
+		amdgpu_workload_profile_put(adev, AMDGPU_RING_TYPE_COMPUTE);
+	else
+		amdgpu_workload_profile_set(adev, AMDGPU_RING_TYPE_COMPUTE);
 }
 
 bool amdgpu_amdkfd_is_kfd_vmid(struct amdgpu_device *adev, u32 vmid)
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH v2 7/7] Revert "drm/amd/amdgpu: switch on/off vcn power profile mode"
  2023-08-21  6:47 [PATCH v2 0/7] GPU workload hints for better performance Arvind Yadav
                   ` (5 preceding siblings ...)
  2023-08-21  6:47 ` [PATCH v2 6/7] drm/amdgpu: switch workload context to/from compute Arvind Yadav
@ 2023-08-21  6:47 ` Arvind Yadav
  2023-08-21 13:49   ` Shashank Sharma
  6 siblings, 1 reply; 39+ messages in thread
From: Arvind Yadav @ 2023-08-21  6:47 UTC (permalink / raw)
  To: Christian.Koenig, alexander.deucher, shashank.sharma, Xinhui.Pan,
	airlied, daniel, Felix.Kuehling, amd-gfx
  Cc: dri-devel, linux-kernel, Arvind Yadav, Christian Koenig

This reverts commit 5ce71f59bb9bd3d8a09b96afdbc92975cb6dc303.

Reason for revert: New  amdgpu_workload_profile* api is added
to switch on/off profile mode. These new api will allow to
change the GPU power profile based on a submitted job.

Cc: Shashank Sharma <shashank.sharma@amd.com>
Cc: Christian Koenig <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Arvind Yadav <Arvind.Yadav@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c | 14 ++------------
 1 file changed, 2 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
index 2d94f1b63bd6..70777fcfa626 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
@@ -363,7 +363,6 @@ static void amdgpu_vcn_idle_work_handler(struct work_struct *work)
 		container_of(work, struct amdgpu_device, vcn.idle_work.work);
 	unsigned int fences = 0, fence[AMDGPU_MAX_VCN_INSTANCES] = {0};
 	unsigned int i, j;
-	int r = 0;
 
 	for (j = 0; j < adev->vcn.num_vcn_inst; ++j) {
 		if (adev->vcn.harvest_config & (1 << j))
@@ -392,10 +391,6 @@ static void amdgpu_vcn_idle_work_handler(struct work_struct *work)
 	if (!fences && !atomic_read(&adev->vcn.total_submission_cnt)) {
 		amdgpu_device_ip_set_powergating_state(adev, AMD_IP_BLOCK_TYPE_VCN,
 		       AMD_PG_STATE_GATE);
-		r = amdgpu_dpm_switch_power_profile(adev, PP_SMC_POWER_PROFILE_VIDEO,
-				false);
-		if (r)
-			dev_warn(adev->dev, "(%d) failed to disable video power profile mode\n", r);
 	} else {
 		schedule_delayed_work(&adev->vcn.idle_work, VCN_IDLE_TIMEOUT);
 	}
@@ -404,16 +399,11 @@ static void amdgpu_vcn_idle_work_handler(struct work_struct *work)
 void amdgpu_vcn_ring_begin_use(struct amdgpu_ring *ring)
 {
 	struct amdgpu_device *adev = ring->adev;
-	int r = 0;
 
 	atomic_inc(&adev->vcn.total_submission_cnt);
 
-	if (!cancel_delayed_work_sync(&adev->vcn.idle_work)) {
-		r = amdgpu_dpm_switch_power_profile(adev, PP_SMC_POWER_PROFILE_VIDEO,
-				true);
-		if (r)
-			dev_warn(adev->dev, "(%d) failed to switch to video power profile mode\n", r);
-	}
+	if (!cancel_delayed_work_sync(&adev->vcn.idle_work))
+		amdgpu_gfx_off_ctrl(adev, false);
 
 	mutex_lock(&adev->vcn.vcn_pg_lock);
 	amdgpu_device_ip_set_powergating_state(adev, AMD_IP_BLOCK_TYPE_VCN,
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 39+ messages in thread

* Re: [PATCH v2 1/7] drm/amdgpu: Added init/fini functions for workload
  2023-08-21  6:47 ` [PATCH v2 1/7] drm/amdgpu: Added init/fini functions for workload Arvind Yadav
@ 2023-08-21 13:06   ` Shashank Sharma
  2023-08-21 13:35     ` Yadav, Arvind
  0 siblings, 1 reply; 39+ messages in thread
From: Shashank Sharma @ 2023-08-21 13:06 UTC (permalink / raw)
  To: Arvind Yadav, Christian.Koenig, alexander.deucher, Xinhui.Pan,
	airlied, daniel, Felix.Kuehling, amd-gfx
  Cc: dri-devel, linux-kernel

Hey Arvind,

On 21/08/2023 08:47, Arvind Yadav wrote:
> The'struct amdgpu_smu_workload' initialization/cleanup
> functions is added by this patch.
>
> v2:
> - Splitting big patch into separate patches.
> - Added new fini function.
>
> Cc: Shashank Sharma <shashank.sharma@amd.com>
> Cc: Christian Koenig <christian.koenig@amd.com>
> Cc: Alex Deucher <alexander.deucher@amd.com>
> Signed-off-by: Arvind Yadav <Arvind.Yadav@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/Makefile           |  2 +-
>   drivers/gpu/drm/amd/amdgpu/amdgpu.h           |  3 ++
>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c    |  4 ++
>   drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c  | 44 +++++++++++++++
>   drivers/gpu/drm/amd/include/amdgpu_workload.h | 53 +++++++++++++++++++
>   5 files changed, 105 insertions(+), 1 deletion(-)
>   create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c
>   create mode 100644 drivers/gpu/drm/amd/include/amdgpu_workload.h
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/Makefile b/drivers/gpu/drm/amd/amdgpu/Makefile
> index 415a7fa395c4..6a9e187d61e1 100644
> --- a/drivers/gpu/drm/amd/amdgpu/Makefile
> +++ b/drivers/gpu/drm/amd/amdgpu/Makefile
> @@ -60,7 +60,7 @@ amdgpu-y += amdgpu_device.o amdgpu_kms.o \
>   	amdgpu_umc.o smu_v11_0_i2c.o amdgpu_fru_eeprom.o amdgpu_rap.o \
>   	amdgpu_fw_attestation.o amdgpu_securedisplay.o \
>   	amdgpu_eeprom.o amdgpu_mca.o amdgpu_psp_ta.o amdgpu_lsdma.o \
> -	amdgpu_ring_mux.o
> +	amdgpu_ring_mux.o amdgpu_workload.o
>   
>   amdgpu-$(CONFIG_PROC_FS) += amdgpu_fdinfo.o
>   
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> index 02b827785e39..1939fa1af8a6 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> @@ -107,6 +107,7 @@
>   #include "amdgpu_fdinfo.h"
>   #include "amdgpu_mca.h"
>   #include "amdgpu_ras.h"
> +#include "amdgpu_workload.h"
>   
>   #define MAX_GPU_INSTANCE		16
>   
> @@ -1050,6 +1051,8 @@ struct amdgpu_device {
>   
>   	bool                            job_hang;
>   	bool                            dc_enabled;
> +
> +	struct amdgpu_smu_workload	smu_workload;
>   };
>   
>   static inline struct amdgpu_device *drm_to_adev(struct drm_device *ddev)
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 5c7d40873ee2..cd3bf641b630 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -2243,6 +2243,8 @@ static int amdgpu_device_ip_early_init(struct amdgpu_device *adev)
>   	adev->cg_flags &= amdgpu_cg_mask;
>   	adev->pg_flags &= amdgpu_pg_mask;
>   
> +	amdgpu_workload_profile_init(adev);
> +
>   	return 0;
>   }
>   
> @@ -2890,6 +2892,8 @@ static int amdgpu_device_ip_fini(struct amdgpu_device *adev)
>   {
>   	int i, r;
>   
> +	amdgpu_workload_profile_fini(adev);
> +
>   	if (amdgpu_sriov_vf(adev) && adev->virt.ras_init_done)
>   		amdgpu_virt_release_ras_err_handler_data(adev);
>   
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c
> new file mode 100644
> index 000000000000..32166f482f77
> --- /dev/null
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c
> @@ -0,0 +1,44 @@
> +// SPDX-License-Identifier: MIT
> +/*
> + * Copyright 2023 Advanced Micro Devices, Inc.
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a
> + * copy of this software and associated documentation files (the "Software"),
> + * to deal in the Software without restriction, including without limitation
> + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
> + * and/or sell copies of the Software, and to permit persons to whom the
> + * Software is furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice shall be included in
> + * all copies or substantial portions of the Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
> + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
> + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
> + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
> + * OTHER DEALINGS IN THE SOFTWARE.
> + *
> + */
> +
> +#include "amdgpu.h"
> +
> +void amdgpu_workload_profile_init(struct amdgpu_device *adev)
> +{
> +	adev->smu_workload.adev = adev;
> +	adev->smu_workload.submit_workload_status = 0;
> +	adev->smu_workload.initialized = true;
why do we need this variable ?
> +
> +	mutex_init(&adev->smu_workload.workload_lock);
> +}
> +
> +void amdgpu_workload_profile_fini(struct amdgpu_device *adev)
> +{
> +	if (!adev->smu_workload.initialized)
> +		return;
> +
> +	adev->smu_workload.submit_workload_status = 0;
> +	adev->smu_workload.initialized = false;
> +	mutex_destroy(&adev->smu_workload.workload_lock);
> +}
> diff --git a/drivers/gpu/drm/amd/include/amdgpu_workload.h b/drivers/gpu/drm/amd/include/amdgpu_workload.h
> new file mode 100644
> index 000000000000..5d0f068422d4
> --- /dev/null
> +++ b/drivers/gpu/drm/amd/include/amdgpu_workload.h
> @@ -0,0 +1,53 @@
> +/* SPDX-License-Identifier: MIT */
> +/*
> + * Copyright 2023 Advanced Micro Devices, Inc.
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a
> + * copy of this software and associated documentation files (the "Software"),
> + * to deal in the Software without restriction, including without limitation
> + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
> + * and/or sell copies of the Software, and to permit persons to whom the
> + * Software is furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice shall be included in
> + * all copies or substantial portions of the Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
> + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
> + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
> + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
> + * OTHER DEALINGS IN THE SOFTWARE.
> + *
> + */
> +
> +#ifndef _AMDGPU_WORKLOAD_H_
> +#define _AMDGPU_WORKLOAD_H_
> +
> +struct amdgpu_smu_workload {
> +	struct amdgpu_device	*adev;
> +	struct mutex		workload_lock;
> +	struct delayed_work	smu_delayed_work;

call it power_profile_work instead ? Looks good otherwise.

- Shashank

> +	uint32_t		submit_workload_status;
> +	bool			initialized;
> +	atomic_t		power_profile_ref[PP_SMC_POWER_PROFILE_COUNT];
> +};
> +
> +/* Workload mode names */
> +static const char * const amdgpu_workload_mode_name[] = {
> +	"Default",
> +	"3D",
> +	"Powersaving",
> +	"Video",
> +	"VR",
> +	"Compute",
> +	"Custom",
> +	"Window3D"
> +};
> +
> +void amdgpu_workload_profile_init(struct amdgpu_device *adev);
> +
> +void amdgpu_workload_profile_fini(struct amdgpu_device *adev);
> +
> +#endif

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v2 2/7] drm/amdgpu: Add new function to set GPU power profile
  2023-08-21  6:47 ` [PATCH v2 2/7] drm/amdgpu: Add new function to set GPU power profile Arvind Yadav
@ 2023-08-21 13:10   ` Shashank Sharma
  2023-08-21 16:22   ` Alex Deucher
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 39+ messages in thread
From: Shashank Sharma @ 2023-08-21 13:10 UTC (permalink / raw)
  To: Arvind Yadav, Christian.Koenig, alexander.deucher, Xinhui.Pan,
	airlied, daniel, Felix.Kuehling, amd-gfx
  Cc: dri-devel, linux-kernel

On 21/08/2023 08:47, Arvind Yadav wrote:
> This patch adds a function which will change the GPU
> power profile based on a submitted job. This can optimize
> the power performance when the workload is on.
>
> v2:
> - Splitting workload_profile_set and workload_profile_put
>    into two separate patches.
> - Addressed review comment.
>
> Cc: Shashank Sharma <shashank.sharma@amd.com>
> Cc: Christian Koenig <christian.koenig@amd.com>
> Cc: Alex Deucher <alexander.deucher@amd.com>
> Signed-off-by: Arvind Yadav <Arvind.Yadav@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c  | 56 +++++++++++++++++++
>   drivers/gpu/drm/amd/include/amdgpu_workload.h |  3 +
>   2 files changed, 59 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c
> index 32166f482f77..e661cc5b3d92 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c
> @@ -24,6 +24,62 @@
>   
>   #include "amdgpu.h"
>   
> +static enum PP_SMC_POWER_PROFILE
> +ring_to_power_profile(uint32_t ring_type)
> +{
> +	switch (ring_type) {
> +	case AMDGPU_RING_TYPE_GFX:
> +		return PP_SMC_POWER_PROFILE_FULLSCREEN3D;
> +	case AMDGPU_RING_TYPE_COMPUTE:
> +		return PP_SMC_POWER_PROFILE_COMPUTE;
> +	case AMDGPU_RING_TYPE_UVD:
> +	case AMDGPU_RING_TYPE_VCE:
> +	case AMDGPU_RING_TYPE_UVD_ENC:
> +	case AMDGPU_RING_TYPE_VCN_DEC:
> +	case AMDGPU_RING_TYPE_VCN_ENC:
> +	case AMDGPU_RING_TYPE_VCN_JPEG:
> +		return PP_SMC_POWER_PROFILE_VIDEO;
> +	default:
> +		return PP_SMC_POWER_PROFILE_BOOTUP_DEFAULT;
> +	}
> +}
> +
> +static int
> +amdgpu_power_profile_set(struct amdgpu_device *adev,
> +			 enum PP_SMC_POWER_PROFILE profile)
> +{
> +	int ret = amdgpu_dpm_switch_power_profile(adev, profile, true);
> +
> +	if (!ret) {
> +		/* Set the bit for the submitted workload profile */
> +		adev->smu_workload.submit_workload_status |= (1 << profile);
> +		atomic_inc(&adev->smu_workload.power_profile_ref[profile]);
> +	}
> +
> +	return ret;
> +}
> +
> +void amdgpu_workload_profile_set(struct amdgpu_device *adev,
> +				 uint32_t ring_type)
> +{
> +	struct amdgpu_smu_workload *workload = &adev->smu_workload;
> +	enum PP_SMC_POWER_PROFILE profile = ring_to_power_profile(ring_type);
> +	int ret;
> +
> +	if (profile == PP_SMC_POWER_PROFILE_BOOTUP_DEFAULT)
> +		return;
> +
> +	mutex_lock(&workload->workload_lock);
> +
> +	ret = amdgpu_power_profile_set(adev, profile);
> +	if (ret) {
> +		DRM_WARN("Failed to set workload profile to %s, error = %d\n",
> +			 amdgpu_workload_mode_name[profile], ret);
> +	}
> +
> +	mutex_unlock(&workload->workload_lock);
> +}
> +
>   void amdgpu_workload_profile_init(struct amdgpu_device *adev)
>   {
>   	adev->smu_workload.adev = adev;
> diff --git a/drivers/gpu/drm/amd/include/amdgpu_workload.h b/drivers/gpu/drm/amd/include/amdgpu_workload.h
> index 5d0f068422d4..5022f28fc2f9 100644
> --- a/drivers/gpu/drm/amd/include/amdgpu_workload.h
> +++ b/drivers/gpu/drm/amd/include/amdgpu_workload.h
> @@ -46,6 +46,9 @@ static const char * const amdgpu_workload_mode_name[] = {
>   	"Window3D"
>   };
>   
> +void amdgpu_workload_profile_set(struct amdgpu_device *adev,
> +				 uint32_t ring_type);
> +
>   void amdgpu_workload_profile_init(struct amdgpu_device *adev);
>   
>   void amdgpu_workload_profile_fini(struct amdgpu_device *adev);

Please feel free to use:

Reviewed-by: Shashank Sharma <shashank.sharma@amd.com>


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v2 1/7] drm/amdgpu: Added init/fini functions for workload
  2023-08-21 13:06   ` Shashank Sharma
@ 2023-08-21 13:35     ` Yadav, Arvind
  2023-08-21 13:54       ` Shashank Sharma
  0 siblings, 1 reply; 39+ messages in thread
From: Yadav, Arvind @ 2023-08-21 13:35 UTC (permalink / raw)
  To: Shashank Sharma, Arvind Yadav, Christian.Koenig,
	alexander.deucher, Xinhui.Pan, airlied, daniel, Felix.Kuehling,
	amd-gfx
  Cc: dri-devel, linux-kernel


On 8/21/2023 6:36 PM, Shashank Sharma wrote:
> Hey Arvind,
>
> On 21/08/2023 08:47, Arvind Yadav wrote:
>> The'struct amdgpu_smu_workload' initialization/cleanup
>> functions is added by this patch.
>>
>> v2:
>> - Splitting big patch into separate patches.
>> - Added new fini function.
>>
>> Cc: Shashank Sharma <shashank.sharma@amd.com>
>> Cc: Christian Koenig <christian.koenig@amd.com>
>> Cc: Alex Deucher <alexander.deucher@amd.com>
>> Signed-off-by: Arvind Yadav <Arvind.Yadav@amd.com>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/Makefile           |  2 +-
>>   drivers/gpu/drm/amd/amdgpu/amdgpu.h           |  3 ++
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c    |  4 ++
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c  | 44 +++++++++++++++
>>   drivers/gpu/drm/amd/include/amdgpu_workload.h | 53 +++++++++++++++++++
>>   5 files changed, 105 insertions(+), 1 deletion(-)
>>   create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c
>>   create mode 100644 drivers/gpu/drm/amd/include/amdgpu_workload.h
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/Makefile 
>> b/drivers/gpu/drm/amd/amdgpu/Makefile
>> index 415a7fa395c4..6a9e187d61e1 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/Makefile
>> +++ b/drivers/gpu/drm/amd/amdgpu/Makefile
>> @@ -60,7 +60,7 @@ amdgpu-y += amdgpu_device.o amdgpu_kms.o \
>>       amdgpu_umc.o smu_v11_0_i2c.o amdgpu_fru_eeprom.o amdgpu_rap.o \
>>       amdgpu_fw_attestation.o amdgpu_securedisplay.o \
>>       amdgpu_eeprom.o amdgpu_mca.o amdgpu_psp_ta.o amdgpu_lsdma.o \
>> -    amdgpu_ring_mux.o
>> +    amdgpu_ring_mux.o amdgpu_workload.o
>>     amdgpu-$(CONFIG_PROC_FS) += amdgpu_fdinfo.o
>>   diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>> index 02b827785e39..1939fa1af8a6 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>> @@ -107,6 +107,7 @@
>>   #include "amdgpu_fdinfo.h"
>>   #include "amdgpu_mca.h"
>>   #include "amdgpu_ras.h"
>> +#include "amdgpu_workload.h"
>>     #define MAX_GPU_INSTANCE        16
>>   @@ -1050,6 +1051,8 @@ struct amdgpu_device {
>>         bool                            job_hang;
>>       bool                            dc_enabled;
>> +
>> +    struct amdgpu_smu_workload    smu_workload;
>>   };
>>     static inline struct amdgpu_device *drm_to_adev(struct drm_device 
>> *ddev)
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> index 5c7d40873ee2..cd3bf641b630 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> @@ -2243,6 +2243,8 @@ static int amdgpu_device_ip_early_init(struct 
>> amdgpu_device *adev)
>>       adev->cg_flags &= amdgpu_cg_mask;
>>       adev->pg_flags &= amdgpu_pg_mask;
>>   +    amdgpu_workload_profile_init(adev);
>> +
>>       return 0;
>>   }
>>   @@ -2890,6 +2892,8 @@ static int amdgpu_device_ip_fini(struct 
>> amdgpu_device *adev)
>>   {
>>       int i, r;
>>   +    amdgpu_workload_profile_fini(adev);
>> +
>>       if (amdgpu_sriov_vf(adev) && adev->virt.ras_init_done)
>>           amdgpu_virt_release_ras_err_handler_data(adev);
>>   diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c
>> new file mode 100644
>> index 000000000000..32166f482f77
>> --- /dev/null
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c
>> @@ -0,0 +1,44 @@
>> +// SPDX-License-Identifier: MIT
>> +/*
>> + * Copyright 2023 Advanced Micro Devices, Inc.
>> + *
>> + * Permission is hereby granted, free of charge, to any person 
>> obtaining a
>> + * copy of this software and associated documentation files (the 
>> "Software"),
>> + * to deal in the Software without restriction, including without 
>> limitation
>> + * the rights to use, copy, modify, merge, publish, distribute, 
>> sublicense,
>> + * and/or sell copies of the Software, and to permit persons to whom 
>> the
>> + * Software is furnished to do so, subject to the following conditions:
>> + *
>> + * The above copyright notice and this permission notice shall be 
>> included in
>> + * all copies or substantial portions of the Software.
>> + *
>> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, 
>> EXPRESS OR
>> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF 
>> MERCHANTABILITY,
>> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO 
>> EVENT SHALL
>> + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, 
>> DAMAGES OR
>> + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR 
>> OTHERWISE,
>> + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE 
>> USE OR
>> + * OTHER DEALINGS IN THE SOFTWARE.
>> + *
>> + */
>> +
>> +#include "amdgpu.h"
>> +
>> +void amdgpu_workload_profile_init(struct amdgpu_device *adev)
>> +{
>> +    adev->smu_workload.adev = adev;
>> +    adev->smu_workload.submit_workload_status = 0;
>> +    adev->smu_workload.initialized = true;
> why do we need this variable ?

Hi Shashank,

If any error comes while the device is booting then amdgpu will start 
unloading everything.
So I am using 'initialized' for unloading a driver successfully. This 
variable is to identify that the driver is loaded or not.

This is the below error for which the amdgpu driver is unloading when it 
is not getting firmware.

[   12.421609] amdgpu 0000:08:00.0: Direct firmware load for 
amdgpu/renoir_ta.bin failed with error -2
[   12.421618] [drm:amdgpu_device_init [amdgpu]] *ERROR* early_init of 
IP block <psp> failed -19
[   12.428207] [drm] VCN decode is enabled in VM mode
[   12.428212] [drm] VCN encode is enabled in VM mode
[   12.430925] [drm] JPEG decode is enabled in VM mode
[   12.430931] amdgpu 0000:08:00.0: amdgpu: Fatal error during GPU init
[   12.431184] amdgpu 0000:08:00.0: amdgpu: amdgpu: finishing device.
[   12.431296] ------------[ cut here ]------------
[   12.431297] WARNING: CPU: 3 PID: 438 at kernel/workqueue.c:3379 
__flush_work+0x22f/0x240
[   12.431305] Modules linked in: ledtrig_audio snd_hda_codec_hdmi 
snd_hda_intel snd_intel_dspcfg snd_intel_sdw_acpi snd_hda_codec 
snd_hda_core amdgpu(OE+) snd_hwdep snd_pcm kvm snd_seq_midi 
snd_seq_midi_event drm_exec amdxcp snd_rawmidi iommu_v2 crct10dif_pclmul 
drm_buddy gpu_sched ghash_clmulni_intel sha512_ssse3 snd_seq 
drm_suballoc_helper aesni_intel drm_ttm_helper binfmt_misc crypto_simd 
snd_seq_device ttm cryptd snd_timer drm_display_helper input_leds rapl 
joydev cec wmi_bmof rc_core snd drm_kms_helper k10temp ccp soundcore 
mac_hid sch_fq_codel msr parport_pc ppdev lp parport ramoops 
reed_solomon drm pstore_blk pstore_zone efi_pstore ip_tables x_tables 
autofs4 hid_generic usbhid hid crc32_pclmul nvme igb ahci i2c_piix4 
xhci_pci i2c_algo_bit nvme_core libahci xhci_pci_renesas dca video wmi
[   12.431360] CPU: 3 PID: 438 Comm: systemd-udevd Tainted: G        W  
OE      6.5.0-rc2-custom #1
[   12.431362] Hardware name: Gigabyte Technology Co., Ltd. X570 AORUS 
ELITE/X570 AORUS ELITE, BIOS F34 06/10/2021
[   12.431364] RIP: 0010:__flush_work+0x22f/0x240
[   12.431367] Code: 8b 43 30 48 8b 53 40 89 c1 e9 f9 fe ff ff 4c 89 f7 
e8 45 0b db 00 e8 90 f5 08 00 45 31 ff e9 11 ff ff ff 0f 0b e9 0a ff ff 
ff <0f> 0b 45 31 ff e9 00 ff ff ff e8 02 a0 d9 00 66 90 90 90 90 90 90
[   12.431368] RSP: 0018:ffffb0668156f818 EFLAGS: 00010246
[   12.431370] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 
0000000000000000
[   12.431371] RDX: 0000000000000001 RSI: 0000000000000001 RDI: 
ffff9cea492c7840
[   12.431372] RBP: ffffb0668156f890 R08: 0000000000000000 R09: 
ffffb0668156f7a0
[   12.431372] R10: 0000000000000001 R11: 0000000000000001 R12: 
ffff9cea492c7840
[   12.431373] R13: 0000000000000001 R14: ffff9cea43839940 R15: 
0000000000000001
[   12.431374] FS:  00007fde83c18880(0000) GS:ffff9cf15e2c0000(0000) 
knlGS:0000000000000000
[   12.431375] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   12.431376] CR2: 00007f2648000010 CR3: 00000001059e2000 CR4: 
0000000000350ee0
[   12.431377] Call Trace:
[   12.431379]  <TASK>
[   12.431384]  ? show_regs+0x68/0x70
[   12.431388]  ? __flush_work+0x22f/0x240
[   12.431389]  ? __warn+0x8f/0x150
[   12.431392]  ? __flush_work+0x22f/0x240
[   12.431394]  ? report_bug+0x1f5/0x200
[   12.431399]  ? handle_bug+0x46/0x80
[   12.431402]  ? exc_invalid_op+0x19/0x70
[   12.431404]  ? asm_exc_invalid_op+0x1b/0x20
[   12.431408]  ? __flush_work+0x22f/0x240
[   12.431410]  ? irq_work_queue+0x10/0x60
[   12.431414]  ? __wake_up_klogd.part.0+0x5a/0x80
[   12.431419]  __cancel_work_timer+0x124/0x1b0
[   12.431421]  ? _printk+0x58/0x80
[   12.431423]  cancel_delayed_work_sync+0x13/0x20
[   12.431427]  amdgpu_workload_profile_fini+0x25/0x40 [amdgpu]
[   12.431854]  amdgpu_device_fini_sw+0x33/0x550 [amdgpu]
[   12.432035]  amdgpu_driver_release_kms+0x16/0x30 [amdgpu]
[   12.432213]  drm_dev_release+0x28/0x50 [drm]
[   12.432256]  devm_drm_dev_init_release+0x38/0x60 [drm]
[   12.432278]  devm_action_release+0x15/0x20
[   12.432283]  release_nodes+0x40/0xc0
[   12.432285]  devres_release_all+0x9e/0xe0
[   12.432286]  device_unbind_cleanup+0x12/0x80
[   12.432289]  really_probe+0x116/0x3e0
[   12.432291]  __driver_probe_device+0x7e/0x170
[   12.432293]  driver_probe_device+0x23/0xa0
[   12.432295]  __driver_attach+0xc5/0x190
[   12.432297]  ? __pfx___driver_attach+0x10/0x10
[   12.432299]  bus_for_each_dev+0x7c/0xd0
[   12.432302]  driver_attach+0x1e/0x30
[   12.432304]  bus_add_driver+0x11c/0x220
[   12.432306]  driver_register+0x64/0x130
[   12.432309]  ? __pfx_amdgpu_init+0x10/0x10 [amdgpu]
[   12.432491]  __pci_register_driver+0x68/0x70
[   12.432494]  amdgpu_init+0x63/0xff0 [amdgpu]
[   12.432667]  do_one_initcall+0x48/0x310
[   12.432671]  ? kmalloc_trace+0x2a/0xa0
[   12.432675]  do_init_module+0x6a/0x260
[   12.432677]  load_module+0x1db3/0x2050
[   12.432681]  init_module_from_file+0x9c/0xe0
[   12.432682]  ? init_module_from_file+0x9c/0xe0
[   12.432685]  idempotent_init_module+0x179/0x230
[   12.432687]  __x64_sys_finit_module+0x5d/0xb0
[   12.432689]  do_syscall_64+0x3b/0x90
[   12.432691]  entry_SYSCALL_64_after_hwframe+0x6e/0xd8

>> +
>> +    mutex_init(&adev->smu_workload.workload_lock);
>> +}
>> +
>> +void amdgpu_workload_profile_fini(struct amdgpu_device *adev)
>> +{
>> +    if (!adev->smu_workload.initialized)
>> +        return;
>> +
>> +    adev->smu_workload.submit_workload_status = 0;
>> +    adev->smu_workload.initialized = false;
>> +    mutex_destroy(&adev->smu_workload.workload_lock);
>> +}
>> diff --git a/drivers/gpu/drm/amd/include/amdgpu_workload.h 
>> b/drivers/gpu/drm/amd/include/amdgpu_workload.h
>> new file mode 100644
>> index 000000000000..5d0f068422d4
>> --- /dev/null
>> +++ b/drivers/gpu/drm/amd/include/amdgpu_workload.h
>> @@ -0,0 +1,53 @@
>> +/* SPDX-License-Identifier: MIT */
>> +/*
>> + * Copyright 2023 Advanced Micro Devices, Inc.
>> + *
>> + * Permission is hereby granted, free of charge, to any person 
>> obtaining a
>> + * copy of this software and associated documentation files (the 
>> "Software"),
>> + * to deal in the Software without restriction, including without 
>> limitation
>> + * the rights to use, copy, modify, merge, publish, distribute, 
>> sublicense,
>> + * and/or sell copies of the Software, and to permit persons to whom 
>> the
>> + * Software is furnished to do so, subject to the following conditions:
>> + *
>> + * The above copyright notice and this permission notice shall be 
>> included in
>> + * all copies or substantial portions of the Software.
>> + *
>> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, 
>> EXPRESS OR
>> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF 
>> MERCHANTABILITY,
>> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO 
>> EVENT SHALL
>> + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, 
>> DAMAGES OR
>> + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR 
>> OTHERWISE,
>> + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE 
>> USE OR
>> + * OTHER DEALINGS IN THE SOFTWARE.
>> + *
>> + */
>> +
>> +#ifndef _AMDGPU_WORKLOAD_H_
>> +#define _AMDGPU_WORKLOAD_H_
>> +
>> +struct amdgpu_smu_workload {
>> +    struct amdgpu_device    *adev;
>> +    struct mutex        workload_lock;
>> +    struct delayed_work    smu_delayed_work;
>
> call it power_profile_work instead ? Looks good otherwise.
>
Noted.

Thank you

~Arvind

> - Shashank
>
>> +    uint32_t submit_workload_status;
>> +    bool            initialized;
>> +    atomic_t power_profile_ref[PP_SMC_POWER_PROFILE_COUNT];
>> +};
>> +
>> +/* Workload mode names */
>> +static const char * const amdgpu_workload_mode_name[] = {
>> +    "Default",
>> +    "3D",
>> +    "Powersaving",
>> +    "Video",
>> +    "VR",
>> +    "Compute",
>> +    "Custom",
>> +    "Window3D"
>> +};
>> +
>> +void amdgpu_workload_profile_init(struct amdgpu_device *adev);
>> +
>> +void amdgpu_workload_profile_fini(struct amdgpu_device *adev);
>> +
>> +#endif

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v2 3/7] drm/amdgpu: Add new function to put GPU power profile
  2023-08-21  6:47 ` [PATCH v2 3/7] drm/amdgpu: Add new function to put " Arvind Yadav
@ 2023-08-21 13:39   ` Shashank Sharma
  2023-08-21 14:40     ` Yadav, Arvind
  2023-08-22  4:51   ` Lazar, Lijo
  1 sibling, 1 reply; 39+ messages in thread
From: Shashank Sharma @ 2023-08-21 13:39 UTC (permalink / raw)
  To: Arvind Yadav, Christian.Koenig, alexander.deucher, Xinhui.Pan,
	airlied, daniel, Felix.Kuehling, amd-gfx
  Cc: dri-devel, linux-kernel


On 21/08/2023 08:47, Arvind Yadav wrote:
> This patch adds a function which will clear the GPU
> power profile after job finished.
>
> This is how it works:
> - schedular will set the GPU power profile based on ring_type.
> - Schedular will clear the GPU Power profile once job finished.
> - Here, the *_workload_profile_set function will set the GPU
>    power profile and the *_workload_profile_put function will
>    schedule the smu_delayed_work task after 100ms delay. This
>    smu_delayed_work task will clear a GPU power profile if any
>    new jobs are not scheduled within 100 ms. But if any new job
>    comes within 100ms then the *_workload_profile_set function
>    will cancel this work and set the GPU power profile based on
>    preferences.
>
> v2:
> - Splitting workload_profile_set and workload_profile_put
>    into two separate patches.
> - Addressed review comment.
>
> Cc: Shashank Sharma <shashank.sharma@amd.com>
> Cc: Christian Koenig <christian.koenig@amd.com>
> Cc: Alex Deucher <alexander.deucher@amd.com>
> Signed-off-by: Arvind Yadav <Arvind.Yadav@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c  | 97 +++++++++++++++++++
>   drivers/gpu/drm/amd/include/amdgpu_workload.h |  3 +
>   2 files changed, 100 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c
> index e661cc5b3d92..6367eb88a44d 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c
> @@ -24,6 +24,9 @@
>   
>   #include "amdgpu.h"
>   
> +/* 100 millsecond timeout */
> +#define SMU_IDLE_TIMEOUT	msecs_to_jiffies(100)
> +
>   static enum PP_SMC_POWER_PROFILE
>   ring_to_power_profile(uint32_t ring_type)
>   {
> @@ -59,6 +62,80 @@ amdgpu_power_profile_set(struct amdgpu_device *adev,
>   	return ret;
>   }
>   
> +static int
> +amdgpu_power_profile_clear(struct amdgpu_device *adev,
> +			   enum PP_SMC_POWER_PROFILE profile)
> +{
> +	int ret = amdgpu_dpm_switch_power_profile(adev, profile, false);
> +
> +	if (!ret) {
> +		/* Clear the bit for the submitted workload profile */
> +		adev->smu_workload.submit_workload_status &= ~(1 << profile);
> +	}
> +
> +	return ret;
> +}
> +
> +static void
> +amdgpu_power_profile_idle_work_handler(struct work_struct *work)
> +{
> +
> +	struct amdgpu_smu_workload *workload = container_of(work,
> +						      struct amdgpu_smu_workload,
> +						      smu_delayed_work.work);
> +	struct amdgpu_device *adev = workload->adev;
> +	bool reschedule = false;
> +	int index  = fls(workload->submit_workload_status);
> +	int ret;
> +
We should check validity and range of index here before before using it 
below.
> +	mutex_lock(&workload->workload_lock);
> +	for (; index > 0; index--) {
> +		int val = atomic_read(&workload->power_profile_ref[index]);
> +
> +		if (val) {
> +			reschedule = true;
> +		} else {
> +			if (workload->submit_workload_status &
> +			    (1 << index)) {
> +				ret = amdgpu_power_profile_clear(adev, index);
> +				if (ret) {
> +					DRM_WARN("Failed to clear workload %s,error = %d\n",
> +						 amdgpu_workload_mode_name[index], ret);
> +					goto exit;
instead of exiting, we might wanna continue the loop here, just to check 
if we are able to reset another profile in the next attempt.
> +				}
> +			}
> +		}
> +	}
A blank line recommended here.
> +	if (reschedule)
> +		schedule_delayed_work(&workload->smu_delayed_work,
> +				      SMU_IDLE_TIMEOUT);
> +exit:
> +	mutex_unlock(&workload->workload_lock);
> +}
> +
> +void amdgpu_workload_profile_put(struct amdgpu_device *adev,
> +				 uint32_t ring_type)
> +{
> +	struct amdgpu_smu_workload *workload = &adev->smu_workload;
> +	enum PP_SMC_POWER_PROFILE profile = ring_to_power_profile(ring_type);
> +
> +	if (profile == PP_SMC_POWER_PROFILE_BOOTUP_DEFAULT)
> +		return;
> +
> +	mutex_lock(&workload->workload_lock);
> +
> +	if (!atomic_read(&workload->power_profile_ref[profile])) {
> +		DRM_WARN("Power profile %s ref. count error\n",
> +			 amdgpu_workload_mode_name[profile]);
> +	} else {
> +		atomic_dec(&workload->power_profile_ref[profile]);
> +		schedule_delayed_work(&workload->smu_delayed_work,
> +				      SMU_IDLE_TIMEOUT);
We don't want to schedule this work everytime a power profile is put, 
but we want to do that only when a power profile ref count reaches '0'. 
So you might want to check the ref_count, and schedule the work under a 
if (!ref_count) condition.
> +	}
> +
> +	mutex_unlock(&workload->workload_lock);
> +}
> +
>   void amdgpu_workload_profile_set(struct amdgpu_device *adev,
>   				 uint32_t ring_type)
>   {
> @@ -70,13 +147,30 @@ void amdgpu_workload_profile_set(struct amdgpu_device *adev,
>   		return;
>   
>   	mutex_lock(&workload->workload_lock);
> +	cancel_delayed_work_sync(&workload->smu_delayed_work);
>   
>   	ret = amdgpu_power_profile_set(adev, profile);
>   	if (ret) {
>   		DRM_WARN("Failed to set workload profile to %s, error = %d\n",
>   			 amdgpu_workload_mode_name[profile], ret);
> +		goto exit;
> +	}
> +
> +	/* Clear the already finished jobs of higher power profile*/

We are not clearing the jobs here, but their power profiles.

I would recommend a little rework in the comment like "As we cancelled 
the delayed work, check and clear the pending higher power profiles set 
by previous jobs which are done now"

> +	for (int index = fls(workload->submit_workload_status);
The index can be initialized above, like the put function for loop.
> +	     index > profile; index--) {
> +		if (!atomic_read(&workload->power_profile_ref[index]) &&
> +		    workload->submit_workload_status & (1 << index)) {
> +			ret = amdgpu_power_profile_clear(adev, index);
After clearing the power profile, we should also clear the respective 
workload->submit_workload_status bit as well, right ?
> +			if (ret) {
> +				DRM_WARN("Failed to clear workload %s, err = %d\n",
> +					 amdgpu_workload_mode_name[profile], ret);
> +				goto exit;

Same as previous about continuing the loop.

- Shashank

> +			}
> +		}
>   	}
>   
> +exit:
>   	mutex_unlock(&workload->workload_lock);
>   }
>   
> @@ -87,6 +181,8 @@ void amdgpu_workload_profile_init(struct amdgpu_device *adev)
>   	adev->smu_workload.initialized = true;
>   
>   	mutex_init(&adev->smu_workload.workload_lock);
> +	INIT_DELAYED_WORK(&adev->smu_workload.smu_delayed_work,
> +			  amdgpu_power_profile_idle_work_handler);
>   }
>   
>   void amdgpu_workload_profile_fini(struct amdgpu_device *adev)
> @@ -94,6 +190,7 @@ void amdgpu_workload_profile_fini(struct amdgpu_device *adev)
>   	if (!adev->smu_workload.initialized)
>   		return;
>   
> +	cancel_delayed_work_sync(&adev->smu_workload.smu_delayed_work);
>   	adev->smu_workload.submit_workload_status = 0;
>   	adev->smu_workload.initialized = false;
>   	mutex_destroy(&adev->smu_workload.workload_lock);
> diff --git a/drivers/gpu/drm/amd/include/amdgpu_workload.h b/drivers/gpu/drm/amd/include/amdgpu_workload.h
> index 5022f28fc2f9..ee1f87257f2d 100644
> --- a/drivers/gpu/drm/amd/include/amdgpu_workload.h
> +++ b/drivers/gpu/drm/amd/include/amdgpu_workload.h
> @@ -46,6 +46,9 @@ static const char * const amdgpu_workload_mode_name[] = {
>   	"Window3D"
>   };
>   
> +void amdgpu_workload_profile_put(struct amdgpu_device *adev,
> +				 uint32_t ring_type);
> +
>   void amdgpu_workload_profile_set(struct amdgpu_device *adev,
>   				 uint32_t ring_type);
>   

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v2 4/7] drm/amdgpu: Add suspend function to clear the GPU power profile.
  2023-08-21  6:47 ` [PATCH v2 4/7] drm/amdgpu: Add suspend function to clear the " Arvind Yadav
@ 2023-08-21 13:43   ` Shashank Sharma
  2023-08-21 13:52     ` Yadav, Arvind
  2023-08-22  6:31   ` Lazar, Lijo
  1 sibling, 1 reply; 39+ messages in thread
From: Shashank Sharma @ 2023-08-21 13:43 UTC (permalink / raw)
  To: Arvind Yadav, Christian.Koenig, alexander.deucher, Xinhui.Pan,
	airlied, daniel, Felix.Kuehling, amd-gfx
  Cc: dri-devel, linux-kernel


On 21/08/2023 08:47, Arvind Yadav wrote:
> This patch adds a suspend function that will clear the GPU
> power profile before going into suspend state.
>
> v2:
> - Add the new suspend function based on review comment.
>
> Cc: Shashank Sharma <shashank.sharma@amd.com>
> Cc: Christian Koenig <christian.koenig@amd.com>
> Cc: Alex Deucher <alexander.deucher@amd.com>
> Signed-off-by: Arvind Yadav <Arvind.Yadav@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c    |  2 ++
>   drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c  | 23 +++++++++++++++++++
>   drivers/gpu/drm/amd/include/amdgpu_workload.h |  2 ++
>   3 files changed, 27 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index cd3bf641b630..3b70e657b439 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -4212,6 +4212,8 @@ int amdgpu_device_suspend(struct drm_device *dev, bool fbcon)
>   
>   	amdgpu_ras_suspend(adev);
>   
> +	amdgpu_workload_profile_suspend(adev);
> +
>   	amdgpu_device_ip_suspend_phase1(adev);
>   
>   	if (!adev->in_s0ix)
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c
> index 6367eb88a44d..44ca8e986984 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c
> @@ -174,6 +174,29 @@ void amdgpu_workload_profile_set(struct amdgpu_device *adev,
>   	mutex_unlock(&workload->workload_lock);
>   }
>   
> +void amdgpu_workload_profile_suspend(struct amdgpu_device *adev)
> +{
> +	struct amdgpu_smu_workload *workload = &adev->smu_workload;
> +	int ret;
> +
> +	mutex_lock(&workload->workload_lock);
> +	cancel_delayed_work_sync(&workload->smu_delayed_work);
> +
> +	/* Clear all the set GPU power profile*/
> +	for (int index = fls(workload->submit_workload_status);
> +	     index > 0; index--) {
> +		if (workload->submit_workload_status & (1 << index)) {
> +			atomic_set(&workload->power_profile_ref[index], 0);
> +			ret = amdgpu_power_profile_clear(adev, index);

Why do we need the checks here ? can't we simply set call 
power_profile_clear() for all profiles ?

- Shashank

> +			if (ret)
> +				DRM_WARN("Failed to clear power profile %s, err = %d\n",
> +					 amdgpu_workload_mode_name[index], ret);
> +		}
> +	}


> +	workload->submit_workload_status = 0;
> +	mutex_unlock(&workload->workload_lock);
> +}
> +
>   void amdgpu_workload_profile_init(struct amdgpu_device *adev)
>   {
>   	adev->smu_workload.adev = adev;
> diff --git a/drivers/gpu/drm/amd/include/amdgpu_workload.h b/drivers/gpu/drm/amd/include/amdgpu_workload.h
> index ee1f87257f2d..0acd8769ec52 100644
> --- a/drivers/gpu/drm/amd/include/amdgpu_workload.h
> +++ b/drivers/gpu/drm/amd/include/amdgpu_workload.h
> @@ -52,6 +52,8 @@ void amdgpu_workload_profile_put(struct amdgpu_device *adev,
>   void amdgpu_workload_profile_set(struct amdgpu_device *adev,
>   				 uint32_t ring_type);
>   
> +void amdgpu_workload_profile_suspend(struct amdgpu_device *adev);
> +
>   void amdgpu_workload_profile_init(struct amdgpu_device *adev);
>   
>   void amdgpu_workload_profile_fini(struct amdgpu_device *adev);

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v2 5/7] drm/amdgpu: Switch on/off GPU workload profile
  2023-08-21  6:47 ` [PATCH v2 5/7] drm/amdgpu: Switch on/off GPU workload profile Arvind Yadav
@ 2023-08-21 13:46   ` Shashank Sharma
  2023-08-21 13:53     ` Yadav, Arvind
  0 siblings, 1 reply; 39+ messages in thread
From: Shashank Sharma @ 2023-08-21 13:46 UTC (permalink / raw)
  To: Arvind Yadav, Christian.Koenig, alexander.deucher, Xinhui.Pan,
	airlied, daniel, Felix.Kuehling, amd-gfx
  Cc: dri-devel, linux-kernel


On 21/08/2023 08:47, Arvind Yadav wrote:
> This patch is to switch the GPU workload profile based
> on the submitted job. The workload profile is reset to
> default when the job is done.
>
> Cc: Shashank Sharma <shashank.sharma@amd.com>
> Cc: Christian Koenig <christian.koenig@amd.com>
> Cc: Alex Deucher <alexander.deucher@amd.com>
> Signed-off-by: Arvind Yadav <Arvind.Yadav@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 5 +++++
>   1 file changed, 5 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> index c3d9d75143f4..c2b0fda6ba26 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> @@ -176,6 +176,9 @@ void amdgpu_job_free_resources(struct amdgpu_job *job)
>   static void amdgpu_job_free_cb(struct drm_sched_job *s_job)
>   {
>   	struct amdgpu_job *job = to_amdgpu_job(s_job);
> +	struct amdgpu_ring *ring = to_amdgpu_ring(s_job->sched);
> +
> +	amdgpu_workload_profile_put(ring->adev, ring->funcs->type);
>   
>   	drm_sched_job_cleanup(s_job);
>   
> @@ -295,6 +298,8 @@ static struct dma_fence *amdgpu_job_run(struct drm_sched_job *sched_job)
>   			DRM_ERROR("Error scheduling IBs (%d)\n", r);
>   	}
>   
> +	amdgpu_workload_profile_set(adev, ring->funcs->type);
> +
>   	job->job_run_counter++;
>   	amdgpu_job_free_resources(job);
>   

Instead of calling switch on/off in title, may we call it set/reset GPU 
workload profile ?

With that minor nitpick handled, please feel free to use:

Reviewed-by: Shashank Sharma <shashank.sharma@amd.com>

- Shashank


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v2 6/7] drm/amdgpu: switch workload context to/from compute
  2023-08-21  6:47 ` [PATCH v2 6/7] drm/amdgpu: switch workload context to/from compute Arvind Yadav
@ 2023-08-21 13:47   ` Shashank Sharma
  0 siblings, 0 replies; 39+ messages in thread
From: Shashank Sharma @ 2023-08-21 13:47 UTC (permalink / raw)
  To: Arvind Yadav, Christian.Koenig, alexander.deucher, Xinhui.Pan,
	airlied, daniel, Felix.Kuehling, amd-gfx
  Cc: dri-devel, linux-kernel


On 21/08/2023 08:47, Arvind Yadav wrote:
> This patch switches the GPU workload mode to/from
> compute mode, while submitting compute workload.
>
> Cc: Christian Koenig <christian.koenig@amd.com>
> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
> Signed-off-by: Arvind Yadav <Arvind.Yadav@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 8 +++++---
>   1 file changed, 5 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
> index 0385f7f69278..1d6a41f8d24e 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
> @@ -713,9 +713,11 @@ void amdgpu_amdkfd_set_compute_idle(struct amdgpu_device *adev, bool idle)
>   		pr_debug("GFXOFF is %s\n", idle ? "enabled" : "disabled");
>   		amdgpu_gfx_off_ctrl(adev, idle);
>   	}
> -	amdgpu_dpm_switch_power_profile(adev,
> -					PP_SMC_POWER_PROFILE_COMPUTE,
> -					!idle);
> +
> +	if (idle)
> +		amdgpu_workload_profile_put(adev, AMDGPU_RING_TYPE_COMPUTE);
> +	else
> +		amdgpu_workload_profile_set(adev, AMDGPU_RING_TYPE_COMPUTE);
>   }
Please feel free to use:

Reviewed-by: Shashank Sharma <shashank.sharma@amd.com>

- Shashank

>   
>   bool amdgpu_amdkfd_is_kfd_vmid(struct amdgpu_device *adev, u32 vmid)

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v2 7/7] Revert "drm/amd/amdgpu: switch on/off vcn power profile mode"
  2023-08-21  6:47 ` [PATCH v2 7/7] Revert "drm/amd/amdgpu: switch on/off vcn power profile mode" Arvind Yadav
@ 2023-08-21 13:49   ` Shashank Sharma
  0 siblings, 0 replies; 39+ messages in thread
From: Shashank Sharma @ 2023-08-21 13:49 UTC (permalink / raw)
  To: Arvind Yadav, Christian.Koenig, alexander.deucher, Xinhui.Pan,
	airlied, daniel, Felix.Kuehling, amd-gfx
  Cc: dri-devel, linux-kernel

Someone from MM should also confirm on this, but:

Acked-by: Shashank Sharma <shashank.sharma@amd.com>


On 21/08/2023 08:47, Arvind Yadav wrote:
> This reverts commit 5ce71f59bb9bd3d8a09b96afdbc92975cb6dc303.
>
> Reason for revert: New  amdgpu_workload_profile* api is added
> to switch on/off profile mode. These new api will allow to
> change the GPU power profile based on a submitted job.
>
> Cc: Shashank Sharma <shashank.sharma@amd.com>
> Cc: Christian Koenig <christian.koenig@amd.com>
> Cc: Alex Deucher <alexander.deucher@amd.com>
> Signed-off-by: Arvind Yadav <Arvind.Yadav@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c | 14 ++------------
>   1 file changed, 2 insertions(+), 12 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
> index 2d94f1b63bd6..70777fcfa626 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
> @@ -363,7 +363,6 @@ static void amdgpu_vcn_idle_work_handler(struct work_struct *work)
>   		container_of(work, struct amdgpu_device, vcn.idle_work.work);
>   	unsigned int fences = 0, fence[AMDGPU_MAX_VCN_INSTANCES] = {0};
>   	unsigned int i, j;
> -	int r = 0;
>   
>   	for (j = 0; j < adev->vcn.num_vcn_inst; ++j) {
>   		if (adev->vcn.harvest_config & (1 << j))
> @@ -392,10 +391,6 @@ static void amdgpu_vcn_idle_work_handler(struct work_struct *work)
>   	if (!fences && !atomic_read(&adev->vcn.total_submission_cnt)) {
>   		amdgpu_device_ip_set_powergating_state(adev, AMD_IP_BLOCK_TYPE_VCN,
>   		       AMD_PG_STATE_GATE);
> -		r = amdgpu_dpm_switch_power_profile(adev, PP_SMC_POWER_PROFILE_VIDEO,
> -				false);
> -		if (r)
> -			dev_warn(adev->dev, "(%d) failed to disable video power profile mode\n", r);
>   	} else {
>   		schedule_delayed_work(&adev->vcn.idle_work, VCN_IDLE_TIMEOUT);
>   	}
> @@ -404,16 +399,11 @@ static void amdgpu_vcn_idle_work_handler(struct work_struct *work)
>   void amdgpu_vcn_ring_begin_use(struct amdgpu_ring *ring)
>   {
>   	struct amdgpu_device *adev = ring->adev;
> -	int r = 0;
>   
>   	atomic_inc(&adev->vcn.total_submission_cnt);
>   
> -	if (!cancel_delayed_work_sync(&adev->vcn.idle_work)) {
> -		r = amdgpu_dpm_switch_power_profile(adev, PP_SMC_POWER_PROFILE_VIDEO,
> -				true);
> -		if (r)
> -			dev_warn(adev->dev, "(%d) failed to switch to video power profile mode\n", r);
> -	}
> +	if (!cancel_delayed_work_sync(&adev->vcn.idle_work))
> +		amdgpu_gfx_off_ctrl(adev, false);
>   
>   	mutex_lock(&adev->vcn.vcn_pg_lock);
>   	amdgpu_device_ip_set_powergating_state(adev, AMD_IP_BLOCK_TYPE_VCN,

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v2 4/7] drm/amdgpu: Add suspend function to clear the GPU power profile.
  2023-08-21 13:43   ` Shashank Sharma
@ 2023-08-21 13:52     ` Yadav, Arvind
  0 siblings, 0 replies; 39+ messages in thread
From: Yadav, Arvind @ 2023-08-21 13:52 UTC (permalink / raw)
  To: Shashank Sharma, Arvind Yadav, Christian.Koenig,
	alexander.deucher, Xinhui.Pan, airlied, daniel, Felix.Kuehling,
	amd-gfx
  Cc: dri-devel, linux-kernel


On 8/21/2023 7:13 PM, Shashank Sharma wrote:
>
> On 21/08/2023 08:47, Arvind Yadav wrote:
>> This patch adds a suspend function that will clear the GPU
>> power profile before going into suspend state.
>>
>> v2:
>> - Add the new suspend function based on review comment.
>>
>> Cc: Shashank Sharma <shashank.sharma@amd.com>
>> Cc: Christian Koenig <christian.koenig@amd.com>
>> Cc: Alex Deucher <alexander.deucher@amd.com>
>> Signed-off-by: Arvind Yadav <Arvind.Yadav@amd.com>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c    |  2 ++
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c  | 23 +++++++++++++++++++
>>   drivers/gpu/drm/amd/include/amdgpu_workload.h |  2 ++
>>   3 files changed, 27 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> index cd3bf641b630..3b70e657b439 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> @@ -4212,6 +4212,8 @@ int amdgpu_device_suspend(struct drm_device 
>> *dev, bool fbcon)
>>         amdgpu_ras_suspend(adev);
>>   +    amdgpu_workload_profile_suspend(adev);
>> +
>>       amdgpu_device_ip_suspend_phase1(adev);
>>         if (!adev->in_s0ix)
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c
>> index 6367eb88a44d..44ca8e986984 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c
>> @@ -174,6 +174,29 @@ void amdgpu_workload_profile_set(struct 
>> amdgpu_device *adev,
>>       mutex_unlock(&workload->workload_lock);
>>   }
>>   +void amdgpu_workload_profile_suspend(struct amdgpu_device *adev)
>> +{
>> +    struct amdgpu_smu_workload *workload = &adev->smu_workload;
>> +    int ret;
>> +
>> +    mutex_lock(&workload->workload_lock);
>> + cancel_delayed_work_sync(&workload->smu_delayed_work);
>> +
>> +    /* Clear all the set GPU power profile*/
>> +    for (int index = fls(workload->submit_workload_status);
>> +         index > 0; index--) {
>> +        if (workload->submit_workload_status & (1 << index)) {
>> + atomic_set(&workload->power_profile_ref[index], 0);
>> +            ret = amdgpu_power_profile_clear(adev, index);
>
> Why do we need the checks here ? can't we simply set call 
> power_profile_clear() for all profiles ?

Hi Shashank,

  If we use only one profile then why to clear others. But I can remove 
the check and clear for all the profiles as per your suggestion.

ThankYou,
~Arvind
>
> - Shashank
>
>> +            if (ret)
>> +                DRM_WARN("Failed to clear power profile %s, err = 
>> %d\n",
>> +                     amdgpu_workload_mode_name[index], ret);
>> +        }
>> +    }
>
>
>> +    workload->submit_workload_status = 0;
>> +    mutex_unlock(&workload->workload_lock);
>> +}
>> +
>>   void amdgpu_workload_profile_init(struct amdgpu_device *adev)
>>   {
>>       adev->smu_workload.adev = adev;
>> diff --git a/drivers/gpu/drm/amd/include/amdgpu_workload.h 
>> b/drivers/gpu/drm/amd/include/amdgpu_workload.h
>> index ee1f87257f2d..0acd8769ec52 100644
>> --- a/drivers/gpu/drm/amd/include/amdgpu_workload.h
>> +++ b/drivers/gpu/drm/amd/include/amdgpu_workload.h
>> @@ -52,6 +52,8 @@ void amdgpu_workload_profile_put(struct 
>> amdgpu_device *adev,
>>   void amdgpu_workload_profile_set(struct amdgpu_device *adev,
>>                    uint32_t ring_type);
>>   +void amdgpu_workload_profile_suspend(struct amdgpu_device *adev);
>> +
>>   void amdgpu_workload_profile_init(struct amdgpu_device *adev);
>>     void amdgpu_workload_profile_fini(struct amdgpu_device *adev);

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v2 5/7] drm/amdgpu: Switch on/off GPU workload profile
  2023-08-21 13:46   ` Shashank Sharma
@ 2023-08-21 13:53     ` Yadav, Arvind
  0 siblings, 0 replies; 39+ messages in thread
From: Yadav, Arvind @ 2023-08-21 13:53 UTC (permalink / raw)
  To: Shashank Sharma, Arvind Yadav, Christian.Koenig,
	alexander.deucher, Xinhui.Pan, airlied, daniel, Felix.Kuehling,
	amd-gfx
  Cc: dri-devel, linux-kernel


On 8/21/2023 7:16 PM, Shashank Sharma wrote:
>
> On 21/08/2023 08:47, Arvind Yadav wrote:
>> This patch is to switch the GPU workload profile based
>> on the submitted job. The workload profile is reset to
>> default when the job is done.
>>
>> Cc: Shashank Sharma <shashank.sharma@amd.com>
>> Cc: Christian Koenig <christian.koenig@amd.com>
>> Cc: Alex Deucher <alexander.deucher@amd.com>
>> Signed-off-by: Arvind Yadav <Arvind.Yadav@amd.com>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 5 +++++
>>   1 file changed, 5 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
>> index c3d9d75143f4..c2b0fda6ba26 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
>> @@ -176,6 +176,9 @@ void amdgpu_job_free_resources(struct amdgpu_job 
>> *job)
>>   static void amdgpu_job_free_cb(struct drm_sched_job *s_job)
>>   {
>>       struct amdgpu_job *job = to_amdgpu_job(s_job);
>> +    struct amdgpu_ring *ring = to_amdgpu_ring(s_job->sched);
>> +
>> +    amdgpu_workload_profile_put(ring->adev, ring->funcs->type);
>>         drm_sched_job_cleanup(s_job);
>>   @@ -295,6 +298,8 @@ static struct dma_fence *amdgpu_job_run(struct 
>> drm_sched_job *sched_job)
>>               DRM_ERROR("Error scheduling IBs (%d)\n", r);
>>       }
>>   +    amdgpu_workload_profile_set(adev, ring->funcs->type);
>> +
>>       job->job_run_counter++;
>>       amdgpu_job_free_resources(job);
>
> Instead of calling switch on/off in title, may we call it set/reset 
> GPU workload profile ?
>
> With that minor nitpick handled, please feel free to use:
>
Noted.

Thank You
~Arvind
> Reviewed-by: Shashank Sharma <shashank.sharma@amd.com>
>
> - Shashank
>

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v2 1/7] drm/amdgpu: Added init/fini functions for workload
  2023-08-21 13:35     ` Yadav, Arvind
@ 2023-08-21 13:54       ` Shashank Sharma
  2023-08-21 14:12         ` Yadav, Arvind
  0 siblings, 1 reply; 39+ messages in thread
From: Shashank Sharma @ 2023-08-21 13:54 UTC (permalink / raw)
  To: Yadav, Arvind, Arvind Yadav, Christian.Koenig, alexander.deucher,
	Xinhui.Pan, airlied, daniel, Felix.Kuehling, amd-gfx
  Cc: dri-devel, linux-kernel


On 21/08/2023 15:35, Yadav, Arvind wrote:
>
> On 8/21/2023 6:36 PM, Shashank Sharma wrote:
>> Hey Arvind,
>>
>> On 21/08/2023 08:47, Arvind Yadav wrote:
>>> The'struct amdgpu_smu_workload' initialization/cleanup
>>> functions is added by this patch.
>>>
>>> v2:
>>> - Splitting big patch into separate patches.
>>> - Added new fini function.
>>>
>>> Cc: Shashank Sharma <shashank.sharma@amd.com>
>>> Cc: Christian Koenig <christian.koenig@amd.com>
>>> Cc: Alex Deucher <alexander.deucher@amd.com>
>>> Signed-off-by: Arvind Yadav <Arvind.Yadav@amd.com>
>>> ---
>>>   drivers/gpu/drm/amd/amdgpu/Makefile           |  2 +-
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu.h           |  3 ++
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c    |  4 ++
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c  | 44 +++++++++++++++
>>>   drivers/gpu/drm/amd/include/amdgpu_workload.h | 53 
>>> +++++++++++++++++++
>>>   5 files changed, 105 insertions(+), 1 deletion(-)
>>>   create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c
>>>   create mode 100644 drivers/gpu/drm/amd/include/amdgpu_workload.h
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/Makefile 
>>> b/drivers/gpu/drm/amd/amdgpu/Makefile
>>> index 415a7fa395c4..6a9e187d61e1 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/Makefile
>>> +++ b/drivers/gpu/drm/amd/amdgpu/Makefile
>>> @@ -60,7 +60,7 @@ amdgpu-y += amdgpu_device.o amdgpu_kms.o \
>>>       amdgpu_umc.o smu_v11_0_i2c.o amdgpu_fru_eeprom.o amdgpu_rap.o \
>>>       amdgpu_fw_attestation.o amdgpu_securedisplay.o \
>>>       amdgpu_eeprom.o amdgpu_mca.o amdgpu_psp_ta.o amdgpu_lsdma.o \
>>> -    amdgpu_ring_mux.o
>>> +    amdgpu_ring_mux.o amdgpu_workload.o
>>>     amdgpu-$(CONFIG_PROC_FS) += amdgpu_fdinfo.o
>>>   diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>> index 02b827785e39..1939fa1af8a6 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>> @@ -107,6 +107,7 @@
>>>   #include "amdgpu_fdinfo.h"
>>>   #include "amdgpu_mca.h"
>>>   #include "amdgpu_ras.h"
>>> +#include "amdgpu_workload.h"
>>>     #define MAX_GPU_INSTANCE        16
>>>   @@ -1050,6 +1051,8 @@ struct amdgpu_device {
>>>         bool                            job_hang;
>>>       bool                            dc_enabled;
>>> +
>>> +    struct amdgpu_smu_workload    smu_workload;
>>>   };
>>>     static inline struct amdgpu_device *drm_to_adev(struct 
>>> drm_device *ddev)
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>> index 5c7d40873ee2..cd3bf641b630 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>> @@ -2243,6 +2243,8 @@ static int amdgpu_device_ip_early_init(struct 
>>> amdgpu_device *adev)
>>>       adev->cg_flags &= amdgpu_cg_mask;
>>>       adev->pg_flags &= amdgpu_pg_mask;
>>>   +    amdgpu_workload_profile_init(adev);
>>> +
>>>       return 0;
>>>   }
>>>   @@ -2890,6 +2892,8 @@ static int amdgpu_device_ip_fini(struct 
>>> amdgpu_device *adev)
>>>   {
>>>       int i, r;
>>>   +    amdgpu_workload_profile_fini(adev);
>>> +
>>>       if (amdgpu_sriov_vf(adev) && adev->virt.ras_init_done)
>>>           amdgpu_virt_release_ras_err_handler_data(adev);
>>>   diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c
>>> new file mode 100644
>>> index 000000000000..32166f482f77
>>> --- /dev/null
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c
>>> @@ -0,0 +1,44 @@
>>> +// SPDX-License-Identifier: MIT
>>> +/*
>>> + * Copyright 2023 Advanced Micro Devices, Inc.
>>> + *
>>> + * Permission is hereby granted, free of charge, to any person 
>>> obtaining a
>>> + * copy of this software and associated documentation files (the 
>>> "Software"),
>>> + * to deal in the Software without restriction, including without 
>>> limitation
>>> + * the rights to use, copy, modify, merge, publish, distribute, 
>>> sublicense,
>>> + * and/or sell copies of the Software, and to permit persons to 
>>> whom the
>>> + * Software is furnished to do so, subject to the following 
>>> conditions:
>>> + *
>>> + * The above copyright notice and this permission notice shall be 
>>> included in
>>> + * all copies or substantial portions of the Software.
>>> + *
>>> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, 
>>> EXPRESS OR
>>> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF 
>>> MERCHANTABILITY,
>>> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO 
>>> EVENT SHALL
>>> + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, 
>>> DAMAGES OR
>>> + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR 
>>> OTHERWISE,
>>> + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE 
>>> USE OR
>>> + * OTHER DEALINGS IN THE SOFTWARE.
>>> + *
>>> + */
>>> +
>>> +#include "amdgpu.h"
>>> +
>>> +void amdgpu_workload_profile_init(struct amdgpu_device *adev)
>>> +{
>>> +    adev->smu_workload.adev = adev;
>>> +    adev->smu_workload.submit_workload_status = 0;
>>> +    adev->smu_workload.initialized = true;
>> why do we need this variable ?
>
> Hi Shashank,
>
> If any error comes while the device is booting then amdgpu will start 
> unloading everything.
> So I am using 'initialized' for unloading a driver successfully. This 
> variable is to identify that the driver is loaded or not.

I am not sure if I am getting this right. This variable is only getting 
used in this patch here, just being set and reset.

How does this flag help us ? I guess if AMDGPU driver is getting 
unloaded we already know that we can't set power profile.

- Shashank

>
> This is the below error for which the amdgpu driver is unloading when 
> it is not getting firmware.
>
> [   12.421609] amdgpu 0000:08:00.0: Direct firmware load for 
> amdgpu/renoir_ta.bin failed with error -2
> [   12.421618] [drm:amdgpu_device_init [amdgpu]] *ERROR* early_init of 
> IP block <psp> failed -19
> [   12.428207] [drm] VCN decode is enabled in VM mode
> [   12.428212] [drm] VCN encode is enabled in VM mode
> [   12.430925] [drm] JPEG decode is enabled in VM mode
> [   12.430931] amdgpu 0000:08:00.0: amdgpu: Fatal error during GPU init
> [   12.431184] amdgpu 0000:08:00.0: amdgpu: amdgpu: finishing device.
> [   12.431296] ------------[ cut here ]------------
> [   12.431297] WARNING: CPU: 3 PID: 438 at kernel/workqueue.c:3379 
> __flush_work+0x22f/0x240
> [   12.431305] Modules linked in: ledtrig_audio snd_hda_codec_hdmi 
> snd_hda_intel snd_intel_dspcfg snd_intel_sdw_acpi snd_hda_codec 
> snd_hda_core amdgpu(OE+) snd_hwdep snd_pcm kvm snd_seq_midi 
> snd_seq_midi_event drm_exec amdxcp snd_rawmidi iommu_v2 
> crct10dif_pclmul drm_buddy gpu_sched ghash_clmulni_intel sha512_ssse3 
> snd_seq drm_suballoc_helper aesni_intel drm_ttm_helper binfmt_misc 
> crypto_simd snd_seq_device ttm cryptd snd_timer drm_display_helper 
> input_leds rapl joydev cec wmi_bmof rc_core snd drm_kms_helper k10temp 
> ccp soundcore mac_hid sch_fq_codel msr parport_pc ppdev lp parport 
> ramoops reed_solomon drm pstore_blk pstore_zone efi_pstore ip_tables 
> x_tables autofs4 hid_generic usbhid hid crc32_pclmul nvme igb ahci 
> i2c_piix4 xhci_pci i2c_algo_bit nvme_core libahci xhci_pci_renesas dca 
> video wmi
> [   12.431360] CPU: 3 PID: 438 Comm: systemd-udevd Tainted: G        
> W  OE      6.5.0-rc2-custom #1
> [   12.431362] Hardware name: Gigabyte Technology Co., Ltd. X570 AORUS 
> ELITE/X570 AORUS ELITE, BIOS F34 06/10/2021
> [   12.431364] RIP: 0010:__flush_work+0x22f/0x240
> [   12.431367] Code: 8b 43 30 48 8b 53 40 89 c1 e9 f9 fe ff ff 4c 89 
> f7 e8 45 0b db 00 e8 90 f5 08 00 45 31 ff e9 11 ff ff ff 0f 0b e9 0a 
> ff ff ff <0f> 0b 45 31 ff e9 00 ff ff ff e8 02 a0 d9 00 66 90 90 90 90 
> 90 90
> [   12.431368] RSP: 0018:ffffb0668156f818 EFLAGS: 00010246
> [   12.431370] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 
> 0000000000000000
> [   12.431371] RDX: 0000000000000001 RSI: 0000000000000001 RDI: 
> ffff9cea492c7840
> [   12.431372] RBP: ffffb0668156f890 R08: 0000000000000000 R09: 
> ffffb0668156f7a0
> [   12.431372] R10: 0000000000000001 R11: 0000000000000001 R12: 
> ffff9cea492c7840
> [   12.431373] R13: 0000000000000001 R14: ffff9cea43839940 R15: 
> 0000000000000001
> [   12.431374] FS:  00007fde83c18880(0000) GS:ffff9cf15e2c0000(0000) 
> knlGS:0000000000000000
> [   12.431375] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [   12.431376] CR2: 00007f2648000010 CR3: 00000001059e2000 CR4: 
> 0000000000350ee0
> [   12.431377] Call Trace:
> [   12.431379]  <TASK>
> [   12.431384]  ? show_regs+0x68/0x70
> [   12.431388]  ? __flush_work+0x22f/0x240
> [   12.431389]  ? __warn+0x8f/0x150
> [   12.431392]  ? __flush_work+0x22f/0x240
> [   12.431394]  ? report_bug+0x1f5/0x200
> [   12.431399]  ? handle_bug+0x46/0x80
> [   12.431402]  ? exc_invalid_op+0x19/0x70
> [   12.431404]  ? asm_exc_invalid_op+0x1b/0x20
> [   12.431408]  ? __flush_work+0x22f/0x240
> [   12.431410]  ? irq_work_queue+0x10/0x60
> [   12.431414]  ? __wake_up_klogd.part.0+0x5a/0x80
> [   12.431419]  __cancel_work_timer+0x124/0x1b0
> [   12.431421]  ? _printk+0x58/0x80
> [   12.431423]  cancel_delayed_work_sync+0x13/0x20
> [   12.431427]  amdgpu_workload_profile_fini+0x25/0x40 [amdgpu]
> [   12.431854]  amdgpu_device_fini_sw+0x33/0x550 [amdgpu]
> [   12.432035]  amdgpu_driver_release_kms+0x16/0x30 [amdgpu]
> [   12.432213]  drm_dev_release+0x28/0x50 [drm]
> [   12.432256]  devm_drm_dev_init_release+0x38/0x60 [drm]
> [   12.432278]  devm_action_release+0x15/0x20
> [   12.432283]  release_nodes+0x40/0xc0
> [   12.432285]  devres_release_all+0x9e/0xe0
> [   12.432286]  device_unbind_cleanup+0x12/0x80
> [   12.432289]  really_probe+0x116/0x3e0
> [   12.432291]  __driver_probe_device+0x7e/0x170
> [   12.432293]  driver_probe_device+0x23/0xa0
> [   12.432295]  __driver_attach+0xc5/0x190
> [   12.432297]  ? __pfx___driver_attach+0x10/0x10
> [   12.432299]  bus_for_each_dev+0x7c/0xd0
> [   12.432302]  driver_attach+0x1e/0x30
> [   12.432304]  bus_add_driver+0x11c/0x220
> [   12.432306]  driver_register+0x64/0x130
> [   12.432309]  ? __pfx_amdgpu_init+0x10/0x10 [amdgpu]
> [   12.432491]  __pci_register_driver+0x68/0x70
> [   12.432494]  amdgpu_init+0x63/0xff0 [amdgpu]
> [   12.432667]  do_one_initcall+0x48/0x310
> [   12.432671]  ? kmalloc_trace+0x2a/0xa0
> [   12.432675]  do_init_module+0x6a/0x260
> [   12.432677]  load_module+0x1db3/0x2050
> [   12.432681]  init_module_from_file+0x9c/0xe0
> [   12.432682]  ? init_module_from_file+0x9c/0xe0
> [   12.432685]  idempotent_init_module+0x179/0x230
> [   12.432687]  __x64_sys_finit_module+0x5d/0xb0
> [   12.432689]  do_syscall_64+0x3b/0x90
> [   12.432691]  entry_SYSCALL_64_after_hwframe+0x6e/0xd8
>
>>> +
>>> +    mutex_init(&adev->smu_workload.workload_lock);
>>> +}
>>> +
>>> +void amdgpu_workload_profile_fini(struct amdgpu_device *adev)
>>> +{
>>> +    if (!adev->smu_workload.initialized)
>>> +        return;
>>> +
>>> +    adev->smu_workload.submit_workload_status = 0;
>>> +    adev->smu_workload.initialized = false;
>>> +    mutex_destroy(&adev->smu_workload.workload_lock);
>>> +}
>>> diff --git a/drivers/gpu/drm/amd/include/amdgpu_workload.h 
>>> b/drivers/gpu/drm/amd/include/amdgpu_workload.h
>>> new file mode 100644
>>> index 000000000000..5d0f068422d4
>>> --- /dev/null
>>> +++ b/drivers/gpu/drm/amd/include/amdgpu_workload.h
>>> @@ -0,0 +1,53 @@
>>> +/* SPDX-License-Identifier: MIT */
>>> +/*
>>> + * Copyright 2023 Advanced Micro Devices, Inc.
>>> + *
>>> + * Permission is hereby granted, free of charge, to any person 
>>> obtaining a
>>> + * copy of this software and associated documentation files (the 
>>> "Software"),
>>> + * to deal in the Software without restriction, including without 
>>> limitation
>>> + * the rights to use, copy, modify, merge, publish, distribute, 
>>> sublicense,
>>> + * and/or sell copies of the Software, and to permit persons to 
>>> whom the
>>> + * Software is furnished to do so, subject to the following 
>>> conditions:
>>> + *
>>> + * The above copyright notice and this permission notice shall be 
>>> included in
>>> + * all copies or substantial portions of the Software.
>>> + *
>>> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, 
>>> EXPRESS OR
>>> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF 
>>> MERCHANTABILITY,
>>> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO 
>>> EVENT SHALL
>>> + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, 
>>> DAMAGES OR
>>> + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR 
>>> OTHERWISE,
>>> + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE 
>>> USE OR
>>> + * OTHER DEALINGS IN THE SOFTWARE.
>>> + *
>>> + */
>>> +
>>> +#ifndef _AMDGPU_WORKLOAD_H_
>>> +#define _AMDGPU_WORKLOAD_H_
>>> +
>>> +struct amdgpu_smu_workload {
>>> +    struct amdgpu_device    *adev;
>>> +    struct mutex        workload_lock;
>>> +    struct delayed_work    smu_delayed_work;
>>
>> call it power_profile_work instead ? Looks good otherwise.
>>
> Noted.
>
> Thank you
>
> ~Arvind
>
>> - Shashank
>>
>>> +    uint32_t submit_workload_status;
>>> +    bool            initialized;
>>> +    atomic_t power_profile_ref[PP_SMC_POWER_PROFILE_COUNT];
>>> +};
>>> +
>>> +/* Workload mode names */
>>> +static const char * const amdgpu_workload_mode_name[] = {
>>> +    "Default",
>>> +    "3D",
>>> +    "Powersaving",
>>> +    "Video",
>>> +    "VR",
>>> +    "Compute",
>>> +    "Custom",
>>> +    "Window3D"
>>> +};
>>> +
>>> +void amdgpu_workload_profile_init(struct amdgpu_device *adev);
>>> +
>>> +void amdgpu_workload_profile_fini(struct amdgpu_device *adev);
>>> +
>>> +#endif

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v2 1/7] drm/amdgpu: Added init/fini functions for workload
  2023-08-21 13:54       ` Shashank Sharma
@ 2023-08-21 14:12         ` Yadav, Arvind
  2023-08-21 14:27           ` Shashank Sharma
  0 siblings, 1 reply; 39+ messages in thread
From: Yadav, Arvind @ 2023-08-21 14:12 UTC (permalink / raw)
  To: Shashank Sharma, Arvind Yadav, Christian.Koenig,
	alexander.deucher, Xinhui.Pan, airlied, daniel, Felix.Kuehling,
	amd-gfx
  Cc: dri-devel, linux-kernel


On 8/21/2023 7:24 PM, Shashank Sharma wrote:
>
> On 21/08/2023 15:35, Yadav, Arvind wrote:
>>
>> On 8/21/2023 6:36 PM, Shashank Sharma wrote:
>>> Hey Arvind,
>>>
>>> On 21/08/2023 08:47, Arvind Yadav wrote:
>>>> The'struct amdgpu_smu_workload' initialization/cleanup
>>>> functions is added by this patch.
>>>>
>>>> v2:
>>>> - Splitting big patch into separate patches.
>>>> - Added new fini function.
>>>>
>>>> Cc: Shashank Sharma <shashank.sharma@amd.com>
>>>> Cc: Christian Koenig <christian.koenig@amd.com>
>>>> Cc: Alex Deucher <alexander.deucher@amd.com>
>>>> Signed-off-by: Arvind Yadav <Arvind.Yadav@amd.com>
>>>> ---
>>>>   drivers/gpu/drm/amd/amdgpu/Makefile           |  2 +-
>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu.h           |  3 ++
>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c    |  4 ++
>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c  | 44 +++++++++++++++
>>>>   drivers/gpu/drm/amd/include/amdgpu_workload.h | 53 
>>>> +++++++++++++++++++
>>>>   5 files changed, 105 insertions(+), 1 deletion(-)
>>>>   create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c
>>>>   create mode 100644 drivers/gpu/drm/amd/include/amdgpu_workload.h
>>>>
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/Makefile 
>>>> b/drivers/gpu/drm/amd/amdgpu/Makefile
>>>> index 415a7fa395c4..6a9e187d61e1 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/Makefile
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/Makefile
>>>> @@ -60,7 +60,7 @@ amdgpu-y += amdgpu_device.o amdgpu_kms.o \
>>>>       amdgpu_umc.o smu_v11_0_i2c.o amdgpu_fru_eeprom.o amdgpu_rap.o \
>>>>       amdgpu_fw_attestation.o amdgpu_securedisplay.o \
>>>>       amdgpu_eeprom.o amdgpu_mca.o amdgpu_psp_ta.o amdgpu_lsdma.o \
>>>> -    amdgpu_ring_mux.o
>>>> +    amdgpu_ring_mux.o amdgpu_workload.o
>>>>     amdgpu-$(CONFIG_PROC_FS) += amdgpu_fdinfo.o
>>>>   diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>>> index 02b827785e39..1939fa1af8a6 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>>> @@ -107,6 +107,7 @@
>>>>   #include "amdgpu_fdinfo.h"
>>>>   #include "amdgpu_mca.h"
>>>>   #include "amdgpu_ras.h"
>>>> +#include "amdgpu_workload.h"
>>>>     #define MAX_GPU_INSTANCE        16
>>>>   @@ -1050,6 +1051,8 @@ struct amdgpu_device {
>>>>         bool                            job_hang;
>>>>       bool                            dc_enabled;
>>>> +
>>>> +    struct amdgpu_smu_workload    smu_workload;
>>>>   };
>>>>     static inline struct amdgpu_device *drm_to_adev(struct 
>>>> drm_device *ddev)
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>> index 5c7d40873ee2..cd3bf641b630 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>> @@ -2243,6 +2243,8 @@ static int amdgpu_device_ip_early_init(struct 
>>>> amdgpu_device *adev)
>>>>       adev->cg_flags &= amdgpu_cg_mask;
>>>>       adev->pg_flags &= amdgpu_pg_mask;
>>>>   +    amdgpu_workload_profile_init(adev);
>>>> +
>>>>       return 0;
>>>>   }
>>>>   @@ -2890,6 +2892,8 @@ static int amdgpu_device_ip_fini(struct 
>>>> amdgpu_device *adev)
>>>>   {
>>>>       int i, r;
>>>>   +    amdgpu_workload_profile_fini(adev);
>>>> +
>>>>       if (amdgpu_sriov_vf(adev) && adev->virt.ras_init_done)
>>>>           amdgpu_virt_release_ras_err_handler_data(adev);
>>>>   diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c 
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c
>>>> new file mode 100644
>>>> index 000000000000..32166f482f77
>>>> --- /dev/null
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c
>>>> @@ -0,0 +1,44 @@
>>>> +// SPDX-License-Identifier: MIT
>>>> +/*
>>>> + * Copyright 2023 Advanced Micro Devices, Inc.
>>>> + *
>>>> + * Permission is hereby granted, free of charge, to any person 
>>>> obtaining a
>>>> + * copy of this software and associated documentation files (the 
>>>> "Software"),
>>>> + * to deal in the Software without restriction, including without 
>>>> limitation
>>>> + * the rights to use, copy, modify, merge, publish, distribute, 
>>>> sublicense,
>>>> + * and/or sell copies of the Software, and to permit persons to 
>>>> whom the
>>>> + * Software is furnished to do so, subject to the following 
>>>> conditions:
>>>> + *
>>>> + * The above copyright notice and this permission notice shall be 
>>>> included in
>>>> + * all copies or substantial portions of the Software.
>>>> + *
>>>> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, 
>>>> EXPRESS OR
>>>> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF 
>>>> MERCHANTABILITY,
>>>> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO 
>>>> EVENT SHALL
>>>> + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, 
>>>> DAMAGES OR
>>>> + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR 
>>>> OTHERWISE,
>>>> + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE 
>>>> USE OR
>>>> + * OTHER DEALINGS IN THE SOFTWARE.
>>>> + *
>>>> + */
>>>> +
>>>> +#include "amdgpu.h"
>>>> +
>>>> +void amdgpu_workload_profile_init(struct amdgpu_device *adev)
>>>> +{
>>>> +    adev->smu_workload.adev = adev;
>>>> +    adev->smu_workload.submit_workload_status = 0;
>>>> +    adev->smu_workload.initialized = true;
>>> why do we need this variable ?
>>
>> Hi Shashank,
>>
>> If any error comes while the device is booting then amdgpu will start 
>> unloading everything.
>> So I am using 'initialized' for unloading a driver successfully. This 
>> variable is to identify that the driver is loaded or not.
>
> I am not sure if I am getting this right. This variable is only 
> getting used in this patch here, just being set and reset.
>
> How does this flag help us ? I guess if AMDGPU driver is getting 
> unloaded we already know that we can't set power profile.
>
We are setting "initialized = true" and checking in 
amdgpu_workload_profile_fini() that 'initialized' is set or not set because
amdgpu_workload_profile_fini() is destroying mutex and same for delayed 
work which I have implemented in patch 0003.

In the below error case where amdgpu_workload_profile_init() was not 
called because psp firmware failed to load but amdgpu driver is
calling all unload function as well amdgpu_workload_profile_fini().

ThankYou
~Arvind

> - Shashank
>
>>
>> This is the below error for which the amdgpu driver is unloading when 
>> it is not getting firmware.
>>
>> [   12.421609] amdgpu 0000:08:00.0: Direct firmware load for 
>> amdgpu/renoir_ta.bin failed with error -2
>> [   12.421618] [drm:amdgpu_device_init [amdgpu]] *ERROR* early_init 
>> of IP block <psp> failed -19
>> [   12.428207] [drm] VCN decode is enabled in VM mode
>> [   12.428212] [drm] VCN encode is enabled in VM mode
>> [   12.430925] [drm] JPEG decode is enabled in VM mode
>> [   12.430931] amdgpu 0000:08:00.0: amdgpu: Fatal error during GPU init
>> [   12.431184] amdgpu 0000:08:00.0: amdgpu: amdgpu: finishing device.
>> [   12.431296] ------------[ cut here ]------------
>> [   12.431297] WARNING: CPU: 3 PID: 438 at kernel/workqueue.c:3379 
>> __flush_work+0x22f/0x240
>> [   12.431305] Modules linked in: ledtrig_audio snd_hda_codec_hdmi 
>> snd_hda_intel snd_intel_dspcfg snd_intel_sdw_acpi snd_hda_codec 
>> snd_hda_core amdgpu(OE+) snd_hwdep snd_pcm kvm snd_seq_midi 
>> snd_seq_midi_event drm_exec amdxcp snd_rawmidi iommu_v2 
>> crct10dif_pclmul drm_buddy gpu_sched ghash_clmulni_intel sha512_ssse3 
>> snd_seq drm_suballoc_helper aesni_intel drm_ttm_helper binfmt_misc 
>> crypto_simd snd_seq_device ttm cryptd snd_timer drm_display_helper 
>> input_leds rapl joydev cec wmi_bmof rc_core snd drm_kms_helper 
>> k10temp ccp soundcore mac_hid sch_fq_codel msr parport_pc ppdev lp 
>> parport ramoops reed_solomon drm pstore_blk pstore_zone efi_pstore 
>> ip_tables x_tables autofs4 hid_generic usbhid hid crc32_pclmul nvme 
>> igb ahci i2c_piix4 xhci_pci i2c_algo_bit nvme_core libahci 
>> xhci_pci_renesas dca video wmi
>> [   12.431360] CPU: 3 PID: 438 Comm: systemd-udevd Tainted: G        
>> W  OE      6.5.0-rc2-custom #1
>> [   12.431362] Hardware name: Gigabyte Technology Co., Ltd. X570 
>> AORUS ELITE/X570 AORUS ELITE, BIOS F34 06/10/2021
>> [   12.431364] RIP: 0010:__flush_work+0x22f/0x240
>> [   12.431367] Code: 8b 43 30 48 8b 53 40 89 c1 e9 f9 fe ff ff 4c 89 
>> f7 e8 45 0b db 00 e8 90 f5 08 00 45 31 ff e9 11 ff ff ff 0f 0b e9 0a 
>> ff ff ff <0f> 0b 45 31 ff e9 00 ff ff ff e8 02 a0 d9 00 66 90 90 90 
>> 90 90 90
>> [   12.431368] RSP: 0018:ffffb0668156f818 EFLAGS: 00010246
>> [   12.431370] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 
>> 0000000000000000
>> [   12.431371] RDX: 0000000000000001 RSI: 0000000000000001 RDI: 
>> ffff9cea492c7840
>> [   12.431372] RBP: ffffb0668156f890 R08: 0000000000000000 R09: 
>> ffffb0668156f7a0
>> [   12.431372] R10: 0000000000000001 R11: 0000000000000001 R12: 
>> ffff9cea492c7840
>> [   12.431373] R13: 0000000000000001 R14: ffff9cea43839940 R15: 
>> 0000000000000001
>> [   12.431374] FS:  00007fde83c18880(0000) GS:ffff9cf15e2c0000(0000) 
>> knlGS:0000000000000000
>> [   12.431375] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> [   12.431376] CR2: 00007f2648000010 CR3: 00000001059e2000 CR4: 
>> 0000000000350ee0
>> [   12.431377] Call Trace:
>> [   12.431379]  <TASK>
>> [   12.431384]  ? show_regs+0x68/0x70
>> [   12.431388]  ? __flush_work+0x22f/0x240
>> [   12.431389]  ? __warn+0x8f/0x150
>> [   12.431392]  ? __flush_work+0x22f/0x240
>> [   12.431394]  ? report_bug+0x1f5/0x200
>> [   12.431399]  ? handle_bug+0x46/0x80
>> [   12.431402]  ? exc_invalid_op+0x19/0x70
>> [   12.431404]  ? asm_exc_invalid_op+0x1b/0x20
>> [   12.431408]  ? __flush_work+0x22f/0x240
>> [   12.431410]  ? irq_work_queue+0x10/0x60
>> [   12.431414]  ? __wake_up_klogd.part.0+0x5a/0x80
>> [   12.431419]  __cancel_work_timer+0x124/0x1b0
>> [   12.431421]  ? _printk+0x58/0x80
>> [   12.431423]  cancel_delayed_work_sync+0x13/0x20
>> [   12.431427]  amdgpu_workload_profile_fini+0x25/0x40 [amdgpu]
>> [   12.431854]  amdgpu_device_fini_sw+0x33/0x550 [amdgpu]
>> [   12.432035]  amdgpu_driver_release_kms+0x16/0x30 [amdgpu]
>> [   12.432213]  drm_dev_release+0x28/0x50 [drm]
>> [   12.432256]  devm_drm_dev_init_release+0x38/0x60 [drm]
>> [   12.432278]  devm_action_release+0x15/0x20
>> [   12.432283]  release_nodes+0x40/0xc0
>> [   12.432285]  devres_release_all+0x9e/0xe0
>> [   12.432286]  device_unbind_cleanup+0x12/0x80
>> [   12.432289]  really_probe+0x116/0x3e0
>> [   12.432291]  __driver_probe_device+0x7e/0x170
>> [   12.432293]  driver_probe_device+0x23/0xa0
>> [   12.432295]  __driver_attach+0xc5/0x190
>> [   12.432297]  ? __pfx___driver_attach+0x10/0x10
>> [   12.432299]  bus_for_each_dev+0x7c/0xd0
>> [   12.432302]  driver_attach+0x1e/0x30
>> [   12.432304]  bus_add_driver+0x11c/0x220
>> [   12.432306]  driver_register+0x64/0x130
>> [   12.432309]  ? __pfx_amdgpu_init+0x10/0x10 [amdgpu]
>> [   12.432491]  __pci_register_driver+0x68/0x70
>> [   12.432494]  amdgpu_init+0x63/0xff0 [amdgpu]
>> [   12.432667]  do_one_initcall+0x48/0x310
>> [   12.432671]  ? kmalloc_trace+0x2a/0xa0
>> [   12.432675]  do_init_module+0x6a/0x260
>> [   12.432677]  load_module+0x1db3/0x2050
>> [   12.432681]  init_module_from_file+0x9c/0xe0
>> [   12.432682]  ? init_module_from_file+0x9c/0xe0
>> [   12.432685]  idempotent_init_module+0x179/0x230
>> [   12.432687]  __x64_sys_finit_module+0x5d/0xb0
>> [   12.432689]  do_syscall_64+0x3b/0x90
>> [   12.432691]  entry_SYSCALL_64_after_hwframe+0x6e/0xd8
>>
>>>> +
>>>> +    mutex_init(&adev->smu_workload.workload_lock);
>>>> +}
>>>> +
>>>> +void amdgpu_workload_profile_fini(struct amdgpu_device *adev)
>>>> +{
>>>> +    if (!adev->smu_workload.initialized)
>>>> +        return;
>>>> +
>>>> +    adev->smu_workload.submit_workload_status = 0;
>>>> +    adev->smu_workload.initialized = false;
>>>> + mutex_destroy(&adev->smu_workload.workload_lock);
>>>> +}
>>>> diff --git a/drivers/gpu/drm/amd/include/amdgpu_workload.h 
>>>> b/drivers/gpu/drm/amd/include/amdgpu_workload.h
>>>> new file mode 100644
>>>> index 000000000000..5d0f068422d4
>>>> --- /dev/null
>>>> +++ b/drivers/gpu/drm/amd/include/amdgpu_workload.h
>>>> @@ -0,0 +1,53 @@
>>>> +/* SPDX-License-Identifier: MIT */
>>>> +/*
>>>> + * Copyright 2023 Advanced Micro Devices, Inc.
>>>> + *
>>>> + * Permission is hereby granted, free of charge, to any person 
>>>> obtaining a
>>>> + * copy of this software and associated documentation files (the 
>>>> "Software"),
>>>> + * to deal in the Software without restriction, including without 
>>>> limitation
>>>> + * the rights to use, copy, modify, merge, publish, distribute, 
>>>> sublicense,
>>>> + * and/or sell copies of the Software, and to permit persons to 
>>>> whom the
>>>> + * Software is furnished to do so, subject to the following 
>>>> conditions:
>>>> + *
>>>> + * The above copyright notice and this permission notice shall be 
>>>> included in
>>>> + * all copies or substantial portions of the Software.
>>>> + *
>>>> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, 
>>>> EXPRESS OR
>>>> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF 
>>>> MERCHANTABILITY,
>>>> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO 
>>>> EVENT SHALL
>>>> + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, 
>>>> DAMAGES OR
>>>> + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR 
>>>> OTHERWISE,
>>>> + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE 
>>>> USE OR
>>>> + * OTHER DEALINGS IN THE SOFTWARE.
>>>> + *
>>>> + */
>>>> +
>>>> +#ifndef _AMDGPU_WORKLOAD_H_
>>>> +#define _AMDGPU_WORKLOAD_H_
>>>> +
>>>> +struct amdgpu_smu_workload {
>>>> +    struct amdgpu_device    *adev;
>>>> +    struct mutex        workload_lock;
>>>> +    struct delayed_work    smu_delayed_work;
>>>
>>> call it power_profile_work instead ? Looks good otherwise.
>>>
>> Noted.
>>
>> Thank you
>>
>> ~Arvind
>>
>>> - Shashank
>>>
>>>> +    uint32_t submit_workload_status;
>>>> +    bool            initialized;
>>>> +    atomic_t power_profile_ref[PP_SMC_POWER_PROFILE_COUNT];
>>>> +};
>>>> +
>>>> +/* Workload mode names */
>>>> +static const char * const amdgpu_workload_mode_name[] = {
>>>> +    "Default",
>>>> +    "3D",
>>>> +    "Powersaving",
>>>> +    "Video",
>>>> +    "VR",
>>>> +    "Compute",
>>>> +    "Custom",
>>>> +    "Window3D"
>>>> +};
>>>> +
>>>> +void amdgpu_workload_profile_init(struct amdgpu_device *adev);
>>>> +
>>>> +void amdgpu_workload_profile_fini(struct amdgpu_device *adev);
>>>> +
>>>> +#endif

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v2 1/7] drm/amdgpu: Added init/fini functions for workload
  2023-08-21 14:12         ` Yadav, Arvind
@ 2023-08-21 14:27           ` Shashank Sharma
  0 siblings, 0 replies; 39+ messages in thread
From: Shashank Sharma @ 2023-08-21 14:27 UTC (permalink / raw)
  To: Yadav, Arvind, Arvind Yadav, Christian.Koenig, alexander.deucher,
	Xinhui.Pan, airlied, daniel, Felix.Kuehling, amd-gfx
  Cc: dri-devel, linux-kernel


On 21/08/2023 16:12, Yadav, Arvind wrote:
>
> On 8/21/2023 7:24 PM, Shashank Sharma wrote:
>>
>> On 21/08/2023 15:35, Yadav, Arvind wrote:
>>>
>>> On 8/21/2023 6:36 PM, Shashank Sharma wrote:
>>>> Hey Arvind,
>>>>
>>>> On 21/08/2023 08:47, Arvind Yadav wrote:
>>>>> The'struct amdgpu_smu_workload' initialization/cleanup
>>>>> functions is added by this patch.
>>>>>
>>>>> v2:
>>>>> - Splitting big patch into separate patches.
>>>>> - Added new fini function.
>>>>>
>>>>> Cc: Shashank Sharma <shashank.sharma@amd.com>
>>>>> Cc: Christian Koenig <christian.koenig@amd.com>
>>>>> Cc: Alex Deucher <alexander.deucher@amd.com>
>>>>> Signed-off-by: Arvind Yadav <Arvind.Yadav@amd.com>
>>>>> ---
>>>>>   drivers/gpu/drm/amd/amdgpu/Makefile           |  2 +-
>>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu.h           |  3 ++
>>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c    |  4 ++
>>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c  | 44 +++++++++++++++
>>>>>   drivers/gpu/drm/amd/include/amdgpu_workload.h | 53 
>>>>> +++++++++++++++++++
>>>>>   5 files changed, 105 insertions(+), 1 deletion(-)
>>>>>   create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c
>>>>>   create mode 100644 drivers/gpu/drm/amd/include/amdgpu_workload.h
>>>>>
>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/Makefile 
>>>>> b/drivers/gpu/drm/amd/amdgpu/Makefile
>>>>> index 415a7fa395c4..6a9e187d61e1 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/Makefile
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/Makefile
>>>>> @@ -60,7 +60,7 @@ amdgpu-y += amdgpu_device.o amdgpu_kms.o \
>>>>>       amdgpu_umc.o smu_v11_0_i2c.o amdgpu_fru_eeprom.o amdgpu_rap.o \
>>>>>       amdgpu_fw_attestation.o amdgpu_securedisplay.o \
>>>>>       amdgpu_eeprom.o amdgpu_mca.o amdgpu_psp_ta.o amdgpu_lsdma.o \
>>>>> -    amdgpu_ring_mux.o
>>>>> +    amdgpu_ring_mux.o amdgpu_workload.o
>>>>>     amdgpu-$(CONFIG_PROC_FS) += amdgpu_fdinfo.o
>>>>>   diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>>>> index 02b827785e39..1939fa1af8a6 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>>>> @@ -107,6 +107,7 @@
>>>>>   #include "amdgpu_fdinfo.h"
>>>>>   #include "amdgpu_mca.h"
>>>>>   #include "amdgpu_ras.h"
>>>>> +#include "amdgpu_workload.h"
>>>>>     #define MAX_GPU_INSTANCE        16
>>>>>   @@ -1050,6 +1051,8 @@ struct amdgpu_device {
>>>>>         bool                            job_hang;
>>>>>       bool                            dc_enabled;
>>>>> +
>>>>> +    struct amdgpu_smu_workload    smu_workload;
>>>>>   };
>>>>>     static inline struct amdgpu_device *drm_to_adev(struct 
>>>>> drm_device *ddev)
>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>> index 5c7d40873ee2..cd3bf641b630 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>> @@ -2243,6 +2243,8 @@ static int 
>>>>> amdgpu_device_ip_early_init(struct amdgpu_device *adev)
>>>>>       adev->cg_flags &= amdgpu_cg_mask;
>>>>>       adev->pg_flags &= amdgpu_pg_mask;
>>>>>   +    amdgpu_workload_profile_init(adev);
>>>>> +
>>>>>       return 0;
>>>>>   }
>>>>>   @@ -2890,6 +2892,8 @@ static int amdgpu_device_ip_fini(struct 
>>>>> amdgpu_device *adev)
>>>>>   {
>>>>>       int i, r;
>>>>>   +    amdgpu_workload_profile_fini(adev);
>>>>> +
>>>>>       if (amdgpu_sriov_vf(adev) && adev->virt.ras_init_done)
>>>>>           amdgpu_virt_release_ras_err_handler_data(adev);
>>>>>   diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c 
>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c
>>>>> new file mode 100644
>>>>> index 000000000000..32166f482f77
>>>>> --- /dev/null
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c
>>>>> @@ -0,0 +1,44 @@
>>>>> +// SPDX-License-Identifier: MIT
>>>>> +/*
>>>>> + * Copyright 2023 Advanced Micro Devices, Inc.
>>>>> + *
>>>>> + * Permission is hereby granted, free of charge, to any person 
>>>>> obtaining a
>>>>> + * copy of this software and associated documentation files (the 
>>>>> "Software"),
>>>>> + * to deal in the Software without restriction, including without 
>>>>> limitation
>>>>> + * the rights to use, copy, modify, merge, publish, distribute, 
>>>>> sublicense,
>>>>> + * and/or sell copies of the Software, and to permit persons to 
>>>>> whom the
>>>>> + * Software is furnished to do so, subject to the following 
>>>>> conditions:
>>>>> + *
>>>>> + * The above copyright notice and this permission notice shall be 
>>>>> included in
>>>>> + * all copies or substantial portions of the Software.
>>>>> + *
>>>>> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY 
>>>>> KIND, EXPRESS OR
>>>>> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF 
>>>>> MERCHANTABILITY,
>>>>> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO 
>>>>> EVENT SHALL
>>>>> + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, 
>>>>> DAMAGES OR
>>>>> + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR 
>>>>> OTHERWISE,
>>>>> + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE 
>>>>> USE OR
>>>>> + * OTHER DEALINGS IN THE SOFTWARE.
>>>>> + *
>>>>> + */
>>>>> +
>>>>> +#include "amdgpu.h"
>>>>> +
>>>>> +void amdgpu_workload_profile_init(struct amdgpu_device *adev)
>>>>> +{
>>>>> +    adev->smu_workload.adev = adev;
>>>>> +    adev->smu_workload.submit_workload_status = 0;
>>>>> +    adev->smu_workload.initialized = true;
>>>> why do we need this variable ?
>>>
>>> Hi Shashank,
>>>
>>> If any error comes while the device is booting then amdgpu will 
>>> start unloading everything.
>>> So I am using 'initialized' for unloading a driver successfully. 
>>> This variable is to identify that the driver is loaded or not.
>>
>> I am not sure if I am getting this right. This variable is only 
>> getting used in this patch here, just being set and reset.
>>
>> How does this flag help us ? I guess if AMDGPU driver is getting 
>> unloaded we already know that we can't set power profile.
>>
> We are setting "initialized = true" and checking in 
> amdgpu_workload_profile_fini() that 'initialized' is set or not set 
> because
> amdgpu_workload_profile_fini() is destroying mutex and same for 
> delayed work which I have implemented in patch 0003.
>
> In the below error case where amdgpu_workload_profile_init() was not 
> called because psp firmware failed to load but amdgpu driver is
> calling all unload function as well amdgpu_workload_profile_fini().

Ah, makes sense.

With the minor comments fixed below, please feel free to use:

Reviewed-by: Shashank Sharma <shashank.sharma@amd.com>

- Shashank

>
> ThankYou
> ~Arvind
>
>> - Shashank
>>
>>>
>>> This is the below error for which the amdgpu driver is unloading 
>>> when it is not getting firmware.
>>>
>>> [   12.421609] amdgpu 0000:08:00.0: Direct firmware load for 
>>> amdgpu/renoir_ta.bin failed with error -2
>>> [   12.421618] [drm:amdgpu_device_init [amdgpu]] *ERROR* early_init 
>>> of IP block <psp> failed -19
>>> [   12.428207] [drm] VCN decode is enabled in VM mode
>>> [   12.428212] [drm] VCN encode is enabled in VM mode
>>> [   12.430925] [drm] JPEG decode is enabled in VM mode
>>> [   12.430931] amdgpu 0000:08:00.0: amdgpu: Fatal error during GPU init
>>> [   12.431184] amdgpu 0000:08:00.0: amdgpu: amdgpu: finishing device.
>>> [   12.431296] ------------[ cut here ]------------
>>> [   12.431297] WARNING: CPU: 3 PID: 438 at kernel/workqueue.c:3379 
>>> __flush_work+0x22f/0x240
>>> [   12.431305] Modules linked in: ledtrig_audio snd_hda_codec_hdmi 
>>> snd_hda_intel snd_intel_dspcfg snd_intel_sdw_acpi snd_hda_codec 
>>> snd_hda_core amdgpu(OE+) snd_hwdep snd_pcm kvm snd_seq_midi 
>>> snd_seq_midi_event drm_exec amdxcp snd_rawmidi iommu_v2 
>>> crct10dif_pclmul drm_buddy gpu_sched ghash_clmulni_intel 
>>> sha512_ssse3 snd_seq drm_suballoc_helper aesni_intel drm_ttm_helper 
>>> binfmt_misc crypto_simd snd_seq_device ttm cryptd snd_timer 
>>> drm_display_helper input_leds rapl joydev cec wmi_bmof rc_core snd 
>>> drm_kms_helper k10temp ccp soundcore mac_hid sch_fq_codel msr 
>>> parport_pc ppdev lp parport ramoops reed_solomon drm pstore_blk 
>>> pstore_zone efi_pstore ip_tables x_tables autofs4 hid_generic usbhid 
>>> hid crc32_pclmul nvme igb ahci i2c_piix4 xhci_pci i2c_algo_bit 
>>> nvme_core libahci xhci_pci_renesas dca video wmi
>>> [   12.431360] CPU: 3 PID: 438 Comm: systemd-udevd Tainted: G        
>>> W  OE      6.5.0-rc2-custom #1
>>> [   12.431362] Hardware name: Gigabyte Technology Co., Ltd. X570 
>>> AORUS ELITE/X570 AORUS ELITE, BIOS F34 06/10/2021
>>> [   12.431364] RIP: 0010:__flush_work+0x22f/0x240
>>> [   12.431367] Code: 8b 43 30 48 8b 53 40 89 c1 e9 f9 fe ff ff 4c 89 
>>> f7 e8 45 0b db 00 e8 90 f5 08 00 45 31 ff e9 11 ff ff ff 0f 0b e9 0a 
>>> ff ff ff <0f> 0b 45 31 ff e9 00 ff ff ff e8 02 a0 d9 00 66 90 90 90 
>>> 90 90 90
>>> [   12.431368] RSP: 0018:ffffb0668156f818 EFLAGS: 00010246
>>> [   12.431370] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 
>>> 0000000000000000
>>> [   12.431371] RDX: 0000000000000001 RSI: 0000000000000001 RDI: 
>>> ffff9cea492c7840
>>> [   12.431372] RBP: ffffb0668156f890 R08: 0000000000000000 R09: 
>>> ffffb0668156f7a0
>>> [   12.431372] R10: 0000000000000001 R11: 0000000000000001 R12: 
>>> ffff9cea492c7840
>>> [   12.431373] R13: 0000000000000001 R14: ffff9cea43839940 R15: 
>>> 0000000000000001
>>> [   12.431374] FS:  00007fde83c18880(0000) GS:ffff9cf15e2c0000(0000) 
>>> knlGS:0000000000000000
>>> [   12.431375] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>> [   12.431376] CR2: 00007f2648000010 CR3: 00000001059e2000 CR4: 
>>> 0000000000350ee0
>>> [   12.431377] Call Trace:
>>> [   12.431379]  <TASK>
>>> [   12.431384]  ? show_regs+0x68/0x70
>>> [   12.431388]  ? __flush_work+0x22f/0x240
>>> [   12.431389]  ? __warn+0x8f/0x150
>>> [   12.431392]  ? __flush_work+0x22f/0x240
>>> [   12.431394]  ? report_bug+0x1f5/0x200
>>> [   12.431399]  ? handle_bug+0x46/0x80
>>> [   12.431402]  ? exc_invalid_op+0x19/0x70
>>> [   12.431404]  ? asm_exc_invalid_op+0x1b/0x20
>>> [   12.431408]  ? __flush_work+0x22f/0x240
>>> [   12.431410]  ? irq_work_queue+0x10/0x60
>>> [   12.431414]  ? __wake_up_klogd.part.0+0x5a/0x80
>>> [   12.431419]  __cancel_work_timer+0x124/0x1b0
>>> [   12.431421]  ? _printk+0x58/0x80
>>> [   12.431423]  cancel_delayed_work_sync+0x13/0x20
>>> [   12.431427]  amdgpu_workload_profile_fini+0x25/0x40 [amdgpu]
>>> [   12.431854]  amdgpu_device_fini_sw+0x33/0x550 [amdgpu]
>>> [   12.432035]  amdgpu_driver_release_kms+0x16/0x30 [amdgpu]
>>> [   12.432213]  drm_dev_release+0x28/0x50 [drm]
>>> [   12.432256]  devm_drm_dev_init_release+0x38/0x60 [drm]
>>> [   12.432278]  devm_action_release+0x15/0x20
>>> [   12.432283]  release_nodes+0x40/0xc0
>>> [   12.432285]  devres_release_all+0x9e/0xe0
>>> [   12.432286]  device_unbind_cleanup+0x12/0x80
>>> [   12.432289]  really_probe+0x116/0x3e0
>>> [   12.432291]  __driver_probe_device+0x7e/0x170
>>> [   12.432293]  driver_probe_device+0x23/0xa0
>>> [   12.432295]  __driver_attach+0xc5/0x190
>>> [   12.432297]  ? __pfx___driver_attach+0x10/0x10
>>> [   12.432299]  bus_for_each_dev+0x7c/0xd0
>>> [   12.432302]  driver_attach+0x1e/0x30
>>> [   12.432304]  bus_add_driver+0x11c/0x220
>>> [   12.432306]  driver_register+0x64/0x130
>>> [   12.432309]  ? __pfx_amdgpu_init+0x10/0x10 [amdgpu]
>>> [   12.432491]  __pci_register_driver+0x68/0x70
>>> [   12.432494]  amdgpu_init+0x63/0xff0 [amdgpu]
>>> [   12.432667]  do_one_initcall+0x48/0x310
>>> [   12.432671]  ? kmalloc_trace+0x2a/0xa0
>>> [   12.432675]  do_init_module+0x6a/0x260
>>> [   12.432677]  load_module+0x1db3/0x2050
>>> [   12.432681]  init_module_from_file+0x9c/0xe0
>>> [   12.432682]  ? init_module_from_file+0x9c/0xe0
>>> [   12.432685]  idempotent_init_module+0x179/0x230
>>> [   12.432687]  __x64_sys_finit_module+0x5d/0xb0
>>> [   12.432689]  do_syscall_64+0x3b/0x90
>>> [   12.432691]  entry_SYSCALL_64_after_hwframe+0x6e/0xd8
>>>
>>>>> +
>>>>> +    mutex_init(&adev->smu_workload.workload_lock);
>>>>> +}
>>>>> +
>>>>> +void amdgpu_workload_profile_fini(struct amdgpu_device *adev)
>>>>> +{
>>>>> +    if (!adev->smu_workload.initialized)
>>>>> +        return;
>>>>> +
>>>>> +    adev->smu_workload.submit_workload_status = 0;
>>>>> +    adev->smu_workload.initialized = false;
>>>>> + mutex_destroy(&adev->smu_workload.workload_lock);
>>>>> +}
>>>>> diff --git a/drivers/gpu/drm/amd/include/amdgpu_workload.h 
>>>>> b/drivers/gpu/drm/amd/include/amdgpu_workload.h
>>>>> new file mode 100644
>>>>> index 000000000000..5d0f068422d4
>>>>> --- /dev/null
>>>>> +++ b/drivers/gpu/drm/amd/include/amdgpu_workload.h
>>>>> @@ -0,0 +1,53 @@
>>>>> +/* SPDX-License-Identifier: MIT */
>>>>> +/*
>>>>> + * Copyright 2023 Advanced Micro Devices, Inc.
>>>>> + *
>>>>> + * Permission is hereby granted, free of charge, to any person 
>>>>> obtaining a
>>>>> + * copy of this software and associated documentation files (the 
>>>>> "Software"),
>>>>> + * to deal in the Software without restriction, including without 
>>>>> limitation
>>>>> + * the rights to use, copy, modify, merge, publish, distribute, 
>>>>> sublicense,
>>>>> + * and/or sell copies of the Software, and to permit persons to 
>>>>> whom the
>>>>> + * Software is furnished to do so, subject to the following 
>>>>> conditions:
>>>>> + *
>>>>> + * The above copyright notice and this permission notice shall be 
>>>>> included in
>>>>> + * all copies or substantial portions of the Software.
>>>>> + *
>>>>> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY 
>>>>> KIND, EXPRESS OR
>>>>> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF 
>>>>> MERCHANTABILITY,
>>>>> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO 
>>>>> EVENT SHALL
>>>>> + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, 
>>>>> DAMAGES OR
>>>>> + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR 
>>>>> OTHERWISE,
>>>>> + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE 
>>>>> USE OR
>>>>> + * OTHER DEALINGS IN THE SOFTWARE.
>>>>> + *
>>>>> + */
>>>>> +
>>>>> +#ifndef _AMDGPU_WORKLOAD_H_
>>>>> +#define _AMDGPU_WORKLOAD_H_
>>>>> +
>>>>> +struct amdgpu_smu_workload {
>>>>> +    struct amdgpu_device    *adev;
>>>>> +    struct mutex        workload_lock;
>>>>> +    struct delayed_work    smu_delayed_work;
>>>>
>>>> call it power_profile_work instead ? Looks good otherwise.
>>>>
>>> Noted.
>>>
>>> Thank you
>>>
>>> ~Arvind
>>>
>>>> - Shashank
>>>>
>>>>> +    uint32_t submit_workload_status;
>>>>> +    bool            initialized;
>>>>> +    atomic_t power_profile_ref[PP_SMC_POWER_PROFILE_COUNT];
>>>>> +};
>>>>> +
>>>>> +/* Workload mode names */
>>>>> +static const char * const amdgpu_workload_mode_name[] = {
>>>>> +    "Default",
>>>>> +    "3D",
>>>>> +    "Powersaving",
>>>>> +    "Video",
>>>>> +    "VR",
>>>>> +    "Compute",
>>>>> +    "Custom",
>>>>> +    "Window3D"
>>>>> +};
>>>>> +
>>>>> +void amdgpu_workload_profile_init(struct amdgpu_device *adev);
>>>>> +
>>>>> +void amdgpu_workload_profile_fini(struct amdgpu_device *adev);
>>>>> +
>>>>> +#endif

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v2 3/7] drm/amdgpu: Add new function to put GPU power profile
  2023-08-21 13:39   ` Shashank Sharma
@ 2023-08-21 14:40     ` Yadav, Arvind
  0 siblings, 0 replies; 39+ messages in thread
From: Yadav, Arvind @ 2023-08-21 14:40 UTC (permalink / raw)
  To: Shashank Sharma, Arvind Yadav, Christian.Koenig,
	alexander.deucher, Xinhui.Pan, airlied, daniel, Felix.Kuehling,
	amd-gfx
  Cc: dri-devel, linux-kernel


On 8/21/2023 7:09 PM, Shashank Sharma wrote:
>
> On 21/08/2023 08:47, Arvind Yadav wrote:
>> This patch adds a function which will clear the GPU
>> power profile after job finished.
>>
>> This is how it works:
>> - schedular will set the GPU power profile based on ring_type.
>> - Schedular will clear the GPU Power profile once job finished.
>> - Here, the *_workload_profile_set function will set the GPU
>>    power profile and the *_workload_profile_put function will
>>    schedule the smu_delayed_work task after 100ms delay. This
>>    smu_delayed_work task will clear a GPU power profile if any
>>    new jobs are not scheduled within 100 ms. But if any new job
>>    comes within 100ms then the *_workload_profile_set function
>>    will cancel this work and set the GPU power profile based on
>>    preferences.
>>
>> v2:
>> - Splitting workload_profile_set and workload_profile_put
>>    into two separate patches.
>> - Addressed review comment.
>>
>> Cc: Shashank Sharma <shashank.sharma@amd.com>
>> Cc: Christian Koenig <christian.koenig@amd.com>
>> Cc: Alex Deucher <alexander.deucher@amd.com>
>> Signed-off-by: Arvind Yadav <Arvind.Yadav@amd.com>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c  | 97 +++++++++++++++++++
>>   drivers/gpu/drm/amd/include/amdgpu_workload.h |  3 +
>>   2 files changed, 100 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c
>> index e661cc5b3d92..6367eb88a44d 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c
>> @@ -24,6 +24,9 @@
>>     #include "amdgpu.h"
>>   +/* 100 millsecond timeout */
>> +#define SMU_IDLE_TIMEOUT    msecs_to_jiffies(100)
>> +
>>   static enum PP_SMC_POWER_PROFILE
>>   ring_to_power_profile(uint32_t ring_type)
>>   {
>> @@ -59,6 +62,80 @@ amdgpu_power_profile_set(struct amdgpu_device *adev,
>>       return ret;
>>   }
>>   +static int
>> +amdgpu_power_profile_clear(struct amdgpu_device *adev,
>> +               enum PP_SMC_POWER_PROFILE profile)
>> +{
>> +    int ret = amdgpu_dpm_switch_power_profile(adev, profile, false);
>> +
>> +    if (!ret) {
>> +        /* Clear the bit for the submitted workload profile */
>> +        adev->smu_workload.submit_workload_status &= ~(1 << profile);
>> +    }
>> +
>> +    return ret;
>> +}
>> +
>> +static void
>> +amdgpu_power_profile_idle_work_handler(struct work_struct *work)
>> +{
>> +
>> +    struct amdgpu_smu_workload *workload = container_of(work,
>> +                              struct amdgpu_smu_workload,
>> +                              smu_delayed_work.work);
>> +    struct amdgpu_device *adev = workload->adev;
>> +    bool reschedule = false;
>> +    int index  = fls(workload->submit_workload_status);
>> +    int ret;
>> +
> We should check validity and range of index here before before using 
> it below.
Noted.
>
>> + mutex_lock(&workload->workload_lock);
>> +    for (; index > 0; index--) {
>> +        int val = atomic_read(&workload->power_profile_ref[index]);
>> +
>> +        if (val) {
>> +            reschedule = true;
>> +        } else {
>> +            if (workload->submit_workload_status &
>> +                (1 << index)) {
>> +                ret = amdgpu_power_profile_clear(adev, index);
>> +                if (ret) {
>> +                    DRM_WARN("Failed to clear workload %s,error = 
>> %d\n",
>> +                         amdgpu_workload_mode_name[index], ret);
>> +                    goto exit;
> instead of exiting, we might wanna continue the loop here, just to 
> check if we are able to reset another profile in the next attempt.
Noted.
>> +                }
>> +            }
>> +        }
>> +    }
> A blank line recommended here.
Noted.
>> +    if (reschedule)
>> + schedule_delayed_work(&workload->smu_delayed_work,
>> +                      SMU_IDLE_TIMEOUT);
>> +exit:
>> +    mutex_unlock(&workload->workload_lock);
>> +}
>> +
>> +void amdgpu_workload_profile_put(struct amdgpu_device *adev,
>> +                 uint32_t ring_type)
>> +{
>> +    struct amdgpu_smu_workload *workload = &adev->smu_workload;
>> +    enum PP_SMC_POWER_PROFILE profile = 
>> ring_to_power_profile(ring_type);
>> +
>> +    if (profile == PP_SMC_POWER_PROFILE_BOOTUP_DEFAULT)
>> +        return;
>> +
>> +    mutex_lock(&workload->workload_lock);
>> +
>> +    if (!atomic_read(&workload->power_profile_ref[profile])) {
>> +        DRM_WARN("Power profile %s ref. count error\n",
>> +             amdgpu_workload_mode_name[profile]);
>> +    } else {
>> + atomic_dec(&workload->power_profile_ref[profile]);
>> + schedule_delayed_work(&workload->smu_delayed_work,
>> +                      SMU_IDLE_TIMEOUT);
> We don't want to schedule this work everytime a power profile is put, 
> but we want to do that only when a power profile ref count reaches 
> '0'. So you might want to check the ref_count, and schedule the work 
> under a if (!ref_count) condition.
Noted.
>
>> +    }
>> +
>> +    mutex_unlock(&workload->workload_lock);
>> +}
>> +
>>   void amdgpu_workload_profile_set(struct amdgpu_device *adev,
>>                    uint32_t ring_type)
>>   {
>> @@ -70,13 +147,30 @@ void amdgpu_workload_profile_set(struct 
>> amdgpu_device *adev,
>>           return;
>>         mutex_lock(&workload->workload_lock);
>> + cancel_delayed_work_sync(&workload->smu_delayed_work);
>>         ret = amdgpu_power_profile_set(adev, profile);
>>       if (ret) {
>>           DRM_WARN("Failed to set workload profile to %s, error = %d\n",
>>                amdgpu_workload_mode_name[profile], ret);
>> +        goto exit;
>> +    }
>> +
>> +    /* Clear the already finished jobs of higher power profile*/
>
> We are not clearing the jobs here, but their power profiles.
>
> I would recommend a little rework in the comment like "As we cancelled 
> the delayed work, check and clear the pending higher power profiles 
> set by previous jobs which are done now"
>
Noted.
>> +    for (int index = fls(workload->submit_workload_status);
> The index can be initialized above, like the put function for loop.
>> +         index > profile; index--) {
>> +        if (!atomic_read(&workload->power_profile_ref[index]) &&
>> +            workload->submit_workload_status & (1 << index)) {
>> +            ret = amdgpu_power_profile_clear(adev, index);
> After clearing the power profile, we should also clear the respective 
> workload->submit_workload_status bit as well, right ?
We are clearing in submit_workload_status bit in 
amdgpu_power_profile_clear()
>> +            if (ret) {
>> +                DRM_WARN("Failed to clear workload %s, err = %d\n",
>> +                     amdgpu_workload_mode_name[profile], ret);
>> +                goto exit;
>
> Same as previous about continuing the loop.

Noted.

Thank You,
~Arvind

>
> - Shashank
>
>> +            }
>> +        }
>>       }
>>   +exit:
>>       mutex_unlock(&workload->workload_lock);
>>   }
>>   @@ -87,6 +181,8 @@ void amdgpu_workload_profile_init(struct 
>> amdgpu_device *adev)
>>       adev->smu_workload.initialized = true;
>>         mutex_init(&adev->smu_workload.workload_lock);
>> + INIT_DELAYED_WORK(&adev->smu_workload.smu_delayed_work,
>> +              amdgpu_power_profile_idle_work_handler);
>>   }
>>     void amdgpu_workload_profile_fini(struct amdgpu_device *adev)
>> @@ -94,6 +190,7 @@ void amdgpu_workload_profile_fini(struct 
>> amdgpu_device *adev)
>>       if (!adev->smu_workload.initialized)
>>           return;
>>   + cancel_delayed_work_sync(&adev->smu_workload.smu_delayed_work);
>>       adev->smu_workload.submit_workload_status = 0;
>>       adev->smu_workload.initialized = false;
>>       mutex_destroy(&adev->smu_workload.workload_lock);
>> diff --git a/drivers/gpu/drm/amd/include/amdgpu_workload.h 
>> b/drivers/gpu/drm/amd/include/amdgpu_workload.h
>> index 5022f28fc2f9..ee1f87257f2d 100644
>> --- a/drivers/gpu/drm/amd/include/amdgpu_workload.h
>> +++ b/drivers/gpu/drm/amd/include/amdgpu_workload.h
>> @@ -46,6 +46,9 @@ static const char * const 
>> amdgpu_workload_mode_name[] = {
>>       "Window3D"
>>   };
>>   +void amdgpu_workload_profile_put(struct amdgpu_device *adev,
>> +                 uint32_t ring_type);
>> +
>>   void amdgpu_workload_profile_set(struct amdgpu_device *adev,
>>                    uint32_t ring_type);

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v2 2/7] drm/amdgpu: Add new function to set GPU power profile
  2023-08-21  6:47 ` [PATCH v2 2/7] drm/amdgpu: Add new function to set GPU power profile Arvind Yadav
  2023-08-21 13:10   ` Shashank Sharma
@ 2023-08-21 16:22   ` Alex Deucher
  2023-08-21 17:53     ` Yadav, Arvind
  2023-08-21 18:06   ` Alex Deucher
  2023-08-22  6:25   ` Lazar, Lijo
  3 siblings, 1 reply; 39+ messages in thread
From: Alex Deucher @ 2023-08-21 16:22 UTC (permalink / raw)
  To: Arvind Yadav
  Cc: Christian.Koenig, alexander.deucher, shashank.sharma, Xinhui.Pan,
	airlied, daniel, Felix.Kuehling, amd-gfx, linux-kernel, dri-devel

On Mon, Aug 21, 2023 at 2:55 AM Arvind Yadav <Arvind.Yadav@amd.com> wrote:
>
> This patch adds a function which will change the GPU
> power profile based on a submitted job. This can optimize
> the power performance when the workload is on.
>
> v2:
> - Splitting workload_profile_set and workload_profile_put
>   into two separate patches.
> - Addressed review comment.
>
> Cc: Shashank Sharma <shashank.sharma@amd.com>
> Cc: Christian Koenig <christian.koenig@amd.com>
> Cc: Alex Deucher <alexander.deucher@amd.com>
> Signed-off-by: Arvind Yadav <Arvind.Yadav@amd.com>
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c  | 56 +++++++++++++++++++
>  drivers/gpu/drm/amd/include/amdgpu_workload.h |  3 +
>  2 files changed, 59 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c
> index 32166f482f77..e661cc5b3d92 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c
> @@ -24,6 +24,62 @@
>
>  #include "amdgpu.h"
>
> +static enum PP_SMC_POWER_PROFILE
> +ring_to_power_profile(uint32_t ring_type)
> +{
> +       switch (ring_type) {
> +       case AMDGPU_RING_TYPE_GFX:
> +               return PP_SMC_POWER_PROFILE_FULLSCREEN3D;
> +       case AMDGPU_RING_TYPE_COMPUTE:
> +               return PP_SMC_POWER_PROFILE_COMPUTE;
> +       case AMDGPU_RING_TYPE_UVD:
> +       case AMDGPU_RING_TYPE_VCE:
> +       case AMDGPU_RING_TYPE_UVD_ENC:
> +       case AMDGPU_RING_TYPE_VCN_DEC:
> +       case AMDGPU_RING_TYPE_VCN_ENC:
> +       case AMDGPU_RING_TYPE_VCN_JPEG:
> +               return PP_SMC_POWER_PROFILE_VIDEO;
> +       default:
> +               return PP_SMC_POWER_PROFILE_BOOTUP_DEFAULT;
> +       }
> +}
> +
> +static int
> +amdgpu_power_profile_set(struct amdgpu_device *adev,
> +                        enum PP_SMC_POWER_PROFILE profile)
> +{
> +       int ret = amdgpu_dpm_switch_power_profile(adev, profile, true);
> +
> +       if (!ret) {
> +               /* Set the bit for the submitted workload profile */
> +               adev->smu_workload.submit_workload_status |= (1 << profile);
> +               atomic_inc(&adev->smu_workload.power_profile_ref[profile]);
> +       }
> +
> +       return ret;
> +}
> +
> +void amdgpu_workload_profile_set(struct amdgpu_device *adev,
> +                                uint32_t ring_type)
> +{
> +       struct amdgpu_smu_workload *workload = &adev->smu_workload;
> +       enum PP_SMC_POWER_PROFILE profile = ring_to_power_profile(ring_type);
> +       int ret;
> +
> +       if (profile == PP_SMC_POWER_PROFILE_BOOTUP_DEFAULT)
> +               return;

Why is this one skipped?  How do we get back to the boot up profile?

Alex

> +
> +       mutex_lock(&workload->workload_lock);
> +
> +       ret = amdgpu_power_profile_set(adev, profile);
> +       if (ret) {
> +               DRM_WARN("Failed to set workload profile to %s, error = %d\n",
> +                        amdgpu_workload_mode_name[profile], ret);
> +       }
> +
> +       mutex_unlock(&workload->workload_lock);
> +}
> +
>  void amdgpu_workload_profile_init(struct amdgpu_device *adev)
>  {
>         adev->smu_workload.adev = adev;
> diff --git a/drivers/gpu/drm/amd/include/amdgpu_workload.h b/drivers/gpu/drm/amd/include/amdgpu_workload.h
> index 5d0f068422d4..5022f28fc2f9 100644
> --- a/drivers/gpu/drm/amd/include/amdgpu_workload.h
> +++ b/drivers/gpu/drm/amd/include/amdgpu_workload.h
> @@ -46,6 +46,9 @@ static const char * const amdgpu_workload_mode_name[] = {
>         "Window3D"
>  };
>
> +void amdgpu_workload_profile_set(struct amdgpu_device *adev,
> +                                uint32_t ring_type);
> +
>  void amdgpu_workload_profile_init(struct amdgpu_device *adev);
>
>  void amdgpu_workload_profile_fini(struct amdgpu_device *adev);
> --
> 2.34.1
>

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v2 2/7] drm/amdgpu: Add new function to set GPU power profile
  2023-08-21 16:22   ` Alex Deucher
@ 2023-08-21 17:53     ` Yadav, Arvind
  2023-08-21 18:10       ` Alex Deucher
  0 siblings, 1 reply; 39+ messages in thread
From: Yadav, Arvind @ 2023-08-21 17:53 UTC (permalink / raw)
  To: Alex Deucher, Arvind Yadav
  Cc: Christian.Koenig, alexander.deucher, shashank.sharma, Xinhui.Pan,
	airlied, daniel, Felix.Kuehling, amd-gfx, linux-kernel, dri-devel


On 8/21/2023 9:52 PM, Alex Deucher wrote:
> On Mon, Aug 21, 2023 at 2:55 AM Arvind Yadav <Arvind.Yadav@amd.com> wrote:
>> This patch adds a function which will change the GPU
>> power profile based on a submitted job. This can optimize
>> the power performance when the workload is on.
>>
>> v2:
>> - Splitting workload_profile_set and workload_profile_put
>>    into two separate patches.
>> - Addressed review comment.
>>
>> Cc: Shashank Sharma <shashank.sharma@amd.com>
>> Cc: Christian Koenig <christian.koenig@amd.com>
>> Cc: Alex Deucher <alexander.deucher@amd.com>
>> Signed-off-by: Arvind Yadav <Arvind.Yadav@amd.com>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c  | 56 +++++++++++++++++++
>>   drivers/gpu/drm/amd/include/amdgpu_workload.h |  3 +
>>   2 files changed, 59 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c
>> index 32166f482f77..e661cc5b3d92 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c
>> @@ -24,6 +24,62 @@
>>
>>   #include "amdgpu.h"
>>
>> +static enum PP_SMC_POWER_PROFILE
>> +ring_to_power_profile(uint32_t ring_type)
>> +{
>> +       switch (ring_type) {
>> +       case AMDGPU_RING_TYPE_GFX:
>> +               return PP_SMC_POWER_PROFILE_FULLSCREEN3D;
>> +       case AMDGPU_RING_TYPE_COMPUTE:
>> +               return PP_SMC_POWER_PROFILE_COMPUTE;
>> +       case AMDGPU_RING_TYPE_UVD:
>> +       case AMDGPU_RING_TYPE_VCE:
>> +       case AMDGPU_RING_TYPE_UVD_ENC:
>> +       case AMDGPU_RING_TYPE_VCN_DEC:
>> +       case AMDGPU_RING_TYPE_VCN_ENC:
>> +       case AMDGPU_RING_TYPE_VCN_JPEG:
>> +               return PP_SMC_POWER_PROFILE_VIDEO;
>> +       default:
>> +               return PP_SMC_POWER_PROFILE_BOOTUP_DEFAULT;
>> +       }
>> +}
>> +
>> +static int
>> +amdgpu_power_profile_set(struct amdgpu_device *adev,
>> +                        enum PP_SMC_POWER_PROFILE profile)
>> +{
>> +       int ret = amdgpu_dpm_switch_power_profile(adev, profile, true);
>> +
>> +       if (!ret) {
>> +               /* Set the bit for the submitted workload profile */
>> +               adev->smu_workload.submit_workload_status |= (1 << profile);
>> +               atomic_inc(&adev->smu_workload.power_profile_ref[profile]);
>> +       }
>> +
>> +       return ret;
>> +}
>> +
>> +void amdgpu_workload_profile_set(struct amdgpu_device *adev,
>> +                                uint32_t ring_type)
>> +{
>> +       struct amdgpu_smu_workload *workload = &adev->smu_workload;
>> +       enum PP_SMC_POWER_PROFILE profile = ring_to_power_profile(ring_type);
>> +       int ret;
>> +
>> +       if (profile == PP_SMC_POWER_PROFILE_BOOTUP_DEFAULT)
>> +               return;
> Why is this one skipped?  How do we get back to the boot up profile?

Hi Alex,

enum PP_SMC_POWER_PROFILE {
     PP_SMC_POWER_PROFILE_BOOTUP_DEFAULT = 0x0,
     PP_SMC_POWER_PROFILE_FULLSCREEN3D = 0x1,
     PP_SMC_POWER_PROFILE_POWERSAVING  = 0x2,
     PP_SMC_POWER_PROFILE_VIDEO        = 0x3,
     PP_SMC_POWER_PROFILE_VR           = 0x4,
     PP_SMC_POWER_PROFILE_COMPUTE      = 0x5,
     PP_SMC_POWER_PROFILE_CUSTOM       = 0x6,
     PP_SMC_POWER_PROFILE_WINDOW3D     = 0x7,
     PP_SMC_POWER_PROFILE_COUNT,
};

These are all the profiles. We are using which is > 
PP_SMC_POWER_PROFILE_BOOTUP_DEFAULT.
Now suppose the profile was DEFAULT and we set it to VIDEO, SMU will 
move the profile to a higher level.
When we reset the VIDEO profile then SMU will move back to the DEFAULT one.

Our job is to set the profile and reset it after the job is done.
SMU will take care to move to a higher profile and after reset, it will 
move back to DEFAULT.

ThankYou,
~Arvind

>
> Alex
>
>> +
>> +       mutex_lock(&workload->workload_lock);
>> +
>> +       ret = amdgpu_power_profile_set(adev, profile);
>> +       if (ret) {
>> +               DRM_WARN("Failed to set workload profile to %s, error = %d\n",
>> +                        amdgpu_workload_mode_name[profile], ret);
>> +       }
>> +
>> +       mutex_unlock(&workload->workload_lock);
>> +}
>> +
>>   void amdgpu_workload_profile_init(struct amdgpu_device *adev)
>>   {
>>          adev->smu_workload.adev = adev;
>> diff --git a/drivers/gpu/drm/amd/include/amdgpu_workload.h b/drivers/gpu/drm/amd/include/amdgpu_workload.h
>> index 5d0f068422d4..5022f28fc2f9 100644
>> --- a/drivers/gpu/drm/amd/include/amdgpu_workload.h
>> +++ b/drivers/gpu/drm/amd/include/amdgpu_workload.h
>> @@ -46,6 +46,9 @@ static const char * const amdgpu_workload_mode_name[] = {
>>          "Window3D"
>>   };
>>
>> +void amdgpu_workload_profile_set(struct amdgpu_device *adev,
>> +                                uint32_t ring_type);
>> +
>>   void amdgpu_workload_profile_init(struct amdgpu_device *adev);
>>
>>   void amdgpu_workload_profile_fini(struct amdgpu_device *adev);
>> --
>> 2.34.1
>>

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v2 2/7] drm/amdgpu: Add new function to set GPU power profile
  2023-08-21  6:47 ` [PATCH v2 2/7] drm/amdgpu: Add new function to set GPU power profile Arvind Yadav
  2023-08-21 13:10   ` Shashank Sharma
  2023-08-21 16:22   ` Alex Deucher
@ 2023-08-21 18:06   ` Alex Deucher
  2023-08-21 18:08     ` Yadav, Arvind
  2023-08-22  6:25   ` Lazar, Lijo
  3 siblings, 1 reply; 39+ messages in thread
From: Alex Deucher @ 2023-08-21 18:06 UTC (permalink / raw)
  To: Arvind Yadav
  Cc: Christian.Koenig, alexander.deucher, shashank.sharma, Xinhui.Pan,
	airlied, daniel, Felix.Kuehling, amd-gfx, linux-kernel, dri-devel

On Mon, Aug 21, 2023 at 2:55 AM Arvind Yadav <Arvind.Yadav@amd.com> wrote:
>
> This patch adds a function which will change the GPU
> power profile based on a submitted job. This can optimize
> the power performance when the workload is on.
>
> v2:
> - Splitting workload_profile_set and workload_profile_put
>   into two separate patches.
> - Addressed review comment.
>
> Cc: Shashank Sharma <shashank.sharma@amd.com>
> Cc: Christian Koenig <christian.koenig@amd.com>
> Cc: Alex Deucher <alexander.deucher@amd.com>
> Signed-off-by: Arvind Yadav <Arvind.Yadav@amd.com>
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c  | 56 +++++++++++++++++++
>  drivers/gpu/drm/amd/include/amdgpu_workload.h |  3 +
>  2 files changed, 59 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c
> index 32166f482f77..e661cc5b3d92 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c
> @@ -24,6 +24,62 @@
>
>  #include "amdgpu.h"
>
> +static enum PP_SMC_POWER_PROFILE
> +ring_to_power_profile(uint32_t ring_type)
> +{
> +       switch (ring_type) {
> +       case AMDGPU_RING_TYPE_GFX:
> +               return PP_SMC_POWER_PROFILE_FULLSCREEN3D;
> +       case AMDGPU_RING_TYPE_COMPUTE:
> +               return PP_SMC_POWER_PROFILE_COMPUTE;
> +       case AMDGPU_RING_TYPE_UVD:
> +       case AMDGPU_RING_TYPE_VCE:
> +       case AMDGPU_RING_TYPE_UVD_ENC:
> +       case AMDGPU_RING_TYPE_VCN_DEC:
> +       case AMDGPU_RING_TYPE_VCN_ENC:
> +       case AMDGPU_RING_TYPE_VCN_JPEG:
> +               return PP_SMC_POWER_PROFILE_VIDEO;
> +       default:
> +               return PP_SMC_POWER_PROFILE_BOOTUP_DEFAULT;
> +       }
> +}
> +
> +static int
> +amdgpu_power_profile_set(struct amdgpu_device *adev,
> +                        enum PP_SMC_POWER_PROFILE profile)
> +{
> +       int ret = amdgpu_dpm_switch_power_profile(adev, profile, true);
> +
> +       if (!ret) {
> +               /* Set the bit for the submitted workload profile */
> +               adev->smu_workload.submit_workload_status |= (1 << profile);
> +               atomic_inc(&adev->smu_workload.power_profile_ref[profile]);
> +       }
> +
> +       return ret;
> +}
> +
> +void amdgpu_workload_profile_set(struct amdgpu_device *adev,
> +                                uint32_t ring_type)

Maybe rename this amdgpu_workload_profile_get() to align with put/get
naming semantics?

Alex

> +{
> +       struct amdgpu_smu_workload *workload = &adev->smu_workload;
> +       enum PP_SMC_POWER_PROFILE profile = ring_to_power_profile(ring_type);
> +       int ret;
> +
> +       if (profile == PP_SMC_POWER_PROFILE_BOOTUP_DEFAULT)
> +               return;
> +
> +       mutex_lock(&workload->workload_lock);
> +
> +       ret = amdgpu_power_profile_set(adev, profile);
> +       if (ret) {
> +               DRM_WARN("Failed to set workload profile to %s, error = %d\n",
> +                        amdgpu_workload_mode_name[profile], ret);
> +       }
> +
> +       mutex_unlock(&workload->workload_lock);
> +}
> +
>  void amdgpu_workload_profile_init(struct amdgpu_device *adev)
>  {
>         adev->smu_workload.adev = adev;
> diff --git a/drivers/gpu/drm/amd/include/amdgpu_workload.h b/drivers/gpu/drm/amd/include/amdgpu_workload.h
> index 5d0f068422d4..5022f28fc2f9 100644
> --- a/drivers/gpu/drm/amd/include/amdgpu_workload.h
> +++ b/drivers/gpu/drm/amd/include/amdgpu_workload.h
> @@ -46,6 +46,9 @@ static const char * const amdgpu_workload_mode_name[] = {
>         "Window3D"
>  };
>
> +void amdgpu_workload_profile_set(struct amdgpu_device *adev,
> +                                uint32_t ring_type);
> +
>  void amdgpu_workload_profile_init(struct amdgpu_device *adev);
>
>  void amdgpu_workload_profile_fini(struct amdgpu_device *adev);
> --
> 2.34.1
>

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v2 2/7] drm/amdgpu: Add new function to set GPU power profile
  2023-08-21 18:06   ` Alex Deucher
@ 2023-08-21 18:08     ` Yadav, Arvind
  0 siblings, 0 replies; 39+ messages in thread
From: Yadav, Arvind @ 2023-08-21 18:08 UTC (permalink / raw)
  To: Alex Deucher, Arvind Yadav
  Cc: Christian.Koenig, alexander.deucher, shashank.sharma, Xinhui.Pan,
	airlied, daniel, Felix.Kuehling, amd-gfx, linux-kernel, dri-devel


On 8/21/2023 11:36 PM, Alex Deucher wrote:
> On Mon, Aug 21, 2023 at 2:55 AM Arvind Yadav <Arvind.Yadav@amd.com> wrote:
>> This patch adds a function which will change the GPU
>> power profile based on a submitted job. This can optimize
>> the power performance when the workload is on.
>>
>> v2:
>> - Splitting workload_profile_set and workload_profile_put
>>    into two separate patches.
>> - Addressed review comment.
>>
>> Cc: Shashank Sharma <shashank.sharma@amd.com>
>> Cc: Christian Koenig <christian.koenig@amd.com>
>> Cc: Alex Deucher <alexander.deucher@amd.com>
>> Signed-off-by: Arvind Yadav <Arvind.Yadav@amd.com>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c  | 56 +++++++++++++++++++
>>   drivers/gpu/drm/amd/include/amdgpu_workload.h |  3 +
>>   2 files changed, 59 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c
>> index 32166f482f77..e661cc5b3d92 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c
>> @@ -24,6 +24,62 @@
>>
>>   #include "amdgpu.h"
>>
>> +static enum PP_SMC_POWER_PROFILE
>> +ring_to_power_profile(uint32_t ring_type)
>> +{
>> +       switch (ring_type) {
>> +       case AMDGPU_RING_TYPE_GFX:
>> +               return PP_SMC_POWER_PROFILE_FULLSCREEN3D;
>> +       case AMDGPU_RING_TYPE_COMPUTE:
>> +               return PP_SMC_POWER_PROFILE_COMPUTE;
>> +       case AMDGPU_RING_TYPE_UVD:
>> +       case AMDGPU_RING_TYPE_VCE:
>> +       case AMDGPU_RING_TYPE_UVD_ENC:
>> +       case AMDGPU_RING_TYPE_VCN_DEC:
>> +       case AMDGPU_RING_TYPE_VCN_ENC:
>> +       case AMDGPU_RING_TYPE_VCN_JPEG:
>> +               return PP_SMC_POWER_PROFILE_VIDEO;
>> +       default:
>> +               return PP_SMC_POWER_PROFILE_BOOTUP_DEFAULT;
>> +       }
>> +}
>> +
>> +static int
>> +amdgpu_power_profile_set(struct amdgpu_device *adev,
>> +                        enum PP_SMC_POWER_PROFILE profile)
>> +{
>> +       int ret = amdgpu_dpm_switch_power_profile(adev, profile, true);
>> +
>> +       if (!ret) {
>> +               /* Set the bit for the submitted workload profile */
>> +               adev->smu_workload.submit_workload_status |= (1 << profile);
>> +               atomic_inc(&adev->smu_workload.power_profile_ref[profile]);
>> +       }
>> +
>> +       return ret;
>> +}
>> +
>> +void amdgpu_workload_profile_set(struct amdgpu_device *adev,
>> +                                uint32_t ring_type)
> Maybe rename this amdgpu_workload_profile_get() to align with put/get
> naming semantics?
Noted.
>
> Alex
>
>> +{
>> +       struct amdgpu_smu_workload *workload = &adev->smu_workload;
>> +       enum PP_SMC_POWER_PROFILE profile = ring_to_power_profile(ring_type);
>> +       int ret;
>> +
>> +       if (profile == PP_SMC_POWER_PROFILE_BOOTUP_DEFAULT)
>> +               return;
>> +
>> +       mutex_lock(&workload->workload_lock);
>> +
>> +       ret = amdgpu_power_profile_set(adev, profile);
>> +       if (ret) {
>> +               DRM_WARN("Failed to set workload profile to %s, error = %d\n",
>> +                        amdgpu_workload_mode_name[profile], ret);
>> +       }
>> +
>> +       mutex_unlock(&workload->workload_lock);
>> +}
>> +
>>   void amdgpu_workload_profile_init(struct amdgpu_device *adev)
>>   {
>>          adev->smu_workload.adev = adev;
>> diff --git a/drivers/gpu/drm/amd/include/amdgpu_workload.h b/drivers/gpu/drm/amd/include/amdgpu_workload.h
>> index 5d0f068422d4..5022f28fc2f9 100644
>> --- a/drivers/gpu/drm/amd/include/amdgpu_workload.h
>> +++ b/drivers/gpu/drm/amd/include/amdgpu_workload.h
>> @@ -46,6 +46,9 @@ static const char * const amdgpu_workload_mode_name[] = {
>>          "Window3D"
>>   };
>>
>> +void amdgpu_workload_profile_set(struct amdgpu_device *adev,
>> +                                uint32_t ring_type);
>> +
>>   void amdgpu_workload_profile_init(struct amdgpu_device *adev);
>>
>>   void amdgpu_workload_profile_fini(struct amdgpu_device *adev);
>> --
>> 2.34.1
>>

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v2 2/7] drm/amdgpu: Add new function to set GPU power profile
  2023-08-21 17:53     ` Yadav, Arvind
@ 2023-08-21 18:10       ` Alex Deucher
  2023-08-22  6:13         ` Yadav, Arvind
  0 siblings, 1 reply; 39+ messages in thread
From: Alex Deucher @ 2023-08-21 18:10 UTC (permalink / raw)
  To: Yadav, Arvind
  Cc: Arvind Yadav, Christian.Koenig, alexander.deucher,
	shashank.sharma, Xinhui.Pan, airlied, daniel, Felix.Kuehling,
	amd-gfx, linux-kernel, dri-devel

On Mon, Aug 21, 2023 at 1:54 PM Yadav, Arvind <arvyadav@amd.com> wrote:
>
>
> On 8/21/2023 9:52 PM, Alex Deucher wrote:
> > On Mon, Aug 21, 2023 at 2:55 AM Arvind Yadav <Arvind.Yadav@amd.com> wrote:
> >> This patch adds a function which will change the GPU
> >> power profile based on a submitted job. This can optimize
> >> the power performance when the workload is on.
> >>
> >> v2:
> >> - Splitting workload_profile_set and workload_profile_put
> >>    into two separate patches.
> >> - Addressed review comment.
> >>
> >> Cc: Shashank Sharma <shashank.sharma@amd.com>
> >> Cc: Christian Koenig <christian.koenig@amd.com>
> >> Cc: Alex Deucher <alexander.deucher@amd.com>
> >> Signed-off-by: Arvind Yadav <Arvind.Yadav@amd.com>
> >> ---
> >>   drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c  | 56 +++++++++++++++++++
> >>   drivers/gpu/drm/amd/include/amdgpu_workload.h |  3 +
> >>   2 files changed, 59 insertions(+)
> >>
> >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c
> >> index 32166f482f77..e661cc5b3d92 100644
> >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c
> >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c
> >> @@ -24,6 +24,62 @@
> >>
> >>   #include "amdgpu.h"
> >>
> >> +static enum PP_SMC_POWER_PROFILE
> >> +ring_to_power_profile(uint32_t ring_type)
> >> +{
> >> +       switch (ring_type) {
> >> +       case AMDGPU_RING_TYPE_GFX:
> >> +               return PP_SMC_POWER_PROFILE_FULLSCREEN3D;
> >> +       case AMDGPU_RING_TYPE_COMPUTE:
> >> +               return PP_SMC_POWER_PROFILE_COMPUTE;
> >> +       case AMDGPU_RING_TYPE_UVD:
> >> +       case AMDGPU_RING_TYPE_VCE:
> >> +       case AMDGPU_RING_TYPE_UVD_ENC:
> >> +       case AMDGPU_RING_TYPE_VCN_DEC:
> >> +       case AMDGPU_RING_TYPE_VCN_ENC:
> >> +       case AMDGPU_RING_TYPE_VCN_JPEG:
> >> +               return PP_SMC_POWER_PROFILE_VIDEO;
> >> +       default:
> >> +               return PP_SMC_POWER_PROFILE_BOOTUP_DEFAULT;
> >> +       }
> >> +}
> >> +
> >> +static int
> >> +amdgpu_power_profile_set(struct amdgpu_device *adev,
> >> +                        enum PP_SMC_POWER_PROFILE profile)
> >> +{
> >> +       int ret = amdgpu_dpm_switch_power_profile(adev, profile, true);
> >> +
> >> +       if (!ret) {
> >> +               /* Set the bit for the submitted workload profile */
> >> +               adev->smu_workload.submit_workload_status |= (1 << profile);
> >> +               atomic_inc(&adev->smu_workload.power_profile_ref[profile]);
> >> +       }
> >> +
> >> +       return ret;
> >> +}
> >> +
> >> +void amdgpu_workload_profile_set(struct amdgpu_device *adev,
> >> +                                uint32_t ring_type)
> >> +{
> >> +       struct amdgpu_smu_workload *workload = &adev->smu_workload;
> >> +       enum PP_SMC_POWER_PROFILE profile = ring_to_power_profile(ring_type);
> >> +       int ret;
> >> +
> >> +       if (profile == PP_SMC_POWER_PROFILE_BOOTUP_DEFAULT)
> >> +               return;
> > Why is this one skipped?  How do we get back to the boot up profile?
>
> Hi Alex,
>
> enum PP_SMC_POWER_PROFILE {
>      PP_SMC_POWER_PROFILE_BOOTUP_DEFAULT = 0x0,
>      PP_SMC_POWER_PROFILE_FULLSCREEN3D = 0x1,
>      PP_SMC_POWER_PROFILE_POWERSAVING  = 0x2,
>      PP_SMC_POWER_PROFILE_VIDEO        = 0x3,
>      PP_SMC_POWER_PROFILE_VR           = 0x4,
>      PP_SMC_POWER_PROFILE_COMPUTE      = 0x5,
>      PP_SMC_POWER_PROFILE_CUSTOM       = 0x6,
>      PP_SMC_POWER_PROFILE_WINDOW3D     = 0x7,
>      PP_SMC_POWER_PROFILE_COUNT,
> };
>
> These are all the profiles. We are using which is >
> PP_SMC_POWER_PROFILE_BOOTUP_DEFAULT.
> Now suppose the profile was DEFAULT and we set it to VIDEO, SMU will
> move the profile to a higher level.
> When we reset the VIDEO profile then SMU will move back to the DEFAULT one.
>
> Our job is to set the profile and reset it after the job is done.
> SMU will take care to move to a higher profile and after reset, it will
> move back to DEFAULT.

I guess that is the part I'm missing.  How does the call to the SMU to
set the profile back to DEFAULT actually happen?  It seems that both
the put and get functions return early in this case.

Alex


Alex


>
> ThankYou,
> ~Arvind
>
> >
> > Alex
> >
> >> +
> >> +       mutex_lock(&workload->workload_lock);
> >> +
> >> +       ret = amdgpu_power_profile_set(adev, profile);
> >> +       if (ret) {
> >> +               DRM_WARN("Failed to set workload profile to %s, error = %d\n",
> >> +                        amdgpu_workload_mode_name[profile], ret);
> >> +       }
> >> +
> >> +       mutex_unlock(&workload->workload_lock);
> >> +}
> >> +
> >>   void amdgpu_workload_profile_init(struct amdgpu_device *adev)
> >>   {
> >>          adev->smu_workload.adev = adev;
> >> diff --git a/drivers/gpu/drm/amd/include/amdgpu_workload.h b/drivers/gpu/drm/amd/include/amdgpu_workload.h
> >> index 5d0f068422d4..5022f28fc2f9 100644
> >> --- a/drivers/gpu/drm/amd/include/amdgpu_workload.h
> >> +++ b/drivers/gpu/drm/amd/include/amdgpu_workload.h
> >> @@ -46,6 +46,9 @@ static const char * const amdgpu_workload_mode_name[] = {
> >>          "Window3D"
> >>   };
> >>
> >> +void amdgpu_workload_profile_set(struct amdgpu_device *adev,
> >> +                                uint32_t ring_type);
> >> +
> >>   void amdgpu_workload_profile_init(struct amdgpu_device *adev);
> >>
> >>   void amdgpu_workload_profile_fini(struct amdgpu_device *adev);
> >> --
> >> 2.34.1
> >>

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v2 3/7] drm/amdgpu: Add new function to put GPU power profile
  2023-08-21  6:47 ` [PATCH v2 3/7] drm/amdgpu: Add new function to put " Arvind Yadav
  2023-08-21 13:39   ` Shashank Sharma
@ 2023-08-22  4:51   ` Lazar, Lijo
  2023-08-22 12:11     ` Yadav, Arvind
  1 sibling, 1 reply; 39+ messages in thread
From: Lazar, Lijo @ 2023-08-22  4:51 UTC (permalink / raw)
  To: Arvind Yadav, Christian.Koenig, alexander.deucher,
	shashank.sharma, Xinhui.Pan, airlied, daniel, Felix.Kuehling,
	amd-gfx
  Cc: linux-kernel, dri-devel



On 8/21/2023 12:17 PM, Arvind Yadav wrote:
> This patch adds a function which will clear the GPU
> power profile after job finished.
> 
> This is how it works:
> - schedular will set the GPU power profile based on ring_type.
> - Schedular will clear the GPU Power profile once job finished.
> - Here, the *_workload_profile_set function will set the GPU
>    power profile and the *_workload_profile_put function will
>    schedule the smu_delayed_work task after 100ms delay. This
>    smu_delayed_work task will clear a GPU power profile if any
>    new jobs are not scheduled within 100 ms. But if any new job
>    comes within 100ms then the *_workload_profile_set function
>    will cancel this work and set the GPU power profile based on
>    preferences.
> 
> v2:
> - Splitting workload_profile_set and workload_profile_put
>    into two separate patches.
> - Addressed review comment.
> 
> Cc: Shashank Sharma <shashank.sharma@amd.com>
> Cc: Christian Koenig <christian.koenig@amd.com>
> Cc: Alex Deucher <alexander.deucher@amd.com>
> Signed-off-by: Arvind Yadav <Arvind.Yadav@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c  | 97 +++++++++++++++++++
>   drivers/gpu/drm/amd/include/amdgpu_workload.h |  3 +
>   2 files changed, 100 insertions(+)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c
> index e661cc5b3d92..6367eb88a44d 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c
> @@ -24,6 +24,9 @@
>   
>   #include "amdgpu.h"
>   
> +/* 100 millsecond timeout */
> +#define SMU_IDLE_TIMEOUT	msecs_to_jiffies(100)
> +
>   static enum PP_SMC_POWER_PROFILE
>   ring_to_power_profile(uint32_t ring_type)
>   {
> @@ -59,6 +62,80 @@ amdgpu_power_profile_set(struct amdgpu_device *adev,
>   	return ret;
>   }
>   
> +static int
> +amdgpu_power_profile_clear(struct amdgpu_device *adev,
> +			   enum PP_SMC_POWER_PROFILE profile)
> +{
> +	int ret = amdgpu_dpm_switch_power_profile(adev, profile, false);
> +
> +	if (!ret) {
> +		/* Clear the bit for the submitted workload profile */
> +		adev->smu_workload.submit_workload_status &= ~(1 << profile);
> +	}
> +
> +	return ret;
> +}
> +
> +static void
> +amdgpu_power_profile_idle_work_handler(struct work_struct *work)
> +{
> +
> +	struct amdgpu_smu_workload *workload = container_of(work,
> +						      struct amdgpu_smu_workload,
> +						      smu_delayed_work.work);
> +	struct amdgpu_device *adev = workload->adev;
> +	bool reschedule = false;
> +	int index  = fls(workload->submit_workload_status);
> +	int ret;
> +
> +	mutex_lock(&workload->workload_lock);
> +	for (; index > 0; index--) {

Why not use for_each_set_bit?

> +		int val = atomic_read(&workload->power_profile_ref[index]);
> +
> +		if (val) {
> +			reschedule = true;

Why do you need to do reschedule? For each put(), a schedule is called. 
If refcount is not zero, that means some other job has already set the 
profile. It is supposed to call put() and at that time, this job will be 
run to clear it anyway, right?

> +		} else {
> +			if (workload->submit_workload_status &
> +			    (1 << index)) {
> +				ret = amdgpu_power_profile_clear(adev, index);
> +				if (ret) {
> +					DRM_WARN("Failed to clear workload %s,error = %d\n",
> +						 amdgpu_workload_mode_name[index], ret);
> +					goto exit;
> +				}
> +			}
> +		}
> +	}
> +	if (reschedule)
> +		schedule_delayed_work(&workload->smu_delayed_work,
> +				      SMU_IDLE_TIMEOUT);
> +exit:
> +	mutex_unlock(&workload->workload_lock);
> +}
> +
> +void amdgpu_workload_profile_put(struct amdgpu_device *adev,
> +				 uint32_t ring_type)
> +{
> +	struct amdgpu_smu_workload *workload = &adev->smu_workload;
> +	enum PP_SMC_POWER_PROFILE profile = ring_to_power_profile(ring_type);
> +
> +	if (profile == PP_SMC_POWER_PROFILE_BOOTUP_DEFAULT)
> +		return;
> +
> +	mutex_lock(&workload->workload_lock);
> +
> +	if (!atomic_read(&workload->power_profile_ref[profile])) {
> +		DRM_WARN("Power profile %s ref. count error\n",
> +			 amdgpu_workload_mode_name[profile]);
> +	} else {
> +		atomic_dec(&workload->power_profile_ref[profile]);
> +		schedule_delayed_work(&workload->smu_delayed_work,
> +				      SMU_IDLE_TIMEOUT);
> +	}
> +
> +	mutex_unlock(&workload->workload_lock);
> +}
> +
>   void amdgpu_workload_profile_set(struct amdgpu_device *adev,
>   				 uint32_t ring_type)
>   {
> @@ -70,13 +147,30 @@ void amdgpu_workload_profile_set(struct amdgpu_device *adev,
>   		return;
>   
>   	mutex_lock(&workload->workload_lock);
> +	cancel_delayed_work_sync(&workload->smu_delayed_work);

This is a potential deadlock. You already hold the mutex and then 
waiting for idle work to finish. Idle work could now be at the point 
where it is waiting for the same mutex. Suggest not to call cancel here 
and let the mutex take care of the sequence.

>   
>   	ret = amdgpu_power_profile_set(adev, profile);
>   	if (ret) {
>   		DRM_WARN("Failed to set workload profile to %s, error = %d\n",
>   			 amdgpu_workload_mode_name[profile], ret);
> +		goto exit;
> +	}
> +
> +	/* Clear the already finished jobs of higher power profile*/
> +	for (int index = fls(workload->submit_workload_status);
> +	     index > profile; index--) {
> +		if (!atomic_read(&workload->power_profile_ref[index]) &&
> +		    workload->submit_workload_status & (1 << index)) {
> +			ret = amdgpu_power_profile_clear(adev, index);
> +			if (ret) {
> +				DRM_WARN("Failed to clear workload %s, err = %d\n",
> +					 amdgpu_workload_mode_name[profile], ret);
> +				goto exit;
> +			}
> +		}

If you follow the earlier comment, that will keep this logic only at one 
place - i.e, at idle work handler. Basically just let the idle work 
handle its duty. If some job starts running during the clear call, it's 
just unfortunate timing and let the next set() take the lock and request 
profile again.

Thanks,
Lijo

>   	}
>   
> +exit:
>   	mutex_unlock(&workload->workload_lock);
>   }
>   
> @@ -87,6 +181,8 @@ void amdgpu_workload_profile_init(struct amdgpu_device *adev)
>   	adev->smu_workload.initialized = true;
>   
>   	mutex_init(&adev->smu_workload.workload_lock);
> +	INIT_DELAYED_WORK(&adev->smu_workload.smu_delayed_work,
> +			  amdgpu_power_profile_idle_work_handler);
>   }
>   
>   void amdgpu_workload_profile_fini(struct amdgpu_device *adev)
> @@ -94,6 +190,7 @@ void amdgpu_workload_profile_fini(struct amdgpu_device *adev)
>   	if (!adev->smu_workload.initialized)
>   		return;
>   
> +	cancel_delayed_work_sync(&adev->smu_workload.smu_delayed_work);
>   	adev->smu_workload.submit_workload_status = 0;
>   	adev->smu_workload.initialized = false;
>   	mutex_destroy(&adev->smu_workload.workload_lock);
> diff --git a/drivers/gpu/drm/amd/include/amdgpu_workload.h b/drivers/gpu/drm/amd/include/amdgpu_workload.h
> index 5022f28fc2f9..ee1f87257f2d 100644
> --- a/drivers/gpu/drm/amd/include/amdgpu_workload.h
> +++ b/drivers/gpu/drm/amd/include/amdgpu_workload.h
> @@ -46,6 +46,9 @@ static const char * const amdgpu_workload_mode_name[] = {
>   	"Window3D"
>   };
>   
> +void amdgpu_workload_profile_put(struct amdgpu_device *adev,
> +				 uint32_t ring_type);
> +
>   void amdgpu_workload_profile_set(struct amdgpu_device *adev,
>   				 uint32_t ring_type);
>   

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v2 2/7] drm/amdgpu: Add new function to set GPU power profile
  2023-08-21 18:10       ` Alex Deucher
@ 2023-08-22  6:13         ` Yadav, Arvind
  0 siblings, 0 replies; 39+ messages in thread
From: Yadav, Arvind @ 2023-08-22  6:13 UTC (permalink / raw)
  To: Alex Deucher
  Cc: Arvind Yadav, Christian.Koenig, alexander.deucher,
	shashank.sharma, Xinhui.Pan, airlied, daniel, Felix.Kuehling,
	amd-gfx, linux-kernel, dri-devel


On 8/21/2023 11:40 PM, Alex Deucher wrote:
> On Mon, Aug 21, 2023 at 1:54 PM Yadav, Arvind <arvyadav@amd.com> wrote:
>>
>> On 8/21/2023 9:52 PM, Alex Deucher wrote:
>>> On Mon, Aug 21, 2023 at 2:55 AM Arvind Yadav <Arvind.Yadav@amd.com> wrote:
>>>> This patch adds a function which will change the GPU
>>>> power profile based on a submitted job. This can optimize
>>>> the power performance when the workload is on.
>>>>
>>>> v2:
>>>> - Splitting workload_profile_set and workload_profile_put
>>>>     into two separate patches.
>>>> - Addressed review comment.
>>>>
>>>> Cc: Shashank Sharma <shashank.sharma@amd.com>
>>>> Cc: Christian Koenig <christian.koenig@amd.com>
>>>> Cc: Alex Deucher <alexander.deucher@amd.com>
>>>> Signed-off-by: Arvind Yadav <Arvind.Yadav@amd.com>
>>>> ---
>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c  | 56 +++++++++++++++++++
>>>>    drivers/gpu/drm/amd/include/amdgpu_workload.h |  3 +
>>>>    2 files changed, 59 insertions(+)
>>>>
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c
>>>> index 32166f482f77..e661cc5b3d92 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c
>>>> @@ -24,6 +24,62 @@
>>>>
>>>>    #include "amdgpu.h"
>>>>
>>>> +static enum PP_SMC_POWER_PROFILE
>>>> +ring_to_power_profile(uint32_t ring_type)
>>>> +{
>>>> +       switch (ring_type) {
>>>> +       case AMDGPU_RING_TYPE_GFX:
>>>> +               return PP_SMC_POWER_PROFILE_FULLSCREEN3D;
>>>> +       case AMDGPU_RING_TYPE_COMPUTE:
>>>> +               return PP_SMC_POWER_PROFILE_COMPUTE;
>>>> +       case AMDGPU_RING_TYPE_UVD:
>>>> +       case AMDGPU_RING_TYPE_VCE:
>>>> +       case AMDGPU_RING_TYPE_UVD_ENC:
>>>> +       case AMDGPU_RING_TYPE_VCN_DEC:
>>>> +       case AMDGPU_RING_TYPE_VCN_ENC:
>>>> +       case AMDGPU_RING_TYPE_VCN_JPEG:
>>>> +               return PP_SMC_POWER_PROFILE_VIDEO;
>>>> +       default:
>>>> +               return PP_SMC_POWER_PROFILE_BOOTUP_DEFAULT;
>>>> +       }
>>>> +}
>>>> +
>>>> +static int
>>>> +amdgpu_power_profile_set(struct amdgpu_device *adev,
>>>> +                        enum PP_SMC_POWER_PROFILE profile)
>>>> +{
>>>> +       int ret = amdgpu_dpm_switch_power_profile(adev, profile, true);
>>>> +
>>>> +       if (!ret) {
>>>> +               /* Set the bit for the submitted workload profile */
>>>> +               adev->smu_workload.submit_workload_status |= (1 << profile);
>>>> +               atomic_inc(&adev->smu_workload.power_profile_ref[profile]);
>>>> +       }
>>>> +
>>>> +       return ret;
>>>> +}
>>>> +
>>>> +void amdgpu_workload_profile_set(struct amdgpu_device *adev,
>>>> +                                uint32_t ring_type)
>>>> +{
>>>> +       struct amdgpu_smu_workload *workload = &adev->smu_workload;
>>>> +       enum PP_SMC_POWER_PROFILE profile = ring_to_power_profile(ring_type);
>>>> +       int ret;
>>>> +
>>>> +       if (profile == PP_SMC_POWER_PROFILE_BOOTUP_DEFAULT)
>>>> +               return;
>>> Why is this one skipped?  How do we get back to the boot up profile?
>> Hi Alex,
>>
>> enum PP_SMC_POWER_PROFILE {
>>       PP_SMC_POWER_PROFILE_BOOTUP_DEFAULT = 0x0,
>>       PP_SMC_POWER_PROFILE_FULLSCREEN3D = 0x1,
>>       PP_SMC_POWER_PROFILE_POWERSAVING  = 0x2,
>>       PP_SMC_POWER_PROFILE_VIDEO        = 0x3,
>>       PP_SMC_POWER_PROFILE_VR           = 0x4,
>>       PP_SMC_POWER_PROFILE_COMPUTE      = 0x5,
>>       PP_SMC_POWER_PROFILE_CUSTOM       = 0x6,
>>       PP_SMC_POWER_PROFILE_WINDOW3D     = 0x7,
>>       PP_SMC_POWER_PROFILE_COUNT,
>> };
>>
>> These are all the profiles. We are using which is >
>> PP_SMC_POWER_PROFILE_BOOTUP_DEFAULT.
>> Now suppose the profile was DEFAULT and we set it to VIDEO, SMU will
>> move the profile to a higher level.
>> When we reset the VIDEO profile then SMU will move back to the DEFAULT one.
>>
>> Our job is to set the profile and reset it after the job is done.
>> SMU will take care to move to a higher profile and after reset, it will
>> move back to DEFAULT.
> I guess that is the part I'm missing.  How does the call to the SMU to
> set the profile back to DEFAULT actually happen?  It seems that both
> the put and get functions return early in this case.
SMU is calculating a workload for given the profile and setting it when 
we call the *get and *put function.
When we call *set function for VIDEO then SMU will calculate a workload 
for VIDEO and set it. Now We call
*put function for the same profile then SMU calculates a workload which 
will be lower or DEFAULT (0)
and then it will set it.

Suppose we have called *set function for VIDEO profile then SMU will 
calculate the workload = 4 and set it.
Now we have called *put function for the same profile then SMU will 
calculate the workload = 0 and set it.

please see the below smu code where index will be DEFAULT (0) or lower 
for *put function.

if (!en) { // put function
         smu->workload_mask &= ~(1 << smu->workload_prority[type]);
         index = fls(smu->workload_mask);
         index = index > 0 && index <= WORKLOAD_POLICY_MAX ? index - 1 : 0;
         workload = smu->workload_setting[index];
} else { // set function.
         smu->workload_mask |= (1 << smu->workload_prority[type]);
         index = fls(smu->workload_mask);
         index = index <= WORKLOAD_POLICY_MAX ? index - 1 : 0;
         workload = smu->workload_setting[index];
}

In our case the *set function will set the GPU  power profile and the 
*put function will schedule
the smu_delayed_work task after 100ms delay. This smu_delayed_work task 
will clear a GPU power profile if any
new jobs are not scheduled within 100 ms. But if any new job comes 
within 100ms then the *set function
will cancel this work and set the GPU power profile based on preferences.

Thank You
~Arvind

>
> Alex
>
>
> Alex
>
>
>> ThankYou,
>> ~Arvind
>>
>>> Alex
>>>
>>>> +
>>>> +       mutex_lock(&workload->workload_lock);
>>>> +
>>>> +       ret = amdgpu_power_profile_set(adev, profile);
>>>> +       if (ret) {
>>>> +               DRM_WARN("Failed to set workload profile to %s, error = %d\n",
>>>> +                        amdgpu_workload_mode_name[profile], ret);
>>>> +       }
>>>> +
>>>> +       mutex_unlock(&workload->workload_lock);
>>>> +}
>>>> +
>>>>    void amdgpu_workload_profile_init(struct amdgpu_device *adev)
>>>>    {
>>>>           adev->smu_workload.adev = adev;
>>>> diff --git a/drivers/gpu/drm/amd/include/amdgpu_workload.h b/drivers/gpu/drm/amd/include/amdgpu_workload.h
>>>> index 5d0f068422d4..5022f28fc2f9 100644
>>>> --- a/drivers/gpu/drm/amd/include/amdgpu_workload.h
>>>> +++ b/drivers/gpu/drm/amd/include/amdgpu_workload.h
>>>> @@ -46,6 +46,9 @@ static const char * const amdgpu_workload_mode_name[] = {
>>>>           "Window3D"
>>>>    };
>>>>
>>>> +void amdgpu_workload_profile_set(struct amdgpu_device *adev,
>>>> +                                uint32_t ring_type);
>>>> +
>>>>    void amdgpu_workload_profile_init(struct amdgpu_device *adev);
>>>>
>>>>    void amdgpu_workload_profile_fini(struct amdgpu_device *adev);
>>>> --
>>>> 2.34.1
>>>>

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v2 2/7] drm/amdgpu: Add new function to set GPU power profile
  2023-08-21  6:47 ` [PATCH v2 2/7] drm/amdgpu: Add new function to set GPU power profile Arvind Yadav
                     ` (2 preceding siblings ...)
  2023-08-21 18:06   ` Alex Deucher
@ 2023-08-22  6:25   ` Lazar, Lijo
  2023-08-22 12:40     ` Yadav, Arvind
  3 siblings, 1 reply; 39+ messages in thread
From: Lazar, Lijo @ 2023-08-22  6:25 UTC (permalink / raw)
  To: Arvind Yadav, Christian.Koenig, alexander.deucher,
	shashank.sharma, Xinhui.Pan, airlied, daniel, Felix.Kuehling,
	amd-gfx
  Cc: linux-kernel, dri-devel



On 8/21/2023 12:17 PM, Arvind Yadav wrote:
> This patch adds a function which will change the GPU
> power profile based on a submitted job. This can optimize
> the power performance when the workload is on.
> 
> v2:
> - Splitting workload_profile_set and workload_profile_put
>    into two separate patches.
> - Addressed review comment.
> 
> Cc: Shashank Sharma <shashank.sharma@amd.com>
> Cc: Christian Koenig <christian.koenig@amd.com>
> Cc: Alex Deucher <alexander.deucher@amd.com>
> Signed-off-by: Arvind Yadav <Arvind.Yadav@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c  | 56 +++++++++++++++++++
>   drivers/gpu/drm/amd/include/amdgpu_workload.h |  3 +
>   2 files changed, 59 insertions(+)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c
> index 32166f482f77..e661cc5b3d92 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c
> @@ -24,6 +24,62 @@
>   
>   #include "amdgpu.h"
>   
> +static enum PP_SMC_POWER_PROFILE
> +ring_to_power_profile(uint32_t ring_type)
> +{
> +	switch (ring_type) {
> +	case AMDGPU_RING_TYPE_GFX:
> +		return PP_SMC_POWER_PROFILE_FULLSCREEN3D;
> +	case AMDGPU_RING_TYPE_COMPUTE:
> +		return PP_SMC_POWER_PROFILE_COMPUTE;
> +	case AMDGPU_RING_TYPE_UVD:
> +	case AMDGPU_RING_TYPE_VCE:
> +	case AMDGPU_RING_TYPE_UVD_ENC:
> +	case AMDGPU_RING_TYPE_VCN_DEC:
> +	case AMDGPU_RING_TYPE_VCN_ENC:
> +	case AMDGPU_RING_TYPE_VCN_JPEG:
> +		return PP_SMC_POWER_PROFILE_VIDEO;
> +	default:
> +		return PP_SMC_POWER_PROFILE_BOOTUP_DEFAULT;
> +	}
> +}
> +
> +static int
> +amdgpu_power_profile_set(struct amdgpu_device *adev,
> +			 enum PP_SMC_POWER_PROFILE profile)
> +{
> +	int ret = amdgpu_dpm_switch_power_profile(adev, profile, true);
> +

You don't need to interact with FW for every set() call. Only send the 
message if workload_status doesn't have the profile set or refcount is 
zero. Otherwise, only need to increment the refcount.

Thanks,
Lijo

> +	if (!ret) {
> +		/* Set the bit for the submitted workload profile */
> +		adev->smu_workload.submit_workload_status |= (1 << profile);
> +		atomic_inc(&adev->smu_workload.power_profile_ref[profile]);
> +	}
> +
> +	return ret;
> +}
> +
> +void amdgpu_workload_profile_set(struct amdgpu_device *adev,
> +				 uint32_t ring_type)
> +{
> +	struct amdgpu_smu_workload *workload = &adev->smu_workload;
> +	enum PP_SMC_POWER_PROFILE profile = ring_to_power_profile(ring_type);
> +	int ret;
> +
> +	if (profile == PP_SMC_POWER_PROFILE_BOOTUP_DEFAULT)
> +		return;
> +
> +	mutex_lock(&workload->workload_lock);
> +
> +	ret = amdgpu_power_profile_set(adev, profile);
> +	if (ret) {
> +		DRM_WARN("Failed to set workload profile to %s, error = %d\n",
> +			 amdgpu_workload_mode_name[profile], ret);
> +	}
> +
> +	mutex_unlock(&workload->workload_lock);
> +}
> +
>   void amdgpu_workload_profile_init(struct amdgpu_device *adev)
>   {
>   	adev->smu_workload.adev = adev;
> diff --git a/drivers/gpu/drm/amd/include/amdgpu_workload.h b/drivers/gpu/drm/amd/include/amdgpu_workload.h
> index 5d0f068422d4..5022f28fc2f9 100644
> --- a/drivers/gpu/drm/amd/include/amdgpu_workload.h
> +++ b/drivers/gpu/drm/amd/include/amdgpu_workload.h
> @@ -46,6 +46,9 @@ static const char * const amdgpu_workload_mode_name[] = {
>   	"Window3D"
>   };
>   
> +void amdgpu_workload_profile_set(struct amdgpu_device *adev,
> +				 uint32_t ring_type);
> +
>   void amdgpu_workload_profile_init(struct amdgpu_device *adev);
>   
>   void amdgpu_workload_profile_fini(struct amdgpu_device *adev);

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v2 4/7] drm/amdgpu: Add suspend function to clear the GPU power profile.
  2023-08-21  6:47 ` [PATCH v2 4/7] drm/amdgpu: Add suspend function to clear the " Arvind Yadav
  2023-08-21 13:43   ` Shashank Sharma
@ 2023-08-22  6:31   ` Lazar, Lijo
  2023-08-22 12:22     ` Yadav, Arvind
  1 sibling, 1 reply; 39+ messages in thread
From: Lazar, Lijo @ 2023-08-22  6:31 UTC (permalink / raw)
  To: Arvind Yadav, Christian.Koenig, alexander.deucher,
	shashank.sharma, Xinhui.Pan, airlied, daniel, Felix.Kuehling,
	amd-gfx
  Cc: linux-kernel, dri-devel



On 8/21/2023 12:17 PM, Arvind Yadav wrote:
> This patch adds a suspend function that will clear the GPU
> power profile before going into suspend state.
> 
> v2:
> - Add the new suspend function based on review comment.
> 
> Cc: Shashank Sharma <shashank.sharma@amd.com>
> Cc: Christian Koenig <christian.koenig@amd.com>
> Cc: Alex Deucher <alexander.deucher@amd.com>
> Signed-off-by: Arvind Yadav <Arvind.Yadav@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c    |  2 ++
>   drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c  | 23 +++++++++++++++++++
>   drivers/gpu/drm/amd/include/amdgpu_workload.h |  2 ++
>   3 files changed, 27 insertions(+)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index cd3bf641b630..3b70e657b439 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -4212,6 +4212,8 @@ int amdgpu_device_suspend(struct drm_device *dev, bool fbcon)
>   
>   	amdgpu_ras_suspend(adev);
>   
> +	amdgpu_workload_profile_suspend(adev);
> +
>   	amdgpu_device_ip_suspend_phase1(adev);
>   
>   	if (!adev->in_s0ix)
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c
> index 6367eb88a44d..44ca8e986984 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c
> @@ -174,6 +174,29 @@ void amdgpu_workload_profile_set(struct amdgpu_device *adev,
>   	mutex_unlock(&workload->workload_lock);
>   }
>   
> +void amdgpu_workload_profile_suspend(struct amdgpu_device *adev)
> +{
> +	struct amdgpu_smu_workload *workload = &adev->smu_workload;
> +	int ret;
> +
> +	mutex_lock(&workload->workload_lock);
> +	cancel_delayed_work_sync(&workload->smu_delayed_work);

Another deadlock candidate. Between fini() and suspend(), the only 
difference probably could be initialization status. If so, just use a 
helper that is used during fini() and suspend().

Thanks,
Lijo

> +
> +	/* Clear all the set GPU power profile*/
> +	for (int index = fls(workload->submit_workload_status);
> +	     index > 0; index--) {
> +		if (workload->submit_workload_status & (1 << index)) {
> +			atomic_set(&workload->power_profile_ref[index], 0);
> +			ret = amdgpu_power_profile_clear(adev, index);
> +			if (ret)
> +				DRM_WARN("Failed to clear power profile %s, err = %d\n",
> +					 amdgpu_workload_mode_name[index], ret);
> +		}
> +	}
> +	workload->submit_workload_status = 0;
> +	mutex_unlock(&workload->workload_lock);
> +}
> +
>   void amdgpu_workload_profile_init(struct amdgpu_device *adev)
>   {
>   	adev->smu_workload.adev = adev;
> diff --git a/drivers/gpu/drm/amd/include/amdgpu_workload.h b/drivers/gpu/drm/amd/include/amdgpu_workload.h
> index ee1f87257f2d..0acd8769ec52 100644
> --- a/drivers/gpu/drm/amd/include/amdgpu_workload.h
> +++ b/drivers/gpu/drm/amd/include/amdgpu_workload.h
> @@ -52,6 +52,8 @@ void amdgpu_workload_profile_put(struct amdgpu_device *adev,
>   void amdgpu_workload_profile_set(struct amdgpu_device *adev,
>   				 uint32_t ring_type);
>   
> +void amdgpu_workload_profile_suspend(struct amdgpu_device *adev);
> +
>   void amdgpu_workload_profile_init(struct amdgpu_device *adev);
>   
>   void amdgpu_workload_profile_fini(struct amdgpu_device *adev);

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v2 3/7] drm/amdgpu: Add new function to put GPU power profile
  2023-08-22  4:51   ` Lazar, Lijo
@ 2023-08-22 12:11     ` Yadav, Arvind
  2023-08-22 12:46       ` Lazar, Lijo
  0 siblings, 1 reply; 39+ messages in thread
From: Yadav, Arvind @ 2023-08-22 12:11 UTC (permalink / raw)
  To: Lazar, Lijo, Arvind Yadav, Christian.Koenig, alexander.deucher,
	shashank.sharma, Xinhui.Pan, airlied, daniel, Felix.Kuehling,
	amd-gfx
  Cc: linux-kernel, dri-devel

Hi Lijo,

The *_set function will set the GPU power profile and the *_put function 
will  schedule the
smu_delayed_work task after 100ms delay. This smu_delayed_work task will 
clear a GPU
power profile if any new jobs are not scheduled within 100 ms. But if 
any new job  comes within 100ms
then the *_workload_profile_set function  will cancel this work and set 
the GPU power profile based on
preferences.

Please see the below case.

case 1 - only same profile jobs run. It will take 100ms to clear the 
profile once all jobs complete.

                                            wl = VIDEO <100ms>
workload     _________|`````````````````````````````````````|____

Jobs (VIDEO) ________|```|__|```|___|````|___________


Case2 - two jobs of two different profile. job1 profile will be set but 
when job2 will arrive it will be moved
         to higher profile.

                                  wl = VIDEO  ->    wl = COMPUTE         
   <100ms>
workload 
___|``````````````````````````````````````````````````````````````````|____

Jobs (VIDEO) ___|```|__|```|___|````|___|````|_______

Jobs (COMPUTE) ______________|```|___|````|___|````|_________



Case3 - two jobs of two different profile. job1 profile will be set but 
when job2 will arrive it will not be moved
to lower profile. When compute job2 will complete then only it will move 
to lower profile.

                                              wl = COMPUTE 
->               wl = VIDEO  <100ms>
workload 
_________|``````````````````````````````````````````````````````````````````|____

Jobs (COMPUTE)    ____|```|__|```|___|````|___|````|_______

Jobs (VIDEO) ___________________|```|___|````|___|````|___|````|___________

On 8/22/2023 10:21 AM, Lazar, Lijo wrote:
>
>
> On 8/21/2023 12:17 PM, Arvind Yadav wrote:
>> This patch adds a function which will clear the GPU
>> power profile after job finished.
>>
>> This is how it works:
>> - schedular will set the GPU power profile based on ring_type.
>> - Schedular will clear the GPU Power profile once job finished.
>> - Here, the *_workload_profile_set function will set the GPU
>>    power profile and the *_workload_profile_put function will
>>    schedule the smu_delayed_work task after 100ms delay. This
>>    smu_delayed_work task will clear a GPU power profile if any
>>    new jobs are not scheduled within 100 ms. But if any new job
>>    comes within 100ms then the *_workload_profile_set function
>>    will cancel this work and set the GPU power profile based on
>>    preferences.
>>
>> v2:
>> - Splitting workload_profile_set and workload_profile_put
>>    into two separate patches.
>> - Addressed review comment.
>>
>> Cc: Shashank Sharma <shashank.sharma@amd.com>
>> Cc: Christian Koenig <christian.koenig@amd.com>
>> Cc: Alex Deucher <alexander.deucher@amd.com>
>> Signed-off-by: Arvind Yadav <Arvind.Yadav@amd.com>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c  | 97 +++++++++++++++++++
>>   drivers/gpu/drm/amd/include/amdgpu_workload.h |  3 +
>>   2 files changed, 100 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c
>> index e661cc5b3d92..6367eb88a44d 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c
>> @@ -24,6 +24,9 @@
>>     #include "amdgpu.h"
>>   +/* 100 millsecond timeout */
>> +#define SMU_IDLE_TIMEOUT    msecs_to_jiffies(100)
>> +
>>   static enum PP_SMC_POWER_PROFILE
>>   ring_to_power_profile(uint32_t ring_type)
>>   {
>> @@ -59,6 +62,80 @@ amdgpu_power_profile_set(struct amdgpu_device *adev,
>>       return ret;
>>   }
>>   +static int
>> +amdgpu_power_profile_clear(struct amdgpu_device *adev,
>> +               enum PP_SMC_POWER_PROFILE profile)
>> +{
>> +    int ret = amdgpu_dpm_switch_power_profile(adev, profile, false);
>> +
>> +    if (!ret) {
>> +        /* Clear the bit for the submitted workload profile */
>> +        adev->smu_workload.submit_workload_status &= ~(1 << profile);
>> +    }
>> +
>> +    return ret;
>> +}
>> +
>> +static void
>> +amdgpu_power_profile_idle_work_handler(struct work_struct *work)
>> +{
>> +
>> +    struct amdgpu_smu_workload *workload = container_of(work,
>> +                              struct amdgpu_smu_workload,
>> +                              smu_delayed_work.work);
>> +    struct amdgpu_device *adev = workload->adev;
>> +    bool reschedule = false;
>> +    int index  = fls(workload->submit_workload_status);
>> +    int ret;
>> +
>> +    mutex_lock(&workload->workload_lock);
>> +    for (; index > 0; index--) {
>
> Why not use for_each_set_bit?

We are clearing which we have only set it. We will clear first higher 
profile then lower.


>
>> +        int val = atomic_read(&workload->power_profile_ref[index]);
>> +
>> +        if (val) {
>> +            reschedule = true;
>
> Why do you need to do reschedule? For each put(), a schedule is 
> called. If refcount is not zero, that means some other job has already 
> set the profile. It is supposed to call put() and at that time, this 
> job will be run to clear it anyway, right?
>
Yes, I have got the comment for this I am going to remove this.
Noted.

>> +        } else {
>> +            if (workload->submit_workload_status &
>> +                (1 << index)) {
>> +                ret = amdgpu_power_profile_clear(adev, index);
>> +                if (ret) {
>> +                    DRM_WARN("Failed to clear workload %s,error = 
>> %d\n",
>> +                         amdgpu_workload_mode_name[index], ret);
>> +                    goto exit;
>> +                }
>> +            }
>> +        }
>> +    }
>> +    if (reschedule)
>> + schedule_delayed_work(&workload->smu_delayed_work,
>> +                      SMU_IDLE_TIMEOUT);
>> +exit:
>> +    mutex_unlock(&workload->workload_lock);
>> +}
>> +
>> +void amdgpu_workload_profile_put(struct amdgpu_device *adev,
>> +                 uint32_t ring_type)
>> +{
>> +    struct amdgpu_smu_workload *workload = &adev->smu_workload;
>> +    enum PP_SMC_POWER_PROFILE profile = 
>> ring_to_power_profile(ring_type);
>> +
>> +    if (profile == PP_SMC_POWER_PROFILE_BOOTUP_DEFAULT)
>> +        return;
>> +
>> +    mutex_lock(&workload->workload_lock);
>> +
>> +    if (!atomic_read(&workload->power_profile_ref[profile])) {
>> +        DRM_WARN("Power profile %s ref. count error\n",
>> +             amdgpu_workload_mode_name[profile]);
>> +    } else {
>> + atomic_dec(&workload->power_profile_ref[profile]);
>> + schedule_delayed_work(&workload->smu_delayed_work,
>> +                      SMU_IDLE_TIMEOUT);
>> +    }
>> +
>> +    mutex_unlock(&workload->workload_lock);
>> +}
>> +
>>   void amdgpu_workload_profile_set(struct amdgpu_device *adev,
>>                    uint32_t ring_type)
>>   {
>> @@ -70,13 +147,30 @@ void amdgpu_workload_profile_set(struct 
>> amdgpu_device *adev,
>>           return;
>>         mutex_lock(&workload->workload_lock);
>> + cancel_delayed_work_sync(&workload->smu_delayed_work);
>
> This is a potential deadlock. You already hold the mutex and then 
> waiting for idle work to finish. Idle work could now be at the point 
> where it is waiting for the same mutex. Suggest not to call cancel 
> here and let the mutex take care of the sequence.
We cannot cancel if idle work is running. So we have to wait until ideal 
work is complete. If *put function arrived before ideal work is not 
stated then we can cancel it. but if it is running work thread we should 
wait.
>
>>         ret = amdgpu_power_profile_set(adev, profile);
>>       if (ret) {
>>           DRM_WARN("Failed to set workload profile to %s, error = %d\n",
>>                amdgpu_workload_mode_name[profile], ret);
>> +        goto exit;
>> +    }
>> +
>> +    /* Clear the already finished jobs of higher power profile*/
>> +    for (int index = fls(workload->submit_workload_status);
>> +         index > profile; index--) {
>> +        if (!atomic_read(&workload->power_profile_ref[index]) &&
>> +            workload->submit_workload_status & (1 << index)) {
>> +            ret = amdgpu_power_profile_clear(adev, index);
>> +            if (ret) {
>> +                DRM_WARN("Failed to clear workload %s, err = %d\n",
>> +                     amdgpu_workload_mode_name[profile], ret);
>> +                goto exit;
>> +            }
>> +        }
>
> If you follow the earlier comment, that will keep this logic only at 
> one place - i.e, at idle work handler. Basically just let the idle 
> work handle its duty. If some job starts running during the clear 
> call, it's just unfortunate timing and let the next set() take the 
> lock and request profile again.

So basically for every millisecond  new jobs are coming and completing 
it to the same or different profile . Suppose we are running higher 
profile jobs and  before it completes if a lower job arrives, this check 
will help to move the higher profile to lower profile once higher 
profile finishes it. If we are not checking here then it will stuck on 
higher profile until then other jobs will also not complete. Please 
refer case3 scenario.


> Thanks,
> Lijo
>
>>       }
>>   +exit:
>>       mutex_unlock(&workload->workload_lock);
>>   }
>>   @@ -87,6 +181,8 @@ void amdgpu_workload_profile_init(struct 
>> amdgpu_device *adev)
>>       adev->smu_workload.initialized = true;
>>         mutex_init(&adev->smu_workload.workload_lock);
>> + INIT_DELAYED_WORK(&adev->smu_workload.smu_delayed_work,
>> +              amdgpu_power_profile_idle_work_handler);
>>   }
>>     void amdgpu_workload_profile_fini(struct amdgpu_device *adev)
>> @@ -94,6 +190,7 @@ void amdgpu_workload_profile_fini(struct 
>> amdgpu_device *adev)
>>       if (!adev->smu_workload.initialized)
>>           return;
>>   + cancel_delayed_work_sync(&adev->smu_workload.smu_delayed_work);
>>       adev->smu_workload.submit_workload_status = 0;
>>       adev->smu_workload.initialized = false;
>>       mutex_destroy(&adev->smu_workload.workload_lock);
>> diff --git a/drivers/gpu/drm/amd/include/amdgpu_workload.h 
>> b/drivers/gpu/drm/amd/include/amdgpu_workload.h
>> index 5022f28fc2f9..ee1f87257f2d 100644
>> --- a/drivers/gpu/drm/amd/include/amdgpu_workload.h
>> +++ b/drivers/gpu/drm/amd/include/amdgpu_workload.h
>> @@ -46,6 +46,9 @@ static const char * const 
>> amdgpu_workload_mode_name[] = {
>>       "Window3D"
>>   };
>>   +void amdgpu_workload_profile_put(struct amdgpu_device *adev,
>> +                 uint32_t ring_type);
>> +
>>   void amdgpu_workload_profile_set(struct amdgpu_device *adev,
>>                    uint32_t ring_type);

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v2 4/7] drm/amdgpu: Add suspend function to clear the GPU power profile.
  2023-08-22  6:31   ` Lazar, Lijo
@ 2023-08-22 12:22     ` Yadav, Arvind
  2023-08-22 12:54       ` Lazar, Lijo
  0 siblings, 1 reply; 39+ messages in thread
From: Yadav, Arvind @ 2023-08-22 12:22 UTC (permalink / raw)
  To: Lazar, Lijo, Arvind Yadav, Christian.Koenig, alexander.deucher,
	shashank.sharma, Xinhui.Pan, airlied, daniel, Felix.Kuehling,
	amd-gfx
  Cc: linux-kernel, dri-devel


On 8/22/2023 12:01 PM, Lazar, Lijo wrote:
>
>
> On 8/21/2023 12:17 PM, Arvind Yadav wrote:
>> This patch adds a suspend function that will clear the GPU
>> power profile before going into suspend state.
>>
>> v2:
>> - Add the new suspend function based on review comment.
>>
>> Cc: Shashank Sharma <shashank.sharma@amd.com>
>> Cc: Christian Koenig <christian.koenig@amd.com>
>> Cc: Alex Deucher <alexander.deucher@amd.com>
>> Signed-off-by: Arvind Yadav <Arvind.Yadav@amd.com>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c    |  2 ++
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c  | 23 +++++++++++++++++++
>>   drivers/gpu/drm/amd/include/amdgpu_workload.h |  2 ++
>>   3 files changed, 27 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> index cd3bf641b630..3b70e657b439 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> @@ -4212,6 +4212,8 @@ int amdgpu_device_suspend(struct drm_device 
>> *dev, bool fbcon)
>>         amdgpu_ras_suspend(adev);
>>   +    amdgpu_workload_profile_suspend(adev);
>> +
>>       amdgpu_device_ip_suspend_phase1(adev);
>>         if (!adev->in_s0ix)
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c
>> index 6367eb88a44d..44ca8e986984 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c
>> @@ -174,6 +174,29 @@ void amdgpu_workload_profile_set(struct 
>> amdgpu_device *adev,
>>       mutex_unlock(&workload->workload_lock);
>>   }
>>   +void amdgpu_workload_profile_suspend(struct amdgpu_device *adev)
>> +{
>> +    struct amdgpu_smu_workload *workload = &adev->smu_workload;
>> +    int ret;
>> +
>> +    mutex_lock(&workload->workload_lock);
>> + cancel_delayed_work_sync(&workload->smu_delayed_work);
>
> Another deadlock candidate. Between fini() and suspend(), the only 
> difference probably could be initialization status. If so, just use a 
> helper that is used during fini() and suspend().
>
Before going to suspend(), we need to cancel the work and clear all the 
profiles but in fini() we are destroying the mutex. also it will be 
called when we are unloading everything.

~Arvind

> Thanks,
> Lijo
>
>> +
>> +    /* Clear all the set GPU power profile*/
>> +    for (int index = fls(workload->submit_workload_status);
>> +         index > 0; index--) {
>> +        if (workload->submit_workload_status & (1 << index)) {
>> + atomic_set(&workload->power_profile_ref[index], 0);
>> +            ret = amdgpu_power_profile_clear(adev, index);
>> +            if (ret)
>> +                DRM_WARN("Failed to clear power profile %s, err = 
>> %d\n",
>> +                     amdgpu_workload_mode_name[index], ret);
>> +        }
>> +    }
>> +    workload->submit_workload_status = 0;
>> +    mutex_unlock(&workload->workload_lock);
>> +}
>> +
>>   void amdgpu_workload_profile_init(struct amdgpu_device *adev)
>>   {
>>       adev->smu_workload.adev = adev;
>> diff --git a/drivers/gpu/drm/amd/include/amdgpu_workload.h 
>> b/drivers/gpu/drm/amd/include/amdgpu_workload.h
>> index ee1f87257f2d..0acd8769ec52 100644
>> --- a/drivers/gpu/drm/amd/include/amdgpu_workload.h
>> +++ b/drivers/gpu/drm/amd/include/amdgpu_workload.h
>> @@ -52,6 +52,8 @@ void amdgpu_workload_profile_put(struct 
>> amdgpu_device *adev,
>>   void amdgpu_workload_profile_set(struct amdgpu_device *adev,
>>                    uint32_t ring_type);
>>   +void amdgpu_workload_profile_suspend(struct amdgpu_device *adev);
>> +
>>   void amdgpu_workload_profile_init(struct amdgpu_device *adev);
>>     void amdgpu_workload_profile_fini(struct amdgpu_device *adev);

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v2 2/7] drm/amdgpu: Add new function to set GPU power profile
  2023-08-22  6:25   ` Lazar, Lijo
@ 2023-08-22 12:40     ` Yadav, Arvind
  0 siblings, 0 replies; 39+ messages in thread
From: Yadav, Arvind @ 2023-08-22 12:40 UTC (permalink / raw)
  To: Lazar, Lijo, Arvind Yadav, Christian.Koenig, alexander.deucher,
	shashank.sharma, Xinhui.Pan, airlied, daniel, Felix.Kuehling,
	amd-gfx
  Cc: linux-kernel, dri-devel


On 8/22/2023 11:55 AM, Lazar, Lijo wrote:
>
>
> On 8/21/2023 12:17 PM, Arvind Yadav wrote:
>> This patch adds a function which will change the GPU
>> power profile based on a submitted job. This can optimize
>> the power performance when the workload is on.
>>
>> v2:
>> - Splitting workload_profile_set and workload_profile_put
>>    into two separate patches.
>> - Addressed review comment.
>>
>> Cc: Shashank Sharma <shashank.sharma@amd.com>
>> Cc: Christian Koenig <christian.koenig@amd.com>
>> Cc: Alex Deucher <alexander.deucher@amd.com>
>> Signed-off-by: Arvind Yadav <Arvind.Yadav@amd.com>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c  | 56 +++++++++++++++++++
>>   drivers/gpu/drm/amd/include/amdgpu_workload.h |  3 +
>>   2 files changed, 59 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c
>> index 32166f482f77..e661cc5b3d92 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c
>> @@ -24,6 +24,62 @@
>>     #include "amdgpu.h"
>>   +static enum PP_SMC_POWER_PROFILE
>> +ring_to_power_profile(uint32_t ring_type)
>> +{
>> +    switch (ring_type) {
>> +    case AMDGPU_RING_TYPE_GFX:
>> +        return PP_SMC_POWER_PROFILE_FULLSCREEN3D;
>> +    case AMDGPU_RING_TYPE_COMPUTE:
>> +        return PP_SMC_POWER_PROFILE_COMPUTE;
>> +    case AMDGPU_RING_TYPE_UVD:
>> +    case AMDGPU_RING_TYPE_VCE:
>> +    case AMDGPU_RING_TYPE_UVD_ENC:
>> +    case AMDGPU_RING_TYPE_VCN_DEC:
>> +    case AMDGPU_RING_TYPE_VCN_ENC:
>> +    case AMDGPU_RING_TYPE_VCN_JPEG:
>> +        return PP_SMC_POWER_PROFILE_VIDEO;
>> +    default:
>> +        return PP_SMC_POWER_PROFILE_BOOTUP_DEFAULT;
>> +    }
>> +}
>> +
>> +static int
>> +amdgpu_power_profile_set(struct amdgpu_device *adev,
>> +             enum PP_SMC_POWER_PROFILE profile)
>> +{
>> +    int ret = amdgpu_dpm_switch_power_profile(adev, profile, true);
>> +
>
> You don't need to interact with FW for every set() call. Only send the 
> message if workload_status doesn't have the profile set or refcount is 
> zero. Otherwise, only need to increment the refcount.
Noted.
Thank You,
~Arvind
>
> Thanks,
> Lijo
>
>> +    if (!ret) {
>> +        /* Set the bit for the submitted workload profile */
>> +        adev->smu_workload.submit_workload_status |= (1 << profile);
>> + atomic_inc(&adev->smu_workload.power_profile_ref[profile]);
>> +    }
>> +
>> +    return ret;
>> +}
>> +
>> +void amdgpu_workload_profile_set(struct amdgpu_device *adev,
>> +                 uint32_t ring_type)
>> +{
>> +    struct amdgpu_smu_workload *workload = &adev->smu_workload;
>> +    enum PP_SMC_POWER_PROFILE profile = 
>> ring_to_power_profile(ring_type);
>> +    int ret;
>> +
>> +    if (profile == PP_SMC_POWER_PROFILE_BOOTUP_DEFAULT)
>> +        return;
>> +
>> +    mutex_lock(&workload->workload_lock);
>> +
>> +    ret = amdgpu_power_profile_set(adev, profile);
>> +    if (ret) {
>> +        DRM_WARN("Failed to set workload profile to %s, error = %d\n",
>> +             amdgpu_workload_mode_name[profile], ret);
>> +    }
>> +
>> +    mutex_unlock(&workload->workload_lock);
>> +}
>> +
>>   void amdgpu_workload_profile_init(struct amdgpu_device *adev)
>>   {
>>       adev->smu_workload.adev = adev;
>> diff --git a/drivers/gpu/drm/amd/include/amdgpu_workload.h 
>> b/drivers/gpu/drm/amd/include/amdgpu_workload.h
>> index 5d0f068422d4..5022f28fc2f9 100644
>> --- a/drivers/gpu/drm/amd/include/amdgpu_workload.h
>> +++ b/drivers/gpu/drm/amd/include/amdgpu_workload.h
>> @@ -46,6 +46,9 @@ static const char * const 
>> amdgpu_workload_mode_name[] = {
>>       "Window3D"
>>   };
>>   +void amdgpu_workload_profile_set(struct amdgpu_device *adev,
>> +                 uint32_t ring_type);
>> +
>>   void amdgpu_workload_profile_init(struct amdgpu_device *adev);
>>     void amdgpu_workload_profile_fini(struct amdgpu_device *adev);

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v2 3/7] drm/amdgpu: Add new function to put GPU power profile
  2023-08-22 12:11     ` Yadav, Arvind
@ 2023-08-22 12:46       ` Lazar, Lijo
  2023-08-25 11:18         ` Yadav, Arvind
  0 siblings, 1 reply; 39+ messages in thread
From: Lazar, Lijo @ 2023-08-22 12:46 UTC (permalink / raw)
  To: Yadav, Arvind, Arvind Yadav, Christian.Koenig, alexander.deucher,
	shashank.sharma, Xinhui.Pan, airlied, daniel, Felix.Kuehling,
	amd-gfx
  Cc: linux-kernel, dri-devel



On 8/22/2023 5:41 PM, Yadav, Arvind wrote:
> Hi Lijo,
> 
> The *_set function will set the GPU power profile and the *_put function 
> will  schedule the
> smu_delayed_work task after 100ms delay. This smu_delayed_work task will 
> clear a GPU
> power profile if any new jobs are not scheduled within 100 ms. But if 
> any new job  comes within 100ms
> then the *_workload_profile_set function  will cancel this work and set 
> the GPU power profile based on
> preferences.
> 
> Please see the below case.
> 
> case 1 - only same profile jobs run. It will take 100ms to clear the 
> profile once all jobs complete.
> 
>                                             wl = VIDEO <100ms>
> workload     _________|`````````````````````````````````````|____
> 
> Jobs (VIDEO) ________|```|__|```|___|````|___________
> 
> 
> Case2 - two jobs of two different profile. job1 profile will be set but 
> when job2 will arrive it will be moved
>          to higher profile.
> 
>                                   wl = VIDEO  ->    wl = COMPUTE   <100ms>
> workload 
> ___|``````````````````````````````````````````````````````````````````|____
> 
> Jobs (VIDEO) ___|```|__|```|___|````|___|````|_______
> 
> Jobs (COMPUTE) ______________|```|___|````|___|````|_________
> 
> 
> 
> Case3 - two jobs of two different profile. job1 profile will be set but 
> when job2 will arrive it will not be moved
> to lower profile. When compute job2 will complete then only it will move 
> to lower profile.
> 
>                                               wl = COMPUTE 
> ->               wl = VIDEO  <100ms>
> workload 
> _________|``````````````````````````````````````````````````````````````````|____ 
> 
> 
> Jobs (COMPUTE)    ____|```|__|```|___|````|___|````|_______
> 
> Jobs (VIDEO) ___________________|```|___|````|___|````|___|````|___________
> 

swsmu layer maintains a workload mask based on priority. So once you 
have set the mask, until you unset it (i.e when refcount = 0), the mask 
will be set in the lower layer. swsmu layer will take care of requesting 
FW the highest priority. I don't think that needs to be repeated at this 
level.

At this layer, all you need is to refcount the requests and make the 
request.

When refcount of a profile becomes non-zero (only one-time), place one 
request for that profile. As swsmu layer maintains the workload mask, it 
will take the new profile also into consideration while requesting for 
the one  with the highest priority.

When refcount of a profile becomes zero, place a request to clear it. 
This is controlled by your idle work. As I see, it keeps an additional 
100ms tolerance before placing a clear request. In that way, there is no 
need to cancel that work.

Inside idle work handler -
Loop through the profiles that are set and clear those profiles whose 
refcount is zero.

Thus if a job starts during the 100ms delay, idle work won't see the ref 
count as zero and then it won't place a request to clear out that profile.

> On 8/22/2023 10:21 AM, Lazar, Lijo wrote:
>>
>>
>> On 8/21/2023 12:17 PM, Arvind Yadav wrote:
>>> This patch adds a function which will clear the GPU
>>> power profile after job finished.
>>>
>>> This is how it works:
>>> - schedular will set the GPU power profile based on ring_type.
>>> - Schedular will clear the GPU Power profile once job finished.
>>> - Here, the *_workload_profile_set function will set the GPU
>>>    power profile and the *_workload_profile_put function will
>>>    schedule the smu_delayed_work task after 100ms delay. This
>>>    smu_delayed_work task will clear a GPU power profile if any
>>>    new jobs are not scheduled within 100 ms. But if any new job
>>>    comes within 100ms then the *_workload_profile_set function
>>>    will cancel this work and set the GPU power profile based on
>>>    preferences.
>>>
>>> v2:
>>> - Splitting workload_profile_set and workload_profile_put
>>>    into two separate patches.
>>> - Addressed review comment.
>>>
>>> Cc: Shashank Sharma <shashank.sharma@amd.com>
>>> Cc: Christian Koenig <christian.koenig@amd.com>
>>> Cc: Alex Deucher <alexander.deucher@amd.com>
>>> Signed-off-by: Arvind Yadav <Arvind.Yadav@amd.com>
>>> ---
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c  | 97 +++++++++++++++++++
>>>   drivers/gpu/drm/amd/include/amdgpu_workload.h |  3 +
>>>   2 files changed, 100 insertions(+)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c
>>> index e661cc5b3d92..6367eb88a44d 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c
>>> @@ -24,6 +24,9 @@
>>>     #include "amdgpu.h"
>>>   +/* 100 millsecond timeout */
>>> +#define SMU_IDLE_TIMEOUT    msecs_to_jiffies(100)
>>> +
>>>   static enum PP_SMC_POWER_PROFILE
>>>   ring_to_power_profile(uint32_t ring_type)
>>>   {
>>> @@ -59,6 +62,80 @@ amdgpu_power_profile_set(struct amdgpu_device *adev,
>>>       return ret;
>>>   }
>>>   +static int
>>> +amdgpu_power_profile_clear(struct amdgpu_device *adev,
>>> +               enum PP_SMC_POWER_PROFILE profile)
>>> +{
>>> +    int ret = amdgpu_dpm_switch_power_profile(adev, profile, false);
>>> +
>>> +    if (!ret) {
>>> +        /* Clear the bit for the submitted workload profile */
>>> +        adev->smu_workload.submit_workload_status &= ~(1 << profile);
>>> +    }
>>> +
>>> +    return ret;
>>> +}
>>> +
>>> +static void
>>> +amdgpu_power_profile_idle_work_handler(struct work_struct *work)
>>> +{
>>> +
>>> +    struct amdgpu_smu_workload *workload = container_of(work,
>>> +                              struct amdgpu_smu_workload,
>>> +                              smu_delayed_work.work);
>>> +    struct amdgpu_device *adev = workload->adev;
>>> +    bool reschedule = false;
>>> +    int index  = fls(workload->submit_workload_status);
>>> +    int ret;
>>> +
>>> +    mutex_lock(&workload->workload_lock);
>>> +    for (; index > 0; index--) {
>>
>> Why not use for_each_set_bit?
> 
> We are clearing which we have only set it. We will clear first higher 
> profile then lower.
> 

You don't need to do take care of this. swsmu layer will take care of 
the priority. It is not the job of this layer to take care of priority. 
swsmu is the layer that could be altered specific to each SOC, and that 
can take care of any priority changes accordingly. This layer only needs 
to ref count the requests and place accordingly.

> 
>>
>>> +        int val = atomic_read(&workload->power_profile_ref[index]);
>>> +
>>> +        if (val) {
>>> +            reschedule = true;
>>
>> Why do you need to do reschedule? For each put(), a schedule is 
>> called. If refcount is not zero, that means some other job has already 
>> set the profile. It is supposed to call put() and at that time, this 
>> job will be run to clear it anyway, right?
>>
> Yes, I have got the comment for this I am going to remove this.
> Noted.
> 
>>> +        } else {
>>> +            if (workload->submit_workload_status &
>>> +                (1 << index)) {
>>> +                ret = amdgpu_power_profile_clear(adev, index);
>>> +                if (ret) {
>>> +                    DRM_WARN("Failed to clear workload %s,error = 
>>> %d\n",
>>> +                         amdgpu_workload_mode_name[index], ret);
>>> +                    goto exit;
>>> +                }
>>> +            }
>>> +        }
>>> +    }
>>> +    if (reschedule)
>>> + schedule_delayed_work(&workload->smu_delayed_work,
>>> +                      SMU_IDLE_TIMEOUT);
>>> +exit:
>>> +    mutex_unlock(&workload->workload_lock);
>>> +}
>>> +
>>> +void amdgpu_workload_profile_put(struct amdgpu_device *adev,
>>> +                 uint32_t ring_type)
>>> +{
>>> +    struct amdgpu_smu_workload *workload = &adev->smu_workload;
>>> +    enum PP_SMC_POWER_PROFILE profile = 
>>> ring_to_power_profile(ring_type);
>>> +
>>> +    if (profile == PP_SMC_POWER_PROFILE_BOOTUP_DEFAULT)
>>> +        return;
>>> +
>>> +    mutex_lock(&workload->workload_lock);
>>> +
>>> +    if (!atomic_read(&workload->power_profile_ref[profile])) {
>>> +        DRM_WARN("Power profile %s ref. count error\n",
>>> +             amdgpu_workload_mode_name[profile]);
>>> +    } else {
>>> + atomic_dec(&workload->power_profile_ref[profile]);
>>> + schedule_delayed_work(&workload->smu_delayed_work,
>>> +                      SMU_IDLE_TIMEOUT);
>>> +    }
>>> +
>>> +    mutex_unlock(&workload->workload_lock);
>>> +}
>>> +
>>>   void amdgpu_workload_profile_set(struct amdgpu_device *adev,
>>>                    uint32_t ring_type)
>>>   {
>>> @@ -70,13 +147,30 @@ void amdgpu_workload_profile_set(struct 
>>> amdgpu_device *adev,
>>>           return;
>>>         mutex_lock(&workload->workload_lock);
>>> + cancel_delayed_work_sync(&workload->smu_delayed_work);
>>
>> This is a potential deadlock. You already hold the mutex and then 
>> waiting for idle work to finish. Idle work could now be at the point 
>> where it is waiting for the same mutex. Suggest not to call cancel 
>> here and let the mutex take care of the sequence.
> We cannot cancel if idle work is running. So we have to wait until ideal 
> work is complete. If *put function arrived before ideal work is not 
> stated then we can cancel it. but if it is running work thread we should 
> wait.

No need to wait, because you already have a mutex. So you will be 
waiting naturally for the mutex lock to be released (if at all idle work 
already grabbed it). If a request comes in at the time while idle work 
is running it is only a timing issue.

Also you have a deadlock here. Here you acquired the mutex first and 
then waiting for the idle work to finish. The idle work function would 
have just started at that point and reached to the place where it is 
going to grab mutex. That is a deadlock. This function is waiting for 
idle work to finish and idle work is waiting to get the mutex.

Nevertheless, this function doesn't even need to take care of such fancy 
things. It only grabs the mutex and increases the refcount, places a 
request if refcount became non-zero.

At whatever point, idle work runs, it will see that the refcount is not 
zero and skips placing a request to clear that profile.

>>
>>>         ret = amdgpu_power_profile_set(adev, profile);
>>>       if (ret) {
>>>           DRM_WARN("Failed to set workload profile to %s, error = %d\n",
>>>                amdgpu_workload_mode_name[profile], ret);
>>> +        goto exit;
>>> +    }
>>> +
>>> +    /* Clear the already finished jobs of higher power profile*/
>>> +    for (int index = fls(workload->submit_workload_status);
>>> +         index > profile; index--) {
>>> +        if (!atomic_read(&workload->power_profile_ref[index]) &&
>>> +            workload->submit_workload_status & (1 << index)) {
>>> +            ret = amdgpu_power_profile_clear(adev, index);
>>> +            if (ret) {
>>> +                DRM_WARN("Failed to clear workload %s, err = %d\n",
>>> +                     amdgpu_workload_mode_name[profile], ret);
>>> +                goto exit;
>>> +            }
>>> +        }
>>
>> If you follow the earlier comment, that will keep this logic only at 
>> one place - i.e, at idle work handler. Basically just let the idle 
>> work handle its duty. If some job starts running during the clear 
>> call, it's just unfortunate timing and let the next set() take the 
>> lock and request profile again.
> 
> So basically for every millisecond  new jobs are coming and completing 
> it to the same or different profile . Suppose we are running higher 
> profile jobs and  before it completes if a lower job arrives, this check 
> will help to move the higher profile to lower profile once higher 
> profile finishes it. If we are not checking here then it will stuck on 
> higher profile until then other jobs will also not complete. Please 
> refer case3 scenario.
> 

As mentioned before, this is not the place to take care of SOC specific 
power profile priorities. We already have swsmu layer doing that job. 
This layer just needs to do a ref count and place requests accordingly.

Thanks,
Lijo

> 
>> Thanks,
>> Lijo
>>
>>>       }
>>>   +exit:
>>>       mutex_unlock(&workload->workload_lock);
>>>   }
>>>   @@ -87,6 +181,8 @@ void amdgpu_workload_profile_init(struct 
>>> amdgpu_device *adev)
>>>       adev->smu_workload.initialized = true;
>>>         mutex_init(&adev->smu_workload.workload_lock);
>>> + INIT_DELAYED_WORK(&adev->smu_workload.smu_delayed_work,
>>> +              amdgpu_power_profile_idle_work_handler);
>>>   }
>>>     void amdgpu_workload_profile_fini(struct amdgpu_device *adev)
>>> @@ -94,6 +190,7 @@ void amdgpu_workload_profile_fini(struct 
>>> amdgpu_device *adev)
>>>       if (!adev->smu_workload.initialized)
>>>           return;
>>>   + cancel_delayed_work_sync(&adev->smu_workload.smu_delayed_work);
>>>       adev->smu_workload.submit_workload_status = 0;
>>>       adev->smu_workload.initialized = false;
>>>       mutex_destroy(&adev->smu_workload.workload_lock);
>>> diff --git a/drivers/gpu/drm/amd/include/amdgpu_workload.h 
>>> b/drivers/gpu/drm/amd/include/amdgpu_workload.h
>>> index 5022f28fc2f9..ee1f87257f2d 100644
>>> --- a/drivers/gpu/drm/amd/include/amdgpu_workload.h
>>> +++ b/drivers/gpu/drm/amd/include/amdgpu_workload.h
>>> @@ -46,6 +46,9 @@ static const char * const 
>>> amdgpu_workload_mode_name[] = {
>>>       "Window3D"
>>>   };
>>>   +void amdgpu_workload_profile_put(struct amdgpu_device *adev,
>>> +                 uint32_t ring_type);
>>> +
>>>   void amdgpu_workload_profile_set(struct amdgpu_device *adev,
>>>                    uint32_t ring_type);

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v2 4/7] drm/amdgpu: Add suspend function to clear the GPU power profile.
  2023-08-22 12:22     ` Yadav, Arvind
@ 2023-08-22 12:54       ` Lazar, Lijo
  2023-08-22 12:56         ` Yadav, Arvind
  0 siblings, 1 reply; 39+ messages in thread
From: Lazar, Lijo @ 2023-08-22 12:54 UTC (permalink / raw)
  To: Yadav, Arvind, Arvind Yadav, Christian.Koenig, alexander.deucher,
	shashank.sharma, Xinhui.Pan, airlied, daniel, Felix.Kuehling,
	amd-gfx
  Cc: linux-kernel, dri-devel



On 8/22/2023 5:52 PM, Yadav, Arvind wrote:
> 
> On 8/22/2023 12:01 PM, Lazar, Lijo wrote:
>>
>>
>> On 8/21/2023 12:17 PM, Arvind Yadav wrote:
>>> This patch adds a suspend function that will clear the GPU
>>> power profile before going into suspend state.
>>>
>>> v2:
>>> - Add the new suspend function based on review comment.
>>>
>>> Cc: Shashank Sharma <shashank.sharma@amd.com>
>>> Cc: Christian Koenig <christian.koenig@amd.com>
>>> Cc: Alex Deucher <alexander.deucher@amd.com>
>>> Signed-off-by: Arvind Yadav <Arvind.Yadav@amd.com>
>>> ---
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c    |  2 ++
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c  | 23 +++++++++++++++++++
>>>   drivers/gpu/drm/amd/include/amdgpu_workload.h |  2 ++
>>>   3 files changed, 27 insertions(+)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>> index cd3bf641b630..3b70e657b439 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>> @@ -4212,6 +4212,8 @@ int amdgpu_device_suspend(struct drm_device 
>>> *dev, bool fbcon)
>>>         amdgpu_ras_suspend(adev);
>>>   +    amdgpu_workload_profile_suspend(adev);
>>> +
>>>       amdgpu_device_ip_suspend_phase1(adev);
>>>         if (!adev->in_s0ix)
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c
>>> index 6367eb88a44d..44ca8e986984 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c
>>> @@ -174,6 +174,29 @@ void amdgpu_workload_profile_set(struct 
>>> amdgpu_device *adev,
>>>       mutex_unlock(&workload->workload_lock);
>>>   }
>>>   +void amdgpu_workload_profile_suspend(struct amdgpu_device *adev)
>>> +{
>>> +    struct amdgpu_smu_workload *workload = &adev->smu_workload;
>>> +    int ret;
>>> +
>>> +    mutex_lock(&workload->workload_lock);
>>> + cancel_delayed_work_sync(&workload->smu_delayed_work);
>>
>> Another deadlock candidate. Between fini() and suspend(), the only 
>> difference probably could be initialization status. If so, just use a 
>> helper that is used during fini() and suspend().
>>
> Before going to suspend(), we need to cancel the work and clear all the 
> profiles but in fini() we are destroying the mutex. also it will be 
> called when we are unloading everything.
> 

What I meant is for both suspend/fini, you need to cancel any work 
scheduled, clear refcounts and set the profile back to default profile. 
Keep this in a helper and reuse.

Thanks,
Lijo

> ~Arvind
> 
>> Thanks,
>> Lijo
>>
>>> +
>>> +    /* Clear all the set GPU power profile*/
>>> +    for (int index = fls(workload->submit_workload_status);
>>> +         index > 0; index--) {
>>> +        if (workload->submit_workload_status & (1 << index)) {
>>> + atomic_set(&workload->power_profile_ref[index], 0);
>>> +            ret = amdgpu_power_profile_clear(adev, index);
>>> +            if (ret)
>>> +                DRM_WARN("Failed to clear power profile %s, err = 
>>> %d\n",
>>> +                     amdgpu_workload_mode_name[index], ret);
>>> +        }
>>> +    }
>>> +    workload->submit_workload_status = 0;
>>> +    mutex_unlock(&workload->workload_lock);
>>> +}
>>> +
>>>   void amdgpu_workload_profile_init(struct amdgpu_device *adev)
>>>   {
>>>       adev->smu_workload.adev = adev;
>>> diff --git a/drivers/gpu/drm/amd/include/amdgpu_workload.h 
>>> b/drivers/gpu/drm/amd/include/amdgpu_workload.h
>>> index ee1f87257f2d..0acd8769ec52 100644
>>> --- a/drivers/gpu/drm/amd/include/amdgpu_workload.h
>>> +++ b/drivers/gpu/drm/amd/include/amdgpu_workload.h
>>> @@ -52,6 +52,8 @@ void amdgpu_workload_profile_put(struct 
>>> amdgpu_device *adev,
>>>   void amdgpu_workload_profile_set(struct amdgpu_device *adev,
>>>                    uint32_t ring_type);
>>>   +void amdgpu_workload_profile_suspend(struct amdgpu_device *adev);
>>> +
>>>   void amdgpu_workload_profile_init(struct amdgpu_device *adev);
>>>     void amdgpu_workload_profile_fini(struct amdgpu_device *adev);

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v2 4/7] drm/amdgpu: Add suspend function to clear the GPU power profile.
  2023-08-22 12:54       ` Lazar, Lijo
@ 2023-08-22 12:56         ` Yadav, Arvind
  0 siblings, 0 replies; 39+ messages in thread
From: Yadav, Arvind @ 2023-08-22 12:56 UTC (permalink / raw)
  To: Lazar, Lijo, Arvind Yadav, Christian.Koenig, alexander.deucher,
	shashank.sharma, Xinhui.Pan, airlied, daniel, Felix.Kuehling,
	amd-gfx
  Cc: linux-kernel, dri-devel


On 8/22/2023 6:24 PM, Lazar, Lijo wrote:
>
>
> On 8/22/2023 5:52 PM, Yadav, Arvind wrote:
>>
>> On 8/22/2023 12:01 PM, Lazar, Lijo wrote:
>>>
>>>
>>> On 8/21/2023 12:17 PM, Arvind Yadav wrote:
>>>> This patch adds a suspend function that will clear the GPU
>>>> power profile before going into suspend state.
>>>>
>>>> v2:
>>>> - Add the new suspend function based on review comment.
>>>>
>>>> Cc: Shashank Sharma <shashank.sharma@amd.com>
>>>> Cc: Christian Koenig <christian.koenig@amd.com>
>>>> Cc: Alex Deucher <alexander.deucher@amd.com>
>>>> Signed-off-by: Arvind Yadav <Arvind.Yadav@amd.com>
>>>> ---
>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c    |  2 ++
>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c  | 23 
>>>> +++++++++++++++++++
>>>>   drivers/gpu/drm/amd/include/amdgpu_workload.h |  2 ++
>>>>   3 files changed, 27 insertions(+)
>>>>
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>> index cd3bf641b630..3b70e657b439 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>> @@ -4212,6 +4212,8 @@ int amdgpu_device_suspend(struct drm_device 
>>>> *dev, bool fbcon)
>>>>         amdgpu_ras_suspend(adev);
>>>>   +    amdgpu_workload_profile_suspend(adev);
>>>> +
>>>>       amdgpu_device_ip_suspend_phase1(adev);
>>>>         if (!adev->in_s0ix)
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c 
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c
>>>> index 6367eb88a44d..44ca8e986984 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c
>>>> @@ -174,6 +174,29 @@ void amdgpu_workload_profile_set(struct 
>>>> amdgpu_device *adev,
>>>>       mutex_unlock(&workload->workload_lock);
>>>>   }
>>>>   +void amdgpu_workload_profile_suspend(struct amdgpu_device *adev)
>>>> +{
>>>> +    struct amdgpu_smu_workload *workload = &adev->smu_workload;
>>>> +    int ret;
>>>> +
>>>> +    mutex_lock(&workload->workload_lock);
>>>> + cancel_delayed_work_sync(&workload->smu_delayed_work);
>>>
>>> Another deadlock candidate. Between fini() and suspend(), the only 
>>> difference probably could be initialization status. If so, just use 
>>> a helper that is used during fini() and suspend().
>>>
>> Before going to suspend(), we need to cancel the work and clear all 
>> the profiles but in fini() we are destroying the mutex. also it will 
>> be called when we are unloading everything.
>>
>
> What I meant is for both suspend/fini, you need to cancel any work 
> scheduled, clear refcounts and set the profile back to default 
> profile. Keep this in a helper and reuse.
>
Noted.

Thank you,
~Arvind

> Thanks,
> Lijo
>
>> ~Arvind
>>
>>> Thanks,
>>> Lijo
>>>
>>>> +
>>>> +    /* Clear all the set GPU power profile*/
>>>> +    for (int index = fls(workload->submit_workload_status);
>>>> +         index > 0; index--) {
>>>> +        if (workload->submit_workload_status & (1 << index)) {
>>>> + atomic_set(&workload->power_profile_ref[index], 0);
>>>> +            ret = amdgpu_power_profile_clear(adev, index);
>>>> +            if (ret)
>>>> +                DRM_WARN("Failed to clear power profile %s, err = 
>>>> %d\n",
>>>> +                     amdgpu_workload_mode_name[index], ret);
>>>> +        }
>>>> +    }
>>>> +    workload->submit_workload_status = 0;
>>>> +    mutex_unlock(&workload->workload_lock);
>>>> +}
>>>> +
>>>>   void amdgpu_workload_profile_init(struct amdgpu_device *adev)
>>>>   {
>>>>       adev->smu_workload.adev = adev;
>>>> diff --git a/drivers/gpu/drm/amd/include/amdgpu_workload.h 
>>>> b/drivers/gpu/drm/amd/include/amdgpu_workload.h
>>>> index ee1f87257f2d..0acd8769ec52 100644
>>>> --- a/drivers/gpu/drm/amd/include/amdgpu_workload.h
>>>> +++ b/drivers/gpu/drm/amd/include/amdgpu_workload.h
>>>> @@ -52,6 +52,8 @@ void amdgpu_workload_profile_put(struct 
>>>> amdgpu_device *adev,
>>>>   void amdgpu_workload_profile_set(struct amdgpu_device *adev,
>>>>                    uint32_t ring_type);
>>>>   +void amdgpu_workload_profile_suspend(struct amdgpu_device *adev);
>>>> +
>>>>   void amdgpu_workload_profile_init(struct amdgpu_device *adev);
>>>>     void amdgpu_workload_profile_fini(struct amdgpu_device *adev);

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v2 3/7] drm/amdgpu: Add new function to put GPU power profile
  2023-08-22 12:46       ` Lazar, Lijo
@ 2023-08-25 11:18         ` Yadav, Arvind
  2023-08-25 11:27           ` Lazar, Lijo
  0 siblings, 1 reply; 39+ messages in thread
From: Yadav, Arvind @ 2023-08-25 11:18 UTC (permalink / raw)
  To: Lazar, Lijo, Arvind Yadav, Christian.Koenig, alexander.deucher,
	shashank.sharma, Xinhui.Pan, airlied, daniel, Felix.Kuehling,
	amd-gfx
  Cc: linux-kernel, dri-devel


On 8/22/2023 6:16 PM, Lazar, Lijo wrote:
>
>
> On 8/22/2023 5:41 PM, Yadav, Arvind wrote:
>> Hi Lijo,
>>
>> The *_set function will set the GPU power profile and the *_put 
>> function will  schedule the
>> smu_delayed_work task after 100ms delay. This smu_delayed_work task 
>> will clear a GPU
>> power profile if any new jobs are not scheduled within 100 ms. But if 
>> any new job  comes within 100ms
>> then the *_workload_profile_set function  will cancel this work and 
>> set the GPU power profile based on
>> preferences.
>>
>> Please see the below case.
>>
>> case 1 - only same profile jobs run. It will take 100ms to clear the 
>> profile once all jobs complete.
>>
>>                                             wl = VIDEO <100ms>
>> workload _________|`````````````````````````````````````|____
>>
>> Jobs (VIDEO) ________|```|__|```|___|````|___________
>>
>>
>> Case2 - two jobs of two different profile. job1 profile will be set 
>> but when job2 will arrive it will be moved
>>          to higher profile.
>>
>>                                   wl = VIDEO  ->    wl = COMPUTE   
>> <100ms>
>> workload 
>> ___|``````````````````````````````````````````````````````````````````|____
>>
>> Jobs (VIDEO) ___|```|__|```|___|````|___|````|_______
>>
>> Jobs (COMPUTE) ______________|```|___|````|___|````|_________
>>
>>
>>
>> Case3 - two jobs of two different profile. job1 profile will be set 
>> but when job2 will arrive it will not be moved
>> to lower profile. When compute job2 will complete then only it will 
>> move to lower profile.
>>
>>                                               wl = COMPUTE 
>> ->               wl = VIDEO  <100ms>
>> workload 
>> _________|``````````````````````````````````````````````````````````````````|____ 
>>
>>
>> Jobs (COMPUTE)    ____|```|__|```|___|````|___|````|_______
>>
>> Jobs (VIDEO) 
>> ___________________|```|___|````|___|````|___|````|___________
>>
>
> swsmu layer maintains a workload mask based on priority. So once you 
> have set the mask, until you unset it (i.e when refcount = 0), the 
> mask will be set in the lower layer. swsmu layer will take care of 
> requesting FW the highest priority. I don't think that needs to be 
> repeated at this level.
>
> At this layer, all you need is to refcount the requests and make the 
> request.
>
> When refcount of a profile becomes non-zero (only one-time), place one 
> request for that profile. As swsmu layer maintains the workload mask, 
> it will take the new profile also into consideration while requesting 
> for the one  with the highest priority.
>
> When refcount of a profile becomes zero, place a request to clear it. 
> This is controlled by your idle work. As I see, it keeps an additional 
> 100ms tolerance before placing a clear request. In that way, there is 
> no need to cancel that work.
>
> Inside idle work handler -
> Loop through the profiles that are set and clear those profiles whose 
> refcount is zero.
>
> Thus if a job starts during the 100ms delay, idle work won't see the 
> ref count as zero and then it won't place a request to clear out that 
> profile.
>
Hi Liji,

Thank you for your comment. We would be considering your comment but we 
would retain the same design.

~Arvind.

>> On 8/22/2023 10:21 AM, Lazar, Lijo wrote:
>>>
>>>
>>> On 8/21/2023 12:17 PM, Arvind Yadav wrote:
>>>> This patch adds a function which will clear the GPU
>>>> power profile after job finished.
>>>>
>>>> This is how it works:
>>>> - schedular will set the GPU power profile based on ring_type.
>>>> - Schedular will clear the GPU Power profile once job finished.
>>>> - Here, the *_workload_profile_set function will set the GPU
>>>>    power profile and the *_workload_profile_put function will
>>>>    schedule the smu_delayed_work task after 100ms delay. This
>>>>    smu_delayed_work task will clear a GPU power profile if any
>>>>    new jobs are not scheduled within 100 ms. But if any new job
>>>>    comes within 100ms then the *_workload_profile_set function
>>>>    will cancel this work and set the GPU power profile based on
>>>>    preferences.
>>>>
>>>> v2:
>>>> - Splitting workload_profile_set and workload_profile_put
>>>>    into two separate patches.
>>>> - Addressed review comment.
>>>>
>>>> Cc: Shashank Sharma <shashank.sharma@amd.com>
>>>> Cc: Christian Koenig <christian.koenig@amd.com>
>>>> Cc: Alex Deucher <alexander.deucher@amd.com>
>>>> Signed-off-by: Arvind Yadav <Arvind.Yadav@amd.com>
>>>> ---
>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c  | 97 
>>>> +++++++++++++++++++
>>>>   drivers/gpu/drm/amd/include/amdgpu_workload.h |  3 +
>>>>   2 files changed, 100 insertions(+)
>>>>
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c 
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c
>>>> index e661cc5b3d92..6367eb88a44d 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c
>>>> @@ -24,6 +24,9 @@
>>>>     #include "amdgpu.h"
>>>>   +/* 100 millsecond timeout */
>>>> +#define SMU_IDLE_TIMEOUT    msecs_to_jiffies(100)
>>>> +
>>>>   static enum PP_SMC_POWER_PROFILE
>>>>   ring_to_power_profile(uint32_t ring_type)
>>>>   {
>>>> @@ -59,6 +62,80 @@ amdgpu_power_profile_set(struct amdgpu_device 
>>>> *adev,
>>>>       return ret;
>>>>   }
>>>>   +static int
>>>> +amdgpu_power_profile_clear(struct amdgpu_device *adev,
>>>> +               enum PP_SMC_POWER_PROFILE profile)
>>>> +{
>>>> +    int ret = amdgpu_dpm_switch_power_profile(adev, profile, false);
>>>> +
>>>> +    if (!ret) {
>>>> +        /* Clear the bit for the submitted workload profile */
>>>> +        adev->smu_workload.submit_workload_status &= ~(1 << profile);
>>>> +    }
>>>> +
>>>> +    return ret;
>>>> +}
>>>> +
>>>> +static void
>>>> +amdgpu_power_profile_idle_work_handler(struct work_struct *work)
>>>> +{
>>>> +
>>>> +    struct amdgpu_smu_workload *workload = container_of(work,
>>>> +                              struct amdgpu_smu_workload,
>>>> +                              smu_delayed_work.work);
>>>> +    struct amdgpu_device *adev = workload->adev;
>>>> +    bool reschedule = false;
>>>> +    int index  = fls(workload->submit_workload_status);
>>>> +    int ret;
>>>> +
>>>> +    mutex_lock(&workload->workload_lock);
>>>> +    for (; index > 0; index--) {
>>>
>>> Why not use for_each_set_bit?
>>
>> We are clearing which we have only set it. We will clear first higher 
>> profile then lower.
>>
>
> You don't need to do take care of this. swsmu layer will take care of 
> the priority. It is not the job of this layer to take care of 
> priority. swsmu is the layer that could be altered specific to each 
> SOC, and that can take care of any priority changes accordingly. This 
> layer only needs to ref count the requests and place accordingly.
>
>>
>>>
>>>> +        int val = atomic_read(&workload->power_profile_ref[index]);
>>>> +
>>>> +        if (val) {
>>>> +            reschedule = true;
>>>
>>> Why do you need to do reschedule? For each put(), a schedule is 
>>> called. If refcount is not zero, that means some other job has 
>>> already set the profile. It is supposed to call put() and at that 
>>> time, this job will be run to clear it anyway, right?
>>>
>> Yes, I have got the comment for this I am going to remove this.
>> Noted.
>>
>>>> +        } else {
>>>> +            if (workload->submit_workload_status &
>>>> +                (1 << index)) {
>>>> +                ret = amdgpu_power_profile_clear(adev, index);
>>>> +                if (ret) {
>>>> +                    DRM_WARN("Failed to clear workload %s,error = 
>>>> %d\n",
>>>> +                         amdgpu_workload_mode_name[index], ret);
>>>> +                    goto exit;
>>>> +                }
>>>> +            }
>>>> +        }
>>>> +    }
>>>> +    if (reschedule)
>>>> + schedule_delayed_work(&workload->smu_delayed_work,
>>>> +                      SMU_IDLE_TIMEOUT);
>>>> +exit:
>>>> +    mutex_unlock(&workload->workload_lock);
>>>> +}
>>>> +
>>>> +void amdgpu_workload_profile_put(struct amdgpu_device *adev,
>>>> +                 uint32_t ring_type)
>>>> +{
>>>> +    struct amdgpu_smu_workload *workload = &adev->smu_workload;
>>>> +    enum PP_SMC_POWER_PROFILE profile = 
>>>> ring_to_power_profile(ring_type);
>>>> +
>>>> +    if (profile == PP_SMC_POWER_PROFILE_BOOTUP_DEFAULT)
>>>> +        return;
>>>> +
>>>> +    mutex_lock(&workload->workload_lock);
>>>> +
>>>> +    if (!atomic_read(&workload->power_profile_ref[profile])) {
>>>> +        DRM_WARN("Power profile %s ref. count error\n",
>>>> +             amdgpu_workload_mode_name[profile]);
>>>> +    } else {
>>>> + atomic_dec(&workload->power_profile_ref[profile]);
>>>> + schedule_delayed_work(&workload->smu_delayed_work,
>>>> +                      SMU_IDLE_TIMEOUT);
>>>> +    }
>>>> +
>>>> +    mutex_unlock(&workload->workload_lock);
>>>> +}
>>>> +
>>>>   void amdgpu_workload_profile_set(struct amdgpu_device *adev,
>>>>                    uint32_t ring_type)
>>>>   {
>>>> @@ -70,13 +147,30 @@ void amdgpu_workload_profile_set(struct 
>>>> amdgpu_device *adev,
>>>>           return;
>>>>         mutex_lock(&workload->workload_lock);
>>>> + cancel_delayed_work_sync(&workload->smu_delayed_work);
>>>
>>> This is a potential deadlock. You already hold the mutex and then 
>>> waiting for idle work to finish. Idle work could now be at the point 
>>> where it is waiting for the same mutex. Suggest not to call cancel 
>>> here and let the mutex take care of the sequence.
>> We cannot cancel if idle work is running. So we have to wait until 
>> ideal work is complete. If *put function arrived before ideal work is 
>> not stated then we can cancel it. but if it is running work thread we 
>> should wait.
>
> No need to wait, because you already have a mutex. So you will be 
> waiting naturally for the mutex lock to be released (if at all idle 
> work already grabbed it). If a request comes in at the time while idle 
> work is running it is only a timing issue.
>
> Also you have a deadlock here. Here you acquired the mutex first and 
> then waiting for the idle work to finish. The idle work function would 
> have just started at that point and reached to the place where it is 
> going to grab mutex. That is a deadlock. This function is waiting for 
> idle work to finish and idle work is waiting to get the mutex.
>
> Nevertheless, this function doesn't even need to take care of such 
> fancy things. It only grabs the mutex and increases the refcount, 
> places a request if refcount became non-zero.
>
> At whatever point, idle work runs, it will see that the refcount is 
> not zero and skips placing a request to clear that profile.
>
>>>
>>>>         ret = amdgpu_power_profile_set(adev, profile);
>>>>       if (ret) {
>>>>           DRM_WARN("Failed to set workload profile to %s, error = 
>>>> %d\n",
>>>>                amdgpu_workload_mode_name[profile], ret);
>>>> +        goto exit;
>>>> +    }
>>>> +
>>>> +    /* Clear the already finished jobs of higher power profile*/
>>>> +    for (int index = fls(workload->submit_workload_status);
>>>> +         index > profile; index--) {
>>>> +        if (!atomic_read(&workload->power_profile_ref[index]) &&
>>>> +            workload->submit_workload_status & (1 << index)) {
>>>> +            ret = amdgpu_power_profile_clear(adev, index);
>>>> +            if (ret) {
>>>> +                DRM_WARN("Failed to clear workload %s, err = %d\n",
>>>> +                     amdgpu_workload_mode_name[profile], ret);
>>>> +                goto exit;
>>>> +            }
>>>> +        }
>>>
>>> If you follow the earlier comment, that will keep this logic only at 
>>> one place - i.e, at idle work handler. Basically just let the idle 
>>> work handle its duty. If some job starts running during the clear 
>>> call, it's just unfortunate timing and let the next set() take the 
>>> lock and request profile again.
>>
>> So basically for every millisecond  new jobs are coming and 
>> completing it to the same or different profile . Suppose we are 
>> running higher profile jobs and  before it completes if a lower job 
>> arrives, this check will help to move the higher profile to lower 
>> profile once higher profile finishes it. If we are not checking here 
>> then it will stuck on higher profile until then other jobs will also 
>> not complete. Please refer case3 scenario.
>>
>
> As mentioned before, this is not the place to take care of SOC 
> specific power profile priorities. We already have swsmu layer doing 
> that job. This layer just needs to do a ref count and place requests 
> accordingly.
>
> Thanks,
> Lijo
>
>>
>>> Thanks,
>>> Lijo
>>>
>>>>       }
>>>>   +exit:
>>>>       mutex_unlock(&workload->workload_lock);
>>>>   }
>>>>   @@ -87,6 +181,8 @@ void amdgpu_workload_profile_init(struct 
>>>> amdgpu_device *adev)
>>>>       adev->smu_workload.initialized = true;
>>>> mutex_init(&adev->smu_workload.workload_lock);
>>>> + INIT_DELAYED_WORK(&adev->smu_workload.smu_delayed_work,
>>>> +              amdgpu_power_profile_idle_work_handler);
>>>>   }
>>>>     void amdgpu_workload_profile_fini(struct amdgpu_device *adev)
>>>> @@ -94,6 +190,7 @@ void amdgpu_workload_profile_fini(struct 
>>>> amdgpu_device *adev)
>>>>       if (!adev->smu_workload.initialized)
>>>>           return;
>>>>   + cancel_delayed_work_sync(&adev->smu_workload.smu_delayed_work);
>>>>       adev->smu_workload.submit_workload_status = 0;
>>>>       adev->smu_workload.initialized = false;
>>>> mutex_destroy(&adev->smu_workload.workload_lock);
>>>> diff --git a/drivers/gpu/drm/amd/include/amdgpu_workload.h 
>>>> b/drivers/gpu/drm/amd/include/amdgpu_workload.h
>>>> index 5022f28fc2f9..ee1f87257f2d 100644
>>>> --- a/drivers/gpu/drm/amd/include/amdgpu_workload.h
>>>> +++ b/drivers/gpu/drm/amd/include/amdgpu_workload.h
>>>> @@ -46,6 +46,9 @@ static const char * const 
>>>> amdgpu_workload_mode_name[] = {
>>>>       "Window3D"
>>>>   };
>>>>   +void amdgpu_workload_profile_put(struct amdgpu_device *adev,
>>>> +                 uint32_t ring_type);
>>>> +
>>>>   void amdgpu_workload_profile_set(struct amdgpu_device *adev,
>>>>                    uint32_t ring_type);

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v2 3/7] drm/amdgpu: Add new function to put GPU power profile
  2023-08-25 11:18         ` Yadav, Arvind
@ 2023-08-25 11:27           ` Lazar, Lijo
  0 siblings, 0 replies; 39+ messages in thread
From: Lazar, Lijo @ 2023-08-25 11:27 UTC (permalink / raw)
  To: Yadav, Arvind, Arvind Yadav, Christian.Koenig, alexander.deucher,
	shashank.sharma, Xinhui.Pan, airlied, daniel, Felix.Kuehling,
	amd-gfx
  Cc: linux-kernel, dri-devel



On 8/25/2023 4:48 PM, Yadav, Arvind wrote:
> 
> On 8/22/2023 6:16 PM, Lazar, Lijo wrote:
>>
>>
>> On 8/22/2023 5:41 PM, Yadav, Arvind wrote:
>>> Hi Lijo,
>>>
>>> The *_set function will set the GPU power profile and the *_put 
>>> function will  schedule the
>>> smu_delayed_work task after 100ms delay. This smu_delayed_work task 
>>> will clear a GPU
>>> power profile if any new jobs are not scheduled within 100 ms. But if 
>>> any new job  comes within 100ms
>>> then the *_workload_profile_set function  will cancel this work and 
>>> set the GPU power profile based on
>>> preferences.
>>>
>>> Please see the below case.
>>>
>>> case 1 - only same profile jobs run. It will take 100ms to clear the 
>>> profile once all jobs complete.
>>>
>>>                                             wl = VIDEO <100ms>
>>> workload _________|`````````````````````````````````````|____
>>>
>>> Jobs (VIDEO) ________|```|__|```|___|````|___________
>>>
>>>
>>> Case2 - two jobs of two different profile. job1 profile will be set 
>>> but when job2 will arrive it will be moved
>>>          to higher profile.
>>>
>>>                                   wl = VIDEO  ->    wl = COMPUTE <100ms>
>>> workload 
>>> ___|``````````````````````````````````````````````````````````````````|____ 
>>>
>>>
>>> Jobs (VIDEO) ___|```|__|```|___|````|___|````|_______
>>>
>>> Jobs (COMPUTE) ______________|```|___|````|___|````|_________
>>>
>>>
>>>
>>> Case3 - two jobs of two different profile. job1 profile will be set 
>>> but when job2 will arrive it will not be moved
>>> to lower profile. When compute job2 will complete then only it will 
>>> move to lower profile.
>>>
>>>                                               wl = COMPUTE 
>>> ->               wl = VIDEO  <100ms>
>>> workload 
>>> _________|``````````````````````````````````````````````````````````````````|____ 
>>>
>>>
>>> Jobs (COMPUTE)    ____|```|__|```|___|````|___|````|_______
>>>
>>> Jobs (VIDEO) 
>>> ___________________|```|___|````|___|````|___|````|___________
>>>
>>
>> swsmu layer maintains a workload mask based on priority. So once you 
>> have set the mask, until you unset it (i.e when refcount = 0), the 
>> mask will be set in the lower layer. swsmu layer will take care of 
>> requesting FW the highest priority. I don't think that needs to be 
>> repeated at this level.
>>
>> At this layer, all you need is to refcount the requests and make the 
>> request.
>>
>> When refcount of a profile becomes non-zero (only one-time), place one 
>> request for that profile. As swsmu layer maintains the workload mask, 
>> it will take the new profile also into consideration while requesting 
>> for the one  with the highest priority.
>>
>> When refcount of a profile becomes zero, place a request to clear it. 
>> This is controlled by your idle work. As I see, it keeps an additional 
>> 100ms tolerance before placing a clear request. In that way, there is 
>> no need to cancel that work.
>>
>> Inside idle work handler -
>> Loop through the profiles that are set and clear those profiles whose 
>> refcount is zero.
>>
>> Thus if a job starts during the 100ms delay, idle work won't see the 
>> ref count as zero and then it won't place a request to clear out that 
>> profile.
>>
> Hi Liji,
> 
> Thank you for your comment. We would be considering your comment but we 
> would retain the same design.
> 

All things aside, the entire idea of switching power profile for every 
job submission on a ring looks like an 'abuse' of the power profile 
design. The goal of power profile is to keep a specific profile for a 
sustained workload - like gaming mode, cinema mode etc. It's not meant 
for like switch profile with every job submission which lasts ms or 
lesser (though you may argue it takes only highest priority profile). 
This design is to keep interrupting FW every now and then thinking 
driver is doing better. For any normal/mixed use scenarios, FW 
algorithms could handle it better with all the activity monitors they have.

If you are going ahead, please also make sure to post the improved 
performance numbers you are getting with this.

Thanks,
Lijo

> ~Arvind.
> 
>>> On 8/22/2023 10:21 AM, Lazar, Lijo wrote:
>>>>
>>>>
>>>> On 8/21/2023 12:17 PM, Arvind Yadav wrote:
>>>>> This patch adds a function which will clear the GPU
>>>>> power profile after job finished.
>>>>>
>>>>> This is how it works:
>>>>> - schedular will set the GPU power profile based on ring_type.
>>>>> - Schedular will clear the GPU Power profile once job finished.
>>>>> - Here, the *_workload_profile_set function will set the GPU
>>>>>    power profile and the *_workload_profile_put function will
>>>>>    schedule the smu_delayed_work task after 100ms delay. This
>>>>>    smu_delayed_work task will clear a GPU power profile if any
>>>>>    new jobs are not scheduled within 100 ms. But if any new job
>>>>>    comes within 100ms then the *_workload_profile_set function
>>>>>    will cancel this work and set the GPU power profile based on
>>>>>    preferences.
>>>>>
>>>>> v2:
>>>>> - Splitting workload_profile_set and workload_profile_put
>>>>>    into two separate patches.
>>>>> - Addressed review comment.
>>>>>
>>>>> Cc: Shashank Sharma <shashank.sharma@amd.com>
>>>>> Cc: Christian Koenig <christian.koenig@amd.com>
>>>>> Cc: Alex Deucher <alexander.deucher@amd.com>
>>>>> Signed-off-by: Arvind Yadav <Arvind.Yadav@amd.com>
>>>>> ---
>>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c  | 97 
>>>>> +++++++++++++++++++
>>>>>   drivers/gpu/drm/amd/include/amdgpu_workload.h |  3 +
>>>>>   2 files changed, 100 insertions(+)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c 
>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c
>>>>> index e661cc5b3d92..6367eb88a44d 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_workload.c
>>>>> @@ -24,6 +24,9 @@
>>>>>     #include "amdgpu.h"
>>>>>   +/* 100 millsecond timeout */
>>>>> +#define SMU_IDLE_TIMEOUT    msecs_to_jiffies(100)
>>>>> +
>>>>>   static enum PP_SMC_POWER_PROFILE
>>>>>   ring_to_power_profile(uint32_t ring_type)
>>>>>   {
>>>>> @@ -59,6 +62,80 @@ amdgpu_power_profile_set(struct amdgpu_device 
>>>>> *adev,
>>>>>       return ret;
>>>>>   }
>>>>>   +static int
>>>>> +amdgpu_power_profile_clear(struct amdgpu_device *adev,
>>>>> +               enum PP_SMC_POWER_PROFILE profile)
>>>>> +{
>>>>> +    int ret = amdgpu_dpm_switch_power_profile(adev, profile, false);
>>>>> +
>>>>> +    if (!ret) {
>>>>> +        /* Clear the bit for the submitted workload profile */
>>>>> +        adev->smu_workload.submit_workload_status &= ~(1 << profile);
>>>>> +    }
>>>>> +
>>>>> +    return ret;
>>>>> +}
>>>>> +
>>>>> +static void
>>>>> +amdgpu_power_profile_idle_work_handler(struct work_struct *work)
>>>>> +{
>>>>> +
>>>>> +    struct amdgpu_smu_workload *workload = container_of(work,
>>>>> +                              struct amdgpu_smu_workload,
>>>>> +                              smu_delayed_work.work);
>>>>> +    struct amdgpu_device *adev = workload->adev;
>>>>> +    bool reschedule = false;
>>>>> +    int index  = fls(workload->submit_workload_status);
>>>>> +    int ret;
>>>>> +
>>>>> +    mutex_lock(&workload->workload_lock);
>>>>> +    for (; index > 0; index--) {
>>>>
>>>> Why not use for_each_set_bit?
>>>
>>> We are clearing which we have only set it. We will clear first higher 
>>> profile then lower.
>>>
>>
>> You don't need to do take care of this. swsmu layer will take care of 
>> the priority. It is not the job of this layer to take care of 
>> priority. swsmu is the layer that could be altered specific to each 
>> SOC, and that can take care of any priority changes accordingly. This 
>> layer only needs to ref count the requests and place accordingly.
>>
>>>
>>>>
>>>>> +        int val = atomic_read(&workload->power_profile_ref[index]);
>>>>> +
>>>>> +        if (val) {
>>>>> +            reschedule = true;
>>>>
>>>> Why do you need to do reschedule? For each put(), a schedule is 
>>>> called. If refcount is not zero, that means some other job has 
>>>> already set the profile. It is supposed to call put() and at that 
>>>> time, this job will be run to clear it anyway, right?
>>>>
>>> Yes, I have got the comment for this I am going to remove this.
>>> Noted.
>>>
>>>>> +        } else {
>>>>> +            if (workload->submit_workload_status &
>>>>> +                (1 << index)) {
>>>>> +                ret = amdgpu_power_profile_clear(adev, index);
>>>>> +                if (ret) {
>>>>> +                    DRM_WARN("Failed to clear workload %s,error = 
>>>>> %d\n",
>>>>> +                         amdgpu_workload_mode_name[index], ret);
>>>>> +                    goto exit;
>>>>> +                }
>>>>> +            }
>>>>> +        }
>>>>> +    }
>>>>> +    if (reschedule)
>>>>> + schedule_delayed_work(&workload->smu_delayed_work,
>>>>> +                      SMU_IDLE_TIMEOUT);
>>>>> +exit:
>>>>> +    mutex_unlock(&workload->workload_lock);
>>>>> +}
>>>>> +
>>>>> +void amdgpu_workload_profile_put(struct amdgpu_device *adev,
>>>>> +                 uint32_t ring_type)
>>>>> +{
>>>>> +    struct amdgpu_smu_workload *workload = &adev->smu_workload;
>>>>> +    enum PP_SMC_POWER_PROFILE profile = 
>>>>> ring_to_power_profile(ring_type);
>>>>> +
>>>>> +    if (profile == PP_SMC_POWER_PROFILE_BOOTUP_DEFAULT)
>>>>> +        return;
>>>>> +
>>>>> +    mutex_lock(&workload->workload_lock);
>>>>> +
>>>>> +    if (!atomic_read(&workload->power_profile_ref[profile])) {
>>>>> +        DRM_WARN("Power profile %s ref. count error\n",
>>>>> +             amdgpu_workload_mode_name[profile]);
>>>>> +    } else {
>>>>> + atomic_dec(&workload->power_profile_ref[profile]);
>>>>> + schedule_delayed_work(&workload->smu_delayed_work,
>>>>> +                      SMU_IDLE_TIMEOUT);
>>>>> +    }
>>>>> +
>>>>> +    mutex_unlock(&workload->workload_lock);
>>>>> +}
>>>>> +
>>>>>   void amdgpu_workload_profile_set(struct amdgpu_device *adev,
>>>>>                    uint32_t ring_type)
>>>>>   {
>>>>> @@ -70,13 +147,30 @@ void amdgpu_workload_profile_set(struct 
>>>>> amdgpu_device *adev,
>>>>>           return;
>>>>>         mutex_lock(&workload->workload_lock);
>>>>> + cancel_delayed_work_sync(&workload->smu_delayed_work);
>>>>
>>>> This is a potential deadlock. You already hold the mutex and then 
>>>> waiting for idle work to finish. Idle work could now be at the point 
>>>> where it is waiting for the same mutex. Suggest not to call cancel 
>>>> here and let the mutex take care of the sequence.
>>> We cannot cancel if idle work is running. So we have to wait until 
>>> ideal work is complete. If *put function arrived before ideal work is 
>>> not stated then we can cancel it. but if it is running work thread we 
>>> should wait.
>>
>> No need to wait, because you already have a mutex. So you will be 
>> waiting naturally for the mutex lock to be released (if at all idle 
>> work already grabbed it). If a request comes in at the time while idle 
>> work is running it is only a timing issue.
>>
>> Also you have a deadlock here. Here you acquired the mutex first and 
>> then waiting for the idle work to finish. The idle work function would 
>> have just started at that point and reached to the place where it is 
>> going to grab mutex. That is a deadlock. This function is waiting for 
>> idle work to finish and idle work is waiting to get the mutex.
>>
>> Nevertheless, this function doesn't even need to take care of such 
>> fancy things. It only grabs the mutex and increases the refcount, 
>> places a request if refcount became non-zero.
>>
>> At whatever point, idle work runs, it will see that the refcount is 
>> not zero and skips placing a request to clear that profile.
>>
>>>>
>>>>>         ret = amdgpu_power_profile_set(adev, profile);
>>>>>       if (ret) {
>>>>>           DRM_WARN("Failed to set workload profile to %s, error = 
>>>>> %d\n",
>>>>>                amdgpu_workload_mode_name[profile], ret);
>>>>> +        goto exit;
>>>>> +    }
>>>>> +
>>>>> +    /* Clear the already finished jobs of higher power profile*/
>>>>> +    for (int index = fls(workload->submit_workload_status);
>>>>> +         index > profile; index--) {
>>>>> +        if (!atomic_read(&workload->power_profile_ref[index]) &&
>>>>> +            workload->submit_workload_status & (1 << index)) {
>>>>> +            ret = amdgpu_power_profile_clear(adev, index);
>>>>> +            if (ret) {
>>>>> +                DRM_WARN("Failed to clear workload %s, err = %d\n",
>>>>> +                     amdgpu_workload_mode_name[profile], ret);
>>>>> +                goto exit;
>>>>> +            }
>>>>> +        }
>>>>
>>>> If you follow the earlier comment, that will keep this logic only at 
>>>> one place - i.e, at idle work handler. Basically just let the idle 
>>>> work handle its duty. If some job starts running during the clear 
>>>> call, it's just unfortunate timing and let the next set() take the 
>>>> lock and request profile again.
>>>
>>> So basically for every millisecond  new jobs are coming and 
>>> completing it to the same or different profile . Suppose we are 
>>> running higher profile jobs and  before it completes if a lower job 
>>> arrives, this check will help to move the higher profile to lower 
>>> profile once higher profile finishes it. If we are not checking here 
>>> then it will stuck on higher profile until then other jobs will also 
>>> not complete. Please refer case3 scenario.
>>>
>>
>> As mentioned before, this is not the place to take care of SOC 
>> specific power profile priorities. We already have swsmu layer doing 
>> that job. This layer just needs to do a ref count and place requests 
>> accordingly.
>>
>> Thanks,
>> Lijo
>>
>>>
>>>> Thanks,
>>>> Lijo
>>>>
>>>>>       }
>>>>>   +exit:
>>>>>       mutex_unlock(&workload->workload_lock);
>>>>>   }
>>>>>   @@ -87,6 +181,8 @@ void amdgpu_workload_profile_init(struct 
>>>>> amdgpu_device *adev)
>>>>>       adev->smu_workload.initialized = true;
>>>>> mutex_init(&adev->smu_workload.workload_lock);
>>>>> + INIT_DELAYED_WORK(&adev->smu_workload.smu_delayed_work,
>>>>> +              amdgpu_power_profile_idle_work_handler);
>>>>>   }
>>>>>     void amdgpu_workload_profile_fini(struct amdgpu_device *adev)
>>>>> @@ -94,6 +190,7 @@ void amdgpu_workload_profile_fini(struct 
>>>>> amdgpu_device *adev)
>>>>>       if (!adev->smu_workload.initialized)
>>>>>           return;
>>>>>   + cancel_delayed_work_sync(&adev->smu_workload.smu_delayed_work);
>>>>>       adev->smu_workload.submit_workload_status = 0;
>>>>>       adev->smu_workload.initialized = false;
>>>>> mutex_destroy(&adev->smu_workload.workload_lock);
>>>>> diff --git a/drivers/gpu/drm/amd/include/amdgpu_workload.h 
>>>>> b/drivers/gpu/drm/amd/include/amdgpu_workload.h
>>>>> index 5022f28fc2f9..ee1f87257f2d 100644
>>>>> --- a/drivers/gpu/drm/amd/include/amdgpu_workload.h
>>>>> +++ b/drivers/gpu/drm/amd/include/amdgpu_workload.h
>>>>> @@ -46,6 +46,9 @@ static const char * const 
>>>>> amdgpu_workload_mode_name[] = {
>>>>>       "Window3D"
>>>>>   };
>>>>>   +void amdgpu_workload_profile_put(struct amdgpu_device *adev,
>>>>> +                 uint32_t ring_type);
>>>>> +
>>>>>   void amdgpu_workload_profile_set(struct amdgpu_device *adev,
>>>>>                    uint32_t ring_type);

^ permalink raw reply	[flat|nested] 39+ messages in thread

end of thread, other threads:[~2023-08-25 11:29 UTC | newest]

Thread overview: 39+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-08-21  6:47 [PATCH v2 0/7] GPU workload hints for better performance Arvind Yadav
2023-08-21  6:47 ` [PATCH v2 1/7] drm/amdgpu: Added init/fini functions for workload Arvind Yadav
2023-08-21 13:06   ` Shashank Sharma
2023-08-21 13:35     ` Yadav, Arvind
2023-08-21 13:54       ` Shashank Sharma
2023-08-21 14:12         ` Yadav, Arvind
2023-08-21 14:27           ` Shashank Sharma
2023-08-21  6:47 ` [PATCH v2 2/7] drm/amdgpu: Add new function to set GPU power profile Arvind Yadav
2023-08-21 13:10   ` Shashank Sharma
2023-08-21 16:22   ` Alex Deucher
2023-08-21 17:53     ` Yadav, Arvind
2023-08-21 18:10       ` Alex Deucher
2023-08-22  6:13         ` Yadav, Arvind
2023-08-21 18:06   ` Alex Deucher
2023-08-21 18:08     ` Yadav, Arvind
2023-08-22  6:25   ` Lazar, Lijo
2023-08-22 12:40     ` Yadav, Arvind
2023-08-21  6:47 ` [PATCH v2 3/7] drm/amdgpu: Add new function to put " Arvind Yadav
2023-08-21 13:39   ` Shashank Sharma
2023-08-21 14:40     ` Yadav, Arvind
2023-08-22  4:51   ` Lazar, Lijo
2023-08-22 12:11     ` Yadav, Arvind
2023-08-22 12:46       ` Lazar, Lijo
2023-08-25 11:18         ` Yadav, Arvind
2023-08-25 11:27           ` Lazar, Lijo
2023-08-21  6:47 ` [PATCH v2 4/7] drm/amdgpu: Add suspend function to clear the " Arvind Yadav
2023-08-21 13:43   ` Shashank Sharma
2023-08-21 13:52     ` Yadav, Arvind
2023-08-22  6:31   ` Lazar, Lijo
2023-08-22 12:22     ` Yadav, Arvind
2023-08-22 12:54       ` Lazar, Lijo
2023-08-22 12:56         ` Yadav, Arvind
2023-08-21  6:47 ` [PATCH v2 5/7] drm/amdgpu: Switch on/off GPU workload profile Arvind Yadav
2023-08-21 13:46   ` Shashank Sharma
2023-08-21 13:53     ` Yadav, Arvind
2023-08-21  6:47 ` [PATCH v2 6/7] drm/amdgpu: switch workload context to/from compute Arvind Yadav
2023-08-21 13:47   ` Shashank Sharma
2023-08-21  6:47 ` [PATCH v2 7/7] Revert "drm/amd/amdgpu: switch on/off vcn power profile mode" Arvind Yadav
2023-08-21 13:49   ` Shashank Sharma

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox