* [PATCH 0/3] Job submission optimisation
@ 2026-06-26 8:55 Tvrtko Ursulin
2026-06-26 8:55 ` [PATCH 1/3] drm/amdgpu: Remove unused amdgpu_device_ip_is_hw Tvrtko Ursulin
` (2 more replies)
0 siblings, 3 replies; 9+ messages in thread
From: Tvrtko Ursulin @ 2026-06-26 8:55 UTC (permalink / raw)
To: amd-gfx
Cc: kernel-dev, Tvrtko Ursulin, Alex Deucher, Christian König,
Timur Kristóf
I accidentally noticed some inefficiencies on the job submit path which seemed
easy to address. CPU usage of the DRM scheduler submission thread:
before after
UnigineHeaven 2.3% 1.3%
vkgears vsync off 15% 12%
Gains are mostly due reduced hammering on the delayed worker rescheduling. Stock
profile of the scheduler thread otherwise looks like this:
20.25% [kernel] [k] __mod_timer
2.87% [kernel] [k] enqueue_timer
2.16% [kernel] [k] amdgpu_gfx_profile_ring_end_use
More pairs of eyes would be welcome to check I did not break something.
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Timur Kristóf <timur.kristof@gmail.com>
Tvrtko Ursulin (3):
drm/amdgpu: Remove unused amdgpu_device_ip_is_hw
drm/amdgpu: Save some cycles on the job submission path
drm/amdgpu: Do not fiddle with the idle workers too much
drivers/gpu/drm/amd/amdgpu/amdgpu.h | 1 +
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 3 +++
drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c | 11 +++++-----
drivers/gpu/drm/amd/amdgpu/amdgpu_ip.c | 21 -------------------
drivers/gpu/drm/amd/amdgpu/amdgpu_ip.h | 2 --
drivers/gpu/drm/amd/amdgpu/amdgpu_jpeg.c | 9 ++++----
drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c | 12 +++++------
drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c | 12 +++++------
drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c | 14 +++++--------
drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h | 8 ++++++-
10 files changed, 36 insertions(+), 57 deletions(-)
--
2.54.0
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH 1/3] drm/amdgpu: Remove unused amdgpu_device_ip_is_hw
2026-06-26 8:55 [PATCH 0/3] Job submission optimisation Tvrtko Ursulin
@ 2026-06-26 8:55 ` Tvrtko Ursulin
2026-06-26 16:59 ` Timur Kristóf
2026-06-26 8:55 ` [PATCH 2/3] drm/amdgpu: Save some cycles on the job submission path Tvrtko Ursulin
2026-06-26 8:55 ` [PATCH 3/3] drm/amdgpu: Do not fiddle with the idle workers too much Tvrtko Ursulin
2 siblings, 1 reply; 9+ messages in thread
From: Tvrtko Ursulin @ 2026-06-26 8:55 UTC (permalink / raw)
To: amd-gfx
Cc: kernel-dev, Tvrtko Ursulin, Alex Deucher, Christian König,
Timur Kristóf
This function is unused so lets remove it.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Timur Kristóf <timur.kristof@gmail.com>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ip.c | 21 ---------------------
drivers/gpu/drm/amd/amdgpu/amdgpu_ip.h | 2 --
2 files changed, 23 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ip.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ip.c
index 6aa54156bbc9..62285e973c5c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ip.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ip.c
@@ -368,27 +368,6 @@ int amdgpu_device_ip_wait_for_idle(struct amdgpu_device *adev,
return 0;
}
-/**
- * amdgpu_device_ip_is_hw - is the hardware IP enabled
- *
- * @adev: amdgpu_device pointer
- * @block_type: Type of hardware IP (SMU, GFX, UVD, etc.)
- *
- * Check if the hardware IP is enable or not.
- * Returns true if it the IP is enable, false if not.
- */
-bool amdgpu_device_ip_is_hw(struct amdgpu_device *adev,
- enum amd_ip_block_type block_type)
-{
- struct amdgpu_ip_block *ip_block;
-
- ip_block = amdgpu_device_ip_get_ip_block(adev, block_type);
- if (ip_block)
- return ip_block->status.hw;
-
- return false;
-}
-
/**
* amdgpu_device_ip_is_valid - is the hardware IP valid
*
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ip.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ip.h
index 1d0df6d93957..11739fbdeaa6 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ip.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ip.h
@@ -146,8 +146,6 @@ void amdgpu_device_ip_get_clockgating_state(struct amdgpu_device *adev,
u64 *flags);
int amdgpu_device_ip_wait_for_idle(struct amdgpu_device *adev,
enum amd_ip_block_type block_type);
-bool amdgpu_device_ip_is_hw(struct amdgpu_device *adev,
- enum amd_ip_block_type block_type);
bool amdgpu_device_ip_is_valid(struct amdgpu_device *adev,
enum amd_ip_block_type block_type);
--
2.54.0
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH 2/3] drm/amdgpu: Save some cycles on the job submission path
2026-06-26 8:55 [PATCH 0/3] Job submission optimisation Tvrtko Ursulin
2026-06-26 8:55 ` [PATCH 1/3] drm/amdgpu: Remove unused amdgpu_device_ip_is_hw Tvrtko Ursulin
@ 2026-06-26 8:55 ` Tvrtko Ursulin
2026-06-26 17:10 ` Timur Kristóf
2026-06-26 8:55 ` [PATCH 3/3] drm/amdgpu: Do not fiddle with the idle workers too much Tvrtko Ursulin
2 siblings, 1 reply; 9+ messages in thread
From: Tvrtko Ursulin @ 2026-06-26 8:55 UTC (permalink / raw)
To: amd-gfx
Cc: kernel-dev, Tvrtko Ursulin, Alex Deucher, Christian König,
Timur Kristóf
Every job submission on the Steam Deck ends up walking the list of IP
blocks looking for AMD_IP_BLOCK_TYPE_SMC. Half of the call chain is like
the below, while the second half is from amdgpu_gfx_profile_ring_end_use:
amdgpu_gfx_profile_ring_begin_use
amdgpu_dpm_is_overdrive_enabled
is_support_sw_smu
amdgpu_device_ip_is_valid
On a game menu screen at 90Hz refresh rate we end up with ~840 calls per
second which sticks out when the submission worker is profiled with perf:
13.78% [kernel] [k] __lock_text_start
10.86% [kernel] [k] __lookup_object
8.76% [kernel] [k] __mod_timer
4.94% [kernel] [k] queued_spin_lock_slowpath
1.66% [kernel] [k] amdgpu_device_ip_is_valid
1.54% [kernel] [k] preempt_count_add
1.42% [kernel] [k] amdgpu_sync_peek_fence
1.18% [kernel] [k] amdgpu_vmid_grab
1.17% [kernel] [k] amdgpu_ib_schedule
1.14% [kernel] [k] kthread_worker_fn
Lets short-circuit this walk by simply caching the result of
is_support_sw_smu() in the device.
This is a micro-improvement but it is at least conceptually nicer to avoid
repeating the same walk so much.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Timur Kristóf <timur.kristof@gmail.com>
---
v2:
* Approach changed to cache sw_smu status only.
---
drivers/gpu/drm/amd/amdgpu/amdgpu.h | 1 +
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 3 +++
drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c | 14 +++++---------
drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h | 8 +++++++-
4 files changed, 16 insertions(+), 10 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index 7b09410d6d8f..9803967d15f9 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -851,6 +851,7 @@ struct amdgpu_device {
struct dev_pm_domain vga_pm_domain;
bool have_disp_power_ref;
bool have_atomics_support;
+ bool is_sw_smu;
/* BIOS */
bool is_atom_fw;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 1e6b75ecafe4..7f935a5778b0 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -74,6 +74,7 @@
#include "amdgpu_ras.h"
#include "amdgpu_ras_mgr.h"
#include "amdgpu_pmu.h"
+#include "amdgpu_smu.h"
#include "amdgpu_fru_eeprom.h"
#include "amdgpu_reset.h"
#include "amdgpu_virt.h"
@@ -2130,6 +2131,8 @@ static int amdgpu_device_ip_early_init(struct amdgpu_device *adev)
adev->cg_flags &= amdgpu_cg_mask;
adev->pg_flags &= amdgpu_pg_mask;
+ amdgpu_smu_early_init(adev);
+
return 0;
}
diff --git a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
index 208a2fba6d40..82c9ae6a5092 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
@@ -591,17 +591,13 @@ static int smu_get_power_num_states(void *handle,
return 0;
}
-bool is_support_sw_smu(struct amdgpu_device *adev)
+void amdgpu_smu_early_init(struct amdgpu_device *adev)
{
/* vega20 is 11.0.2, but it's supported via the powerplay code */
- if (adev->asic_type == CHIP_VEGA20)
- return false;
-
- if ((amdgpu_ip_version(adev, MP1_HWIP, 0) >= IP_VERSION(11, 0, 0)) &&
- amdgpu_device_ip_is_valid(adev, AMD_IP_BLOCK_TYPE_SMC))
- return true;
-
- return false;
+ adev->is_sw_smu = adev->asic_type != CHIP_VEGA20 &&
+ (amdgpu_ip_version(adev, MP1_HWIP, 0) >=
+ IP_VERSION(11, 0, 0) &&
+ amdgpu_device_ip_is_valid(adev, AMD_IP_BLOCK_TYPE_SMC));
}
bool is_support_cclk_dpm(struct amdgpu_device *adev)
diff --git a/drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h b/drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h
index d76e0b005308..efc52d97058b 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h
+++ b/drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h
@@ -1952,7 +1952,13 @@ int smu_link_reset(struct smu_context *smu);
extern const struct amd_ip_funcs smu_ip_funcs;
-bool is_support_sw_smu(struct amdgpu_device *adev);
+void amdgpu_smu_early_init(struct amdgpu_device *adev);
+
+static inline bool is_support_sw_smu(struct amdgpu_device *adev)
+{
+ return adev->is_sw_smu;
+}
+
bool is_support_cclk_dpm(struct amdgpu_device *adev);
int smu_write_watermarks_table(struct smu_context *smu);
--
2.54.0
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH 3/3] drm/amdgpu: Do not fiddle with the idle workers too much
2026-06-26 8:55 [PATCH 0/3] Job submission optimisation Tvrtko Ursulin
2026-06-26 8:55 ` [PATCH 1/3] drm/amdgpu: Remove unused amdgpu_device_ip_is_hw Tvrtko Ursulin
2026-06-26 8:55 ` [PATCH 2/3] drm/amdgpu: Save some cycles on the job submission path Tvrtko Ursulin
@ 2026-06-26 8:55 ` Tvrtko Ursulin
2026-06-26 17:15 ` Timur Kristóf
2 siblings, 1 reply; 9+ messages in thread
From: Tvrtko Ursulin @ 2026-06-26 8:55 UTC (permalink / raw)
To: amd-gfx
Cc: kernel-dev, Tvrtko Ursulin, Alex Deucher, Christian König,
Timur Kristóf
Idle workers only need to be canceled or pushed back if we are potentially
idle. Make the both operations conditional on the pre-increment and post-
decrement status of the in-flight job counter.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Timur Kristóf <timur.kristof@gmail.com>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c | 11 +++++------
drivers/gpu/drm/amd/amdgpu/amdgpu_jpeg.c | 9 +++++----
drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c | 12 +++++-------
drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c | 12 +++++-------
4 files changed, 20 insertions(+), 24 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
index 85372af1216d..623a5339bc47 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
@@ -2460,9 +2460,8 @@ void amdgpu_gfx_profile_ring_begin_use(struct amdgpu_ring *ring)
else
profile = PP_SMC_POWER_PROFILE_COMPUTE;
- atomic_inc(&adev->gfx.total_submission_cnt);
-
- cancel_delayed_work_sync(&adev->gfx.idle_work);
+ if (!atomic_fetch_inc(&adev->gfx.total_submission_cnt))
+ cancel_delayed_work_sync(&adev->gfx.idle_work);
/* We can safely return early here because we've cancelled the
* the delayed work so there is no one else to set it to false
@@ -2490,9 +2489,9 @@ void amdgpu_gfx_profile_ring_end_use(struct amdgpu_ring *ring)
if (amdgpu_dpm_is_overdrive_enabled(adev))
return;
- atomic_dec(&ring->adev->gfx.total_submission_cnt);
-
- schedule_delayed_work(&ring->adev->gfx.idle_work, GFX_PROFILE_IDLE_TIMEOUT);
+ if (atomic_dec_and_test(&ring->adev->gfx.total_submission_cnt))
+ schedule_delayed_work(&ring->adev->gfx.idle_work,
+ GFX_PROFILE_IDLE_TIMEOUT);
}
/**
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_jpeg.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_jpeg.c
index 63ee6ba6a931..57935c321515 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_jpeg.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_jpeg.c
@@ -134,8 +134,8 @@ void amdgpu_jpeg_ring_begin_use(struct amdgpu_ring *ring)
{
struct amdgpu_device *adev = ring->adev;
- atomic_inc(&adev->jpeg.total_submission_cnt);
- cancel_delayed_work_sync(&adev->jpeg.idle_work);
+ if (!atomic_fetch_inc(&adev->jpeg.total_submission_cnt))
+ cancel_delayed_work_sync(&adev->jpeg.idle_work);
mutex_lock(&adev->jpeg.jpeg_pg_lock);
amdgpu_device_ip_set_powergating_state(adev, AMD_IP_BLOCK_TYPE_JPEG,
@@ -145,8 +145,9 @@ void amdgpu_jpeg_ring_begin_use(struct amdgpu_ring *ring)
void amdgpu_jpeg_ring_end_use(struct amdgpu_ring *ring)
{
- atomic_dec(&ring->adev->jpeg.total_submission_cnt);
- schedule_delayed_work(&ring->adev->jpeg.idle_work, JPEG_IDLE_TIMEOUT);
+ if (atomic_dec_and_test(&ring->adev->jpeg.total_submission_cnt))
+ schedule_delayed_work(&ring->adev->jpeg.idle_work,
+ JPEG_IDLE_TIMEOUT);
}
int amdgpu_jpeg_dec_ring_test_ring(struct amdgpu_ring *ring)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
index b261aa7c1ba8..8d2abf706dfd 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
@@ -506,9 +506,8 @@ void amdgpu_vcn_ring_begin_use(struct amdgpu_ring *ring)
struct amdgpu_device *adev = ring->adev;
struct amdgpu_vcn_inst *vcn_inst = &adev->vcn.inst[ring->me];
- atomic_inc(&vcn_inst->total_submission_cnt);
-
- cancel_delayed_work_sync(&vcn_inst->idle_work);
+ if (!atomic_fetch_inc(&vcn_inst->total_submission_cnt))
+ cancel_delayed_work_sync(&vcn_inst->idle_work);
mutex_lock(&vcn_inst->vcn_pg_lock);
vcn_inst->set_pg_state(vcn_inst, AMD_PG_STATE_UNGATE);
@@ -550,10 +549,9 @@ void amdgpu_vcn_ring_end_use(struct amdgpu_ring *ring)
!adev->vcn.inst[ring->me].using_unified_queue)
atomic_dec(&ring->adev->vcn.inst[ring->me].dpg_enc_submission_cnt);
- atomic_dec(&ring->adev->vcn.inst[ring->me].total_submission_cnt);
-
- schedule_delayed_work(&ring->adev->vcn.inst[ring->me].idle_work,
- VCN_IDLE_TIMEOUT);
+ if (atomic_dec_and_test(&ring->adev->vcn.inst[ring->me].total_submission_cnt))
+ schedule_delayed_work(&ring->adev->vcn.inst[ring->me].idle_work,
+ VCN_IDLE_TIMEOUT);
}
int amdgpu_vcn_dec_ring_test_ring(struct amdgpu_ring *ring)
diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c b/drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c
index 8b8184fe6764..0d8a3cea63ee 100644
--- a/drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c
+++ b/drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c
@@ -159,9 +159,8 @@ static void vcn_v2_5_ring_begin_use(struct amdgpu_ring *ring)
struct amdgpu_device *adev = ring->adev;
struct amdgpu_vcn_inst *v = &adev->vcn.inst[ring->me];
- atomic_inc(&adev->vcn.inst[0].total_submission_cnt);
-
- cancel_delayed_work_sync(&adev->vcn.inst[0].idle_work);
+ if (!atomic_fetch_inc(&adev->vcn.inst[0].total_submission_cnt))
+ cancel_delayed_work_sync(&adev->vcn.inst[0].idle_work);
/* We can safely return early here because we've cancelled the
* the delayed work so there is no one else to set it to false
@@ -207,10 +206,9 @@ static void vcn_v2_5_ring_end_use(struct amdgpu_ring *ring)
!adev->vcn.inst[ring->me].using_unified_queue)
atomic_dec(&adev->vcn.inst[ring->me].dpg_enc_submission_cnt);
- atomic_dec(&adev->vcn.inst[0].total_submission_cnt);
-
- schedule_delayed_work(&adev->vcn.inst[0].idle_work,
- VCN_IDLE_TIMEOUT);
+ if (atomic_dec_and_test(&adev->vcn.inst[0].total_submission_cnt))
+ schedule_delayed_work(&adev->vcn.inst[0].idle_work,
+ VCN_IDLE_TIMEOUT);
}
/**
--
2.54.0
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH 1/3] drm/amdgpu: Remove unused amdgpu_device_ip_is_hw
2026-06-26 8:55 ` [PATCH 1/3] drm/amdgpu: Remove unused amdgpu_device_ip_is_hw Tvrtko Ursulin
@ 2026-06-26 16:59 ` Timur Kristóf
2026-06-26 19:06 ` Tvrtko Ursulin
0 siblings, 1 reply; 9+ messages in thread
From: Timur Kristóf @ 2026-06-26 16:59 UTC (permalink / raw)
To: amd-gfx, Tvrtko Ursulin
Cc: kernel-dev, Tvrtko Ursulin, Alex Deucher, Christian König
On 2026. június 26., péntek 10:55:56 közép-európai nyári idő Tvrtko Ursulin
wrote:
> This function is unused so lets remove it.
>
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
> Cc: Alex Deucher <alexander.deucher@amd.com>
> Cc: Christian König <christian.koenig@amd.com>
> Cc: Timur Kristóf <timur.kristof@gmail.com>
Nice cleanup!
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Are there any more uses left of the amdgpu_ip_block_status.hw field?
As far as I can see the field is set but never used, maybe we could remove it
too. What did this field mean anyway?
Best regards,
Timur
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_ip.c | 21 ---------------------
> drivers/gpu/drm/amd/amdgpu/amdgpu_ip.h | 2 --
> 2 files changed, 23 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ip.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ip.c index 6aa54156bbc9..62285e973c5c
> 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ip.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ip.c
> @@ -368,27 +368,6 @@ int amdgpu_device_ip_wait_for_idle(struct amdgpu_device
> *adev, return 0;
> }
>
> -/**
> - * amdgpu_device_ip_is_hw - is the hardware IP enabled
> - *
> - * @adev: amdgpu_device pointer
> - * @block_type: Type of hardware IP (SMU, GFX, UVD, etc.)
> - *
> - * Check if the hardware IP is enable or not.
> - * Returns true if it the IP is enable, false if not.
> - */
> -bool amdgpu_device_ip_is_hw(struct amdgpu_device *adev,
> - enum amd_ip_block_type block_type)
> -{
> - struct amdgpu_ip_block *ip_block;
> -
> - ip_block = amdgpu_device_ip_get_ip_block(adev, block_type);
> - if (ip_block)
> - return ip_block->status.hw;
> -
> - return false;
> -}
> -
> /**
> * amdgpu_device_ip_is_valid - is the hardware IP valid
> *
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ip.h
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ip.h index 1d0df6d93957..11739fbdeaa6
> 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ip.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ip.h
> @@ -146,8 +146,6 @@ void amdgpu_device_ip_get_clockgating_state(struct
> amdgpu_device *adev, u64 *flags);
> int amdgpu_device_ip_wait_for_idle(struct amdgpu_device *adev,
> enum amd_ip_block_type
block_type);
> -bool amdgpu_device_ip_is_hw(struct amdgpu_device *adev,
> - enum amd_ip_block_type block_type);
> bool amdgpu_device_ip_is_valid(struct amdgpu_device *adev,
> enum amd_ip_block_type block_type);
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 2/3] drm/amdgpu: Save some cycles on the job submission path
2026-06-26 8:55 ` [PATCH 2/3] drm/amdgpu: Save some cycles on the job submission path Tvrtko Ursulin
@ 2026-06-26 17:10 ` Timur Kristóf
0 siblings, 0 replies; 9+ messages in thread
From: Timur Kristóf @ 2026-06-26 17:10 UTC (permalink / raw)
To: amd-gfx, Tvrtko Ursulin
Cc: kernel-dev, Tvrtko Ursulin, Alex Deucher, Christian König
On 2026. június 26., péntek 10:55:57 közép-európai nyári idő Tvrtko Ursulin
wrote:
> Every job submission on the Steam Deck ends up walking the list of IP
> blocks looking for AMD_IP_BLOCK_TYPE_SMC. Half of the call chain is like
> the below, while the second half is from amdgpu_gfx_profile_ring_end_use:
>
> amdgpu_gfx_profile_ring_begin_use
> amdgpu_dpm_is_overdrive_enabled
> is_support_sw_smu
> amdgpu_device_ip_is_valid
>
> On a game menu screen at 90Hz refresh rate we end up with ~840 calls per
> second which sticks out when the submission worker is profiled with perf:
>
> 13.78% [kernel] [k] __lock_text_start
> 10.86% [kernel] [k] __lookup_object
> 8.76% [kernel] [k] __mod_timer
> 4.94% [kernel] [k] queued_spin_lock_slowpath
> 1.66% [kernel] [k] amdgpu_device_ip_is_valid
> 1.54% [kernel] [k] preempt_count_add
> 1.42% [kernel] [k] amdgpu_sync_peek_fence
> 1.18% [kernel] [k] amdgpu_vmid_grab
> 1.17% [kernel] [k] amdgpu_ib_schedule
> 1.14% [kernel] [k] kthread_worker_fn
>
> Lets short-circuit this walk by simply caching the result of
> is_support_sw_smu() in the device.
>
> This is a micro-improvement but it is at least conceptually nicer to avoid
> repeating the same walk so much.
Hi,
I agree with cleaning up this thing.
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
That being said, I think is_support_sw_smu() is horrible and should be removed
alltogether, because it goes against how the rest of the power management code
works.
In my opinion, we should instead:
1. Hook up some function pointers and check those instead,
For example in amdgpu_pm_acpi_event_handler() we should just hook up
smu_set_ac_dc() to the notify_ac_dc() function pointer. There are plenty of
other similar cases.
Another example, for amdgpu_dpm_mode1_reset() we should introduce a new
asic_reset_mode_1() pointer in amd_pm_funcs() similar to how it works with
MODE2 reset for consistency.
2. Eliminate redundant functions where the same thing is already done
elsewhere.
For example in amdgpu_dpm_is_mode1_reset_supported() it checks
smu_mode1_reset_is_support() which is redundant because the supported reset
type is available on the ASIC functions already and we can just use that.
What do you think?
Thanks & best regards,
Timur
>
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
> Cc: Alex Deucher <alexander.deucher@amd.com>
> Cc: Christian König <christian.koenig@amd.com>
> Cc: Timur Kristóf <timur.kristof@gmail.com>
> ---
> v2:
> * Approach changed to cache sw_smu status only.
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu.h | 1 +
> drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 3 +++
> drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c | 14 +++++---------
> drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h | 8 +++++++-
> 4 files changed, 16 insertions(+), 10 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h index 7b09410d6d8f..9803967d15f9
> 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> @@ -851,6 +851,7 @@ struct amdgpu_device {
> struct dev_pm_domain vga_pm_domain;
> bool have_disp_power_ref;
> bool have_atomics_support;
> + bool is_sw_smu;
>
> /* BIOS */
> bool is_atom_fw;
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index
> 1e6b75ecafe4..7f935a5778b0 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -74,6 +74,7 @@
> #include "amdgpu_ras.h"
> #include "amdgpu_ras_mgr.h"
> #include "amdgpu_pmu.h"
> +#include "amdgpu_smu.h"
> #include "amdgpu_fru_eeprom.h"
> #include "amdgpu_reset.h"
> #include "amdgpu_virt.h"
> @@ -2130,6 +2131,8 @@ static int amdgpu_device_ip_early_init(struct
> amdgpu_device *adev) adev->cg_flags &= amdgpu_cg_mask;
> adev->pg_flags &= amdgpu_pg_mask;
>
> + amdgpu_smu_early_init(adev);
> +
> return 0;
> }
>
> diff --git a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
> b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c index
> 208a2fba6d40..82c9ae6a5092 100644
> --- a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
> +++ b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
> @@ -591,17 +591,13 @@ static int smu_get_power_num_states(void *handle,
> return 0;
> }
>
> -bool is_support_sw_smu(struct amdgpu_device *adev)
> +void amdgpu_smu_early_init(struct amdgpu_device *adev)
> {
> /* vega20 is 11.0.2, but it's supported via the powerplay code */
> - if (adev->asic_type == CHIP_VEGA20)
> - return false;
> -
> - if ((amdgpu_ip_version(adev, MP1_HWIP, 0) >= IP_VERSION(11, 0, 0))
&&
> - amdgpu_device_ip_is_valid(adev, AMD_IP_BLOCK_TYPE_SMC))
> - return true;
> -
> - return false;
> + adev->is_sw_smu = adev->asic_type != CHIP_VEGA20 &&
> + (amdgpu_ip_version(adev, MP1_HWIP, 0) >=
> + IP_VERSION(11, 0, 0) &&
> + amdgpu_device_ip_is_valid(adev,
AMD_IP_BLOCK_TYPE_SMC));
> }
>
> bool is_support_cclk_dpm(struct amdgpu_device *adev)
> diff --git a/drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h
> b/drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h index
> d76e0b005308..efc52d97058b 100644
> --- a/drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h
> +++ b/drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h
> @@ -1952,7 +1952,13 @@ int smu_link_reset(struct smu_context *smu);
>
> extern const struct amd_ip_funcs smu_ip_funcs;
>
> -bool is_support_sw_smu(struct amdgpu_device *adev);
> +void amdgpu_smu_early_init(struct amdgpu_device *adev);
> +
> +static inline bool is_support_sw_smu(struct amdgpu_device *adev)
> +{
> + return adev->is_sw_smu;
> +}
> +
> bool is_support_cclk_dpm(struct amdgpu_device *adev);
> int smu_write_watermarks_table(struct smu_context *smu);
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 3/3] drm/amdgpu: Do not fiddle with the idle workers too much
2026-06-26 8:55 ` [PATCH 3/3] drm/amdgpu: Do not fiddle with the idle workers too much Tvrtko Ursulin
@ 2026-06-26 17:15 ` Timur Kristóf
2026-06-26 18:59 ` Tvrtko Ursulin
0 siblings, 1 reply; 9+ messages in thread
From: Timur Kristóf @ 2026-06-26 17:15 UTC (permalink / raw)
To: amd-gfx, Tvrtko Ursulin
Cc: kernel-dev, Tvrtko Ursulin, Alex Deucher, Christian König
On 2026. június 26., péntek 10:55:58 közép-európai nyári idő Tvrtko Ursulin
wrote:
> Idle workers only need to be canceled or pushed back if we are potentially
> idle. Make the both operations conditional on the pre-increment and post-
> decrement status of the in-flight job counter.
>
Nice catch!
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
> Cc: Alex Deucher <alexander.deucher@amd.com>
> Cc: Christian König <christian.koenig@amd.com>
> Cc: Timur Kristóf <timur.kristof@gmail.com>
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c | 11 +++++------
> drivers/gpu/drm/amd/amdgpu/amdgpu_jpeg.c | 9 +++++----
> drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c | 12 +++++-------
> drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c | 12 +++++-------
> 4 files changed, 20 insertions(+), 24 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c index 85372af1216d..623a5339bc47
> 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> @@ -2460,9 +2460,8 @@ void amdgpu_gfx_profile_ring_begin_use(struct
> amdgpu_ring *ring) else
> profile = PP_SMC_POWER_PROFILE_COMPUTE;
>
> - atomic_inc(&adev->gfx.total_submission_cnt);
> -
> - cancel_delayed_work_sync(&adev->gfx.idle_work);
> + if (!atomic_fetch_inc(&adev->gfx.total_submission_cnt))
> + cancel_delayed_work_sync(&adev->gfx.idle_work);
>
> /* We can safely return early here because we've cancelled the
> * the delayed work so there is no one else to set it to false
> @@ -2490,9 +2489,9 @@ void amdgpu_gfx_profile_ring_end_use(struct
> amdgpu_ring *ring) if (amdgpu_dpm_is_overdrive_enabled(adev))
> return;
>
> - atomic_dec(&ring->adev->gfx.total_submission_cnt);
> -
> - schedule_delayed_work(&ring->adev->gfx.idle_work,
> GFX_PROFILE_IDLE_TIMEOUT); + if
> (atomic_dec_and_test(&ring->adev->gfx.total_submission_cnt))
> + schedule_delayed_work(&ring->adev->gfx.idle_work,
> + GFX_PROFILE_IDLE_TIMEOUT);
> }
>
> /**
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_jpeg.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_jpeg.c index 63ee6ba6a931..57935c321515
> 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_jpeg.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_jpeg.c
> @@ -134,8 +134,8 @@ void amdgpu_jpeg_ring_begin_use(struct amdgpu_ring
> *ring) {
> struct amdgpu_device *adev = ring->adev;
>
> - atomic_inc(&adev->jpeg.total_submission_cnt);
> - cancel_delayed_work_sync(&adev->jpeg.idle_work);
> + if (!atomic_fetch_inc(&adev->jpeg.total_submission_cnt))
> + cancel_delayed_work_sync(&adev->jpeg.idle_work);
>
> mutex_lock(&adev->jpeg.jpeg_pg_lock);
> amdgpu_device_ip_set_powergating_state(adev,
AMD_IP_BLOCK_TYPE_JPEG,
> @@ -145,8 +145,9 @@ void amdgpu_jpeg_ring_begin_use(struct amdgpu_ring
> *ring)
>
> void amdgpu_jpeg_ring_end_use(struct amdgpu_ring *ring)
> {
> - atomic_dec(&ring->adev->jpeg.total_submission_cnt);
> - schedule_delayed_work(&ring->adev->jpeg.idle_work,
JPEG_IDLE_TIMEOUT);
> + if (atomic_dec_and_test(&ring->adev->jpeg.total_submission_cnt))
> + schedule_delayed_work(&ring->adev->jpeg.idle_work,
> + JPEG_IDLE_TIMEOUT);
> }
>
> int amdgpu_jpeg_dec_ring_test_ring(struct amdgpu_ring *ring)
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c index b261aa7c1ba8..8d2abf706dfd
> 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
> @@ -506,9 +506,8 @@ void amdgpu_vcn_ring_begin_use(struct amdgpu_ring *ring)
> struct amdgpu_device *adev = ring->adev;
> struct amdgpu_vcn_inst *vcn_inst = &adev->vcn.inst[ring->me];
>
> - atomic_inc(&vcn_inst->total_submission_cnt);
> -
> - cancel_delayed_work_sync(&vcn_inst->idle_work);
> + if (!atomic_fetch_inc(&vcn_inst->total_submission_cnt))
> + cancel_delayed_work_sync(&vcn_inst->idle_work);
>
> mutex_lock(&vcn_inst->vcn_pg_lock);
> vcn_inst->set_pg_state(vcn_inst, AMD_PG_STATE_UNGATE);
> @@ -550,10 +549,9 @@ void amdgpu_vcn_ring_end_use(struct amdgpu_ring *ring)
> !adev->vcn.inst[ring->me].using_unified_queue)
> atomic_dec(&ring->adev->vcn.inst[ring-
>me].dpg_enc_submission_cnt);
>
> - atomic_dec(&ring->adev->vcn.inst[ring->me].total_submission_cnt);
> -
> - schedule_delayed_work(&ring->adev->vcn.inst[ring->me].idle_work,
> - VCN_IDLE_TIMEOUT);
> + if
> (atomic_dec_and_test(&ring->adev->vcn.inst[ring->me].total_submission_cnt))
> + schedule_delayed_work(&ring->adev->vcn.inst[ring-
>me].idle_work, +
> VCN_IDLE_TIMEOUT);
> }
>
> int amdgpu_vcn_dec_ring_test_ring(struct amdgpu_ring *ring)
> diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c
> b/drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c index 8b8184fe6764..0d8a3cea63ee
> 100644
> --- a/drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c
> +++ b/drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c
> @@ -159,9 +159,8 @@ static void vcn_v2_5_ring_begin_use(struct amdgpu_ring
> *ring) struct amdgpu_device *adev = ring->adev;
> struct amdgpu_vcn_inst *v = &adev->vcn.inst[ring->me];
>
> - atomic_inc(&adev->vcn.inst[0].total_submission_cnt);
> -
> - cancel_delayed_work_sync(&adev->vcn.inst[0].idle_work);
> + if (!atomic_fetch_inc(&adev->vcn.inst[0].total_submission_cnt))
> + cancel_delayed_work_sync(&adev->vcn.inst[0].idle_work);
>
> /* We can safely return early here because we've cancelled the
> * the delayed work so there is no one else to set it to false
> @@ -207,10 +206,9 @@ static void vcn_v2_5_ring_end_use(struct amdgpu_ring
> *ring) !adev->vcn.inst[ring->me].using_unified_queue)
> atomic_dec(&adev->vcn.inst[ring-
>me].dpg_enc_submission_cnt);
>
> - atomic_dec(&adev->vcn.inst[0].total_submission_cnt);
> -
> - schedule_delayed_work(&adev->vcn.inst[0].idle_work,
> - VCN_IDLE_TIMEOUT);
> + if (atomic_dec_and_test(&adev->vcn.inst[0].total_submission_cnt))
> + schedule_delayed_work(&adev->vcn.inst[0].idle_work,
> + VCN_IDLE_TIMEOUT);
> }
>
> /**
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 3/3] drm/amdgpu: Do not fiddle with the idle workers too much
2026-06-26 17:15 ` Timur Kristóf
@ 2026-06-26 18:59 ` Tvrtko Ursulin
0 siblings, 0 replies; 9+ messages in thread
From: Tvrtko Ursulin @ 2026-06-26 18:59 UTC (permalink / raw)
To: Timur Kristóf, amd-gfx
Cc: kernel-dev, Alex Deucher, Christian König
On 26/06/2026 18:15, Timur Kristóf wrote:
> On 2026. június 26., péntek 10:55:58 közép-európai nyári idő Tvrtko Ursulin
> wrote:
>> Idle workers only need to be canceled or pushed back if we are potentially
>> idle. Make the both operations conditional on the pre-increment and post-
>> decrement status of the in-flight job counter.
>>
>
> Nice catch!
I now have some second thoughts about this one. Think I have inverted
the logic of what it needs to achieve. I blame the heat wave :) but at
least I am pretty sure there is still a way to make it more efficient. I
will re-visit next week.
Regards,
Tvrtko
>
> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
>
>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
>> Cc: Alex Deucher <alexander.deucher@amd.com>
>> Cc: Christian König <christian.koenig@amd.com>
>> Cc: Timur Kristóf <timur.kristof@gmail.com>
>> ---
>> drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c | 11 +++++------
>> drivers/gpu/drm/amd/amdgpu/amdgpu_jpeg.c | 9 +++++----
>> drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c | 12 +++++-------
>> drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c | 12 +++++-------
>> 4 files changed, 20 insertions(+), 24 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c index 85372af1216d..623a5339bc47
>> 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
>> @@ -2460,9 +2460,8 @@ void amdgpu_gfx_profile_ring_begin_use(struct
>> amdgpu_ring *ring) else
>> profile = PP_SMC_POWER_PROFILE_COMPUTE;
>>
>> - atomic_inc(&adev->gfx.total_submission_cnt);
>> -
>> - cancel_delayed_work_sync(&adev->gfx.idle_work);
>> + if (!atomic_fetch_inc(&adev->gfx.total_submission_cnt))
>> + cancel_delayed_work_sync(&adev->gfx.idle_work);
>>
>> /* We can safely return early here because we've cancelled the
>> * the delayed work so there is no one else to set it to false
>> @@ -2490,9 +2489,9 @@ void amdgpu_gfx_profile_ring_end_use(struct
>> amdgpu_ring *ring) if (amdgpu_dpm_is_overdrive_enabled(adev))
>> return;
>>
>> - atomic_dec(&ring->adev->gfx.total_submission_cnt);
>> -
>> - schedule_delayed_work(&ring->adev->gfx.idle_work,
>> GFX_PROFILE_IDLE_TIMEOUT); + if
>> (atomic_dec_and_test(&ring->adev->gfx.total_submission_cnt))
>> + schedule_delayed_work(&ring->adev->gfx.idle_work,
>> + GFX_PROFILE_IDLE_TIMEOUT);
>> }
>>
>> /**
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_jpeg.c
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_jpeg.c index 63ee6ba6a931..57935c321515
>> 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_jpeg.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_jpeg.c
>> @@ -134,8 +134,8 @@ void amdgpu_jpeg_ring_begin_use(struct amdgpu_ring
>> *ring) {
>> struct amdgpu_device *adev = ring->adev;
>>
>> - atomic_inc(&adev->jpeg.total_submission_cnt);
>> - cancel_delayed_work_sync(&adev->jpeg.idle_work);
>> + if (!atomic_fetch_inc(&adev->jpeg.total_submission_cnt))
>> + cancel_delayed_work_sync(&adev->jpeg.idle_work);
>>
>> mutex_lock(&adev->jpeg.jpeg_pg_lock);
>> amdgpu_device_ip_set_powergating_state(adev,
> AMD_IP_BLOCK_TYPE_JPEG,
>> @@ -145,8 +145,9 @@ void amdgpu_jpeg_ring_begin_use(struct amdgpu_ring
>> *ring)
>>
>> void amdgpu_jpeg_ring_end_use(struct amdgpu_ring *ring)
>> {
>> - atomic_dec(&ring->adev->jpeg.total_submission_cnt);
>> - schedule_delayed_work(&ring->adev->jpeg.idle_work,
> JPEG_IDLE_TIMEOUT);
>> + if (atomic_dec_and_test(&ring->adev->jpeg.total_submission_cnt))
>> + schedule_delayed_work(&ring->adev->jpeg.idle_work,
>> + JPEG_IDLE_TIMEOUT);
>> }
>>
>> int amdgpu_jpeg_dec_ring_test_ring(struct amdgpu_ring *ring)
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c index b261aa7c1ba8..8d2abf706dfd
>> 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
>> @@ -506,9 +506,8 @@ void amdgpu_vcn_ring_begin_use(struct amdgpu_ring *ring)
>> struct amdgpu_device *adev = ring->adev;
>> struct amdgpu_vcn_inst *vcn_inst = &adev->vcn.inst[ring->me];
>>
>> - atomic_inc(&vcn_inst->total_submission_cnt);
>> -
>> - cancel_delayed_work_sync(&vcn_inst->idle_work);
>> + if (!atomic_fetch_inc(&vcn_inst->total_submission_cnt))
>> + cancel_delayed_work_sync(&vcn_inst->idle_work);
>>
>> mutex_lock(&vcn_inst->vcn_pg_lock);
>> vcn_inst->set_pg_state(vcn_inst, AMD_PG_STATE_UNGATE);
>> @@ -550,10 +549,9 @@ void amdgpu_vcn_ring_end_use(struct amdgpu_ring *ring)
>> !adev->vcn.inst[ring->me].using_unified_queue)
>> atomic_dec(&ring->adev->vcn.inst[ring-
>> me].dpg_enc_submission_cnt);
>>
>> - atomic_dec(&ring->adev->vcn.inst[ring->me].total_submission_cnt);
>> -
>> - schedule_delayed_work(&ring->adev->vcn.inst[ring->me].idle_work,
>> - VCN_IDLE_TIMEOUT);
>> + if
>> (atomic_dec_and_test(&ring->adev->vcn.inst[ring->me].total_submission_cnt))
>> + schedule_delayed_work(&ring->adev->vcn.inst[ring-
>> me].idle_work, +
>> VCN_IDLE_TIMEOUT);
>> }
>>
>> int amdgpu_vcn_dec_ring_test_ring(struct amdgpu_ring *ring)
>> diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c
>> b/drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c index 8b8184fe6764..0d8a3cea63ee
>> 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c
>> @@ -159,9 +159,8 @@ static void vcn_v2_5_ring_begin_use(struct amdgpu_ring
>> *ring) struct amdgpu_device *adev = ring->adev;
>> struct amdgpu_vcn_inst *v = &adev->vcn.inst[ring->me];
>>
>> - atomic_inc(&adev->vcn.inst[0].total_submission_cnt);
>> -
>> - cancel_delayed_work_sync(&adev->vcn.inst[0].idle_work);
>> + if (!atomic_fetch_inc(&adev->vcn.inst[0].total_submission_cnt))
>> + cancel_delayed_work_sync(&adev->vcn.inst[0].idle_work);
>>
>> /* We can safely return early here because we've cancelled the
>> * the delayed work so there is no one else to set it to false
>> @@ -207,10 +206,9 @@ static void vcn_v2_5_ring_end_use(struct amdgpu_ring
>> *ring) !adev->vcn.inst[ring->me].using_unified_queue)
>> atomic_dec(&adev->vcn.inst[ring-
>> me].dpg_enc_submission_cnt);
>>
>> - atomic_dec(&adev->vcn.inst[0].total_submission_cnt);
>> -
>> - schedule_delayed_work(&adev->vcn.inst[0].idle_work,
>> - VCN_IDLE_TIMEOUT);
>> + if (atomic_dec_and_test(&adev->vcn.inst[0].total_submission_cnt))
>> + schedule_delayed_work(&adev->vcn.inst[0].idle_work,
>> + VCN_IDLE_TIMEOUT);
>> }
>>
>> /**
>
>
>
>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 1/3] drm/amdgpu: Remove unused amdgpu_device_ip_is_hw
2026-06-26 16:59 ` Timur Kristóf
@ 2026-06-26 19:06 ` Tvrtko Ursulin
0 siblings, 0 replies; 9+ messages in thread
From: Tvrtko Ursulin @ 2026-06-26 19:06 UTC (permalink / raw)
To: Timur Kristóf, amd-gfx
Cc: kernel-dev, Alex Deucher, Christian König
On 26/06/2026 17:59, Timur Kristóf wrote:
> On 2026. június 26., péntek 10:55:56 közép-európai nyári idő Tvrtko Ursulin
> wrote:
>> This function is unused so lets remove it.
>>
>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
>> Cc: Alex Deucher <alexander.deucher@amd.com>
>> Cc: Christian König <christian.koenig@amd.com>
>> Cc: Timur Kristóf <timur.kristof@gmail.com>
>
> Nice cleanup!
>
> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
>
> Are there any more uses left of the amdgpu_ip_block_status.hw field?
> As far as I can see the field is set but never used, maybe we could remove it
> too. What did this field mean anyway?
It appears used during init/fini and suspend/resume, a little bit in
reset. I am not quite sure what it means - could it be "hw initialized"
or "hw ready"?
Regards,
Tvrtko
>> ---
>> drivers/gpu/drm/amd/amdgpu/amdgpu_ip.c | 21 ---------------------
>> drivers/gpu/drm/amd/amdgpu/amdgpu_ip.h | 2 --
>> 2 files changed, 23 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ip.c
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ip.c index 6aa54156bbc9..62285e973c5c
>> 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ip.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ip.c
>> @@ -368,27 +368,6 @@ int amdgpu_device_ip_wait_for_idle(struct amdgpu_device
>> *adev, return 0;
>> }
>>
>> -/**
>> - * amdgpu_device_ip_is_hw - is the hardware IP enabled
>> - *
>> - * @adev: amdgpu_device pointer
>> - * @block_type: Type of hardware IP (SMU, GFX, UVD, etc.)
>> - *
>> - * Check if the hardware IP is enable or not.
>> - * Returns true if it the IP is enable, false if not.
>> - */
>> -bool amdgpu_device_ip_is_hw(struct amdgpu_device *adev,
>> - enum amd_ip_block_type block_type)
>> -{
>> - struct amdgpu_ip_block *ip_block;
>> -
>> - ip_block = amdgpu_device_ip_get_ip_block(adev, block_type);
>> - if (ip_block)
>> - return ip_block->status.hw;
>> -
>> - return false;
>> -}
>> -
>> /**
>> * amdgpu_device_ip_is_valid - is the hardware IP valid
>> *
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ip.h
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ip.h index 1d0df6d93957..11739fbdeaa6
>> 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ip.h
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ip.h
>> @@ -146,8 +146,6 @@ void amdgpu_device_ip_get_clockgating_state(struct
>> amdgpu_device *adev, u64 *flags);
>> int amdgpu_device_ip_wait_for_idle(struct amdgpu_device *adev,
>> enum amd_ip_block_type
> block_type);
>> -bool amdgpu_device_ip_is_hw(struct amdgpu_device *adev,
>> - enum amd_ip_block_type block_type);
>> bool amdgpu_device_ip_is_valid(struct amdgpu_device *adev,
>> enum amd_ip_block_type block_type);
>
>
>
>
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2026-06-26 19:06 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-26 8:55 [PATCH 0/3] Job submission optimisation Tvrtko Ursulin
2026-06-26 8:55 ` [PATCH 1/3] drm/amdgpu: Remove unused amdgpu_device_ip_is_hw Tvrtko Ursulin
2026-06-26 16:59 ` Timur Kristóf
2026-06-26 19:06 ` Tvrtko Ursulin
2026-06-26 8:55 ` [PATCH 2/3] drm/amdgpu: Save some cycles on the job submission path Tvrtko Ursulin
2026-06-26 17:10 ` Timur Kristóf
2026-06-26 8:55 ` [PATCH 3/3] drm/amdgpu: Do not fiddle with the idle workers too much Tvrtko Ursulin
2026-06-26 17:15 ` Timur Kristóf
2026-06-26 18:59 ` Tvrtko Ursulin
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.