All of lore.kernel.org
 help / color / mirror / Atom feed
From: Sasha Levin <sashal@kernel.org>
To: patches@lists.linux.dev, stable@vger.kernel.org
Cc: Amber Lin <Amber.Lin@amd.com>,
	Alex Deucher <alexander.deucher@amd.com>,
	Sasha Levin <sashal@kernel.org>,
	Felix.Kuehling@amd.com, christian.koenig@amd.com,
	airlied@gmail.com, simona@ffwll.ch,
	amd-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org,
	linux-kernel@vger.kernel.org
Subject: [PATCH AUTOSEL 6.1 34/58] drm/amdkfd: Set SDMA_RLCx_IB_CNTL/SWITCH_INSIDE_IB
Date: Sun,  1 Jun 2025 19:39:47 -0400	[thread overview]
Message-ID: <20250601234012.3516352-34-sashal@kernel.org> (raw)
In-Reply-To: <20250601234012.3516352-1-sashal@kernel.org>

From: Amber Lin <Amber.Lin@amd.com>

[ Upstream commit ab9fcc6362e0699fc1150aa1d8503c40fce2c1e1 ]

When submitting MQD to CP, set SDMA_RLCx_IB_CNTL/SWITCH_INSIDE_IB bit so
it'll allow SDMA preemption if there is a massive command buffer of
long-running SDMA commands.

Signed-off-by: Amber Lin <Amber.Lin@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---

Based on my comprehensive analysis of both the commit and the kernel
codebase context, here is my assessment: **YES** This commit should be
backported to stable kernel trees. Here's my extensive analysis: ## Code
Analysis The commit makes a minimal but important change to
`drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c` in the
`update_mqd_sdma()` function: ```c m->sdmax_rlcx_ib_cntl |=
SDMA0_GFX_IB_CNTL__SWITCH_INSIDE_IB_MASK; ``` ## Why This Should Be
Backported ### 1. **Fixes a Clear Functional Bug** - The commit
addresses missing register configuration that prevents proper SDMA
preemption - Without this bit set, SDMA engines cannot perform context
switches inside Indirect Buffers (IBs) - This creates a scheduling
fairness issue where long-running SDMA commands can starve other queues
### 2. **Minimal Risk, High Confidence Change** - **Single line
addition**: Only adds one register bit setting - **Well-understood
hardware feature**: The `SWITCH_INSIDE_IB` bit is a documented hardware
capability - **Conservative fix**: Simply enables existing hardware
functionality that should have been enabled - **No API changes**:
Doesn't modify interfaces or data structures ### 3. **Broad Impact on
Modern AMD Hardware** - Affects all AMD GPUs with SDMA engines from GCN
4.0+ (Vega, Navi, RDNA generations) - Impacts compute workloads using
ROCm, OpenCL, and HIP - The issue affects system responsiveness under
heavy SDMA usage patterns ### 4. **Consistent with Kernel Patterns**
From examining similar commits, this follows the pattern of commits
marked "Backport Status: YES": - **Similar Commit #5** was backported
for implementing missing register state control in SDMA - Both commits
involve configuring SDMA control registers that were previously unset -
Both address functional correctness rather than adding new features ###
5. **Meets Stable Tree Criteria** - **Important bugfix**: ✅ Fixes SDMA
scheduling fairness - **Small and contained**: ✅ Single line change in
one function - **No architectural changes**: ✅ Uses existing register
and bit definitions - **Minimal regression risk**: ✅ Only enables
documented hardware feature - **Critical subsystem**: ✅ GPU scheduling
affects system performance ### 6. **Clear User Impact** The commit
message explicitly states the problem: "massive command buffer of long-
running SDMA commands" can cause starvation. This directly impacts: -
Multi-process compute workloads - Real-time or interactive applications
that need timely SDMA access - System responsiveness during heavy GPU
compute usage ### 7. **Evidence from Codebase Analysis** My kernel tree
analysis revealed: - The `SWITCH_INSIDE_IB` functionality exists across
all SDMA generations - Other SDMA drivers (amdgpu) properly configure
this register - KFD was missing this configuration, creating an
inconsistency - No security implications found - this is purely a
scheduling correctness issue This commit represents exactly the type of
targeted bugfix that stable trees are designed to include: it's small,
safe, fixes a real user-impacting issue, and has minimal risk of
regression.

 drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c
index eaf084acb706f..1a5d872e88ecc 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c
@@ -409,6 +409,10 @@ static void update_mqd_sdma(struct mqd_manager *mm, void *mqd,
 	m->sdma_engine_id = q->sdma_engine_id;
 	m->sdma_queue_id = q->sdma_queue_id;
 	m->sdmax_rlcx_dummy_reg = SDMA_RLC_DUMMY_DEFAULT;
+	/* Allow context switch so we don't cross-process starve with a massive
+	 * command buffer of long-running SDMA commands
+	 */
+	m->sdmax_rlcx_ib_cntl |= SDMA0_GFX_IB_CNTL__SWITCH_INSIDE_IB_MASK;
 
 	q->is_active = QUEUE_IS_ACTIVE(*q);
 }
-- 
2.39.5


  parent reply	other threads:[~2025-06-01 23:41 UTC|newest]

Thread overview: 60+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-06-01 23:39 [PATCH AUTOSEL 6.1 01/58] drm/amdgpu/gfx6: fix CSIB handling Sasha Levin
2025-06-01 23:39 ` [PATCH AUTOSEL 6.1 02/58] drm/dp: add option to disable zero sized address only transactions Sasha Levin
2025-06-01 23:39 ` [PATCH AUTOSEL 6.1 03/58] sunrpc: update nextcheck time when adding new cache entries Sasha Levin
2025-06-01 23:39 ` [PATCH AUTOSEL 6.1 04/58] drm/amd/display: DCN32 null data check Sasha Levin
2025-06-01 23:39 ` [PATCH AUTOSEL 6.1 05/58] drm/bridge: analogix_dp: Add irq flag IRQF_NO_AUTOEN instead of calling disable_irq() Sasha Levin
2025-06-01 23:39 ` [PATCH AUTOSEL 6.1 06/58] exfat: fix double free in delayed_free Sasha Levin
2025-06-01 23:39 ` [PATCH AUTOSEL 6.1 07/58] arm64/cpuinfo: only show one cpu's info in c_show() Sasha Levin
2025-06-01 23:39 ` [PATCH AUTOSEL 6.1 08/58] drm/bridge: anx7625: change the gpiod_set_value API Sasha Levin
2025-06-01 23:39 ` [PATCH AUTOSEL 6.1 09/58] drm/amdgpu/gfx11: fix CSIB handling Sasha Levin
2025-06-01 23:39 ` [PATCH AUTOSEL 6.1 10/58] media: i2c: imx334: Enable runtime PM before sub-device registration Sasha Levin
2025-06-01 23:39 ` [PATCH AUTOSEL 6.1 11/58] drm/msm/hdmi: add runtime PM calls to DDC transfer function Sasha Levin
2025-06-01 23:39 ` [PATCH AUTOSEL 6.1 12/58] media: uapi: v4l: Fix V4L2_TYPE_IS_OUTPUT condition Sasha Levin
2025-06-01 23:39 ` [PATCH AUTOSEL 6.1 13/58] drm/amd/display: Add NULL pointer checks in dm_force_atomic_commit() Sasha Levin
2025-06-01 23:39 ` [PATCH AUTOSEL 6.1 14/58] drm/amd/display: Skip to enable dsc if it has been off Sasha Levin
2025-06-01 23:39 ` [PATCH AUTOSEL 6.1 15/58] drm/msm/a6xx: Increase HFI response timeout Sasha Levin
2025-06-01 23:39 ` [PATCH AUTOSEL 6.1 16/58] media: i2c: imx334: Fix runtime PM handling in remove function Sasha Levin
2025-06-01 23:39 ` [PATCH AUTOSEL 6.1 17/58] drm/amdgpu/gfx10: fix CSIB handling Sasha Levin
2025-06-01 23:39 ` [PATCH AUTOSEL 6.1 18/58] drm: panel-orientation-quirks: Add ZOTAC Gaming Zone Sasha Levin
2025-06-01 23:39 ` [PATCH AUTOSEL 6.1 19/58] media: ccs-pll: Better validate VT PLL branch Sasha Levin
2025-06-01 23:39 ` [PATCH AUTOSEL 6.1 20/58] media: uapi: v4l: Change V4L2_TYPE_IS_CAPTURE condition Sasha Levin
2025-06-01 23:39 ` [PATCH AUTOSEL 6.1 21/58] drm/amdgpu/gfx7: fix CSIB handling Sasha Levin
2025-06-01 23:39 ` [PATCH AUTOSEL 6.1 22/58] ext4: ext4: unify EXT4_EX_NOCACHE|NOFAIL flags in ext4_ext_remove_space() Sasha Levin
2025-06-01 23:39 ` [PATCH AUTOSEL 6.1 23/58] jfs: fix array-index-out-of-bounds read in add_missing_indices Sasha Levin
2025-06-01 23:39 ` [PATCH AUTOSEL 6.1 24/58] media: ti: cal: Fix wrong goto on error path Sasha Levin
2025-06-01 23:39 ` [PATCH AUTOSEL 6.1 25/58] media: rkvdec: h264: Use bytesperline and buffer height as virstride Sasha Levin
2025-06-01 23:39   ` Sasha Levin
2025-06-01 23:39 ` [PATCH AUTOSEL 6.1 26/58] media: rkvdec: Initialize the m2m context before the controls Sasha Levin
2025-06-01 23:39   ` Sasha Levin
2025-06-01 23:39 ` [PATCH AUTOSEL 6.1 27/58] sunrpc: fix race in cache cleanup causing stale nextcheck time Sasha Levin
2025-06-01 23:39 ` [PATCH AUTOSEL 6.1 28/58] ext4: prevent stale extent cache entries caused by concurrent get es_cache Sasha Levin
2025-06-01 23:39 ` [PATCH AUTOSEL 6.1 29/58] drm/amdgpu/gfx8: fix CSIB handling Sasha Levin
2025-06-01 23:39 ` [PATCH AUTOSEL 6.1 30/58] drm/amdgpu/gfx9: " Sasha Levin
2025-06-01 23:39 ` [PATCH AUTOSEL 6.1 31/58] jfs: Fix null-ptr-deref in jfs_ioc_trim Sasha Levin
2025-06-01 23:39 ` [PATCH AUTOSEL 6.1 32/58] drm/amd/display: Correct prefetch calculation Sasha Levin
2025-06-01 23:39 ` [PATCH AUTOSEL 6.1 33/58] drm/msm/dpu: don't select single flush for active CTL blocks Sasha Levin
2025-06-01 23:39 ` Sasha Levin [this message]
2025-06-01 23:39 ` [PATCH AUTOSEL 6.1 35/58] media: tc358743: ignore video while HPD is low Sasha Levin
2025-06-01 23:39 ` [PATCH AUTOSEL 6.1 36/58] media: platform: exynos4-is: Add hardware sync wait to fimc_is_hw_change_mode() Sasha Levin
2025-06-01 23:39 ` [PATCH AUTOSEL 6.1 37/58] media: i2c: imx334: update mode_3840x2160_regs array Sasha Levin
2025-06-01 23:39 ` [PATCH AUTOSEL 6.1 38/58] nios2: force update_mmu_cache on spurious tlb-permission--related pagefaults Sasha Levin
2025-06-01 23:39 ` [PATCH AUTOSEL 6.1 39/58] media: rcar-vin: Fix stride setting for RAW8 formats Sasha Levin
2025-06-01 23:39 ` [PATCH AUTOSEL 6.1 40/58] media: qcom: venus: Fix uninitialized variable warning Sasha Levin
2025-06-01 23:39 ` [PATCH AUTOSEL 6.1 41/58] ACPI: bus: Bail out if acpi_kobj registration fails Sasha Levin
2025-06-01 23:39 ` [PATCH AUTOSEL 6.1 42/58] pmdomain: ti: Fix STANDBY handling of PER power domain Sasha Levin
2025-06-01 23:39 ` [PATCH AUTOSEL 6.1 43/58] PM: runtime: fix denying of auto suspend in pm_suspend_timer_fn() Sasha Levin
2025-06-01 23:39 ` [PATCH AUTOSEL 6.1 44/58] ASoC: amd: yc: Add quirk for Lenovo Yoga Pro 7 14ASP9 Sasha Levin
2025-06-01 23:39 ` [PATCH AUTOSEL 6.1 45/58] thermal/drivers/qcom/tsens: Update conditions to strictly evaluate for IP v2+ Sasha Levin
2025-06-01 23:39 ` [PATCH AUTOSEL 6.1 46/58] clocksource/drivers/timer-tegra186: Fix watchdog self-pinging Sasha Levin
2025-06-01 23:40 ` [PATCH AUTOSEL 6.1 47/58] gpio: pxa: Make irq_chip immutable Sasha Levin
2025-06-01 23:40 ` [PATCH AUTOSEL 6.1 48/58] gpio: grgpio: " Sasha Levin
2025-06-01 23:40 ` [PATCH AUTOSEL 6.1 49/58] gpio: xgene-sb: " Sasha Levin
2025-06-01 23:40 ` [PATCH AUTOSEL 6.1 50/58] mmc: Add quirk to disable DDR50 tuning Sasha Levin
2025-06-01 23:40 ` [PATCH AUTOSEL 6.1 51/58] clocksource: Fix the CPUs' choice in the watchdog per CPU verification Sasha Levin
2025-06-01 23:40 ` [PATCH AUTOSEL 6.1 52/58] ACPICA: Avoid sequence overread in call to strncmp() Sasha Levin
2025-06-01 23:40 ` [PATCH AUTOSEL 6.1 53/58] ACPICA: utilities: Fix overflow check in vsnprintf() Sasha Levin
2025-06-01 23:40 ` [PATCH AUTOSEL 6.1 54/58] ACPI: EC: Add device to acpi_ec_no_wakeup[] qurik list Sasha Levin
2025-06-01 23:40 ` [PATCH AUTOSEL 6.1 55/58] ALSA: seq: Remove unused snd_seq_queue_client_leave_cells Sasha Levin
2025-06-01 23:40 ` [PATCH AUTOSEL 6.1 56/58] cpufreq: Force sync policy boost with global boost on sysfs update Sasha Levin
2025-06-01 23:40 ` [PATCH AUTOSEL 6.1 57/58] power: supply: bq27xxx: Retrieve again when busy Sasha Levin
2025-06-01 23:40 ` [PATCH AUTOSEL 6.1 58/58] ASoC: tas2770: Power cycle amp on ISENSE/VSENSE change Sasha Levin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250601234012.3516352-34-sashal@kernel.org \
    --to=sashal@kernel.org \
    --cc=Amber.Lin@amd.com \
    --cc=Felix.Kuehling@amd.com \
    --cc=airlied@gmail.com \
    --cc=alexander.deucher@amd.com \
    --cc=amd-gfx@lists.freedesktop.org \
    --cc=christian.koenig@amd.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=patches@lists.linux.dev \
    --cc=simona@ffwll.ch \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.