From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 46D19426698; Tue, 31 Mar 2026 16:42:40 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774975360; cv=none; b=J6le8ZNHZTchcCgd7nrlaZGK1xbgSl2jq63XXVKCaK7xBkQSAMjcw9DuG9sDmRCOZ1R9GYGr7MYIPcZCixb5fS0KcnIznFLyLWAWS3D3CBPnJp2cqBDulmIHKwrt0OO8XsTLp3lZWfD9tuQSJLf/ZCdkd7/+5ynlwVGNWGca6q4= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774975360; c=relaxed/simple; bh=eLKgfffF/jc1ZVzkl2a4FvvFfRXYJ8/3Y11yCvr8pWQ=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=UBCiQtYuYYenBEOl5mPO6CmeOMv7JRuoDMLH5Jc+4WFAbA+kPHeHz47QWTgGvIMXBRr30VMDfN+8uUEl9yEkAKT3P1wiSKnDQ5zZCh6/Tj/JPPXKNLZtFniF1BqtPEJZq6pfJIHy9ap3iVlO50EtaMWgiZFt1uFlLiDOD7fu88w= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linuxfoundation.org header.i=@linuxfoundation.org header.b=cEAvIOtZ; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linuxfoundation.org header.i=@linuxfoundation.org header.b="cEAvIOtZ" Received: by smtp.kernel.org (Postfix) with ESMTPSA id CE0ADC19423; Tue, 31 Mar 2026 16:42:39 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linuxfoundation.org; s=korg; t=1774975360; bh=eLKgfffF/jc1ZVzkl2a4FvvFfRXYJ8/3Y11yCvr8pWQ=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=cEAvIOtZKnwlGodoOTTnBjpnhU2RKo9MENvktHT9FhDCwiVkYH9uWEeEqPQ7wH/5M CKGqLk5qH0EHC55i4jLAHkvclWw4o4ABHzwE91VRyWC4lbER20+lBiS4XRNpAikr2+ i0iRjLztkzb6oUuPg69MNoO4Bmb3PHHcZUsjh688= From: Greg Kroah-Hartman To: stable@vger.kernel.org Cc: Greg Kroah-Hartman , patches@lists.linux.dev, =?UTF-8?q?Christian=20K=C3=B6nig?= , Alex Deucher , Ruijing Dong Subject: [PATCH 6.19 263/342] drm/amdgpu: fix strsep() corrupting lockup_timeout on multi-GPU (v3) Date: Tue, 31 Mar 2026 18:21:36 +0200 Message-ID: <20260331161808.629336391@linuxfoundation.org> X-Mailer: git-send-email 2.53.0 In-Reply-To: <20260331161758.909578033@linuxfoundation.org> References: <20260331161758.909578033@linuxfoundation.org> User-Agent: quilt/0.69 X-stable: review X-Patchwork-Hint: ignore Precedence: bulk X-Mailing-List: stable@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit 6.19-stable review patch. If anyone has any objections, please let me know. ------------------ From: Ruijing Dong commit 2d300ebfc411205fa31ba7741c5821d381912381 upstream. amdgpu_device_get_job_timeout_settings() passes a pointer directly to the global amdgpu_lockup_timeout[] buffer into strsep(). strsep() destructively replaces delimiter characters with '\0' in-place. On multi-GPU systems, this function is called once per device. When a multi-value setting like "0,0,0,-1" is used, the first GPU's call transforms the global buffer into "0\00\00\0-1". The second GPU then sees only "0" (terminated at the first '\0'), parses a single value, hits the single-value fallthrough (index == 1), and applies timeout=0 to all rings — causing immediate false job timeouts. Fix this by copying into a stack-local array before calling strsep(), so the global module parameter buffer remains intact across calls. The buffer is AMDGPU_MAX_TIMEOUT_PARAM_LENGTH (256) bytes, which is safe for the stack. v2: wrap commit message to 72 columns, add Assisted-by tag. v3: use stack array with strscpy() instead of kstrdup()/kfree() to avoid unnecessary heap allocation (Christian). This patch was developed with assistance from Claude (claude-opus-4-6). Assisted-by: Claude:claude-opus-4-6 Reviewed-by: Christian König Reviewed-by: Alex Deucher Signed-off-by: Ruijing Dong Signed-off-by: Alex Deucher (cherry picked from commit 94d79f51efecb74be1d88dde66bdc8bfcca17935) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 13 +++++++++++-- 1 file changed, 11 insertions(+), 2 deletions(-) --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -4361,7 +4361,8 @@ fail: static int amdgpu_device_get_job_timeout_settings(struct amdgpu_device *adev) { - char *input = amdgpu_lockup_timeout; + char buf[AMDGPU_MAX_TIMEOUT_PARAM_LENGTH]; + char *input = buf; char *timeout_setting = NULL; int index = 0; long timeout; @@ -4371,9 +4372,17 @@ static int amdgpu_device_get_job_timeout adev->gfx_timeout = adev->compute_timeout = adev->sdma_timeout = adev->video_timeout = msecs_to_jiffies(2000); - if (!strnlen(input, AMDGPU_MAX_TIMEOUT_PARAM_LENGTH)) + if (!strnlen(amdgpu_lockup_timeout, AMDGPU_MAX_TIMEOUT_PARAM_LENGTH)) return 0; + /* + * strsep() destructively modifies its input by replacing delimiters + * with '\0'. Use a stack copy so the global module parameter buffer + * remains intact for multi-GPU systems where this function is called + * once per device. + */ + strscpy(buf, amdgpu_lockup_timeout, sizeof(buf)); + while ((timeout_setting = strsep(&input, ",")) && strnlen(timeout_setting, AMDGPU_MAX_TIMEOUT_PARAM_LENGTH)) { ret = kstrtol(timeout_setting, 0, &timeout);