stable.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Sasha Levin <sashal@kernel.org>
To: linux-kernel@vger.kernel.org, stable@vger.kernel.org
Cc: Erico Nunes <nunes.erico@gmail.com>, Qiang Yu <yuq825@gmail.com>,
	Sasha Levin <sashal@kernel.org>,
	maarten.lankhorst@linux.intel.com, mripard@kernel.org,
	tzimmermann@suse.de, airlied@gmail.com, daniel@ffwll.ch,
	dri-devel@lists.freedesktop.org, lima@lists.freedesktop.org
Subject: [PATCH AUTOSEL 6.9 12/23] drm/lima: mask irqs in timeout path before hard reset
Date: Mon, 27 May 2024 11:50:13 -0400	[thread overview]
Message-ID: <20240527155123.3863983-12-sashal@kernel.org> (raw)
In-Reply-To: <20240527155123.3863983-1-sashal@kernel.org>

From: Erico Nunes <nunes.erico@gmail.com>

[ Upstream commit a421cc7a6a001b70415aa4f66024fa6178885a14 ]

There is a race condition in which a rendering job might take just long
enough to trigger the drm sched job timeout handler but also still
complete before the hard reset is done by the timeout handler.
This runs into race conditions not expected by the timeout handler.
In some very specific cases it currently may result in a refcount
imbalance on lima_pm_idle, with a stack dump such as:

[10136.669170] WARNING: CPU: 0 PID: 0 at drivers/gpu/drm/lima/lima_devfreq.c:205 lima_devfreq_record_idle+0xa0/0xb0
...
[10136.669459] pc : lima_devfreq_record_idle+0xa0/0xb0
...
[10136.669628] Call trace:
[10136.669634]  lima_devfreq_record_idle+0xa0/0xb0
[10136.669646]  lima_sched_pipe_task_done+0x5c/0xb0
[10136.669656]  lima_gp_irq_handler+0xa8/0x120
[10136.669666]  __handle_irq_event_percpu+0x48/0x160
[10136.669679]  handle_irq_event+0x4c/0xc0

We can prevent that race condition entirely by masking the irqs at the
beginning of the timeout handler, at which point we give up on waiting
for that job entirely.
The irqs will be enabled again at the next hard reset which is already
done as a recovery by the timeout handler.

Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240405152951.1531555-4-nunes.erico@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/gpu/drm/lima/lima_sched.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/drivers/gpu/drm/lima/lima_sched.c b/drivers/gpu/drm/lima/lima_sched.c
index 66841503a6183..bbf3f8feab944 100644
--- a/drivers/gpu/drm/lima/lima_sched.c
+++ b/drivers/gpu/drm/lima/lima_sched.c
@@ -430,6 +430,13 @@ static enum drm_gpu_sched_stat lima_sched_timedout_job(struct drm_sched_job *job
 		return DRM_GPU_SCHED_STAT_NOMINAL;
 	}
 
+	/*
+	 * The task might still finish while this timeout handler runs.
+	 * To prevent a race condition on its completion, mask all irqs
+	 * on the running core until the next hard reset completes.
+	 */
+	pipe->task_mask_irq(pipe);
+
 	if (!pipe->error)
 		DRM_ERROR("%s job timeout\n", lima_ip_name(ip));
 
-- 
2.43.0


  parent reply	other threads:[~2024-05-27 15:52 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-05-27 15:50 [PATCH AUTOSEL 6.9 01/23] drm/amd/display: Exit idle optimizations before HDCP execution Sasha Levin
2024-05-27 15:50 ` [PATCH AUTOSEL 6.9 02/23] drm/amd/display: Workaround register access in idle race with cursor Sasha Levin
2024-05-27 15:50 ` [PATCH AUTOSEL 6.9 03/23] ASoC: Intel: sof_cs42l42: rename BT offload quirk Sasha Levin
2024-05-27 15:50 ` [PATCH AUTOSEL 6.9 04/23] ima: Fix use-after-free on a dentry's dname.name Sasha Levin
2024-05-27 15:50 ` [PATCH AUTOSEL 6.9 05/23] platform/x86: toshiba_acpi: Add quirk for buttons on Z830 Sasha Levin
2024-05-27 15:50 ` [PATCH AUTOSEL 6.9 06/23] cgroup/cpuset: Make cpuset hotplug processing synchronous Sasha Levin
2024-05-27 15:50 ` [PATCH AUTOSEL 6.9 07/23] drm/amd/display: add root clock control function pointer to fix display corruption Sasha Levin
2024-05-27 15:50 ` [PATCH AUTOSEL 6.9 08/23] ASoC: Intel: sof_sdw: add JD2 quirk for HP Omen 14 Sasha Levin
2024-05-27 15:50 ` [PATCH AUTOSEL 6.9 09/23] ASoC: Intel: sof_sdw: add quirk for Dell SKU 0C0F Sasha Levin
2024-05-27 15:50 ` [PATCH AUTOSEL 6.9 10/23] drm/lima: add mask irq callback to gp and pp Sasha Levin
2024-05-27 15:50 ` [PATCH AUTOSEL 6.9 11/23] drm/lima: include pp bcast irq in timeout handler check Sasha Levin
2024-05-27 15:50 ` Sasha Levin [this message]
2024-05-27 15:50 ` [PATCH AUTOSEL 6.9 13/23] platform/x86: x86-android-tablets: Unregister devices in reverse order Sasha Levin
2024-05-27 15:50 ` [PATCH AUTOSEL 6.9 14/23] platform/x86: x86-android-tablets: Add Lenovo Yoga Tablet 2 Pro 1380F/L data Sasha Levin
2024-05-27 15:50 ` [PATCH AUTOSEL 6.9 15/23] ALSA: hda/realtek: Add quirks for HP Omen models using CS35L41 Sasha Levin
2024-05-27 15:50 ` [PATCH AUTOSEL 6.9 16/23] ALSA: hda/realtek: Add quirks for Lenovo 13X Sasha Levin
2024-05-27 15:50 ` [PATCH AUTOSEL 6.9 17/23] media: lgdt3306a: Add a check against null-pointer-def Sasha Levin
2024-05-27 15:50 ` [PATCH AUTOSEL 6.9 18/23] powerpc: make fadump resilient with memory add/remove events Sasha Levin
2024-05-30 11:52   ` Sourabh Jain
2024-06-18  9:15     ` Pavel Machek
2024-06-19  6:31       ` Michael Ellerman
2024-06-19 14:30         ` Sasha Levin
2024-06-19 14:32         ` Sasha Levin
2024-05-27 15:50 ` [PATCH AUTOSEL 6.9 19/23] powerpc/pseries: Enforce hcall result buffer validity and size Sasha Levin
2024-05-27 15:50 ` [PATCH AUTOSEL 6.9 20/23] media: intel/ipu6: Fix build with !ACPI Sasha Levin
2024-05-27 15:50 ` [PATCH AUTOSEL 6.9 21/23] media: mtk-vcodec: potential null pointer deference in SCP Sasha Levin
2024-05-27 15:50 ` [PATCH AUTOSEL 6.9 22/23] powerpc/io: Avoid clang null pointer arithmetic warnings Sasha Levin
2024-05-27 15:50 ` [PATCH AUTOSEL 6.9 23/23] platform/x86: p2sb: Don't init until unassigned resources have been assigned Sasha Levin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240527155123.3863983-12-sashal@kernel.org \
    --to=sashal@kernel.org \
    --cc=airlied@gmail.com \
    --cc=daniel@ffwll.ch \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=lima@lists.freedesktop.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=maarten.lankhorst@linux.intel.com \
    --cc=mripard@kernel.org \
    --cc=nunes.erico@gmail.com \
    --cc=stable@vger.kernel.org \
    --cc=tzimmermann@suse.de \
    --cc=yuq825@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).