From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from h7.fbrelay.privateemail.com (h7.fbrelay.privateemail.com [162.0.218.230])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(No client certificate requested)
	by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6D6FC199FD3
	for <linux-kernel@vger.kernel.org>; Fri, 13 Feb 2026 19:36:41 +0000 (UTC)
Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=162.0.218.230
ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116;
	t=1771011402; cv=none; b=TxVx/MIZFyxJeWWU8uCnDeVnWHHutvDEGO7l9RcmPJSyrThMZzuwzSfXrEvmXMSFzFr2hHzz0X7LtZPcIQCya+/8DCCH2lavgOC5LxN6olusDA3H5xXTTgh5wCa2dxgru6m7cmu04iBhPgdkTx5CMD+Col0087AL8b5q4u7UYto=
ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org;
	s=arc-20240116; t=1771011402; c=relaxed/simple;
	bh=tS0POA6TsCtaFywPO1jM+s+niSWKGfESgmq4VD7c4lw=;
	h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version:
	 Content-Type:Content-Disposition:In-Reply-To; b=PaDcuZzwp4whkxunLt2oGC1tTq+ovAYb1LW6UDpU3f/teLurSwLWOrjSRJaC+ckHqcKstDY/KZdkPisVrVRx7fVViX4jKkTRTZQepa+HIrINI35zbpxDhY2iXUT1rmxddTkEJrkeM3BBGNiU+MydPACF/K43IsFZu3g0V2Ffj2g=
ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=effective-light.com; spf=pass smtp.mailfrom=effective-light.com; arc=none smtp.client-ip=162.0.218.230
Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=effective-light.com
Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=effective-light.com
Received: from MTA-13-4.privateemail.com (mta-13-1.privateemail.com [198.54.122.107])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
	 key-exchange X25519 server-signature RSA-PSS (2048 bits))
	(No client certificate requested)
	by h7.fbrelay.privateemail.com (Postfix) with ESMTPSA id 4fCMqV6LFcz2xBb
	for <linux-kernel@vger.kernel.org>; Fri, 13 Feb 2026 14:36:38 -0500 (EST)
Received: from mta-13.privateemail.com (localhost [127.0.0.1])
	by mta-13.privateemail.com (Postfix) with ESMTP id 4fCMqM1DX3z3hhX1;
	Fri, 13 Feb 2026 14:36:31 -0500 (EST)
Received: from hal-station (unknown [23.129.64.148])
	by mta-13.privateemail.com (Postfix) with ESMTPA;
	Fri, 13 Feb 2026 14:35:59 -0500 (EST)
Date: Fri, 13 Feb 2026 14:35:43 -0500
From: Hamza Mahfooz <someguy@effective-light.com>
To: Mario Limonciello <mario.limonciello@amd.com>
Cc: dri-devel@lists.freedesktop.org,
	Michel =?iso-8859-1?Q?D=E4nzer?= <michel.daenzer@mailbox.org>,
	Harry Wentland <harry.wentland@amd.com>,
	Leo Li <sunpeng.li@amd.com>, Rodrigo Siqueira <siqueira@igalia.com>,
	Alex Deucher <alexander.deucher@amd.com>,
	Christian =?iso-8859-1?Q?K=F6nig?= <christian.koenig@amd.com>,
	David Airlie <airlied@gmail.com>, Simona Vetter <simona@ffwll.ch>,
	Maarten Lankhorst <maarten.lankhorst@linux.intel.com>,
	Maxime Ripard <mripard@kernel.org>,
	Thomas Zimmermann <tzimmermann@suse.de>,
	Alex Hung <alex.hung@amd.com>, Wayne Lin <Wayne.Lin@amd.com>,
	Aurabindo Pillai <aurabindo.pillai@amd.com>,
	Ivan Lipski <ivan.lipski@amd.com>,
	Timur =?iso-8859-1?Q?Krist=F3f?= <timur.kristof@gmail.com>,
	Dominik Kaszewski <dominik.kaszewski@amd.com>,
	amd-gfx@lists.freedesktop.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v3 1/2] drm: introduce KMS recovery mechanism
Message-ID: <aY99D-yXVydpMdwy@hal-station>
References: <20260212230905.688006-1-someguy@effective-light.com>
 <2e359cd9-0192-44d0-886f-7f93a8b0a4fa@amd.com>
Precedence: bulk
X-Mailing-List: linux-kernel@vger.kernel.org
List-Id: <linux-kernel.vger.kernel.org>
List-Subscribe: <mailto:linux-kernel+subscribe@vger.kernel.org>
List-Unsubscribe: <mailto:linux-kernel+unsubscribe@vger.kernel.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <2e359cd9-0192-44d0-886f-7f93a8b0a4fa@amd.com>
X-Virus-Scanned: ClamAV using ClamSMTP

On Thu, Feb 12, 2026 at 06:18:17PM -0600, Mario Limonciello wrote:
> Since you were able to (relatively) reliably reproduce a problem in amdgpu,
> how far in your iterative flow did you get?  Did you manage to need the
> vendor specific handling?  And presumably that helped?
> 

Every time I've tested it (with my repro) the full modeset has failed
and it was able to recover with the vendor specific handling. Though
it's worth noting that I strongly suspect a firmware hang in my case[1].

[1] https://lore.kernel.org/r/aYplYyf6Pp20lOAD@hal-station/

> > @@ -1881,13 +1886,43 @@ void drm_atomic_helper_wait_for_flip_done(struct drm_device *dev,
> >   			continue;
> >   		ret = wait_for_completion_timeout(&commit->flip_done, 10 * HZ);
> > -		if (ret == 0)
> > -			drm_err(dev, "[CRTC:%d:%s] flip_done timed out\n",
> > -				crtc->base.id, crtc->name);
> > +		if (!ret) {
> > +			switch (dev->reset_phase) {
> > +			case DRM_KMS_RESET_NONE:
> > +				drm_err(dev, "[CRTC:%d:%s] flip_done timed out\n",
> > +					crtc->base.id, crtc->name);
> > +				dev->reset_phase = DRM_KMS_RESET_FORCE_MODESET;
> > +				drm_kms_helper_hotplug_event(dev);
> > +				break;
> 
> Since you're iterating multiple CRTCs if you manage to recover from one
> with this call shouldn't you keep iterating the rest?
> 

Most measures that the can be implemented at the kernel level (including
forcing a full modeset), can't save the the current commit. So, in all
likelihood we will just end up waiting an extra 10 seconds per CRTC
(assuming they haven't completed already, unrelated to the forced
modeset).