From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from h7.fbrelay.privateemail.com (h7.fbrelay.privateemail.com [162.0.218.230]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6D6FC199FD3 for ; Fri, 13 Feb 2026 19:36:41 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=162.0.218.230 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1771011402; cv=none; b=TxVx/MIZFyxJeWWU8uCnDeVnWHHutvDEGO7l9RcmPJSyrThMZzuwzSfXrEvmXMSFzFr2hHzz0X7LtZPcIQCya+/8DCCH2lavgOC5LxN6olusDA3H5xXTTgh5wCa2dxgru6m7cmu04iBhPgdkTx5CMD+Col0087AL8b5q4u7UYto= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1771011402; c=relaxed/simple; bh=tS0POA6TsCtaFywPO1jM+s+niSWKGfESgmq4VD7c4lw=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=PaDcuZzwp4whkxunLt2oGC1tTq+ovAYb1LW6UDpU3f/teLurSwLWOrjSRJaC+ckHqcKstDY/KZdkPisVrVRx7fVViX4jKkTRTZQepa+HIrINI35zbpxDhY2iXUT1rmxddTkEJrkeM3BBGNiU+MydPACF/K43IsFZu3g0V2Ffj2g= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=effective-light.com; spf=pass smtp.mailfrom=effective-light.com; arc=none smtp.client-ip=162.0.218.230 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=effective-light.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=effective-light.com Received: from MTA-13-4.privateemail.com (mta-13-1.privateemail.com [198.54.122.107]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by h7.fbrelay.privateemail.com (Postfix) with ESMTPSA id 4fCMqV6LFcz2xBb for ; Fri, 13 Feb 2026 14:36:38 -0500 (EST) Received: from mta-13.privateemail.com (localhost [127.0.0.1]) by mta-13.privateemail.com (Postfix) with ESMTP id 4fCMqM1DX3z3hhX1; Fri, 13 Feb 2026 14:36:31 -0500 (EST) Received: from hal-station (unknown [23.129.64.148]) by mta-13.privateemail.com (Postfix) with ESMTPA; Fri, 13 Feb 2026 14:35:59 -0500 (EST) Date: Fri, 13 Feb 2026 14:35:43 -0500 From: Hamza Mahfooz To: Mario Limonciello Cc: dri-devel@lists.freedesktop.org, Michel =?iso-8859-1?Q?D=E4nzer?= , Harry Wentland , Leo Li , Rodrigo Siqueira , Alex Deucher , Christian =?iso-8859-1?Q?K=F6nig?= , David Airlie , Simona Vetter , Maarten Lankhorst , Maxime Ripard , Thomas Zimmermann , Alex Hung , Wayne Lin , Aurabindo Pillai , Ivan Lipski , Timur =?iso-8859-1?Q?Krist=F3f?= , Dominik Kaszewski , amd-gfx@lists.freedesktop.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH v3 1/2] drm: introduce KMS recovery mechanism Message-ID: References: <20260212230905.688006-1-someguy@effective-light.com> <2e359cd9-0192-44d0-886f-7f93a8b0a4fa@amd.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <2e359cd9-0192-44d0-886f-7f93a8b0a4fa@amd.com> X-Virus-Scanned: ClamAV using ClamSMTP On Thu, Feb 12, 2026 at 06:18:17PM -0600, Mario Limonciello wrote: > Since you were able to (relatively) reliably reproduce a problem in amdgpu, > how far in your iterative flow did you get? Did you manage to need the > vendor specific handling? And presumably that helped? > Every time I've tested it (with my repro) the full modeset has failed and it was able to recover with the vendor specific handling. Though it's worth noting that I strongly suspect a firmware hang in my case[1]. [1] https://lore.kernel.org/r/aYplYyf6Pp20lOAD@hal-station/ > > @@ -1881,13 +1886,43 @@ void drm_atomic_helper_wait_for_flip_done(struct drm_device *dev, > > continue; > > ret = wait_for_completion_timeout(&commit->flip_done, 10 * HZ); > > - if (ret == 0) > > - drm_err(dev, "[CRTC:%d:%s] flip_done timed out\n", > > - crtc->base.id, crtc->name); > > + if (!ret) { > > + switch (dev->reset_phase) { > > + case DRM_KMS_RESET_NONE: > > + drm_err(dev, "[CRTC:%d:%s] flip_done timed out\n", > > + crtc->base.id, crtc->name); > > + dev->reset_phase = DRM_KMS_RESET_FORCE_MODESET; > > + drm_kms_helper_hotplug_event(dev); > > + break; > > Since you're iterating multiple CRTCs if you manage to recover from one > with this call shouldn't you keep iterating the rest? > Most measures that the can be implemented at the kernel level (including forcing a full modeset), can't save the the current commit. So, in all likelihood we will just end up waiting an extra 10 seconds per CRTC (assuming they haven't completed already, unrelated to the forced modeset).