From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 4F469C433EF for ; Tue, 14 Jun 2022 16:28:22 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id A33EA112818; Tue, 14 Jun 2022 16:28:17 +0000 (UTC) Received: from mga06.intel.com (mga06b.intel.com [134.134.136.31]) by gabe.freedesktop.org (Postfix) with ESMTPS id DC483112811; Tue, 14 Jun 2022 16:28:15 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1655224096; x=1686760096; h=date:message-id:from:to:cc:subject:in-reply-to: references:mime-version; bh=NG9nB278N2efkMgmUGvqiQ4YT/2J5Fgbyzqn3TqaDok=; b=dXXbE3TqoSpTh9Mt85CBD1pSaDhI7kDS5oqA826sXVca1aIk18ILNRNW EUalOY6UZaaGOVYx7WEokDeolXIFDxjGvrHayBnTFXD1kb4Cmo7vEhwXI f/mvYQodnucUfuQ1rnpWzyLdFsTjTvVFjkCJnj6V/4es7/KxYLENn9ac+ WYncLMTGS510GCclJyEaYwrvGlpE2WmnDqB3Ys1XDvv+n5uzlKpX+HqCO AbboFdXOjoLP6ZEGhSn5KxFEsi4KeM6TjL2uwMQgXY6W1r00tk3n8iV6x E2X+JOkjfxQzj1gGhMn7Mw22nk1/qkHNpDhg+bDRYKtv+ISW46TXau8l1 w==; X-IronPort-AV: E=McAfee;i="6400,9594,10378"; a="340330474" X-IronPort-AV: E=Sophos;i="5.91,300,1647327600"; d="scan'208";a="340330474" Received: from orsmga002.jf.intel.com ([10.7.209.21]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Jun 2022 09:28:14 -0700 X-IronPort-AV: E=Sophos;i="5.91,300,1647327600"; d="scan'208";a="570071388" Received: from adixit-mobl1.amr.corp.intel.com (HELO adixit-arch.intel.com) ([10.209.75.167]) by orsmga002-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Jun 2022 09:28:14 -0700 Date: Tue, 14 Jun 2022 09:28:14 -0700 Message-ID: <87zgiffbvl.wl-ashutosh.dixit@intel.com> From: "Dixit, Ashutosh" To: Zhanjun Dong In-Reply-To: <20220602172119.96324-1-zhanjun.dong@intel.com> References: <20220602172119.96324-1-zhanjun.dong@intel.com> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?ISO-8859-4?Q?Goj=F2?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/28.1 (x86_64-pc-linux-gnu) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: text/plain; charset=US-ASCII Subject: Re: [Intel-gfx] [PATCH] drm/i915/guc: Check ctx while waiting for response X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: intel-gfx@lists.freedesktop.org, alan.previn.teres.alexis@intel.com, dri-devel@lists.freedesktop.org Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" On Thu, 02 Jun 2022 10:21:19 -0700, Zhanjun Dong wrote: > Hi Zhanjun, > We are seeing error message of "No response for request". Some cases happened > while waiting for response and reset/suspend action was triggered. In this > case, no response is not an error, active requests will be cancelled. > > This patch will handle this condition and change the error message into > debug message. IMO the patch title should be changed: which ctx are we checking while waiting for response? Something like "check for ct enabled while waiting for response"? > @@ -481,12 +481,14 @@ static int wait_for_ct_request_update(struct ct_request *req, u32 *status) > #define GUC_CTB_RESPONSE_TIMEOUT_SHORT_MS 10 > #define GUC_CTB_RESPONSE_TIMEOUT_LONG_MS 1000 > #define done \ > - (FIELD_GET(GUC_HXG_MSG_0_ORIGIN, READ_ONCE(req->status)) == \ > + (!intel_guc_ct_enabled(ct) || FIELD_GET(GUC_HXG_MSG_0_ORIGIN, READ_ONCE(req->status)) == \ > GUC_HXG_ORIGIN_GUC) > err = wait_for_us(done, GUC_CTB_RESPONSE_TIMEOUT_SHORT_MS); > if (err) > err = wait_for(done, GUC_CTB_RESPONSE_TIMEOUT_LONG_MS); > #undef done > + if (!intel_guc_ct_enabled(ct)) > + err = -ECANCELED; Also, I really don't like intel_guc_ct_enabled() being called in two places. Is there a possibility that intel_guc_ct_enabled() can return false in the first place (causing the wait to exit) and then return true in the second place (so we don't return -ECANCELED)? Is it possible to change the status of the request to something else from intel_guc_ct_disable() (or wherever ct->enabled is set to false) rather than introducing intel_guc_ct_enabled() checks here. Changing the status of the request when CT goes down would cause the wait's to exit here. And then we can check that special request status signifying CT went down? Thanks. -- Ashutosh From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 19A5AC433EF for ; Tue, 14 Jun 2022 16:28:18 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 690C0112811; Tue, 14 Jun 2022 16:28:17 +0000 (UTC) Received: from mga06.intel.com (mga06b.intel.com [134.134.136.31]) by gabe.freedesktop.org (Postfix) with ESMTPS id DC483112811; Tue, 14 Jun 2022 16:28:15 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1655224096; x=1686760096; h=date:message-id:from:to:cc:subject:in-reply-to: references:mime-version; bh=NG9nB278N2efkMgmUGvqiQ4YT/2J5Fgbyzqn3TqaDok=; b=dXXbE3TqoSpTh9Mt85CBD1pSaDhI7kDS5oqA826sXVca1aIk18ILNRNW EUalOY6UZaaGOVYx7WEokDeolXIFDxjGvrHayBnTFXD1kb4Cmo7vEhwXI f/mvYQodnucUfuQ1rnpWzyLdFsTjTvVFjkCJnj6V/4es7/KxYLENn9ac+ WYncLMTGS510GCclJyEaYwrvGlpE2WmnDqB3Ys1XDvv+n5uzlKpX+HqCO AbboFdXOjoLP6ZEGhSn5KxFEsi4KeM6TjL2uwMQgXY6W1r00tk3n8iV6x E2X+JOkjfxQzj1gGhMn7Mw22nk1/qkHNpDhg+bDRYKtv+ISW46TXau8l1 w==; X-IronPort-AV: E=McAfee;i="6400,9594,10378"; a="340330474" X-IronPort-AV: E=Sophos;i="5.91,300,1647327600"; d="scan'208";a="340330474" Received: from orsmga002.jf.intel.com ([10.7.209.21]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Jun 2022 09:28:14 -0700 X-IronPort-AV: E=Sophos;i="5.91,300,1647327600"; d="scan'208";a="570071388" Received: from adixit-mobl1.amr.corp.intel.com (HELO adixit-arch.intel.com) ([10.209.75.167]) by orsmga002-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Jun 2022 09:28:14 -0700 Date: Tue, 14 Jun 2022 09:28:14 -0700 Message-ID: <87zgiffbvl.wl-ashutosh.dixit@intel.com> From: "Dixit, Ashutosh" To: Zhanjun Dong Subject: Re: [PATCH] drm/i915/guc: Check ctx while waiting for response In-Reply-To: <20220602172119.96324-1-zhanjun.dong@intel.com> References: <20220602172119.96324-1-zhanjun.dong@intel.com> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?ISO-8859-4?Q?Goj=F2?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/28.1 (x86_64-pc-linux-gnu) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: text/plain; charset=US-ASCII X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: intel-gfx@lists.freedesktop.org, alan.previn.teres.alexis@intel.com, dri-devel@lists.freedesktop.org, michal.wajdeczko@intel.com Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" On Thu, 02 Jun 2022 10:21:19 -0700, Zhanjun Dong wrote: > Hi Zhanjun, > We are seeing error message of "No response for request". Some cases happened > while waiting for response and reset/suspend action was triggered. In this > case, no response is not an error, active requests will be cancelled. > > This patch will handle this condition and change the error message into > debug message. IMO the patch title should be changed: which ctx are we checking while waiting for response? Something like "check for ct enabled while waiting for response"? > @@ -481,12 +481,14 @@ static int wait_for_ct_request_update(struct ct_request *req, u32 *status) > #define GUC_CTB_RESPONSE_TIMEOUT_SHORT_MS 10 > #define GUC_CTB_RESPONSE_TIMEOUT_LONG_MS 1000 > #define done \ > - (FIELD_GET(GUC_HXG_MSG_0_ORIGIN, READ_ONCE(req->status)) == \ > + (!intel_guc_ct_enabled(ct) || FIELD_GET(GUC_HXG_MSG_0_ORIGIN, READ_ONCE(req->status)) == \ > GUC_HXG_ORIGIN_GUC) > err = wait_for_us(done, GUC_CTB_RESPONSE_TIMEOUT_SHORT_MS); > if (err) > err = wait_for(done, GUC_CTB_RESPONSE_TIMEOUT_LONG_MS); > #undef done > + if (!intel_guc_ct_enabled(ct)) > + err = -ECANCELED; Also, I really don't like intel_guc_ct_enabled() being called in two places. Is there a possibility that intel_guc_ct_enabled() can return false in the first place (causing the wait to exit) and then return true in the second place (so we don't return -ECANCELED)? Is it possible to change the status of the request to something else from intel_guc_ct_disable() (or wherever ct->enabled is set to false) rather than introducing intel_guc_ct_enabled() checks here. Changing the status of the request when CT goes down would cause the wait's to exit here. And then we can check that special request status signifying CT went down? Thanks. -- Ashutosh