From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 01B473BB66E for ; Tue, 12 May 2026 14:44:34 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.17 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778597076; cv=none; b=XQxmJwQ5zvA3ZL+/geC+msBEd4OQ9qW7Uo//liFKIZDlDSW5ZpbRZoyqDZpgMDs2Czr1TD/wopp8cvWwF1YMf+U0ZfeMCIh0QSuuWZFvwFWjefqSXuoMucYwZrY7lsVxPoS5Pj4Z9GpBPR/S37JW0yMOzDQqZBgaL1NJ9Z3tAXI= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778597076; c=relaxed/simple; bh=+2mlVVyC8rNfQKGiNjq8NYbKcjmNDprm2GMVkzAJa3E=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=AzEaJ+CUZ9jzpuAx8NZ9iSVVVbsC4YYLXJTEaSSQ7tS/tGwa8vEXMOhzek84e7WLV+ZSx7233yWXYst1yHJbQmc3OZzs4mAsKnXfnnoiu9SyJ1wfYdyYe3DAS5k9h8lOsvbXS69p52CJud81gYmqGjKG2Ygp+qXKWIWltfic0zU= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=NOI3/1Jf; arc=none smtp.client-ip=198.175.65.17 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="NOI3/1Jf" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1778597075; x=1810133075; h=date:from:to:cc:subject:message-id:references: mime-version:in-reply-to; bh=+2mlVVyC8rNfQKGiNjq8NYbKcjmNDprm2GMVkzAJa3E=; b=NOI3/1JfIsCVeNY6n6tAAr8cHaVcPHLHnwDP3nU+hog3qf7mRmrT8SX0 NBpZeNcyHvIOdXfB2jooBKXdx2RVAWcwtVBU8kky027B6DzkzrNhwoJsR bZg8/g5pq/MKoA35gplh5DDDZDwOQ5IwroK0FTUEBcZx2Bl3SfxsinNcQ /nhTH4VMm5kylRqPgVtynJmSbBt6wbW4s1dxBQ0ENIBAJPOzWtr3Dl8Dm kyEpMf2VfwAan6kkURgOLMXRz+klqElFXYFuPWjUsUVeKSZiWTAgzjFoW +hNMnSIhI66iX/9MLV5+sCo17PjqxfBus4kBG1VTfKyHaDsjW3DioQQUj g==; X-CSE-ConnectionGUID: uMipjqG3T22yrXfEK99qtQ== X-CSE-MsgGUID: FZRDYdIRSbeR4ag6J/yd6w== X-IronPort-AV: E=McAfee;i="6800,10657,11784"; a="79491145" X-IronPort-AV: E=Sophos;i="6.23,231,1770624000"; d="scan'208";a="79491145" Received: from orviesa005.jf.intel.com ([10.64.159.145]) by orvoesa109.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 May 2026 07:44:35 -0700 X-CSE-ConnectionGUID: f+vJYqrCSW2Xpqci1/0FiA== X-CSE-MsgGUID: cUk01pY5RcK7VJ4iUtoQGA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.23,231,1770624000"; d="scan'208";a="242784455" Received: from black.igk.intel.com ([10.91.253.5]) by orviesa005.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 May 2026 07:44:30 -0700 Date: Tue, 12 May 2026 16:44:26 +0200 From: Raag Jadav To: "Tauro, Riana" Cc: intel-xe@lists.freedesktop.org, dri-devel@lists.freedesktop.org, netdev@vger.kernel.org, simona.vetter@ffwll.ch, airlied@gmail.com, kuba@kernel.org, lijo.lazar@amd.com, Hawking.Zhang@amd.com, davem@davemloft.net, pabeni@redhat.com, edumazet@google.com, maarten@lankhorst.se, zachary.mckevitt@oss.qualcomm.com, rodrigo.vivi@intel.com, michal.wajdeczko@intel.com, matthew.d.roper@intel.com, umesh.nerlige.ramappa@intel.com, mallesh.koujalagi@intel.com, soham.purkait@intel.com, anoop.c.vijay@intel.com, aravind.iddamsetty@linux.intel.com Subject: Re: [PATCH v1 09/11] drm/xe/ras: Set error threshold support Message-ID: References: <20260417211730.837345-1-raag.jadav@intel.com> <20260417211730.837345-10-raag.jadav@intel.com> <5e90b9aa-9432-43b5-ae40-1fce383bb043@intel.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <5e90b9aa-9432-43b5-ae40-1fce383bb043@intel.com> On Mon, May 11, 2026 at 10:51:38PM +0530, Tauro, Riana wrote: > On 4/18/2026 2:46 AM, Raag Jadav wrote: > > System controller allows programming per error threshold value, which > > it uses to raise error events to the driver. Set it using mailbox > > command so that it can be programmed by the user. > > > > Signed-off-by: Raag Jadav > > --- > > drivers/gpu/drm/xe/xe_ras.c | 42 +++++++++++++++++++ > > drivers/gpu/drm/xe/xe_ras.h | 1 + > > drivers/gpu/drm/xe/xe_ras_types.h | 28 +++++++++++++ > > drivers/gpu/drm/xe/xe_sysctrl_mailbox_types.h | 2 + > > 4 files changed, 73 insertions(+) > > > > diff --git a/drivers/gpu/drm/xe/xe_ras.c b/drivers/gpu/drm/xe/xe_ras.c > > index 3e93f838aa4a..26e063166c5f 100644 > > --- a/drivers/gpu/drm/xe/xe_ras.c > > +++ b/drivers/gpu/drm/xe/xe_ras.c > > @@ -163,3 +163,45 @@ int xe_ras_get_threshold(struct xe_device *xe, u32 severity, u32 component, u32 > > comp_to_str(counter.common.component), sev_to_str(counter.common.severity)); > > return 0; > > } > > + > > +int xe_ras_set_threshold(struct xe_device *xe, u32 severity, u32 component, u32 threshold) > > +{ > > + struct xe_ras_set_threshold_response response = {}; > > + struct xe_ras_set_threshold_request request = {}; > > + struct xe_sysctrl_mailbox_command command = {}; > > + struct xe_ras_error_class counter = {}; > > + size_t len; > > + int ret; > > + > > + counter.common.severity = drm_to_xe_ras_severities[severity]; > > + counter.common.component = drm_to_xe_ras_components[component]; > > + request.counter = counter; > > + request.threshold = threshold; > > We might need a max check here to avoid unnecessary values from user. We may want to avoid hardcoding it in driver since it can potentially be different per product. > > + ras_command_prepare(&command, &request, sizeof(request), &response, > > + sizeof(response), XE_SYSCTRL_CMD_SET_THRESHOLD); > > Nit: command, request, response seems to be a better format Sure, I'll likely create a separate sysctrl helper. > > + guard(xe_pm_runtime)(xe); > > + ret = xe_sysctrl_send_command(&xe->sc, &command, &len); > > + if (ret) { > > + xe_err(xe, "sysctrl: failed to set threshold %d\n", ret); > > + return ret; > > + } > > + > > + if (len != sizeof(response)) { > > + xe_err(xe, "sysctrl: unexpected set threshold response length %zu (expected %zu)\n", > > + len, sizeof(response)); > > + return -EIO; > > + } > > + > > + if (response.status) { > > + xe_err(xe, "sysctrl: set threshold operation failed %#x\n", response.status); > > Status should be converted to visible error codes. check [PATCH v5 3/6] > drm/xe/xe_ras: Add helper to clear error counter - Riana Tauro > > Coming right up. Raag > > + return -EIO; > > + } > > + > > + counter = response.counter; > > + > > + xe_dbg(xe, "[RAS]: Set threshold %u for %s %s\n", response.threshold, > > + comp_to_str(counter.common.component), sev_to_str(counter.common.severity)); > Again not required. Value should be visible to user > > + return 0; > > +} > > diff --git a/drivers/gpu/drm/xe/xe_ras.h b/drivers/gpu/drm/xe/xe_ras.h > > index 982bbe61461e..d1f71b1de723 100644 > > --- a/drivers/gpu/drm/xe/xe_ras.h > > +++ b/drivers/gpu/drm/xe/xe_ras.h > > @@ -14,5 +14,6 @@ struct xe_sysctrl_event_response; > > void xe_ras_counter_threshold_crossed(struct xe_device *xe, > > struct xe_sysctrl_event_response *response); > > int xe_ras_get_threshold(struct xe_device *xe, u32 severity, u32 component, u32 *threshold); > > +int xe_ras_set_threshold(struct xe_device *xe, u32 severity, u32 component, u32 threshold); > > #endif > > diff --git a/drivers/gpu/drm/xe/xe_ras_types.h b/drivers/gpu/drm/xe/xe_ras_types.h > > index d5da93d65cf5..d7e4a02a661d 100644 > > --- a/drivers/gpu/drm/xe/xe_ras_types.h > > +++ b/drivers/gpu/drm/xe/xe_ras_types.h > > @@ -92,4 +92,32 @@ struct xe_ras_get_threshold_response { > > u32 reserved[4]; > > } __packed; > > +/** > > + * struct xe_ras_set_threshold_request - Request structure for set threshold > > + */ > > +struct xe_ras_set_threshold_request { > > + /** @counter: Counter to set threshold for */ > > + struct xe_ras_error_class counter; > > + /** @threshold: Threshold value to set */ > > + u32 threshold; > > + /** @reserved: Reserved for future use */ > > + u32 reserved; > > +} __packed; > > + > > +/** > > + * struct xe_ras_set_threshold_response - Response structure for set threshold > > + */ > > +struct xe_ras_set_threshold_response { > > + /** @counter: Counter id */ > > Nit: ID > > > + struct xe_ras_error_class counter; > > + /** @threshold_old: Old threshold value */ > > Nit: prev > > Thanks > Riana > > > + u32 threshold_old; > > + /** @threshold: New threshold value */ > > + u32 threshold; > > + /** @status: Set threshold operation status */ > > + u32 status; > > + /** @reserved: Reserved for future use */ > > + u32 reserved[2]; > > +} __packed; > > + > > #endif > > diff --git a/drivers/gpu/drm/xe/xe_sysctrl_mailbox_types.h b/drivers/gpu/drm/xe/xe_sysctrl_mailbox_types.h > > index a1b71218deca..b865768e903b 100644 > > --- a/drivers/gpu/drm/xe/xe_sysctrl_mailbox_types.h > > +++ b/drivers/gpu/drm/xe/xe_sysctrl_mailbox_types.h > > @@ -23,10 +23,12 @@ enum xe_sysctrl_group { > > * enum xe_sysctrl_gfsp_cmd - Commands supported by GFSP group > > * > > * @XE_SYSCTRL_CMD_GET_THRESHOLD: Retrieve error threshold > > + * @XE_SYSCTRL_CMD_SET_THRESHOLD: Set error threshold > > * @XE_SYSCTRL_CMD_GET_PENDING_EVENT: Retrieve pending event > > */ > > enum xe_sysctrl_gfsp_cmd { > > XE_SYSCTRL_CMD_GET_THRESHOLD = 0x05, > > + XE_SYSCTRL_CMD_SET_THRESHOLD = 0x06, > > XE_SYSCTRL_CMD_GET_PENDING_EVENT = 0x07, > > };