From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 980C5D743C2 for ; Wed, 20 Nov 2024 19:04:33 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 468C210E075; Wed, 20 Nov 2024 19:04:33 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.b="aqEE/LTo"; dkim-atps=neutral Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.18]) by gabe.freedesktop.org (Postfix) with ESMTPS id 90F9210E075 for ; Wed, 20 Nov 2024 19:04:31 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1732129471; x=1763665471; h=date:message-id:from:to:cc:subject:in-reply-to: references:mime-version; bh=y94VwkW6dqu0pASurE45AafZpTPm/pLtFao1qLUHYJ0=; b=aqEE/LToHrZRItfiUoQTOLKiixu/VCh1dd7wzF7hh6sB2ITAgOqhbGke xbNLIgdBoc57S5kzQEc+IKMWh/lSKLkI4wF77+etiLRrYhDP3eWX+jwzg NJb057ohU2iN6zwOXkWAi8VPu5q+tYjkkyvwQgqaMDlTuVXsUWmHgZ6JK Cjk+ba9WpGGhRLi2X1yQaCpJoe6QzR8wV6vycpGOkKE4hEA2baX9Sp06i 5LfK+9yoXhdri8SVzLct0BqPC6agA4Rcou8UK2avFHYZ+9XhHh+0WATFP XHYwj0yQiZa9y4MhZClObTf5/ljrS6+io/DeiFhF5l3V6xyhAOIBHcA3r g==; X-CSE-ConnectionGUID: nDbfu6MfTCCEnNZGZWFupw== X-CSE-MsgGUID: ctqPpgVpTg6q30HczoFObg== X-IronPort-AV: E=McAfee;i="6700,10204,11262"; a="31575883" X-IronPort-AV: E=Sophos;i="6.12,170,1728975600"; d="scan'208";a="31575883" Received: from orviesa006.jf.intel.com ([10.64.159.146]) by fmvoesa112.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 20 Nov 2024 11:04:31 -0800 X-CSE-ConnectionGUID: cWaPc3l8TY2N2X+CJcApWA== X-CSE-MsgGUID: s1kVN9UCTiS7tH834oBH+A== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.12,170,1728975600"; d="scan'208";a="90127056" Received: from orsosgc001.jf.intel.com (HELO orsosgc001.intel.com) ([10.165.21.142]) by orviesa006-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 20 Nov 2024 11:04:29 -0800 Date: Wed, 20 Nov 2024 11:04:28 -0800 Message-ID: <851pz5loar.wl-ashutosh.dixit@intel.com> From: "Dixit, Ashutosh" To: Harish Chegondi Cc: Umesh Nerlige Ramappa , intel-xe@lists.freedesktop.org, james.ausmus@intel.com, felix.j.degrood@intel.com, jose.souza@intel.com, matias.a.cabral@intel.com, joshua.santosh.ranjan@intel.com, shubham.kumar@intel.com Subject: Re: [PATCH v4 2/5] drm/xe/eustall: Introduce API for EU stall sampling In-Reply-To: References: User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?ISO-8859-4?Q?Goj=F2?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/28.2 (x86_64-redhat-linux-gnu) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: text/plain; charset=US-ASCII X-BeenThere: intel-xe@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel Xe graphics driver List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-xe-bounces@lists.freedesktop.org Sender: "Intel-xe" On Tue, 19 Nov 2024 15:59:12 -0800, Harish Chegondi wrote: > > > > +/** > > > + * enum drm_xe_eu_stall_property_id - EU stall sampling input property ids. > > > + * > > > + * These properties are passed to the driver as a chain of > > > + * @drm_xe_ext_set_property structures with @property set to these > > > + * properties' enums and @value set to the corresponding values of these > > > + * properties. @drm_xe_user_extension base.name should be set to > > > + * @DRM_XE_EU_STALL_EXTENSION_SET_PROPERTY. > > > + */ > > > +enum drm_xe_eu_stall_property_id { > > > +#define DRM_XE_EU_STALL_EXTENSION_SET_PROPERTY 0 > > > + /** > > > + * @DRM_XE_EU_STALL_PROP_SAMPLE_RATE: Sampling rate > > > + * in multiples of 251 cycles. Valid values are 1 to 7. > > > + * If the value is 1, sampling interval is 251 cycles. > > > + * If the value is 7, sampling interval is 7 x 251 cycles. > > > + */ > > > + DRM_XE_EU_STALL_PROP_SAMPLE_RATE = 1, > > > > What is the rate of 251 cycles? If that can be clearly defined, then at > > first glance, I would think it's better to define this in terms of > > frequency. The implementation can decide how to translate that to HW > > configuration. > > > Since the duration of a cycle depends on the GPU clock, it can very from > GPU to GPU. So, if there is any translation in the driver, it will have > to be different for each GPU. I think keeping this input as a multiplier > of cycles may be more future proof for the uAPI. I am trying to get more > information and feedback from the user space regarding your suggestion. > If it is feasible, I will implement in v6. Umesh has a point but I sort of agree with Harish because this value is directly fed into a register. But we do need some changes: 1. This 251 value showing up here doesn't make any sense and needs to go. 2. According to Bspec 64036, HW supports "127 * N" sampling rates (in terms of cycles), so we should support those too. 3. Even higher sampling rates (say 10x) are being proposed for the future. So these should also be supported. So my proposal is simple, but let's see if it can be made to work. The uapi will directly input the sampling rate in number of cycles (so the value coming in is what the GPU freq is divided by). So e.g. if "3 * 251" is required "3 * 251" will come in through the uapi. If UMD wants "7 * 127", they will send in "7 * 127". The driver will internally map this value into the "closest" sampling rate supported by HW. I am assuming that UMD's already know what sampling rates are supported by a particular HW platform so they can send in the exact value they need. Otherwise the driver can always map the value sent by userspace. Say UMD sends a value 10, this will be mapped into "1 * 127" which is the closest sampling rate supported to 10. So this way all sampling rates can be supported. UMD just says I want a sampling rate of "GPU_freq divided by 10" and they automatically get whatever is the closest available. They probably do need to have an idea of what rates are supported on a particular HW platform, I am assuming they have this information from Bspec, so they can send in exact values if they know and driver will be able to set the exact value UMD has specified. Ashutosh