From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id AF2D8C0219B for ; Tue, 11 Feb 2025 19:23:04 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 799E710E306; Tue, 11 Feb 2025 19:23:04 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.b="QwjebLT5"; dkim-atps=neutral Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.21]) by gabe.freedesktop.org (Postfix) with ESMTPS id 4A55C10E306 for ; Tue, 11 Feb 2025 19:23:03 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1739301783; x=1770837783; h=date:message-id:from:to:cc:subject:in-reply-to: references:mime-version; bh=7HdB5Mi5Q6R5oBhNb3OpZjVw3MQd/Apj5w1to00933o=; b=QwjebLT5IOJOTYY+lsZi+Q8l0R3VbWRtm1+9V7339puX/M81sG9up9Dq dskYbpo/3umbS/5x9sAeb39znWIwewhFpsV7hUEeRfxAjORz1KZiy0lLL 1GgjFtl9h0q/39jBD/OX9+gqbcXumGsBtAacUAaKOXBksFYFEzZYKMLTT 3DwXyApylv/SkccZShODgWwN47+AioNmQSyh+fat1pIlF/oHOc7c7r/7V B+SH+p/O4UOfjm6AlZQEiii8qItSiCKm4waKn9+4fFelJmXyI0RRTNVDE QNoCooCCaXGu++lk93eAjh/BlaXpGIkmMZ68wfKmj278AGP5wJBbkBvVA w==; X-CSE-ConnectionGUID: PyP2CchmQwerhoj5YbyIbw== X-CSE-MsgGUID: hdmItbVvT1eNTpXssOAwUQ== X-IronPort-AV: E=McAfee;i="6700,10204,11342"; a="39852878" X-IronPort-AV: E=Sophos;i="6.13,278,1732608000"; d="scan'208";a="39852878" Received: from orviesa009.jf.intel.com ([10.64.159.149]) by orvoesa113.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Feb 2025 11:23:03 -0800 X-CSE-ConnectionGUID: vKh762yfQt+zdBd9PwjapQ== X-CSE-MsgGUID: phnEmCQJSiubP4xx+ToL8A== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.13,278,1732608000"; d="scan'208";a="112360059" Received: from orsosgc001.jf.intel.com (HELO orsosgc001.intel.com) ([10.165.21.142]) by orviesa009-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Feb 2025 11:23:03 -0800 Date: Tue, 11 Feb 2025 11:23:02 -0800 Message-ID: <85ed042smh.wl-ashutosh.dixit@intel.com> From: "Dixit, Ashutosh" To: Harish Chegondi Cc: Subject: Re: [PATCH v9 0/8] Add support for EU stall sampling In-Reply-To: References: User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?ISO-8859-4?Q?Goj=F2?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/28.2 (x86_64-redhat-linux-gnu) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: text/plain; charset=US-ASCII X-BeenThere: intel-xe@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel Xe graphics driver List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-xe-bounces@lists.freedesktop.org Sender: "Intel-xe" On Mon, 10 Feb 2025 05:46:41 -0800, Harish Chegondi wrote: > Hi Harish, > The following patch series add support for EU stall sampling, > a new hardware feature first added in PVC and is being supported > in XE2 and later architecture GPUs. This feature would enable > capturing of EU stall data which include the IP address of the > instruction stalled and various stall reason counts. > > Support for this feature is being added into Mesa: > https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30142 > > New IGT tests for EU stall sampling are being added: > https://patchwork.freedesktop.org/series/143030/ > > This patch series has undergone basic testing with the new IGT tests. Before posting the next version, could you please update the IGT to sync with this version. And then include a "Test-with:" in the the next kernel version, so that we can see some Eu Stall IGT's running in the pre-merge? No need to fix all IGT issues yet, if you don't have the time now. That can be done later. Just fix the IGT's enough for the tests to run and hopefully pass, when we post new versions for the kernel series. Thanks. -- Ashutosh > > Thank You. > > v9: a. Split the big patch in v8 into two patches > b. Moved all drop data handling code into one patch > c. Several other code improvements as mentioned in the patches > v8: a. Used div_u64() instead of / to fix 32-bit build issue. > b. Changed copyright year in new files to 2025. > c. Renamed struct drm_xe_eu_stall_data_pvc to struct xe_eu_stall_data_pvc > d. Renamed struct drm_xe_eu_stall_data_xe2 to struct xe_eu_stall_data_xe2 > > v7: a. Renamed input property DRM_XE_EU_STALL_PROP_EVENT_REPORT_COUNT > to DRM_XE_EU_STALL_PROP_WAIT_NUM_REPORTS to be consistent with > OA. Renamed the corresponding internal variables. > b. Fixed some commit messages based on review feedback. > c. Changed sampling_rates from a pointer to flexible array. > > v6: a. Changed the uAPI input to accept sampling rate in GPU cycles > instead of sampling rate multiplier. > b. Fix buffer wrap around over write bug (Matt Olson). > c. Include EU stall sampling rates information and per XeCore buffer size in the query information. > > v5: Addressed review feedback from v4 including > a. Removed DRM_XE_EU_STALL_PROP_POLL_PERIOD from the uAPI (Ashutosh) > b. Separated the patches for Xe_HPC and Xe2 (Matt R) > c. Moved read() returning -EIO into a separate patch > d. Removed spinlocks around set_bit() and clear_bit() (Matt R) > e. Renamed several variables, structures and enums (Ashutosh and > Matt R) > f. Addressed other review feedback. > v4: Addressed review feedback from v3 including > a. Split the patch into multiple patches (Matt R) > b. Added a new device query to get EU stall info (Ashutosh) > c. Renamed all Dss to xecore (Matt R) > d. Removed buffer size and disable at open input properties. (Matt R) > e. Removed the "_SHIFT" macros (Matt R) > f. Allocate the EU stall buffer only on system memory. > g. Changed the work arounds to OOB (Matt R) > h. Other review feedback. > v3: a. Removed data header and changed read() to return -EIO when data is dropped by the HW. > b. Added a new DRM_XE_OBSERVATION_IOCTL_INFO to query EU stall data record info > c. Added struct drm_xe_eu_stall_data_pvc and struct drm_xe_eu_stall_data_xe2 > to xe_drm.h. These declarations would help user space to parse the > EU stall data > d. Addressed other review comments from v2 > v2: Rename xe perf layer as xe observation layer (Ashutosh) > > Reviewed-by: Ben Olson > Acked-by: Felix Degrood > Signed-off-by: Harish Chegondi > Signed-off-by: Ashutosh Dixit > > Harish Chegondi (8): > drm/xe/topology: Add a function to find the index of the last enabled > DSS in a mask > drm/xe/uapi: Introduce API for EU stall sampling > drm/xe/eustall: Add support to init, enable and disable EU stall > sampling > drm/xe/eustall: Add support to read() and poll() EU stall data > drm/xe/eustall: Add support to handle dropped EU stall data > drm/xe/eustall: Add EU stall sampling support for Xe2 > drm/xe/uapi: Add a device query to get EU stall sampling information > drm/xe/eustall: Add workaround 22016596838 which applies to PVC. > > drivers/gpu/drm/xe/Makefile | 1 + > drivers/gpu/drm/xe/regs/xe_eu_stall_regs.h | 29 + > drivers/gpu/drm/xe/xe_eu_stall.c | 1029 ++++++++++++++++++++ > drivers/gpu/drm/xe/xe_eu_stall.h | 21 + > drivers/gpu/drm/xe/xe_gt.c | 6 + > drivers/gpu/drm/xe/xe_gt_topology.h | 13 + > drivers/gpu/drm/xe/xe_gt_types.h | 3 + > drivers/gpu/drm/xe/xe_observation.c | 14 + > drivers/gpu/drm/xe/xe_query.c | 38 + > drivers/gpu/drm/xe/xe_trace.h | 33 + > drivers/gpu/drm/xe/xe_wa_oob.rules | 1 + > include/uapi/drm/xe_drm.h | 74 ++ > 12 files changed, 1262 insertions(+) > create mode 100644 drivers/gpu/drm/xe/regs/xe_eu_stall_regs.h > create mode 100644 drivers/gpu/drm/xe/xe_eu_stall.c > create mode 100644 drivers/gpu/drm/xe/xe_eu_stall.h > > -- > 2.48.1 >