From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 5AE7CC021BE for ; Wed, 26 Feb 2025 02:34:15 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 2603910E835; Wed, 26 Feb 2025 02:34:15 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.b="ESqNfKGK"; dkim-atps=neutral Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.9]) by gabe.freedesktop.org (Postfix) with ESMTPS id 0356410E835 for ; Wed, 26 Feb 2025 02:34:07 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1740537252; x=1772073252; h=date:message-id:from:to:cc:subject:in-reply-to: references:mime-version; bh=4hYz8i7D4IndbCLHgXkNybr3gq9gCcb4zknpu4pEjxg=; b=ESqNfKGKegMuyzjhojMvnHH1G5tpeEcteccGfybbN5419GiNlyhZRiFE 6ZgiIs/D7kzxvyZmWqF2aVLzmgBZEVokTCP0C/a8GFu1125H3RCwpxP5I Y7D4MKqHw9kbKjE8foTpEaSLOLlvILKST+a4LKRxNrhCSmxilFqz9lWtA VKOKDjSBfJDYj5V3L96oWyRHKW5JxBekZY1ikpAoqgZo8wYBEiyIlPTK6 WcwkACatG9L4y7NMoEokGf905XGuN/H4Rht5oIH4bQ4DCdR2cRZKShv7N xkYAwwCgK+Um7VMjcbDujkJpSdieAkXXGZ6Bwv2t9ptEhHryVAOvlkZS2 g==; X-CSE-ConnectionGUID: WL3q1kdvQfWabMCtrqPMBQ== X-CSE-MsgGUID: 9aQehQ/CTnOQHJ5cs53C0A== X-IronPort-AV: E=McAfee;i="6700,10204,11356"; a="63835048" X-IronPort-AV: E=Sophos;i="6.13,316,1732608000"; d="scan'208";a="63835048" Received: from fmviesa006.fm.intel.com ([10.60.135.146]) by orvoesa101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 25 Feb 2025 18:34:05 -0800 X-CSE-ConnectionGUID: DCexg0R4Sm2/RK2vbgPg/Q== X-CSE-MsgGUID: jNVoamiOTa2L/4xyuDhPOQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.13,316,1732608000"; d="scan'208";a="116378711" Received: from orsosgc001.jf.intel.com (HELO orsosgc001.intel.com) ([10.165.21.142]) by fmviesa006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 25 Feb 2025 18:34:05 -0800 Date: Tue, 25 Feb 2025 18:34:04 -0800 Message-ID: <858qptjuxv.wl-ashutosh.dixit@intel.com> From: "Dixit, Ashutosh" To: Harish Chegondi Cc: , , , , , , Subject: Re: [PATCH v12 0/8] Add support for EU stall sampling In-Reply-To: References: User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?ISO-8859-4?Q?Goj=F2?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/28.2 (x86_64-redhat-linux-gnu) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: text/plain; charset=US-ASCII X-BeenThere: intel-xe@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel Xe graphics driver List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-xe-bounces@lists.freedesktop.org Sender: "Intel-xe" On Tue, 25 Feb 2025 17:47:04 -0800, Harish Chegondi wrote: > > The following patch series add support for EU stall sampling, > a new hardware feature first added in PVC and is being supported > in XE2 and later architecture GPUs. This feature would enable > capturing of EU stall data which include the IP address of the > instruction stalled and various stall reason counts. > > Support for this feature is being added into Mesa: > https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30142 > > New IGT tests for EU stall sampling are being added: > https://patchwork.freedesktop.org/series/143030/ > > This patch series has undergone basic testing with the new IGT tests. > > Issues that need investigation: > 1. Blocked reads with small user buffers may be blocked even with EU > stall data in the kernel buffer as a previous read has set pollin to > false even when kernel buffer has data that could be read but the user > buffer is too small to read all the data. The series is now completey reviewed. Planning to merge this once CI completes on this version: Reviewed-by: Ashutosh Dixit > > Thank You. > > v12 a. Move check for EU stall support to a header file > b. Move 'goto exit_drop;' to the next if statement > v11 a. Lock optimization > b. Moved around code as per review feedback > v10 a. Fixed error rewinding code > b. Used cancel_delayed_work_sync() instead of flush_delayed_work() > c. Replaced per xecore lock with a lock for all the xecore buffers > d. Remove function description for static functions. > e. Use extension number while parsing chain of extensions. > f. Moved code around as per review feedback > v9: a. Split the big patch in v8 into two patches > b. Moved all drop data handling code into one patch > c. Several other code improvements as mentioned in the patches > v8: a. Used div_u64() instead of / to fix 32-bit build issue. > b. Changed copyright year in new files to 2025. > c. Renamed struct drm_xe_eu_stall_data_pvc to struct xe_eu_stall_data_pvc > d. Renamed struct drm_xe_eu_stall_data_xe2 to struct xe_eu_stall_data_xe2 > > v7: a. Renamed input property DRM_XE_EU_STALL_PROP_EVENT_REPORT_COUNT > to DRM_XE_EU_STALL_PROP_WAIT_NUM_REPORTS to be consistent with > OA. Renamed the corresponding internal variables. > b. Fixed some commit messages based on review feedback. > c. Changed sampling_rates from a pointer to flexible array. > > v6: a. Changed the uAPI input to accept sampling rate in GPU cycles > instead of sampling rate multiplier. > b. Fix buffer wrap around over write bug (Matt Olson). > c. Include EU stall sampling rates information and per XeCore buffer size in the query information. > > v5: Addressed review feedback from v4 including > a. Removed DRM_XE_EU_STALL_PROP_POLL_PERIOD from the uAPI (Ashutosh) > b. Separated the patches for Xe_HPC and Xe2 (Matt R) > c. Moved read() returning -EIO into a separate patch > d. Removed spinlocks around set_bit() and clear_bit() (Matt R) > e. Renamed several variables, structures and enums (Ashutosh and > Matt R) > f. Addressed other review feedback. > v4: Addressed review feedback from v3 including > a. Split the patch into multiple patches (Matt R) > b. Added a new device query to get EU stall info (Ashutosh) > c. Renamed all Dss to xecore (Matt R) > d. Removed buffer size and disable at open input properties. (Matt R) > e. Removed the "_SHIFT" macros (Matt R) > f. Allocate the EU stall buffer only on system memory. > g. Changed the work arounds to OOB (Matt R) > h. Other review feedback. > v3: a. Removed data header and changed read() to return -EIO when data is dropped by the HW. > b. Added a new DRM_XE_OBSERVATION_IOCTL_INFO to query EU stall data record info > c. Added struct drm_xe_eu_stall_data_pvc and struct drm_xe_eu_stall_data_xe2 > to xe_drm.h. These declarations would help user space to parse the > EU stall data > d. Addressed other review comments from v2 > v2: Rename xe perf layer as xe observation layer (Ashutosh) > > Test-with: cover.1739901972.git.harish.chegondi@intel.com > > Reviewed-by: Ben Olson > Acked-by: Felix Degrood > Signed-off-by: Harish Chegondi > Signed-off-by: Ashutosh Dixit > > Harish Chegondi (8): > drm/xe/topology: Add a function to find the index of the last enabled > DSS in a mask > drm/xe/uapi: Introduce API for EU stall sampling > drm/xe/eustall: Add support to init, enable and disable EU stall > sampling > drm/xe/eustall: Add support to read() and poll() EU stall data > drm/xe/eustall: Add support to handle dropped EU stall data > drm/xe/eustall: Add EU stall sampling support for Xe2 > drm/xe/uapi: Add a device query to get EU stall sampling information > drm/xe/eustall: Add workaround 22016596838 which applies to PVC. > > drivers/gpu/drm/xe/Makefile | 1 + > drivers/gpu/drm/xe/regs/xe_eu_stall_regs.h | 29 + > drivers/gpu/drm/xe/xe_eu_stall.c | 960 +++++++++++++++++++++ > drivers/gpu/drm/xe/xe_eu_stall.h | 24 + > drivers/gpu/drm/xe/xe_gt.c | 5 + > drivers/gpu/drm/xe/xe_gt_topology.h | 13 + > drivers/gpu/drm/xe/xe_gt_types.h | 3 + > drivers/gpu/drm/xe/xe_observation.c | 14 + > drivers/gpu/drm/xe/xe_query.c | 43 + > drivers/gpu/drm/xe/xe_trace.h | 30 + > drivers/gpu/drm/xe/xe_wa_oob.rules | 1 + > include/uapi/drm/xe_drm.h | 74 ++ > 12 files changed, 1197 insertions(+) > create mode 100644 drivers/gpu/drm/xe/regs/xe_eu_stall_regs.h > create mode 100644 drivers/gpu/drm/xe/xe_eu_stall.c > create mode 100644 drivers/gpu/drm/xe/xe_eu_stall.h > > -- > 2.48.1 >