From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 320D0C021B8 for ; Wed, 26 Feb 2025 19:43:28 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 5EEDD10E260; Wed, 26 Feb 2025 19:43:26 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.b="aE3Qt92N"; dkim-atps=neutral Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.7]) by gabe.freedesktop.org (Postfix) with ESMTPS id A793510E260 for ; Wed, 26 Feb 2025 19:43:24 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1740599004; x=1772135004; h=date:message-id:from:to:cc:subject:in-reply-to: references:mime-version; bh=Z5hH48BEY/qhJZQp/7bA/4DoTd/KIRo8UgNnOPgiPc8=; b=aE3Qt92N3Q2TmZrLp6KlVyDrDvBveJmPWAZT6KMOrOgOvDDZBjMTY+k3 xM1ZxJXbrqW7+9EQYluQ66PcW3AZHLZ+AmbFNPSDbNa4Fsi/tuVxQ9CNs u6gF6PObzIwmMb8yhUwQ5BTkl8Bk7tmA4LjKqIkIRPE/ezk9Qos+LX9cR N9K7CBEjHVeET/P5qLegCjmBSip6X/mWNzygyachLE3hdzezYEmr4SQvP Ohy35pKczZWG0yQuUoEkvqG/m2ObdiNPsYs+m5Y0x+BLu76SW8MbB8yiK qcZ2FnCZHYGynFoivgp2TmWoEK+7/B//MfVqGOgaMhqYQHGb4izs89ZvA Q==; X-CSE-ConnectionGUID: riheHqZ6SByMIdjnbN+/BA== X-CSE-MsgGUID: NS1YDUsYRRiQSiDDdiJV/g== X-IronPort-AV: E=McAfee;i="6700,10204,11357"; a="66840620" X-IronPort-AV: E=Sophos;i="6.13,318,1732608000"; d="scan'208";a="66840620" Received: from orviesa004.jf.intel.com ([10.64.159.144]) by fmvoesa101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Feb 2025 11:43:23 -0800 X-CSE-ConnectionGUID: rljRw0+MRYK+XKW6U2VAFA== X-CSE-MsgGUID: yFG3yPsjTZ+u/p79ArmmtQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.13,318,1732608000"; d="scan'208";a="121806600" Received: from orsosgc001.jf.intel.com (HELO orsosgc001.intel.com) ([10.165.21.142]) by orviesa004-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Feb 2025 11:43:22 -0800 Date: Wed, 26 Feb 2025 11:43:22 -0800 Message-ID: <857c5cjxut.wl-ashutosh.dixit@intel.com> From: "Dixit, Ashutosh" To: Harish Chegondi Cc: , , , , , , Subject: Re: [PATCH v12 0/8] Add support for EU stall sampling In-Reply-To: <858qptjuxv.wl-ashutosh.dixit@intel.com> References: <858qptjuxv.wl-ashutosh.dixit@intel.com> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?ISO-8859-4?Q?Goj=F2?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/28.2 (x86_64-redhat-linux-gnu) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: text/plain; charset=US-ASCII X-BeenThere: intel-xe@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel Xe graphics driver List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-xe-bounces@lists.freedesktop.org Sender: "Intel-xe" On Tue, 25 Feb 2025 18:34:04 -0800, Dixit, Ashutosh wrote: > > On Tue, 25 Feb 2025 17:47:04 -0800, Harish Chegondi wrote: > > > > The following patch series add support for EU stall sampling, > > a new hardware feature first added in PVC and is being supported > > in XE2 and later architecture GPUs. This feature would enable > > capturing of EU stall data which include the IP address of the > > instruction stalled and various stall reason counts. > > > > Support for this feature is being added into Mesa: > > https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30142 > > > > New IGT tests for EU stall sampling are being added: > > https://patchwork.freedesktop.org/series/143030/ > > > > This patch series has undergone basic testing with the new IGT tests. > > > > Issues that need investigation: > > 1. Blocked reads with small user buffers may be blocked even with EU > > stall data in the kernel buffer as a previous read has set pollin to > > false even when kernel buffer has data that could be read but the user > > buffer is too small to read all the data. > > The series is now completey reviewed. Planning to merge this once CI > completes on this version: Done! EU stall series is now merged upstream. > > Reviewed-by: Ashutosh Dixit > > > > > Thank You. > > > > v12 a. Move check for EU stall support to a header file > > b. Move 'goto exit_drop;' to the next if statement > > v11 a. Lock optimization > > b. Moved around code as per review feedback > > v10 a. Fixed error rewinding code > > b. Used cancel_delayed_work_sync() instead of flush_delayed_work() > > c. Replaced per xecore lock with a lock for all the xecore buffers > > d. Remove function description for static functions. > > e. Use extension number while parsing chain of extensions. > > f. Moved code around as per review feedback > > v9: a. Split the big patch in v8 into two patches > > b. Moved all drop data handling code into one patch > > c. Several other code improvements as mentioned in the patches > > v8: a. Used div_u64() instead of / to fix 32-bit build issue. > > b. Changed copyright year in new files to 2025. > > c. Renamed struct drm_xe_eu_stall_data_pvc to struct xe_eu_stall_data_pvc > > d. Renamed struct drm_xe_eu_stall_data_xe2 to struct xe_eu_stall_data_xe2 > > > > v7: a. Renamed input property DRM_XE_EU_STALL_PROP_EVENT_REPORT_COUNT > > to DRM_XE_EU_STALL_PROP_WAIT_NUM_REPORTS to be consistent with > > OA. Renamed the corresponding internal variables. > > b. Fixed some commit messages based on review feedback. > > c. Changed sampling_rates from a pointer to flexible array. > > > > v6: a. Changed the uAPI input to accept sampling rate in GPU cycles > > instead of sampling rate multiplier. > > b. Fix buffer wrap around over write bug (Matt Olson). > > c. Include EU stall sampling rates information and per XeCore buffer size in the query information. > > > > v5: Addressed review feedback from v4 including > > a. Removed DRM_XE_EU_STALL_PROP_POLL_PERIOD from the uAPI (Ashutosh) > > b. Separated the patches for Xe_HPC and Xe2 (Matt R) > > c. Moved read() returning -EIO into a separate patch > > d. Removed spinlocks around set_bit() and clear_bit() (Matt R) > > e. Renamed several variables, structures and enums (Ashutosh and > > Matt R) > > f. Addressed other review feedback. > > v4: Addressed review feedback from v3 including > > a. Split the patch into multiple patches (Matt R) > > b. Added a new device query to get EU stall info (Ashutosh) > > c. Renamed all Dss to xecore (Matt R) > > d. Removed buffer size and disable at open input properties. (Matt R) > > e. Removed the "_SHIFT" macros (Matt R) > > f. Allocate the EU stall buffer only on system memory. > > g. Changed the work arounds to OOB (Matt R) > > h. Other review feedback. > > v3: a. Removed data header and changed read() to return -EIO when data is dropped by the HW. > > b. Added a new DRM_XE_OBSERVATION_IOCTL_INFO to query EU stall data record info > > c. Added struct drm_xe_eu_stall_data_pvc and struct drm_xe_eu_stall_data_xe2 > > to xe_drm.h. These declarations would help user space to parse the > > EU stall data > > d. Addressed other review comments from v2 > > v2: Rename xe perf layer as xe observation layer (Ashutosh) > > > > Test-with: cover.1739901972.git.harish.chegondi@intel.com > > > > Reviewed-by: Ben Olson > > Acked-by: Felix Degrood > > Signed-off-by: Harish Chegondi > > Signed-off-by: Ashutosh Dixit > > > > Harish Chegondi (8): > > drm/xe/topology: Add a function to find the index of the last enabled > > DSS in a mask > > drm/xe/uapi: Introduce API for EU stall sampling > > drm/xe/eustall: Add support to init, enable and disable EU stall > > sampling > > drm/xe/eustall: Add support to read() and poll() EU stall data > > drm/xe/eustall: Add support to handle dropped EU stall data > > drm/xe/eustall: Add EU stall sampling support for Xe2 > > drm/xe/uapi: Add a device query to get EU stall sampling information > > drm/xe/eustall: Add workaround 22016596838 which applies to PVC. > > > > drivers/gpu/drm/xe/Makefile | 1 + > > drivers/gpu/drm/xe/regs/xe_eu_stall_regs.h | 29 + > > drivers/gpu/drm/xe/xe_eu_stall.c | 960 +++++++++++++++++++++ > > drivers/gpu/drm/xe/xe_eu_stall.h | 24 + > > drivers/gpu/drm/xe/xe_gt.c | 5 + > > drivers/gpu/drm/xe/xe_gt_topology.h | 13 + > > drivers/gpu/drm/xe/xe_gt_types.h | 3 + > > drivers/gpu/drm/xe/xe_observation.c | 14 + > > drivers/gpu/drm/xe/xe_query.c | 43 + > > drivers/gpu/drm/xe/xe_trace.h | 30 + > > drivers/gpu/drm/xe/xe_wa_oob.rules | 1 + > > include/uapi/drm/xe_drm.h | 74 ++ > > 12 files changed, 1197 insertions(+) > > create mode 100644 drivers/gpu/drm/xe/regs/xe_eu_stall_regs.h > > create mode 100644 drivers/gpu/drm/xe/xe_eu_stall.c > > create mode 100644 drivers/gpu/drm/xe/xe_eu_stall.h > > > > -- > > 2.48.1 > >