intel-xe.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
From: Harish Chegondi <harish.chegondi@intel.com>
To: "Olson, Matthew" <matthew.olson@intel.com>
Cc: <intel-xe@lists.freedesktop.org>, <ashutosh.dixit@intel.com>,
	<james.ausmus@intel.com>, <felix.j.degrood@intel.com>,
	<matias.a.cabral@intel.com>, <joshua.santosh.ranjan@intel.com>,
	<shubham.kumar@intel.com>, <matthew.d.roper@intel.com>,
	<lucas.demarchi@intel.com>
Subject: Re: [PATCH v8 0/7] Add support for EU stall sampling
Date: Fri, 17 Jan 2025 21:19:00 -0800	[thread overview]
Message-ID: <Z4s5xCVP6qx64kfo@intel.com> (raw)
In-Reply-To: <Z4l_LkmTYPaXjzZ7@bolson-desk>

On Thu, Jan 16, 2025 at 03:50:38PM -0600, Olson, Matthew wrote:
> On Wed, Jan 15, 2025 at 12:02:06PM -0800, Harish Chegondi wrote:
> > The following patch series add support for EU stall sampling,
> > a new hardware feature first added in PVC and is being supported
> > in XE2 and later architecture GPUs. This feature would enable
> > capturing of EU stall data which include the IP address of the
> > instruction stalled and various stall reason counts.
> > 
> > Support for this feature is being added into Mesa:
> > https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30142
> > 
> > New IGT tests for EU stall sampling are being added:
> > https://patchwork.freedesktop.org/series/143030/
> > 
> > This patch series has undergone basic testing with the new IGT tests.
> 
> Our profiler, iaprof, also consumes EU stalls using this patch series and
> generates AI flamegraphs using them. I've been testing mostly with v7 of this
> patch series since it came out, and have had no issues with it. The stalls are
> reasonable (both the reasons and the GPU address that they point to), and we've
> been able to poll them well enough to run our profiler in the background.
> 
> I suspect the following was already discussed for one of the earlier versions of
> this series, but is it possible to have even lower sampling rates than what are
> currently provided? We're already selecting the slowest sampling rate (the last
That's the slowest sampling rate supported by the hardware.

> in the array), but CPU usage is too high for our liking, and we're still getting
During EU stall sampling, a timer thread in the driver keeps polling for
new EU stall data approximately once every 10 milliseconds. I am wondering if this
could be contributing to the CPU usage too.
> tens of millions of samples per minute.
> 
> > 
> > Thank You.
> 
> No, thank *you!*
> 
> Reviewed-by: Ben Olson <matthew.olson@intel.com>
> 
> > 
> > v8: a. Used div_u64() instead of / to fix 32-bit build issue.
> >     b. Changed copyright year in new files to 2025.
> >     c. Renamed struct drm_xe_eu_stall_data_pvc to struct xe_eu_stall_data_pvc
> >     d. Renamed struct drm_xe_eu_stall_data_xe2 to struct xe_eu_stall_data_xe2
> > 
> > v7: a. Renamed input property DRM_XE_EU_STALL_PROP_EVENT_REPORT_COUNT
> >        to DRM_XE_EU_STALL_PROP_WAIT_NUM_REPORTS to be consistent with
> >        OA. Renamed the corresponding internal variables.
> >     b. Fixed some commit messages based on review feedback.
> >     c. Changed sampling_rates from a pointer to flexible array.
> > 
> > v6: a. Changed the uAPI input to accept sampling rate in GPU cycles
> >        instead of sampling rate multiplier.
> >     b. Fix buffer wrap around over write bug (Matt Olson).
> >     c. Include EU stall sampling rates information and per XeCore buffer size in the query information.
> > 
> > v5: Addressed review feedback from v4 including
> >     a. Removed DRM_XE_EU_STALL_PROP_POLL_PERIOD from the uAPI (Ashutosh)
> >     b. Separated the patches for Xe_HPC and Xe2 (Matt R)
> >     c. Moved read() returning -EIO into a separate patch
> >     d. Removed spinlocks around set_bit() and clear_bit() (Matt R)
> >     e. Renamed several variables, structures and enums (Ashutosh and
> > Matt R)
> >     f. Addressed other review feedback.
> > v4: Addressed review feedback from v3 including
> >     a. Split the patch into multiple patches (Matt R)
> >     b. Added a new device query to get EU stall info (Ashutosh)
> >     c. Renamed all Dss to xecore (Matt R)
> >     d. Removed buffer size and disable at open input properties. (Matt R)
> >     e. Removed the "_SHIFT" macros (Matt R)
> >     f. Allocate the EU stall buffer only on system memory.
> >     g. Changed the work arounds to OOB (Matt R)
> >     h. Other review feedback.
> > v3: a. Removed data header and changed read() to return -EIO when data is dropped by the HW.
> >     b. Added a new DRM_XE_OBSERVATION_IOCTL_INFO to query EU stall data record info
> >     c. Added struct drm_xe_eu_stall_data_pvc and struct drm_xe_eu_stall_data_xe2
> >        to xe_drm.h. These declarations would help user space to parse the
> >        EU stall data
> >     d. Addressed other review comments from v2
> > v2: Rename xe perf layer as xe observation layer (Ashutosh)
> > 
> > Cc: Felix Degrood <felix.j.degrood@intel.com>
> > Signed-off-by: Harish Chegondi <harish.chegondi@intel.com>
> > Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
> > 
> > Harish Chegondi (7):
> >   drm/xe/topology: Add a function to find the index of the last enabled
> >     DSS in a mask
> >   drm/xe/uapi: Introduce API for EU stall sampling
> >   drm/xe/eustall: Implement EU stall sampling APIs for Xe_HPC
> >   drm/xe/eustall: Return -EIO error from read() if HW drops data
> >   drm/xe/eustall: Add EU stall sampling support for Xe2
> >   drm/xe/uapi: Add a device query to get EU stall sampling information
> >   drm/xe/eustall: Add workaround 22016596838 which applies to PVC.
> > 
> >  drivers/gpu/drm/xe/Makefile                |    1 +
> >  drivers/gpu/drm/xe/regs/xe_eu_stall_regs.h |   29 +
> >  drivers/gpu/drm/xe/xe_eu_stall.c           | 1103 ++++++++++++++++++++
> >  drivers/gpu/drm/xe/xe_eu_stall.h           |   61 ++
> >  drivers/gpu/drm/xe/xe_gt.c                 |    6 +
> >  drivers/gpu/drm/xe/xe_gt_topology.h        |   13 +
> >  drivers/gpu/drm/xe/xe_gt_types.h           |    3 +
> >  drivers/gpu/drm/xe/xe_observation.c        |   14 +
> >  drivers/gpu/drm/xe/xe_query.c              |   38 +
> >  drivers/gpu/drm/xe/xe_trace.h              |   33 +
> >  drivers/gpu/drm/xe/xe_wa_oob.rules         |    1 +
> >  include/uapi/drm/xe_drm.h                  |   74 ++
> >  12 files changed, 1376 insertions(+)
> >  create mode 100644 drivers/gpu/drm/xe/regs/xe_eu_stall_regs.h
> >  create mode 100644 drivers/gpu/drm/xe/xe_eu_stall.c
> >  create mode 100644 drivers/gpu/drm/xe/xe_eu_stall.h
> > 
> > -- 
> > 2.47.1
> > 

      reply	other threads:[~2025-01-18  5:19 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-01-15 20:02 [PATCH v8 0/7] Add support for EU stall sampling Harish Chegondi
2025-01-15 20:02 ` [PATCH v8 1/7] drm/xe/topology: Add a function to find the index of the last enabled DSS in a mask Harish Chegondi
2025-01-17 17:25   ` Dixit, Ashutosh
2025-01-22  5:18     ` Harish Chegondi
2025-01-15 20:02 ` [PATCH v8 2/7] drm/xe/uapi: Introduce API for EU stall sampling Harish Chegondi
2025-01-17 19:02   ` Dixit, Ashutosh
2025-01-22 23:44     ` Harish Chegondi
2025-01-23  2:19       ` Dixit, Ashutosh
2025-01-15 20:02 ` [PATCH v8 3/7] drm/xe/eustall: Implement EU stall sampling APIs for Xe_HPC Harish Chegondi
2025-01-18  2:34   ` Dixit, Ashutosh
2025-01-23 18:51   ` Dixit, Ashutosh
2025-01-25  3:09   ` [PATCH v8 3 " Dixit, Ashutosh
2025-01-29  4:12   ` [PATCH v8 4 " Dixit, Ashutosh
2025-01-29  4:32     ` Dixit, Ashutosh
2025-01-30 18:46     ` Harish Chegondi
2025-01-31  3:23       ` Dixit, Ashutosh
2025-01-15 20:02 ` [PATCH v8 4/7] drm/xe/eustall: Return -EIO error from read() if HW drops data Harish Chegondi
2025-01-30  4:45   ` Dixit, Ashutosh
2025-01-30 17:05     ` Dixit, Ashutosh
2025-01-31 21:50       ` Harish Chegondi
2025-01-31 19:30     ` Harish Chegondi
2025-01-31 20:19       ` Dixit, Ashutosh
2025-01-31 22:59         ` Harish Chegondi
2025-02-01  0:13           ` Dixit, Ashutosh
2025-02-01  6:57             ` Dixit, Ashutosh
2025-01-15 20:02 ` [PATCH v8 5/7] drm/xe/eustall: Add EU stall sampling support for Xe2 Harish Chegondi
2025-01-30  4:55   ` Dixit, Ashutosh
2025-02-05  1:16     ` Olson, Matthew
2025-02-05  1:57       ` Dixit, Ashutosh
2025-02-05 19:03         ` Olson, Matthew
2025-02-05 20:02           ` Dixit, Ashutosh
2025-01-15 20:02 ` [PATCH v8 6/7] drm/xe/uapi: Add a device query to get EU stall sampling information Harish Chegondi
2025-01-16 22:34   ` Dixit, Ashutosh
2025-01-22  2:48     ` Harish Chegondi
2025-01-22  3:00       ` Dixit, Ashutosh
2025-01-30 17:36   ` Dixit, Ashutosh
2025-01-15 20:02 ` [PATCH v8 7/7] drm/xe/eustall: Add workaround 22016596838 which applies to PVC Harish Chegondi
2025-01-30  5:14   ` Dixit, Ashutosh
2025-01-15 20:46 ` ✓ CI.Patch_applied: success for Add support for EU stall sampling Patchwork
2025-01-15 20:46 ` ✗ CI.checkpatch: warning " Patchwork
2025-01-15 20:48 ` ✓ CI.KUnit: success " Patchwork
2025-01-15 21:14 ` ✓ CI.Build: " Patchwork
2025-01-15 21:16 ` ✗ CI.Hooks: failure " Patchwork
2025-01-15 21:18 ` ✓ CI.checksparse: success " Patchwork
2025-01-15 21:43 ` ✗ Xe.CI.BAT: failure " Patchwork
2025-01-16  0:37 ` ✗ Xe.CI.Full: " Patchwork
2025-01-16  0:51 ` [PATCH v8 0/7] " Degrood, Felix J
2025-01-16 21:50 ` Olson, Matthew
2025-01-18  5:19   ` Harish Chegondi [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z4s5xCVP6qx64kfo@intel.com \
    --to=harish.chegondi@intel.com \
    --cc=ashutosh.dixit@intel.com \
    --cc=felix.j.degrood@intel.com \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=james.ausmus@intel.com \
    --cc=joshua.santosh.ranjan@intel.com \
    --cc=lucas.demarchi@intel.com \
    --cc=matias.a.cabral@intel.com \
    --cc=matthew.d.roper@intel.com \
    --cc=matthew.olson@intel.com \
    --cc=shubham.kumar@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).