All of lore.kernel.org
 help / color / mirror / Atom feed
From: Harish Chegondi <harish.chegondi@intel.com>
To: "Olson, Matthew" <matthew.olson@intel.com>
Cc: <intel-xe@lists.freedesktop.org>, <ashutosh.dixit@intel.com>,
	<james.ausmus@intel.com>, <felix.j.degrood@intel.com>,
	<matias.a.cabral@intel.com>, <joshua.santosh.ranjan@intel.com>,
	<shubham.kumar@intel.com>, <matthew.d.roper@intel.com>,
	<lucas.demarchi@intel.com>
Subject: Re: [PATCH v8 0/7] Add support for EU stall sampling
Date: Fri, 17 Jan 2025 21:19:00 -0800	[thread overview]
Message-ID: <Z4s5xCVP6qx64kfo@intel.com> (raw)
In-Reply-To: <Z4l_LkmTYPaXjzZ7@bolson-desk>

On Thu, Jan 16, 2025 at 03:50:38PM -0600, Olson, Matthew wrote:
> On Wed, Jan 15, 2025 at 12:02:06PM -0800, Harish Chegondi wrote:
> > The following patch series add support for EU stall sampling,
> > a new hardware feature first added in PVC and is being supported
> > in XE2 and later architecture GPUs. This feature would enable
> > capturing of EU stall data which include the IP address of the
> > instruction stalled and various stall reason counts.
> > 
> > Support for this feature is being added into Mesa:
> > https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30142
> > 
> > New IGT tests for EU stall sampling are being added:
> > https://patchwork.freedesktop.org/series/143030/
> > 
> > This patch series has undergone basic testing with the new IGT tests.
> 
> Our profiler, iaprof, also consumes EU stalls using this patch series and
> generates AI flamegraphs using them. I've been testing mostly with v7 of this
> patch series since it came out, and have had no issues with it. The stalls are
> reasonable (both the reasons and the GPU address that they point to), and we've
> been able to poll them well enough to run our profiler in the background.
> 
> I suspect the following was already discussed for one of the earlier versions of
> this series, but is it possible to have even lower sampling rates than what are
> currently provided? We're already selecting the slowest sampling rate (the last
That's the slowest sampling rate supported by the hardware.

> in the array), but CPU usage is too high for our liking, and we're still getting
During EU stall sampling, a timer thread in the driver keeps polling for
new EU stall data approximately once every 10 milliseconds. I am wondering if this
could be contributing to the CPU usage too.
> tens of millions of samples per minute.
> 
> > 
> > Thank You.
> 
> No, thank *you!*
> 
> Reviewed-by: Ben Olson <matthew.olson@intel.com>
> 
> > 
> > v8: a. Used div_u64() instead of / to fix 32-bit build issue.
> >     b. Changed copyright year in new files to 2025.
> >     c. Renamed struct drm_xe_eu_stall_data_pvc to struct xe_eu_stall_data_pvc
> >     d. Renamed struct drm_xe_eu_stall_data_xe2 to struct xe_eu_stall_data_xe2
> > 
> > v7: a. Renamed input property DRM_XE_EU_STALL_PROP_EVENT_REPORT_COUNT
> >        to DRM_XE_EU_STALL_PROP_WAIT_NUM_REPORTS to be consistent with
> >        OA. Renamed the corresponding internal variables.
> >     b. Fixed some commit messages based on review feedback.
> >     c. Changed sampling_rates from a pointer to flexible array.
> > 
> > v6: a. Changed the uAPI input to accept sampling rate in GPU cycles
> >        instead of sampling rate multiplier.
> >     b. Fix buffer wrap around over write bug (Matt Olson).
> >     c. Include EU stall sampling rates information and per XeCore buffer size in the query information.
> > 
> > v5: Addressed review feedback from v4 including
> >     a. Removed DRM_XE_EU_STALL_PROP_POLL_PERIOD from the uAPI (Ashutosh)
> >     b. Separated the patches for Xe_HPC and Xe2 (Matt R)
> >     c. Moved read() returning -EIO into a separate patch
> >     d. Removed spinlocks around set_bit() and clear_bit() (Matt R)
> >     e. Renamed several variables, structures and enums (Ashutosh and
> > Matt R)
> >     f. Addressed other review feedback.
> > v4: Addressed review feedback from v3 including
> >     a. Split the patch into multiple patches (Matt R)
> >     b. Added a new device query to get EU stall info (Ashutosh)
> >     c. Renamed all Dss to xecore (Matt R)
> >     d. Removed buffer size and disable at open input properties. (Matt R)
> >     e. Removed the "_SHIFT" macros (Matt R)
> >     f. Allocate the EU stall buffer only on system memory.
> >     g. Changed the work arounds to OOB (Matt R)
> >     h. Other review feedback.
> > v3: a. Removed data header and changed read() to return -EIO when data is dropped by the HW.
> >     b. Added a new DRM_XE_OBSERVATION_IOCTL_INFO to query EU stall data record info
> >     c. Added struct drm_xe_eu_stall_data_pvc and struct drm_xe_eu_stall_data_xe2
> >        to xe_drm.h. These declarations would help user space to parse the
> >        EU stall data
> >     d. Addressed other review comments from v2
> > v2: Rename xe perf layer as xe observation layer (Ashutosh)
> > 
> > Cc: Felix Degrood <felix.j.degrood@intel.com>
> > Signed-off-by: Harish Chegondi <harish.chegondi@intel.com>
> > Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
> > 
> > Harish Chegondi (7):
> >   drm/xe/topology: Add a function to find the index of the last enabled
> >     DSS in a mask
> >   drm/xe/uapi: Introduce API for EU stall sampling
> >   drm/xe/eustall: Implement EU stall sampling APIs for Xe_HPC
> >   drm/xe/eustall: Return -EIO error from read() if HW drops data
> >   drm/xe/eustall: Add EU stall sampling support for Xe2
> >   drm/xe/uapi: Add a device query to get EU stall sampling information
> >   drm/xe/eustall: Add workaround 22016596838 which applies to PVC.
> > 
> >  drivers/gpu/drm/xe/Makefile                |    1 +
> >  drivers/gpu/drm/xe/regs/xe_eu_stall_regs.h |   29 +
> >  drivers/gpu/drm/xe/xe_eu_stall.c           | 1103 ++++++++++++++++++++
> >  drivers/gpu/drm/xe/xe_eu_stall.h           |   61 ++
> >  drivers/gpu/drm/xe/xe_gt.c                 |    6 +
> >  drivers/gpu/drm/xe/xe_gt_topology.h        |   13 +
> >  drivers/gpu/drm/xe/xe_gt_types.h           |    3 +
> >  drivers/gpu/drm/xe/xe_observation.c        |   14 +
> >  drivers/gpu/drm/xe/xe_query.c              |   38 +
> >  drivers/gpu/drm/xe/xe_trace.h              |   33 +
> >  drivers/gpu/drm/xe/xe_wa_oob.rules         |    1 +
> >  include/uapi/drm/xe_drm.h                  |   74 ++
> >  12 files changed, 1376 insertions(+)
> >  create mode 100644 drivers/gpu/drm/xe/regs/xe_eu_stall_regs.h
> >  create mode 100644 drivers/gpu/drm/xe/xe_eu_stall.c
> >  create mode 100644 drivers/gpu/drm/xe/xe_eu_stall.h
> > 
> > -- 
> > 2.47.1
> > 

      reply	other threads:[~2025-01-18  5:19 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-01-15 20:02 [PATCH v8 0/7] Add support for EU stall sampling Harish Chegondi
2025-01-15 20:02 ` [PATCH v8 1/7] drm/xe/topology: Add a function to find the index of the last enabled DSS in a mask Harish Chegondi
2025-01-17 17:25   ` Dixit, Ashutosh
2025-01-22  5:18     ` Harish Chegondi
2025-01-15 20:02 ` [PATCH v8 2/7] drm/xe/uapi: Introduce API for EU stall sampling Harish Chegondi
2025-01-17 19:02   ` Dixit, Ashutosh
2025-01-22 23:44     ` Harish Chegondi
2025-01-23  2:19       ` Dixit, Ashutosh
2025-01-15 20:02 ` [PATCH v8 3/7] drm/xe/eustall: Implement EU stall sampling APIs for Xe_HPC Harish Chegondi
2025-01-18  2:34   ` Dixit, Ashutosh
2025-01-23 18:51   ` Dixit, Ashutosh
2025-01-25  3:09   ` [PATCH v8 3 " Dixit, Ashutosh
2025-01-29  4:12   ` [PATCH v8 4 " Dixit, Ashutosh
2025-01-29  4:32     ` Dixit, Ashutosh
2025-01-30 18:46     ` Harish Chegondi
2025-01-31  3:23       ` Dixit, Ashutosh
2025-01-15 20:02 ` [PATCH v8 4/7] drm/xe/eustall: Return -EIO error from read() if HW drops data Harish Chegondi
2025-01-30  4:45   ` Dixit, Ashutosh
2025-01-30 17:05     ` Dixit, Ashutosh
2025-01-31 21:50       ` Harish Chegondi
2025-01-31 19:30     ` Harish Chegondi
2025-01-31 20:19       ` Dixit, Ashutosh
2025-01-31 22:59         ` Harish Chegondi
2025-02-01  0:13           ` Dixit, Ashutosh
2025-02-01  6:57             ` Dixit, Ashutosh
2025-01-15 20:02 ` [PATCH v8 5/7] drm/xe/eustall: Add EU stall sampling support for Xe2 Harish Chegondi
2025-01-30  4:55   ` Dixit, Ashutosh
2025-02-05  1:16     ` Olson, Matthew
2025-02-05  1:57       ` Dixit, Ashutosh
2025-02-05 19:03         ` Olson, Matthew
2025-02-05 20:02           ` Dixit, Ashutosh
2025-01-15 20:02 ` [PATCH v8 6/7] drm/xe/uapi: Add a device query to get EU stall sampling information Harish Chegondi
2025-01-16 22:34   ` Dixit, Ashutosh
2025-01-22  2:48     ` Harish Chegondi
2025-01-22  3:00       ` Dixit, Ashutosh
2025-01-30 17:36   ` Dixit, Ashutosh
2025-01-15 20:02 ` [PATCH v8 7/7] drm/xe/eustall: Add workaround 22016596838 which applies to PVC Harish Chegondi
2025-01-30  5:14   ` Dixit, Ashutosh
2025-01-15 20:46 ` ✓ CI.Patch_applied: success for Add support for EU stall sampling Patchwork
2025-01-15 20:46 ` ✗ CI.checkpatch: warning " Patchwork
2025-01-15 20:48 ` ✓ CI.KUnit: success " Patchwork
2025-01-15 21:14 ` ✓ CI.Build: " Patchwork
2025-01-15 21:16 ` ✗ CI.Hooks: failure " Patchwork
2025-01-15 21:18 ` ✓ CI.checksparse: success " Patchwork
2025-01-15 21:43 ` ✗ Xe.CI.BAT: failure " Patchwork
2025-01-16  0:37 ` ✗ Xe.CI.Full: " Patchwork
2025-01-16  0:51 ` [PATCH v8 0/7] " Degrood, Felix J
2025-01-16 21:50 ` Olson, Matthew
2025-01-18  5:19   ` Harish Chegondi [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z4s5xCVP6qx64kfo@intel.com \
    --to=harish.chegondi@intel.com \
    --cc=ashutosh.dixit@intel.com \
    --cc=felix.j.degrood@intel.com \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=james.ausmus@intel.com \
    --cc=joshua.santosh.ranjan@intel.com \
    --cc=lucas.demarchi@intel.com \
    --cc=matias.a.cabral@intel.com \
    --cc=matthew.d.roper@intel.com \
    --cc=matthew.olson@intel.com \
    --cc=shubham.kumar@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.