All of lore.kernel.org
 help / color / mirror / Atom feed
From: Matthew Brost <matthew.brost@intel.com>
To: Boris Brezillon <boris.brezillon@collabora.com>
Cc: intel-xe@lists.freedesktop.org, dri-devel@lists.freedesktop.org,
	"Tvrtko Ursulin" <tvrtko.ursulin@igalia.com>,
	"Rodrigo Vivi" <rodrigo.vivi@intel.com>,
	"Thomas Hellström" <thomas.hellstrom@linux.intel.com>,
	"Christian König" <christian.koenig@amd.com>,
	"Danilo Krummrich" <dakr@kernel.org>,
	"David Airlie" <airlied@gmail.com>,
	"Maarten Lankhorst" <maarten.lankhorst@linux.intel.com>,
	"Maxime Ripard" <mripard@kernel.org>,
	"Philipp Stanner" <phasta@kernel.org>,
	"Simona Vetter" <simona@ffwll.ch>,
	"Sumit Semwal" <sumit.semwal@linaro.org>,
	"Thomas Zimmermann" <tzimmermann@suse.de>,
	linux-kernel@vger.kernel.org
Subject: Re: [RFC PATCH 02/12] drm/dep: Add DRM dependency queue layer
Date: Tue, 24 Mar 2026 09:06:02 -0700	[thread overview]
Message-ID: <acK2apOn5DMJFb1+@lstrano-desk.jf.intel.com> (raw)
In-Reply-To: <20260324102345.17742bef@fedora>

On Tue, Mar 24, 2026 at 10:23:45AM +0100, Boris Brezillon wrote:
> On Mon, 23 Mar 2026 11:38:06 -0700
> Matthew Brost <matthew.brost@intel.com> wrote:
> 
> > 
> > Ok, getting stats is easier than I thought...
> > 
> > ./perf stat -a -e context-switches,cpu-migrations,task-clock,cycles,instructions /home/mbrost/xe/source/drivers.gpu.i915.igt-gpu-tools/build/tests/xe_exec_threads --r threads-basic
> > 
> > This test creates one thread per engine instance (7 instances this BMG
> > device) and submits 1k exec IOCTLs per thread, each performing a DW
> > write. Each exec IOCTL typically does not have unsignaled input dependencies.
> > 
> > With IRQ putting of jobs off + no bypass (drm_dep_queue_flags = 0):
> > 
> >              8,449      context-switches
> >                412      cpu-migrations
> >           2,531.43 msec task-clock
> >      1,847,846,588      cpu_atom/cycles/
> >      1,847,856,947      cpu_core/cycles/
> >    <not supported>      cpu_atom/instructions/
> >        460,744,020      cpu_core/instructions/
> > 
> > With IRQ putting of jobs off + bypass (drm_dep_queue_flags =
> > DRM_DEP_QUEUE_FLAGS_BYPASS_SUPPORTED):
> > 
> >              8,655      context-switches
> >                229      cpu-migrations
> >           2,571.33 msec task-clock
> >        855,900,607      cpu_atom/cycles/
> >        855,900,272      cpu_core/cycles/
> >    <not supported>      cpu_atom/instructions/
> >        403,651,469      cpu_core/instructions/
> > 
> > With IRQ putting of jobs on + bypass (drm_dep_queue_flags =
> > DRM_DEP_QUEUE_FLAGS_BYPASS_SUPPORTED |
> > DRM_DEP_QUEUE_FLAGS_JOB_PUT_IRQ_SAFE):
> > 
> >              5,361      context-switches
> >                169      cpu-migrations
> >           2,577.44 msec task-clock
> >        685,769,153      cpu_atom/cycles/
> >        685,768,407      cpu_core/cycles/
> >    <not supported>      cpu_atom/instructions/
> >        321,336,297      cpu_core/instructions/
> 
> Thanks for sharing those numbers. For completeness, can you also add the
> "With IRQ putting of jobs on + no bypass" case?
> 

Yes, I also will share a DRM sched baseline too + I figured out power
can be measured too - initial results confirm what I expected too - less
power.

I'm putting together a doc based on running glxgears and another
benchmark on top Ubuntu 24.10 + Wayland which has explicit sync
(linux-drm-syncobj, behaves like surfface flinger when rendering flag to
not pass in fences to draw jobs).

Almost have all the data. Will share here once I have it.

> I'm a bit surprised by the difference in number of context switches
> given I'd expect the local-CPU to be picked in priority, and so queuing
> work items on the same wq from another work item to be almost free in
> term on scheduling. But I guess there's some load-balancing happening
> when you execute jobs at such a high rate.
> 
> Also, I don't know if that's just noise or if it's reproducible, but
> task-clock seems to be ~40usec lower with the deferred cleanup and
> no-bypass (higher throughput because you're not blocking the dequeuing
> of the next job on the cleanup of the previous one, I suspect).

I think that is just noise of what the test is doing in user space -
that bounces around a bit.

Matt

> 

  reply	other threads:[~2026-03-24 16:06 UTC|newest]

Thread overview: 65+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-16  4:32 [RFC PATCH 00/12] Introduce DRM dep queue Matthew Brost
2026-03-16  4:32 ` [RFC PATCH 01/12] workqueue: Add interface to teach lockdep to warn on reclaim violations Matthew Brost
2026-03-25 15:59   ` Tejun Heo
2026-03-26  1:49     ` Matthew Brost
2026-03-26  2:19       ` Tejun Heo
2026-03-27  4:33         ` Matthew Brost
2026-03-27 17:25           ` Tejun Heo
2026-03-16  4:32 ` [RFC PATCH 02/12] drm/dep: Add DRM dependency queue layer Matthew Brost
2026-03-16  9:16   ` Boris Brezillon
2026-03-17  5:22     ` Matthew Brost
2026-03-17  8:48       ` Boris Brezillon
2026-03-16 10:25   ` Danilo Krummrich
2026-03-17  5:10     ` Matthew Brost
2026-03-17 12:19       ` Danilo Krummrich
2026-03-18 23:02         ` Matthew Brost
2026-03-17  2:47   ` Daniel Almeida
2026-03-17  5:45     ` Matthew Brost
2026-03-17  7:17       ` Miguel Ojeda
2026-03-17  8:26         ` Matthew Brost
2026-03-17 12:04           ` Daniel Almeida
2026-03-17 19:41           ` Miguel Ojeda
2026-03-23 17:31             ` Matthew Brost
2026-03-23 17:42               ` Miguel Ojeda
2026-03-17 18:14       ` Matthew Brost
2026-03-17 19:48         ` Daniel Almeida
2026-03-17 20:43         ` Boris Brezillon
2026-03-18 22:40           ` Matthew Brost
2026-03-19  9:57             ` Boris Brezillon
2026-03-22  6:43               ` Matthew Brost
2026-03-23  7:58                 ` Matthew Brost
2026-03-23 10:06                   ` Boris Brezillon
2026-03-23 17:11                     ` Matthew Brost
2026-03-17 12:31     ` Danilo Krummrich
2026-03-17 14:25       ` Daniel Almeida
2026-03-17 14:33         ` Danilo Krummrich
2026-03-18 22:50           ` Matthew Brost
2026-03-17  8:47   ` Christian König
2026-03-17 14:55   ` Boris Brezillon
2026-03-18 23:28     ` Matthew Brost
2026-03-19  9:11       ` Boris Brezillon
2026-03-23  4:50         ` Matthew Brost
2026-03-23  9:55           ` Boris Brezillon
2026-03-23 17:08             ` Matthew Brost
2026-03-23 18:38               ` Matthew Brost
2026-03-24  9:23                 ` Boris Brezillon
2026-03-24 16:06                   ` Matthew Brost [this message]
2026-03-25  2:33                     ` Matthew Brost
2026-03-24  8:49               ` Boris Brezillon
2026-03-24 16:51                 ` Matthew Brost
2026-03-17 16:30   ` Shashank Sharma
2026-03-16  4:32 ` [RFC PATCH 03/12] drm/xe: Use WQ_MEM_WARN_ON_RECLAIM on all workqueues in the reclaim path Matthew Brost
2026-03-16  4:32 ` [RFC PATCH 04/12] drm/xe: Issue GGTT invalidation under lock in ggtt_node_remove Matthew Brost
2026-03-26  5:45   ` Bhadane, Dnyaneshwar
2026-03-16  4:32 ` [RFC PATCH 05/12] drm/xe: Return fence from xe_sched_job_arm and adjust job references Matthew Brost
2026-03-16  4:32 ` [RFC PATCH 06/12] drm/xe: Convert to DRM dep queue scheduler layer Matthew Brost
2026-03-16  4:32 ` [RFC PATCH 07/12] drm/xe: Make scheduler message lock IRQ-safe Matthew Brost
2026-03-16  4:32 ` [RFC PATCH 08/12] drm/xe: Rework exec queue object on top of DRM dep Matthew Brost
2026-03-16  4:32 ` [RFC PATCH 09/12] drm/xe: Enable IRQ job put in " Matthew Brost
2026-03-16  4:32 ` [RFC PATCH 10/12] drm/xe: Use DRM dep queue kill semantics Matthew Brost
2026-03-16  4:32 ` [RFC PATCH 11/12] accel/amdxdna: Convert to drm_dep scheduler layer Matthew Brost
2026-03-16  4:32 ` [RFC PATCH 12/12] drm/panthor: " Matthew Brost
2026-03-16  4:52 ` ✗ CI.checkpatch: warning for Introduce DRM dep queue Patchwork
2026-03-16  4:53 ` ✓ CI.KUnit: success " Patchwork
2026-03-16  5:28 ` ✓ Xe.CI.BAT: " Patchwork
2026-03-16  8:09 ` ✗ Xe.CI.FULL: failure " Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=acK2apOn5DMJFb1+@lstrano-desk.jf.intel.com \
    --to=matthew.brost@intel.com \
    --cc=airlied@gmail.com \
    --cc=boris.brezillon@collabora.com \
    --cc=christian.koenig@amd.com \
    --cc=dakr@kernel.org \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=maarten.lankhorst@linux.intel.com \
    --cc=mripard@kernel.org \
    --cc=phasta@kernel.org \
    --cc=rodrigo.vivi@intel.com \
    --cc=simona@ffwll.ch \
    --cc=sumit.semwal@linaro.org \
    --cc=thomas.hellstrom@linux.intel.com \
    --cc=tvrtko.ursulin@igalia.com \
    --cc=tzimmermann@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.