From: Boris Brezillon <boris.brezillon@collabora.com>
To: Matthew Brost <matthew.brost@intel.com>
Cc: intel-xe@lists.freedesktop.org, dri-devel@lists.freedesktop.org,
"Tvrtko Ursulin" <tvrtko.ursulin@igalia.com>,
"Rodrigo Vivi" <rodrigo.vivi@intel.com>,
"Thomas Hellström" <thomas.hellstrom@linux.intel.com>,
"Christian König" <christian.koenig@amd.com>,
"Danilo Krummrich" <dakr@kernel.org>,
"David Airlie" <airlied@gmail.com>,
"Maarten Lankhorst" <maarten.lankhorst@linux.intel.com>,
"Maxime Ripard" <mripard@kernel.org>,
"Philipp Stanner" <phasta@kernel.org>,
"Simona Vetter" <simona@ffwll.ch>,
"Sumit Semwal" <sumit.semwal@linaro.org>,
"Thomas Zimmermann" <tzimmermann@suse.de>,
linux-kernel@vger.kernel.org
Subject: Re: [RFC PATCH 02/12] drm/dep: Add DRM dependency queue layer
Date: Tue, 24 Mar 2026 10:23:45 +0100 [thread overview]
Message-ID: <20260324102345.17742bef@fedora> (raw)
In-Reply-To: <acGIjjLXuxUzv3aI@lstrano-desk.jf.intel.com>
On Mon, 23 Mar 2026 11:38:06 -0700
Matthew Brost <matthew.brost@intel.com> wrote:
>
> Ok, getting stats is easier than I thought...
>
> ./perf stat -a -e context-switches,cpu-migrations,task-clock,cycles,instructions /home/mbrost/xe/source/drivers.gpu.i915.igt-gpu-tools/build/tests/xe_exec_threads --r threads-basic
>
> This test creates one thread per engine instance (7 instances this BMG
> device) and submits 1k exec IOCTLs per thread, each performing a DW
> write. Each exec IOCTL typically does not have unsignaled input dependencies.
>
> With IRQ putting of jobs off + no bypass (drm_dep_queue_flags = 0):
>
> 8,449 context-switches
> 412 cpu-migrations
> 2,531.43 msec task-clock
> 1,847,846,588 cpu_atom/cycles/
> 1,847,856,947 cpu_core/cycles/
> <not supported> cpu_atom/instructions/
> 460,744,020 cpu_core/instructions/
>
> With IRQ putting of jobs off + bypass (drm_dep_queue_flags =
> DRM_DEP_QUEUE_FLAGS_BYPASS_SUPPORTED):
>
> 8,655 context-switches
> 229 cpu-migrations
> 2,571.33 msec task-clock
> 855,900,607 cpu_atom/cycles/
> 855,900,272 cpu_core/cycles/
> <not supported> cpu_atom/instructions/
> 403,651,469 cpu_core/instructions/
>
> With IRQ putting of jobs on + bypass (drm_dep_queue_flags =
> DRM_DEP_QUEUE_FLAGS_BYPASS_SUPPORTED |
> DRM_DEP_QUEUE_FLAGS_JOB_PUT_IRQ_SAFE):
>
> 5,361 context-switches
> 169 cpu-migrations
> 2,577.44 msec task-clock
> 685,769,153 cpu_atom/cycles/
> 685,768,407 cpu_core/cycles/
> <not supported> cpu_atom/instructions/
> 321,336,297 cpu_core/instructions/
Thanks for sharing those numbers. For completeness, can you also add the
"With IRQ putting of jobs on + no bypass" case?
I'm a bit surprised by the difference in number of context switches
given I'd expect the local-CPU to be picked in priority, and so queuing
work items on the same wq from another work item to be almost free in
term on scheduling. But I guess there's some load-balancing happening
when you execute jobs at such a high rate.
Also, I don't know if that's just noise or if it's reproducible, but
task-clock seems to be ~40usec lower with the deferred cleanup and
no-bypass (higher throughput because you're not blocking the dequeuing
of the next job on the cleanup of the previous one, I suspect).
next prev parent reply other threads:[~2026-03-24 9:23 UTC|newest]
Thread overview: 65+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-16 4:32 [RFC PATCH 00/12] Introduce DRM dep queue Matthew Brost
2026-03-16 4:32 ` [RFC PATCH 01/12] workqueue: Add interface to teach lockdep to warn on reclaim violations Matthew Brost
2026-03-25 15:59 ` Tejun Heo
2026-03-26 1:49 ` Matthew Brost
2026-03-26 2:19 ` Tejun Heo
2026-03-27 4:33 ` Matthew Brost
2026-03-27 17:25 ` Tejun Heo
2026-03-16 4:32 ` [RFC PATCH 02/12] drm/dep: Add DRM dependency queue layer Matthew Brost
2026-03-16 9:16 ` Boris Brezillon
2026-03-17 5:22 ` Matthew Brost
2026-03-17 8:48 ` Boris Brezillon
2026-03-16 10:25 ` Danilo Krummrich
2026-03-17 5:10 ` Matthew Brost
2026-03-17 12:19 ` Danilo Krummrich
2026-03-18 23:02 ` Matthew Brost
2026-03-17 2:47 ` Daniel Almeida
2026-03-17 5:45 ` Matthew Brost
2026-03-17 7:17 ` Miguel Ojeda
2026-03-17 8:26 ` Matthew Brost
2026-03-17 12:04 ` Daniel Almeida
2026-03-17 19:41 ` Miguel Ojeda
2026-03-23 17:31 ` Matthew Brost
2026-03-23 17:42 ` Miguel Ojeda
2026-03-17 18:14 ` Matthew Brost
2026-03-17 19:48 ` Daniel Almeida
2026-03-17 20:43 ` Boris Brezillon
2026-03-18 22:40 ` Matthew Brost
2026-03-19 9:57 ` Boris Brezillon
2026-03-22 6:43 ` Matthew Brost
2026-03-23 7:58 ` Matthew Brost
2026-03-23 10:06 ` Boris Brezillon
2026-03-23 17:11 ` Matthew Brost
2026-03-17 12:31 ` Danilo Krummrich
2026-03-17 14:25 ` Daniel Almeida
2026-03-17 14:33 ` Danilo Krummrich
2026-03-18 22:50 ` Matthew Brost
2026-03-17 8:47 ` Christian König
2026-03-17 14:55 ` Boris Brezillon
2026-03-18 23:28 ` Matthew Brost
2026-03-19 9:11 ` Boris Brezillon
2026-03-23 4:50 ` Matthew Brost
2026-03-23 9:55 ` Boris Brezillon
2026-03-23 17:08 ` Matthew Brost
2026-03-23 18:38 ` Matthew Brost
2026-03-24 9:23 ` Boris Brezillon [this message]
2026-03-24 16:06 ` Matthew Brost
2026-03-25 2:33 ` Matthew Brost
2026-03-24 8:49 ` Boris Brezillon
2026-03-24 16:51 ` Matthew Brost
2026-03-17 16:30 ` Shashank Sharma
2026-03-16 4:32 ` [RFC PATCH 03/12] drm/xe: Use WQ_MEM_WARN_ON_RECLAIM on all workqueues in the reclaim path Matthew Brost
2026-03-16 4:32 ` [RFC PATCH 04/12] drm/xe: Issue GGTT invalidation under lock in ggtt_node_remove Matthew Brost
2026-03-26 5:45 ` Bhadane, Dnyaneshwar
2026-03-16 4:32 ` [RFC PATCH 05/12] drm/xe: Return fence from xe_sched_job_arm and adjust job references Matthew Brost
2026-03-16 4:32 ` [RFC PATCH 06/12] drm/xe: Convert to DRM dep queue scheduler layer Matthew Brost
2026-03-16 4:32 ` [RFC PATCH 07/12] drm/xe: Make scheduler message lock IRQ-safe Matthew Brost
2026-03-16 4:32 ` [RFC PATCH 08/12] drm/xe: Rework exec queue object on top of DRM dep Matthew Brost
2026-03-16 4:32 ` [RFC PATCH 09/12] drm/xe: Enable IRQ job put in " Matthew Brost
2026-03-16 4:32 ` [RFC PATCH 10/12] drm/xe: Use DRM dep queue kill semantics Matthew Brost
2026-03-16 4:32 ` [RFC PATCH 11/12] accel/amdxdna: Convert to drm_dep scheduler layer Matthew Brost
2026-03-16 4:32 ` [RFC PATCH 12/12] drm/panthor: " Matthew Brost
2026-03-16 4:52 ` ✗ CI.checkpatch: warning for Introduce DRM dep queue Patchwork
2026-03-16 4:53 ` ✓ CI.KUnit: success " Patchwork
2026-03-16 5:28 ` ✓ Xe.CI.BAT: " Patchwork
2026-03-16 8:09 ` ✗ Xe.CI.FULL: failure " Patchwork
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260324102345.17742bef@fedora \
--to=boris.brezillon@collabora.com \
--cc=airlied@gmail.com \
--cc=christian.koenig@amd.com \
--cc=dakr@kernel.org \
--cc=dri-devel@lists.freedesktop.org \
--cc=intel-xe@lists.freedesktop.org \
--cc=linux-kernel@vger.kernel.org \
--cc=maarten.lankhorst@linux.intel.com \
--cc=matthew.brost@intel.com \
--cc=mripard@kernel.org \
--cc=phasta@kernel.org \
--cc=rodrigo.vivi@intel.com \
--cc=simona@ffwll.ch \
--cc=sumit.semwal@linaro.org \
--cc=thomas.hellstrom@linux.intel.com \
--cc=tvrtko.ursulin@igalia.com \
--cc=tzimmermann@suse.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.