Re: [PATCH v3 1/3] drm/xe: Split H2G and G2H into separate buffer objects

Intel-XE Archive on lore.kernel.org
 help / color / mirror / Atom feed

From: "Thomas Hellström" <thomas.hellstrom@linux.intel.com>
To: Matthew Brost <matthew.brost@intel.com>
Cc: intel-xe@lists.freedesktop.org, stuart.summers@intel.com,
	 francois.dugast@intel.com, daniele.ceraolospurio@intel.com,
	 michal.wajdeczko@intel.com
Subject: Re: [PATCH v3 1/3] drm/xe: Split H2G and G2H into separate buffer objects
Date: Thu, 26 Feb 2026 13:08:45 +0100	[thread overview]
Message-ID: <cc0958e96c7204931c06aac064b29afa0d589d02.camel@linux.intel.com> (raw)
In-Reply-To: <aZ3N+CwyKRARMW6m@lstrano-desk.jf.intel.com>

On Tue, 2026-02-24 at 08:12 -0800, Matthew Brost wrote:
> On Tue, Feb 24, 2026 at 04:58:35PM +0100, Thomas Hellström wrote:
> > On Tue, 2026-02-17 at 20:33 -0800, Matthew Brost wrote:
> > > H2G and G2H buffers have different access patterns (H2G is CPU-
> > > write,
> > > GuC-read, while G2H is GPU-write, CPU-read). On dGPU, these
> > > patterns
> > > benefit from different memory placements: H2G in VRAM and G2H in
> > > system
> > > memory. Split the CT buffer into two separate buffers—one for H2G
> > > and
> > > one for G2H—and select the optimal placement for each.
> > > 
> > > This provides a significant performance improvement on the G2H
> > > read
> > > path, reducing a single read from ~20 µs to under 1 µs on BMG.
> > > 
> > > Signed-off-by: Matthew Brost <matthew.brost@intel.com>
> > 
> > Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
> > 
> > Perhaps one could experiment with reading the data from the g2h bo
> > using MOVNTDQA, like the write-combining memcopy. That would avoid
> > caching the data and the GuC having to invalidate the cache line
> > while
> > snooping on the next write.
> 
> We can try that, but G2H messages are variable-sized, so I believe it
> will get a little tricky. Once these are system-memory reads, I
> recall
> G2H handling being something like 15 per µs of page faults (maybe
> that
> isn’t correct — I’ll double-check), and that included my not-yet-
> posted
> caching implementation, which also takes a spinlock, examines the
> page-fault cache, and chains the fault onto a list. So I don’t think
> this will end up in the critical path.

Actually DOCs say MOVNTDQA only has an effect on write-combining
mappings, so probably a dead end experimenting with that.
There's also PREFETCHNTA though, which may or may not have an effect.

/Thomas

next prev parent reply	other threads:[~2026-02-26 12:08 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-18  4:33 [PATCH v3 0/3] dGPU memory optimizations Matthew Brost
2026-02-18  4:33 ` [PATCH v3 1/3] drm/xe: Split H2G and G2H into separate buffer objects Matthew Brost
2026-02-18 23:12   ` Summers, Stuart
2026-02-19  3:46     ` Matthew Brost
2026-02-24 15:58   ` Thomas Hellström
2026-02-24 16:12     ` Matthew Brost
2026-02-25 10:55       ` Thomas Hellström
2026-02-25 18:08         ` Matthew Brost
2026-02-26 12:08       ` Thomas Hellström [this message]
2026-02-18  4:33 ` [PATCH v3 2/3] drm/xe: Avoid unconditional VRAM reads in H2G path Matthew Brost
2026-02-18 23:20   ` Summers, Stuart
2026-02-26 12:47   ` Thomas Hellström
2026-02-18  4:33 ` [PATCH v3 3/3] drm/xe: Move LRC seqno to system memory to avoid slow dGPU reads Matthew Brost
2026-02-24  2:40   ` Matthew Brost
2026-02-26 12:25   ` Thomas Hellström
2026-02-26 17:11     ` Matthew Brost
2026-02-26 17:26       ` Matthew Brost
2026-02-26 17:56         ` Thomas Hellström
2026-02-26 12:43   ` Thomas Hellström
2026-02-26 16:55     ` Matthew Brost
2026-02-18  4:40 ` ✓ CI.KUnit: success for dGPU memory optimizations Patchwork
2026-02-18  5:23 ` ✗ Xe.CI.BAT: failure " Patchwork
2026-02-18  6:15 ` ✗ Xe.CI.FULL: " Patchwork
2026-02-18  7:07 ` ✓ CI.KUnit: success for dGPU memory optimizations (rev2) Patchwork
2026-02-18  7:36 ` ✓ Xe.CI.BAT: " Patchwork
2026-02-18  7:53 ` ✓ Xe.CI.FULL: " Patchwork
2026-02-18 12:29 ` ✓ CI.KUnit: success for dGPU memory optimizations (rev3) Patchwork
2026-02-18 13:09 ` ✓ Xe.CI.BAT: " Patchwork
2026-02-18 14:08 ` ✗ Xe.CI.FULL: failure " Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=cc0958e96c7204931c06aac064b29afa0d589d02.camel@linux.intel.com \
    --to=thomas.hellstrom@linux.intel.com \
    --cc=daniele.ceraolospurio@intel.com \
    --cc=francois.dugast@intel.com \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=matthew.brost@intel.com \
    --cc=michal.wajdeczko@intel.com \
    --cc=stuart.summers@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox