Intel-XE Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: "Thomas Hellström" <thomas.hellstrom@linux.intel.com>
To: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>,
	intel-xe@lists.freedesktop.org,  Dave Airlie <airlied@gmail.com>,
	Simona Vetter <simona.vetter@ffwll.ch>,
	Joonas Lahtinen	 <joonas.lahtinen@linux.intel.com>,
	Maarten Lankhorst	 <maarten.lankhorst@intel.com>,
	Matthew Brost <matthew.brost@intel.com>,
	 Lucas De Marchi <lucas.demarchi@intel.com>
Subject: Re: [RFC PATCH] drm/xe/dma-buf: Allow pinning of p2p dma-buf
Date: Tue, 16 Sep 2025 23:05:19 +0200	[thread overview]
Message-ID: <5a4e9081c66c4d92a4b21321b12a28f071857ffc.camel@linux.intel.com> (raw)
In-Reply-To: <f9fea1925b39e6cb195784497ff031ba5e4967d6.camel@linux.intel.com>

On Tue, 2025-09-16 at 22:35 +0200, Thomas Hellström wrote:
> On Tue, 2025-09-16 at 15:53 -0400, Rodrigo Vivi wrote:
> > On Tue, Sep 16, 2025 at 03:06:48PM +0200, Thomas Hellström wrote:
> > > On Tue, 2025-09-16 at 14:03 +0100, Matthew Auld wrote:
> > > > On 16/09/2025 12:53, Thomas Hellström wrote:
> > > > > RDMA NICs typically requires the VRAM dma-bufs to be pinned
> > > > > in
> > > > > VRAM for pcie-p2p communication, since they don't fully
> > > > > support
> > > > > the move_notify() scheme. We would like to support that.
> > > > > 
> > > > > However allowing unaccounted pinning of VRAM creates a DOS
> > > > > vector
> > > > > so up until now we haven't allowed it.
> > > > > 
> > > > > However with cgroups support in TTM, the amount of VRAM
> > > > > allocated
> > > > > to a cgroup can be limited, and since also the pinned memory
> > > > > is
> > > > > accounted as allocated VRAM we should be safe.
> > > > > 
> > > > > An analogy with system memory can be made if we observe the
> > > > > similarity with kernel system memory that is allocated as the
> > > > > result of user-space action and that is accounted using
> > > > > __GFP_ACCOUNT.
> > > > > 
> > > > > Ideally, to be more flexible, we would add a "pinned_memory",
> > > > > or possibly "kernel_memory" limit to the dmem cgroups
> > > > > controller,
> > > > > that would additionally limit the memory that is pinned in
> > > > > this
> > > > > way.
> > > > > If we let that limit default to the dmem::max limit we can
> > > > > introduce that without needing to care about regressions.
> > > > > 
> > > > > Considering that we already pin VRAM in this way for at least
> > > > > page-table memory and LRC memory, and the above path to
> > > > > greater
> > > > > flexibility, allow this also for dma-bufs.
> > > > > 
> > > > > Cc: Dave Airlie <airlied@gmail.com>
> > > > > Cc: Simona Vetter <simona.vetter@ffwll.ch>
> > > > > Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> > > > > Cc: Maarten Lankhorst <maarten.lankhorst@intel.com>
> > > > > Cc: Matthew Brost <matthew.brost@intel.com>
> > > > > Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
> > > > > Cc: Lucas De Marchi <lucas.demarchi@intel.com>
> > > > > Signed-off-by: Thomas Hellström
> > > > > <thomas.hellstrom@linux.intel.com>
> > > > > ---
> > > > >   drivers/gpu/drm/xe/tests/xe_dma_buf.c | 13 +++++++++
> > > > >   drivers/gpu/drm/xe/xe_dma_buf.c       | 41
> > > > > +++++++++++++++++-
> > > > > ----
> > > > > -----
> > > > >   2 files changed, 39 insertions(+), 15 deletions(-)
> > > > > 
> > > > > diff --git a/drivers/gpu/drm/xe/tests/xe_dma_buf.c
> > > > > b/drivers/gpu/drm/xe/tests/xe_dma_buf.c
> > > > > index a7e548a2bdfb..1f88ca71820c 100644
> > > > > --- a/drivers/gpu/drm/xe/tests/xe_dma_buf.c
> > > > > +++ b/drivers/gpu/drm/xe/tests/xe_dma_buf.c
> > > > > @@ -31,6 +31,7 @@ static void check_residency(struct kunit
> > > > > *test,
> > > > > struct xe_bo *exported,
> > > > >   			    struct drm_exec *exec)
> > > > >   {
> > > > >   	struct dma_buf_test_params *params =
> > > > > to_dma_buf_test_params(test->priv);
> > > > > +	struct dma_buf_attachment *attach;
> > > > >   	u32 mem_type;
> > > > >   	int ret;
> > > > >   
> > > > > @@ -88,6 +89,18 @@ static void check_residency(struct kunit
> > > > > *test,
> > > > > struct xe_bo *exported,
> > > > >   
> > > > >   	KUNIT_EXPECT_TRUE(test, xe_bo_is_mem_type(exported,
> > > > > mem_type));
> > > > >   
> > > > > +	/* Check that we can pin without migrating. */
> > > > > +	attach = list_first_entry_or_null(&dmabuf-
> > > > > > attachments,
> > > > > typeof(*attach), node);
> > > > > +	if (attach) {
> > > > > +		int err = dma_buf_pin(attach);
> > > > > +
> > > > > +		if (!err) {
> > > > > +			KUNIT_EXPECT_TRUE(test,
> > > > > xe_bo_is_mem_type(exported, mem_type));
> > > > > +			dma_buf_unpin(attach);
> > > > > +		}
> > > > > +		KUNIT_EXPECT_EQ(test, err, 0);
> > > > > +	}
> > > > > +
> > > > >   	if (params->force_different_devices)
> > > > >   		KUNIT_EXPECT_TRUE(test,
> > > > > xe_bo_is_mem_type(imported, XE_PL_TT));
> > > > >   	else
> > > > > diff --git a/drivers/gpu/drm/xe/xe_dma_buf.c
> > > > > b/drivers/gpu/drm/xe/xe_dma_buf.c
> > > > > index a7d67725c3ee..54e42960daad 100644
> > > > > --- a/drivers/gpu/drm/xe/xe_dma_buf.c
> > > > > +++ b/drivers/gpu/drm/xe/xe_dma_buf.c
> > > > > @@ -48,32 +48,43 @@ static void xe_dma_buf_detach(struct
> > > > > dma_buf
> > > > > *dmabuf,
> > > > >   
> > > > >   static int xe_dma_buf_pin(struct dma_buf_attachment
> > > > > *attach)
> > > > >   {
> > > > > -	struct drm_gem_object *obj = attach->dmabuf->priv;
> > > > > +	struct dma_buf *dmabuf = attach->dmabuf;
> > > > > +	struct drm_gem_object *obj = dmabuf->priv;
> > > > >   	struct xe_bo *bo = gem_to_xe_bo(obj);
> > > > >   	struct xe_device *xe = xe_bo_device(bo);
> > > > >   	struct drm_exec *exec = XE_VALIDATION_UNSUPPORTED;
> > > > > +	bool allow_vram = true;
> > > > >   	int ret;
> > > > >   
> > > > > -	/*
> > > > > -	 * For now only support pinning in TT memory, for
> > > > > two
> > > > > reasons:
> > > > > -	 * 1) Avoid pinning in a placement not accessible to
> > > > > some
> > > > > importers.
> > > > > -	 * 2) Pinning in VRAM requires PIN accounting which
> > > > > is
> > > > > a
> > > > > to-do.
> > > > > -	 */
> > > > > -	if (xe_bo_is_pinned(bo) && !xe_bo_is_mem_type(bo,
> > > > > XE_PL_TT)) {
> > > > > +	if (!IS_ENABLED(CONFIG_DMABUF_MOVE_NOTIFY)) {
> > 
> > I don't believe this check is the right one here.
> > Perhaps the config is enabled, but the driver importing
> > this dma-buf doesn't support it.
> 
> So this was actually shamelessly stolen from amdgpu. I'll see if I
> can
> sort out whether this works anyway even if the importer doesn't
> support
> move_notify().

It turns out that the dma-buf core requires a move_notify() callback
for p2p dma. So checking the attachments should be sufficient. However
the config check avoids that in the case MOVE_NOTIFY is not enabled.

/Thomas



> 
> > 
> > perhaps we want a new config here? so when we transition to cgroups
> > we "don't regress" ?!
> 
> We currently *do* use cgroups for VRAM. If enabled, all the cgroup's
> VRAM allocations (including this pinned memory) don't exceed the
> dmem:max limit.
> The plan moving forward is to, if needed, try to introduce a
> dmem:pinned_memory limit <= dmem:max and when we pin we account
> against
> that as well. If dmem:pinned_memory is set to dmem:max by default, we
> don't regress.
> 
> Thanks,
> Thomas
> 
> 
> > 
> > > > > +		allow_vram = false;
> > > > > +	} else {
> > > > > +		list_for_each_entry(attach, &dmabuf-
> > > > > > attachments,
> > > > > node) {
> > > > > +			if (!attach->peer2peer) {
> > > > > +				allow_vram = false;
> > > > > +				break;
> > > > > +			}
> > > > > +		}
> > > > > +	}
> > > > > +
> > > > > +	if (xe_bo_is_pinned(bo) && !xe_bo_is_mem_type(bo,
> > > > > XE_PL_TT) &&
> > > > > +	    !(xe_bo_is_vram(bo) && allow_vram)) {
> > > > >   		drm_dbg(&xe->drm, "Can't migrate pinned bo
> > > > > for
> > > > > dma-buf pin.\n");
> > > > >   		return -EINVAL;
> > > > >   	}
> > > > >   
> > > > > -	ret = xe_bo_migrate(bo, XE_PL_TT, NULL, exec);
> > > > > -	if (ret) {
> > > > > -		if (ret != -EINTR && ret != -ERESTARTSYS)
> > > > > -			drm_dbg(&xe->drm,
> > > > > -				"Failed migrating dma-buf to
> > > > > TT
> > > > > memory: %pe\n",
> > > > > -				ERR_PTR(ret));
> > > > > -		return ret;
> > > > > +	if (!allow_vram) {
> > > > > +		ret = xe_bo_migrate(bo, XE_PL_TT, NULL,
> > > > > exec);
> > > > > +		if (ret) {
> > > > > +			if (ret != -EINTR && ret != -
> > > > > ERESTARTSYS)
> > > > > +				drm_dbg(&xe->drm,
> > > > > +					"Failed migrating
> > > > > dma-
> > > > > buf
> > > > > to TT memory: %pe\n",
> > > > > +					ERR_PTR(ret));
> > > > > +			return ret;
> > > > > +		}
> > > > >   	}
> > > > >   
> > > > > -	ret = xe_bo_pin_external(bo, true, exec);
> > > > > +	ret = xe_bo_pin_external(bo, !allow_vram, exec);
> > > > 
> > > > Are we also missing save/restore support for such objects? Or
> > > > at
> > > > least I 
> > > > can't see where the save flow is happening for externally
> > > > pinned
> > > > VRAM?
> > > 
> > > Good point. I forgot about that. IIRC we once made a deliberate
> > > decision to leave that out since we didn't support it.
> > > 
> > > I'll have a look at that as well depending if we decide to go
> > > ahead with this.
> > > 
> > > /Thomas
> > > 
> > > > 
> > > > >   	xe_assert(xe, !ret);
> > > > >   
> > > > >   	return 0;
> > > > 
> > > 
> 


  reply	other threads:[~2025-09-16 21:05 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-09-16 11:53 [RFC PATCH] drm/xe/dma-buf: Allow pinning of p2p dma-buf Thomas Hellström
2025-09-16 12:10 ` ✓ CI.KUnit: success for " Patchwork
2025-09-16 12:43 ` ✓ Xe.CI.BAT: " Patchwork
2025-09-16 13:03 ` [RFC PATCH] " Matthew Auld
2025-09-16 13:06   ` Thomas Hellström
2025-09-16 18:02     ` Matthew Brost
2025-09-16 18:54       ` Thomas Hellström
2025-09-17  8:48         ` Maarten Lankhorst
2025-09-19  7:11           ` Thomas Hellström
2025-09-16 19:53     ` Rodrigo Vivi
2025-09-16 20:35       ` Thomas Hellström
2025-09-16 21:05         ` Thomas Hellström [this message]
2025-09-16 14:44 ` ✓ Xe.CI.Full: success for " Patchwork
2025-09-17  5:43 ` [RFC PATCH] " Niranjana Vishwanathapura

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5a4e9081c66c4d92a4b21321b12a28f071857ffc.camel@linux.intel.com \
    --to=thomas.hellstrom@linux.intel.com \
    --cc=airlied@gmail.com \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=joonas.lahtinen@linux.intel.com \
    --cc=lucas.demarchi@intel.com \
    --cc=maarten.lankhorst@intel.com \
    --cc=matthew.auld@intel.com \
    --cc=matthew.brost@intel.com \
    --cc=rodrigo.vivi@intel.com \
    --cc=simona.vetter@ffwll.ch \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox