From: "Thomas Hellström" <thomas.hellstrom@linux.intel.com>
To: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>,
intel-xe@lists.freedesktop.org, Dave Airlie <airlied@gmail.com>,
Simona Vetter <simona.vetter@ffwll.ch>,
Joonas Lahtinen <joonas.lahtinen@linux.intel.com>,
Maarten Lankhorst <maarten.lankhorst@intel.com>,
Matthew Brost <matthew.brost@intel.com>,
Lucas De Marchi <lucas.demarchi@intel.com>
Subject: Re: [RFC PATCH] drm/xe/dma-buf: Allow pinning of p2p dma-buf
Date: Tue, 16 Sep 2025 22:35:28 +0200 [thread overview]
Message-ID: <f9fea1925b39e6cb195784497ff031ba5e4967d6.camel@linux.intel.com> (raw)
In-Reply-To: <aMnAIlKMXsmRuXCW@intel.com>
On Tue, 2025-09-16 at 15:53 -0400, Rodrigo Vivi wrote:
> On Tue, Sep 16, 2025 at 03:06:48PM +0200, Thomas Hellström wrote:
> > On Tue, 2025-09-16 at 14:03 +0100, Matthew Auld wrote:
> > > On 16/09/2025 12:53, Thomas Hellström wrote:
> > > > RDMA NICs typically requires the VRAM dma-bufs to be pinned in
> > > > VRAM for pcie-p2p communication, since they don't fully support
> > > > the move_notify() scheme. We would like to support that.
> > > >
> > > > However allowing unaccounted pinning of VRAM creates a DOS
> > > > vector
> > > > so up until now we haven't allowed it.
> > > >
> > > > However with cgroups support in TTM, the amount of VRAM
> > > > allocated
> > > > to a cgroup can be limited, and since also the pinned memory is
> > > > accounted as allocated VRAM we should be safe.
> > > >
> > > > An analogy with system memory can be made if we observe the
> > > > similarity with kernel system memory that is allocated as the
> > > > result of user-space action and that is accounted using
> > > > __GFP_ACCOUNT.
> > > >
> > > > Ideally, to be more flexible, we would add a "pinned_memory",
> > > > or possibly "kernel_memory" limit to the dmem cgroups
> > > > controller,
> > > > that would additionally limit the memory that is pinned in this
> > > > way.
> > > > If we let that limit default to the dmem::max limit we can
> > > > introduce that without needing to care about regressions.
> > > >
> > > > Considering that we already pin VRAM in this way for at least
> > > > page-table memory and LRC memory, and the above path to greater
> > > > flexibility, allow this also for dma-bufs.
> > > >
> > > > Cc: Dave Airlie <airlied@gmail.com>
> > > > Cc: Simona Vetter <simona.vetter@ffwll.ch>
> > > > Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> > > > Cc: Maarten Lankhorst <maarten.lankhorst@intel.com>
> > > > Cc: Matthew Brost <matthew.brost@intel.com>
> > > > Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
> > > > Cc: Lucas De Marchi <lucas.demarchi@intel.com>
> > > > Signed-off-by: Thomas Hellström
> > > > <thomas.hellstrom@linux.intel.com>
> > > > ---
> > > > drivers/gpu/drm/xe/tests/xe_dma_buf.c | 13 +++++++++
> > > > drivers/gpu/drm/xe/xe_dma_buf.c | 41 +++++++++++++++++-
> > > > ----
> > > > -----
> > > > 2 files changed, 39 insertions(+), 15 deletions(-)
> > > >
> > > > diff --git a/drivers/gpu/drm/xe/tests/xe_dma_buf.c
> > > > b/drivers/gpu/drm/xe/tests/xe_dma_buf.c
> > > > index a7e548a2bdfb..1f88ca71820c 100644
> > > > --- a/drivers/gpu/drm/xe/tests/xe_dma_buf.c
> > > > +++ b/drivers/gpu/drm/xe/tests/xe_dma_buf.c
> > > > @@ -31,6 +31,7 @@ static void check_residency(struct kunit
> > > > *test,
> > > > struct xe_bo *exported,
> > > > struct drm_exec *exec)
> > > > {
> > > > struct dma_buf_test_params *params =
> > > > to_dma_buf_test_params(test->priv);
> > > > + struct dma_buf_attachment *attach;
> > > > u32 mem_type;
> > > > int ret;
> > > >
> > > > @@ -88,6 +89,18 @@ static void check_residency(struct kunit
> > > > *test,
> > > > struct xe_bo *exported,
> > > >
> > > > KUNIT_EXPECT_TRUE(test, xe_bo_is_mem_type(exported,
> > > > mem_type));
> > > >
> > > > + /* Check that we can pin without migrating. */
> > > > + attach = list_first_entry_or_null(&dmabuf-
> > > > >attachments,
> > > > typeof(*attach), node);
> > > > + if (attach) {
> > > > + int err = dma_buf_pin(attach);
> > > > +
> > > > + if (!err) {
> > > > + KUNIT_EXPECT_TRUE(test,
> > > > xe_bo_is_mem_type(exported, mem_type));
> > > > + dma_buf_unpin(attach);
> > > > + }
> > > > + KUNIT_EXPECT_EQ(test, err, 0);
> > > > + }
> > > > +
> > > > if (params->force_different_devices)
> > > > KUNIT_EXPECT_TRUE(test,
> > > > xe_bo_is_mem_type(imported, XE_PL_TT));
> > > > else
> > > > diff --git a/drivers/gpu/drm/xe/xe_dma_buf.c
> > > > b/drivers/gpu/drm/xe/xe_dma_buf.c
> > > > index a7d67725c3ee..54e42960daad 100644
> > > > --- a/drivers/gpu/drm/xe/xe_dma_buf.c
> > > > +++ b/drivers/gpu/drm/xe/xe_dma_buf.c
> > > > @@ -48,32 +48,43 @@ static void xe_dma_buf_detach(struct
> > > > dma_buf
> > > > *dmabuf,
> > > >
> > > > static int xe_dma_buf_pin(struct dma_buf_attachment *attach)
> > > > {
> > > > - struct drm_gem_object *obj = attach->dmabuf->priv;
> > > > + struct dma_buf *dmabuf = attach->dmabuf;
> > > > + struct drm_gem_object *obj = dmabuf->priv;
> > > > struct xe_bo *bo = gem_to_xe_bo(obj);
> > > > struct xe_device *xe = xe_bo_device(bo);
> > > > struct drm_exec *exec = XE_VALIDATION_UNSUPPORTED;
> > > > + bool allow_vram = true;
> > > > int ret;
> > > >
> > > > - /*
> > > > - * For now only support pinning in TT memory, for two
> > > > reasons:
> > > > - * 1) Avoid pinning in a placement not accessible to
> > > > some
> > > > importers.
> > > > - * 2) Pinning in VRAM requires PIN accounting which is
> > > > a
> > > > to-do.
> > > > - */
> > > > - if (xe_bo_is_pinned(bo) && !xe_bo_is_mem_type(bo,
> > > > XE_PL_TT)) {
> > > > + if (!IS_ENABLED(CONFIG_DMABUF_MOVE_NOTIFY)) {
>
> I don't believe this check is the right one here.
> Perhaps the config is enabled, but the driver importing
> this dma-buf doesn't support it.
So this was actually shamelessly stolen from amdgpu. I'll see if I can
sort out whether this works anyway even if the importer doesn't support
move_notify().
>
> perhaps we want a new config here? so when we transition to cgroups
> we "don't regress" ?!
We currently *do* use cgroups for VRAM. If enabled, all the cgroup's
VRAM allocations (including this pinned memory) don't exceed the
dmem:max limit.
The plan moving forward is to, if needed, try to introduce a
dmem:pinned_memory limit <= dmem:max and when we pin we account against
that as well. If dmem:pinned_memory is set to dmem:max by default, we
don't regress.
Thanks,
Thomas
>
> > > > + allow_vram = false;
> > > > + } else {
> > > > + list_for_each_entry(attach, &dmabuf-
> > > > >attachments,
> > > > node) {
> > > > + if (!attach->peer2peer) {
> > > > + allow_vram = false;
> > > > + break;
> > > > + }
> > > > + }
> > > > + }
> > > > +
> > > > + if (xe_bo_is_pinned(bo) && !xe_bo_is_mem_type(bo,
> > > > XE_PL_TT) &&
> > > > + !(xe_bo_is_vram(bo) && allow_vram)) {
> > > > drm_dbg(&xe->drm, "Can't migrate pinned bo for
> > > > dma-buf pin.\n");
> > > > return -EINVAL;
> > > > }
> > > >
> > > > - ret = xe_bo_migrate(bo, XE_PL_TT, NULL, exec);
> > > > - if (ret) {
> > > > - if (ret != -EINTR && ret != -ERESTARTSYS)
> > > > - drm_dbg(&xe->drm,
> > > > - "Failed migrating dma-buf to
> > > > TT
> > > > memory: %pe\n",
> > > > - ERR_PTR(ret));
> > > > - return ret;
> > > > + if (!allow_vram) {
> > > > + ret = xe_bo_migrate(bo, XE_PL_TT, NULL, exec);
> > > > + if (ret) {
> > > > + if (ret != -EINTR && ret != -
> > > > ERESTARTSYS)
> > > > + drm_dbg(&xe->drm,
> > > > + "Failed migrating dma-
> > > > buf
> > > > to TT memory: %pe\n",
> > > > + ERR_PTR(ret));
> > > > + return ret;
> > > > + }
> > > > }
> > > >
> > > > - ret = xe_bo_pin_external(bo, true, exec);
> > > > + ret = xe_bo_pin_external(bo, !allow_vram, exec);
> > >
> > > Are we also missing save/restore support for such objects? Or at
> > > least I
> > > can't see where the save flow is happening for externally pinned
> > > VRAM?
> >
> > Good point. I forgot about that. IIRC we once made a deliberate
> > decision to leave that out since we didn't support it.
> >
> > I'll have a look at that as well depending if we decide to go
> > ahead with this.
> >
> > /Thomas
> >
> > >
> > > > xe_assert(xe, !ret);
> > > >
> > > > return 0;
> > >
> >
next prev parent reply other threads:[~2025-09-16 20:35 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-09-16 11:53 [RFC PATCH] drm/xe/dma-buf: Allow pinning of p2p dma-buf Thomas Hellström
2025-09-16 12:10 ` ✓ CI.KUnit: success for " Patchwork
2025-09-16 12:43 ` ✓ Xe.CI.BAT: " Patchwork
2025-09-16 13:03 ` [RFC PATCH] " Matthew Auld
2025-09-16 13:06 ` Thomas Hellström
2025-09-16 18:02 ` Matthew Brost
2025-09-16 18:54 ` Thomas Hellström
2025-09-17 8:48 ` Maarten Lankhorst
2025-09-19 7:11 ` Thomas Hellström
2025-09-16 19:53 ` Rodrigo Vivi
2025-09-16 20:35 ` Thomas Hellström [this message]
2025-09-16 21:05 ` Thomas Hellström
2025-09-16 14:44 ` ✓ Xe.CI.Full: success for " Patchwork
2025-09-17 5:43 ` [RFC PATCH] " Niranjana Vishwanathapura
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=f9fea1925b39e6cb195784497ff031ba5e4967d6.camel@linux.intel.com \
--to=thomas.hellstrom@linux.intel.com \
--cc=airlied@gmail.com \
--cc=intel-xe@lists.freedesktop.org \
--cc=joonas.lahtinen@linux.intel.com \
--cc=lucas.demarchi@intel.com \
--cc=maarten.lankhorst@intel.com \
--cc=matthew.auld@intel.com \
--cc=matthew.brost@intel.com \
--cc=rodrigo.vivi@intel.com \
--cc=simona.vetter@ffwll.ch \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox