Re: [RFC 0/3] cgroups: Add support for pinned device memory

cgroups.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: "Thomas Hellström" <thomas.hellstrom@linux.intel.com>
To: "Natalie Vock" <natalie.vock@gmx.de>,
	"Maarten Lankhorst" <dev@lankhorst.se>,
	"Lucas De Marchi" <lucas.demarchi@intel.com>,
	"Rodrigo Vivi" <rodrigo.vivi@intel.com>,
	"David Airlie" <airlied@gmail.com>,
	"Simona Vetter" <simona@ffwll.ch>,
	"Maxime Ripard" <mripard@kernel.org>, "Tejun Heo" <tj@kernel.org>,
	"Johannes Weiner" <hannes@cmpxchg.org>,
	"'Michal Koutný'" <mkoutny@suse.com>,
	"Michal Hocko" <mhocko@kernel.org>,
	"Roman Gushchin" <roman.gushchin@linux.dev>,
	"Shakeel Butt" <shakeel.butt@linux.dev>,
	"Muchun Song" <muchun.song@linux.dev>,
	"Andrew Morton" <akpm@linux-foundation.org>,
	"David Hildenbrand" <david@redhat.com>,
	"Lorenzo Stoakes" <lorenzo.stoakes@oracle.com>,
	"'Liam R . Howlett'" <Liam.Howlett@oracle.com>,
	"Vlastimil Babka" <vbabka@suse.cz>,
	"Mike Rapoport" <rppt@kernel.org>,
	"Suren Baghdasaryan" <surenb@google.com>,
	"Thomas Zimmermann" <tzimmermann@suse.de>
Cc: Michal Hocko <mhocko@suse.com>,
	intel-xe@lists.freedesktop.org,  dri-devel@lists.freedesktop.org,
	linux-kernel@vger.kernel.org,  cgroups@vger.kernel.org,
	linux-mm@kvack.org
Subject: Re: [RFC 0/3] cgroups: Add support for pinned device memory
Date: Mon, 01 Sep 2025 16:37:51 +0200	[thread overview]
Message-ID: <3713e6d83421fcf64978927a1cb40fae1e3c7a57.camel@linux.intel.com> (raw)
In-Reply-To: <25b42c8e-7233-4121-b253-e044e022b327@gmx.de>

Hi,

On Mon, 2025-09-01 at 14:45 +0200, Natalie Vock wrote:
> Hi,
> 
> On 8/19/25 13:49, Maarten Lankhorst wrote:
> > When exporting dma-bufs to other devices, even when it is allowed
> > to use
> > move_notify in some drivers, performance will degrade severely when
> > eviction happens.
> > 
> > A perticular example where this can happen is in a multi-card
> > setup,
> > where PCI-E peer-to-peer is used to prevent using access to system
> > memory.
> > 
> > If the buffer is evicted to system memory, not only the evicting
> > GPU wher
> > the buffer resided is affected, but it will also stall the GPU that
> > is
> > waiting on the buffer.
> > 
> > It also makes sense for long running jobs not to be preempted by
> > having
> > its buffers evicted, so it will make sense to have the ability to
> > pin
> > from system memory too.
> > 
> > This is dependant on patches by Dave Airlie, so it's not part of
> > this
> > series yet. But I'm planning on extending pinning to the memory
> > cgroup
> > controller in the future to handle this case.
> > 
> > Implementation details:
> > 
> > For each cgroup up until the root cgroup, the 'min' limit is
> > checked
> > against currently effectively pinned value. If the value will go
> > above
> > 'min', the pinning attempt is rejected.
> 
> Why do you want to reject pins in this case? What happens in desktop 
> usecases (e.g. PRIME buffer sharing)? AFAIU, you kind of need to be
> able 
> to pin buffers and export them to other devices for that whole thing
> to 
> work, right? If the user doesn't explicitly set a min value, wouldn't
> the value being zero mean any pins will be rejected (and thus PRIME 
> would break)?

That's really the point. If an unprivileged malicious process is
allowed to pin arbitrary amounts of memory, thats a DOS vector.

However drivers that allow unlimited pinning today need to take care
when implementing restrictions to avoid regressions. Like perhaps
adding this behind a config option.

That said, IMO dma-buf clients should implement move_notify() whenever
possible to provide an option to avoid pinning unless necessary.

/Thomas



> 
> If your objective is to prevent pinned buffers from being evicted, 
> perhaps you could instead make TTM try to avoid evicting pinned
> buffers 
> and prefer unpinned buffers as long as there are unpinned buffers to 
> evict? As long as the total amount of pinned memory stays below min,
> no 
> pinned buffers should get evicted with that either.


> 
> Best,
> Natalie
> 
> > 
> > Pinned memory is handled slightly different and affects calculating
> > effective min/low values. Pinned memory is subtracted from both,
> > and needs to be added afterwards when calculating.
> > 
> > This is because increasing the amount of pinned memory, the amount
> > of
> > free min/low memory decreases for all cgroups that are part of the
> > hierarchy.
> > 
> > Maarten Lankhorst (3):
> >    page_counter: Allow for pinning some amount of memory
> >    cgroup/dmem: Implement pinning device memory
> >    drm/xe: Add DRM_XE_GEM_CREATE_FLAG_PINNED flag and
> > implementation
> > 
> >   drivers/gpu/drm/xe/xe_bo.c      | 66 +++++++++++++++++++++-
> >   drivers/gpu/drm/xe/xe_dma_buf.c | 10 +++-
> >   include/linux/cgroup_dmem.h     |  2 +
> >   include/linux/page_counter.h    |  8 +++
> >   include/uapi/drm/xe_drm.h       | 10 +++-
> >   kernel/cgroup/dmem.c            | 57 ++++++++++++++++++-
> >   mm/page_counter.c               | 98
> > ++++++++++++++++++++++++++++++---
> >   7 files changed, 237 insertions(+), 14 deletions(-)
> > 
>

next prev parent reply	other threads:[~2025-09-01 14:38 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-08-19 11:49 [RFC 0/3] cgroups: Add support for pinned device memory Maarten Lankhorst
2025-08-19 11:49 ` [RFC 1/3] page_counter: Allow for pinning some amount of memory Maarten Lankhorst
2025-08-19 11:49 ` [RFC 2/3] cgroup/dmem: Implement pinning device memory Maarten Lankhorst
2025-08-19 11:49 ` [RFC 3/3] drm/xe: Add DRM_XE_GEM_CREATE_FLAG_PINNED flag and implementation Maarten Lankhorst
2025-08-19 16:22   ` Thomas Hellström
2025-08-21 11:41     ` Maarten Lankhorst
2025-08-26 14:20 ` [RFC 0/3] cgroups: Add support for pinned device memory Michal Koutný
2025-08-28 20:58   ` Maarten Lankhorst
2025-09-01 12:25 ` David Hildenbrand
2025-09-01 18:16   ` Maarten Lankhorst
2025-09-01 18:21     ` Thomas Hellström
2025-09-01 18:38       ` David Hildenbrand
2025-09-02 13:42         ` Thomas Hellström
2025-09-01 12:45 ` Natalie Vock
2025-09-01 14:37   ` Thomas Hellström [this message]
2025-09-01 18:22   ` Maarten Lankhorst

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3713e6d83421fcf64978927a1cb40fae1e3c7a57.camel@linux.intel.com \
    --to=thomas.hellstrom@linux.intel.com \
    --cc=Liam.Howlett@oracle.com \
    --cc=airlied@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=cgroups@vger.kernel.org \
    --cc=david@redhat.com \
    --cc=dev@lankhorst.se \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=hannes@cmpxchg.org \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lorenzo.stoakes@oracle.com \
    --cc=lucas.demarchi@intel.com \
    --cc=mhocko@kernel.org \
    --cc=mhocko@suse.com \
    --cc=mkoutny@suse.com \
    --cc=mripard@kernel.org \
    --cc=muchun.song@linux.dev \
    --cc=natalie.vock@gmx.de \
    --cc=rodrigo.vivi@intel.com \
    --cc=roman.gushchin@linux.dev \
    --cc=rppt@kernel.org \
    --cc=shakeel.butt@linux.dev \
    --cc=simona@ffwll.ch \
    --cc=surenb@google.com \
    --cc=tj@kernel.org \
    --cc=tzimmermann@suse.de \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).