cgroups.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Daniel Vetter <daniel-/w4YWyX8dFk@public.gmane.org>
To: Kenny Ho <y2kenny-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Cc: joseph.greathouse-5C7GfCeVMHo@public.gmane.org,
	"Kenny Ho" <Kenny.Ho-5C7GfCeVMHo@public.gmane.org>,
	jsparks-WVYJKLFxKCc@public.gmane.org,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org,
	lkaplan-WVYJKLFxKCc@public.gmane.org,
	"Alex Deucher" <alexander.deucher-5C7GfCeVMHo@public.gmane.org>,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org,
	"Daniel Vetter" <daniel-/w4YWyX8dFk@public.gmane.org>,
	"Tejun Heo" <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>,
	cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	"Christian König" <christian.koenig-5C7GfCeVMHo@public.gmane.org>
Subject: Re: [RFC PATCH v3 04/11] drm, cgroup: Add total GEM buffer allocation limit
Date: Wed, 26 Jun 2019 23:41:13 +0200	[thread overview]
Message-ID: <20190626214113.GA12905@phenom.ffwll.local> (raw)
In-Reply-To: <CAOWid-eurCMx1F7ciUwx0e+p=s=NP8=UxQUhhF-hdK-iAna+fA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>

On Wed, Jun 26, 2019 at 05:27:48PM -0400, Kenny Ho wrote:
> On Wed, Jun 26, 2019 at 12:05 PM Daniel Vetter <daniel@ffwll.ch> wrote:
> >
> > > drm.buffer.default
> > >         A read-only flat-keyed file which exists on the root cgroup.
> > >         Each entry is keyed by the drm device's major:minor.
> > >
> > >         Default limits on the total GEM buffer allocation in bytes.
> >
> > Don't we need a "0 means no limit" semantics here?
> 
> I believe the convention is to use the 'max' keyword.
> 
> >
> > I think we need a new drm-cgroup.rst which contains all this
> > documentation.
> 
> Yes I planned to do that when things are more finalized.  I am
> actually writing the commit message following the current doc format
> so I can reuse it in the rst.

Awesome.

> > With multiple GPUs, do we need an overall GEM bo limit, across all gpus?
> > For other stuff later on like vram/tt/... and all that it needs to be
> > per-device, but I think one overall limit could be useful.
> 
> This one I am not sure but should be fairly straightforward to add.
> I'd love to hear more feedbacks on this as well.
> 
> > >       if (!amdgpu_bo_validate_size(adev, size, bp->domain))
> > >               return -ENOMEM;
> > >
> > > +     if (!drmcgrp_bo_can_allocate(current, adev->ddev, size))
> > > +             return -ENOMEM;
> >
> > So what happens when you start a lot of threads all at the same time,
> > allocating gem bo? Also would be nice if we could roll out at least the
> > accounting part of this cgroup to all GEM drivers.
> 
> When there is a large number of allocation, the allocation will be
> checked in sequence within a device (since I used a per device mutex
> in the check.)  Are you suggesting the overhead here is significant
> enough to be a bottleneck?  The accounting part should be available to
> all GEM drivers (unless I missed something) since the chg and unchg
> function is called via the generic drm_gem_private_object_init and
> drm_gem_object_release.

thread 1: checks limits, still under the total

thread 2: checks limits, still under the total

thread 1: allocates, still under

thread 2: allocates, now over the limit

I think the check and chg need to be one step, or this wont work. Or I'm
missing something somewhere.

Wrt rolling out the accounting for all drivers: Since you also roll out
enforcement in this patch I'm not sure whether the accounting part is
fully stand-alone. And as discussed a bit on an earlier patch, I think for
DRIVER_GEM we should set up the accounting cgroup automatically.

> > > +     /* only allow bo from the same cgroup or its ancestor to be imported */
> > > +     if (drmcgrp != NULL &&
> >
> > Quite a serious limitation here ...
> >
> > > +                     !drmcgrp_is_self_or_ancestor(drmcgrp, obj->drmcgrp)) {
> >
> > Also what happens if you actually share across devices? Then importing in
> > the 2nd group is suddenly possible, and I think will be double-counted.
> >
> > What's the underlying technical reason for not allowing sharing across
> > cgroups?
> 
> With the current implementation, there shouldn't be double counting as
> the counting is done during the buffer init.

If you share across devices there will be two drm_gem_obect structures, on
on each device. But only one underlying bo.

Now the bo limit is per-device too, so that's all fine, but for a global
bo limit we'd need to make sure we count these only once.

> To be clear, sharing across cgroup is allowed, the buffer just needs
> to be allocated by a process that is parent to the cgroup.  So in the
> case of xorg allocating buffer for client, the xorg would be in the
> root cgroup and the buffer can be passed around by different clients
> (in root or other cgroup.)  The idea here is to establish some form of
> ownership, otherwise there wouldn't be a way to account for or limit
> the usage.

But why? What's the problem if I allocate something and then hand it to
someone else. E.g. one popular use of cgroups is to isolate clients, so
maybe you'd do a cgroup + namespace for each X11 client (ok wayland, with
X11 this is probably pointless).

But with your current limitation those clients can't pass buffers to the
compositor anymore, making cgroups useless. Your example here only works
if Xorg is in the root and allocates all the buffers. That's not even true
for DRI3 anymore.

So pretty serious limitation on cgroups, and I'm not really understanding
why we need this. I think if we want to prevent buffer sharing, what we
need are some selinux hooks and stuff so you can prevent an import/access
by someone who's not allowed to touch a buffer. But that kind of access
right management should be separate from resource control imo.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

  parent reply	other threads:[~2019-06-26 21:41 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-06-26 15:05 [RFC PATCH v3 00/11] new cgroup controller for gpu/drm subsystem Kenny Ho
2019-06-26 15:05 ` [RFC PATCH v3 02/11] cgroup: Add mechanism to register DRM devices Kenny Ho
     [not found]   ` <20190626150522.11618-3-Kenny.Ho-5C7GfCeVMHo@public.gmane.org>
2019-06-26 15:56     ` Daniel Vetter
2019-06-26 20:37       ` Kenny Ho
2019-06-26 21:03         ` Daniel Vetter
     [not found]           ` <CAKMK7uERvn7Ed2trGQShM94Ozp6+x8bsULFyGj9CYWstuzb56A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2019-06-26 21:58             ` Kenny Ho
2019-06-26 15:05 ` [RFC PATCH v3 03/11] drm/amdgpu: Register AMD devices for DRM cgroup Kenny Ho
2019-06-26 15:05 ` [RFC PATCH v3 04/11] drm, cgroup: Add total GEM buffer allocation limit Kenny Ho
     [not found]   ` <20190626150522.11618-5-Kenny.Ho-5C7GfCeVMHo@public.gmane.org>
2019-06-26 16:05     ` Daniel Vetter
     [not found]       ` <20190626160553.GR12905-dv86pmgwkMBes7Z6vYuT8azUEOm+Xw19@public.gmane.org>
2019-06-26 21:27         ` Kenny Ho
     [not found]           ` <CAOWid-eurCMx1F7ciUwx0e+p=s=NP8=UxQUhhF-hdK-iAna+fA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2019-06-26 21:41             ` Daniel Vetter [this message]
     [not found]               ` <20190626214113.GA12905-dv86pmgwkMBes7Z6vYuT8azUEOm+Xw19@public.gmane.org>
2019-06-26 22:41                 ` Kenny Ho
     [not found]                   ` <CAOWid-egYGijS0a6uuG4mPUmOWaPwF-EKokR=LFNJ=5M+akVZw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2019-06-27  5:43                     ` Daniel Vetter
2019-06-27 18:42                       ` Kenny Ho
     [not found]                         ` <CAOWid-cT4TQ7HGzcSWjmLGjAW_D1hRrkNguEiV8N+baNiKQm_A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2019-06-27 21:24                           ` Daniel Vetter
2019-06-28 18:43                             ` Kenny Ho
     [not found]                               ` <CAOWid-dZQhpKHxYEFn+X+WSep+B66M_LtN6v0=4-uO3ecZ0pcg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2019-07-02 13:16                                 ` Daniel Vetter
2019-06-26 15:05 ` [RFC PATCH v3 07/11] drm, cgroup: Add TTM buffer allocation stats Kenny Ho
2019-06-26 16:12   ` Daniel Vetter
     [not found]     ` <20190626161254.GS12905-dv86pmgwkMBes7Z6vYuT8azUEOm+Xw19@public.gmane.org>
2019-06-27  4:06       ` Kenny Ho
     [not found]         ` <CAOWid-f3kKnM=4oC5Bba5WW5WNV2MH5PvVamrhO6LBr5ydPJQg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2019-06-27  6:01           ` Daniel Vetter
2019-06-27 20:17             ` Kenny Ho
2019-06-27 21:33               ` Daniel Vetter
2019-06-26 15:05 ` [RFC PATCH v3 08/11] drm, cgroup: Add TTM buffer peak usage stats Kenny Ho
2019-06-26 16:16   ` Daniel Vetter
2019-06-26 15:05 ` [RFC PATCH v3 10/11] drm, cgroup: Add soft VRAM limit Kenny Ho
     [not found] ` <20190626150522.11618-1-Kenny.Ho-5C7GfCeVMHo@public.gmane.org>
2019-06-26 15:05   ` [RFC PATCH v3 01/11] cgroup: Introduce cgroup for drm subsystem Kenny Ho
     [not found]     ` <20190626150522.11618-2-Kenny.Ho-5C7GfCeVMHo@public.gmane.org>
2019-06-26 15:49       ` Daniel Vetter
2019-06-26 19:35         ` Kenny Ho
     [not found]           ` <CAOWid-dyGwf=e0ikBEQ=bnVM_bC8-FeTOD8fJVMJKUgPv6vtyw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2019-06-26 20:12             ` Daniel Vetter
2019-06-26 15:05   ` [RFC PATCH v3 05/11] drm, cgroup: Add peak GEM buffer allocation limit Kenny Ho
2019-06-26 15:05   ` [RFC PATCH v3 06/11] drm, cgroup: Add GEM buffer allocation count stats Kenny Ho
2019-06-26 15:05   ` [RFC PATCH v3 09/11] drm, cgroup: Add per cgroup bw measure and control Kenny Ho
     [not found]     ` <20190626150522.11618-10-Kenny.Ho-5C7GfCeVMHo@public.gmane.org>
2019-06-26 16:25       ` Daniel Vetter
     [not found]         ` <20190626162554.GU12905-dv86pmgwkMBes7Z6vYuT8azUEOm+Xw19@public.gmane.org>
2019-06-27  4:34           ` Kenny Ho
     [not found]             ` <CAOWid-dO5QH4wLyN_ztMaoZtLM9yzw-FEMgk3ufbh1ahHJ2vVg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2019-06-27  6:11               ` Daniel Vetter
     [not found]                 ` <20190627061153.GD12905-dv86pmgwkMBes7Z6vYuT8azUEOm+Xw19@public.gmane.org>
2019-06-28 19:49                   ` Kenny Ho
     [not found]                     ` <CAOWid-dCkevUiN27pkwfPketdqS8O+ZGYu8vRMPY2GhXGaVARA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2019-07-02 13:20                       ` Daniel Vetter
2019-06-26 15:05   ` [RFC PATCH v3 11/11] drm, cgroup: Allow more aggressive memory reclaim Kenny Ho
     [not found]     ` <20190626150522.11618-12-Kenny.Ho-5C7GfCeVMHo@public.gmane.org>
2019-06-26 16:44       ` Daniel Vetter
2019-06-26 22:52         ` Kenny Ho
2019-06-27  6:15           ` Daniel Vetter
2019-06-27  7:24 ` [RFC PATCH v3 00/11] new cgroup controller for gpu/drm subsystem Daniel Vetter
2019-06-30  5:10   ` Kenny Ho
2019-07-02 13:21     ` Daniel Vetter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190626214113.GA12905@phenom.ffwll.local \
    --to=daniel-/w4ywyx8dfk@public.gmane.org \
    --cc=Kenny.Ho-5C7GfCeVMHo@public.gmane.org \
    --cc=alexander.deucher-5C7GfCeVMHo@public.gmane.org \
    --cc=amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org \
    --cc=cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=christian.koenig-5C7GfCeVMHo@public.gmane.org \
    --cc=dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org \
    --cc=joseph.greathouse-5C7GfCeVMHo@public.gmane.org \
    --cc=jsparks-WVYJKLFxKCc@public.gmane.org \
    --cc=lkaplan-WVYJKLFxKCc@public.gmane.org \
    --cc=tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org \
    --cc=y2kenny-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).