All of lore.kernel.org
 help / color / mirror / Atom feed
From: Oded Gabbay <oded.gabbay@amd.com>
To: "Daniel Vetter" <daniel@ffwll.ch>,
	"Thomas Hellstrom" <thellstrom@vmware.com>,
	"Jérôme Glisse" <jglisse@redhat.com>
Cc: "Michel Dänzer" <michel@daenzer.net>,
	dri-devel@lists.freedesktop.org,
	"Konrad Rzeszutek Wilk" <konrad.wilk@oracle.com>
Subject: Re: GEM memory DOS (WAS Re: [PATCH 3/3] drm/ttm: under memory pressure minimize the size of memory pool)
Date: Wed, 13 Aug 2014 17:09:49 +0300	[thread overview]
Message-ID: <53EB71AD.1070904@amd.com> (raw)
In-Reply-To: <20140813130108.GA10500@phenom.ffwll.local>



On 13/08/14 16:01, Daniel Vetter wrote:
> On Wed, Aug 13, 2014 at 02:35:52PM +0200, Thomas Hellstrom wrote:
>> On 08/13/2014 12:42 PM, Daniel Vetter wrote:
>>> On Wed, Aug 13, 2014 at 11:06:25AM +0200, Thomas Hellstrom wrote:
>>>> On 08/13/2014 05:52 AM, Jérôme Glisse wrote:
>>>>> From: Jérôme Glisse <jglisse@redhat.com>
>>>>>
>>>>> When experiencing memory pressure we want to minimize pool size so that
>>>>> memory we just shrinked is not added back again just as the next thing.
>>>>>
>>>>> This will divide by 2 the maximum pool size for each device each time
>>>>> the pool have to shrink. The limit is bumped again is next allocation
>>>>> happen after one second since the last shrink. The one second delay is
>>>>> obviously an arbitrary choice.
>>>> Jérôme,
>>>>
>>>> I don't like this patch. It adds extra complexity and its usefulness is
>>>> highly questionable.
>>>> There are a number of caches in the system, and if all of them added
>>>> some sort of voluntary shrink heuristics like this, we'd end up with
>>>> impossible-to-debug unpredictable performance issues.
>>>>
>>>> We should let the memory subsystem decide when to reclaim pages from
>>>> caches and what caches to reclaim them from.
>>> Yeah, artificially limiting your cache from growing when your shrinker
>>> gets called will just break the equal-memory pressure the core mm uses to
>>> rebalance between all caches when workload changes. In i915 we let
>>> everything grow without artificial bounds and only rely upon the shrinker
>>> callbacks to ensure we don't consume more than our fair share of available
>>> memory overall.
>>> -Daniel
>>
>> Now when you bring i915 memory usage up, Daniel,
>> I can't refrain from bringing up the old user-space unreclaimable kernel
>> memory issue, for which gem open is a good example ;) Each time
>> user-space opens a gem handle, some un-reclaimable kernel memory is
>> allocated, for which there is no accounting, so theoretically I think a
>> user can bring a system to unusability this way.
>>
>> Typically there are various limits on unreclaimable objects like this,
>> like open file descriptors, and IIRC the kernel even has an internal
>> limit on the number of struct files you initialize, based on the
>> available system memory, so dma-buf / prime should already have some
>> sort of protection.
>
> Oh yeah, we have zero cgroups limits or similar stuff for gem allocations,
> so there's not really a way to isolate gpu memory usage in a sane way for
> specific processes. But there's also zero limits on actual gpu usage
> itself (timeslices or whatever) so I guess no one asked for this yet.
>
> My comment really was about balancing mm users under the assumption that
> they're all unlimited.
> -Daniel
>
I think the point you brought up becomes very important for compute (HSA) 
processes. I still don't know how to distinguish between legitimate use of GPU 
local memory and misbehaving/malicious processes.

We have a requirement that HSA processes will be allowed to allocate and pin GPU 
local memory. They do it through an ioctl.
In the kernel driver, we have an accounting of those memory allocations, meaning 
that I can print a list of all the objects that were allocated by a certain 
process, per device.
Therefore, in theory, I can reclaim any object, but that will probably break the 
userspace app. If the app is misbehaving/malicious than that's ok, I guess. But 
how do I know that ? And what prevents that malicious app to re-spawn and do the 
same allocation again ?

	Oded

  reply	other threads:[~2014-08-13 14:10 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-08-13  3:52 [PATCH 0/3] drm/ttm: hard to swim in an empty pool Jérôme Glisse
2014-08-13  3:52 ` [PATCH 1/3] drm/ttm: set sensible pool size limit Jérôme Glisse
2014-08-13  6:24   ` Michel Dänzer
2014-08-13  3:52 ` [PATCH 2/3] drm/ttm: fix object deallocation to properly fill in the page pool Jérôme Glisse
2015-03-25 19:06   ` Konrad Rzeszutek Wilk
2015-07-06  9:11   ` Michel Dänzer
2015-07-06 16:10     ` Jerome Glisse
2015-07-07  6:39       ` Michel Dänzer
2015-07-07 17:41         ` Jerome Glisse
2015-07-08  2:34           ` Michel Dänzer
2014-08-13  3:52 ` [PATCH 3/3] drm/ttm: under memory pressure minimize the size of memory pool Jérôme Glisse
2014-08-13  6:32   ` Michel Dänzer
2014-08-13  9:06   ` Thomas Hellstrom
2014-08-13 10:42     ` Daniel Vetter
2014-08-13 12:35       ` GEM memory DOS (WAS Re: [PATCH 3/3] drm/ttm: under memory pressure minimize the size of memory pool) Thomas Hellstrom
2014-08-13 12:40         ` David Herrmann
2014-08-13 12:48           ` Thomas Hellstrom
2014-08-13 13:01         ` Daniel Vetter
2014-08-13 14:09           ` Oded Gabbay [this message]
2014-08-13 15:19             ` Thomas Hellstrom
2014-08-13 16:30             ` Daniel Vetter
2014-08-13 15:13           ` Thomas Hellstrom
2014-08-13 16:24             ` Daniel Vetter
2014-08-13 16:30               ` Alex Deucher
2014-08-13 16:38                 ` Daniel Vetter
2014-08-13 16:45                   ` Daniel Vetter
2014-08-13 17:38                 ` Thomas Hellstrom
2014-08-13 17:20               ` Thomas Hellstrom
2014-08-14 22:29             ` Jesse Barnes

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=53EB71AD.1070904@amd.com \
    --to=oded.gabbay@amd.com \
    --cc=daniel@ffwll.ch \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=jglisse@redhat.com \
    --cc=konrad.wilk@oracle.com \
    --cc=michel@daenzer.net \
    --cc=thellstrom@vmware.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.