From: Oded Gabbay <oded.gabbay@amd.com>
To: Jerome Glisse <j.glisse@gmail.com>, Alex Deucher <alexdeucher@gmail.com>
Cc: "Bridgman, John" <John.Bridgman@amd.com>,
"Jesse Barnes" <jbarnes@virtuousgeek.org>,
"dri-devel@lists.freedesktop.org"
<dri-devel@lists.freedesktop.org>,
"Christian König" <deathsimple@vodafone.de>,
"Lewycky, Andrew" <Andrew.Lewycky@amd.com>,
"David Airlie" <airlied@linux.ie>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH v2 00/25] AMDKFD kernel driver
Date: Thu, 24 Jul 2014 21:57:16 +0300 [thread overview]
Message-ID: <53D1570C.5000704@amd.com> (raw)
In-Reply-To: <20140724184739.GA6177@gmail.com>
On 24/07/14 21:47, Jerome Glisse wrote:
> On Thu, Jul 24, 2014 at 01:35:53PM -0400, Alex Deucher wrote:
>> On Thu, Jul 24, 2014 at 11:44 AM, Jerome Glisse <j.glisse@gmail.com> wrote:
>>> On Thu, Jul 24, 2014 at 01:01:41AM +0300, Oded Gabbay wrote:
>>>> On 24/07/14 00:46, Bridgman, John wrote:
>>>>>
>>>>>> -----Original Message----- From: dri-devel
>>>>>> [mailto:dri-devel-bounces@lists.freedesktop.org] On Behalf Of Jesse
>>>>>> Barnes Sent: Wednesday, July 23, 2014 5:00 PM To:
>>>>>> dri-devel@lists.freedesktop.org Subject: Re: [PATCH v2 00/25]
>>>>>> AMDKFD kernel driver
>>>>>>
>>>>>> On Mon, 21 Jul 2014 19:05:46 +0200 daniel at ffwll.ch (Daniel
>>>>>> Vetter) wrote:
>>>>>>
>>>>>>> On Mon, Jul 21, 2014 at 11:58:52AM -0400, Jerome Glisse wrote:
>>>>>>>> On Mon, Jul 21, 2014 at 05:25:11PM +0200, Daniel Vetter wrote:
>>>>>>>>> On Mon, Jul 21, 2014 at 03:39:09PM +0200, Christian K?nig
>>>>>>>>> wrote:
>>>>>>>>>> Am 21.07.2014 14:36, schrieb Oded Gabbay:
>>>>>>>>>>> On 20/07/14 20:46, Jerome Glisse wrote:
>>>>>>
>>>>>> [snip!!]
>>>>> My BlackBerry thumb thanks you ;)
>>>>>>
>>>>>>>>>>
>>>>>>>>>> The main questions here are if it's avoid able to pin down
>>>>>>>>>> the memory and if the memory is pinned down at driver load,
>>>>>>>>>> by request from userspace or by anything else.
>>>>>>>>>>
>>>>>>>>>> As far as I can see only the "mqd per userspace queue"
>>>>>>>>>> might be a bit questionable, everything else sounds
>>>>>>>>>> reasonable.
>>>>>>>>>
>>>>>>>>> Aside, i915 perspective again (i.e. how we solved this):
>>>>>>>>> When scheduling away from contexts we unpin them and put them
>>>>>>>>> into the lru. And in the shrinker we have a last-ditch
>>>>>>>>> callback to switch to a default context (since you can't ever
>>>>>>>>> have no context once you've started) which means we can evict
>>>>>>>>> any context object if it's
>>>>>> getting in the way.
>>>>>>>>
>>>>>>>> So Intel hardware report through some interrupt or some channel
>>>>>>>> when it is not using a context ? ie kernel side get
>>>>>>>> notification when some user context is done executing ?
>>>>>>>
>>>>>>> Yes, as long as we do the scheduling with the cpu we get
>>>>>>> interrupts for context switches. The mechanic is already
>>>>>>> published in the execlist patches currently floating around. We
>>>>>>> get a special context switch interrupt.
>>>>>>>
>>>>>>> But we have this unpin logic already on the current code where
>>>>>>> we switch contexts through in-line cs commands from the kernel.
>>>>>>> There we obviously use the normal batch completion events.
>>>>>>
>>>>>> Yeah and we can continue that going forward. And of course if your
>>>>>> hw can do page faulting, you don't need to pin the normal data
>>>>>> buffers.
>>>>>>
>>>>>> Usually there are some special buffers that need to be pinned for
>>>>>> longer periods though, anytime the context could be active. Sounds
>>>>>> like in this case the userland queues, which makes some sense. But
>>>>>> maybe for smaller systems the size limit could be clamped to
>>>>>> something smaller than 128M. Or tie it into the rlimit somehow,
>>>>>> just like we do for mlock() stuff.
>>>>>>
>>>>> Yeah, even the queues are in pageable memory, it's just a ~256 byte
>>>>> structure per queue (the Memory Queue Descriptor) that describes the
>>>>> queue to hardware, plus a couple of pages for each process using HSA
>>>>> to hold things like doorbells. Current thinking is to limit #
>>>>> processes using HSA to ~256 and #queues per process to ~1024 by
>>>>> default in the initial code, although my guess is that we could take
>>>>> the #queues per process default limit even lower.
>>>>>
>>>>
>>>> So my mistake. struct cik_mqd is actually 604 bytes, and it is allocated
>>>> on 256 boundary.
>>>> I had in mind to reserve 64MB of gart by default, which translates to
>>>> 512 queues per process, with 128 processes. Add 2 kernel module
>>>> parameters, # of max-queues-per-process and # of max-processes (default
>>>> is, as I said, 512 and 128) for better control of system admin.
>>>>
>>>
>>> So as i said somewhere else in this thread, this should not be reserved
>>> but use a special allocation. Any HSA GPU use virtual address space for
>>> userspace so only issue is for kernel side GTT.
>>>
>>> What i would like is seeing radeon pinned GTT allocation at bottom of
>>> GTT space (ie all ring buffer and the ib pool buffer). Then have an
>>> allocator that allocate new queue from top of GTT address space and
>>> grow to the bottom.
>>>
>>> It should not staticly reserved 64M or anything. When doing allocation
>>> it should move any ttm buffer that are in the region it want to allocate
>>> to a different location.
>>>
>>>
>>> As this needs some work, i am not against reserving some small amount
>>> (couple MB) as a first stage but anything more would need a proper solution
>>> like the one i just described.
>>
>> It's still a trade off. Even if we reserve a couple of megs it'll be
>> wasted if we are not running HSA apps. And even today if we run a
>> compute job using the current interfaces we could end up in the same
>> case. So while I think it's definitely a good goal to come up with
>> some solution for fragmentation, I don't think it should be a
>> show-stopper right now.
>>
>
> Seems i am having a hard time to express myself. I am not saying it is a
> showstopper i am saying until proper solution is implemented KFD should
> limit its number of queue to consume at most couple MB ie not 64MB or more
> but 2MB, 4MB something in that water.
So we thought internally about limiting ourselves through two kernel
module parameters, # of queues per process and # of processes. Default
values will be 128 queues per process and 32 processes. mqd takes 768
bytes at most, so that gives us a maximum of 3MB.
For absolute maximum, I think using H/W limits which are 1024 queues per
process and 512 processes. That gives us 384MB.
Would that be acceptable ?
>
>> A better solution to deal with fragmentation of GTT and provide a
>> better way to allocate larger buffers in vram would be to break up
>> vram <-> system pool transfers into multiple transfers depending on
>> the available GTT size. Or use GPUVM dynamically for vram <-> system
>> transfers.
>
> Isn't the UVD engine still using the main GTT ? I have not look much at
> UVD in a while.
>
> Yes there is way to fix buffer migration but i would also like to see
> address space fragmentation to a minimum which is the main reason i
> uterly hate any design that forbid kernel to take over and do its thing.
>
> Buffer pining should really be only for front buffer and thing like ring
> ie buffer that have a lifetime bound to the driver lifetime.
>
> Cheers,
> Jérôme
>
>>
>> Alex
next prev parent reply other threads:[~2014-07-24 18:57 UTC|newest]
Thread overview: 55+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-07-17 13:57 [PATCH v2 00/25] AMDKFD kernel driver Oded Gabbay
2014-07-20 17:46 ` Jerome Glisse
2014-07-21 3:03 ` Jerome Glisse
2014-07-21 7:01 ` Daniel Vetter
2014-07-21 9:34 ` Christian König
2014-07-21 12:36 ` Oded Gabbay
2014-07-21 13:39 ` Christian König
2014-07-21 14:12 ` Oded Gabbay
2014-07-21 15:54 ` Jerome Glisse
2014-07-21 17:42 ` Oded Gabbay
2014-07-21 18:14 ` Jerome Glisse
2014-07-21 18:36 ` Oded Gabbay
2014-07-21 18:59 ` Jerome Glisse
2014-07-21 19:23 ` Oded Gabbay
2014-07-21 19:28 ` Jerome Glisse
2014-07-21 21:56 ` Oded Gabbay
2014-07-21 23:05 ` Jerome Glisse
2014-07-21 23:29 ` Bridgman, John
2014-07-21 23:36 ` Jerome Glisse
2014-07-22 8:05 ` Oded Gabbay
2014-07-22 7:23 ` Daniel Vetter
2014-07-22 8:10 ` Oded Gabbay
2014-07-21 15:25 ` Daniel Vetter
2014-07-21 15:58 ` Jerome Glisse
2014-07-21 17:05 ` Daniel Vetter
2014-07-21 17:28 ` Oded Gabbay
2014-07-21 18:22 ` Daniel Vetter
2014-07-21 18:41 ` Oded Gabbay
2014-07-21 19:03 ` Jerome Glisse
2014-07-22 7:28 ` Daniel Vetter
2014-07-22 7:40 ` Daniel Vetter
2014-07-22 8:21 ` Oded Gabbay
2014-07-22 8:19 ` Oded Gabbay
2014-07-22 9:21 ` Daniel Vetter
2014-07-22 9:24 ` Daniel Vetter
2014-07-22 9:52 ` Oded Gabbay
2014-07-22 11:15 ` Daniel Vetter
2014-07-23 6:50 ` Oded Gabbay
2014-07-23 7:04 ` Christian König
2014-07-23 13:39 ` Bridgman, John
2014-07-23 14:56 ` Jerome Glisse
2014-07-23 19:49 ` Alex Deucher
2014-07-23 20:25 ` Jerome Glisse
2014-07-23 7:05 ` Daniel Vetter
2014-07-23 8:35 ` Oded Gabbay
2014-07-23 13:33 ` Bridgman, John
2014-07-23 14:41 ` Daniel Vetter
2014-07-23 15:06 ` Bridgman, John
2014-07-23 15:12 ` Bridgman, John
[not found] ` <20140723135931.79541a86@jbarnes-desktop>
[not found] ` <D89D60253BB73A4E8C62F9FD18A939CA01067410@storexdag02.amd.com>
2014-07-23 22:01 ` Oded Gabbay
2014-07-24 15:44 ` Jerome Glisse
2014-07-24 17:35 ` Alex Deucher
2014-07-24 18:47 ` Jerome Glisse
2014-07-24 18:57 ` Oded Gabbay [this message]
2014-07-24 20:26 ` Jerome Glisse
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=53D1570C.5000704@amd.com \
--to=oded.gabbay@amd.com \
--cc=Andrew.Lewycky@amd.com \
--cc=John.Bridgman@amd.com \
--cc=airlied@linux.ie \
--cc=alexdeucher@gmail.com \
--cc=deathsimple@vodafone.de \
--cc=dri-devel@lists.freedesktop.org \
--cc=j.glisse@gmail.com \
--cc=jbarnes@virtuousgeek.org \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox