From: Oded Gabbay <oded.gabbay@amd.com>
To: "Christian König" <christian.koenig@amd.com>,
dri-devel@lists.freedesktop.org, Alexander.Deucher@amd.com
Subject: Re: [PATCH 0/9] Replace use of radeon_sa with a new sub allocator
Date: Thu, 1 Jan 2015 09:46:06 +0200 [thread overview]
Message-ID: <54A4FB3E.6020108@amd.com> (raw)
In-Reply-To: <54A42D44.90002@amd.com>
On 12/31/2014 07:07 PM, Christian König wrote:
>> The long-term solution
> That was the part that I missed in the description. Please note somewhere that
> we still need to improve this.
OK, I'll add it to commit msg of the relevant patch (and cover letter)
>
> Apart from that the patches look fine to me, but I need more time to review them
> in detail.
Thanks! I hope we can push it to 3.20
>
> Regards,
> Christian.
>
> Am 31.12.2014 um 15:06 schrieb Oded Gabbay:
>>
>> On 12/31/2014 03:49 PM, Christian König wrote:
>>> Am 31.12.2014 um 14:39 schrieb Oded Gabbay:
>>>> Background:
>>>>
>>>> amdkfd needs GART memory for several things, such as runlist packets,
>>>> MQDs, HPDs and more. Unfortunately, all of this memory must be always
>>>> pinned (due to several reasons which were discussed during the
>>>> initial review of amdkfd).
>>> In general seems to be a good idea, but so far I still don't have seen a
>>> good explanation why all those memory must be pinned. So please summarize
>>> that one once more.
>>>
>>> Regards,
>>> Christian.
>>>
>> ok, once more :)
>>
>> The bulk of the allocations in the GART is for MQDs. MQDs represent active
>> user-mode queues, which are on the current runlist. It is important to
>> remember that active queues doesn't necessarily mean scheduled/running
>> queues, especially if there is over-subscription of queues or more than a
>> single HSA process.
>>
>> Because the scheduling of the user-mode queues is done by the CP firmware,
>> amdkfd doesn't have any indication if the queue is scheduled or not. If the
>> CP will try to schedule a queue, and its MQD is not present, this will
>> probably stuck the CP permanently, as it will load garbage from the GART
>> (the address of the MQD is given to the CP inside the runlist packet).
>>
>> In addition, there are a couple of small allocations which also should
>> always be pinned - runlist packets (2 packets) and HPDs. runlist packets can
>> be quite large, depending on number of processes and queues.
>>
>> A few solutions were proposed, but at the end Jerome agreed there is no harm
>> when limiting the total memory consumption to around 4MB.
>>
>> The long-term solution, which I will be working on, hopefully soon, is to
>> create a mechanism through which radeon/ttm can ask amdkfd to clear
>> GART/VRAM memory due to memory pressure. Then, amdkfd will preempt the
>> running queues and wait until the memory pressure is over. Then it will
>> reschedule the queues. But I'm getting ahead of myself. I hope to send an
>> RFC about that in the next couple of weeks.
>>
>> Oded
>>
>>
>>
>>>> Current Solution:
>>>>
>>>> The current (short/mid-term) solution that was proposed by Jerome.G, is
>>>> to limit the amount of memory to a small size, roughly 4MB and allocate
>>>> this buffer at the start of the GART. To accomodate this, amdkfd has
>>>> two kernel module parameters, maximum number of HSA processes and
>>>> maximum number of queues per process, which require under 4MB of GART
>>>> memory when using their defaults, 32 and 128 respectively.
>>>>
>>>> Until now, amdkfd used the radeon sub-allocator module (radeon_sa)
>>>> to handle the sub-allocation of memory from this large buffer to
>>>> different modules inside the amdkfd.
>>>>
>>>> However, while running OpenCL conformance test suite, we found that
>>>> radeon_sa module is not suitable for this kind of task, due to its
>>>> design:
>>>> 1. Every allocation increments its interal pointer so the next
>>>> allocation is *always* done ahead of the previous allocation. This
>>>> causes the internal pointer to wrap-around when it reaches the end of
>>>> the buffer.
>>>>
>>>> 2. When encoutering an area that is already allocated, the module
>>>> waits for that area to be freed. If it is not freed in a timely manner
>>>> (or has no fence), the allocation fails. Simply put, it can't "skip"
>>>> the allocated area.
>>>>
>>>> Now, this is most probably good for graphics, but for amdkfd needs,
>>>> the combination of the two behaviors mentioned above eventually causes
>>>> a denial-of-service. This is because some memory allocations
>>>> are *always* present and *never* freed (such as HPDs).
>>>> Therefore, given enough time and workload, the radeon_sa eventually
>>>> wraps around, encounters an already allocated area and gets stuck.
>>>>
>>>> Proposed new solution:
>>>>
>>>> To solve this, I have written a simple sub-allocator module inside
>>>> amdkfd. It allocates fixed-size contiguous chunks (1 or more) and uses
>>>> a bitmap to manage the allocations. The next allocation is always
>>>> being searched for from the start of the GART buffer, and the module
>>>> knows how to skip allocated chunks.
>>>>
>>>> Because most allocations are MQDs, and MQDs are 512 Bytes in size, I
>>>> set the default chunk size to be 512 Bytes.
>>>>
>>>> The basic GART memory allocation is still being done in the
>>>> amdkfd <--> radeon interface, and it still occupies less than 4MB.
>>>>
>>>> I have chosen to implement a new allocator instead of changing
>>>> radeon_sa because the behavior of radeon_sa is very appropriate for
>>>> graphics, where allocations do not stay forever. Also, amdkfd doesn't
>>>> actually need the flexibility and features radeon_sa provides.
>>>>
>>>> Oded
>>>>
>>>> Oded Gabbay (9):
>>>> drm/amd: Add new kfd-->kgd interface for gart usage
>>>> drm/radeon: Impl. new gtt allocate/free functions
>>>> drm/amdkfd: Add gtt sa related data to kfd_dev struct
>>>> drm/amdkfd: Add kfd gtt sub-allocator functions
>>>> drm/amdkfd: Fixed calculation of gart buffer size
>>>> drm/amdkfd: Allocate gart memory using new interface
>>>> drm/amdkfd: Using new gtt sa in amdkfd
>>>> drm/radeon: Remove old radeon_sa usage from kfd-->kgd interface
>>>> drm/amd: Remove old radeon_sa funcs from kfd-->kgd interface
>>>>
>>>> drivers/gpu/drm/amd/amdkfd/kfd_device.c | 217
>>>> ++++++++++++++++++++-
>>>> .../gpu/drm/amd/amdkfd/kfd_device_queue_manager.c | 23 +--
>>>> drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c | 41 ++--
>>>> drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.c | 16 +-
>>>> drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c | 10 +-
>>>> drivers/gpu/drm/amd/amdkfd/kfd_priv.h | 28 ++-
>>>> drivers/gpu/drm/amd/include/kgd_kfd_interface.h | 23 +--
>>>> drivers/gpu/drm/radeon/radeon_kfd.c | 128 ++++++------
>>>> 8 files changed, 329 insertions(+), 157 deletions(-)
>>>>
>
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel
next prev parent reply other threads:[~2015-01-01 7:46 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-12-31 13:39 [PATCH 0/9] Replace use of radeon_sa with a new sub allocator Oded Gabbay
2014-12-31 13:39 ` [PATCH 1/9] drm/amd: Add new kfd-->kgd interface for gart usage Oded Gabbay
2014-12-31 13:39 ` [PATCH 2/9] drm/radeon: Impl. new gtt allocate/free functions Oded Gabbay
2014-12-31 13:39 ` [PATCH 3/9] drm/amdkfd: Add gtt sa related data to kfd_dev struct Oded Gabbay
2014-12-31 13:39 ` [PATCH 4/9] drm/amdkfd: Add kfd gtt sub-allocator functions Oded Gabbay
2014-12-31 13:39 ` [PATCH 5/9] drm/amdkfd: Fixed calculation of gart buffer size Oded Gabbay
2014-12-31 13:39 ` [PATCH 6/9] drm/amdkfd: Allocate gart memory using new interface Oded Gabbay
2014-12-31 13:39 ` [PATCH 7/9] drm/amdkfd: Using new gtt sa in amdkfd Oded Gabbay
2014-12-31 13:39 ` [PATCH 8/9] drm/radeon: Remove old radeon_sa usage from kfd-->kgd interface Oded Gabbay
2014-12-31 13:39 ` [PATCH 9/9] drm/amd: Remove old radeon_sa funcs " Oded Gabbay
2014-12-31 13:49 ` [PATCH 0/9] Replace use of radeon_sa with a new sub allocator Christian König
2014-12-31 14:06 ` Oded Gabbay
2014-12-31 17:07 ` Christian König
2015-01-01 7:46 ` Oded Gabbay [this message]
2015-01-06 16:51 ` Alex Deucher
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=54A4FB3E.6020108@amd.com \
--to=oded.gabbay@amd.com \
--cc=Alexander.Deucher@amd.com \
--cc=christian.koenig@amd.com \
--cc=dri-devel@lists.freedesktop.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.