* [PATCH 1/5] drm/radeon: Set TTM_PL_FLAG_TOPDOWN also for RADEON_GEM_CPU_ACCESS BOs
@ 2014-10-28 9:35 Michel Dänzer
2014-10-28 9:35 ` [PATCH 2/5] drm/ttm: Add DRM_MM_SEARCH_BELOW for TTM_PL_FLAG_TOPDOWN Michel Dänzer
` (4 more replies)
0 siblings, 5 replies; 14+ messages in thread
From: Michel Dänzer @ 2014-10-28 9:35 UTC (permalink / raw)
To: dri-devel; +Cc: Lauri Kasanen
From: Michel Dänzer <michel.daenzer@amd.com>
I wasn't sure if TTM_PL_FLAG_TOPDOWN works correctly with non-0 lpfn, but
AFAICT it does.
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
---
drivers/gpu/drm/radeon/radeon_object.c | 4 +---
1 file changed, 1 insertion(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/radeon/radeon_object.c b/drivers/gpu/drm/radeon/radeon_object.c
index 9c92976..d23b0dd 100644
--- a/drivers/gpu/drm/radeon/radeon_object.c
+++ b/drivers/gpu/drm/radeon/radeon_object.c
@@ -179,9 +179,7 @@ void radeon_ttm_placement_from_domain(struct radeon_bo *rbo, u32 domain)
* improve fragmentation quality.
* 512kb was measured as the most optimal number.
*/
- if (!((rbo->flags & RADEON_GEM_CPU_ACCESS) &&
- (rbo->placements[i].flags & TTM_PL_FLAG_VRAM)) &&
- rbo->tbo.mem.size > 512 * 1024) {
+ if (rbo->tbo.mem.size > 512 * 1024) {
for (i = 0; i < c; i++) {
rbo->placements[i].flags |= TTM_PL_FLAG_TOPDOWN;
}
--
2.1.1
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel
^ permalink raw reply related [flat|nested] 14+ messages in thread* [PATCH 2/5] drm/ttm: Add DRM_MM_SEARCH_BELOW for TTM_PL_FLAG_TOPDOWN 2014-10-28 9:35 [PATCH 1/5] drm/radeon: Set TTM_PL_FLAG_TOPDOWN also for RADEON_GEM_CPU_ACCESS BOs Michel Dänzer @ 2014-10-28 9:35 ` Michel Dänzer 2014-10-28 9:35 ` [PATCH 3/5] drm/ttm: Use only " Michel Dänzer ` (3 subsequent siblings) 4 siblings, 0 replies; 14+ messages in thread From: Michel Dänzer @ 2014-10-28 9:35 UTC (permalink / raw) To: dri-devel; +Cc: Lauri Kasanen From: Michel Dänzer <michel.daenzer@amd.com> If the BO should be placed at the top of the area, we should start looking for holes from the top. Signed-off-by: Michel Dänzer <michel.daenzer@amd.com> --- drivers/gpu/drm/ttm/ttm_bo_manager.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/ttm/ttm_bo_manager.c b/drivers/gpu/drm/ttm/ttm_bo_manager.c index 964387f..1e93f6c 100644 --- a/drivers/gpu/drm/ttm/ttm_bo_manager.c +++ b/drivers/gpu/drm/ttm/ttm_bo_manager.c @@ -55,6 +55,7 @@ static int ttm_bo_man_get_node(struct ttm_mem_type_manager *man, struct ttm_range_manager *rman = (struct ttm_range_manager *) man->priv; struct drm_mm *mm = &rman->mm; struct drm_mm_node *node = NULL; + enum drm_mm_search_flags sflags = DRM_MM_SEARCH_BEST; enum drm_mm_allocator_flags aflags = DRM_MM_CREATE_DEFAULT; unsigned long lpfn; int ret; @@ -67,15 +68,16 @@ static int ttm_bo_man_get_node(struct ttm_mem_type_manager *man, if (!node) return -ENOMEM; - if (place->flags & TTM_PL_FLAG_TOPDOWN) + if (place->flags & TTM_PL_FLAG_TOPDOWN) { + sflags |= DRM_MM_SEARCH_BELOW; aflags = DRM_MM_CREATE_TOP; + } spin_lock(&rman->lock); ret = drm_mm_insert_node_in_range_generic(mm, node, mem->num_pages, mem->page_alignment, 0, place->fpfn, lpfn, - DRM_MM_SEARCH_BEST, - aflags); + sflags, aflags); spin_unlock(&rman->lock); if (unlikely(ret)) { -- 2.1.1 _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 3/5] drm/ttm: Use only DRM_MM_SEARCH_BELOW for TTM_PL_FLAG_TOPDOWN 2014-10-28 9:35 [PATCH 1/5] drm/radeon: Set TTM_PL_FLAG_TOPDOWN also for RADEON_GEM_CPU_ACCESS BOs Michel Dänzer 2014-10-28 9:35 ` [PATCH 2/5] drm/ttm: Add DRM_MM_SEARCH_BELOW for TTM_PL_FLAG_TOPDOWN Michel Dänzer @ 2014-10-28 9:35 ` Michel Dänzer 2014-10-28 11:10 ` Daniel Vetter 2014-10-28 9:35 ` [PATCH 4/5] drm/ttm: Add TTM_PL_FLAG_BOTTOMUP Michel Dänzer ` (2 subsequent siblings) 4 siblings, 1 reply; 14+ messages in thread From: Michel Dänzer @ 2014-10-28 9:35 UTC (permalink / raw) To: dri-devel; +Cc: Lauri Kasanen From: Michel Dänzer <michel.daenzer@amd.com> DRM_MM_SEARCH_BEST gets the smallest hole which can fit the BO. That seems against the idea of TTM_PL_FLAG_TOPDOWN: * The smallest hole may be in the overall bottom of the area * If the hole isn't much larger than the BO, it doesn't make much difference whether the BO is placed at the bottom or at the top of the hole Signed-off-by: Michel Dänzer <michel.daenzer@amd.com> --- drivers/gpu/drm/ttm/ttm_bo_manager.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/ttm/ttm_bo_manager.c b/drivers/gpu/drm/ttm/ttm_bo_manager.c index 1e93f6c..aa0bd05 100644 --- a/drivers/gpu/drm/ttm/ttm_bo_manager.c +++ b/drivers/gpu/drm/ttm/ttm_bo_manager.c @@ -69,7 +69,7 @@ static int ttm_bo_man_get_node(struct ttm_mem_type_manager *man, return -ENOMEM; if (place->flags & TTM_PL_FLAG_TOPDOWN) { - sflags |= DRM_MM_SEARCH_BELOW; + sflags = DRM_MM_SEARCH_BELOW; aflags = DRM_MM_CREATE_TOP; } -- 2.1.1 _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH 3/5] drm/ttm: Use only DRM_MM_SEARCH_BELOW for TTM_PL_FLAG_TOPDOWN 2014-10-28 9:35 ` [PATCH 3/5] drm/ttm: Use only " Michel Dänzer @ 2014-10-28 11:10 ` Daniel Vetter 2014-10-30 8:41 ` Michel Dänzer 2014-10-30 15:07 ` Thomas Hellstrom 0 siblings, 2 replies; 14+ messages in thread From: Daniel Vetter @ 2014-10-28 11:10 UTC (permalink / raw) To: Michel Dänzer; +Cc: Lauri Kasanen, dri-devel On Tue, Oct 28, 2014 at 06:35:04PM +0900, Michel Dänzer wrote: > From: Michel Dänzer <michel.daenzer@amd.com> > > DRM_MM_SEARCH_BEST gets the smallest hole which can fit the BO. That seems > against the idea of TTM_PL_FLAG_TOPDOWN: > > * The smallest hole may be in the overall bottom of the area > * If the hole isn't much larger than the BO, it doesn't make much > difference whether the BO is placed at the bottom or at the top of the > hole > > Signed-off-by: Michel Dänzer <michel.daenzer@amd.com> tbh I think SEARCH_BEST is pretty much always a bad idea - it rips apart allocations from the same execbuf, and usually those get recycled around the same time. Which means you'll just fragment your mm even more if you try to find the best hole instead of just picking one and the stuffing the entire execbuf into it. So imo we might as well just kill it. Another one that I've advertised a bunch of times already is the scan roaster in drm_mm.c: Currently ttm just evicts until there's a big enough hole, which is fairly awful if you have quasi-segmented memory like with top-down/bottom-up schemes and different ranges for different units. With the roaster you just walk the lru and build up potential holes until there's a suitable one, and then only evict those buffers. Which means if you have a certain range of memory under very high pressure (e.g. the 256M which uvd can use or whatever it is), then you wont thrash all the other vram too. Cheers, Daniel > --- > drivers/gpu/drm/ttm/ttm_bo_manager.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/ttm/ttm_bo_manager.c b/drivers/gpu/drm/ttm/ttm_bo_manager.c > index 1e93f6c..aa0bd05 100644 > --- a/drivers/gpu/drm/ttm/ttm_bo_manager.c > +++ b/drivers/gpu/drm/ttm/ttm_bo_manager.c > @@ -69,7 +69,7 @@ static int ttm_bo_man_get_node(struct ttm_mem_type_manager *man, > return -ENOMEM; > > if (place->flags & TTM_PL_FLAG_TOPDOWN) { > - sflags |= DRM_MM_SEARCH_BELOW; > + sflags = DRM_MM_SEARCH_BELOW; > aflags = DRM_MM_CREATE_TOP; > } > > -- > 2.1.1 > > _______________________________________________ > dri-devel mailing list > dri-devel@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/dri-devel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 3/5] drm/ttm: Use only DRM_MM_SEARCH_BELOW for TTM_PL_FLAG_TOPDOWN 2014-10-28 11:10 ` Daniel Vetter @ 2014-10-30 8:41 ` Michel Dänzer 2014-10-30 15:07 ` Thomas Hellstrom 1 sibling, 0 replies; 14+ messages in thread From: Michel Dänzer @ 2014-10-30 8:41 UTC (permalink / raw) To: Daniel Vetter; +Cc: Lauri Kasanen, dri-devel On 28.10.2014 20:10, Daniel Vetter wrote: > On Tue, Oct 28, 2014 at 06:35:04PM +0900, Michel Dänzer wrote: >> From: Michel Dänzer <michel.daenzer@amd.com> >> >> DRM_MM_SEARCH_BEST gets the smallest hole which can fit the BO. That seems >> against the idea of TTM_PL_FLAG_TOPDOWN: >> >> * The smallest hole may be in the overall bottom of the area >> * If the hole isn't much larger than the BO, it doesn't make much >> difference whether the BO is placed at the bottom or at the top of the >> hole >> >> Signed-off-by: Michel Dänzer <michel.daenzer@amd.com> > > tbh I think SEARCH_BEST is pretty much always a bad idea - it rips apart > allocations from the same execbuf, and usually those get recycled around > the same time. Which means you'll just fragment your mm even more if you > try to find the best hole instead of just picking one and the stuffing the > entire execbuf into it. So imo we might as well just kill it. Sent out a new patch 4 which nukes it. > Another one that I've advertised a bunch of times already is the scan > roaster in drm_mm.c: Currently ttm just evicts until there's a big enough > hole, which is fairly awful if you have quasi-segmented memory like with > top-down/bottom-up schemes and different ranges for different units. With > the roaster you just walk the lru and build up potential holes until > there's a suitable one, and then only evict those buffers. Which means if > you have a certain range of memory under very high pressure (e.g. the 256M > which uvd can use or whatever it is), then you wont thrash all the other > vram too. I recently made some changes in TTM and radeon to alleviate that particular issue, but the scan roaster does sound useful. Unfortunately, modifying TTM to use it is a bit too involved for me right now, any volunteers? -- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Mesa and X developer _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 3/5] drm/ttm: Use only DRM_MM_SEARCH_BELOW for TTM_PL_FLAG_TOPDOWN 2014-10-28 11:10 ` Daniel Vetter 2014-10-30 8:41 ` Michel Dänzer @ 2014-10-30 15:07 ` Thomas Hellstrom 2014-10-31 17:19 ` Daniel Vetter 1 sibling, 1 reply; 14+ messages in thread From: Thomas Hellstrom @ 2014-10-30 15:07 UTC (permalink / raw) To: dri-devel On 10/28/2014 12:10 PM, Daniel Vetter wrote: > On Tue, Oct 28, 2014 at 06:35:04PM +0900, Michel Dänzer wrote: >> From: Michel Dänzer <michel.daenzer@amd.com> >> >> DRM_MM_SEARCH_BEST gets the smallest hole which can fit the BO. That seems >> against the idea of TTM_PL_FLAG_TOPDOWN: >> >> * The smallest hole may be in the overall bottom of the area >> * If the hole isn't much larger than the BO, it doesn't make much >> difference whether the BO is placed at the bottom or at the top of the >> hole >> >> Signed-off-by: Michel Dänzer <michel.daenzer@amd.com> > tbh I think SEARCH_BEST is pretty much always a bad idea - it rips apart > allocations from the same execbuf, and usually those get recycled around > the same time. Which means you'll just fragment your mm even more if you > try to find the best hole instead of just picking one and the stuffing the > entire execbuf into it. So imo we might as well just kill it. > > Another one that I've advertised a bunch of times already is the scan > roaster in drm_mm.c: Currently ttm just evicts until there's a big enough > hole, which is fairly awful if you have quasi-segmented memory like with > top-down/bottom-up schemes and different ranges for different units. With > the roaster you just walk the lru and build up potential holes until > there's a suitable one, and then only evict those buffers. Which means if > you have a certain range of memory under very high pressure (e.g. the 256M > which uvd can use or whatever it is), then you wont thrash all the other > vram too. > > Cheers, Daniel > And I think I've commented each time why that is currently not possible or at least worthile with TTM ;). Although the idea is great, even if you make a nice list of suitable buffers to evict, as soon as you drop the LRU spinlock, that list is invalid. Worse, while you start evicting your list, other processes may steal part of the hole you've created. As some point we should perhaps figure out how to protect each managed area with a separate mutex, but that would create trouble if, for example, evicting from VRAM to managed GART. /Thomas _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 3/5] drm/ttm: Use only DRM_MM_SEARCH_BELOW for TTM_PL_FLAG_TOPDOWN 2014-10-30 15:07 ` Thomas Hellstrom @ 2014-10-31 17:19 ` Daniel Vetter 0 siblings, 0 replies; 14+ messages in thread From: Daniel Vetter @ 2014-10-31 17:19 UTC (permalink / raw) To: Thomas Hellstrom; +Cc: dri-devel On Thu, Oct 30, 2014 at 04:07:05PM +0100, Thomas Hellstrom wrote: > On 10/28/2014 12:10 PM, Daniel Vetter wrote: > > On Tue, Oct 28, 2014 at 06:35:04PM +0900, Michel Dänzer wrote: > >> From: Michel Dänzer <michel.daenzer@amd.com> > >> > >> DRM_MM_SEARCH_BEST gets the smallest hole which can fit the BO. That seems > >> against the idea of TTM_PL_FLAG_TOPDOWN: > >> > >> * The smallest hole may be in the overall bottom of the area > >> * If the hole isn't much larger than the BO, it doesn't make much > >> difference whether the BO is placed at the bottom or at the top of the > >> hole > >> > >> Signed-off-by: Michel Dänzer <michel.daenzer@amd.com> > > tbh I think SEARCH_BEST is pretty much always a bad idea - it rips apart > > allocations from the same execbuf, and usually those get recycled around > > the same time. Which means you'll just fragment your mm even more if you > > try to find the best hole instead of just picking one and the stuffing the > > entire execbuf into it. So imo we might as well just kill it. > > > > Another one that I've advertised a bunch of times already is the scan > > roaster in drm_mm.c: Currently ttm just evicts until there's a big enough > > hole, which is fairly awful if you have quasi-segmented memory like with > > top-down/bottom-up schemes and different ranges for different units. With > > the roaster you just walk the lru and build up potential holes until > > there's a suitable one, and then only evict those buffers. Which means if > > you have a certain range of memory under very high pressure (e.g. the 256M > > which uvd can use or whatever it is), then you wont thrash all the other > > vram too. > > > > Cheers, Daniel > > > And I think I've commented each time why that is currently not possible > or at least worthile with TTM ;). > > Although the idea is great, even if you make a nice list of suitable > buffers to evict, as soon as you drop the LRU spinlock, that list is > invalid. Worse, while you start evicting your list, other processes may > steal part of the hole you've created. > > As some point we should perhaps figure out how to protect each managed > area with a separate mutex, but that would create trouble if, for > example, evicting from VRAM to managed GART. Yeah, you need to do the entire scanning/evicting under the lru lock, which means you need to jump over all the buffers you can't immediately trylock. Which also means that you need a slowpath fallback that rips out everything in case the buffer doesn't fit anywhere. Wrt the lru lock you need to at least overwrite the generic placement/eviction code since you need to a bit more logic and state-keeping for drm_mm than what's currently exposed in the super-generic ttm managed area allocator (which apparently could also manage an idr). But shouldn't be more fuzz than a new vfunc with the current code as default implementation. For lru lock recursion (due to vram->gtt eviction or similar) I guess either rework stuff to allow the entire eviction process under lru. Or keep the drm_mm allocations on some separate list until you've evicted everything (dropping the lock meanwhile) and the mass-free the entire list under the lru lock and immediately do your allocation. I'm not sure whether the long hold times for the lru lock really would be a problem (they'll happen even when you drop it for actual eviction since walking the lru lists is costly when you jump over lots of buffers that don't get you towards your goal), since evicting all buffers is probably going to thrash the system much worse anyway. And for the worst case a regular sched for lru lock contention or preemption should prevent interactivity disasters for parallel cmd submission. So yeah some details to take care of, but I don't think there's anything insurmountable. It's just that I don't have time (nor justification) for it. And it will be indeed quite a bit of work to get going. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH 4/5] drm/ttm: Add TTM_PL_FLAG_BOTTOMUP 2014-10-28 9:35 [PATCH 1/5] drm/radeon: Set TTM_PL_FLAG_TOPDOWN also for RADEON_GEM_CPU_ACCESS BOs Michel Dänzer 2014-10-28 9:35 ` [PATCH 2/5] drm/ttm: Add DRM_MM_SEARCH_BELOW for TTM_PL_FLAG_TOPDOWN Michel Dänzer 2014-10-28 9:35 ` [PATCH 3/5] drm/ttm: Use only " Michel Dänzer @ 2014-10-28 9:35 ` Michel Dänzer 2014-10-30 8:36 ` [PATCH 4/4] drm/mm: Remove DRM_MM_SEARCH_BEST Michel Dänzer 2014-10-28 9:35 ` [PATCH 5/5] drm/radeon: Use TTM_PL_FLAG_BOTTOMUP Michel Dänzer 2014-10-28 10:46 ` [PATCH 1/5] drm/radeon: Set TTM_PL_FLAG_TOPDOWN also for RADEON_GEM_CPU_ACCESS BOs Lauri Kasanen 4 siblings, 1 reply; 14+ messages in thread From: Michel Dänzer @ 2014-10-28 9:35 UTC (permalink / raw) To: dri-devel; +Cc: Lauri Kasanen From: Michel Dänzer <michel.daenzer@amd.com> For placing a BO strictly from the bottom up, i.e. in the lowest hole which can hold it. Signed-off-by: Michel Dänzer <michel.daenzer@amd.com> --- drivers/gpu/drm/ttm/ttm_bo_manager.c | 5 +++++ include/drm/ttm/ttm_placement.h | 5 ++++- 2 files changed, 9 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/ttm/ttm_bo_manager.c b/drivers/gpu/drm/ttm/ttm_bo_manager.c index aa0bd05..a64397b 100644 --- a/drivers/gpu/drm/ttm/ttm_bo_manager.c +++ b/drivers/gpu/drm/ttm/ttm_bo_manager.c @@ -73,6 +73,11 @@ static int ttm_bo_man_get_node(struct ttm_mem_type_manager *man, aflags = DRM_MM_CREATE_TOP; } + if (place->flags & TTM_PL_FLAG_BOTTOMUP) { + sflags = DRM_MM_SEARCH_DEFAULT; + aflags = DRM_MM_CREATE_DEFAULT; + } + spin_lock(&rman->lock); ret = drm_mm_insert_node_in_range_generic(mm, node, mem->num_pages, mem->page_alignment, 0, diff --git a/include/drm/ttm/ttm_placement.h b/include/drm/ttm/ttm_placement.h index 8ed44f9..c9ae86a 100644 --- a/include/drm/ttm/ttm_placement.h +++ b/include/drm/ttm/ttm_placement.h @@ -66,7 +66,9 @@ * TTM_PL_FLAG_NO_EVICT means that the buffer may never * be evicted to make room for other buffers. * TTM_PL_FLAG_TOPDOWN requests to be placed from the - * top of the memory area, instead of the bottom. + * top of the memory area. + * TTM_PL_FLAG_BOTTOMUP requests to be placed from the + * bottom of the memory area. */ #define TTM_PL_FLAG_CACHED (1 << 16) @@ -75,6 +77,7 @@ #define TTM_PL_FLAG_SHARED (1 << 20) #define TTM_PL_FLAG_NO_EVICT (1 << 21) #define TTM_PL_FLAG_TOPDOWN (1 << 22) +#define TTM_PL_FLAG_BOTTOMUP (1 << 23) #define TTM_PL_MASK_CACHING (TTM_PL_FLAG_CACHED | \ TTM_PL_FLAG_UNCACHED | \ -- 2.1.1 _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 4/4] drm/mm: Remove DRM_MM_SEARCH_BEST 2014-10-28 9:35 ` [PATCH 4/5] drm/ttm: Add TTM_PL_FLAG_BOTTOMUP Michel Dänzer @ 2014-10-30 8:36 ` Michel Dänzer 2014-10-31 18:10 ` Alex Deucher 0 siblings, 1 reply; 14+ messages in thread From: Michel Dänzer @ 2014-10-30 8:36 UTC (permalink / raw) To: dri-devel From: Michel Dänzer <michel.daenzer@amd.com> In the words of Daniel Vetter: «I think SEARCH_BEST is pretty much always a bad idea - it rips apart allocations from the same execbuf, and usually those get recycled around the same time. Which means you'll just fragment your mm even more if you try to find the best hole instead of just picking one and the stuffing the entire execbuf into it. So imo we might as well just kill it.» Signed-off-by: Michel Dänzer <michel.daenzer@amd.com> --- This patch replaces the previous patches 4 and 5. Not sure about the function header comment hunk, maybe what it says is true even with DRM_MM_SEARCH_BELOW? drivers/gpu/drm/drm_mm.c | 36 +++++------------------------------- drivers/gpu/drm/ttm/ttm_bo_manager.c | 2 +- include/drm/drm_mm.h | 3 +-- 3 files changed, 7 insertions(+), 34 deletions(-) diff --git a/drivers/gpu/drm/drm_mm.c b/drivers/gpu/drm/drm_mm.c index 04a209e..3e0ef8c 100644 --- a/drivers/gpu/drm/drm_mm.c +++ b/drivers/gpu/drm/drm_mm.c @@ -409,16 +409,11 @@ static struct drm_mm_node *drm_mm_search_free_generic(const struct drm_mm *mm, enum drm_mm_search_flags flags) { struct drm_mm_node *entry; - struct drm_mm_node *best; unsigned long adj_start; unsigned long adj_end; - unsigned long best_size; BUG_ON(mm->scanned_blocks); - best = NULL; - best_size = ~0UL; - __drm_mm_for_each_hole(entry, mm, adj_start, adj_end, flags & DRM_MM_SEARCH_BELOW) { unsigned long hole_size = adj_end - adj_start; @@ -429,19 +424,11 @@ static struct drm_mm_node *drm_mm_search_free_generic(const struct drm_mm *mm, continue; } - if (!check_free_hole(adj_start, adj_end, size, alignment)) - continue; - - if (!(flags & DRM_MM_SEARCH_BEST)) + if (check_free_hole(adj_start, adj_end, size, alignment)) return entry; - - if (hole_size < best_size) { - best = entry; - best_size = hole_size; - } } - return best; + return NULL; } static struct drm_mm_node *drm_mm_search_free_in_range_generic(const struct drm_mm *mm, @@ -453,16 +440,11 @@ static struct drm_mm_node *drm_mm_search_free_in_range_generic(const struct drm_ enum drm_mm_search_flags flags) { struct drm_mm_node *entry; - struct drm_mm_node *best; unsigned long adj_start; unsigned long adj_end; - unsigned long best_size; BUG_ON(mm->scanned_blocks); - best = NULL; - best_size = ~0UL; - __drm_mm_for_each_hole(entry, mm, adj_start, adj_end, flags & DRM_MM_SEARCH_BELOW) { unsigned long hole_size = adj_end - adj_start; @@ -478,19 +460,11 @@ static struct drm_mm_node *drm_mm_search_free_in_range_generic(const struct drm_ continue; } - if (!check_free_hole(adj_start, adj_end, size, alignment)) - continue; - - if (!(flags & DRM_MM_SEARCH_BEST)) + if (check_free_hole(adj_start, adj_end, size, alignment)) return entry; - - if (hole_size < best_size) { - best = entry; - best_size = hole_size; - } } - return best; + return NULL; } /** @@ -679,7 +653,7 @@ EXPORT_SYMBOL(drm_mm_scan_add_block); * corrupted. * * When the scan list is empty, the selected memory nodes can be freed. An - * immediately following drm_mm_search_free with !DRM_MM_SEARCH_BEST will then + * immediately following drm_mm_search_free with !DRM_MM_SEARCH_BELOW will then * return the just freed block (because its at the top of the free_stack list). * * Returns: diff --git a/drivers/gpu/drm/ttm/ttm_bo_manager.c b/drivers/gpu/drm/ttm/ttm_bo_manager.c index aa0bd05..42c2aa9 100644 --- a/drivers/gpu/drm/ttm/ttm_bo_manager.c +++ b/drivers/gpu/drm/ttm/ttm_bo_manager.c @@ -55,7 +55,7 @@ static int ttm_bo_man_get_node(struct ttm_mem_type_manager *man, struct ttm_range_manager *rman = (struct ttm_range_manager *) man->priv; struct drm_mm *mm = &rman->mm; struct drm_mm_node *node = NULL; - enum drm_mm_search_flags sflags = DRM_MM_SEARCH_BEST; + enum drm_mm_search_flags sflags = DRM_MM_SEARCH_DEFAULT; enum drm_mm_allocator_flags aflags = DRM_MM_CREATE_DEFAULT; unsigned long lpfn; int ret; diff --git a/include/drm/drm_mm.h b/include/drm/drm_mm.h index a24addf..f38f7a0 100644 --- a/include/drm/drm_mm.h +++ b/include/drm/drm_mm.h @@ -46,8 +46,7 @@ enum drm_mm_search_flags { DRM_MM_SEARCH_DEFAULT = 0, - DRM_MM_SEARCH_BEST = 1 << 0, - DRM_MM_SEARCH_BELOW = 1 << 1, + DRM_MM_SEARCH_BELOW = 1 << 0, }; enum drm_mm_allocator_flags { -- 2.1.1 _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH 4/4] drm/mm: Remove DRM_MM_SEARCH_BEST 2014-10-30 8:36 ` [PATCH 4/4] drm/mm: Remove DRM_MM_SEARCH_BEST Michel Dänzer @ 2014-10-31 18:10 ` Alex Deucher 2014-11-04 8:47 ` Michel Dänzer 0 siblings, 1 reply; 14+ messages in thread From: Alex Deucher @ 2014-10-31 18:10 UTC (permalink / raw) To: Michel Dänzer; +Cc: Maling list - DRI developers On Thu, Oct 30, 2014 at 4:36 AM, Michel Dänzer <michel@daenzer.net> wrote: > From: Michel Dänzer <michel.daenzer@amd.com> > > In the words of Daniel Vetter: > > «I think SEARCH_BEST is pretty much always a bad idea - it rips apart > allocations from the same execbuf, and usually those get recycled around > the same time. Which means you'll just fragment your mm even more if you > try to find the best hole instead of just picking one and the stuffing > the entire execbuf into it. So imo we might as well just kill it.» > > Signed-off-by: Michel Dänzer <michel.daenzer@amd.com> Applied this series to my -next tree. Thanks, Alex > --- > > This patch replaces the previous patches 4 and 5. > > Not sure about the function header comment hunk, maybe what it says is > true even with DRM_MM_SEARCH_BELOW? > > drivers/gpu/drm/drm_mm.c | 36 +++++------------------------------- > drivers/gpu/drm/ttm/ttm_bo_manager.c | 2 +- > include/drm/drm_mm.h | 3 +-- > 3 files changed, 7 insertions(+), 34 deletions(-) > > diff --git a/drivers/gpu/drm/drm_mm.c b/drivers/gpu/drm/drm_mm.c > index 04a209e..3e0ef8c 100644 > --- a/drivers/gpu/drm/drm_mm.c > +++ b/drivers/gpu/drm/drm_mm.c > @@ -409,16 +409,11 @@ static struct drm_mm_node *drm_mm_search_free_generic(const struct drm_mm *mm, > enum drm_mm_search_flags flags) > { > struct drm_mm_node *entry; > - struct drm_mm_node *best; > unsigned long adj_start; > unsigned long adj_end; > - unsigned long best_size; > > BUG_ON(mm->scanned_blocks); > > - best = NULL; > - best_size = ~0UL; > - > __drm_mm_for_each_hole(entry, mm, adj_start, adj_end, > flags & DRM_MM_SEARCH_BELOW) { > unsigned long hole_size = adj_end - adj_start; > @@ -429,19 +424,11 @@ static struct drm_mm_node *drm_mm_search_free_generic(const struct drm_mm *mm, > continue; > } > > - if (!check_free_hole(adj_start, adj_end, size, alignment)) > - continue; > - > - if (!(flags & DRM_MM_SEARCH_BEST)) > + if (check_free_hole(adj_start, adj_end, size, alignment)) > return entry; > - > - if (hole_size < best_size) { > - best = entry; > - best_size = hole_size; > - } > } > > - return best; > + return NULL; > } > > static struct drm_mm_node *drm_mm_search_free_in_range_generic(const struct drm_mm *mm, > @@ -453,16 +440,11 @@ static struct drm_mm_node *drm_mm_search_free_in_range_generic(const struct drm_ > enum drm_mm_search_flags flags) > { > struct drm_mm_node *entry; > - struct drm_mm_node *best; > unsigned long adj_start; > unsigned long adj_end; > - unsigned long best_size; > > BUG_ON(mm->scanned_blocks); > > - best = NULL; > - best_size = ~0UL; > - > __drm_mm_for_each_hole(entry, mm, adj_start, adj_end, > flags & DRM_MM_SEARCH_BELOW) { > unsigned long hole_size = adj_end - adj_start; > @@ -478,19 +460,11 @@ static struct drm_mm_node *drm_mm_search_free_in_range_generic(const struct drm_ > continue; > } > > - if (!check_free_hole(adj_start, adj_end, size, alignment)) > - continue; > - > - if (!(flags & DRM_MM_SEARCH_BEST)) > + if (check_free_hole(adj_start, adj_end, size, alignment)) > return entry; > - > - if (hole_size < best_size) { > - best = entry; > - best_size = hole_size; > - } > } > > - return best; > + return NULL; > } > > /** > @@ -679,7 +653,7 @@ EXPORT_SYMBOL(drm_mm_scan_add_block); > * corrupted. > * > * When the scan list is empty, the selected memory nodes can be freed. An > - * immediately following drm_mm_search_free with !DRM_MM_SEARCH_BEST will then > + * immediately following drm_mm_search_free with !DRM_MM_SEARCH_BELOW will then > * return the just freed block (because its at the top of the free_stack list). > * > * Returns: > diff --git a/drivers/gpu/drm/ttm/ttm_bo_manager.c b/drivers/gpu/drm/ttm/ttm_bo_manager.c > index aa0bd05..42c2aa9 100644 > --- a/drivers/gpu/drm/ttm/ttm_bo_manager.c > +++ b/drivers/gpu/drm/ttm/ttm_bo_manager.c > @@ -55,7 +55,7 @@ static int ttm_bo_man_get_node(struct ttm_mem_type_manager *man, > struct ttm_range_manager *rman = (struct ttm_range_manager *) man->priv; > struct drm_mm *mm = &rman->mm; > struct drm_mm_node *node = NULL; > - enum drm_mm_search_flags sflags = DRM_MM_SEARCH_BEST; > + enum drm_mm_search_flags sflags = DRM_MM_SEARCH_DEFAULT; > enum drm_mm_allocator_flags aflags = DRM_MM_CREATE_DEFAULT; > unsigned long lpfn; > int ret; > diff --git a/include/drm/drm_mm.h b/include/drm/drm_mm.h > index a24addf..f38f7a0 100644 > --- a/include/drm/drm_mm.h > +++ b/include/drm/drm_mm.h > @@ -46,8 +46,7 @@ > > enum drm_mm_search_flags { > DRM_MM_SEARCH_DEFAULT = 0, > - DRM_MM_SEARCH_BEST = 1 << 0, > - DRM_MM_SEARCH_BELOW = 1 << 1, > + DRM_MM_SEARCH_BELOW = 1 << 0, > }; > > enum drm_mm_allocator_flags { > -- > 2.1.1 > > _______________________________________________ > dri-devel mailing list > dri-devel@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/dri-devel _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 4/4] drm/mm: Remove DRM_MM_SEARCH_BEST 2014-10-31 18:10 ` Alex Deucher @ 2014-11-04 8:47 ` Michel Dänzer 2014-11-04 14:30 ` Alex Deucher 0 siblings, 1 reply; 14+ messages in thread From: Michel Dänzer @ 2014-11-04 8:47 UTC (permalink / raw) To: Alex Deucher; +Cc: Daniel Vetter, Maling list - DRI developers On 01.11.2014 03:10, Alex Deucher wrote: > On Thu, Oct 30, 2014 at 4:36 AM, Michel Dänzer <michel@daenzer.net> wrote: >> From: Michel Dänzer <michel.daenzer@amd.com> >> >> In the words of Daniel Vetter: >> >> «I think SEARCH_BEST is pretty much always a bad idea - it rips apart >> allocations from the same execbuf, and usually those get recycled around >> the same time. Which means you'll just fragment your mm even more if you >> try to find the best hole instead of just picking one and the stuffing >> the entire execbuf into it. So imo we might as well just kill it.» >> >> Signed-off-by: Michel Dänzer <michel.daenzer@amd.com> > > Applied this series to my -next tree. Thanks. Unfortunately, the removal of DRM_MM_SEARCH_BEST caused performance regressions, presumably due to excessive eviction from VRAM, see https://bugs.freedesktop.org/show_bug.cgi?id=84662#c49 . I can reproduce a significant bad effect with the UE4 Elemental demo on my development machine. Please drop this patch. The original patches 4 & 5 have the same effect, so they shouldn't be applied either. Maybe we can try this again once TTM is using the MM scan roaster for eviction. -- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Mesa and X developer _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 4/4] drm/mm: Remove DRM_MM_SEARCH_BEST 2014-11-04 8:47 ` Michel Dänzer @ 2014-11-04 14:30 ` Alex Deucher 0 siblings, 0 replies; 14+ messages in thread From: Alex Deucher @ 2014-11-04 14:30 UTC (permalink / raw) To: Michel Dänzer; +Cc: Daniel Vetter, Maling list - DRI developers On Tue, Nov 4, 2014 at 3:47 AM, Michel Dänzer <michel@daenzer.net> wrote: > On 01.11.2014 03:10, Alex Deucher wrote: >> >> On Thu, Oct 30, 2014 at 4:36 AM, Michel Dänzer <michel@daenzer.net> wrote: >>> >>> From: Michel Dänzer <michel.daenzer@amd.com> >>> >>> In the words of Daniel Vetter: >>> >>> «I think SEARCH_BEST is pretty much always a bad idea - it rips apart >>> allocations from the same execbuf, and usually those get recycled around >>> the same time. Which means you'll just fragment your mm even more if you >>> try to find the best hole instead of just picking one and the stuffing >>> the entire execbuf into it. So imo we might as well just kill it.» >>> >>> Signed-off-by: Michel Dänzer <michel.daenzer@amd.com> >> >> >> Applied this series to my -next tree. > > > Thanks. Unfortunately, the removal of DRM_MM_SEARCH_BEST caused performance > regressions, presumably due to excessive eviction from VRAM, see > https://bugs.freedesktop.org/show_bug.cgi?id=84662#c49 . I can reproduce a > significant bad effect with the UE4 Elemental demo on my development > machine. Please drop this patch. > > The original patches 4 & 5 have the same effect, so they shouldn't be > applied either. Maybe we can try this again once TTM is using the MM scan > roaster for eviction. > I dropped this patch. Thanks, Alex _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH 5/5] drm/radeon: Use TTM_PL_FLAG_BOTTOMUP 2014-10-28 9:35 [PATCH 1/5] drm/radeon: Set TTM_PL_FLAG_TOPDOWN also for RADEON_GEM_CPU_ACCESS BOs Michel Dänzer ` (2 preceding siblings ...) 2014-10-28 9:35 ` [PATCH 4/5] drm/ttm: Add TTM_PL_FLAG_BOTTOMUP Michel Dänzer @ 2014-10-28 9:35 ` Michel Dänzer 2014-10-28 10:46 ` [PATCH 1/5] drm/radeon: Set TTM_PL_FLAG_TOPDOWN also for RADEON_GEM_CPU_ACCESS BOs Lauri Kasanen 4 siblings, 0 replies; 14+ messages in thread From: Michel Dänzer @ 2014-10-28 9:35 UTC (permalink / raw) To: dri-devel; +Cc: Lauri Kasanen From: Michel Dänzer <michel.daenzer@amd.com> We want to allocate small BOs strictly from the bottom up, just as large BOs are allocated strictly from the top down. Signed-off-by: Michel Dänzer <michel.daenzer@amd.com> --- drivers/gpu/drm/radeon/radeon_object.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/radeon/radeon_object.c b/drivers/gpu/drm/radeon/radeon_object.c index d23b0dd..d9f7a02 100644 --- a/drivers/gpu/drm/radeon/radeon_object.c +++ b/drivers/gpu/drm/radeon/radeon_object.c @@ -180,9 +180,11 @@ void radeon_ttm_placement_from_domain(struct radeon_bo *rbo, u32 domain) * 512kb was measured as the most optimal number. */ if (rbo->tbo.mem.size > 512 * 1024) { - for (i = 0; i < c; i++) { + for (i = 0; i < c; i++) rbo->placements[i].flags |= TTM_PL_FLAG_TOPDOWN; - } + } else { + for (i = 0; i < c; i++) + rbo->placements[i].flags |= TTM_PL_FLAG_BOTTOMUP; } } -- 2.1.1 _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH 1/5] drm/radeon: Set TTM_PL_FLAG_TOPDOWN also for RADEON_GEM_CPU_ACCESS BOs 2014-10-28 9:35 [PATCH 1/5] drm/radeon: Set TTM_PL_FLAG_TOPDOWN also for RADEON_GEM_CPU_ACCESS BOs Michel Dänzer ` (3 preceding siblings ...) 2014-10-28 9:35 ` [PATCH 5/5] drm/radeon: Use TTM_PL_FLAG_BOTTOMUP Michel Dänzer @ 2014-10-28 10:46 ` Lauri Kasanen 4 siblings, 0 replies; 14+ messages in thread From: Lauri Kasanen @ 2014-10-28 10:46 UTC (permalink / raw) To: Michel Dänzer; +Cc: dri-devel On Tue, 28 Oct 2014 18:35:02 +0900 Michel Dänzer <michel@daenzer.net> wrote: > From: Michel Dänzer <michel.daenzer@amd.com> > > I wasn't sure if TTM_PL_FLAG_TOPDOWN works correctly with non-0 lpfn, but > AFAICT it does. > > Signed-off-by: Michel Dänzer <michel.daenzer@amd.com> This series is Reviewed-by: Lauri Kasanen <cand@gmx.com> - Lauri _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2014-11-04 14:30 UTC | newest] Thread overview: 14+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2014-10-28 9:35 [PATCH 1/5] drm/radeon: Set TTM_PL_FLAG_TOPDOWN also for RADEON_GEM_CPU_ACCESS BOs Michel Dänzer 2014-10-28 9:35 ` [PATCH 2/5] drm/ttm: Add DRM_MM_SEARCH_BELOW for TTM_PL_FLAG_TOPDOWN Michel Dänzer 2014-10-28 9:35 ` [PATCH 3/5] drm/ttm: Use only " Michel Dänzer 2014-10-28 11:10 ` Daniel Vetter 2014-10-30 8:41 ` Michel Dänzer 2014-10-30 15:07 ` Thomas Hellstrom 2014-10-31 17:19 ` Daniel Vetter 2014-10-28 9:35 ` [PATCH 4/5] drm/ttm: Add TTM_PL_FLAG_BOTTOMUP Michel Dänzer 2014-10-30 8:36 ` [PATCH 4/4] drm/mm: Remove DRM_MM_SEARCH_BEST Michel Dänzer 2014-10-31 18:10 ` Alex Deucher 2014-11-04 8:47 ` Michel Dänzer 2014-11-04 14:30 ` Alex Deucher 2014-10-28 9:35 ` [PATCH 5/5] drm/radeon: Use TTM_PL_FLAG_BOTTOMUP Michel Dänzer 2014-10-28 10:46 ` [PATCH 1/5] drm/radeon: Set TTM_PL_FLAG_TOPDOWN also for RADEON_GEM_CPU_ACCESS BOs Lauri Kasanen
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.