From mboxrd@z Thu Jan 1 00:00:00 1970 From: Minchan Kim Subject: Re: [PATCH] CMA/HOTPLUG: clear buffer-head lru before page migration Date: Mon, 21 Jul 2014 16:36:51 +0900 Message-ID: <20140721073651.GA15912@bbox> References: <53C8C290.90503@lge.com> <20140721025047.GA7707@bbox> <53CCB02A.7070301@lge.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Cc: Andrew Morton , =?utf-8?B?J+q5gOykgOyImCc=?= , Laura Abbott , Michal Nazarewicz , Marek Szyprowski , Alexander Viro , Johannes Weiner , Mel Gorman , linux-kernel@vger.kernel.org, linux-mm@kvack.org, =?utf-8?B?7J206rG07Zi4?= , 'Chanho Min' , linux-fsdevel@vger.kernel.org To: Gioh Kim Return-path: Content-Disposition: inline In-Reply-To: <53CCB02A.7070301@lge.com> Sender: owner-linux-mm@kvack.org List-Id: linux-fsdevel.vger.kernel.org On Mon, Jul 21, 2014 at 03:16:10PM +0900, Gioh Kim wrote: >=20 >=20 > 2014-07-21 =EC=98=A4=EC=A0=84 11:50, Minchan Kim =EC=93=B4 =EA=B8=80: > >Hi Gioh, > > > >On Fri, Jul 18, 2014 at 03:45:36PM +0900, Gioh Kim wrote: > >> > >>Hi, > >> > >>For page migration of CMA, buffer-heads of lru should be dropped. > >>Please refer to https://lkml.org/lkml/2014/7/4/101 for the history. > > > >Just nit: > >Please write *problem* in description instead of URL link. > > > >> > >>I have two solution to drop bhs. > >>One is invalidating entire lru. > > > >You mean? All of percpu bh_lrus so if the system has N cpu, > >it invalidates N * 8? >=20 > Yes, every bh_lru of all cpus. >=20 > > > >>Another is searching the lru and dropping only one bh that Laura prop= osed > >>at https://lkml.org/lkml/2012/8/31/313. > >> > >>I'm not sure which has better performance. > > > >For whom? system or requestor of CMA? >=20 > For system performance. >=20 > > > >>So I did performance test on my cortex-a7 platform with Lmbench > >>that has "File & VM system latencies" test. > >>I am attaching the results. > >>The first line is of invalidating entire lru and the second is droppi= ng selected bh. > > > >You mean you did Lmbench with background CMA allocation? > >Could you describe in detail? >=20 > I'm sorry not to mention the background. > I did the test without CMA allocation because I wanted to check how it = affects system performance. >=20 > The first test, invalidating entire lru, is adding invalidate_bh_lrus()= at alloc_contig_range(). > This is not affecting system performance because alloc_contig_range() i= s not called > for usual file-system management. > The resulf of the first test is the *default system performance.* >=20 > The second test, dropping all bh in lru, is adding drop_buffers(). > Every call of drop_buffers drops all bhs in lru of every cpu. > It can affect system performance. *But* it does not affect system perfo= rmance, > because it drops only bh of migrated pages. >=20 >=20 > > > >> > >>File & VM system latencies in microseconds - smaller is better > >>---------------------------------------------------------------------= ---------- > >>Host OS 0K File 10K File Mmap Prot Pa= ge 100fd > >> Create Delete Create Delete Latency Fault F= ault selct > >>--------- ------------- ------ ------ ------ ------ ------- ----- ---= ---- ----- > >>10.178.33 Linux 3.10.19 25.1 19.6 32.6 19.7 5098.0 0.666 3.4= 5880 6.506 > >>10.178.33 Linux 3.10.19 24.9 19.5 32.3 19.4 5059.0 0.563 3.4= 6380 6.521 > >> > >> > >>I tried several times but the result tells that they are the same und= er 1% gap > >>except Protection Fault. > >>But the latency of Protection Fault is very small and I think it has = little effect. > >> > >>Therefore we can choose anything but I choose invalidating entire lru= . > > > >Not sure we can conclude like that. > > > >A few weeks ago, I saw a patch which increases bh_lrus's size. > >https://lkml.org/lkml/2014/7/4/107 > >IOW, some of workloads really affects by percpu bh_lrus so it would be > >better to be careful to drain, I think. > > > >You want to argue CMA allocation is rare so the cost is marginable. > >It might but some of usecase might call it frequently with small reque= st > >(ie, 8K, 16K). > > > >Anyway, why cannot CMA have the cost without affecting other subsystem= ? > >I mean it's okay for CMA to consume more time to shoot out the bh > >instead of simple all bh_lru invalidation because big order allocation= is > >kinds of slow thing in the VM and everybody already know that and even > >sometime get failed so it's okay to add more code that extremly slow p= ath. >=20 > There are 2 reasons to invalidate entire bh_lru. >=20 > 1. I think CMA allocation is very rare so that invalidaing bh_lru affec= ts the system little. > How do you think about it? My platform does not call CMA allocation oft= en. > Is the CMA allocation or Memory-Hotplug called often? It depends on usecase and you couldn't assume anyting because we couldn't ask every people in the world. "Please ask to us whenever you try to use = CMA". The key point is how the patch is maintainable. If it's too complicate to maintain, maybe we could go with simple solutio= n but if it's not too complicate, we can go with more smart thing to consid= er other cases in future. Why not? Another point is that how user can detect where the regression is from. If we cannot notice the regression, it's not a good idea to go with simpl= e version. >=20 > 2. Adding code in drop_buffers() can affect the system more that adding= code in alloc_contig_range() > because the drop_buffers does not have a way to distinguish migrate typ= e. > Even-though the lmbech results that it has almost the same performance. > But I am afraid that it can be changed. > As you said if bh_lru size can be changed it affects more than now. > SO I do not want to touch non-CMA related code. I'm not saying to add hook in drop_buffers. What I suggest is to handle failure by bh_lrus in migrate_pages because it's not a problem only in CMA. There is already retry logic in migrate_pages so I can think you could handle it. >=20 >=20 > > > >>The try_to_free_buffers() which is calling drop_buffers() is called b= y many filesystem code. > >>So I think inserting codes in drop_buffers() can affect the system. > >>And also we cannot distinguish migration type in drop_buffers(). > >> > >>In alloc_contig_range() we can distinguish migration type and invalid= ate lru if it needs. > >>I think alloc_contig_range() is proper to deal with bh like following= patch. > >> > >>Laura, can I have you name on Acked-by line? > >>Please let me represent my thanks. > >> > >>Thanks for any feedback. > >> > >>------------------------------- 8< ---------------------------------- > >> > >>>From 33c894b1bab9bc26486716f0c62c452d3a04d35d Mon Sep 17 00:00:00 20= 01 > >>From: Gioh Kim > >>Date: Fri, 18 Jul 2014 13:40:01 +0900 > >>Subject: [PATCH] CMA/HOTPLUG: clear buffer-head lru before page migra= tion > >> > >>The bh must be free to migrate a page at which bh is mapped. > >>The reference count of bh is increased when it is installed > >>into lru so that the bh of lru must be freed before migrating the pag= e. > >> > >>This frees every bh of lru. We could free only bh of migrating page. > >>But searching lru costs more than invalidating entire lru. > >> > >>Signed-off-by: Gioh Kim > >>Acked-by: Laura Abbott > >>--- > >> mm/page_alloc.c | 3 +++ > >> 1 file changed, 3 insertions(+) > >> > >>diff --git a/mm/page_alloc.c b/mm/page_alloc.c > >>index b99643d4..3b474e0 100644 > >>--- a/mm/page_alloc.c > >>+++ b/mm/page_alloc.c > >>@@ -6369,6 +6369,9 @@ int alloc_contig_range(unsigned long start, uns= igned long end, > >> if (ret) > >> return ret; > >> > >>+ if (migratetype =3D=3D MIGRATE_CMA || migratetype =3D=3D MIGR= ATE_MOVABLE) > >>+ invalidate_bh_lrus(); > >>+ > > > >Q1. It's a only CMA problem? Memory-Hotplug is not a problem? Or other= places? > > > >I mean it would be better to handle in generic way. >=20 > Only CMA and Memory-Hotplug needs it. Memory-hotplug uses alloc_contig_range? You are adding the logic in alloc_contig_range and it is used for hugetlb and cma. > And I think invalidate_bh_lrus() is general. It couldn't handle memory-hotplug. >=20 > > > >Q2. Why do you call it right before calling __alloc_contig_migrate_ran= ge? > > > >Some of pages will go bh_lrus by __alloc_contig_migrate_ranges. > >In that case, it is useless without caller's retry logic. > >Even you do it from caller's retrial logic, it's not a good idea becau= se > >you makes new binding alloc_contig_range and uppder layer. > > > >So, IMHO, it would be better to handle it in migrate_pages. > >Maybe we could define new API try_to_drop_buffers which calls > >try_to_free_buffers and then only if the function fails due to > >percpu lru count, we could drain only the bh in percpu lru list instea= d of > >all bh draining. And places in migration path should use it rather tha= n > >try_to_relese_page. > > > >But the problem from this approach invents new API which should be > >maintained so not sure Andrew think it's worth. > >Maybe we should see the code and diffstat. >=20 > I also consider to making new function, drop_bh_of_migrate_page in migr= ate_page(), just before unmap_and_move(). > The migrate_page() has an argument reason that distinguish migrate-type= , MR_CMA or MR_MEMORY_HOTPLUG or others. Yes, that's what I suggested. If you see -EAGIN, maybe you could do it. Even, we could enhance it with extending target bh invalidation instead o= f all bhs invalidation so you could make two patches. 1. use invalidate_bh_lrus in migrate pages 2. invalidate only failed bh intead of all CPU percpu bh_blrus flushing. So, if guys hate 2 which is rather overdesigned, we could drop 2 but 1 is mergable still. >=20 > But I DO NOT WATN TO touch non-CMA related code. > Current CMA and Memory-Hotplug code is not mature so that I am not sure= it is ok to touch non-CMA related code for CMA/MemoryHotplug. >=20 > My point is: > 1. CMA/Memory-hotplug is rare and invalidating bh-lru is also rare. > 2. Only change CMA/Memory-hotplig related code. >=20 > > > >Overenginnering? > > > >> ret =3D __alloc_contig_migrate_range(&cc, start, end); > >> if (ret) > >> goto done; > >>-- > >>1.7.9.5 > >> > >>-- > >>To unsubscribe, send a message with 'unsubscribe linux-mm' in > >>the body to majordomo@kvack.org. For more info on Linux MM, > >>see: http://www.linux-mm.org/ . > >>Don't email: email@kvack.org > > >=20 > -- > To unsubscribe, send a message with 'unsubscribe linux-mm' in > the body to majordomo@kvack.org. For more info on Linux MM, > see: http://www.linux-mm.org/ . > Don't email: email@kvack.org --=20 Kind regards, Minchan Kim -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org