[PATCH] mm/huge_memory: fix the memory leak due to the race

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

* [PATCH] mm/huge_memory: fix the memory leak due to the race
@ 2016-06-21 14:05 zhongjiang
  2016-06-21 14:32 ` kbuild test robot
                   ` (3 more replies)
  0 siblings, 4 replies; 10+ messages in thread
From: zhongjiang @ 2016-06-21 14:05 UTC (permalink / raw)
  To: mhocko, akpm; +Cc: linux-mm, linux-kernel

From: zhong jiang <zhongjiang@huawei.com>

with great pressure, I run some test cases. As a result, I found
that the THP is not freed, it is detected by check_mm().

BUG: Bad rss-counter state mm:ffff8827edb70000 idx:1 val:512

Consider the following race :

	CPU0                               CPU1
  __handle_mm_fault()
        wp_huge_pmd()
   	    do_huge_pmd_wp_page()
		pmdp_huge_clear_flush_notify()
                (pmd_none = true)
					exit_mmap()
					   unmap_vmas()
					     zap_pmd_range()
						pmd_none_or_trans_huge_or_clear_bad()
						   (result in memory leak)
                set_pmd_at()

because of CPU0 have allocated huge page before pmdp_huge_clear_notify,
and it make the pmd entry to be null. Therefore, The memory leak can occur.

The patch fix the scenario that the pmd entry can lead to be null.

Signed-off-by: zhong jiang <zhongjiang@huawei.com>
---
 mm/huge_memory.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index e10a4fe..ef04b94 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -1340,11 +1340,11 @@ alloc:
 		pmd_t entry;
 		entry = mk_huge_pmd(new_page, vma->vm_page_prot);
 		entry = maybe_pmd_mkwrite(pmd_mkdirty(entry), vma);
-		pmdp_huge_clear_flush_notify(vma, haddr, pmd);
+		pmdp_invalidate(vma, haddr, pmd);	
 		page_add_new_anon_rmap(new_page, vma, haddr, true);
 		mem_cgroup_commit_charge(new_page, memcg, false, true);
 		lru_cache_add_active_or_unevictable(new_page, vma);
-		set_pmd_at(mm, haddr, pmd, entry);
+		pmd_populate(mm, pmd, entry);
 		update_mmu_cache_pmd(vma, address, pmd);
 		if (!page) {
 			add_mm_counter(mm, MM_ANONPAGES, HPAGE_PMD_NR);
-- 
1.8.3.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH] mm/huge_memory: fix the memory leak due to the race
  2016-06-21 14:05 [PATCH] mm/huge_memory: fix the memory leak due to the race zhongjiang
@ 2016-06-21 14:32 ` kbuild test robot
  2016-06-21 14:37 ` Kirill A. Shutemov
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 10+ messages in thread
From: kbuild test robot @ 2016-06-21 14:32 UTC (permalink / raw)
  To: zhongjiang; +Cc: kbuild-all, mhocko, akpm, linux-mm, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 2494 bytes --]

Hi,

[auto build test ERROR on v4.7-rc4]
[also build test ERROR on next-20160621]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/zhongjiang/mm-huge_memory-fix-the-memory-leak-due-to-the-race/20160621-221736
config: sparc64-allyesconfig (attached as .config)
compiler: sparc64-linux-gnu-gcc (Debian 5.3.1-8) 5.3.1 20160205
reproduce:
        wget https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/plain/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # save the attached .config to linux build tree
        make.cross ARCH=sparc64 

All errors (new ones prefixed by >>):

   In file included from arch/sparc/include/asm/pgalloc.h:4:0,
                    from arch/sparc/include/asm/tlb_64.h:6,
                    from arch/sparc/include/asm/tlb.h:4,
                    from mm/huge_memory.c:34:
   mm/huge_memory.c: In function 'do_huge_pmd_wp_page':
>> mm/huge_memory.c:1383:25: error: incompatible type for argument 3 of 'pmd_set'
      pmd_populate(mm, pmd, entry);
                            ^
   arch/sparc/include/asm/pgalloc_64.h:72:54: note: in definition of macro 'pmd_populate'
    #define pmd_populate(MM, PMD, PTE)  pmd_set(MM, PMD, PTE)
                                                         ^
   In file included from arch/sparc/include/asm/pgtable.h:4:0,
                    from include/linux/mm.h:68,
                    from mm/huge_memory.c:10:
   arch/sparc/include/asm/pgtable_64.h:796:20: note: expected 'pte_t * {aka struct <anonymous> *}' but argument is of type 'pmd_t {aka struct <anonymous>}'
    static inline void pmd_set(struct mm_struct *mm, pmd_t *pmdp, pte_t *ptep)
                       ^

vim +/pmd_set +1383 mm/huge_memory.c

  1377			entry = mk_huge_pmd(new_page, vma->vm_page_prot);
  1378			entry = maybe_pmd_mkwrite(pmd_mkdirty(entry), vma);
  1379			pmdp_invalidate(vma, haddr, pmd);	
  1380			page_add_new_anon_rmap(new_page, vma, haddr, true);
  1381			mem_cgroup_commit_charge(new_page, memcg, false, true);
  1382			lru_cache_add_active_or_unevictable(new_page, vma);
> 1383			pmd_populate(mm, pmd, entry);
  1384			update_mmu_cache_pmd(vma, address, pmd);
  1385			if (!page) {
  1386				add_mm_counter(mm, MM_ANONPAGES, HPAGE_PMD_NR);

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/octet-stream, Size: 46455 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] mm/huge_memory: fix the memory leak due to the race
  2016-06-21 14:05 [PATCH] mm/huge_memory: fix the memory leak due to the race zhongjiang
  2016-06-21 14:32 ` kbuild test robot
@ 2016-06-21 14:37 ` Kirill A. Shutemov
  2016-06-21 15:19   ` zhong jiang
  2016-06-21 14:37 ` Michal Hocko
  2016-06-21 14:42 ` kbuild test robot
  3 siblings, 1 reply; 10+ messages in thread
From: Kirill A. Shutemov @ 2016-06-21 14:37 UTC (permalink / raw)
  To: zhongjiang; +Cc: mhocko, akpm, linux-mm, linux-kernel

On Tue, Jun 21, 2016 at 10:05:56PM +0800, zhongjiang wrote:
> From: zhong jiang <zhongjiang@huawei.com>
> 
> with great pressure, I run some test cases. As a result, I found
> that the THP is not freed, it is detected by check_mm().
> 
> BUG: Bad rss-counter state mm:ffff8827edb70000 idx:1 val:512
> 
> Consider the following race :
> 
> 	CPU0                               CPU1
>   __handle_mm_fault()
>         wp_huge_pmd()
>    	    do_huge_pmd_wp_page()
> 		pmdp_huge_clear_flush_notify()
>                 (pmd_none = true)
> 					exit_mmap()
> 					   unmap_vmas()
> 					     zap_pmd_range()
> 						pmd_none_or_trans_huge_or_clear_bad()
> 						   (result in memory leak)
>                 set_pmd_at()
> 
> because of CPU0 have allocated huge page before pmdp_huge_clear_notify,
> and it make the pmd entry to be null. Therefore, The memory leak can occur.
> 
> The patch fix the scenario that the pmd entry can lead to be null.

I don't think the scenario is possible.

exit_mmap() called when all mm users have gone, so no parallel threads
exist.

-- 
 Kirill A. Shutemov

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] mm/huge_memory: fix the memory leak due to the race
  2016-06-21 14:05 [PATCH] mm/huge_memory: fix the memory leak due to the race zhongjiang
  2016-06-21 14:32 ` kbuild test robot
  2016-06-21 14:37 ` Kirill A. Shutemov
@ 2016-06-21 14:37 ` Michal Hocko
  2016-06-21 14:42 ` kbuild test robot
  3 siblings, 0 replies; 10+ messages in thread
From: Michal Hocko @ 2016-06-21 14:37 UTC (permalink / raw)
  To: zhongjiang; +Cc: akpm, linux-mm, linux-kernel, Kirill A. Shutemov

[CCing Kirill]

On Tue 21-06-16 22:05:56, zhongjiang wrote:
> From: zhong jiang <zhongjiang@huawei.com>
> 
> with great pressure, I run some test cases. As a result, I found
> that the THP is not freed, it is detected by check_mm().
> 
> BUG: Bad rss-counter state mm:ffff8827edb70000 idx:1 val:512
> 
> Consider the following race :
> 
> 	CPU0                               CPU1
>   __handle_mm_fault()
>         wp_huge_pmd()
>    	    do_huge_pmd_wp_page()
> 		pmdp_huge_clear_flush_notify()
>                 (pmd_none = true)
> 					exit_mmap()
> 					   unmap_vmas()
> 					     zap_pmd_range()
> 						pmd_none_or_trans_huge_or_clear_bad()
> 						   (result in memory leak)
>                 set_pmd_at()
>
> because of CPU0 have allocated huge page before pmdp_huge_clear_notify,
> and it make the pmd entry to be null. Therefore, The memory leak can occur.

I do not understand this description. CPU1 is in the exit path with last
mm user gone. So CPU0 is a different process with its own mm. How can
they influence each other. But maybe I am just missing your point.
 
> The patch fix the scenario that the pmd entry can lead to be null.
> 
> Signed-off-by: zhong jiang <zhongjiang@huawei.com>
> ---
>  mm/huge_memory.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index e10a4fe..ef04b94 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -1340,11 +1340,11 @@ alloc:
>  		pmd_t entry;
>  		entry = mk_huge_pmd(new_page, vma->vm_page_prot);
>  		entry = maybe_pmd_mkwrite(pmd_mkdirty(entry), vma);
> -		pmdp_huge_clear_flush_notify(vma, haddr, pmd);
> +		pmdp_invalidate(vma, haddr, pmd);	
>  		page_add_new_anon_rmap(new_page, vma, haddr, true);
>  		mem_cgroup_commit_charge(new_page, memcg, false, true);
>  		lru_cache_add_active_or_unevictable(new_page, vma);
> -		set_pmd_at(mm, haddr, pmd, entry);
> +		pmd_populate(mm, pmd, entry);
>  		update_mmu_cache_pmd(vma, address, pmd);
>  		if (!page) {
>  			add_mm_counter(mm, MM_ANONPAGES, HPAGE_PMD_NR);
> -- 
> 1.8.3.1

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] mm/huge_memory: fix the memory leak due to the race
  2016-06-21 14:05 [PATCH] mm/huge_memory: fix the memory leak due to the race zhongjiang
                   ` (2 preceding siblings ...)
  2016-06-21 14:37 ` Michal Hocko
@ 2016-06-21 14:42 ` kbuild test robot
  3 siblings, 0 replies; 10+ messages in thread
From: kbuild test robot @ 2016-06-21 14:42 UTC (permalink / raw)
  To: zhongjiang; +Cc: kbuild-all, mhocko, akpm, linux-mm, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 2041 bytes --]

Hi,

[auto build test ERROR on v4.7-rc4]
[also build test ERROR on next-20160621]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/zhongjiang/mm-huge_memory-fix-the-memory-leak-due-to-the-race/20160621-221736
config: s390-allyesconfig (attached as .config)
compiler: s390x-linux-gnu-gcc (Debian 5.3.1-8) 5.3.1 20160205
reproduce:
        wget https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/plain/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # save the attached .config to linux build tree
        make.cross ARCH=s390 

All errors (new ones prefixed by >>):

   mm/huge_memory.c: In function 'do_huge_pmd_wp_page':
>> mm/huge_memory.c:1383:25: error: incompatible type for argument 3 of 'pmd_populate'
      pmd_populate(mm, pmd, entry);
                            ^
   In file included from arch/s390/include/asm/tlbflush.h:7:0,
                    from include/linux/hugetlb.h:21,
                    from mm/huge_memory.c:13:
   arch/s390/include/asm/pgalloc.h:120:20: note: expected 'pgtable_t {aka struct <anonymous> *}' but argument is of type 'pmd_t {aka struct <anonymous>}'
    static inline void pmd_populate(struct mm_struct *mm,
                       ^

vim +/pmd_populate +1383 mm/huge_memory.c

  1377			entry = mk_huge_pmd(new_page, vma->vm_page_prot);
  1378			entry = maybe_pmd_mkwrite(pmd_mkdirty(entry), vma);
  1379			pmdp_invalidate(vma, haddr, pmd);	
  1380			page_add_new_anon_rmap(new_page, vma, haddr, true);
  1381			mem_cgroup_commit_charge(new_page, memcg, false, true);
  1382			lru_cache_add_active_or_unevictable(new_page, vma);
> 1383			pmd_populate(mm, pmd, entry);
  1384			update_mmu_cache_pmd(vma, address, pmd);
  1385			if (!page) {
  1386				add_mm_counter(mm, MM_ANONPAGES, HPAGE_PMD_NR);

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/octet-stream, Size: 41724 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH] mm/huge_memory: fix the memory leak due to the race
@ 2016-06-21 14:57 zhongjiang
  0 siblings, 0 replies; 10+ messages in thread
From: zhongjiang @ 2016-06-21 14:57 UTC (permalink / raw)
  To: mhocko, akpm; +Cc: linux-mm, linux-kernel

From: zhong jiang <zhongjiang@huawei.com>

with great pressure, I run some test cases. As a result, I found
that the THP is not freed, it is detected by check_mm().

BUG: Bad rss-counter state mm:ffff8827edb70000 idx:1 val:512

Consider the following race :

	CPU0                               CPU1
  __handle_mm_fault()
        wp_huge_pmd()
   	    do_huge_pmd_wp_page()
		pmdp_huge_clear_flush_notify()
                (pmd_none = true)
					exit_mmap()
					   unmap_vmas()
					     zap_pmd_range()
						pmd_none_or_trans_huge_or_clear_bad()
						   (result in memory leak)
                set_pmd_at()

because of CPU0 have allocated huge page before pmdp_huge_clear_notify,
and it make the pmd entry to be null. Therefore, The memory leak can occur.

The patch fix the scenario that the pmd entry can lead to be null.

Signed-off-by: zhong jiang <zhongjiang@huawei.com>
---
 mm/huge_memory.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index e10a4fe..95c7dfe 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -1340,7 +1340,7 @@ alloc:
 		pmd_t entry;
 		entry = mk_huge_pmd(new_page, vma->vm_page_prot);
 		entry = maybe_pmd_mkwrite(pmd_mkdirty(entry), vma);
-		pmdp_huge_clear_flush_notify(vma, haddr, pmd);
+		pmdp_invalidate(vma, haddr, pmd);
 		page_add_new_anon_rmap(new_page, vma, haddr, true);
 		mem_cgroup_commit_charge(new_page, memcg, false, true);
 		lru_cache_add_active_or_unevictable(new_page, vma);
-- 
1.8.3.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH] mm/huge_memory: fix the memory leak due to the race
  2016-06-21 14:37 ` Kirill A. Shutemov
@ 2016-06-21 15:19   ` zhong jiang
  2016-06-21 15:29     ` Kirill A. Shutemov
  0 siblings, 1 reply; 10+ messages in thread
From: zhong jiang @ 2016-06-21 15:19 UTC (permalink / raw)
  To: Kirill A. Shutemov; +Cc: mhocko, akpm, linux-mm, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 5529 bytes --]

On 2016/6/21 22:37, Kirill A. Shutemov wrote:
> On Tue, Jun 21, 2016 at 10:05:56PM +0800, zhongjiang wrote:
>> From: zhong jiang <zhongjiang@huawei.com>
>>
>> with great pressure, I run some test cases. As a result, I found
>> that the THP is not freed, it is detected by check_mm().
>>
>> BUG: Bad rss-counter state mm:ffff8827edb70000 idx:1 val:512
>>
>> Consider the following race :
>>
>> 	CPU0                               CPU1
>>   __handle_mm_fault()
>>         wp_huge_pmd()
>>    	    do_huge_pmd_wp_page()
>> 		pmdp_huge_clear_flush_notify()
>>                 (pmd_none = true)
>> 					exit_mmap()
>> 					   unmap_vmas()
>> 					     zap_pmd_range()
>> 						pmd_none_or_trans_huge_or_clear_bad()
>> 						   (result in memory leak)
>>                 set_pmd_at()
>>
>> because of CPU0 have allocated huge page before pmdp_huge_clear_notify,
>> and it make the pmd entry to be null. Therefore, The memory leak can occur.
>>
>> The patch fix the scenario that the pmd entry can lead to be null.
> I don't think the scenario is possible.
>
> exit_mmap() called when all mm users have gone, so no parallel threads
> exist.
>
 Forget  this patch.  It 's my fault , it indeed don not exist.
 But I  hit the following problem.  we can see the memory leak when the process exit.
 
 
 Any suggestion will be apprecaited.
 Thanks
 zhongjiang

Authorized users only. All activities may be monitored and reported.
cluster-103 login: [23966.710772] mm/pgtable-generic.c:33: bad pmd ffff88217f4bdcd8(0000012c4d6001e2)
[29611.096341] BUG: Bad rss-counter state mm:ffff8827edb70000 idx:1 val:512
[29611.103071] BUG: non-zero nr_ptes on freeing mm: 1
[35333.076266] mm/pgtable-generic.c:33: bad pmd ffff88218c2719c8(0000012ed7a001e2)
[35929.241588] mm/pgtable-generic.c:33: bad pmd ffff8811ba295bb8(0000092cd10001e2)
[36398.205178] mm/pgtable-generic.c:33: bad pmd ffff8821b94a4f20(00000014bae001e2)
[36469.518251] mm/pgtable-generic.c:33: bad pmd ffff8827dc401e78(0000190e000001e2)
[37856.015724] mm/pgtable-generic.c:33: bad pmd ffff8821a7468a68(0000032d40e001e2)
[40630.459617] mm/pgtable-generic.c:33: bad pmd ffff8820a53b4f68(000001264aa001e2)
[41973.235225] mm/pgtable-generic.c:33: bad pmd ffff8827d57d3b48(00000926f86001e2)
[42943.434794] mm/pgtable-generic.c:33: bad pmd ffff8827d14b4d40(000009268b6001e2)
[43142.718195] mm/pgtable-generic.c:33: bad pmd ffff8827e8efb0f8(00000014f8a001e2)
[43366.878885] mm/pgtable-generic.c:33: bad pmd ffff8827fc40e000(00000013cb8001e2)
[44153.258076] mm/pgtable-generic.c:33: bad pmd ffff8821aa8fee88(0000082f07e001e2)
[44693.401966] mm/pgtable-generic.c:33: bad pmd ffff8814a55d1dc0(0000092f558001e2)
[44835.648216] general protection fault: 0000 [#1] SMP
i tg3 libahci ptp libata pps_core megaraid_sas dm_mirror dm_region_hash dm_log dm_mod
[44835.698547] CPU: 366 PID: 613011 Comm: sh Not tainted 4.5.0-bisect+ #7
[44835.705073] Hardware name: To be filled by O.E.M. FusionServer9032/IT91SMUB, BIOS BLXSV102 04/26/2016
[44835.714289] task: ffff882813bc8000 ti: ffff8827fb7bc000 task.ti: ffff8827fb7bc000
[44835.721768] RIP: 0010:[<ffffffff8169aaef>] [<ffffffff8169aaef>] down_write+0x1f/0x40
[44835.729687] RSP: 0018:ffff8827fb7bfb48 EFLAGS: 00010246
[44835.735000] RAX: 8000082fddd9a02f RBX: 8000082fddd9a02f RCX: ffffea04b1358000
[44835.742127] RDX: ffffffff00000001 RSI: ffff88219bb8a760 RDI: 8000082fddd9a02f
[44835.749250] RBP: ffff8827fb7bfb50 R08: ffffffff81a64bf0 R09: ffffffff81a68c90
[44835.756379] R10: ffffffff81a68c7f R11: 0000000000000000 R12: 0000000000000000
[44835.763501] R13: ffffea0031f585c0 R14: ffffea04b1f50200 R15: ffffea04b1358000
[44835.770630] FS: 00007f0514771740(0000) GS:ffff8828f0e80000(0000) knlGS:0000000000000000
[44835.778698] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[44835.784445] CR2: 00007f0514778000 CR3: 00000021715fd000 CR4: 00000000001406e0
[44835.791577] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[44835.798701] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[44835.805823] Stack:
[44835.807859] ffffea04b1358000 ffff8827fb7bfc08 ffffffff811f5346 ffff8827cf0de180
[44835.815349] 0000000000f38713 ffff8827fb7bfbc8 ffffffff811c4a4f ffff8827d15f3c30
[44835.822844] 000000000000062d 000000000062d000 ffff882ffffbc000 ffffea04b1357fc0
[44835.830346] Call Trace:
[44835.832900] [<ffffffff811f5346>] split_huge_page_to_list+0x66/0xa20
[44835.839314] [<ffffffff811c4a4f>] ? rmap_walk+0x28f/0x3a0
[44835.844742] [<ffffffff811ed6ec>] migrate_pages+0x8dc/0x950
[44835.850364] [<ffffffff812023f0>] ? test_pages_isolated+0x1d0/0x1d0
[44835.856683] [<ffffffff816926db>] __offline_pages.constprop.28+0x4bb/0x7f0
[44835.863595] [<ffffffff811eac11>] offline_pages+0x11/0x20
[44835.869033] [<ffffffff81475527>] memory_subsys_offline+0x47/0x70
[44835.875184] [<ffffffff8145e10a>] device_offline+0x8a/0xb0
[44835.880696] [<ffffffff814752d6>] store_mem_state+0xc6/0xe0
[44835.886309] [<ffffffff8145b228>] dev_attr_store+0x18/0x30
[44835.891857] [<ffffffff8128958a>] sysfs_kf_write+0x3a/0x50
[44835.897361] [<ffffffff81288bf0>] kernfs_fop_write+0x120/0x170
[44835.903243] [<ffffffff8120b3f7>] __vfs_write+0x37/0x100
[44835.908609] [<ffffffff812b71dd>] ? security_file_permission+0x3d/0xc0
[44835.915209] [<ffffffff810c973f>] ? percpu_down_read+0x1f/0x50
[44835.921084] [<ffffffff8120c322>] vfs_write+0xa2/0x1a0
[44835.926276] [<ffffffff81003176>] ? do_audit_syscall_entry+0x66/0x70
[44835.932654] [<ffffffff8120d265>] SyS_write+0x55/0xc0
[44835.937723] [<ffffffff8169c66e>] entry_SYSCALL_64_fastpath+0x12/0x71

[-- Attachment #2: Type: text/html, Size: 7241 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] mm/huge_memory: fix the memory leak due to the race
  2016-06-21 15:19   ` zhong jiang
@ 2016-06-21 15:29     ` Kirill A. Shutemov
  2016-06-22  2:55       ` zhong jiang
  2016-06-22  9:52       ` zhong jiang
  0 siblings, 2 replies; 10+ messages in thread
From: Kirill A. Shutemov @ 2016-06-21 15:29 UTC (permalink / raw)
  To: zhong jiang; +Cc: mhocko, akpm, linux-mm, linux-kernel

On Tue, Jun 21, 2016 at 11:19:07PM +0800, zhong jiang wrote:
> On 2016/6/21 22:37, Kirill A. Shutemov wrote:
> > On Tue, Jun 21, 2016 at 10:05:56PM +0800, zhongjiang wrote:
> >> From: zhong jiang <zhongjiang@huawei.com>
> >>
> >> with great pressure, I run some test cases. As a result, I found
> >> that the THP is not freed, it is detected by check_mm().
> >>
> >> BUG: Bad rss-counter state mm:ffff8827edb70000 idx:1 val:512
> >>
> >> Consider the following race :
> >>
> >> 	CPU0                               CPU1
> >>   __handle_mm_fault()
> >>         wp_huge_pmd()
> >>    	    do_huge_pmd_wp_page()
> >> 		pmdp_huge_clear_flush_notify()
> >>                 (pmd_none = true)
> >> 					exit_mmap()
> >> 					   unmap_vmas()
> >> 					     zap_pmd_range()
> >> 						pmd_none_or_trans_huge_or_clear_bad()
> >> 						   (result in memory leak)
> >>                 set_pmd_at()
> >>
> >> because of CPU0 have allocated huge page before pmdp_huge_clear_notify,
> >> and it make the pmd entry to be null. Therefore, The memory leak can occur.
> >>
> >> The patch fix the scenario that the pmd entry can lead to be null.
> > I don't think the scenario is possible.
> >
> > exit_mmap() called when all mm users have gone, so no parallel threads
> > exist.
> >
>  Forget  this patch.  It 's my fault , it indeed don not exist.
>  But I  hit the following problem.  we can see the memory leak when the process exit.
>  
>  
>  Any suggestion will be apprecaited.

Could you try this:

http://lkml.kernel.org/r/20160621150433.GA7536@node.shutemov.name

-- 
 Kirill A. Shutemov

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] mm/huge_memory: fix the memory leak due to the race
  2016-06-21 15:29     ` Kirill A. Shutemov
@ 2016-06-22  2:55       ` zhong jiang
  2016-06-22  9:52       ` zhong jiang
  1 sibling, 0 replies; 10+ messages in thread
From: zhong jiang @ 2016-06-22  2:55 UTC (permalink / raw)
  To: Kirill A. Shutemov; +Cc: mhocko, akpm, linux-mm, linux-kernel

On 2016/6/21 23:29, Kirill A. Shutemov wrote:
> On Tue, Jun 21, 2016 at 11:19:07PM +0800, zhong jiang wrote:
>> On 2016/6/21 22:37, Kirill A. Shutemov wrote:
>>> On Tue, Jun 21, 2016 at 10:05:56PM +0800, zhongjiang wrote:
>>>> From: zhong jiang <zhongjiang@huawei.com>
>>>>
>>>> with great pressure, I run some test cases. As a result, I found
>>>> that the THP is not freed, it is detected by check_mm().
>>>>
>>>> BUG: Bad rss-counter state mm:ffff8827edb70000 idx:1 val:512
>>>>
>>>> Consider the following race :
>>>>
>>>> 	CPU0                               CPU1
>>>>   __handle_mm_fault()
>>>>         wp_huge_pmd()
>>>>    	    do_huge_pmd_wp_page()
>>>> 		pmdp_huge_clear_flush_notify()
>>>>                 (pmd_none = true)
>>>> 					exit_mmap()
>>>> 					   unmap_vmas()
>>>> 					     zap_pmd_range()
>>>> 						pmd_none_or_trans_huge_or_clear_bad()
>>>> 						   (result in memory leak)
>>>>                 set_pmd_at()
>>>>
>>>> because of CPU0 have allocated huge page before pmdp_huge_clear_notify,
>>>> and it make the pmd entry to be null. Therefore, The memory leak can occur.
>>>>
>>>> The patch fix the scenario that the pmd entry can lead to be null.
>>> I don't think the scenario is possible.
>>>
>>> exit_mmap() called when all mm users have gone, so no parallel threads
>>> exist.
>>>
>>  Forget  this patch.  It 's my fault , it indeed don not exist.
>>  But I  hit the following problem.  we can see the memory leak when the process exit.
>>  
>>  
>>  Any suggestion will be apprecaited.
> Could you try this:
>
> http://lkml.kernel.org/r/20160621150433.GA7536@node.shutemov.name
>
 I fails to open it.  can you  display or add  attachmemts ? :-)  thx

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] mm/huge_memory: fix the memory leak due to the race
  2016-06-21 15:29     ` Kirill A. Shutemov
  2016-06-22  2:55       ` zhong jiang
@ 2016-06-22  9:52       ` zhong jiang
  1 sibling, 0 replies; 10+ messages in thread
From: zhong jiang @ 2016-06-22  9:52 UTC (permalink / raw)
  To: Kirill A. Shutemov; +Cc: mhocko, akpm, linux-mm, linux-kernel

On 2016/6/21 23:29, Kirill A. Shutemov wrote:
> On Tue, Jun 21, 2016 at 11:19:07PM +0800, zhong jiang wrote:
>> On 2016/6/21 22:37, Kirill A. Shutemov wrote:
>>> On Tue, Jun 21, 2016 at 10:05:56PM +0800, zhongjiang wrote:
>>>> From: zhong jiang <zhongjiang@huawei.com>
>>>>
>>>> with great pressure, I run some test cases. As a result, I found
>>>> that the THP is not freed, it is detected by check_mm().
>>>>
>>>> BUG: Bad rss-counter state mm:ffff8827edb70000 idx:1 val:512
>>>>
>>>> Consider the following race :
>>>>
>>>> 	CPU0                               CPU1
>>>>   __handle_mm_fault()
>>>>         wp_huge_pmd()
>>>>    	    do_huge_pmd_wp_page()
>>>> 		pmdp_huge_clear_flush_notify()
>>>>                 (pmd_none = true)
>>>> 					exit_mmap()
>>>> 					   unmap_vmas()
>>>> 					     zap_pmd_range()
>>>> 						pmd_none_or_trans_huge_or_clear_bad()
>>>> 						   (result in memory leak)
>>>>                 set_pmd_at()
>>>>
>>>> because of CPU0 have allocated huge page before pmdp_huge_clear_notify,
>>>> and it make the pmd entry to be null. Therefore, The memory leak can occur.
>>>>
>>>> The patch fix the scenario that the pmd entry can lead to be null.
>>> I don't think the scenario is possible.
>>>
>>> exit_mmap() called when all mm users have gone, so no parallel threads
>>> exist.
>>>
>>  Forget  this patch.  It 's my fault , it indeed don not exist.
>>  But I  hit the following problem.  we can see the memory leak when the process exit.
>>  
>>  
>>  Any suggestion will be apprecaited.
> Could you try this:
>
> http://lkml.kernel.org/r/20160621150433.GA7536@node.shutemov.name
 The patch I have seen ,  but I  don not think this patch  can fix so problem . if that race occur,  pmd entry points to
 the huge page will be changed ,  and freeze_page spilt pmd will fail. subsequent vm_bug_on() will fired.

 freeze_page()
     try_to_unmap()
         split_huge_pmd_address() (return fail) result in page_mapcount is not zero
 vm_bug_on()

               
             

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2016-06-22  9:56 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-06-21 14:05 [PATCH] mm/huge_memory: fix the memory leak due to the race zhongjiang
2016-06-21 14:32 ` kbuild test robot
2016-06-21 14:37 ` Kirill A. Shutemov
2016-06-21 15:19   ` zhong jiang
2016-06-21 15:29     ` Kirill A. Shutemov
2016-06-22  2:55       ` zhong jiang
2016-06-22  9:52       ` zhong jiang
2016-06-21 14:37 ` Michal Hocko
2016-06-21 14:42 ` kbuild test robot
  -- strict thread matches above, loose matches on Subject: below --
2016-06-21 14:57 zhongjiang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).