From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
	id S1752035AbcFVDBt (ORCPT <rfc822;w@1wt.eu>);
	Tue, 21 Jun 2016 23:01:49 -0400
Received: from szxga01-in.huawei.com ([58.251.152.64]:28518 "EHLO
	szxga01-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
	with ESMTP id S1751545AbcFVDBq (ORCPT
	<rfc822;linux-kernel@vger.kernel.org>);
	Tue, 21 Jun 2016 23:01:46 -0400
Message-ID: <5769FE1C.9070102@huawei.com>
Date: Wed, 22 Jun 2016 10:55:24 +0800
From: zhong jiang <zhongjiang@huawei.com>
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:12.0) Gecko/20120428 Thunderbird/12.0.1
MIME-Version: 1.0
To: "Kirill A. Shutemov" <kirill@shutemov.name>
CC: <mhocko@kernel.org>, <akpm@linux-foundation.org>, <linux-mm@kvack.org>,
        <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] mm/huge_memory: fix the memory leak due to the race
References: <1466517956-13875-1-git-send-email-zhongjiang@huawei.com> <20160621143701.GA6139@node.shutemov.name> <57695AEB.8030509@huawei.com> <20160621152920.GA7760@node.shutemov.name>
In-Reply-To: <20160621152920.GA7760@node.shutemov.name>
Content-Type: text/plain; charset="ISO-8859-1"
Content-Transfer-Encoding: 7bit
X-Originating-IP: [10.177.29.68]
X-CFilter-Loop: Reflected
X-Mirapoint-Virus-RAPID-Raw: score=unknown(0),
	refid=str=0001.0A090203.5769FE22.002C,ss=1,re=0.000,recu=0.000,reip=0.000,cl=1,cld=1,fgs=0,
	ip=0.0.0.0,
	so=2013-06-18 04:22:30,
	dmn=2013-03-21 17:37:32
X-Mirapoint-Loop-Id: 8f4dc7b223beeab4c65be92feb467af1
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

On 2016/6/21 23:29, Kirill A. Shutemov wrote:
> On Tue, Jun 21, 2016 at 11:19:07PM +0800, zhong jiang wrote:
>> On 2016/6/21 22:37, Kirill A. Shutemov wrote:
>>> On Tue, Jun 21, 2016 at 10:05:56PM +0800, zhongjiang wrote:
>>>> From: zhong jiang <zhongjiang@huawei.com>
>>>>
>>>> with great pressure, I run some test cases. As a result, I found
>>>> that the THP is not freed, it is detected by check_mm().
>>>>
>>>> BUG: Bad rss-counter state mm:ffff8827edb70000 idx:1 val:512
>>>>
>>>> Consider the following race :
>>>>
>>>> 	CPU0                               CPU1
>>>>   __handle_mm_fault()
>>>>         wp_huge_pmd()
>>>>    	    do_huge_pmd_wp_page()
>>>> 		pmdp_huge_clear_flush_notify()
>>>>                 (pmd_none = true)
>>>> 					exit_mmap()
>>>> 					   unmap_vmas()
>>>> 					     zap_pmd_range()
>>>> 						pmd_none_or_trans_huge_or_clear_bad()
>>>> 						   (result in memory leak)
>>>>                 set_pmd_at()
>>>>
>>>> because of CPU0 have allocated huge page before pmdp_huge_clear_notify,
>>>> and it make the pmd entry to be null. Therefore, The memory leak can occur.
>>>>
>>>> The patch fix the scenario that the pmd entry can lead to be null.
>>> I don't think the scenario is possible.
>>>
>>> exit_mmap() called when all mm users have gone, so no parallel threads
>>> exist.
>>>
>>  Forget  this patch.  It 's my fault , it indeed don not exist.
>>  But I  hit the following problem.  we can see the memory leak when the process exit.
>>  
>>  
>>  Any suggestion will be apprecaited.
> Could you try this:
>
> http://lkml.kernel.org/r/20160621150433.GA7536@node.shutemov.name
>
 I fails to open it.  can you  display or add  attachmemts ? :-)  thx