From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1761339AbdEXKxS (ORCPT ); Wed, 24 May 2017 06:53:18 -0400 Received: from szxga03-in.huawei.com ([45.249.212.189]:6893 "EHLO szxga03-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1760842AbdEXKwk (ORCPT ); Wed, 24 May 2017 06:52:40 -0400 Message-ID: <59256545.9010608@huawei.com> Date: Wed, 24 May 2017 18:49:41 +0800 From: Xishi Qiu User-Agent: Mozilla/5.0 (Windows NT 6.1; rv:12.0) Gecko/20120428 Thunderbird/12.0.1 MIME-Version: 1.0 To: Vlastimil Babka CC: Yisheng Xie , Kefeng Wang , , , zhongjiang Subject: Re: [Question] Mlocked count will not be decreased References: <85591559-2a99-f46b-7a5a-bc7affb53285@huawei.com> <93f1b063-6288-d109-117d-d3c1cf152a8e@suse.cz> In-Reply-To: <93f1b063-6288-d109-117d-d3c1cf152a8e@suse.cz> Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 8bit X-Originating-IP: [10.177.25.179] X-CFilter-Loop: Reflected X-Mirapoint-Virus-RAPID-Raw: score=unknown(0), refid=str=0001.0A020201.592565F6.0055,ss=1,re=0.000,recu=0.000,reip=0.000,cl=1,cld=1,fgs=0, ip=0.0.0.0, so=2014-11-16 11:51:01, dmn=2013-03-21 17:37:32 X-Mirapoint-Loop-Id: 2a645ed076d288b5baea6f72ba48e7eb Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2017/5/24 18:32, Vlastimil Babka wrote: > On 05/24/2017 10:32 AM, Yisheng Xie wrote: >> Hi Kefeng, >> Could you please try this patch. >> >> Thanks >> Yisheng Xie >> ------------- >> From a70ae975756e8e97a28d49117ab25684da631689 Mon Sep 17 00:00:00 2001 >> From: Yisheng Xie >> Date: Wed, 24 May 2017 16:01:24 +0800 >> Subject: [PATCH] mlock: fix mlock count can not decrease in race condition >> >> Kefeng reported that when run the follow test the mlock count in meminfo >> cannot be decreased: >> [1] testcase >> linux:~ # cat test_mlockal >> grep Mlocked /proc/meminfo >> for j in `seq 0 10` >> do >> for i in `seq 4 15` >> do >> ./p_mlockall >> log & >> done >> sleep 0.2 >> done >> sleep 5 # wait some time to let mlock decrease >> grep Mlocked /proc/meminfo >> >> linux:~ # cat p_mlockall.c >> #include >> #include >> #include >> >> #define SPACE_LEN 4096 >> >> int main(int argc, char ** argv) >> { >> int ret; >> void *adr = malloc(SPACE_LEN); >> if (!adr) >> return -1; >> >> ret = mlockall(MCL_CURRENT | MCL_FUTURE); >> printf("mlcokall ret = %d\n", ret); >> >> ret = munlockall(); >> printf("munlcokall ret = %d\n", ret); >> >> free(adr); >> return 0; >> } >> >> When __munlock_pagevec, we ClearPageMlock but isolation_failed in race >> condition, and we do not count these page into delta_munlocked, which cause mlock > > Race condition with what? Who else would isolate our pages? > >> counter incorrect for we had Clear the PageMlock and cannot count down >> the number in the feture. >> >> Fix it by count the number of page whoes PageMlock flag is cleared. >> >> Reported-by: Kefeng Wang >> Signed-off-by: Yisheng Xie > > Weird, I can reproduce the issue on my desktop's 4.11 distro kernel, but > not in qemu and small kernel build, for some reason. So I couldn't test > the patch yet. But it's true that before 7225522bb429 ("mm: munlock: > batch non-THP page isolation and munlock+putback using pagevec") we > decreased NR_MLOCK for each pages that passed TestClearPageMlocked(), > and that unintentionally changed with my patch. There should be a Fixes: > tag for that. > Hi Vlastimil, Why the page has marked Mlocked, but not in lru list? if (TestClearPageMlocked(page)) { /* * We already have pin from follow_page_mask() * so we can spare the get_page() here. */ if (__munlock_isolate_lru_page(page, false)) continue; else __munlock_isolation_failed(page); // How this happened? } Thanks, Xishi Qiu >> --- >> mm/mlock.c | 7 ++++--- >> 1 file changed, 4 insertions(+), 3 deletions(-) >> >> diff --git a/mm/mlock.c b/mm/mlock.c >> index c483c5c..71ba5cf 100644 >> --- a/mm/mlock.c >> +++ b/mm/mlock.c >> @@ -284,7 +284,7 @@ static void __munlock_pagevec(struct pagevec *pvec, struct zone *zone) >> { >> int i; >> int nr = pagevec_count(pvec); >> - int delta_munlocked; >> + int munlocked = 0; >> struct pagevec pvec_putback; >> int pgrescued = 0; >> >> @@ -296,6 +296,7 @@ static void __munlock_pagevec(struct pagevec *pvec, struct zone *zone) >> struct page *page = pvec->pages[i]; >> >> if (TestClearPageMlocked(page)) { >> + munlocked --; >> /* >> * We already have pin from follow_page_mask() >> * so we can spare the get_page() here. >> @@ -315,8 +316,8 @@ static void __munlock_pagevec(struct pagevec *pvec, struct zone *zone) >> pagevec_add(&pvec_putback, pvec->pages[i]); >> pvec->pages[i] = NULL; >> } >> - delta_munlocked = -nr + pagevec_count(&pvec_putback); >> - __mod_zone_page_state(zone, NR_MLOCK, delta_munlocked); >> + if (munlocked) > > You don't have to if () this, it should be very rare that munlocked will > be 0, and the code works fine even if it is. > >> + __mod_zone_page_state(zone, NR_MLOCK, munlocked); >> spin_unlock_irq(zone_lru_lock(zone)); >> >> /* Now we can release pins of pages that we are not munlocking */ >> > > > . >