Re: kernel BUG in munlock_vma_pages_range

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Bob Liu <bob.liu@oracle.com>
To: Sasha Levin <sasha.levin@oracle.com>, Vlastimil Babka <vbabka@suse.cz>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	joern@logfs.org, mgorman@suse.de,
	Michel Lespinasse <walken@google.com>,
	riel@redhat.com, LKML <linux-kernel@vger.kernel.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>
Subject: Re: kernel BUG in munlock_vma_pages_range
Date: Fri, 13 Dec 2013 16:49:15 +0800	[thread overview]
Message-ID: <52AACA0B.6080602@oracle.com> (raw)
In-Reply-To: <52AA2510.8080908@oracle.com>

On 12/13/2013 05:05 AM, Sasha Levin wrote:
> On 12/12/2013 07:41 AM, Vlastimil Babka wrote:
>> On 12/12/2013 06:03 AM, Bob Liu wrote:
>>>
>>> On 12/12/2013 11:16 AM, Sasha Levin wrote:
>>>> On 12/11/2013 05:59 PM, Vlastimil Babka wrote:
>>>>> On 12/09/2013 09:26 PM, Sasha Levin wrote:
>>>>>> On 12/09/2013 12:12 PM, Vlastimil Babka wrote:
>>>>>>> On 12/09/2013 06:05 PM, Sasha Levin wrote:
>>>>>>>> On 12/09/2013 04:34 AM, Vlastimil Babka wrote:
>>>>>>>>> Hello, I will look at it, thanks.
>>>>>>>>> Do you have specific reproduction instructions?
>>>>>>>>
>>>>>>>> Not really, the fuzzer hit it once and I've been unable to trigger
>>>>>>>> it again. Looking at
>>>>>>>> the piece of code involved it might have had something to do with
>>>>>>>> hugetlbfs, so I'll crank
>>>>>>>> up testing on that part.
>>>>>>>
>>>>>>> Thanks. Do you have trinity log and the .config file? I'm currently
>>>>>>> unable to even boot linux-next
>>>>>>> with my config/setup due to a GPF.
>>>>>>> Looking at code I wouldn't expect that it could encounter a tail
>>>>>>> page, without first encountering a
>>>>>>> head page and skipping the whole huge page. At least in THP case, as
>>>>>>> TLB pages should be split when
>>>>>>> a vma is split. As for hugetlbfs, it should be skipped for
>>>>>>> mlock/munlock operations completely. One
>>>>>>> of these assumptions is probably failing here...
>>>>>>
>>>>>> If it helps, I've added a dump_page() in case we hit a tail page
>>>>>> there and got:
>>>>>>
>>>>>> [  980.172299] page:ffffea003e5e8040 count:0 mapcount:1
>>>>>> mapping:          (null) index:0
>>>>>> x0
>>>>>> [  980.173412] page flags: 0x2fffff80008000(tail)
>>>>>>
>>>>>> I can also add anything else in there to get other debug output if
>>>>>> you think of something else useful.
>>>>>
>>>>> Please try the following. Thanks in advance.
>>>>
>>>> [  428.499889] page:ffffea003e5c0040 count:0 mapcount:4
>>>> mapping:          (null) index:0x0
>>>> [  428.499889] page flags: 0x2fffff80008000(tail)
>>>> [  428.499889] start=140117131923456 pfn=16347137
>>>> orig_start=140117130543104 page_increm
>>>> =1 vm_start=140117130543104 vm_end=140117134688256 vm_flags=135266419
>>>> [  428.499889] first_page pfn=16347136
>>>> [  428.499889] page:ffffea003e5c0000 count:204 mapcount:44
>>>> mapping:ffff880fb5c466c1 inde
>>>> x:0x7f6f8fe00
>>>> [  428.499889] page flags:
>>>> 0x2fffff80084068(uptodate|lru|active|head|swapbacked)
>>>
>>>   From this print, it looks like the page is still a huge page.
>>> One situation I guess is a huge page which isn't PageMlocked and passed
>>> to munlock_vma_page(). I'm not sure whether this will happen.
>>
>> Yes that's quite likely the case. It's not illegal to happen I would say.
>>
>>> Please take a try this patch.
>>
>> I've made a simpler version that does away with the ugly page_mask
>> thing completely.
>> Please try that as well. Thanks.
>>
>> Also when working on this I think I found another potential but much
>> rare problem
>> when munlock_vma_page races with a THP split. That would however
>> manifest such that
>> part of the former tail pages would stay PageMlocked. But that still
>> needs more thought.
>> The bug at hand should however be fixed by this patch.
> 
> Yup, this patch seems to fix the issue previously reported.
> 
> However, I'll piggyback another thing that popped up now that the vm
> could run for a while which
> also seems to be caused by the original patch. It looks like a pretty
> straightforward deadlock, but

Looks like put_page() in __munlock_pagevec() need to get the
zone->lru_lock which is already held when entering __munlock_pagevec().

How about fix like this?

Thanks,
-Bob

diff --git a/mm/mlock.c b/mm/mlock.c
index d480cd6..5880d63 100644
--- a/mm/mlock.c
+++ b/mm/mlock.c
@@ -291,7 +291,6 @@ static void __munlock_pagevec(struct pagevec *pvec,
struct zone *zone)
 	int pgrescued = 0;

 	/* Phase 1: page isolation */
-	spin_lock_irq(&zone->lru_lock);
 	for (i = 0; i < nr; i++) {
 		struct page *page = pvec->pages[i];

@@ -300,6 +299,7 @@ static void __munlock_pagevec(struct pagevec *pvec,
struct zone *zone)
 			int lru;

 			if (PageLRU(page)) {
+				spin_lock_irq(&zone->lru_lock);
 				lruvec = mem_cgroup_page_lruvec(page, zone);
 				lru = page_lru(page);
 				/*
@@ -308,6 +308,7 @@ static void __munlock_pagevec(struct pagevec *pvec,
struct zone *zone)
 				 */
 				ClearPageLRU(page);
 				del_page_from_lru_list(page, lruvec, lru);
+				spin_unlock_irq(&zone->lru_lock);
 			} else {
 				__munlock_isolation_failed(page);
 				goto skip_munlock;
@@ -325,8 +326,7 @@ skip_munlock:
 			delta_munlocked++;
 		}
 	}
-	__mod_zone_page_state(zone, NR_MLOCK, delta_munlocked);
-	spin_unlock_irq(&zone->lru_lock);
+	mod_zone_page_state(zone, NR_MLOCK, delta_munlocked);

 	/* Phase 2: page munlock */
 	pagevec_init(&pvec_putback, 0);

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

next prev parent reply	other threads:[~2013-12-13  9:02 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-12-08  1:52 kernel BUG in munlock_vma_pages_range Sasha Levin
2013-12-09  9:34 ` Vlastimil Babka
2013-12-09 17:05   ` Sasha Levin
2013-12-09 17:12     ` Vlastimil Babka
2013-12-09 17:15       ` Sasha Levin
2013-12-09 20:26       ` Sasha Levin
2013-12-11 22:59         ` Vlastimil Babka
2013-12-12  3:16           ` Sasha Levin
2013-12-12  5:03             ` Bob Liu
2013-12-12 12:41               ` Vlastimil Babka
2013-12-12 21:05                 ` Sasha Levin
2013-12-13  8:49                   ` Bob Liu [this message]
2013-12-13  9:08                     ` Vlastimil Babka
2013-12-15 19:49                       ` Sasha Levin
2013-12-16 10:14                         ` [PATCH 0/3] Fix bugs in munlock Vlastimil Babka
2013-12-16 10:14                           ` [PATCH 1/3] mm: munlock: fix a bug where THP tail page is encountered Vlastimil Babka
2013-12-17  1:26                             ` Bob Liu
2013-12-17 13:00                               ` Vlastimil Babka
2013-12-18  0:48                                 ` Bob Liu
2014-03-14 23:55                                 ` Sasha Levin
2014-03-15  3:06                                   ` Sasha Levin
2014-03-17 12:38                                     ` Vlastimil Babka
2014-03-17 21:08                                       ` Sasha Levin
2014-03-17 22:20                                         ` Vlastimil Babka
2014-03-17 22:58                                           ` Sasha Levin
2014-03-17 23:30                                             ` Vlastimil Babka
2014-03-18 10:41                                             ` Vlastimil Babka
2013-12-16 10:14                           ` [PATCH 2/3] mm: munlock: fix deadlock in __munlock_pagevec() Vlastimil Babka
2013-12-17  0:31                             ` Andrew Morton
2013-12-17 13:08                               ` Vlastimil Babka
2013-12-16 10:14                           ` [RFC PATCH 3/3] mm: munlock: fix potential race with THP page split Vlastimil Babka
2014-03-21  1:53                       ` kernel BUG in munlock_vma_pages_range Sasha Levin
2014-03-21  9:02                         ` Vlastimil Babka
2013-12-09 21:16     ` Jiri Kosina
2013-12-11 15:55       ` Sasha Levin

find likely ancestor, descendant, or conflicting patches for this message:
( dfblob:d480cd6 dfblob:5880d63 )
 OR (
bs:"Re: kernel BUG in munlock_vma_pages_range" )
	(help)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=52AACA0B.6080602@oracle.com \
    --to=bob.liu@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=joern@logfs.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@suse.de \
    --cc=riel@redhat.com \
    --cc=sasha.levin@oracle.com \
    --cc=vbabka@suse.cz \
    --cc=walken@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).