Re: hugepage related lockdep trace.

public inbox for linux-mm@kvack.org
 help / color / mirror / Atom feed

From: Minchan Kim <minchan@kernel.org>
To: Michal Hocko <mhocko@suse.cz>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>,
	Dave Jones <davej@redhat.com>,
	Linux Kernel <linux-kernel@vger.kernel.org>,
	linux-mm@kvack.org, Rik van Riel <riel@redhat.com>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
	Hillf Danton <dhillf@gmail.com>,
	Andrew Morton <akpm@linux-foundation.org>
Subject: Re: hugepage related lockdep trace.
Date: Wed, 24 Jul 2013 11:44:28 +0900	[thread overview]
Message-ID: <20130724024428.GA14795@bbox> (raw)
In-Reply-To: <20130723140120.GG8677@dhcp22.suse.cz>

On Tue, Jul 23, 2013 at 04:01:20PM +0200, Michal Hocko wrote:
> On Fri 19-07-13 09:13:03, Minchan Kim wrote:
> > On Thu, Jul 18, 2013 at 11:12:24PM +0530, Aneesh Kumar K.V wrote:
> [...]
> > > diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> > > index 83aff0a..2cb1be3 100644
> > > --- a/mm/hugetlb.c
> > > +++ b/mm/hugetlb.c
> > > @@ -3266,8 +3266,8 @@ pte_t *huge_pmd_share(struct mm_struct *mm, unsigned long addr, pud_t *pud)
> > >  		put_page(virt_to_page(spte));
> > >  	spin_unlock(&mm->page_table_lock);
> > >  out:
> > > -	pte = (pte_t *)pmd_alloc(mm, pud, addr);
> > >  	mutex_unlock(&mapping->i_mmap_mutex);
> > > +	pte = (pte_t *)pmd_alloc(mm, pud, addr);
> > >  	return pte;
> > 
> > I am blind on hugetlb but not sure it doesn't break eb48c071.
> > Michal?
> 
> Well, it is some time since I debugged the race and all the details
> vanished in the meantime. But this part of the changelog suggests that
> this indeed breaks the fix:
> "
>     This patch addresses the issue by moving pmd_alloc into huge_pmd_share
>     which guarantees that the shared pud is populated in the same critical
>     section as pmd.  This also means that huge_pte_offset test in
>     huge_pmd_share is serialized correctly now which in turn means that the
>     success of the sharing will be higher as the racing tasks see the pud
>     and pmd populated together.
> "
> 
> Besides that I fail to see how moving pmd_alloc down changes anything.
> Even if pmd_alloc triggered reclaim then we cannot trip over the same
> i_mmap_mutex as hugetlb pages are not reclaimable because they are not
> on the LRU.

I thought we could map some part of binary with normal page and other part
of the one with MAP_HUGETLB so that a address space could have both normal
page and HugeTLB page. Okay, it's impossible so HugeTLB pages are not on LRU.
Then, above lockdep warning is totally false positive.
Best solution is avoiding pmd_alloc with holding i_mmap_mutex but we need it
to fix eb48c071 so how about this if we couldn't have a better idea?


diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 83aff0a..e7c3a15 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -3240,7 +3240,15 @@ pte_t *huge_pmd_share(struct mm_struct *mm, unsigned long addr, pud_t *pud)
 	if (!vma_shareable(vma, addr))
 		return (pte_t *)pmd_alloc(mm, pud, addr);
 
-	mutex_lock(&mapping->i_mmap_mutex);
+	/*
+	 * It annotates to shut lockdep's warning up casued by i_mmap_mutex
+	 * Below pmd_alloc try to allocate memory with GFP_KERNEL while
+	 * holding i_mmap_mutex so that it could enter direct reclaim path
+	 * that rmap try to hold i_mmap_mutex again. But it's no problem
+	 * for hugetlb because pages on hugetlb never could live in LRU so
+	 * it's false positive. I hope someone fixes it with avoiding pmd_alloc
+        * with holding i_mmap_mutex rather than nesting annotation.
+	 */
+	mutex_lock_nested(&mapping->i_mmap_mutex, SINGLE_DEPTH_NESTING);
 	vma_interval_tree_foreach(svma, &mapping->i_mmap, idx, idx) {
 		if (svma == vma)
 			continue;

> -- 
> Michal Hocko
> SUSE Labs
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

next prev parent reply	other threads:[~2013-07-24  2:44 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-07-17 15:32 hugepage related lockdep trace Dave Jones
2013-07-18  0:09 ` Minchan Kim
2013-07-18 17:42   ` Aneesh Kumar K.V
2013-07-19  0:13     ` Minchan Kim
2013-07-23  7:24       ` Hush Bensen
2013-07-24  2:57         ` Minchan Kim
2013-07-23 14:01       ` Michal Hocko
2013-07-24  2:44         ` Minchan Kim [this message]
2013-07-25 13:30           ` Michal Hocko
2013-07-29  8:24             ` Minchan Kim
2013-07-29 14:53               ` Michal Hocko
2013-07-29 15:20                 ` Peter Zijlstra
2013-07-30 14:29                   ` Michal Hocko
2013-07-30 14:46                     ` [PATCH] hugetlb: fix lockdep splat caused by pmd sharing Michal Hocko
2013-07-30 14:58                       ` Peter Zijlstra
2013-07-30 15:23                         ` Michal Hocko
2013-07-30 23:35                           ` Minchan Kim
2013-07-30 23:37                             ` Dave Jones
2013-07-19  2:08     ` hugepage related lockdep trace Hillf Danton
2013-07-19  3:20       ` Aneesh Kumar K.V

find likely ancestor, descendant, or conflicting patches for this message:
( dfblob:83aff0a dfblob:e7c3a15 )
 OR (
bs:"Re: hugepage related lockdep trace." )
	(help)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130724024428.GA14795@bbox \
    --to=minchan@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=aneesh.kumar@linux.vnet.ibm.com \
    --cc=davej@redhat.com \
    --cc=dhillf@gmail.com \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.cz \
    --cc=riel@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox