From: "Kirill A. Shutemov" <kirill@shutemov.name>
To: Hugh Dickins <hughd@google.com>
Cc: Sasha Levin <sasha.levin@oracle.com>,
"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
Konstantin Khlebnikov <koct9i@gmail.com>,
Mel Gorman <mgorman@suse.de>, Bob Liu <bob.liu@oracle.com>,
"linux-mm@kvack.org" <linux-mm@kvack.org>,
Christoph Lameter <cl@gentwo.org>,
Andrew Morton <akpm@linux-foundation.org>,
LKML <linux-kernel@vger.kernel.org>,
Dave Jones <davej@redhat.com>
Subject: Re: mm: NULL ptr deref in remove_migration_pte
Date: Tue, 10 Jun 2014 13:09:11 +0300 [thread overview]
Message-ID: <20140610100911.GA1718@node.dhcp.inet.fi> (raw)
In-Reply-To: <alpine.LSU.2.11.1406092104330.12382@eggly.anvils>
On Mon, Jun 09, 2014 at 09:20:33PM -0700, Hugh Dickins wrote:
> [PATCH] mm: let mm_find_pmd fix buggy race with THP fault
>
> Trinity has reported:
> BUG: unable to handle kernel NULL pointer dereference at 0000000000000018
> IP: __lock_acquire (kernel/locking/lockdep.c:3070 (discriminator 1))
> CPU: 6 PID: 16173 Comm: trinity-c364 Tainted: G W
> 3.15.0-rc1-next-20140415-sasha-00020-gaa90d09 #398
> lock_acquire (arch/x86/include/asm/current.h:14
> kernel/locking/lockdep.c:3602)
> _raw_spin_lock (include/linux/spinlock_api_smp.h:143
> kernel/locking/spinlock.c:151)
> remove_migration_pte (mm/migrate.c:137)
> rmap_walk (mm/rmap.c:1628 mm/rmap.c:1699)
> remove_migration_ptes (mm/migrate.c:224)
> migrate_pages (mm/migrate.c:922 mm/migrate.c:960 mm/migrate.c:1126)
> migrate_misplaced_page (mm/migrate.c:1733)
> __handle_mm_fault (mm/memory.c:3762 mm/memory.c:3812 mm/memory.c:3925)
> handle_mm_fault (mm/memory.c:3948)
> __get_user_pages (mm/memory.c:1851)
> __mlock_vma_pages_range (mm/mlock.c:255)
> __mm_populate (mm/mlock.c:711)
> SyS_mlockall (include/linux/mm.h:1799 mm/mlock.c:817 mm/mlock.c:791)
>
> I believe this comes about because, whereas collapsing and splitting
> THP functions take anon_vma lock in write mode (which excludes
> concurrent rmap walks), faulting THP functions (write protection and
> misplaced NUMA) do not - and mostly they do not need to.
>
> But they do use a pmdp_clear_flush(), set_pmd_at() sequence which,
> for an instant (indeed, for a long instant, given the inter-CPU
> TLB flush in there), leaves *pmd neither present not trans_huge.
>
> Which can confuse a concurrent rmap walk, as when removing migration
> ptes, seen in the dumped trace. Although that rmap walk has a 4k
> page to insert, anon_vmas containing THPs are in no way segregated
> from 4k-page anon_vmas, so the 4k-intent mm_find_pmd() does need to
> cope with that instant when a trans_huge pmd is temporarily absent.
>
> I don't think we need strengthen the locking at the THP end: it's
> easily handled with an ACCESS_ONCE() before testing both conditions.
>
> And since mm_find_pmd() had only one caller who wanted a THP rather
> than a pmd, let's slightly repurpose it to fail when it hits a THP
> or non-present pmd, and open code split_huge_page_address() again.
>
> Reported-by: Sasha Levin <sasha.levin@oracle.com>
> Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
--
Kirill A. Shutemov
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
WARNING: multiple messages have this Message-ID (diff)
From: "Kirill A. Shutemov" <kirill@shutemov.name>
To: Hugh Dickins <hughd@google.com>
Cc: Sasha Levin <sasha.levin@oracle.com>,
"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
Konstantin Khlebnikov <koct9i@gmail.com>,
Mel Gorman <mgorman@suse.de>, Bob Liu <bob.liu@oracle.com>,
"linux-mm@kvack.org" <linux-mm@kvack.org>,
Christoph Lameter <cl@gentwo.org>,
Andrew Morton <akpm@linux-foundation.org>,
LKML <linux-kernel@vger.kernel.org>,
Dave Jones <davej@redhat.com>
Subject: Re: mm: NULL ptr deref in remove_migration_pte
Date: Tue, 10 Jun 2014 13:09:11 +0300 [thread overview]
Message-ID: <20140610100911.GA1718@node.dhcp.inet.fi> (raw)
In-Reply-To: <alpine.LSU.2.11.1406092104330.12382@eggly.anvils>
On Mon, Jun 09, 2014 at 09:20:33PM -0700, Hugh Dickins wrote:
> [PATCH] mm: let mm_find_pmd fix buggy race with THP fault
>
> Trinity has reported:
> BUG: unable to handle kernel NULL pointer dereference at 0000000000000018
> IP: __lock_acquire (kernel/locking/lockdep.c:3070 (discriminator 1))
> CPU: 6 PID: 16173 Comm: trinity-c364 Tainted: G W
> 3.15.0-rc1-next-20140415-sasha-00020-gaa90d09 #398
> lock_acquire (arch/x86/include/asm/current.h:14
> kernel/locking/lockdep.c:3602)
> _raw_spin_lock (include/linux/spinlock_api_smp.h:143
> kernel/locking/spinlock.c:151)
> remove_migration_pte (mm/migrate.c:137)
> rmap_walk (mm/rmap.c:1628 mm/rmap.c:1699)
> remove_migration_ptes (mm/migrate.c:224)
> migrate_pages (mm/migrate.c:922 mm/migrate.c:960 mm/migrate.c:1126)
> migrate_misplaced_page (mm/migrate.c:1733)
> __handle_mm_fault (mm/memory.c:3762 mm/memory.c:3812 mm/memory.c:3925)
> handle_mm_fault (mm/memory.c:3948)
> __get_user_pages (mm/memory.c:1851)
> __mlock_vma_pages_range (mm/mlock.c:255)
> __mm_populate (mm/mlock.c:711)
> SyS_mlockall (include/linux/mm.h:1799 mm/mlock.c:817 mm/mlock.c:791)
>
> I believe this comes about because, whereas collapsing and splitting
> THP functions take anon_vma lock in write mode (which excludes
> concurrent rmap walks), faulting THP functions (write protection and
> misplaced NUMA) do not - and mostly they do not need to.
>
> But they do use a pmdp_clear_flush(), set_pmd_at() sequence which,
> for an instant (indeed, for a long instant, given the inter-CPU
> TLB flush in there), leaves *pmd neither present not trans_huge.
>
> Which can confuse a concurrent rmap walk, as when removing migration
> ptes, seen in the dumped trace. Although that rmap walk has a 4k
> page to insert, anon_vmas containing THPs are in no way segregated
> from 4k-page anon_vmas, so the 4k-intent mm_find_pmd() does need to
> cope with that instant when a trans_huge pmd is temporarily absent.
>
> I don't think we need strengthen the locking at the THP end: it's
> easily handled with an ACCESS_ONCE() before testing both conditions.
>
> And since mm_find_pmd() had only one caller who wanted a THP rather
> than a pmd, let's slightly repurpose it to fail when it hits a THP
> or non-present pmd, and open code split_huge_page_address() again.
>
> Reported-by: Sasha Levin <sasha.levin@oracle.com>
> Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
--
Kirill A. Shutemov
next prev parent reply other threads:[~2014-06-10 10:09 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-04-16 14:59 mm: NULL ptr deref in remove_migration_pte Sasha Levin
2014-04-16 14:59 ` Sasha Levin
2014-05-05 15:51 ` Sasha Levin
2014-05-05 15:51 ` Sasha Levin
2014-05-24 0:38 ` Sasha Levin
2014-05-24 0:38 ` Sasha Levin
2014-05-26 20:05 ` Hugh Dickins
2014-05-26 20:05 ` Hugh Dickins
2014-05-27 13:52 ` Sasha Levin
2014-05-27 13:52 ` Sasha Levin
2014-06-10 4:20 ` Hugh Dickins
2014-06-10 4:20 ` Hugh Dickins
2014-06-10 10:09 ` Kirill A. Shutemov [this message]
2014-06-10 10:09 ` Kirill A. Shutemov
2014-06-16 21:04 ` Sasha Levin
2014-06-16 21:04 ` Sasha Levin
2014-06-17 14:09 ` Christoph Lameter
2014-06-17 14:09 ` Christoph Lameter
2014-06-18 3:36 ` Hugh Dickins
2014-06-18 3:36 ` Hugh Dickins
2014-06-18 11:24 ` Sasha Levin
2014-06-18 11:24 ` Sasha Levin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140610100911.GA1718@node.dhcp.inet.fi \
--to=kirill@shutemov.name \
--cc=akpm@linux-foundation.org \
--cc=bob.liu@oracle.com \
--cc=cl@gentwo.org \
--cc=davej@redhat.com \
--cc=hughd@google.com \
--cc=kirill.shutemov@linux.intel.com \
--cc=koct9i@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mgorman@suse.de \
--cc=sasha.levin@oracle.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.