linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Mel Gorman <mgorman@suse.de>
To: Hugh Dickins <hughd@google.com>
Cc: Pawel Sikora <pluto@agmk.net>,
	Andrew Morton <akpm@linux-foundation.org>,
	Andrea Arcangeli <aarcange@redhat.com>,
	linux-mm@vger.kernel.org, jpiszcz@lucidpixels.com,
	arekm@pld-linux.org, linux-kernel@vger.kernel.org
Subject: Re: kernel 3.0: BUG: soft lockup: find_get_pages+0x51/0x110
Date: Wed, 19 Oct 2011 09:30:36 +0200	[thread overview]
Message-ID: <20111019073036.GA3410@suse.de> (raw)
In-Reply-To: <alpine.LSU.2.00.1110131547550.1346@sister.anvils>

On Thu, Oct 13, 2011 at 04:16:01PM -0700, Hugh Dickins wrote:
> <SNIP>
> 
> I guess this is the only time you've seen this?  In which case, ideally
> I would try to devise a testcase to demonstrate the issue below instead;

Considering that mremap workloads have been tested fairly heavily and
this hasn't triggered before (or at least not reported), I would not be
confident it can be easily reproduced. Maybe reproducing is easier if
interrupts are also high.

> but that may involve more ingenuity than I can find time for, let's see
> see if people approve of this patch anyway (it applies to 3.1 or 3.0,
> and earlier releases except that i_mmap_mutex used to be i_mmap_lock).
> 
> 
> [PATCH] mm: add anon_vma locking to mremap move
> 
> I don't usually pay much attention to the stale "? " addresses in
> stack backtraces, but this lucky report from Pawel Sikora hints that
> mremap's move_ptes() has inadequate locking against page migration.
> 
>  3.0 BUG_ON(!PageLocked(p)) in migration_entry_to_page():
>  kernel BUG at include/linux/swapops.h:105!

This check is triggered if migration PTEs are left behind. In the few
cases I saw this during compaction development, it was because a VMA was
unreachable during remove_migration_pte. With the anon_vma changes, the
locking during VMA insertion is meant to protect it and the order VMAs
are linked is important so the right anon_vma lock is found.

I don't think it is an unreachable VMA problem because if it was, the
problem would trigger much more frequently and not be exclusive to
mremap.

>  RIP: 0010:[<ffffffff81127b76>]  [<ffffffff81127b76>]
>                        migration_entry_wait+0x156/0x160
>   [<ffffffff811016a1>] handle_pte_fault+0xae1/0xaf0
>   [<ffffffff810feee2>] ? __pte_alloc+0x42/0x120
>   [<ffffffff8112c26b>] ? do_huge_pmd_anonymous_page+0xab/0x310
>   [<ffffffff81102a31>] handle_mm_fault+0x181/0x310
>   [<ffffffff81106097>] ? vma_adjust+0x537/0x570
>   [<ffffffff81424bed>] do_page_fault+0x11d/0x4e0
>   [<ffffffff81109a05>] ? do_mremap+0x2d5/0x570
>   [<ffffffff81421d5f>] page_fault+0x1f/0x30
> 
> mremap's down_write of mmap_sem, together with i_mmap_mutex/lock,
> and pagetable locks, were good enough before page migration (with its
> requirement that every migration entry be found) came in; and enough
> while migration always held mmap_sem.  But not enough nowadays, when
> there's memory hotremove and compaction: anon_vma lock is also needed,
> to make sure a migration entry is not dodging around behind our back.
> 

migration holds the anon_vma lock while it unmaps the pages and keeps holding
it until after remove_migration_ptes is called.  There are two anon vmas
that should exist during mremap that were created for the move. They
should not be able to disappear while migration runs and right now, I'm
not seeing how the VMA can get lost :(

I think a consequence of this patch is that migration and mremap are now
serialised by anon_vma lock. As a result, it might still fix the problem
if there is some race between mremap and migration simply by stopping
them playing with each other.

> It appears that Mel's a8bef8ff6ea1 "mm: migration: avoid race between
> shift_arg_pages() and rmap_walk() during migration by not migrating
> temporary stacks" was actually a workaround for this in the special
> common case of exec's use of move_pagetables(); and we should probably
> now remove that VM_STACK_INCOMPLETE_SETUP stuff as a separate cleanup.
> 

The problem was that there was only one VMA for two page table
ranges. The neater fix was to create a second VMA but that required a
kmalloc and additional VMA work during exec which was considered too
heavy. VM_STACK_INCOMPLETE_SETUP is less clean but it is faster.

  parent reply	other threads:[~2011-10-19  7:30 UTC|newest]

Thread overview: 78+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-10-12 18:12 kernel 3.0: BUG: soft lockup: find_get_pages+0x51/0x110 Paweł Sikora
2011-10-13 23:16 ` Hugh Dickins
2011-10-13 23:30   ` Hugh Dickins
2011-10-16 16:11     ` Christoph Hellwig
2011-10-16 23:54     ` Andrea Arcangeli
2011-10-17 18:51       ` Hugh Dickins
2011-10-17 22:05         ` Andrea Arcangeli
2011-10-19  7:43         ` Mel Gorman
2011-10-19 13:39           ` Linus Torvalds
2011-10-19 19:42             ` Hugh Dickins
2011-10-20  6:30               ` Paweł Sikora
2011-10-20  6:51                 ` Linus Torvalds
2011-10-21  6:54                 ` Nai Xia
2011-10-21  7:35                   ` Pawel Sikora
2011-10-20 12:51               ` Nai Xia
     [not found]                 ` <CANsGZ6a6_q8+88FRV2froBsVEq7GhtKd9fRnB-0M2MD3a7tnSw@mail.gmail.com>
2011-10-21  6:22                   ` Nai Xia
2011-10-21  8:07                     ` Pawel Sikora
2011-10-21  9:07                       ` Nai Xia
2011-10-21 21:36                         ` Paweł Sikora
2011-10-22  6:21                           ` Nai Xia
2011-10-22 16:42                             ` Paweł Sikora
     [not found]                               ` <CAPQyPG5HJKTo8AEy_khdJeciTgtNQepK6XLcpzvPF8PYS0V-Lw@mail.gmail.com>
2011-10-25  7:33                                 ` Pawel Sikora
2011-10-20  9:11       ` Nai Xia
2011-10-21 15:56         ` Mel Gorman
2011-10-21 17:21           ` Nai Xia
2011-10-21 17:41           ` Andrea Arcangeli
2011-10-21 22:50             ` Andrea Arcangeli
2011-10-22  5:52               ` Nai Xia
2011-10-31 17:14                 ` Andrea Arcangeli
2011-10-31 17:27                   ` [PATCH] mremap: enforce rmap src/dst vma ordering in case of vma_merge succeeding in copy_vma Andrea Arcangeli
2011-11-01 12:07                     ` Mel Gorman
2011-11-01 14:35                     ` Nai Xia
2011-11-04  7:31                     ` Hugh Dickins
2011-11-04 14:34                       ` Nai Xia
2011-11-04 15:59                         ` Pawel Sikora
2011-11-05  2:21                           ` Nai Xia
2011-11-04 19:16                         ` Hugh Dickins
2011-11-04 20:54                           ` Andrea Arcangeli
2011-11-05  0:09                             ` Nai Xia
2011-11-05  2:21                               ` Hugh Dickins
2011-11-05  3:07                                 ` Andrea Arcangeli
2011-11-05 17:06                                 ` Andrea Arcangeli
2011-12-08  3:24                                   ` David Rientjes
2011-12-08 12:42                                     ` Andrea Arcangeli
2011-12-09  0:08                                   ` Andrew Morton
2011-12-09  1:55                                     ` Andrea Arcangeli
2011-11-04 23:56                       ` Andrea Arcangeli
2011-11-05  0:21                         ` Nai Xia
2011-11-05  0:59                           ` Nai Xia
2011-11-05  1:33                           ` Andrea Arcangeli
2011-11-05  2:00                             ` Nai Xia
2011-11-07 13:14                               ` Mel Gorman
2011-11-07 15:42                                 ` Andrea Arcangeli
2011-11-07 16:28                                   ` Mel Gorman
2011-11-09  1:25                                     ` Andrea Arcangeli
2011-11-11  9:14                                       ` Nai Xia
2011-11-16 14:00                                       ` Andrea Arcangeli
2011-11-17  0:16                                         ` Hugh Dickins
2011-11-17  2:49                                           ` Nai Xia
2011-11-17  6:21                                           ` Nai Xia
2011-11-17 18:42                                           ` Andrea Arcangeli
2011-11-18  1:42                                             ` Nai Xia
2011-11-18  2:17                                               ` Andrea Arcangeli
2011-11-19  9:15                                                 ` Nai Xia
2011-10-22  5:07             ` kernel 3.0: BUG: soft lockup: find_get_pages+0x51/0x110 Nai Xia
2011-10-31 16:34               ` Andrea Arcangeli
2011-10-16 22:37   ` Linus Torvalds
2011-10-17  3:02     ` Hugh Dickins
2011-10-17  3:09       ` Linus Torvalds
2011-10-18 19:17   ` Paweł Sikora
2011-10-19  7:30   ` Mel Gorman [this message]
2011-10-21 12:44     ` Mel Gorman
  -- strict thread matches above, loose matches on Subject: below --
2011-08-15 16:43 Justin Piszcz
2011-08-15 18:18 ` Hugh Dickins
2011-08-15 19:02   ` Justin Piszcz
2011-08-15 19:53     ` Hugh Dickins
2011-08-21 17:59 ` Maciej Rutecki
2011-08-21 18:49   ` Justin Piszcz

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20111019073036.GA3410@suse.de \
    --to=mgorman@suse.de \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=arekm@pld-linux.org \
    --cc=hughd@google.com \
    --cc=jpiszcz@lucidpixels.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@vger.kernel.org \
    --cc=pluto@agmk.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).