Re: Handling NUMA page migration

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Robin Holt <holt@sgi.com>
To: Frank Mehnert <frank.mehnert@oracle.com>, linux-mm@kvack.org
Cc: "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Hugh Dickins <hughd@google.com>
Subject: Re: Handling NUMA page migration
Date: Tue, 4 Jun 2013 06:58:07 -0500	[thread overview]
Message-ID: <20130604115807.GF3672@sgi.com> (raw)
In-Reply-To: <201306040922.10235.frank.mehnert@oracle.com>

This is probably more appropriate to be directed at the linux-mm
mailing list.

On Tue, Jun 04, 2013 at 09:22:10AM +0200, Frank Mehnert wrote:
> Hi,
> 
> our memory management on Linux hosts conflicts with NUMA page migration.
> I assume this problem existed for a longer time but Linux 3.8 introduced
> automatic NUMA page balancing which makes the problem visible on
> multi-node hosts leading to kernel oopses.
> 
> NUMA page migration means that the physical address of a page changes.
> This is fatal if the application assumes that this never happens for
> that page as it was supposed to be pinned.
> 
> We have two kind of pinned memory:
> 
> A) 1. allocate memory in userland with mmap()
>    2. madvise(MADV_DONTFORK)
>    3. pin with get_user_pages().
>    4. flush dcache_page()
>    5. vm_flags |= (VM_DONTCOPY | VM_LOCKED)
>       (resulting flags are VM_MIXEDMAP | VM_DONTDUMP | VM_DONTEXPAND |
>        VM_DONTCOPY | VM_LOCKED | 0xff)

I don't think this type of allocation should be affected.  The
get_user_pages() call should elevate the pages reference count which
should prevent migration from completing.  I would, however, wait for
a more definitive answer.

> B) 1. allocate memory with alloc_pages()
>    2. SetPageReserved()
>    3. vm_mmap() to allocate a userspace mapping
>    4. vm_insert_page()
>    5. vm_flags |= (VM_DONTEXPAND | VM_DONTDUMP)
>       (resulting flags are VM_MIXEDMAP | VM_DONTDUMP | VM_DONTEXPAND | 0xff)
> 
> At least the memory allocated like B) is affected by automatic NUMA page
> migration. I'm not sure about A).
> 
> 1. How can I prevent automatic NUMA page migration on this memory?
> 2. Can NUMA page migration also be handled on such kind of memory without
>    preventing migration?
> 
> Thanks,
> 
> Frank
> -- 
> Dr.-Ing. Frank Mehnert | Software Development Director, VirtualBox
> ORACLE Deutschland B.V. & Co. KG | Werkstr. 24 | 71384 Weinstadt, Germany
> 
> Hauptverwaltung: Riesstr. 25, D-80992 Munchen
> Registergericht: Amtsgericht Munchen, HRA 95603
> Geschaftsfuhrer: Jurgen Kunz
> 
> Komplementarin: ORACLE Deutschland Verwaltung B.V.
> Hertogswetering 163/167, 3543 AS Utrecht, Niederlande
> Handelsregister der Handelskammer Midden-Niederlande, Nr. 30143697
> Geschaftsfuhrer: Alexander van der Ven, Astrid Kepper, Val Maher
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)

From: Robin Holt <holt@sgi.com>
To: Frank Mehnert <frank.mehnert@oracle.com>, linux-mm@kvack.org
Cc: "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Hugh Dickins <hughd@google.com>
Subject: Re: Handling NUMA page migration
Date: Tue, 4 Jun 2013 06:58:07 -0500	[thread overview]
Message-ID: <20130604115807.GF3672@sgi.com> (raw)
In-Reply-To: <201306040922.10235.frank.mehnert@oracle.com>

This is probably more appropriate to be directed at the linux-mm
mailing list.

On Tue, Jun 04, 2013 at 09:22:10AM +0200, Frank Mehnert wrote:
> Hi,
> 
> our memory management on Linux hosts conflicts with NUMA page migration.
> I assume this problem existed for a longer time but Linux 3.8 introduced
> automatic NUMA page balancing which makes the problem visible on
> multi-node hosts leading to kernel oopses.
> 
> NUMA page migration means that the physical address of a page changes.
> This is fatal if the application assumes that this never happens for
> that page as it was supposed to be pinned.
> 
> We have two kind of pinned memory:
> 
> A) 1. allocate memory in userland with mmap()
>    2. madvise(MADV_DONTFORK)
>    3. pin with get_user_pages().
>    4. flush dcache_page()
>    5. vm_flags |= (VM_DONTCOPY | VM_LOCKED)
>       (resulting flags are VM_MIXEDMAP | VM_DONTDUMP | VM_DONTEXPAND |
>        VM_DONTCOPY | VM_LOCKED | 0xff)

I don't think this type of allocation should be affected.  The
get_user_pages() call should elevate the pages reference count which
should prevent migration from completing.  I would, however, wait for
a more definitive answer.

> B) 1. allocate memory with alloc_pages()
>    2. SetPageReserved()
>    3. vm_mmap() to allocate a userspace mapping
>    4. vm_insert_page()
>    5. vm_flags |= (VM_DONTEXPAND | VM_DONTDUMP)
>       (resulting flags are VM_MIXEDMAP | VM_DONTDUMP | VM_DONTEXPAND | 0xff)
> 
> At least the memory allocated like B) is affected by automatic NUMA page
> migration. I'm not sure about A).
> 
> 1. How can I prevent automatic NUMA page migration on this memory?
> 2. Can NUMA page migration also be handled on such kind of memory without
>    preventing migration?
> 
> Thanks,
> 
> Frank
> -- 
> Dr.-Ing. Frank Mehnert | Software Development Director, VirtualBox
> ORACLE Deutschland B.V. & Co. KG | Werkstr. 24 | 71384 Weinstadt, Germany
> 
> Hauptverwaltung: Riesstr. 25, D-80992 München
> Registergericht: Amtsgericht München, HRA 95603
> Geschäftsführer: Jürgen Kunz
> 
> Komplementärin: ORACLE Deutschland Verwaltung B.V.
> Hertogswetering 163/167, 3543 AS Utrecht, Niederlande
> Handelsregister der Handelskammer Midden-Niederlande, Nr. 30143697
> Geschäftsführer: Alexander van der Ven, Astrid Kepper, Val Maher
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

next prev parent reply	other threads:[~2013-06-04 11:58 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-06-04  7:22 Handling NUMA page migration Frank Mehnert
2013-06-04 11:58 ` Robin Holt [this message]
2013-06-04 11:58   ` Robin Holt
2013-06-04 12:14   ` Frank Mehnert
2013-06-04 13:34     ` Robin Holt
2013-06-04 14:02     ` Michal Hocko
2013-06-04 14:02       ` Michal Hocko
2013-06-04 18:17       ` Frank Mehnert
2013-06-04 21:54         ` Frank Mehnert
2013-06-05  7:54           ` Michal Hocko
2013-06-05  7:54             ` Michal Hocko
2013-06-05  8:34             ` Frank Mehnert
2013-06-05  8:56               ` Frank Mehnert
2013-06-05  9:10               ` Michal Hocko
2013-06-05  9:10                 ` Michal Hocko
2013-06-05  9:32                 ` Frank Mehnert
2013-06-05  9:56                   ` Michal Hocko
2013-06-05  9:56                     ` Michal Hocko
2013-06-05 10:22                     ` Frank Mehnert
2013-06-05 11:41                       ` Michal Hocko
2013-06-05 11:41                         ` Michal Hocko
2013-06-04 15:45     ` Jerome Glisse
2013-06-04 15:45       ` Jerome Glisse
2013-06-04 17:49       ` Jerome Glisse
2013-06-04 17:49         ` Jerome Glisse
2013-06-05 10:10   ` Mel Gorman
2013-06-05 10:10     ` Mel Gorman
2013-06-05 10:35     ` Frank Mehnert
2013-06-05 12:34       ` Mel Gorman
2013-06-05 12:34         ` Mel Gorman
2013-06-06 10:09         ` Frank Mehnert

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130604115807.GF3672@sgi.com \
    --to=holt@sgi.com \
    --cc=frank.mehnert@oracle.com \
    --cc=hughd@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.