alloc_pages_node(... GFP_KERNEL | GFP_THISNODE ...) fails

linux-numa.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Max Laier <max@laiers.net>
To: linux-numa@vger.kernel.org
Cc: christoph@lameter.com
Subject: alloc_pages_node(... GFP_KERNEL | GFP_THISNODE ...) fails
Date: Fri, 29 May 2009 02:30:15 +0200	[thread overview]
Message-ID: <200905290230.16306.max@laiers.net> (raw)

[-- Attachment #1: Type: text/plain, Size: 1703 bytes --]

Hello,

I'm having a bit of trouble with the NUMA allocator in the kernel.  This 
is in a numa=fake test-setup (though this shouldn't matter, I guess).

I'm trying to allocate pages for KVM VMs from selected nodes (minimal PoC 
diff attached - hard coding the preferred page to 7 [the last node in my 
setup, but that doesn't matter - it just demonstrates the point most 
effectively]).

The call of interest is:

  page = alloc_pages_node(7, GFP_KERNEL | GFP_THISNODE, 0)

The problem is that - while page_to_nid() reports that the page is from 
node 7 - "numactl --hardware" doesn't show any allocations from node 7.  
In fact it seems that the memory is allocated from the first node with 
free pages until these run out.  Only then pages from the selected (last) 
node are given out.  Once the selected node is full, alloc_pages_node(... 
GFP_THISNODE ...) returns NULL - as it should - and I fall back to a 
normal allocation that then also reports a different node ID from 
page_to_nid (c.f. the attached diff).

The strange thing is, that a simple test module (attached as well) works 
as expected.  The allocation succeeds, reports the selected node in 
page_to_nid *and* the free memory reported from "numactl --hardware" in 
the selected node decreases.

Any insight as to why the KVM allocation might be special are very 
appreciated.  I tried to follow the call path, but didn't find any red 
flags that would indicate the difference.

Thanks.

-- 
/"\  Best regards,                      | mlaier@freebsd.org
\ /  Max Laier                          | ICQ #67774661
 X   http://pf4freebsd.love2party.net/  | mlaier@EFnet
/ \  ASCII Ribbon Campaign              | Against HTML Mail and News

[-- Attachment #2: kvm.mmu.c.diff --]
[-- Type: text/x-patch, Size: 704 bytes --]

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 479e748..82a3f56 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -300,9 +300,13 @@ static int mmu_topup_memory_cache_page(struct kvm_mmu_memory_cache *cache,
 	if (cache->nobjs >= min)
 		return 0;
 	while (cache->nobjs < ARRAY_SIZE(cache->objects)) {
-		page = alloc_page(GFP_KERNEL);
-		if (!page)
-			return -ENOMEM;
+		page = alloc_pages_node(7, GFP_KERNEL | GFP_THISNODE, 0);
+		if (!page) {
+			page = alloc_page(GFP_KERNEL);
+			if (!page)                
+				return -ENOMEM;
+			printk("Page from node %d\n", page_to_nid(page));
+		}
 		set_page_private(page, 0);
 		cache->objects[cache->nobjs++] = page_address(page);
 	}

[-- Attachment #3: nodemem.c --]
[-- Type: text/x-csrc, Size: 814 bytes --]

#include <linux/init.h>
#include <linux/module.h>
#include <linux/kernel.h>
#include <linux/gfp.h>
#include <linux/mm.h>

MODULE_LICENSE("Dual BSD/GPL");

/* 200 MB */
#define NUM_PAGES (51200)

struct page *pages[NUM_PAGES];

static int nodemem_init(void) {
  int i;

  printk("Trying to allocate %d pages from node 7\n", NUM_PAGES);

  for (i = 0; i < NUM_PAGES; i++) {
    pages[i] = alloc_pages_node(7, GFP_KERNEL | GFP_THISNODE, 0);
    if (!pages[i]) {
      for (i--; i >= 0; i--)
        __free_pages(pages[i], 0);
      return -ENOMEM;
    }
    printk("Page %d from node %d\n", i, page_to_nid(pages[i]));
  }

  return 0;
}

static void nodemem_exit(void) {
  int i;

  for (i = 0; i < NUM_PAGES; i++)
    __free_pages(pages[i], 0);
}

module_init(nodemem_init);
module_exit(nodemem_exit);

next             reply	other threads:[~2009-05-29  0:30 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-05-29  0:30 Max Laier [this message]
     [not found] ` <f568093c0905281743i63e1a24ak681df87bc83826ce@mail.gmail.com>
2009-05-29  0:46   ` alloc_pages_node(... GFP_KERNEL | GFP_THISNODE ...) fails Christoph Lameter
2009-05-29  1:09     ` Max Laier
2009-05-29 13:54       ` Christoph Lameter
2009-05-29 15:01         ` Andi Kleen
2009-05-29 16:18           ` Max Laier
2009-05-29 16:36             ` Andi Kleen
2009-05-29 16:45               ` Max Laier
2009-05-29 18:26                 ` Andi Kleen
2009-05-29 20:39                   ` Max Laier
2009-06-02 20:13                     ` Andi Kleen
2009-06-02 22:59                     ` David Rientjes
2009-06-03 14:04                       ` Christoph Lameter
2009-06-03 18:24                         ` David Rientjes
2009-06-14  4:50                           ` Max Laier

find likely ancestor, descendant, or conflicting patches for this message:
( dfblob:479e748 dfblob:82a3f56 )
 OR (
bs:"alloc_pages_node(... GFP_KERNEL | GFP_THISNODE ...) fails" )
	(help)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=200905290230.16306.max@laiers.net \
    --to=max@laiers.net \
    --cc=christoph@lameter.com \
    --cc=linux-numa@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).