All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
To: Vlastimil Babka <vbabka@suse.cz>,
	Andrew Morton <akpm@linux-foundation.org>,
	linux-mm@kvack.org
Cc: linux-kernel@vger.kernel.org,
	David Rientjes <rientjes@google.com>,
	"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
	Andrea Arcangeli <aarcange@redhat.com>,
	Michal Hocko <mhocko@suse.cz>
Subject: Re: [PATCH] mm, thp: respect MPOL_PREFERRED policy with non-local node
Date: Fri, 19 Jun 2015 14:30:43 +0530	[thread overview]
Message-ID: <871th89io4.fsf@linux.vnet.ibm.com> (raw)
In-Reply-To: <1434639273-9527-1-git-send-email-vbabka@suse.cz>

Vlastimil Babka <vbabka@suse.cz> writes:

> Since commit 077fcf116c8c ("mm/thp: allocate transparent hugepages on local
> node"), we handle THP allocations on page fault in a special way - for
> non-interleave memory policies, the allocation is only attempted on the node
> local to the current CPU, if the policy's nodemask allows the node.
>
> This is motivated by the assumption that THP benefits cannot offset the cost
> of remote accesses, so it's better to fallback to base pages on the local node
> (which might still be available, while huge pages are not due to
> fragmentation) than to allocate huge pages on a remote node.
>
> The nodemask check prevents us from violating e.g. MPOL_BIND policies where
> the local node is not among the allowed nodes. However, the current
> implementation can still give surprising results for the MPOL_PREFERRED policy
> when the preferred node is different than the current CPU's local node.
>
> In such case we should honor the preferred node and not use the local node,
> which is what this patch does. If hugepage allocation on the preferred node
> fails, we fall back to base pages and don't try other nodes, with the same
> motivation as is done for the local node hugepage allocations.
> The patch also moves the MPOL_INTERLEAVE check around to simplify the hugepage
> specific test.
>
> The difference can be demonstrated using in-tree transhuge-stress test on the
> following 2-node machine where half memory on one node was occupied to show
> the difference.
>
>
.....

> Without -p parameter, hugepage restriction to CPU-local node works as before.
>
> Fixes: 077fcf116c8c ("mm/thp: allocate transparent hugepages on local node")
> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
> Cc: David Rientjes <rientjes@google.com>
> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> Cc: Andrea Arcangeli <aarcange@redhat.com>
> Cc: Michal Hocko <mhocko@suse.cz>

Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
  

-aneesh

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)
From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
To: Vlastimil Babka <vbabka@suse.cz>,
	Andrew Morton <akpm@linux-foundation.org>,
	linux-mm@kvack.org
Cc: linux-kernel@vger.kernel.org, Vlastimil Babka <vbabka@suse.cz>,
	David Rientjes <rientjes@google.com>,
	"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
	Andrea Arcangeli <aarcange@redhat.com>,
	Michal Hocko <mhocko@suse.cz>
Subject: Re: [PATCH] mm, thp: respect MPOL_PREFERRED policy with non-local node
Date: Fri, 19 Jun 2015 14:30:43 +0530	[thread overview]
Message-ID: <871th89io4.fsf@linux.vnet.ibm.com> (raw)
In-Reply-To: <1434639273-9527-1-git-send-email-vbabka@suse.cz>

Vlastimil Babka <vbabka@suse.cz> writes:

> Since commit 077fcf116c8c ("mm/thp: allocate transparent hugepages on local
> node"), we handle THP allocations on page fault in a special way - for
> non-interleave memory policies, the allocation is only attempted on the node
> local to the current CPU, if the policy's nodemask allows the node.
>
> This is motivated by the assumption that THP benefits cannot offset the cost
> of remote accesses, so it's better to fallback to base pages on the local node
> (which might still be available, while huge pages are not due to
> fragmentation) than to allocate huge pages on a remote node.
>
> The nodemask check prevents us from violating e.g. MPOL_BIND policies where
> the local node is not among the allowed nodes. However, the current
> implementation can still give surprising results for the MPOL_PREFERRED policy
> when the preferred node is different than the current CPU's local node.
>
> In such case we should honor the preferred node and not use the local node,
> which is what this patch does. If hugepage allocation on the preferred node
> fails, we fall back to base pages and don't try other nodes, with the same
> motivation as is done for the local node hugepage allocations.
> The patch also moves the MPOL_INTERLEAVE check around to simplify the hugepage
> specific test.
>
> The difference can be demonstrated using in-tree transhuge-stress test on the
> following 2-node machine where half memory on one node was occupied to show
> the difference.
>
>
.....

> Without -p parameter, hugepage restriction to CPU-local node works as before.
>
> Fixes: 077fcf116c8c ("mm/thp: allocate transparent hugepages on local node")
> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
> Cc: David Rientjes <rientjes@google.com>
> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> Cc: Andrea Arcangeli <aarcange@redhat.com>
> Cc: Michal Hocko <mhocko@suse.cz>

Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
  

-aneesh


  parent reply	other threads:[~2015-06-19  9:01 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-06-18 14:54 [PATCH] mm, thp: respect MPOL_PREFERRED policy with non-local node Vlastimil Babka
2015-06-18 14:54 ` Vlastimil Babka
2015-06-18 19:32 ` David Rientjes
2015-06-18 19:32   ` David Rientjes
2015-06-19 11:26   ` Vlastimil Babka
2015-06-19 11:26     ` Vlastimil Babka
2015-06-19  9:00 ` Aneesh Kumar K.V [this message]
2015-06-19  9:00   ` Aneesh Kumar K.V

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=871th89io4.fsf@linux.vnet.ibm.com \
    --to=aneesh.kumar@linux.vnet.ibm.com \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.cz \
    --cc=rientjes@google.com \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.