linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 0/4]  mm: fix the "counter.sh" failure for libhugetlbfs
@ 2016-12-05  9:17 Huang Shijie
  2016-12-05  9:17 ` [PATCH v3 1/4] mm: hugetlb: rename some allocation functions Huang Shijie
                   ` (4 more replies)
  0 siblings, 5 replies; 12+ messages in thread
From: Huang Shijie @ 2016-12-05  9:17 UTC (permalink / raw)
  To: akpm, catalin.marinas
  Cc: n-horiguchi, mhocko, kirill.shutemov, aneesh.kumar,
	gerald.schaefer, mike.kravetz, linux-mm, will.deacon,
	steve.capper, kaly.xin, nd, linux-arm-kernel, vbabka,
	Huang Shijie

(1) Background
   For the arm64, the hugetlb page size can be 32M (PMD + Contiguous bit).
   In the 4K page environment, the max page order is 10 (max_order - 1),
   so 32M page is the gigantic page.    

   The arm64 MMU supports a Contiguous bit which is a hint that the TTE
   is one of a set of contiguous entries which can be cached in a single
   TLB entry.  Please refer to the arm64v8 mannul :
       DDI0487A_f_armv8_arm.pdf (in page D4-1811)

(2) The bug   
   After I tested the libhugetlbfs, I found the test case "counter.sh"
   will fail with the gigantic page (32M page in arm64 board).

   The counter.sh is just a wrapper for counter.c.
   You can find them in:
       https://github.com/libhugetlbfs/libhugetlbfs/blob/master/tests/counters.c
       https://github.com/libhugetlbfs/libhugetlbfs/blob/master/tests/counters.sh

   The error log shows below:

   ----------------------------------------------------------
        ...........................................
	LD_PRELOAD=libhugetlbfs.so shmoverride_unlinked (32M: 64):	PASS
	LD_PRELOAD=libhugetlbfs.so HUGETLB_SHM=yes shmoverride_unlinked (32M: 64):	PASS
	quota.sh (32M: 64):	PASS
	counters.sh (32M: 64):	FAIL mmap failed: Invalid argument
	********** TEST SUMMARY
	*                      32M           
	*                      32-bit 64-bit 
	*     Total testcases:     0     87   
	*             Skipped:     0      0   
	*                PASS:     0     86   
	*                FAIL:     0      1   
	*    Killed by signal:     0      0   
	*   Bad configuration:     0      0   
	*       Expected FAIL:     0      0   
	*     Unexpected PASS:     0      0   
	* Strange test result:     0      0   
	**********
   ----------------------------------------------------------

   The failure is caused by:
    1) kernel fails to allocate a gigantic page for the surplus case.
       And the gather_surplus_pages() will return NULL in the end.

    2) The condition checks for some functions are wrong:
        return_unused_surplus_pages()
        nr_overcommit_hugepages_store()
        hugetlb_overcommit_handler()
   
   This patch set adds support for gigantic surplus hugetlb pages,
   allowing the counter.sh unit test to pass. 
   Test this patch set with Juno-r1 board.

   	
v2 -- > v3:
   1.) In patch 2, change argument "no_init" to "do_prep" 
   2.) In patch 3, also change alloc_fresh_huge_page().
       In the v2, this patch only changes the alloc_fresh_gigantic_page().  
   3.) Merge old patch #4,#5 into the last one.    
   4.) Follow Babka's suggestion, do the NULL check for @mask.
   5.) others.


v1 -- > v2:
   1.) fix the compiler error in X86.
   2.) add new patches for NUMA.
       The patch #2 ~ #5 are new patches.

Huang Shijie (4):
  mm: hugetlb: rename some allocation functions
  mm: hugetlb: add a new parameter for some functions
  mm: hugetlb: change the return type for some functions
  mm: hugetlb: support gigantic surplus pages

 include/linux/mempolicy.h |   8 +++
 mm/hugetlb.c              | 146 +++++++++++++++++++++++++++++++++++-----------
 mm/mempolicy.c            |  44 ++++++++++++++
 3 files changed, 163 insertions(+), 35 deletions(-)

-- 
2.5.5

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH v3 1/4] mm: hugetlb: rename some allocation functions
  2016-12-05  9:17 [PATCH v3 0/4] mm: fix the "counter.sh" failure for libhugetlbfs Huang Shijie
@ 2016-12-05  9:17 ` Huang Shijie
  2016-12-05  9:17 ` [PATCH v3 2/4] mm: hugetlb: add a new parameter for some functions Huang Shijie
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 12+ messages in thread
From: Huang Shijie @ 2016-12-05  9:17 UTC (permalink / raw)
  To: akpm, catalin.marinas
  Cc: n-horiguchi, mhocko, kirill.shutemov, aneesh.kumar,
	gerald.schaefer, mike.kravetz, linux-mm, will.deacon,
	steve.capper, kaly.xin, nd, linux-arm-kernel, vbabka,
	Huang Shijie

After a future patch, the __alloc_buddy_huge_page() will not necessarily
use the buddy allocator.

So this patch removes the "buddy" from these functions:
	__alloc_buddy_huge_page -> __alloc_huge_page
	__alloc_buddy_huge_page_no_mpol -> __alloc_huge_page_no_mpol
	__alloc_buddy_huge_page_with_mpol -> __alloc_huge_page_with_mpol

This patch also adds the description for alloc_gigantic_page().

This patch makes preparation for the later patch.

Signed-off-by: Huang Shijie <shijie.huang@arm.com>
---
 mm/hugetlb.c | 30 ++++++++++++++++++++----------
 1 file changed, 20 insertions(+), 10 deletions(-)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 5f228cd..5f4213d 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -1089,6 +1089,12 @@ static bool zone_spans_last_pfn(const struct zone *zone,
 	return zone_spans_pfn(zone, last_pfn);
 }
 
+/*
+ * Allocate a gigantic page from @nid node.
+ *
+ * Scan the zones of @nid node, and try to allocate a number of contiguous
+ * pages (1 << order).
+ */
 static struct page *alloc_gigantic_page(int nid, unsigned int order)
 {
 	unsigned long nr_pages = 1 << order;
@@ -1157,6 +1163,10 @@ static int alloc_fresh_gigantic_page(struct hstate *h,
 
 static inline bool gigantic_page_supported(void) { return true; }
 #else
+static inline struct page *alloc_gigantic_page(int nid, unsigned int order)
+{
+	return NULL;
+}
 static inline bool gigantic_page_supported(void) { return false; }
 static inline void free_gigantic_page(struct page *page, unsigned int order) { }
 static inline void destroy_compound_gigantic_page(struct page *page,
@@ -1568,7 +1578,7 @@ static struct page *__hugetlb_alloc_buddy_huge_page(struct hstate *h,
  * For (2), we ignore 'vma' and 'addr' and use 'nid' exclusively. This
  * implies that memory policies will not be taken in to account.
  */
-static struct page *__alloc_buddy_huge_page(struct hstate *h,
+static struct page *__alloc_huge_page(struct hstate *h,
 		struct vm_area_struct *vma, unsigned long addr, int nid)
 {
 	struct page *page;
@@ -1649,21 +1659,21 @@ static struct page *__alloc_buddy_huge_page(struct hstate *h,
  * anywhere.
  */
 static
-struct page *__alloc_buddy_huge_page_no_mpol(struct hstate *h, int nid)
+struct page *__alloc_huge_page_no_mpol(struct hstate *h, int nid)
 {
 	unsigned long addr = -1;
 
-	return __alloc_buddy_huge_page(h, NULL, addr, nid);
+	return __alloc_huge_page(h, NULL, addr, nid);
 }
 
 /*
  * Use the VMA's mpolicy to allocate a huge page from the buddy.
  */
 static
-struct page *__alloc_buddy_huge_page_with_mpol(struct hstate *h,
+struct page *__alloc_huge_page_with_mpol(struct hstate *h,
 		struct vm_area_struct *vma, unsigned long addr)
 {
-	return __alloc_buddy_huge_page(h, vma, addr, NUMA_NO_NODE);
+	return __alloc_huge_page(h, vma, addr, NUMA_NO_NODE);
 }
 
 /*
@@ -1681,7 +1691,7 @@ struct page *alloc_huge_page_node(struct hstate *h, int nid)
 	spin_unlock(&hugetlb_lock);
 
 	if (!page)
-		page = __alloc_buddy_huge_page_no_mpol(h, nid);
+		page = __alloc_huge_page_no_mpol(h, nid);
 
 	return page;
 }
@@ -1711,7 +1721,7 @@ static int gather_surplus_pages(struct hstate *h, int delta)
 retry:
 	spin_unlock(&hugetlb_lock);
 	for (i = 0; i < needed; i++) {
-		page = __alloc_buddy_huge_page_no_mpol(h, NUMA_NO_NODE);
+		page = __alloc_huge_page_no_mpol(h, NUMA_NO_NODE);
 		if (!page) {
 			alloc_ok = false;
 			break;
@@ -2027,7 +2037,7 @@ struct page *alloc_huge_page(struct vm_area_struct *vma,
 	page = dequeue_huge_page_vma(h, vma, addr, avoid_reserve, gbl_chg);
 	if (!page) {
 		spin_unlock(&hugetlb_lock);
-		page = __alloc_buddy_huge_page_with_mpol(h, vma, addr);
+		page = __alloc_huge_page_with_mpol(h, vma, addr);
 		if (!page)
 			goto out_uncharge_cgroup;
 		if (!avoid_reserve && vma_has_reserves(vma, gbl_chg)) {
@@ -2285,7 +2295,7 @@ static unsigned long set_max_huge_pages(struct hstate *h, unsigned long count,
 	 * First take pages out of surplus state.  Then make up the
 	 * remaining difference by allocating fresh huge pages.
 	 *
-	 * We might race with __alloc_buddy_huge_page() here and be unable
+	 * We might race with __alloc_huge_page() here and be unable
 	 * to convert a surplus huge page to a normal huge page. That is
 	 * not critical, though, it just means the overall size of the
 	 * pool might be one hugepage larger than it needs to be, but
@@ -2331,7 +2341,7 @@ static unsigned long set_max_huge_pages(struct hstate *h, unsigned long count,
 	 * By placing pages into the surplus state independent of the
 	 * overcommit value, we are allowing the surplus pool size to
 	 * exceed overcommit. There are few sane options here. Since
-	 * __alloc_buddy_huge_page() is checking the global counter,
+	 * __alloc_huge_page() is checking the global counter,
 	 * though, we'll note that we're not allowed to exceed surplus
 	 * and won't grow the pool anywhere else. Not until one of the
 	 * sysctls are changed, or the surplus pages go out of use.
-- 
2.5.5

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v3 2/4] mm: hugetlb: add a new parameter for some functions
  2016-12-05  9:17 [PATCH v3 0/4] mm: fix the "counter.sh" failure for libhugetlbfs Huang Shijie
  2016-12-05  9:17 ` [PATCH v3 1/4] mm: hugetlb: rename some allocation functions Huang Shijie
@ 2016-12-05  9:17 ` Huang Shijie
  2016-12-05  9:17 ` [PATCH v3 3/4] mm: hugetlb: change the return type " Huang Shijie
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 12+ messages in thread
From: Huang Shijie @ 2016-12-05  9:17 UTC (permalink / raw)
  To: akpm, catalin.marinas
  Cc: n-horiguchi, mhocko, kirill.shutemov, aneesh.kumar,
	gerald.schaefer, mike.kravetz, linux-mm, will.deacon,
	steve.capper, kaly.xin, nd, linux-arm-kernel, vbabka,
	Huang Shijie

This patch adds a new parameter, the "do_prep", for these functions:
   alloc_fresh_gigantic_page_node()
   alloc_fresh_gigantic_page()

The prep_new_huge_page() does some initialization for the new page.
But sometime, we do not need it to do so, such as in the surplus case
in later patch.

With this parameter, the prep_new_huge_page() can be called by needed:
   If the "do_prep" is true, calls the prep_new_huge_page() in
   the alloc_fresh_gigantic_page_node();

This patch makes preparation for the later patches.

Signed-off-by: Huang Shijie <shijie.huang@arm.com>
---
 mm/hugetlb.c | 14 ++++++++------
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 5f4213d..b7c73a1 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -1133,27 +1133,29 @@ static struct page *alloc_gigantic_page(int nid, unsigned int order)
 static void prep_new_huge_page(struct hstate *h, struct page *page, int nid);
 static void prep_compound_gigantic_page(struct page *page, unsigned int order);
 
-static struct page *alloc_fresh_gigantic_page_node(struct hstate *h, int nid)
+static struct page *alloc_fresh_gigantic_page_node(struct hstate *h,
+					int nid, bool do_prep)
 {
 	struct page *page;
 
 	page = alloc_gigantic_page(nid, huge_page_order(h));
 	if (page) {
 		prep_compound_gigantic_page(page, huge_page_order(h));
-		prep_new_huge_page(h, page, nid);
+		if (do_prep)
+			prep_new_huge_page(h, page, nid);
 	}
 
 	return page;
 }
 
 static int alloc_fresh_gigantic_page(struct hstate *h,
-				nodemask_t *nodes_allowed)
+				nodemask_t *nodes_allowed, bool do_prep)
 {
 	struct page *page = NULL;
 	int nr_nodes, node;
 
 	for_each_node_mask_to_alloc(h, nr_nodes, node, nodes_allowed) {
-		page = alloc_fresh_gigantic_page_node(h, node);
+		page = alloc_fresh_gigantic_page_node(h, node, do_prep);
 		if (page)
 			return 1;
 	}
@@ -1172,7 +1174,7 @@ static inline void free_gigantic_page(struct page *page, unsigned int order) { }
 static inline void destroy_compound_gigantic_page(struct page *page,
 						unsigned int order) { }
 static inline int alloc_fresh_gigantic_page(struct hstate *h,
-					nodemask_t *nodes_allowed) { return 0; }
+		nodemask_t *nodes_allowed, bool do_prep) { return 0; }
 #endif
 
 static void update_and_free_page(struct hstate *h, struct page *page)
@@ -2319,7 +2321,7 @@ static unsigned long set_max_huge_pages(struct hstate *h, unsigned long count,
 		cond_resched();
 
 		if (hstate_is_gigantic(h))
-			ret = alloc_fresh_gigantic_page(h, nodes_allowed);
+			ret = alloc_fresh_gigantic_page(h, nodes_allowed, true);
 		else
 			ret = alloc_fresh_huge_page(h, nodes_allowed);
 		spin_lock(&hugetlb_lock);
-- 
2.5.5

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v3 3/4] mm: hugetlb: change the return type for some functions
  2016-12-05  9:17 [PATCH v3 0/4] mm: fix the "counter.sh" failure for libhugetlbfs Huang Shijie
  2016-12-05  9:17 ` [PATCH v3 1/4] mm: hugetlb: rename some allocation functions Huang Shijie
  2016-12-05  9:17 ` [PATCH v3 2/4] mm: hugetlb: add a new parameter for some functions Huang Shijie
@ 2016-12-05  9:17 ` Huang Shijie
  2016-12-05  9:17 ` [PATCH v3 4/4] mm: hugetlb: support gigantic surplus pages Huang Shijie
  2016-12-05  9:31 ` [PATCH v3 0/4] mm: fix the "counter.sh" failure for libhugetlbfs Michal Hocko
  4 siblings, 0 replies; 12+ messages in thread
From: Huang Shijie @ 2016-12-05  9:17 UTC (permalink / raw)
  To: akpm, catalin.marinas
  Cc: n-horiguchi, mhocko, kirill.shutemov, aneesh.kumar,
	gerald.schaefer, mike.kravetz, linux-mm, will.deacon,
	steve.capper, kaly.xin, nd, linux-arm-kernel, vbabka,
	Huang Shijie

This patch changes the return type to "struct page*" for
alloc_fresh_gigantic_page()/alloc_fresh_huge_page().

This patch makes preparation for later patch.

Signed-off-by: Huang Shijie <shijie.huang@arm.com>
---
 mm/hugetlb.c | 29 ++++++++++++++---------------
 1 file changed, 14 insertions(+), 15 deletions(-)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index b7c73a1..1395bef 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -1148,7 +1148,7 @@ static struct page *alloc_fresh_gigantic_page_node(struct hstate *h,
 	return page;
 }
 
-static int alloc_fresh_gigantic_page(struct hstate *h,
+static struct page *alloc_fresh_gigantic_page(struct hstate *h,
 				nodemask_t *nodes_allowed, bool do_prep)
 {
 	struct page *page = NULL;
@@ -1157,10 +1157,10 @@ static int alloc_fresh_gigantic_page(struct hstate *h,
 	for_each_node_mask_to_alloc(h, nr_nodes, node, nodes_allowed) {
 		page = alloc_fresh_gigantic_page_node(h, node, do_prep);
 		if (page)
-			return 1;
+			return page;
 	}
 
-	return 0;
+	return NULL;
 }
 
 static inline bool gigantic_page_supported(void) { return true; }
@@ -1173,8 +1173,8 @@ static inline bool gigantic_page_supported(void) { return false; }
 static inline void free_gigantic_page(struct page *page, unsigned int order) { }
 static inline void destroy_compound_gigantic_page(struct page *page,
 						unsigned int order) { }
-static inline int alloc_fresh_gigantic_page(struct hstate *h,
-		nodemask_t *nodes_allowed, bool do_prep) { return 0; }
+static inline struct page *alloc_fresh_gigantic_page(struct hstate *h,
+		nodemask_t *nodes_allowed, bool do_prep) { return NULL; }
 #endif
 
 static void update_and_free_page(struct hstate *h, struct page *page)
@@ -1387,26 +1387,24 @@ static struct page *alloc_fresh_huge_page_node(struct hstate *h, int nid)
 	return page;
 }
 
-static int alloc_fresh_huge_page(struct hstate *h, nodemask_t *nodes_allowed)
+static struct page *alloc_fresh_huge_page(struct hstate *h,
+					nodemask_t *nodes_allowed)
 {
-	struct page *page;
+	struct page *page = NULL;
 	int nr_nodes, node;
-	int ret = 0;
 
 	for_each_node_mask_to_alloc(h, nr_nodes, node, nodes_allowed) {
 		page = alloc_fresh_huge_page_node(h, node);
-		if (page) {
-			ret = 1;
+		if (page)
 			break;
-		}
 	}
 
-	if (ret)
+	if (page)
 		count_vm_event(HTLB_BUDDY_PGALLOC);
 	else
 		count_vm_event(HTLB_BUDDY_PGALLOC_FAIL);
 
-	return ret;
+	return page;
 }
 
 /*
@@ -2321,9 +2319,10 @@ static unsigned long set_max_huge_pages(struct hstate *h, unsigned long count,
 		cond_resched();
 
 		if (hstate_is_gigantic(h))
-			ret = alloc_fresh_gigantic_page(h, nodes_allowed, true);
+			ret = !!alloc_fresh_gigantic_page(h, nodes_allowed,
+							true);
 		else
-			ret = alloc_fresh_huge_page(h, nodes_allowed);
+			ret = !!alloc_fresh_huge_page(h, nodes_allowed);
 		spin_lock(&hugetlb_lock);
 		if (!ret)
 			goto out;
-- 
2.5.5

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v3 4/4] mm: hugetlb: support gigantic surplus pages
  2016-12-05  9:17 [PATCH v3 0/4] mm: fix the "counter.sh" failure for libhugetlbfs Huang Shijie
                   ` (2 preceding siblings ...)
  2016-12-05  9:17 ` [PATCH v3 3/4] mm: hugetlb: change the return type " Huang Shijie
@ 2016-12-05  9:17 ` Huang Shijie
  2016-12-05  9:31 ` [PATCH v3 0/4] mm: fix the "counter.sh" failure for libhugetlbfs Michal Hocko
  4 siblings, 0 replies; 12+ messages in thread
From: Huang Shijie @ 2016-12-05  9:17 UTC (permalink / raw)
  To: akpm, catalin.marinas
  Cc: n-horiguchi, mhocko, kirill.shutemov, aneesh.kumar,
	gerald.schaefer, mike.kravetz, linux-mm, will.deacon,
	steve.capper, kaly.xin, nd, linux-arm-kernel, vbabka,
	Huang Shijie

When testing the gigantic page whose order is too large for the buddy
allocator, the libhugetlbfs test case "counter.sh" will fail.

The counter.sh is just a wrapper for counter.c, you can find them in:
  https://github.com/libhugetlbfs/libhugetlbfs/blob/master/tests/counters.c
  https://github.com/libhugetlbfs/libhugetlbfs/blob/master/tests/counters.sh

Please see the error log below:
 ............................................
    ........
    quota.sh (32M: 64):	PASS
    counters.sh (32M: 64):	FAIL mmap failed: Invalid argument
    ********** TEST SUMMARY
    *                      32M
    *                      32-bit 64-bit
    *     Total testcases:     0     87
    *             Skipped:     0      0
    *                PASS:     0     86
    *                FAIL:     0      1
    *    Killed by signal:     0      0
    *   Bad configuration:     0      0
    *       Expected FAIL:     0      0
    *     Unexpected PASS:     0      0
    * Strange test result:     0      0
    **********
 ............................................

The failure is caused by:
 1) kernel fails to allocate a gigantic page for the surplus case.
    And the gather_surplus_pages() will return NULL in the end.

 2) The condition checks for "over-commit" is wrong.

This patch does following things:

 1) This patch changes the condition checks for:
     return_unused_surplus_pages()
     nr_overcommit_hugepages_store()
     hugetlb_overcommit_handler()

 2) This patch introduces two helper functions:
    huge_nodemask() and __hugetlb_alloc_gigantic_page().
    Please see the descritions in the two functions.

 3) This patch uses __hugetlb_alloc_gigantic_page() to allocate the
    gigantic page in the __alloc_huge_page(). After this patch,
    gather_surplus_pages() can return a gigantic page for the surplus case.

After this patch, the counter.sh can pass for the gigantic page.

Signed-off-by: Huang Shijie <shijie.huang@arm.com>
---
 include/linux/mempolicy.h |  8 +++++
 mm/hugetlb.c              | 77 +++++++++++++++++++++++++++++++++++++++++++----
 mm/mempolicy.c            | 44 +++++++++++++++++++++++++++
 3 files changed, 123 insertions(+), 6 deletions(-)

diff --git a/include/linux/mempolicy.h b/include/linux/mempolicy.h
index 5f4d828..6539fbb 100644
--- a/include/linux/mempolicy.h
+++ b/include/linux/mempolicy.h
@@ -146,6 +146,8 @@ extern void mpol_rebind_task(struct task_struct *tsk, const nodemask_t *new,
 				enum mpol_rebind_step step);
 extern void mpol_rebind_mm(struct mm_struct *mm, nodemask_t *new);
 
+extern bool huge_nodemask(struct vm_area_struct *vma,
+				unsigned long addr, nodemask_t *mask);
 extern struct zonelist *huge_zonelist(struct vm_area_struct *vma,
 				unsigned long addr, gfp_t gfp_flags,
 				struct mempolicy **mpol, nodemask_t **nodemask);
@@ -269,6 +271,12 @@ static inline void mpol_rebind_mm(struct mm_struct *mm, nodemask_t *new)
 {
 }
 
+static inline bool huge_nodemask(struct vm_area_struct *vma,
+				unsigned long addr, nodemask_t *mask)
+{
+	return false;
+}
+
 static inline struct zonelist *huge_zonelist(struct vm_area_struct *vma,
 				unsigned long addr, gfp_t gfp_flags,
 				struct mempolicy **mpol, nodemask_t **nodemask)
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 1395bef..04440b8 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -1506,6 +1506,69 @@ int dissolve_free_huge_pages(unsigned long start_pfn, unsigned long end_pfn)
 
 /*
  * There are 3 ways this can get called:
+ *
+ * 1. When the NUMA is not enabled, use alloc_gigantic_page() to get
+ *    the gigantic page.
+ *
+ * 2. The NUMA is enabled, but the vma is NULL.
+ *    Initialize the @mask, and use alloc_fresh_gigantic_page() to get
+ *    the gigantic page.
+ *
+ * 3. The NUMA is enabled, and the vma is valid.
+ *    Use the @vma's memory policy.
+ *    Get @mask by huge_nodemask(), and use alloc_fresh_gigantic_page()
+ *    to get the gigantic page.
+ */
+static struct page *__hugetlb_alloc_gigantic_page(struct hstate *h,
+		struct vm_area_struct *vma, unsigned long addr, int nid)
+{
+	NODEMASK_ALLOC(nodemask_t, mask, GFP_KERNEL | __GFP_NORETRY);
+	struct page *page = NULL;
+
+	/* Not NUMA */
+	if (!IS_ENABLED(CONFIG_NUMA)) {
+		if (nid == NUMA_NO_NODE)
+			nid = numa_mem_id();
+
+		page = alloc_gigantic_page(nid, huge_page_order(h));
+		if (page)
+			prep_compound_gigantic_page(page, huge_page_order(h));
+		goto got_page;
+	}
+
+	/* NUMA && !vma */
+	if (!vma) {
+		/* First, check the mask */
+		if (!mask) {
+			mask = &node_states[N_MEMORY];
+		} else {
+			if (nid == NUMA_NO_NODE) {
+				if (!init_nodemask_of_mempolicy(mask)) {
+					NODEMASK_FREE(mask);
+					mask = &node_states[N_MEMORY];
+				}
+			} else {
+				init_nodemask_of_node(mask, nid);
+			}
+		}
+
+		page = alloc_fresh_gigantic_page(h, mask, false);
+		goto got_page;
+	}
+
+	/* NUMA && vma */
+	if (mask && huge_nodemask(vma, addr, mask))
+		page = alloc_fresh_gigantic_page(h, mask, false);
+
+got_page:
+	if (mask != &node_states[N_MEMORY])
+		NODEMASK_FREE(mask);
+
+	return page;
+}
+
+/*
+ * There are 3 ways this can get called:
  * 1. With vma+addr: we use the VMA's memory policy
  * 2. With !vma, but nid=NUMA_NO_NODE:  We try to allocate a huge
  *    page from any node, and let the buddy allocator itself figure
@@ -1584,7 +1647,7 @@ static struct page *__alloc_huge_page(struct hstate *h,
 	struct page *page;
 	unsigned int r_nid;
 
-	if (hstate_is_gigantic(h))
+	if (hstate_is_gigantic(h) && !gigantic_page_supported())
 		return NULL;
 
 	/*
@@ -1629,7 +1692,10 @@ static struct page *__alloc_huge_page(struct hstate *h,
 	}
 	spin_unlock(&hugetlb_lock);
 
-	page = __hugetlb_alloc_buddy_huge_page(h, vma, addr, nid);
+	if (hstate_is_gigantic(h))
+		page = __hugetlb_alloc_gigantic_page(h, vma, addr, nid);
+	else
+		page = __hugetlb_alloc_buddy_huge_page(h, vma, addr, nid);
 
 	spin_lock(&hugetlb_lock);
 	if (page) {
@@ -1796,8 +1862,7 @@ static void return_unused_surplus_pages(struct hstate *h,
 	/* Uncommit the reservation */
 	h->resv_huge_pages -= unused_resv_pages;
 
-	/* Cannot return gigantic pages currently */
-	if (hstate_is_gigantic(h))
+	if (hstate_is_gigantic(h) && !gigantic_page_supported())
 		return;
 
 	nr_pages = min(unused_resv_pages, h->surplus_huge_pages);
@@ -2514,7 +2579,7 @@ static ssize_t nr_overcommit_hugepages_store(struct kobject *kobj,
 	unsigned long input;
 	struct hstate *h = kobj_to_hstate(kobj, NULL);
 
-	if (hstate_is_gigantic(h))
+	if (hstate_is_gigantic(h) && !gigantic_page_supported())
 		return -EINVAL;
 
 	err = kstrtoul(buf, 10, &input);
@@ -2966,7 +3031,7 @@ int hugetlb_overcommit_handler(struct ctl_table *table, int write,
 
 	tmp = h->nr_overcommit_huge_pages;
 
-	if (write && hstate_is_gigantic(h))
+	if (write && hstate_is_gigantic(h) && !gigantic_page_supported())
 		return -EINVAL;
 
 	table->data = &tmp;
diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index 6d3639e..3550a29 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -1800,6 +1800,50 @@ static inline unsigned interleave_nid(struct mempolicy *pol,
 
 #ifdef CONFIG_HUGETLBFS
 /*
+ * huge_nodemask(@vma, @addr, @mask)
+ * @vma: virtual memory area whose policy is sought
+ * @addr: address in @vma
+ * @mask: should be a valid nodemask pointer, not NULL
+ *
+ * Return true if we can succeed in extracting the policy nodemask
+ * for 'bind' or 'interleave' policy into the argument @mask, or
+ * initializing the argument @mask to contain the single node for
+ * 'preferred' or 'local' policy.
+ */
+bool huge_nodemask(struct vm_area_struct *vma, unsigned long addr,
+			nodemask_t *mask)
+{
+	struct mempolicy *mpol;
+	bool ret = true;
+	int nid;
+
+	mpol = get_vma_policy(vma, addr);
+
+	switch (mpol->mode) {
+	case MPOL_PREFERRED:
+		if (mpol->flags & MPOL_F_LOCAL)
+			nid = numa_node_id();
+		else
+			nid = mpol->v.preferred_node;
+		init_nodemask_of_node(mask, nid);
+		break;
+
+	case MPOL_BIND:
+		/* Fall through */
+	case MPOL_INTERLEAVE:
+		*mask = mpol->v.nodes;
+		break;
+
+	default:
+		ret = false;
+		break;
+	}
+	mpol_cond_put(mpol);
+
+	return ret;
+}
+
+/*
  * huge_zonelist(@vma, @addr, @gfp_flags, @mpol)
  * @vma: virtual memory area whose policy is sought
  * @addr: address in @vma for shared policy lookup and interleave policy
-- 
2.5.5

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH v3 0/4]  mm: fix the "counter.sh" failure for libhugetlbfs
  2016-12-05  9:17 [PATCH v3 0/4] mm: fix the "counter.sh" failure for libhugetlbfs Huang Shijie
                   ` (3 preceding siblings ...)
  2016-12-05  9:17 ` [PATCH v3 4/4] mm: hugetlb: support gigantic surplus pages Huang Shijie
@ 2016-12-05  9:31 ` Michal Hocko
  2016-12-06 10:03   ` Huang Shijie
  2016-12-07  8:46   ` Huang Shijie
  4 siblings, 2 replies; 12+ messages in thread
From: Michal Hocko @ 2016-12-05  9:31 UTC (permalink / raw)
  To: Huang Shijie
  Cc: akpm, catalin.marinas, n-horiguchi, kirill.shutemov, aneesh.kumar,
	gerald.schaefer, mike.kravetz, linux-mm, will.deacon,
	steve.capper, kaly.xin, nd, linux-arm-kernel, vbabka

On Mon 05-12-16 17:17:07, Huang Shijie wrote:
[...]
>    The failure is caused by:
>     1) kernel fails to allocate a gigantic page for the surplus case.
>        And the gather_surplus_pages() will return NULL in the end.
> 
>     2) The condition checks for some functions are wrong:
>         return_unused_surplus_pages()
>         nr_overcommit_hugepages_store()
>         hugetlb_overcommit_handler()

OK, so how is this any different from gigantic (1G) hugetlb pages on
x86_64? Do we need the same functionality or is it just 32MB not being
handled in the same way as 1G?

Thanks!
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v3 0/4]  mm: fix the "counter.sh" failure for libhugetlbfs
  2016-12-05  9:31 ` [PATCH v3 0/4] mm: fix the "counter.sh" failure for libhugetlbfs Michal Hocko
@ 2016-12-06 10:03   ` Huang Shijie
  2016-12-07 15:02     ` Michal Hocko
  2016-12-07  8:46   ` Huang Shijie
  1 sibling, 1 reply; 12+ messages in thread
From: Huang Shijie @ 2016-12-06 10:03 UTC (permalink / raw)
  To: Michal Hocko
  Cc: akpm@linux-foundation.org, Catalin Marinas,
	n-horiguchi@ah.jp.nec.com, kirill.shutemov@linux.intel.com,
	aneesh.kumar@linux.vnet.ibm.com, gerald.schaefer@de.ibm.com,
	mike.kravetz@oracle.com, linux-mm@kvack.org, Will Deacon,
	Steve Capper, Kaly Xin, nd, linux-arm-kernel@lists.infradead.org,
	vbabka@suze.cz

On Mon, Dec 05, 2016 at 05:31:01PM +0800, Michal Hocko wrote:
> On Mon 05-12-16 17:17:07, Huang Shijie wrote:
> [...]
> >    The failure is caused by:
> >     1) kernel fails to allocate a gigantic page for the surplus case.
> >        And the gather_surplus_pages() will return NULL in the end.
> > 
> >     2) The condition checks for some functions are wrong:
> >         return_unused_surplus_pages()
> >         nr_overcommit_hugepages_store()
> >         hugetlb_overcommit_handler()
> 
> OK, so how is this any different from gigantic (1G) hugetlb pages on
I think there is no different from gigantic (1G) hugetlb pages on
x86_64. Do anyone ever tested the 1G hugetlb pages in x86_64 with the "counter.sh"
before? 

> x86_64? Do we need the same functionality or is it just 32MB not being
> handled in the same way as 1G?
Yes, we need this functionality for gigantic pages, no matter it is
X86_64 or S390 or arm64, no matter it is 32MB or 1G. :)

But anyway, I will try to find some machine and try the 1G gigantic page
on ARM64.

Thanks
Huang Shijie

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v3 0/4]  mm: fix the "counter.sh" failure for libhugetlbfs
  2016-12-05  9:31 ` [PATCH v3 0/4] mm: fix the "counter.sh" failure for libhugetlbfs Michal Hocko
  2016-12-06 10:03   ` Huang Shijie
@ 2016-12-07  8:46   ` Huang Shijie
  1 sibling, 0 replies; 12+ messages in thread
From: Huang Shijie @ 2016-12-07  8:46 UTC (permalink / raw)
  To: Michal Hocko
  Cc: akpm@linux-foundation.org, Catalin Marinas,
	n-horiguchi@ah.jp.nec.com, kirill.shutemov@linux.intel.com,
	aneesh.kumar@linux.vnet.ibm.com, gerald.schaefer@de.ibm.com,
	mike.kravetz@oracle.com, linux-mm@kvack.org, Will Deacon,
	Steve Capper, Kaly Xin, nd, linux-arm-kernel@lists.infradead.org,
	vbabka@suze.cz

On Mon, Dec 05, 2016 at 05:31:01PM +0800, Michal Hocko wrote:
> On Mon 05-12-16 17:17:07, Huang Shijie wrote:
> [...]
> >    The failure is caused by:
> >     1) kernel fails to allocate a gigantic page for the surplus case.
> >        And the gather_surplus_pages() will return NULL in the end.
> > 
> >     2) The condition checks for some functions are wrong:
> >         return_unused_surplus_pages()
> >         nr_overcommit_hugepages_store()
> >         hugetlb_overcommit_handler()
> 
> OK, so how is this any different from gigantic (1G) hugetlb pages on
> x86_64? Do we need the same functionality or is it just 32MB not being
> handled in the same way as 1G?
I tested this patch set on the Softiron board(ARM64) which has 16G memory.
I appended "hugepagesz=1G hugepages=6" in the kernel cmdline, the arm64
will use the PUD_SIZE for the hugetlb page.

The 1G page size can run well, I post the log here:

--------------------------------------------------------
	counters.sh (1024M: 64):        PASS
	********** TEST SUMMARY
	*                      1024M         
	*                      32-bit 64-bit 
	*     Total testcases:     0      1   
	*             Skipped:     0      0   
	*                PASS:     0      1   
	*                FAIL:     0      0   
	*    Killed by signal:     0      0   
	*   Bad configuration:     0      0   
	*       Expected FAIL:     0      0   
	*     Unexpected PASS:     0      0   
	* Strange test result:     0      0   
	**********
--------------------------------------------------------

My desktop is x86_64, but its memory is just 8G.
I will expand its memory capacity, and continue to
the test for x86_64. 

Thanks
Huang Shijie

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v3 0/4]  mm: fix the "counter.sh" failure for libhugetlbfs
  2016-12-06 10:03   ` Huang Shijie
@ 2016-12-07 15:02     ` Michal Hocko
  2016-12-08  9:36       ` Huang Shijie
  0 siblings, 1 reply; 12+ messages in thread
From: Michal Hocko @ 2016-12-07 15:02 UTC (permalink / raw)
  To: Huang Shijie
  Cc: akpm@linux-foundation.org, Catalin Marinas,
	n-horiguchi@ah.jp.nec.com, kirill.shutemov@linux.intel.com,
	aneesh.kumar@linux.vnet.ibm.com, gerald.schaefer@de.ibm.com,
	mike.kravetz@oracle.com, linux-mm@kvack.org, Will Deacon,
	Steve Capper, Kaly Xin, nd, linux-arm-kernel@lists.infradead.org,
	vbabka@suze.cz

On Tue 06-12-16 18:03:59, Huang Shijie wrote:
> On Mon, Dec 05, 2016 at 05:31:01PM +0800, Michal Hocko wrote:
> > On Mon 05-12-16 17:17:07, Huang Shijie wrote:
> > [...]
> > >    The failure is caused by:
> > >     1) kernel fails to allocate a gigantic page for the surplus case.
> > >        And the gather_surplus_pages() will return NULL in the end.
> > > 
> > >     2) The condition checks for some functions are wrong:
> > >         return_unused_surplus_pages()
> > >         nr_overcommit_hugepages_store()
> > >         hugetlb_overcommit_handler()
> > 
> > OK, so how is this any different from gigantic (1G) hugetlb pages on
> I think there is no different from gigantic (1G) hugetlb pages on
> x86_64. Do anyone ever tested the 1G hugetlb pages in x86_64 with the "counter.sh"
> before? 

I suspect nobody has because the gigantic page support is still somehow
coarse and from a quick look into the code we only support pre-allocated
giga pages. In other words surplus pages and their accounting is not
supported at all.

I haven't yet checked your patchset but I can tell you one thing.
Surplus and subpool pages code is tricky as hell. And it is not just a
matter of teaching the huge page allocation code to do the right thing.
There are subtle details all over the place. E.g. we currently
do not free giga pages AFAICS. In fact I believe that the giga pages are
kind of implanted to the existing code without any higher level
consistency. This should change long term. But I am worried it is much
more work.

Now I might be wrong because I might misremember things which might have
been changed recently but please make sure you describe the current
state and changes of giga pages when touching this area much better if
you want to pursue this route...

Thanks!
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v3 0/4]  mm: fix the "counter.sh" failure for libhugetlbfs
  2016-12-07 15:02     ` Michal Hocko
@ 2016-12-08  9:36       ` Huang Shijie
  2016-12-08  9:52         ` Michal Hocko
  0 siblings, 1 reply; 12+ messages in thread
From: Huang Shijie @ 2016-12-08  9:36 UTC (permalink / raw)
  To: Michal Hocko
  Cc: akpm@linux-foundation.org, Catalin Marinas,
	n-horiguchi@ah.jp.nec.com, kirill.shutemov@linux.intel.com,
	aneesh.kumar@linux.vnet.ibm.com, gerald.schaefer@de.ibm.com,
	mike.kravetz@oracle.com, linux-mm@kvack.org, Will Deacon,
	Steve Capper, Kaly Xin, nd, linux-arm-kernel@lists.infradead.org,
	vbabka@suze.cz

On Wed, Dec 07, 2016 at 11:02:38PM +0800, Michal Hocko wrote:
> On Tue 06-12-16 18:03:59, Huang Shijie wrote:
> > On Mon, Dec 05, 2016 at 05:31:01PM +0800, Michal Hocko wrote:
> > > On Mon 05-12-16 17:17:07, Huang Shijie wrote:
> > > [...]
> > > >    The failure is caused by:
> > > >     1) kernel fails to allocate a gigantic page for the surplus case.
> > > >        And the gather_surplus_pages() will return NULL in the end.
> > > > 
> > > >     2) The condition checks for some functions are wrong:
> > > >         return_unused_surplus_pages()
> > > >         nr_overcommit_hugepages_store()
> > > >         hugetlb_overcommit_handler()
>add the  > > 
> > > OK, so how is this any different from gigantic (1G) hugetlb pages on
> > I think there is no different from gigantic (1G) hugetlb pages on
> > x86_64. Do anyone ever tested the 1G hugetlb pages in x86_64 with the "counter.sh"
> > before? 
> 
> I suspect nobody has because the gigantic page support is still somehow
> coarse and from a quick look into the code we only support pre-allocated
Yes, the x86_64 even does not support the gigantic page.
The default x86_64_defconfig does not enable the CONFIG_CMA.

I enabled the CONFIG_CMA, and did the test for gigantic page in x86_64.
(I appended "hugepagesz=1G hugepages=4" in the kernel cmdline.)
The result is got with my 16G x86_64 desktop:

   -------------------------------------------------
	counters.sh (1024M: 32):        FAIL mmap failed: Cannot allocate memory
	counters.sh (1024M: 64):        PASS
	********** TEST SUMMARY
	*                      1024M         
	*                      32-bit 64-bit 
	*     Total testcases:     1      1   
	*             Skipped:     0      0   
	*                PASS:     0      1   
	*                FAIL:     1      0   
	*    Killed by signal:     0      0   
	*   Bad configuration:     0      0   
	*       Expected FAIL:     0      0   
	*     Unexpected PASS:     0      0   
	*    Test not present:     0      0   
	* Strange test result:     0      0   
	**********
   -------------------------------------------------

The test passes for 64bit, but fails for 32bit (but I think it's okay,
since 1G hugetlb page is too large for the 32bit).				 

> giga pages. In other words surplus pages and their accounting is not
> supported at all.
Yes.

> 
> I haven't yet checked your patchset but I can tell you one thing.
Could you please review the patch set when you have time? Thanks a lot.

> Surplus and subpool pages code is tricky as hell. And it is not just a
Agree. 

Do we really need so many accountings? such as reserve/ovorcommit/surplus.

> matter of teaching the huge page allocation code to do the right thing.
> There are subtle details all over the place. E.g. we currently
> do not free giga pages AFAICS. In fact I believe that the giga pages are
Please correct me if I am wrong. :)

I think the free-giga-pages can work well.
Please see the code in update_and_free_page(). 

Could you please list all the subtle details you think the code is wrong?
I can check them one by one.


> kind of implanted to the existing code without any higher level
> consistency. This should change long term. But I am worried it is much
What's type of the "higher level consistency" we should care about?

Thanks
Huang Shijie
> more work.
> 
> Now I might be wrong because I might misremember things which might have
> been changed recently but please make sure you describe the current
> state and changes of giga pages when touching this area much better if
> you want to pursue this route...
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v3 0/4]  mm: fix the "counter.sh" failure for libhugetlbfs
  2016-12-08  9:36       ` Huang Shijie
@ 2016-12-08  9:52         ` Michal Hocko
  2016-12-09  9:58           ` Huang Shijie
  0 siblings, 1 reply; 12+ messages in thread
From: Michal Hocko @ 2016-12-08  9:52 UTC (permalink / raw)
  To: Huang Shijie
  Cc: akpm@linux-foundation.org, Catalin Marinas,
	n-horiguchi@ah.jp.nec.com, kirill.shutemov@linux.intel.com,
	aneesh.kumar@linux.vnet.ibm.com, gerald.schaefer@de.ibm.com,
	mike.kravetz@oracle.com, linux-mm@kvack.org, Will Deacon,
	Steve Capper, Kaly Xin, nd, linux-arm-kernel@lists.infradead.org,
	vbabka@suze.cz

On Thu 08-12-16 17:36:24, Huang Shijie wrote:
> On Wed, Dec 07, 2016 at 11:02:38PM +0800, Michal Hocko wrote:
[...]
> > I haven't yet checked your patchset but I can tell you one thing.
>
> Could you please review the patch set when you have time? Thanks a lot.

>From a quick glance you do not handle the reservation code at all. You
just make sure that the allocation doesn't fail unconditionally. I might
be wrong here and Naoya resp. Mike will know much better but this seems
far from enough to me.

> 
> > Surplus and subpool pages code is tricky as hell. And it is not just a
> Agree. 
> 
> Do we really need so many accountings? such as reserve/ovorcommit/surplus.

If we want to make giga page the first class citizen then the whole
reservation/surplus code has to independent on the page size.

> > matter of teaching the huge page allocation code to do the right thing.
> > There are subtle details all over the place. E.g. we currently
> > do not free giga pages AFAICS. In fact I believe that the giga pages are
> Please correct me if I am wrong. :)
> 
> I think the free-giga-pages can work well.
> Please see the code in update_and_free_page(). 

Hmm, I have missed that part. I guess you are right but I would have to
look much closer. Hugetlb code tends to be full of surprises.

> Could you please list all the subtle details you think the code is wrong?
> I can check them one by one.

Well, this would take me quite some time and basically restudy the whole
hugetlb code again. What you are trying to achieve is not a simple "fix
a test case" thing. You are trying to implement full featured giga pages
suport. And as I've said this requires a deeper understanding of the
current code and clean it up considerably wrt. giga pages. This is
definitely desirable plan longterm and I would like to encourage you for
that but it is not a simple project at the same time. 
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v3 0/4]  mm: fix the "counter.sh" failure for libhugetlbfs
  2016-12-08  9:52         ` Michal Hocko
@ 2016-12-09  9:58           ` Huang Shijie
  0 siblings, 0 replies; 12+ messages in thread
From: Huang Shijie @ 2016-12-09  9:58 UTC (permalink / raw)
  To: Michal Hocko
  Cc: akpm@linux-foundation.org, Catalin Marinas,
	n-horiguchi@ah.jp.nec.com, kirill.shutemov@linux.intel.com,
	aneesh.kumar@linux.vnet.ibm.com, gerald.schaefer@de.ibm.com,
	mike.kravetz@oracle.com, linux-mm@kvack.org, Will Deacon,
	Steve Capper, Kaly Xin, nd, linux-arm-kernel@lists.infradead.org,
	vbabka@suze.cz

On Thu, Dec 08, 2016 at 10:52:54AM +0100, Michal Hocko wrote:
> On Thu 08-12-16 17:36:24, Huang Shijie wrote:
> > On Wed, Dec 07, 2016 at 11:02:38PM +0800, Michal Hocko wrote:
> [...]
> > > I haven't yet checked your patchset but I can tell you one thing.
> >
> > Could you please review the patch set when you have time? Thanks a lot.
> 
> From a quick glance you do not handle the reservation code at all. You
Thanks, I will study the code again, and try to find What we need to do
with the reservation code.

> just make sure that the allocation doesn't fail unconditionally. I might
> be wrong here and Naoya resp. Mike will know much better but this seems
> far from enough to me.
> 
> Well, this would take me quite some time and basically restudy the whole
> hugetlb code again. What you are trying to achieve is not a simple "fix
> a test case" thing. You are trying to implement full featured giga pages
> suport. And as I've said this requires a deeper understanding of the
> current code and clean it up considerably wrt. giga pages. This is
> definitely desirable plan longterm and I would like to encourage you for
> that but it is not a simple project at the same time. 
Okay, I will try to implement the full featured giga pages support. :)

But I feel confused at the "full featured". If the patch set can pass
all the giga pages tests in the libhugetlbfs, can we say it is "full
featured"? Or some one reviews this patch set, and say it is full
featured support for the giga pages.

Thanks
Huang Shijie

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2016-12-09  9:58 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-12-05  9:17 [PATCH v3 0/4] mm: fix the "counter.sh" failure for libhugetlbfs Huang Shijie
2016-12-05  9:17 ` [PATCH v3 1/4] mm: hugetlb: rename some allocation functions Huang Shijie
2016-12-05  9:17 ` [PATCH v3 2/4] mm: hugetlb: add a new parameter for some functions Huang Shijie
2016-12-05  9:17 ` [PATCH v3 3/4] mm: hugetlb: change the return type " Huang Shijie
2016-12-05  9:17 ` [PATCH v3 4/4] mm: hugetlb: support gigantic surplus pages Huang Shijie
2016-12-05  9:31 ` [PATCH v3 0/4] mm: fix the "counter.sh" failure for libhugetlbfs Michal Hocko
2016-12-06 10:03   ` Huang Shijie
2016-12-07 15:02     ` Michal Hocko
2016-12-08  9:36       ` Huang Shijie
2016-12-08  9:52         ` Michal Hocko
2016-12-09  9:58           ` Huang Shijie
2016-12-07  8:46   ` Huang Shijie

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).