* [PATCH 1/1] mm: allocate order 0 page from pcb before zone_watermark_ok
@ 2016-06-29 14:44 vichy
2016-06-30 12:35 ` Michal Hocko
0 siblings, 1 reply; 3+ messages in thread
From: vichy @ 2016-06-29 14:44 UTC (permalink / raw)
To: linux-mm
hi all:
In normal case, the allocation of any order page started after
zone_watermark_ok. But if so far pcp->count of this zone is not 0,
why don't we just let order-0-page allocation before zone_watermark_ok.
That mean the order-0-page will be successfully allocated even
free_pages is beneath zone->watermark.
For above idea, I made below patch for your reference.
Signed-off-by: pierre kuo <vichy.kuo@gmail.com>
---
mm/page_alloc.c | 27 +++++++++++++++++++++++++++
1 file changed, 27 insertions(+)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index c1069ef..406655f 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -2622,6 +2622,14 @@ static void reset_alloc_batches(struct zone
*preferred_zone)
} while (zone++ != preferred_zone);
}
+static struct page *
+__get_hot_cold_page(bool cold, struct list_head *list)
+{
+ if (cold)
+ return list_last_entry(list, struct page, lru);
+ else
+ return list_first_entry(list, struct page, lru);
+}
/*
* get_page_from_freelist goes through the zonelist trying to allocate
* a page.
@@ -2695,6 +2703,24 @@ zonelist_scan:
if (ac->spread_dirty_pages && !zone_dirty_ok(zone))
continue;
+ if (likely(order == 0)) {
+ struct per_cpu_pages *pcp;
+ struct list_head *list;
+ unsigned long flags;
+ bool cold = ((gfp_mask & __GFP_COLD) != 0);
+
+ local_irq_save(flags);
+ pcp = &this_cpu_ptr(zone->pageset)->pcp;
+ list = &pcp->lists[ac->migratetype];
+ if (!list_empty(list)) {
+ page = __get_hot_cold_page(cold, list);
+ list_del(&page->lru);
+ pcp->count--;
+ }
+ local_irq_restore(flags);
+ if (page)
+ goto get_page_order0;
+ }
mark = zone->watermark[alloc_flags & ALLOC_WMARK_MASK];
if (!zone_watermark_ok(zone, order, mark,
ac->classzone_idx, alloc_flags)) {
@@ -2730,6 +2756,7 @@ zonelist_scan:
try_this_zone:
page = buffered_rmqueue(ac->preferred_zone, zone, order,
gfp_mask, alloc_flags, ac->migratetype);
+get_page_order0:
if (page) {
if (prep_new_page(page, order, gfp_mask, alloc_flags))
goto try_this_zone;
--
1.9.1
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH 1/1] mm: allocate order 0 page from pcb before zone_watermark_ok
2016-06-29 14:44 [PATCH 1/1] mm: allocate order 0 page from pcb before zone_watermark_ok vichy
@ 2016-06-30 12:35 ` Michal Hocko
2016-07-05 1:39 ` pierre kuo
0 siblings, 1 reply; 3+ messages in thread
From: Michal Hocko @ 2016-06-30 12:35 UTC (permalink / raw)
To: vichy; +Cc: linux-mm
On Wed 29-06-16 22:44:19, vichy wrote:
> hi all:
> In normal case, the allocation of any order page started after
> zone_watermark_ok. But if so far pcp->count of this zone is not 0,
> why don't we just let order-0-page allocation before zone_watermark_ok.
> That mean the order-0-page will be successfully allocated even
> free_pages is beneath zone->watermark.
The watermark check has a good reason. It protects the memory reserves
which are used for important users or emergency situations. The mere
fact that there are pages available for the pcp usage doesn't mean that
we should break this protection. Note that those emergency users might
want order 0 pages as well.
So NAK to the patch.
> For above idea, I made below patch for your reference.
>
> Signed-off-by: pierre kuo <vichy.kuo@gmail.com>
> ---
> mm/page_alloc.c | 27 +++++++++++++++++++++++++++
> 1 file changed, 27 insertions(+)
>
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index c1069ef..406655f 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -2622,6 +2622,14 @@ static void reset_alloc_batches(struct zone
> *preferred_zone)
> } while (zone++ != preferred_zone);
> }
>
> +static struct page *
> +__get_hot_cold_page(bool cold, struct list_head *list)
> +{
> + if (cold)
> + return list_last_entry(list, struct page, lru);
> + else
> + return list_first_entry(list, struct page, lru);
> +}
> /*
> * get_page_from_freelist goes through the zonelist trying to allocate
> * a page.
> @@ -2695,6 +2703,24 @@ zonelist_scan:
> if (ac->spread_dirty_pages && !zone_dirty_ok(zone))
> continue;
>
> + if (likely(order == 0)) {
> + struct per_cpu_pages *pcp;
> + struct list_head *list;
> + unsigned long flags;
> + bool cold = ((gfp_mask & __GFP_COLD) != 0);
> +
> + local_irq_save(flags);
> + pcp = &this_cpu_ptr(zone->pageset)->pcp;
> + list = &pcp->lists[ac->migratetype];
> + if (!list_empty(list)) {
> + page = __get_hot_cold_page(cold, list);
> + list_del(&page->lru);
> + pcp->count--;
> + }
> + local_irq_restore(flags);
> + if (page)
> + goto get_page_order0;
> + }
> mark = zone->watermark[alloc_flags & ALLOC_WMARK_MASK];
> if (!zone_watermark_ok(zone, order, mark,
> ac->classzone_idx, alloc_flags)) {
> @@ -2730,6 +2756,7 @@ zonelist_scan:
> try_this_zone:
> page = buffered_rmqueue(ac->preferred_zone, zone, order,
> gfp_mask, alloc_flags, ac->migratetype);
> +get_page_order0:
> if (page) {
> if (prep_new_page(page, order, gfp_mask, alloc_flags))
> goto try_this_zone;
> --
> 1.9.1
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org. For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
--
Michal Hocko
SUSE Labs
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH 1/1] mm: allocate order 0 page from pcb before zone_watermark_ok
2016-06-30 12:35 ` Michal Hocko
@ 2016-07-05 1:39 ` pierre kuo
0 siblings, 0 replies; 3+ messages in thread
From: pierre kuo @ 2016-07-05 1:39 UTC (permalink / raw)
To: Michal Hocko; +Cc: linux-mm
hi Michal
2016-06-30 20:35 GMT+08:00 Michal Hocko <mhocko@kernel.org>:
> On Wed 29-06-16 22:44:19, vichy wrote:
>> hi all:
>> In normal case, the allocation of any order page started after
>> zone_watermark_ok. But if so far pcp->count of this zone is not 0,
>> why don't we just let order-0-page allocation before zone_watermark_ok.
>> That mean the order-0-page will be successfully allocated even
>> free_pages is beneath zone->watermark.
>
> The watermark check has a good reason. It protects the memory reserves
> which are used for important users or emergency situations. The mere
> fact that there are pages available for the pcp usage doesn't mean that
> we should break this protection. Note that those emergency users might
> want order 0 pages as well.
Got it.
And due to your friendly reminder I found the "emergency users" you
mean, the cases in gfp_to_alloc_flags that will return with
ALLOC_NO_WATERMARKS as below
if (likely(!(gfp_mask & __GFP_NOMEMALLOC))) {
if (gfp_mask & __GFP_MEMALLOC)
alloc_flags |= ALLOC_NO_WATERMARKS;
else if (in_serving_softirq() && (current->flags & PF_MEMALLOC))
alloc_flags |= ALLOC_NO_WATERMARKS;
else if (!in_interrupt() &&
((current->flags & PF_MEMALLOC) ||
unlikely(test_thread_flag(TIF_MEMDIE))))
alloc_flags |= ALLOC_NO_WATERMARKS;
}
Appreciate your kind review,
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2016-07-05 1:39 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-06-29 14:44 [PATCH 1/1] mm: allocate order 0 page from pcb before zone_watermark_ok vichy
2016-06-30 12:35 ` Michal Hocko
2016-07-05 1:39 ` pierre kuo
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).