* [PATCH 1/4] mm/page_alloc : rename rmqueue_bulk to rmqueue_single
@ 2010-01-11 4:37 Huang Shijie
2010-01-11 4:37 ` [PATCH 2/4] mm/page_alloc : relieve the zone->lock's pressure for allocation Huang Shijie
` (3 more replies)
0 siblings, 4 replies; 41+ messages in thread
From: Huang Shijie @ 2010-01-11 4:37 UTC (permalink / raw)
To: akpm; +Cc: mel, kosaki.motohiro, linux-mm, Huang Shijie
There is only one place calls rmqueue_bulk to allocate the single
pages. So rename it to rmqueue_single, and remove an argument
order.
Signed-off-by: Huang Shijie <shijie8@gmail.com>
---
mm/page_alloc.c | 18 ++++++++----------
1 files changed, 8 insertions(+), 10 deletions(-)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 4e9f5cc..23df1ed 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -930,25 +930,24 @@ retry_reserve:
}
/*
- * Obtain a specified number of elements from the buddy allocator, all under
+ * Obtain a specified number of single page from the buddy allocator, all under
* a single hold of the lock, for efficiency. Add them to the supplied list.
* Returns the number of new pages which were placed at *list.
*/
-static int rmqueue_bulk(struct zone *zone, unsigned int order,
- unsigned long count, struct list_head *list,
- int migratetype, int cold)
+static int rmqueue_single(struct zone *zone, unsigned long count,
+ struct list_head *list, int migratetype, int cold)
{
int i;
spin_lock(&zone->lock);
for (i = 0; i < count; ++i) {
- struct page *page = __rmqueue(zone, order, migratetype);
+ struct page *page = __rmqueue(zone, 0, migratetype);
if (unlikely(page == NULL))
break;
/*
* Split buddy pages returned by expand() are received here
- * in physical page order. The page is added to the callers and
+ * in order zero. The page is added to the callers and
* list and the list head then moves forward. From the callers
* perspective, the linked list is ordered by page number in
* some conditions. This is useful for IO devices that can
@@ -962,7 +961,7 @@ static int rmqueue_bulk(struct zone *zone, unsigned int order,
set_page_private(page, migratetype);
list = &page->lru;
}
- __mod_zone_page_state(zone, NR_FREE_PAGES, -(i << order));
+ __mod_zone_page_state(zone, NR_FREE_PAGES, -i);
spin_unlock(&zone->lock);
return i;
}
@@ -1192,9 +1191,8 @@ again:
list = &pcp->lists[migratetype];
local_irq_save(flags);
if (list_empty(list)) {
- pcp->count += rmqueue_bulk(zone, 0,
- pcp->batch, list,
- migratetype, cold);
+ pcp->count += rmqueue_single(zone, pcp->batch, list,
+ migratetype, cold);
if (unlikely(list_empty(list)))
goto failed;
}
--
1.6.5.2
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 41+ messages in thread
* [PATCH 2/4] mm/page_alloc : relieve the zone->lock's pressure for allocation
2010-01-11 4:37 [PATCH 1/4] mm/page_alloc : rename rmqueue_bulk to rmqueue_single Huang Shijie
@ 2010-01-11 4:37 ` Huang Shijie
2010-01-11 4:37 ` [PATCH 3/4] mm/page_alloc : modify the return type of __free_one_page Huang Shijie
` (3 more replies)
2010-01-11 5:00 ` [PATCH 1/4] mm/page_alloc : rename rmqueue_bulk to rmqueue_single Minchan Kim
` (2 subsequent siblings)
3 siblings, 4 replies; 41+ messages in thread
From: Huang Shijie @ 2010-01-11 4:37 UTC (permalink / raw)
To: akpm; +Cc: mel, kosaki.motohiro, linux-mm, Huang Shijie
The __mod_zone_page_state() only require irq disabling,
it does not require the zone's spinlock. So move it out of
the guard region of the spinlock to relieve the pressure for
allocation.
Signed-off-by: Huang Shijie <shijie8@gmail.com>
---
mm/page_alloc.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 23df1ed..00aa83a 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -961,8 +961,8 @@ static int rmqueue_single(struct zone *zone, unsigned long count,
set_page_private(page, migratetype);
list = &page->lru;
}
- __mod_zone_page_state(zone, NR_FREE_PAGES, -i);
spin_unlock(&zone->lock);
+ __mod_zone_page_state(zone, NR_FREE_PAGES, -i);
return i;
}
--
1.6.5.2
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 41+ messages in thread
* [PATCH 3/4] mm/page_alloc : modify the return type of __free_one_page
2010-01-11 4:37 ` [PATCH 2/4] mm/page_alloc : relieve the zone->lock's pressure for allocation Huang Shijie
@ 2010-01-11 4:37 ` Huang Shijie
2010-01-11 4:37 ` [PATCH 4/4] mm/page_alloc : relieve zone->lock's pressure for memory free Huang Shijie
` (3 more replies)
2010-01-11 5:02 ` [PATCH 2/4] mm/page_alloc : relieve the zone->lock's pressure for allocation Minchan Kim
` (2 subsequent siblings)
3 siblings, 4 replies; 41+ messages in thread
From: Huang Shijie @ 2010-01-11 4:37 UTC (permalink / raw)
To: akpm; +Cc: mel, kosaki.motohiro, linux-mm, Huang Shijie
Modify the return type for __free_one_page.
It will return 1 on success, and return 0 when
the check of the compound page is failed.
Signed-off-by: Huang Shijie <shijie8@gmail.com>
---
mm/page_alloc.c | 10 ++++++----
1 files changed, 6 insertions(+), 4 deletions(-)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 00aa83a..290dfc3 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -445,17 +445,18 @@ static inline int page_is_buddy(struct page *page, struct page *buddy,
* triggers coalescing into a block of larger size.
*
* -- wli
+ *
+ * Returns 1 on success, else return 0;
*/
-static inline void __free_one_page(struct page *page,
- struct zone *zone, unsigned int order,
- int migratetype)
+static inline int __free_one_page(struct page *page, struct zone *zone,
+ unsigned int order, int migratetype)
{
unsigned long page_idx;
if (unlikely(PageCompound(page)))
if (unlikely(destroy_compound_page(page, order)))
- return;
+ return 0;
VM_BUG_ON(migratetype == -1);
@@ -485,6 +486,7 @@ static inline void __free_one_page(struct page *page,
list_add(&page->lru,
&zone->free_area[order].free_list[migratetype]);
zone->free_area[order].nr_free++;
+ return 1;
}
/*
--
1.6.5.2
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 41+ messages in thread
* [PATCH 4/4] mm/page_alloc : relieve zone->lock's pressure for memory free
2010-01-11 4:37 ` [PATCH 3/4] mm/page_alloc : modify the return type of __free_one_page Huang Shijie
@ 2010-01-11 4:37 ` Huang Shijie
2010-01-11 5:20 ` Minchan Kim
` (2 more replies)
2010-01-11 5:04 ` [PATCH 3/4] mm/page_alloc : modify the return type of __free_one_page Minchan Kim
` (2 subsequent siblings)
3 siblings, 3 replies; 41+ messages in thread
From: Huang Shijie @ 2010-01-11 4:37 UTC (permalink / raw)
To: akpm; +Cc: mel, kosaki.motohiro, linux-mm, Huang Shijie
Move the __mod_zone_page_state out the guard region of
the spinlock to relieve the pressure for memory free.
Signed-off-by: Huang Shijie <shijie8@gmail.com>
---
mm/page_alloc.c | 26 ++++++++++++++++----------
1 files changed, 16 insertions(+), 10 deletions(-)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 290dfc3..34b9a3a 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -530,12 +530,9 @@ static void free_pcppages_bulk(struct zone *zone, int count,
{
int migratetype = 0;
int batch_free = 0;
+ int free_ok = 0;
spin_lock(&zone->lock);
- zone_clear_flag(zone, ZONE_ALL_UNRECLAIMABLE);
- zone->pages_scanned = 0;
-
- __mod_zone_page_state(zone, NR_FREE_PAGES, count);
while (count) {
struct page *page;
struct list_head *list;
@@ -558,23 +555,32 @@ static void free_pcppages_bulk(struct zone *zone, int count,
page = list_entry(list->prev, struct page, lru);
/* must delete as __free_one_page list manipulates */
list_del(&page->lru);
- __free_one_page(page, zone, 0, migratetype);
+ free_ok += __free_one_page(page, zone, 0, migratetype);
trace_mm_page_pcpu_drain(page, 0, migratetype);
} while (--count && --batch_free && !list_empty(list));
}
+
+ if (likely(free_ok)) {
+ zone_clear_flag(zone, ZONE_ALL_UNRECLAIMABLE);
+ zone->pages_scanned = 0;
+ }
spin_unlock(&zone->lock);
+ __mod_zone_page_state(zone, NR_FREE_PAGES, free_ok);
}
static void free_one_page(struct zone *zone, struct page *page, int order,
int migratetype)
{
- spin_lock(&zone->lock);
- zone_clear_flag(zone, ZONE_ALL_UNRECLAIMABLE);
- zone->pages_scanned = 0;
+ int free_ok;
- __mod_zone_page_state(zone, NR_FREE_PAGES, 1 << order);
- __free_one_page(page, zone, order, migratetype);
+ spin_lock(&zone->lock);
+ free_ok = __free_one_page(page, zone, order, migratetype);
+ if (likely(free_ok)) {
+ zone_clear_flag(zone, ZONE_ALL_UNRECLAIMABLE);
+ zone->pages_scanned = 0;
+ }
spin_unlock(&zone->lock);
+ __mod_zone_page_state(zone, NR_FREE_PAGES, free_ok << order);
}
static void __free_pages_ok(struct page *page, unsigned int order)
--
1.6.5.2
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 41+ messages in thread
* Re: [PATCH 1/4] mm/page_alloc : rename rmqueue_bulk to rmqueue_single
2010-01-11 4:37 [PATCH 1/4] mm/page_alloc : rename rmqueue_bulk to rmqueue_single Huang Shijie
2010-01-11 4:37 ` [PATCH 2/4] mm/page_alloc : relieve the zone->lock's pressure for allocation Huang Shijie
@ 2010-01-11 5:00 ` Minchan Kim
2010-01-12 2:52 ` KOSAKI Motohiro
2010-01-18 11:21 ` Mel Gorman
3 siblings, 0 replies; 41+ messages in thread
From: Minchan Kim @ 2010-01-11 5:00 UTC (permalink / raw)
To: Huang Shijie; +Cc: akpm, mel, kosaki.motohiro, linux-mm
On Mon, 11 Jan 2010 12:37:11 +0800
Huang Shijie <shijie8@gmail.com> wrote:
> There is only one place calls rmqueue_bulk to allocate the single
> pages. So rename it to rmqueue_single, and remove an argument
> order.
>
> Signed-off-by: Huang Shijie <shijie8@gmail.com>
Reviewed-by: Minchan Kim <minchan.kim@gmail.com>
--
Kind regards,
Minchan Kim
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH 2/4] mm/page_alloc : relieve the zone->lock's pressure for allocation
2010-01-11 4:37 ` [PATCH 2/4] mm/page_alloc : relieve the zone->lock's pressure for allocation Huang Shijie
2010-01-11 4:37 ` [PATCH 3/4] mm/page_alloc : modify the return type of __free_one_page Huang Shijie
@ 2010-01-11 5:02 ` Minchan Kim
2010-01-11 5:13 ` Huang Shijie
2010-01-12 2:54 ` KOSAKI Motohiro
2010-01-18 11:24 ` Mel Gorman
3 siblings, 1 reply; 41+ messages in thread
From: Minchan Kim @ 2010-01-11 5:02 UTC (permalink / raw)
To: Huang Shijie; +Cc: akpm, mel, kosaki.motohiro, linux-mm
On Mon, 11 Jan 2010 12:37:12 +0800
Huang Shijie <shijie8@gmail.com> wrote:
> The __mod_zone_page_state() only require irq disabling,
> it does not require the zone's spinlock. So move it out of
> the guard region of the spinlock to relieve the pressure for
> allocation.
>
> Signed-off-by: Huang Shijie <shijie8@gmail.com>
> ---
> mm/page_alloc.c | 2 +-
> 1 files changed, 1 insertions(+), 1 deletions(-)
>
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 23df1ed..00aa83a 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -961,8 +961,8 @@ static int rmqueue_single(struct zone *zone, unsigned long count,
> set_page_private(page, migratetype);
> list = &page->lru;
> }
> - __mod_zone_page_state(zone, NR_FREE_PAGES, -i);
> spin_unlock(&zone->lock);
> + __mod_zone_page_state(zone, NR_FREE_PAGES, -i);
> return i;
> }
>
> --
> 1.6.5.2
>
How about moving this patch into [4/4]?
Otherwise, Looks good to me.
Reviewed-by: Minchan Kim <minchan.kim@gmail.com>
--
Kind regards,
Minchan Kim
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH 3/4] mm/page_alloc : modify the return type of __free_one_page
2010-01-11 4:37 ` [PATCH 3/4] mm/page_alloc : modify the return type of __free_one_page Huang Shijie
2010-01-11 4:37 ` [PATCH 4/4] mm/page_alloc : relieve zone->lock's pressure for memory free Huang Shijie
@ 2010-01-11 5:04 ` Minchan Kim
2010-01-12 2:56 ` KOSAKI Motohiro
2010-01-18 11:25 ` Mel Gorman
3 siblings, 0 replies; 41+ messages in thread
From: Minchan Kim @ 2010-01-11 5:04 UTC (permalink / raw)
To: Huang Shijie; +Cc: akpm, mel, kosaki.motohiro, linux-mm
On Mon, 11 Jan 2010 12:37:13 +0800
Huang Shijie <shijie8@gmail.com> wrote:
> Modify the return type for __free_one_page.
> It will return 1 on success, and return 0 when
> the check of the compound page is failed.
>
> Signed-off-by: Huang Shijie <shijie8@gmail.com>
Reviewed-by: Minchan Kim <minchan.kim@gmail.com>
--
Kind regards,
Minchan Kim
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH 2/4] mm/page_alloc : relieve the zone->lock's pressure for allocation
2010-01-11 5:02 ` [PATCH 2/4] mm/page_alloc : relieve the zone->lock's pressure for allocation Minchan Kim
@ 2010-01-11 5:13 ` Huang Shijie
0 siblings, 0 replies; 41+ messages in thread
From: Huang Shijie @ 2010-01-11 5:13 UTC (permalink / raw)
To: Minchan Kim; +Cc: akpm, mel, kosaki.motohiro, linux-mm
>> The __mod_zone_page_state() only require irq disabling,
>> it does not require the zone's spinlock. So move it out of
>> the guard region of the spinlock to relieve the pressure for
>> allocation.
>>
>> Signed-off-by: Huang Shijie<shijie8@gmail.com>
>> ---
>> mm/page_alloc.c | 2 +-
>> 1 files changed, 1 insertions(+), 1 deletions(-)
>>
>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>> index 23df1ed..00aa83a 100644
>> --- a/mm/page_alloc.c
>> +++ b/mm/page_alloc.c
>> @@ -961,8 +961,8 @@ static int rmqueue_single(struct zone *zone, unsigned long count,
>> set_page_private(page, migratetype);
>> list =&page->lru;
>> }
>> - __mod_zone_page_state(zone, NR_FREE_PAGES, -i);
>> spin_unlock(&zone->lock);
>> + __mod_zone_page_state(zone, NR_FREE_PAGES, -i);
>> return i;
>> }
>>
>> --
>> 1.6.5.2
>>
>>
> How about moving this patch into [4/4]?
> Otherwise, Looks good to me.
>
>
I just want to make the patches more clear : two for memory
allocation, two for memory free.
> Reviewed-by: Minchan Kim<minchan.kim@gmail.com>
>
>
thanks a lot.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH 4/4] mm/page_alloc : relieve zone->lock's pressure for memory free
2010-01-11 4:37 ` [PATCH 4/4] mm/page_alloc : relieve zone->lock's pressure for memory free Huang Shijie
@ 2010-01-11 5:20 ` Minchan Kim
2010-01-11 6:01 ` Huang Shijie
2010-01-11 6:27 ` Huang Shijie
2010-01-12 2:51 ` Huang Shijie
2 siblings, 1 reply; 41+ messages in thread
From: Minchan Kim @ 2010-01-11 5:20 UTC (permalink / raw)
To: Huang Shijie; +Cc: akpm, mel, kosaki.motohiro, linux-mm
On Mon, 11 Jan 2010 12:37:14 +0800
Huang Shijie <shijie8@gmail.com> wrote:
> Move the __mod_zone_page_state out the guard region of
> the spinlock to relieve the pressure for memory free.
>
> Signed-off-by: Huang Shijie <shijie8@gmail.com>
> ---
> mm/page_alloc.c | 26 ++++++++++++++++----------
> 1 files changed, 16 insertions(+), 10 deletions(-)
>
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 290dfc3..34b9a3a 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -530,12 +530,9 @@ static void free_pcppages_bulk(struct zone *zone, int count,
> {
> int migratetype = 0;
> int batch_free = 0;
> + int free_ok = 0;
>
> spin_lock(&zone->lock);
> - zone_clear_flag(zone, ZONE_ALL_UNRECLAIMABLE);
> - zone->pages_scanned = 0;
We don't use zone->lock to protect pages_scanned in shrink_[in]active_list.
At least, we can move zone->pages_scanned out of lock.
In addition,
sometime we don't use lock to test ZONE_ALL_UNRECLAIMABLE,
sometime we do use zone->lock(ex, zoneinfo_show_print).
Now lock for ZONE_ALL_UNRECLAIMABLE is not consistent,
I think we have to use not zone->lock but zone->lru_lock
since it's related to reclaim.
Few days ago, Mel and Kosaki discussed about zone->lock.
Any thought?
> -
> - __mod_zone_page_state(zone, NR_FREE_PAGES, count);
> while (count) {
> struct page *page;
> struct list_head *list;
> @@ -558,23 +555,32 @@ static void free_pcppages_bulk(struct zone *zone, int count,
> page = list_entry(list->prev, struct page, lru);
> /* must delete as __free_one_page list manipulates */
> list_del(&page->lru);
> - __free_one_page(page, zone, 0, migratetype);
> + free_ok += __free_one_page(page, zone, 0, migratetype);
> trace_mm_page_pcpu_drain(page, 0, migratetype);
> } while (--count && --batch_free && !list_empty(list));
> }
> +
> + if (likely(free_ok)) {
> + zone_clear_flag(zone, ZONE_ALL_UNRECLAIMABLE);
> + zone->pages_scanned = 0;
> + }
> spin_unlock(&zone->lock);
> + __mod_zone_page_state(zone, NR_FREE_PAGES, free_ok);
> }
>
> static void free_one_page(struct zone *zone, struct page *page, int order,
> int migratetype)
> {
> - spin_lock(&zone->lock);
> - zone_clear_flag(zone, ZONE_ALL_UNRECLAIMABLE);
> - zone->pages_scanned = 0;
> + int free_ok;
>
> - __mod_zone_page_state(zone, NR_FREE_PAGES, 1 << order);
> - __free_one_page(page, zone, order, migratetype);
> + spin_lock(&zone->lock);
> + free_ok = __free_one_page(page, zone, order, migratetype);
> + if (likely(free_ok)) {
> + zone_clear_flag(zone, ZONE_ALL_UNRECLAIMABLE);
> + zone->pages_scanned = 0;
> + }
> spin_unlock(&zone->lock);
> + __mod_zone_page_state(zone, NR_FREE_PAGES, free_ok << order);
> }
>
> static void __free_pages_ok(struct page *page, unsigned int order)
> --
> 1.6.5.2
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org. For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
--
Kind regards,
Minchan Kim
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH 4/4] mm/page_alloc : relieve zone->lock's pressure for memory free
2010-01-11 5:20 ` Minchan Kim
@ 2010-01-11 6:01 ` Huang Shijie
0 siblings, 0 replies; 41+ messages in thread
From: Huang Shijie @ 2010-01-11 6:01 UTC (permalink / raw)
To: Minchan Kim; +Cc: akpm, mel, kosaki.motohiro, linux-mm
>> Move the __mod_zone_page_state out the guard region of
>> the spinlock to relieve the pressure for memory free.
>>
>> Signed-off-by: Huang Shijie<shijie8@gmail.com>
>> ---
>> mm/page_alloc.c | 26 ++++++++++++++++----------
>> 1 files changed, 16 insertions(+), 10 deletions(-)
>>
>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>> index 290dfc3..34b9a3a 100644
>> --- a/mm/page_alloc.c
>> +++ b/mm/page_alloc.c
>> @@ -530,12 +530,9 @@ static void free_pcppages_bulk(struct zone *zone, int count,
>> {
>> int migratetype = 0;
>> int batch_free = 0;
>> + int free_ok = 0;
>>
>> spin_lock(&zone->lock);
>> - zone_clear_flag(zone, ZONE_ALL_UNRECLAIMABLE);
>> - zone->pages_scanned = 0;
>>
>
> We don't use zone->lock to protect pages_scanned in shrink_[in]active_list.
> At least, we can move zone->pages_scanned out of lock.
>
> In addition,
> sometime we don't use lock to test ZONE_ALL_UNRECLAIMABLE,
> sometime we do use zone->lock(ex, zoneinfo_show_print).
>
> Now lock for ZONE_ALL_UNRECLAIMABLE is not consistent,
>
> I think we have to use not zone->lock but zone->lru_lock
> since it's related to reclaim.
>
>
agree.
I think zone->lru_lock is more proper.
> Few days ago, Mel and Kosaki discussed about zone->lock.
> Any thought?
>
>
>> -
>> - __mod_zone_page_state(zone, NR_FREE_PAGES, count);
>> while (count) {
>> struct page *page;
>> struct list_head *list;
>> @@ -558,23 +555,32 @@ static void free_pcppages_bulk(struct zone *zone, int count,
>> page = list_entry(list->prev, struct page, lru);
>> /* must delete as __free_one_page list manipulates */
>> list_del(&page->lru);
>> - __free_one_page(page, zone, 0, migratetype);
>> + free_ok += __free_one_page(page, zone, 0, migratetype);
>> trace_mm_page_pcpu_drain(page, 0, migratetype);
>> } while (--count&& --batch_free&& !list_empty(list));
>> }
>> +
>> + if (likely(free_ok)) {
>> + zone_clear_flag(zone, ZONE_ALL_UNRECLAIMABLE);
>> + zone->pages_scanned = 0;
>> + }
>> spin_unlock(&zone->lock);
>> + __mod_zone_page_state(zone, NR_FREE_PAGES, free_ok);
>> }
>>
>> static void free_one_page(struct zone *zone, struct page *page, int order,
>> int migratetype)
>> {
>> - spin_lock(&zone->lock);
>> - zone_clear_flag(zone, ZONE_ALL_UNRECLAIMABLE);
>> - zone->pages_scanned = 0;
>> + int free_ok;
>>
>> - __mod_zone_page_state(zone, NR_FREE_PAGES, 1<< order);
>> - __free_one_page(page, zone, order, migratetype);
>> + spin_lock(&zone->lock);
>> + free_ok = __free_one_page(page, zone, order, migratetype);
>> + if (likely(free_ok)) {
>> + zone_clear_flag(zone, ZONE_ALL_UNRECLAIMABLE);
>> + zone->pages_scanned = 0;
>> + }
>> spin_unlock(&zone->lock);
>> + __mod_zone_page_state(zone, NR_FREE_PAGES, free_ok<< order);
>> }
>>
>> static void __free_pages_ok(struct page *page, unsigned int order)
>> --
>> 1.6.5.2
>>
>> --
>> To unsubscribe, send a message with 'unsubscribe linux-mm' in
>> the body to majordomo@kvack.org. For more info on Linux MM,
>> see: http://www.linux-mm.org/ .
>> Don't email:<a href=mailto:"dont@kvack.org"> email@kvack.org</a>
>>
>
>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 41+ messages in thread
* [PATCH 4/4] mm/page_alloc : relieve zone->lock's pressure for memory free
2010-01-11 4:37 ` [PATCH 4/4] mm/page_alloc : relieve zone->lock's pressure for memory free Huang Shijie
2010-01-11 5:20 ` Minchan Kim
@ 2010-01-11 6:27 ` Huang Shijie
2010-01-11 6:38 ` Minchan Kim
2010-01-12 2:51 ` Huang Shijie
2 siblings, 1 reply; 41+ messages in thread
From: Huang Shijie @ 2010-01-11 6:27 UTC (permalink / raw)
To: akpm; +Cc: mel, kosaki.motohiro, minchan.kim, linux-mm, Huang Shijie
Move the __mod_zone_page_state out the guard region of
the spinlock to relieve the pressure for memory free.
Using the zone->lru_lock to replace the zone->lock for
zone->pages_scanned and zone's flag ZONE_ALL_UNRECLAIMABLE.
Signed-off-by: Huang Shijie <shijie8@gmail.com>
---
mm/page_alloc.c | 33 +++++++++++++++++++++++----------
1 files changed, 23 insertions(+), 10 deletions(-)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 290dfc3..dfd4be0 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -530,12 +530,9 @@ static void free_pcppages_bulk(struct zone *zone, int count,
{
int migratetype = 0;
int batch_free = 0;
+ int free_ok = 0;
spin_lock(&zone->lock);
- zone_clear_flag(zone, ZONE_ALL_UNRECLAIMABLE);
- zone->pages_scanned = 0;
-
- __mod_zone_page_state(zone, NR_FREE_PAGES, count);
while (count) {
struct page *page;
struct list_head *list;
@@ -558,23 +555,39 @@ static void free_pcppages_bulk(struct zone *zone, int count,
page = list_entry(list->prev, struct page, lru);
/* must delete as __free_one_page list manipulates */
list_del(&page->lru);
- __free_one_page(page, zone, 0, migratetype);
+ free_ok += __free_one_page(page, zone, 0, migratetype);
trace_mm_page_pcpu_drain(page, 0, migratetype);
} while (--count && --batch_free && !list_empty(list));
}
spin_unlock(&zone->lock);
+
+ if (likely(free_ok)) {
+ spin_lock(&zone->lru_lock);
+ zone_clear_flag(zone, ZONE_ALL_UNRECLAIMABLE);
+ zone->pages_scanned = 0;
+ spin_unlock(&zone->lru_lock);
+
+ __mod_zone_page_state(zone, NR_FREE_PAGES, free_ok);
+ }
}
static void free_one_page(struct zone *zone, struct page *page, int order,
int migratetype)
{
- spin_lock(&zone->lock);
- zone_clear_flag(zone, ZONE_ALL_UNRECLAIMABLE);
- zone->pages_scanned = 0;
+ int free_ok;
- __mod_zone_page_state(zone, NR_FREE_PAGES, 1 << order);
- __free_one_page(page, zone, order, migratetype);
+ spin_lock(&zone->lock);
+ free_ok = __free_one_page(page, zone, order, migratetype);
spin_unlock(&zone->lock);
+
+ if (likely(free_ok)) {
+ spin_lock(&zone->lru_lock);
+ zone_clear_flag(zone, ZONE_ALL_UNRECLAIMABLE);
+ zone->pages_scanned = 0;
+ spin_unlock(&zone->lru_lock);
+
+ __mod_zone_page_state(zone, NR_FREE_PAGES, free_ok << order);
+ }
}
static void __free_pages_ok(struct page *page, unsigned int order)
--
1.6.5.2
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 41+ messages in thread
* Re: [PATCH 4/4] mm/page_alloc : relieve zone->lock's pressure for memory free
2010-01-11 6:27 ` Huang Shijie
@ 2010-01-11 6:38 ` Minchan Kim
2010-01-11 6:59 ` Huang Shijie
2010-01-12 0:47 ` KAMEZAWA Hiroyuki
0 siblings, 2 replies; 41+ messages in thread
From: Minchan Kim @ 2010-01-11 6:38 UTC (permalink / raw)
To: Huang Shijie
Cc: akpm, mel, kosaki.motohiro, minchan.kim, linux-mm, Rik van Riel,
Wu Fengguang, KAMEZAWA Hiroyuki
On Mon, 11 Jan 2010 14:27:57 +0800
Huang Shijie <shijie8@gmail.com> wrote:
> Move the __mod_zone_page_state out the guard region of
> the spinlock to relieve the pressure for memory free.
>
> Using the zone->lru_lock to replace the zone->lock for
> zone->pages_scanned and zone's flag ZONE_ALL_UNRECLAIMABLE.
>
> Signed-off-by: Huang Shijie <shijie8@gmail.com>
> ---
> mm/page_alloc.c | 33 +++++++++++++++++++++++----------
> 1 files changed, 23 insertions(+), 10 deletions(-)
>
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 290dfc3..dfd4be0 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -530,12 +530,9 @@ static void free_pcppages_bulk(struct zone *zone, int count,
> {
> int migratetype = 0;
> int batch_free = 0;
> + int free_ok = 0;
>
> spin_lock(&zone->lock);
> - zone_clear_flag(zone, ZONE_ALL_UNRECLAIMABLE);
> - zone->pages_scanned = 0;
> -
> - __mod_zone_page_state(zone, NR_FREE_PAGES, count);
> while (count) {
> struct page *page;
> struct list_head *list;
> @@ -558,23 +555,39 @@ static void free_pcppages_bulk(struct zone *zone, int count,
> page = list_entry(list->prev, struct page, lru);
> /* must delete as __free_one_page list manipulates */
> list_del(&page->lru);
> - __free_one_page(page, zone, 0, migratetype);
> + free_ok += __free_one_page(page, zone, 0, migratetype);
> trace_mm_page_pcpu_drain(page, 0, migratetype);
> } while (--count && --batch_free && !list_empty(list));
> }
> spin_unlock(&zone->lock);
> +
> + if (likely(free_ok)) {
> + spin_lock(&zone->lru_lock);
> + zone_clear_flag(zone, ZONE_ALL_UNRECLAIMABLE);
> + zone->pages_scanned = 0;
> + spin_unlock(&zone->lru_lock);
> +
> + __mod_zone_page_state(zone, NR_FREE_PAGES, free_ok);
> + }
> }
>
> static void free_one_page(struct zone *zone, struct page *page, int order,
> int migratetype)
> {
> - spin_lock(&zone->lock);
> - zone_clear_flag(zone, ZONE_ALL_UNRECLAIMABLE);
> - zone->pages_scanned = 0;
> + int free_ok;
>
> - __mod_zone_page_state(zone, NR_FREE_PAGES, 1 << order);
> - __free_one_page(page, zone, order, migratetype);
> + spin_lock(&zone->lock);
> + free_ok = __free_one_page(page, zone, order, migratetype);
> spin_unlock(&zone->lock);
> +
> + if (likely(free_ok)) {
> + spin_lock(&zone->lru_lock);
> + zone_clear_flag(zone, ZONE_ALL_UNRECLAIMABLE);
> + zone->pages_scanned = 0;
> + spin_unlock(&zone->lru_lock);
> +
> + __mod_zone_page_state(zone, NR_FREE_PAGES, free_ok << order);
> + }
> }
>
> static void __free_pages_ok(struct page *page, unsigned int order)
> --
> 1.6.5.2
>
Thanks, Huang.
Frankly speaking, I am not sure this ir right way.
This patch is adding to fine-grained locking overhead
As you know, this functions are one of hot pathes.
In addition, we didn't see the any problem, until now.
It means out of synchronization in ZONE_ALL_UNRECLAIMABLE
and pages_scanned are all right?
If it is, we can move them out of zone->lock, too.
If it isn't, we need one more lock, then.
Let's listen other mm guys's opinion.
--
Kind regards,
Minchan Kim
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH 4/4] mm/page_alloc : relieve zone->lock's pressure for memory free
2010-01-11 6:38 ` Minchan Kim
@ 2010-01-11 6:59 ` Huang Shijie
2010-01-12 0:47 ` KAMEZAWA Hiroyuki
1 sibling, 0 replies; 41+ messages in thread
From: Huang Shijie @ 2010-01-11 6:59 UTC (permalink / raw)
To: Minchan Kim
Cc: akpm, mel, kosaki.motohiro, linux-mm, Rik van Riel, Wu Fengguang,
KAMEZAWA Hiroyuki
> Frankly speaking, I am not sure this ir right way.
> This patch is adding to fine-grained locking overhead
>
> As you know, this functions are one of hot pathes.
>
Yes. But the PCP suffers most of the pressure ,I think.
free_one_page() handles (order != 0) most of the time which is
relatively rarely
executed.
> In addition, we didn't see the any problem, until now.
> It means out of synchronization in ZONE_ALL_UNRECLAIMABLE
> and pages_scanned are all right?
>
>
Maybe it has already caused a problem, while the problem is hard to
find out. :)
> If it is, we can move them out of zone->lock, too.
> If it isn't, we need one more lock, then.
>
> Let's listen other mm guys's opinion.
>
>
Ok.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH 4/4] mm/page_alloc : relieve zone->lock's pressure for memory free
2010-01-11 6:38 ` Minchan Kim
2010-01-11 6:59 ` Huang Shijie
@ 2010-01-12 0:47 ` KAMEZAWA Hiroyuki
2010-01-12 2:02 ` Huang Shijie
2010-01-12 2:27 ` Wu Fengguang
1 sibling, 2 replies; 41+ messages in thread
From: KAMEZAWA Hiroyuki @ 2010-01-12 0:47 UTC (permalink / raw)
To: Minchan Kim
Cc: Huang Shijie, akpm, mel, kosaki.motohiro, linux-mm, Rik van Riel,
Wu Fengguang
On Mon, 11 Jan 2010 15:38:02 +0900
Minchan Kim <minchan.kim@gmail.com> wrote:
> On Mon, 11 Jan 2010 14:27:57 +0800
> Huang Shijie <shijie8@gmail.com> wrote:
>
> > Move the __mod_zone_page_state out the guard region of
> > the spinlock to relieve the pressure for memory free.
> >
> > Using the zone->lru_lock to replace the zone->lock for
> > zone->pages_scanned and zone's flag ZONE_ALL_UNRECLAIMABLE.
> >
> > Signed-off-by: Huang Shijie <shijie8@gmail.com>
> > ---
> > mm/page_alloc.c | 33 +++++++++++++++++++++++----------
> > 1 files changed, 23 insertions(+), 10 deletions(-)
> >
> > diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> > index 290dfc3..dfd4be0 100644
> > --- a/mm/page_alloc.c
> > +++ b/mm/page_alloc.c
> > @@ -530,12 +530,9 @@ static void free_pcppages_bulk(struct zone *zone, int count,
> > {
> > int migratetype = 0;
> > int batch_free = 0;
> > + int free_ok = 0;
> >
> > spin_lock(&zone->lock);
> > - zone_clear_flag(zone, ZONE_ALL_UNRECLAIMABLE);
> > - zone->pages_scanned = 0;
> > -
> > - __mod_zone_page_state(zone, NR_FREE_PAGES, count);
> > while (count) {
> > struct page *page;
> > struct list_head *list;
> > @@ -558,23 +555,39 @@ static void free_pcppages_bulk(struct zone *zone, int count,
> > page = list_entry(list->prev, struct page, lru);
> > /* must delete as __free_one_page list manipulates */
> > list_del(&page->lru);
> > - __free_one_page(page, zone, 0, migratetype);
> > + free_ok += __free_one_page(page, zone, 0, migratetype);
> > trace_mm_page_pcpu_drain(page, 0, migratetype);
> > } while (--count && --batch_free && !list_empty(list));
> > }
> > spin_unlock(&zone->lock);
> > +
> > + if (likely(free_ok)) {
> > + spin_lock(&zone->lru_lock);
> > + zone_clear_flag(zone, ZONE_ALL_UNRECLAIMABLE);
> > + zone->pages_scanned = 0;
> > + spin_unlock(&zone->lru_lock);
> > +
> > + __mod_zone_page_state(zone, NR_FREE_PAGES, free_ok);
> > + }
> > }
> >
> > static void free_one_page(struct zone *zone, struct page *page, int order,
> > int migratetype)
> > {
> > - spin_lock(&zone->lock);
> > - zone_clear_flag(zone, ZONE_ALL_UNRECLAIMABLE);
> > - zone->pages_scanned = 0;
> > + int free_ok;
> >
> > - __mod_zone_page_state(zone, NR_FREE_PAGES, 1 << order);
> > - __free_one_page(page, zone, order, migratetype);
> > + spin_lock(&zone->lock);
> > + free_ok = __free_one_page(page, zone, order, migratetype);
> > spin_unlock(&zone->lock);
> > +
> > + if (likely(free_ok)) {
> > + spin_lock(&zone->lru_lock);
> > + zone_clear_flag(zone, ZONE_ALL_UNRECLAIMABLE);
> > + zone->pages_scanned = 0;
> > + spin_unlock(&zone->lru_lock);
> > +
> > + __mod_zone_page_state(zone, NR_FREE_PAGES, free_ok << order);
> > + }
> > }
> >
> > static void __free_pages_ok(struct page *page, unsigned int order)
> > --
> > 1.6.5.2
> >
>
> Thanks, Huang.
>
> Frankly speaking, I am not sure this ir right way.
> This patch is adding to fine-grained locking overhead
>
> As you know, this functions are one of hot pathes.
> In addition, we didn't see the any problem, until now.
> It means out of synchronization in ZONE_ALL_UNRECLAIMABLE
> and pages_scanned are all right?
>
> If it is, we can move them out of zone->lock, too.
> If it isn't, we need one more lock, then.
>
I don't want to see additional spin_lock, here.
About ZONE_ALL_UNRECLAIMABLE, it's not necessary to be handled in atomic way.
If you have concerns with other flags, please modify this with single word,
instead of a bit field.
Thanks,
-Kame
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH 4/4] mm/page_alloc : relieve zone->lock's pressure for memory free
2010-01-12 0:47 ` KAMEZAWA Hiroyuki
@ 2010-01-12 2:02 ` Huang Shijie
2010-01-12 2:07 ` KAMEZAWA Hiroyuki
2010-01-12 2:27 ` Wu Fengguang
1 sibling, 1 reply; 41+ messages in thread
From: Huang Shijie @ 2010-01-12 2:02 UTC (permalink / raw)
To: KAMEZAWA Hiroyuki
Cc: Minchan Kim, akpm, mel, kosaki.motohiro, linux-mm, Rik van Riel,
Wu Fengguang
>> Thanks, Huang.
>>
>> Frankly speaking, I am not sure this ir right way.
>> This patch is adding to fine-grained locking overhead
>>
>> As you know, this functions are one of hot pathes.
>> In addition, we didn't see the any problem, until now.
>> It means out of synchronization in ZONE_ALL_UNRECLAIMABLE
>> and pages_scanned are all right?
>>
>> If it is, we can move them out of zone->lock, too.
>> If it isn't, we need one more lock, then.
>>
>>
> I don't want to see additional spin_lock, here.
>
I don't want it either.
> About ZONE_ALL_UNRECLAIMABLE, it's not necessary to be handled in atomic way.
> If you have concerns with other flags, please modify this with single word,
> instead of a bit field.
>
>
How about the `pages_scanned' ?
It's protected by the zone->lru_lock in shrink_{in}active_list().
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH 4/4] mm/page_alloc : relieve zone->lock's pressure for memory free
2010-01-12 2:02 ` Huang Shijie
@ 2010-01-12 2:07 ` KAMEZAWA Hiroyuki
2010-01-12 2:32 ` Huang Shijie
0 siblings, 1 reply; 41+ messages in thread
From: KAMEZAWA Hiroyuki @ 2010-01-12 2:07 UTC (permalink / raw)
To: Huang Shijie
Cc: Minchan Kim, akpm, mel, kosaki.motohiro, linux-mm, Rik van Riel,
Wu Fengguang
On Tue, 12 Jan 2010 10:02:49 +0800
Huang Shijie <shijie8@gmail.com> wrote:
>
> >> Thanks, Huang.
> >>
> >> Frankly speaking, I am not sure this ir right way.
> >> This patch is adding to fine-grained locking overhead
> >>
> >> As you know, this functions are one of hot pathes.
> >> In addition, we didn't see the any problem, until now.
> >> It means out of synchronization in ZONE_ALL_UNRECLAIMABLE
> >> and pages_scanned are all right?
> >>
> >> If it is, we can move them out of zone->lock, too.
> >> If it isn't, we need one more lock, then.
> >>
> >>
> > I don't want to see additional spin_lock, here.
> >
> I don't want it either.
> > About ZONE_ALL_UNRECLAIMABLE, it's not necessary to be handled in atomic way.
> > If you have concerns with other flags, please modify this with single word,
> > instead of a bit field.
> >
> >
> How about the `pages_scanned' ?
> It's protected by the zone->lru_lock in shrink_{in}active_list().
>
Zero-clear by page-scanned is done by a write (atomic). Then, possible race
will be this update,
zone->pages_scanend += scanned;
And failing to reset the number. But, IMHO, failure to reset this counter
is not a big problem. We'll finally reset this when we free the next
page. So, I have no concerns about resetting this counter.
My only concern is race with other flags.
Thanks,
-Kame
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH 4/4] mm/page_alloc : relieve zone->lock's pressure for memory free
2010-01-12 0:47 ` KAMEZAWA Hiroyuki
2010-01-12 2:02 ` Huang Shijie
@ 2010-01-12 2:27 ` Wu Fengguang
2010-01-12 2:56 ` KOSAKI Motohiro
2010-01-12 4:05 ` Minchan Kim
1 sibling, 2 replies; 41+ messages in thread
From: Wu Fengguang @ 2010-01-12 2:27 UTC (permalink / raw)
To: KAMEZAWA Hiroyuki
Cc: Minchan Kim, Huang Shijie, akpm, mel, kosaki.motohiro, linux-mm,
Rik van Riel
On Tue, Jan 12, 2010 at 09:47:08AM +0900, KAMEZAWA Hiroyuki wrote:
> > Thanks, Huang.
> >
> > Frankly speaking, I am not sure this ir right way.
> > This patch is adding to fine-grained locking overhead
> >
> > As you know, this functions are one of hot pathes.
> > In addition, we didn't see the any problem, until now.
> > It means out of synchronization in ZONE_ALL_UNRECLAIMABLE
> > and pages_scanned are all right?
> >
> > If it is, we can move them out of zone->lock, too.
> > If it isn't, we need one more lock, then.
> >
> I don't want to see additional spin_lock, here.
>
> About ZONE_ALL_UNRECLAIMABLE, it's not necessary to be handled in atomic way.
> If you have concerns with other flags, please modify this with single word,
> instead of a bit field.
I'd second it. It's not a big problem to reset ZONE_ALL_UNRECLAIMABLE
and pages_scanned outside of zone->lru_lock.
Clear of ZONE_ALL_UNRECLAIMABLE is already atomic; if we lose one
pages_scanned=0 due to races, there are plenty of page free events
ahead to reset it, before pages_scanned hit the huge
zone_reclaimable_pages() * 6.
Thanks,
Fengguang
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH 4/4] mm/page_alloc : relieve zone->lock's pressure for memory free
2010-01-12 2:07 ` KAMEZAWA Hiroyuki
@ 2010-01-12 2:32 ` Huang Shijie
0 siblings, 0 replies; 41+ messages in thread
From: Huang Shijie @ 2010-01-12 2:32 UTC (permalink / raw)
To: KAMEZAWA Hiroyuki
Cc: Minchan Kim, akpm, mel, kosaki.motohiro, linux-mm, Rik van Riel,
Wu Fengguang
>> How about the `pages_scanned' ?
>> It's protected by the zone->lru_lock in shrink_{in}active_list().
>>
>>
> Zero-clear by page-scanned is done by a write (atomic). Then, possible race
> will be this update,
>
> zone->pages_scanend += scanned;
>
> And failing to reset the number. But, IMHO, failure to reset this counter
> is not a big problem. We'll finally reset this when we free the next
>
This is a good reason to me. Thanks a lot.
> page. So, I have no concerns about resetting this counter.
>
> My only concern is race with other flags.
>
> Thanks,
> -Kame
>
>
>
>
>
>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 41+ messages in thread
* [PATCH 4/4] mm/page_alloc : relieve zone->lock's pressure for memory free
2010-01-11 4:37 ` [PATCH 4/4] mm/page_alloc : relieve zone->lock's pressure for memory free Huang Shijie
2010-01-11 5:20 ` Minchan Kim
2010-01-11 6:27 ` Huang Shijie
@ 2010-01-12 2:51 ` Huang Shijie
2010-01-12 3:03 ` Wu Fengguang
2010-01-12 3:05 ` KOSAKI Motohiro
2 siblings, 2 replies; 41+ messages in thread
From: Huang Shijie @ 2010-01-12 2:51 UTC (permalink / raw)
To: akpm
Cc: mel, kosaki.motohiro, kamezawa.hiroyu, minchan.kim, linux-mm,
riel, fengguang.wu, Huang Shijie
Move the {__mod_zone_page_state, pages_scanned, clear zone's flags}
out the guard region of the spinlock to relieve the pressure for memory free.
Signed-off-by: Huang Shijie <shijie8@gmail.com>
---
mm/page_alloc.c | 27 +++++++++++++++++----------
1 files changed, 17 insertions(+), 10 deletions(-)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 290dfc3..1c7e32e 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -530,12 +530,9 @@ static void free_pcppages_bulk(struct zone *zone, int count,
{
int migratetype = 0;
int batch_free = 0;
+ int free_ok = 0;
spin_lock(&zone->lock);
- zone_clear_flag(zone, ZONE_ALL_UNRECLAIMABLE);
- zone->pages_scanned = 0;
-
- __mod_zone_page_state(zone, NR_FREE_PAGES, count);
while (count) {
struct page *page;
struct list_head *list;
@@ -558,23 +555,33 @@ static void free_pcppages_bulk(struct zone *zone, int count,
page = list_entry(list->prev, struct page, lru);
/* must delete as __free_one_page list manipulates */
list_del(&page->lru);
- __free_one_page(page, zone, 0, migratetype);
+ free_ok += __free_one_page(page, zone, 0, migratetype);
trace_mm_page_pcpu_drain(page, 0, migratetype);
} while (--count && --batch_free && !list_empty(list));
}
spin_unlock(&zone->lock);
+
+ if (likely(free_ok)) {
+ zone_clear_flag(zone, ZONE_ALL_UNRECLAIMABLE);
+ zone->pages_scanned = 0;
+ __mod_zone_page_state(zone, NR_FREE_PAGES, free_ok);
+ }
}
static void free_one_page(struct zone *zone, struct page *page, int order,
int migratetype)
{
- spin_lock(&zone->lock);
- zone_clear_flag(zone, ZONE_ALL_UNRECLAIMABLE);
- zone->pages_scanned = 0;
+ int free_ok;
- __mod_zone_page_state(zone, NR_FREE_PAGES, 1 << order);
- __free_one_page(page, zone, order, migratetype);
+ spin_lock(&zone->lock);
+ free_ok = __free_one_page(page, zone, order, migratetype);
spin_unlock(&zone->lock);
+
+ if (likely(free_ok)) {
+ zone_clear_flag(zone, ZONE_ALL_UNRECLAIMABLE);
+ zone->pages_scanned = 0;
+ __mod_zone_page_state(zone, NR_FREE_PAGES, free_ok << order);
+ }
}
static void __free_pages_ok(struct page *page, unsigned int order)
--
1.6.5.2
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 41+ messages in thread
* Re: [PATCH 1/4] mm/page_alloc : rename rmqueue_bulk to rmqueue_single
2010-01-11 4:37 [PATCH 1/4] mm/page_alloc : rename rmqueue_bulk to rmqueue_single Huang Shijie
2010-01-11 4:37 ` [PATCH 2/4] mm/page_alloc : relieve the zone->lock's pressure for allocation Huang Shijie
2010-01-11 5:00 ` [PATCH 1/4] mm/page_alloc : rename rmqueue_bulk to rmqueue_single Minchan Kim
@ 2010-01-12 2:52 ` KOSAKI Motohiro
2010-01-18 11:21 ` Mel Gorman
3 siblings, 0 replies; 41+ messages in thread
From: KOSAKI Motohiro @ 2010-01-12 2:52 UTC (permalink / raw)
To: Huang Shijie; +Cc: kosaki.motohiro, akpm, mel, linux-mm
> There is only one place calls rmqueue_bulk to allocate the single
> pages. So rename it to rmqueue_single, and remove an argument
> order.
>
> Signed-off-by: Huang Shijie <shijie8@gmail.com>
Hmm. `_bulk' doesn't mean high order page, it mean allocate multiple
pages at once. nobody imazine `_single' mean "allocate multiple pages".
> ---
> mm/page_alloc.c | 18 ++++++++----------
> 1 files changed, 8 insertions(+), 10 deletions(-)
>
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 4e9f5cc..23df1ed 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -930,25 +930,24 @@ retry_reserve:
> }
>
> /*
> - * Obtain a specified number of elements from the buddy allocator, all under
> + * Obtain a specified number of single page from the buddy allocator, all under
> * a single hold of the lock, for efficiency. Add them to the supplied list.
> * Returns the number of new pages which were placed at *list.
> */
> -static int rmqueue_bulk(struct zone *zone, unsigned int order,
> - unsigned long count, struct list_head *list,
> - int migratetype, int cold)
> +static int rmqueue_single(struct zone *zone, unsigned long count,
> + struct list_head *list, int migratetype, int cold)
> {
> int i;
>
> spin_lock(&zone->lock);
> for (i = 0; i < count; ++i) {
> - struct page *page = __rmqueue(zone, order, migratetype);
> + struct page *page = __rmqueue(zone, 0, migratetype);
> if (unlikely(page == NULL))
> break;
>
> /*
> * Split buddy pages returned by expand() are received here
> - * in physical page order. The page is added to the callers and
> + * in order zero. The page is added to the callers and
> * list and the list head then moves forward. From the callers
> * perspective, the linked list is ordered by page number in
> * some conditions. This is useful for IO devices that can
> @@ -962,7 +961,7 @@ static int rmqueue_bulk(struct zone *zone, unsigned int order,
> set_page_private(page, migratetype);
> list = &page->lru;
> }
> - __mod_zone_page_state(zone, NR_FREE_PAGES, -(i << order));
> + __mod_zone_page_state(zone, NR_FREE_PAGES, -i);
> spin_unlock(&zone->lock);
> return i;
> }
> @@ -1192,9 +1191,8 @@ again:
> list = &pcp->lists[migratetype];
> local_irq_save(flags);
> if (list_empty(list)) {
> - pcp->count += rmqueue_bulk(zone, 0,
> - pcp->batch, list,
> - migratetype, cold);
> + pcp->count += rmqueue_single(zone, pcp->batch, list,
> + migratetype, cold);
> if (unlikely(list_empty(list)))
> goto failed;
> }
> --
> 1.6.5.2
>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH 2/4] mm/page_alloc : relieve the zone->lock's pressure for allocation
2010-01-11 4:37 ` [PATCH 2/4] mm/page_alloc : relieve the zone->lock's pressure for allocation Huang Shijie
2010-01-11 4:37 ` [PATCH 3/4] mm/page_alloc : modify the return type of __free_one_page Huang Shijie
2010-01-11 5:02 ` [PATCH 2/4] mm/page_alloc : relieve the zone->lock's pressure for allocation Minchan Kim
@ 2010-01-12 2:54 ` KOSAKI Motohiro
2010-01-18 11:24 ` Mel Gorman
3 siblings, 0 replies; 41+ messages in thread
From: KOSAKI Motohiro @ 2010-01-12 2:54 UTC (permalink / raw)
To: Huang Shijie; +Cc: kosaki.motohiro, akpm, mel, linux-mm
> The __mod_zone_page_state() only require irq disabling,
> it does not require the zone's spinlock. So move it out of
> the guard region of the spinlock to relieve the pressure for
> allocation.
>
> Signed-off-by: Huang Shijie <shijie8@gmail.com>
Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
> ---
> mm/page_alloc.c | 2 +-
> 1 files changed, 1 insertions(+), 1 deletions(-)
>
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 23df1ed..00aa83a 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -961,8 +961,8 @@ static int rmqueue_single(struct zone *zone, unsigned long count,
> set_page_private(page, migratetype);
> list = &page->lru;
> }
> - __mod_zone_page_state(zone, NR_FREE_PAGES, -i);
> spin_unlock(&zone->lock);
> + __mod_zone_page_state(zone, NR_FREE_PAGES, -i);
> return i;
> }
>
> --
> 1.6.5.2
>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH 4/4] mm/page_alloc : relieve zone->lock's pressure for memory free
2010-01-12 2:27 ` Wu Fengguang
@ 2010-01-12 2:56 ` KOSAKI Motohiro
2010-01-12 3:02 ` Huang Shijie
2010-01-12 4:05 ` Minchan Kim
1 sibling, 1 reply; 41+ messages in thread
From: KOSAKI Motohiro @ 2010-01-12 2:56 UTC (permalink / raw)
To: Wu Fengguang
Cc: kosaki.motohiro, KAMEZAWA Hiroyuki, Minchan Kim, Huang Shijie,
akpm, mel, linux-mm, Rik van Riel
> On Tue, Jan 12, 2010 at 09:47:08AM +0900, KAMEZAWA Hiroyuki wrote:
> > > Thanks, Huang.
> > >
> > > Frankly speaking, I am not sure this ir right way.
> > > This patch is adding to fine-grained locking overhead
> > >
> > > As you know, this functions are one of hot pathes.
> > > In addition, we didn't see the any problem, until now.
> > > It means out of synchronization in ZONE_ALL_UNRECLAIMABLE
> > > and pages_scanned are all right?
> > >
> > > If it is, we can move them out of zone->lock, too.
> > > If it isn't, we need one more lock, then.
> > >
> > I don't want to see additional spin_lock, here.
> >
> > About ZONE_ALL_UNRECLAIMABLE, it's not necessary to be handled in atomic way.
> > If you have concerns with other flags, please modify this with single word,
> > instead of a bit field.
>
> I'd second it. It's not a big problem to reset ZONE_ALL_UNRECLAIMABLE
> and pages_scanned outside of zone->lru_lock.
>
> Clear of ZONE_ALL_UNRECLAIMABLE is already atomic; if we lose one
> pages_scanned=0 due to races, there are plenty of page free events
> ahead to reset it, before pages_scanned hit the huge
> zone_reclaimable_pages() * 6.
Yes, this patch should be rejected.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH 3/4] mm/page_alloc : modify the return type of __free_one_page
2010-01-11 4:37 ` [PATCH 3/4] mm/page_alloc : modify the return type of __free_one_page Huang Shijie
2010-01-11 4:37 ` [PATCH 4/4] mm/page_alloc : relieve zone->lock's pressure for memory free Huang Shijie
2010-01-11 5:04 ` [PATCH 3/4] mm/page_alloc : modify the return type of __free_one_page Minchan Kim
@ 2010-01-12 2:56 ` KOSAKI Motohiro
2010-01-18 11:25 ` Mel Gorman
3 siblings, 0 replies; 41+ messages in thread
From: KOSAKI Motohiro @ 2010-01-12 2:56 UTC (permalink / raw)
To: Huang Shijie; +Cc: kosaki.motohiro, akpm, mel, linux-mm
> Modify the return type for __free_one_page.
> It will return 1 on success, and return 0 when
> the check of the compound page is failed.
This patch should be merged [4/4]. but I really dislike 4/4...
> Signed-off-by: Huang Shijie <shijie8@gmail.com>
> ---
> mm/page_alloc.c | 10 ++++++----
> 1 files changed, 6 insertions(+), 4 deletions(-)
>
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 00aa83a..290dfc3 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -445,17 +445,18 @@ static inline int page_is_buddy(struct page *page, struct page *buddy,
> * triggers coalescing into a block of larger size.
> *
> * -- wli
> + *
> + * Returns 1 on success, else return 0;
> */
>
> -static inline void __free_one_page(struct page *page,
> - struct zone *zone, unsigned int order,
> - int migratetype)
> +static inline int __free_one_page(struct page *page, struct zone *zone,
> + unsigned int order, int migratetype)
> {
> unsigned long page_idx;
>
> if (unlikely(PageCompound(page)))
> if (unlikely(destroy_compound_page(page, order)))
> - return;
> + return 0;
>
> VM_BUG_ON(migratetype == -1);
>
> @@ -485,6 +486,7 @@ static inline void __free_one_page(struct page *page,
> list_add(&page->lru,
> &zone->free_area[order].free_list[migratetype]);
> zone->free_area[order].nr_free++;
> + return 1;
> }
>
> /*
> --
> 1.6.5.2
>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH 4/4] mm/page_alloc : relieve zone->lock's pressure for memory free
2010-01-12 2:56 ` KOSAKI Motohiro
@ 2010-01-12 3:02 ` Huang Shijie
0 siblings, 0 replies; 41+ messages in thread
From: Huang Shijie @ 2010-01-12 3:02 UTC (permalink / raw)
To: KOSAKI Motohiro
Cc: Wu Fengguang, KAMEZAWA Hiroyuki, Minchan Kim, akpm, mel, linux-mm,
Rik van Riel
>>> I don't want to see additional spin_lock, here.
>>>
>>> About ZONE_ALL_UNRECLAIMABLE, it's not necessary to be handled in atomic way.
>>> If you have concerns with other flags, please modify this with single word,
>>> instead of a bit field.
>>>
>> I'd second it. It's not a big problem to reset ZONE_ALL_UNRECLAIMABLE
>> and pages_scanned outside of zone->lru_lock.
>>
>> Clear of ZONE_ALL_UNRECLAIMABLE is already atomic; if we lose one
>> pages_scanned=0 due to races, there are plenty of page free events
>> ahead to reset it, before pages_scanned hit the huge
>> zone_reclaimable_pages() * 6.
>>
> Yes, this patch should be rejected.
>
>
>
>
What about the new version?
http://marc.info/?l=linux-mm&m=126326472530210&w=2
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH 4/4] mm/page_alloc : relieve zone->lock's pressure for memory free
2010-01-12 2:51 ` Huang Shijie
@ 2010-01-12 3:03 ` Wu Fengguang
2010-01-12 3:05 ` KOSAKI Motohiro
1 sibling, 0 replies; 41+ messages in thread
From: Wu Fengguang @ 2010-01-12 3:03 UTC (permalink / raw)
To: Huang Shijie
Cc: akpm@linux-foundation.org, mel@csn.ul.ie,
kosaki.motohiro@jp.fujitsu.com, kamezawa.hiroyu@jp.fujitsu.com,
minchan.kim@gmail.com, linux-mm@kvack.org, riel@redhat.com
Hi Shijie,
> + int free_ok;
>
> - __mod_zone_page_state(zone, NR_FREE_PAGES, 1 << order);
> - __free_one_page(page, zone, order, migratetype);
> + spin_lock(&zone->lock);
> + free_ok = __free_one_page(page, zone, order, migratetype);
> spin_unlock(&zone->lock);
> +
> + if (likely(free_ok)) {
> + zone_clear_flag(zone, ZONE_ALL_UNRECLAIMABLE);
> + zone->pages_scanned = 0;
> + __mod_zone_page_state(zone, NR_FREE_PAGES, free_ok << order);
> + }
If we do
__mod_zone_page_state(zone, -NR_FREE_PAGES, count);
in __free_one_page() on error, we can remove the likely(free_ok) test.
This sounds a bit hacky though.
Thanks,
Fengguang
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH 4/4] mm/page_alloc : relieve zone->lock's pressure for memory free
2010-01-12 2:51 ` Huang Shijie
2010-01-12 3:03 ` Wu Fengguang
@ 2010-01-12 3:05 ` KOSAKI Motohiro
1 sibling, 0 replies; 41+ messages in thread
From: KOSAKI Motohiro @ 2010-01-12 3:05 UTC (permalink / raw)
To: Huang Shijie
Cc: kosaki.motohiro, akpm, mel, kamezawa.hiroyu, minchan.kim,
linux-mm, riel, fengguang.wu
> Move the {__mod_zone_page_state, pages_scanned, clear zone's flags}
> out the guard region of the spinlock to relieve the pressure for memory free.
>
> Signed-off-by: Huang Shijie <shijie8@gmail.com>
> ---
> mm/page_alloc.c | 27 +++++++++++++++++----------
> 1 files changed, 17 insertions(+), 10 deletions(-)
>
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 290dfc3..1c7e32e 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -530,12 +530,9 @@ static void free_pcppages_bulk(struct zone *zone, int count,
> {
> int migratetype = 0;
> int batch_free = 0;
> + int free_ok = 0;
>
> spin_lock(&zone->lock);
> - zone_clear_flag(zone, ZONE_ALL_UNRECLAIMABLE);
> - zone->pages_scanned = 0;
> -
> - __mod_zone_page_state(zone, NR_FREE_PAGES, count);
> while (count) {
> struct page *page;
> struct list_head *list;
> @@ -558,23 +555,33 @@ static void free_pcppages_bulk(struct zone *zone, int count,
> page = list_entry(list->prev, struct page, lru);
> /* must delete as __free_one_page list manipulates */
> list_del(&page->lru);
> - __free_one_page(page, zone, 0, migratetype);
> + free_ok += __free_one_page(page, zone, 0, migratetype);
> trace_mm_page_pcpu_drain(page, 0, migratetype);
> } while (--count && --batch_free && !list_empty(list));
> }
> spin_unlock(&zone->lock);
> +
> + if (likely(free_ok)) {
> + zone_clear_flag(zone, ZONE_ALL_UNRECLAIMABLE);
> + zone->pages_scanned = 0;
> + __mod_zone_page_state(zone, NR_FREE_PAGES, free_ok);
> + }
> }
No. To introduce new branch have big performance degression risk. I don't
think this patch improve performance.
Why can't we remove this "if (free_ok)" statement?
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH 4/4] mm/page_alloc : relieve zone->lock's pressure for memory free
2010-01-12 2:27 ` Wu Fengguang
2010-01-12 2:56 ` KOSAKI Motohiro
@ 2010-01-12 4:05 ` Minchan Kim
2010-01-12 4:21 ` Wu Fengguang
1 sibling, 1 reply; 41+ messages in thread
From: Minchan Kim @ 2010-01-12 4:05 UTC (permalink / raw)
To: Wu Fengguang
Cc: KAMEZAWA Hiroyuki, Huang Shijie, akpm, mel, kosaki.motohiro,
linux-mm, Rik van Riel
On Tue, Jan 12, 2010 at 11:27 AM, Wu Fengguang <fengguang.wu@intel.com> wrote:
> On Tue, Jan 12, 2010 at 09:47:08AM +0900, KAMEZAWA Hiroyuki wrote:
>> > Thanks, Huang.
>> >
>> > Frankly speaking, I am not sure this ir right way.
>> > This patch is adding to fine-grained locking overhead
>> >
>> > As you know, this functions are one of hot pathes.
>> > In addition, we didn't see the any problem, until now.
>> > It means out of synchronization in ZONE_ALL_UNRECLAIMABLE
>> > and pages_scanned are all right?
>> >
>> > If it is, we can move them out of zone->lock, too.
>> > If it isn't, we need one more lock, then.
>> >
>> I don't want to see additional spin_lock, here.
>>
>> About ZONE_ALL_UNRECLAIMABLE, it's not necessary to be handled in atomic way.
>> If you have concerns with other flags, please modify this with single word,
>> instead of a bit field.
>
> I'd second it. It's not a big problem to reset ZONE_ALL_UNRECLAIMABLE
> and pages_scanned outside of zone->lru_lock.
>
> Clear of ZONE_ALL_UNRECLAIMABLE is already atomic; if we lose one
I'd second it? What's meaning? I can't understand your point since I am not
english native.
BTW,
Hmm. It's not atomic as Kame pointed out.
Now, zone->flags have several bit.
* ZONE_ALL_UNRECLAIMALBE
* ZONE_RECLAIM_LOCKED
* ZONE_OOM_LOCKED.
I think this flags are likely to race when the memory pressure is high.
If we don't prevent race, concurrent reclaim and killing could be happened.
So I think reset zone->flags outside of zone->lock would make our efforts which
prevent current reclaim and killing invalidate.
> Thanks,
> Fengguang
>
--
Kind regards,
Minchan Kim
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH 4/4] mm/page_alloc : relieve zone->lock's pressure for memory free
2010-01-12 4:05 ` Minchan Kim
@ 2010-01-12 4:21 ` Wu Fengguang
2010-01-12 4:32 ` KAMEZAWA Hiroyuki
2010-01-12 4:48 ` Minchan Kim
0 siblings, 2 replies; 41+ messages in thread
From: Wu Fengguang @ 2010-01-12 4:21 UTC (permalink / raw)
To: Minchan Kim
Cc: KAMEZAWA Hiroyuki, Huang Shijie, akpm@linux-foundation.org,
mel@csn.ul.ie, kosaki.motohiro@jp.fujitsu.com, linux-mm@kvack.org,
Rik van Riel
On Tue, Jan 12, 2010 at 12:05:32PM +0800, Minchan Kim wrote:
> On Tue, Jan 12, 2010 at 11:27 AM, Wu Fengguang <fengguang.wu@intel.com> wrote:
> > On Tue, Jan 12, 2010 at 09:47:08AM +0900, KAMEZAWA Hiroyuki wrote:
> >> > Thanks, Huang.
> >> >
> >> > Frankly speaking, I am not sure this ir right way.
> >> > This patch is adding to fine-grained locking overhead
> >> >
> >> > As you know, this functions are one of hot pathes.
> >> > In addition, we didn't see the any problem, until now.
> >> > It means out of synchronization in ZONE_ALL_UNRECLAIMABLE
> >> > and pages_scanned are all right?
> >> >
> >> > If it is, we can move them out of zone->lock, too.
> >> > If it isn't, we need one more lock, then.
> >> >
> >> I don't want to see additional spin_lock, here.
> >>
> >> About ZONE_ALL_UNRECLAIMABLE, it's not necessary to be handled in atomic way.
> >> If you have concerns with other flags, please modify this with single word,
> >> instead of a bit field.
> >
> > I'd second it. It's not a big problem to reset ZONE_ALL_UNRECLAIMABLE
> > and pages_scanned outside of zone->lru_lock.
> >
> > Clear of ZONE_ALL_UNRECLAIMABLE is already atomic; if we lose one
>
> I'd second it? What's meaning? I can't understand your point since I am not
> english native.
=> "I agree with Kame." :)
> BTW,
> Hmm. It's not atomic as Kame pointed out.
>
> Now, zone->flags have several bit.
> * ZONE_ALL_UNRECLAIMALBE
> * ZONE_RECLAIM_LOCKED
> * ZONE_OOM_LOCKED.
>
> I think this flags are likely to race when the memory pressure is high.
> If we don't prevent race, concurrent reclaim and killing could be happened.
> So I think reset zone->flags outside of zone->lock would make our efforts which
> prevent current reclaim and killing invalidate.
zone_set_flag()/zone_clear_flag() calls set_bit()/clear_bit() which is
atomic. Do you mean more high level exclusion?
Thanks,
Fengguang
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH 4/4] mm/page_alloc : relieve zone->lock's pressure for memory free
2010-01-12 4:21 ` Wu Fengguang
@ 2010-01-12 4:32 ` KAMEZAWA Hiroyuki
2010-01-12 4:59 ` Wu Fengguang
2010-01-12 5:10 ` KOSAKI Motohiro
2010-01-12 4:48 ` Minchan Kim
1 sibling, 2 replies; 41+ messages in thread
From: KAMEZAWA Hiroyuki @ 2010-01-12 4:32 UTC (permalink / raw)
To: Wu Fengguang
Cc: Minchan Kim, Huang Shijie, akpm@linux-foundation.org,
mel@csn.ul.ie, kosaki.motohiro@jp.fujitsu.com, linux-mm@kvack.org,
Rik van Riel
On Tue, 12 Jan 2010 12:21:16 +0800
Wu Fengguang <fengguang.wu@intel.com> wrote:
> > BTW,
> > Hmm. It's not atomic as Kame pointed out.
> >
> > Now, zone->flags have several bit.
> > * ZONE_ALL_UNRECLAIMALBE
> > * ZONE_RECLAIM_LOCKED
> > * ZONE_OOM_LOCKED.
> >
> > I think this flags are likely to race when the memory pressure is high.
> > If we don't prevent race, concurrent reclaim and killing could be happened.
> > So I think reset zone->flags outside of zone->lock would make our efforts which
> > prevent current reclaim and killing invalidate.
>
> zone_set_flag()/zone_clear_flag() calls set_bit()/clear_bit() which is
> atomic. Do you mean more high level exclusion?
>
Ah, sorry, I missed that.
In my memory, this wasn't atomic ;) ...maybe recent change.
I don't want to see atomic_ops here...So, how about making this back to be
zone->all_unreclaimable word ?
Clearing this is not necessary to be atomic because this is cleard at every
page freeing.
Thanks,
-Kame
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH 4/4] mm/page_alloc : relieve zone->lock's pressure for memory free
2010-01-12 4:21 ` Wu Fengguang
2010-01-12 4:32 ` KAMEZAWA Hiroyuki
@ 2010-01-12 4:48 ` Minchan Kim
1 sibling, 0 replies; 41+ messages in thread
From: Minchan Kim @ 2010-01-12 4:48 UTC (permalink / raw)
To: Wu Fengguang
Cc: KAMEZAWA Hiroyuki, Huang Shijie, akpm@linux-foundation.org,
mel@csn.ul.ie, kosaki.motohiro@jp.fujitsu.com, linux-mm@kvack.org,
Rik van Riel
On Tue, Jan 12, 2010 at 1:21 PM, Wu Fengguang <fengguang.wu@intel.com> wrote:
>> Hmm. It's not atomic as Kame pointed out.
>>
>> Now, zone->flags have several bit.
>> * ZONE_ALL_UNRECLAIMALBE
>> * ZONE_RECLAIM_LOCKED
>> * ZONE_OOM_LOCKED.
>>
>> I think this flags are likely to race when the memory pressure is high.
>> If we don't prevent race, concurrent reclaim and killing could be happened.
>> So I think reset zone->flags outside of zone->lock would make our efforts which
>> prevent current reclaim and killing invalidate.
>
> zone_set_flag()/zone_clear_flag() calls set_bit()/clear_bit() which is
> atomic. Do you mean more high level exclusion?
No. I was wrong. I though it's not atomic operation.
I confused it with __set_bit. :)
Sorry for the noise.
Thanks, Wu. :)
--
Kind regards,
Minchan Kim
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH 4/4] mm/page_alloc : relieve zone->lock's pressure for memory free
2010-01-12 4:32 ` KAMEZAWA Hiroyuki
@ 2010-01-12 4:59 ` Wu Fengguang
2010-01-12 5:09 ` Wu Fengguang
2010-01-12 5:10 ` KOSAKI Motohiro
1 sibling, 1 reply; 41+ messages in thread
From: Wu Fengguang @ 2010-01-12 4:59 UTC (permalink / raw)
To: KAMEZAWA Hiroyuki
Cc: Minchan Kim, Huang Shijie, akpm@linux-foundation.org,
mel@csn.ul.ie, kosaki.motohiro@jp.fujitsu.com, linux-mm@kvack.org,
Rik van Riel, David Rientjes, Christoph Lameter, Andrea Arcangeli
On Tue, Jan 12, 2010 at 12:32:23PM +0800, KAMEZAWA Hiroyuki wrote:
> On Tue, 12 Jan 2010 12:21:16 +0800
> Wu Fengguang <fengguang.wu@intel.com> wrote:
> > > BTW,
> > > Hmm. It's not atomic as Kame pointed out.
> > >
> > > Now, zone->flags have several bit.
> > > * ZONE_ALL_UNRECLAIMALBE
> > > * ZONE_RECLAIM_LOCKED
> > > * ZONE_OOM_LOCKED.
> > >
> > > I think this flags are likely to race when the memory pressure is high.
> > > If we don't prevent race, concurrent reclaim and killing could be happened.
> > > So I think reset zone->flags outside of zone->lock would make our efforts which
> > > prevent current reclaim and killing invalidate.
> >
> > zone_set_flag()/zone_clear_flag() calls set_bit()/clear_bit() which is
> > atomic. Do you mean more high level exclusion?
> >
> Ah, sorry, I missed that.
> In my memory, this wasn't atomic ;) ...maybe recent change.
>
> I don't want to see atomic_ops here...So, how about making this back to be
> zone->all_unreclaimable word ?
>
> Clearing this is not necessary to be atomic because this is cleard at every
> page freeing.
Yes the cost is a bit high. This was introduced by David Rientjes
in commit e815af95f, let's hear opinions from him :)
Thanks,
Fengguang
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH 4/4] mm/page_alloc : relieve zone->lock's pressure for memory free
2010-01-12 4:59 ` Wu Fengguang
@ 2010-01-12 5:09 ` Wu Fengguang
0 siblings, 0 replies; 41+ messages in thread
From: Wu Fengguang @ 2010-01-12 5:09 UTC (permalink / raw)
To: KAMEZAWA Hiroyuki
Cc: Minchan Kim, Huang Shijie, akpm@linux-foundation.org,
mel@csn.ul.ie, kosaki.motohiro@jp.fujitsu.com, linux-mm@kvack.org,
Rik van Riel, David Rientjes, Christoph Lameter, Andrea Arcangeli
[corrected CC to Christoph and Andrea]
On Tue, Jan 12, 2010 at 12:32:23PM +0800, KAMEZAWA Hiroyuki wrote:
> On Tue, 12 Jan 2010 12:21:16 +0800
> Wu Fengguang <fengguang.wu@intel.com> wrote:
> > > BTW,
> > > Hmm. It's not atomic as Kame pointed out.
> > >
> > > Now, zone->flags have several bit.
> > > * ZONE_ALL_UNRECLAIMALBE
> > > * ZONE_RECLAIM_LOCKED
> > > * ZONE_OOM_LOCKED.
> > >
> > > I think this flags are likely to race when the memory pressure is high.
> > > If we don't prevent race, concurrent reclaim and killing could be happened.
> > > So I think reset zone->flags outside of zone->lock would make our efforts which
> > > prevent current reclaim and killing invalidate.
> >
> > zone_set_flag()/zone_clear_flag() calls set_bit()/clear_bit() which is
> > atomic. Do you mean more high level exclusion?
> >
> Ah, sorry, I missed that.
> In my memory, this wasn't atomic ;) ...maybe recent change.
>
> I don't want to see atomic_ops here...So, how about making this back to be
> zone->all_unreclaimable word ?
>
> Clearing this is not necessary to be atomic because this is cleard at every
> page freeing.
Yes the cost is a bit high. This was introduced by David Rientjes
in commit e815af95f, let's hear opinions from him :)
Thanks,
Fengguang
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH 4/4] mm/page_alloc : relieve zone->lock's pressure for memory free
2010-01-12 4:32 ` KAMEZAWA Hiroyuki
2010-01-12 4:59 ` Wu Fengguang
@ 2010-01-12 5:10 ` KOSAKI Motohiro
2010-01-12 7:36 ` David Rientjes
1 sibling, 1 reply; 41+ messages in thread
From: KOSAKI Motohiro @ 2010-01-12 5:10 UTC (permalink / raw)
To: KAMEZAWA Hiroyuki
Cc: kosaki.motohiro, Wu Fengguang, Minchan Kim, Huang Shijie,
akpm@linux-foundation.org, mel@csn.ul.ie, linux-mm@kvack.org,
Rik van Riel
> On Tue, 12 Jan 2010 12:21:16 +0800
> Wu Fengguang <fengguang.wu@intel.com> wrote:
> > > BTW,
> > > Hmm. It's not atomic as Kame pointed out.
> > >
> > > Now, zone->flags have several bit.
> > > * ZONE_ALL_UNRECLAIMALBE
> > > * ZONE_RECLAIM_LOCKED
> > > * ZONE_OOM_LOCKED.
> > >
> > > I think this flags are likely to race when the memory pressure is high.
> > > If we don't prevent race, concurrent reclaim and killing could be happened.
> > > So I think reset zone->flags outside of zone->lock would make our efforts which
> > > prevent current reclaim and killing invalidate.
> >
> > zone_set_flag()/zone_clear_flag() calls set_bit()/clear_bit() which is
> > atomic. Do you mean more high level exclusion?
> >
> Ah, sorry, I missed that.
> In my memory, this wasn't atomic ;) ...maybe recent change.
>
> I don't want to see atomic_ops here...So, how about making this back to be
> zone->all_unreclaimable word ?
>
> Clearing this is not necessary to be atomic because this is cleard at every
> page freeing.
I agree. How about this?
From 751f197ad256c7245151681d7aece591b1dab343 Mon Sep 17 00:00:00 2001
From: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Date: Tue, 12 Jan 2010 13:53:47 +0900
Subject: [PATCH] mm: Restore zone->all_unreclaimable to independence word
commit e815af95 (change all_unreclaimable zone member to flags) chage
all_unreclaimable member to bit flag. but It have undesireble side
effect.
free_one_page() is one of most hot path in linux kernel and increasing
atomic ops in it can reduce kernel performance a bit.
Thus, this patch revert such commit partially. at least
all_unreclaimable shouldn't share memory word with other zone flags.
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
---
include/linux/mmzone.h | 7 +------
mm/page_alloc.c | 6 +++---
mm/vmscan.c | 20 ++++++++------------
mm/vmstat.c | 2 +-
4 files changed, 13 insertions(+), 22 deletions(-)
diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index 30fe668..4f0c6f1 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -341,6 +341,7 @@ struct zone {
unsigned long pages_scanned; /* since last reclaim */
unsigned long flags; /* zone flags, see below */
+ int all_unreclaimable; /* All pages pinned */
/* Zone statistics */
atomic_long_t vm_stat[NR_VM_ZONE_STAT_ITEMS];
@@ -425,7 +426,6 @@ struct zone {
} ____cacheline_internodealigned_in_smp;
typedef enum {
- ZONE_ALL_UNRECLAIMABLE, /* all pages pinned */
ZONE_RECLAIM_LOCKED, /* prevents concurrent reclaim */
ZONE_OOM_LOCKED, /* zone is in OOM killer zonelist */
} zone_flags_t;
@@ -445,11 +445,6 @@ static inline void zone_clear_flag(struct zone *zone, zone_flags_t flag)
clear_bit(flag, &zone->flags);
}
-static inline int zone_is_all_unreclaimable(const struct zone *zone)
-{
- return test_bit(ZONE_ALL_UNRECLAIMABLE, &zone->flags);
-}
-
static inline int zone_is_reclaim_locked(const struct zone *zone)
{
return test_bit(ZONE_RECLAIM_LOCKED, &zone->flags);
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 4e9f5cc..19a5b0e 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -530,7 +530,7 @@ static void free_pcppages_bulk(struct zone *zone, int count,
int batch_free = 0;
spin_lock(&zone->lock);
- zone_clear_flag(zone, ZONE_ALL_UNRECLAIMABLE);
+ zone->all_unreclaimable = 0;
zone->pages_scanned = 0;
__mod_zone_page_state(zone, NR_FREE_PAGES, count);
@@ -567,7 +567,7 @@ static void free_one_page(struct zone *zone, struct page *page, int order,
int migratetype)
{
spin_lock(&zone->lock);
- zone_clear_flag(zone, ZONE_ALL_UNRECLAIMABLE);
+ zone->all_unreclaimable = 0;
zone->pages_scanned = 0;
__mod_zone_page_state(zone, NR_FREE_PAGES, 1 << order);
@@ -2270,7 +2270,7 @@ void show_free_areas(void)
K(zone_page_state(zone, NR_BOUNCE)),
K(zone_page_state(zone, NR_WRITEBACK_TEMP)),
zone->pages_scanned,
- (zone_is_all_unreclaimable(zone) ? "yes" : "no")
+ (zone->all_unreclaimable ? "yes" : "no")
);
printk("lowmem_reserve[]:");
for (i = 0; i < MAX_NR_ZONES; i++)
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 885207a..8057d36 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1694,8 +1694,7 @@ static void shrink_zones(int priority, struct zonelist *zonelist,
continue;
note_zone_scanning_priority(zone, priority);
- if (zone_is_all_unreclaimable(zone) &&
- priority != DEF_PRIORITY)
+ if (zone->all_unreclaimable && priority != DEF_PRIORITY)
continue; /* Let kswapd poll it */
sc->all_unreclaimable = 0;
} else {
@@ -2009,8 +2008,7 @@ loop_again:
if (!populated_zone(zone))
continue;
- if (zone_is_all_unreclaimable(zone) &&
- priority != DEF_PRIORITY)
+ if (zone->all_unreclaimable && priority != DEF_PRIORITY)
continue;
/*
@@ -2053,8 +2051,7 @@ loop_again:
if (!populated_zone(zone))
continue;
- if (zone_is_all_unreclaimable(zone) &&
- priority != DEF_PRIORITY)
+ if (zone->all_unreclaimable && priority != DEF_PRIORITY)
continue;
if (!zone_watermark_ok(zone, order,
@@ -2084,12 +2081,11 @@ loop_again:
lru_pages);
sc.nr_reclaimed += reclaim_state->reclaimed_slab;
total_scanned += sc.nr_scanned;
- if (zone_is_all_unreclaimable(zone))
+ if (zone->all_unreclaimable)
continue;
- if (nr_slab == 0 && zone->pages_scanned >=
- (zone_reclaimable_pages(zone) * 6))
- zone_set_flag(zone,
- ZONE_ALL_UNRECLAIMABLE);
+ if (nr_slab == 0 &&
+ zone->pages_scanned >= (zone_reclaimable_pages(zone) * 6))
+ zone->all_unreclaimable = 1;
/*
* If we've done a decent amount of scanning and
* the reclaim ratio is low, start doing writepage
@@ -2612,7 +2608,7 @@ int zone_reclaim(struct zone *zone, gfp_t gfp_mask, unsigned int order)
zone_page_state(zone, NR_SLAB_RECLAIMABLE) <= zone->min_slab_pages)
return ZONE_RECLAIM_FULL;
- if (zone_is_all_unreclaimable(zone))
+ if (zone->all_unreclaimable)
return ZONE_RECLAIM_FULL;
/*
diff --git a/mm/vmstat.c b/mm/vmstat.c
index 6051fba..8175c64 100644
--- a/mm/vmstat.c
+++ b/mm/vmstat.c
@@ -761,7 +761,7 @@ static void zoneinfo_show_print(struct seq_file *m, pg_data_t *pgdat,
"\n prev_priority: %i"
"\n start_pfn: %lu"
"\n inactive_ratio: %u",
- zone_is_all_unreclaimable(zone),
+ zone->all_unreclaimable,
zone->prev_priority,
zone->zone_start_pfn,
zone->inactive_ratio);
--
1.6.5.2
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 41+ messages in thread
* Re: [PATCH 4/4] mm/page_alloc : relieve zone->lock's pressure for memory free
2010-01-12 5:10 ` KOSAKI Motohiro
@ 2010-01-12 7:36 ` David Rientjes
2010-01-12 8:56 ` KOSAKI Motohiro
0 siblings, 1 reply; 41+ messages in thread
From: David Rientjes @ 2010-01-12 7:36 UTC (permalink / raw)
To: KOSAKI Motohiro
Cc: KAMEZAWA Hiroyuki, Wu Fengguang, Minchan Kim, Huang Shijie,
Andrew Morton, Mel Gorman, linux-mm, Rik van Riel
On Tue, 12 Jan 2010, KOSAKI Motohiro wrote:
> From 751f197ad256c7245151681d7aece591b1dab343 Mon Sep 17 00:00:00 2001
> From: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
> Date: Tue, 12 Jan 2010 13:53:47 +0900
> Subject: [PATCH] mm: Restore zone->all_unreclaimable to independence word
>
> commit e815af95 (change all_unreclaimable zone member to flags) chage
> all_unreclaimable member to bit flag. but It have undesireble side
> effect.
> free_one_page() is one of most hot path in linux kernel and increasing
> atomic ops in it can reduce kernel performance a bit.
>
Could you please elaborate on "a bit" in the changelog with some data? If
it's so egregious, it should be easily be quantifiable.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH 4/4] mm/page_alloc : relieve zone->lock's pressure for memory free
2010-01-12 7:36 ` David Rientjes
@ 2010-01-12 8:56 ` KOSAKI Motohiro
2010-01-12 21:39 ` David Rientjes
0 siblings, 1 reply; 41+ messages in thread
From: KOSAKI Motohiro @ 2010-01-12 8:56 UTC (permalink / raw)
To: David Rientjes
Cc: kosaki.motohiro, KAMEZAWA Hiroyuki, Wu Fengguang, Minchan Kim,
Huang Shijie, Andrew Morton, Mel Gorman, linux-mm, Rik van Riel
> On Tue, 12 Jan 2010, KOSAKI Motohiro wrote:
>
> > From 751f197ad256c7245151681d7aece591b1dab343 Mon Sep 17 00:00:00 2001
> > From: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
> > Date: Tue, 12 Jan 2010 13:53:47 +0900
> > Subject: [PATCH] mm: Restore zone->all_unreclaimable to independence word
> >
> > commit e815af95 (change all_unreclaimable zone member to flags) chage
> > all_unreclaimable member to bit flag. but It have undesireble side
> > effect.
> > free_one_page() is one of most hot path in linux kernel and increasing
> > atomic ops in it can reduce kernel performance a bit.
> >
>
> Could you please elaborate on "a bit" in the changelog with some data? If
> it's so egregious, it should be easily be quantifiable.
Unfortunately I can't. atomic ops is mainly the issue of large machine. but
I can't access such machine now. but I'm sure we shouldn't take unnecessary
atomic ops.
That's fundamental space vs performance tradeoff thing. if we talked about
struct page or similar lots created struct, space efficient is very important.
but struct zone isn't such one.
Or, do you have strong argue to use bitops without space efficiency?
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH 4/4] mm/page_alloc : relieve zone->lock's pressure for memory free
2010-01-12 8:56 ` KOSAKI Motohiro
@ 2010-01-12 21:39 ` David Rientjes
2010-01-13 0:01 ` KOSAKI Motohiro
0 siblings, 1 reply; 41+ messages in thread
From: David Rientjes @ 2010-01-12 21:39 UTC (permalink / raw)
To: KOSAKI Motohiro
Cc: KAMEZAWA Hiroyuki, Wu Fengguang, Minchan Kim, Huang Shijie,
Andrew Morton, Mel Gorman, linux-mm, Rik van Riel
On Tue, 12 Jan 2010, KOSAKI Motohiro wrote:
> > > commit e815af95 (change all_unreclaimable zone member to flags) chage
> > > all_unreclaimable member to bit flag. but It have undesireble side
> > > effect.
> > > free_one_page() is one of most hot path in linux kernel and increasing
> > > atomic ops in it can reduce kernel performance a bit.
> > >
> >
> > Could you please elaborate on "a bit" in the changelog with some data? If
> > it's so egregious, it should be easily be quantifiable.
>
> Unfortunately I can't. atomic ops is mainly the issue of large machine. but
> I can't access such machine now. but I'm sure we shouldn't take unnecessary
> atomic ops.
>
e815af95 was intended to consolidate all bit flags into a single word
merely for space efficiency and cleanliness. At that time, we only had
one member of struct zone that could be converted, and that was
all_unreclaimable. That said, it was part of a larger patchset that
later added another zone flag meant to serialize the oom killer by
zonelist. So no consideration was given at the time concerning any
penalty incurred by moving all_unreclaimable to an atomic op.
> That's fundamental space vs performance tradeoff thing. if we talked about
> struct page or similar lots created struct, space efficient is very important.
> but struct zone isn't such one.
>
> Or, do you have strong argue to use bitops without space efficiency?
>
I'd suggest using a non-atomic variation within zone->flags that may still
be reordered so that it does not incur any performance penalty. In other
words, instead of readding zone->all_unreclaimable, we should add
__zone_set_flag(), __zone_test_and_set_flag(), and __zone_clear_flag()
variants to wrap non-atomic bitops.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH 4/4] mm/page_alloc : relieve zone->lock's pressure for memory free
2010-01-12 21:39 ` David Rientjes
@ 2010-01-13 0:01 ` KOSAKI Motohiro
0 siblings, 0 replies; 41+ messages in thread
From: KOSAKI Motohiro @ 2010-01-13 0:01 UTC (permalink / raw)
To: David Rientjes
Cc: kosaki.motohiro, KAMEZAWA Hiroyuki, Wu Fengguang, Minchan Kim,
Huang Shijie, Andrew Morton, Mel Gorman, linux-mm, Rik van Riel
> On Tue, 12 Jan 2010, KOSAKI Motohiro wrote:
>
> > > > commit e815af95 (change all_unreclaimable zone member to flags) chage
> > > > all_unreclaimable member to bit flag. but It have undesireble side
> > > > effect.
> > > > free_one_page() is one of most hot path in linux kernel and increasing
> > > > atomic ops in it can reduce kernel performance a bit.
> > > >
> > >
> > > Could you please elaborate on "a bit" in the changelog with some data? If
> > > it's so egregious, it should be easily be quantifiable.
> >
> > Unfortunately I can't. atomic ops is mainly the issue of large machine. but
> > I can't access such machine now. but I'm sure we shouldn't take unnecessary
> > atomic ops.
> >
>
> e815af95 was intended to consolidate all bit flags into a single word
> merely for space efficiency and cleanliness. At that time, we only had
> one member of struct zone that could be converted, and that was
> all_unreclaimable. That said, it was part of a larger patchset that
> later added another zone flag meant to serialize the oom killer by
> zonelist. So no consideration was given at the time concerning any
> penalty incurred by moving all_unreclaimable to an atomic op.
I agree ZONE_OOM_LOCKED have lots worth.
> > That's fundamental space vs performance tradeoff thing. if we talked about
> > struct page or similar lots created struct, space efficient is very important.
> > but struct zone isn't such one.
> >
> > Or, do you have strong argue to use bitops without space efficiency?
>
> I'd suggest using a non-atomic variation within zone->flags that may still
> be reordered so that it does not incur any performance penalty. In other
> words, instead of readding zone->all_unreclaimable, we should add
> __zone_set_flag(), __zone_test_and_set_flag(), and __zone_clear_flag()
> variants to wrap non-atomic bitops.
No. non-atomic ops assume the flags are protected by same lock. but
all_unreclaimable and ZONE_OOM_LOCKED don't have such lock. iow,
your opinion might cause ZONE_OOM_LOCKED lost.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH 1/4] mm/page_alloc : rename rmqueue_bulk to rmqueue_single
2010-01-11 4:37 [PATCH 1/4] mm/page_alloc : rename rmqueue_bulk to rmqueue_single Huang Shijie
` (2 preceding siblings ...)
2010-01-12 2:52 ` KOSAKI Motohiro
@ 2010-01-18 11:21 ` Mel Gorman
3 siblings, 0 replies; 41+ messages in thread
From: Mel Gorman @ 2010-01-18 11:21 UTC (permalink / raw)
To: Huang Shijie; +Cc: akpm, kosaki.motohiro, linux-mm
On Mon, Jan 11, 2010 at 12:37:11PM +0800, Huang Shijie wrote:
> There is only one place calls rmqueue_bulk to allocate the single
> pages. So rename it to rmqueue_single, and remove an argument
> order.
>
Why do this? The name rmqueue_bulk means "remove a number of pages in bulk
with the lock held" i.e. count is the important parameter to this function,
not order. rmqueue_batch might make more sense as the count is
pcp->batch. If this patch is to be anything, it would just remove the
"order" parameter as being unnecessary but leave the naming alone. I
doubt the performance difference would be measurable though.
> Signed-off-by: Huang Shijie <shijie8@gmail.com>
> ---
> mm/page_alloc.c | 18 ++++++++----------
> 1 files changed, 8 insertions(+), 10 deletions(-)
>
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 4e9f5cc..23df1ed 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -930,25 +930,24 @@ retry_reserve:
> }
>
> /*
> - * Obtain a specified number of elements from the buddy allocator, all under
> + * Obtain a specified number of single page from the buddy allocator, all under
> * a single hold of the lock, for efficiency. Add them to the supplied list.
> * Returns the number of new pages which were placed at *list.
> */
> -static int rmqueue_bulk(struct zone *zone, unsigned int order,
> - unsigned long count, struct list_head *list,
> - int migratetype, int cold)
> +static int rmqueue_single(struct zone *zone, unsigned long count,
> + struct list_head *list, int migratetype, int cold)
> {
> int i;
>
> spin_lock(&zone->lock);
> for (i = 0; i < count; ++i) {
> - struct page *page = __rmqueue(zone, order, migratetype);
> + struct page *page = __rmqueue(zone, 0, migratetype);
> if (unlikely(page == NULL))
> break;
>
> /*
> * Split buddy pages returned by expand() are received here
> - * in physical page order. The page is added to the callers and
> + * in order zero. The page is added to the callers and
> * list and the list head then moves forward. From the callers
> * perspective, the linked list is ordered by page number in
> * some conditions. This is useful for IO devices that can
> @@ -962,7 +961,7 @@ static int rmqueue_bulk(struct zone *zone, unsigned int order,
> set_page_private(page, migratetype);
> list = &page->lru;
> }
> - __mod_zone_page_state(zone, NR_FREE_PAGES, -(i << order));
> + __mod_zone_page_state(zone, NR_FREE_PAGES, -i);
> spin_unlock(&zone->lock);
> return i;
> }
> @@ -1192,9 +1191,8 @@ again:
> list = &pcp->lists[migratetype];
> local_irq_save(flags);
> if (list_empty(list)) {
> - pcp->count += rmqueue_bulk(zone, 0,
> - pcp->batch, list,
> - migratetype, cold);
> + pcp->count += rmqueue_single(zone, pcp->batch, list,
> + migratetype, cold);
> if (unlikely(list_empty(list)))
> goto failed;
> }
> --
> 1.6.5.2
>
--
Mel Gorman
Part-time Phd Student Linux Technology Center
University of Limerick IBM Dublin Software Lab
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH 2/4] mm/page_alloc : relieve the zone->lock's pressure for allocation
2010-01-11 4:37 ` [PATCH 2/4] mm/page_alloc : relieve the zone->lock's pressure for allocation Huang Shijie
` (2 preceding siblings ...)
2010-01-12 2:54 ` KOSAKI Motohiro
@ 2010-01-18 11:24 ` Mel Gorman
3 siblings, 0 replies; 41+ messages in thread
From: Mel Gorman @ 2010-01-18 11:24 UTC (permalink / raw)
To: Huang Shijie; +Cc: akpm, kosaki.motohiro, linux-mm
On Mon, Jan 11, 2010 at 12:37:12PM +0800, Huang Shijie wrote:
> The __mod_zone_page_state() only require irq disabling,
> it does not require the zone's spinlock. So move it out of
> the guard region of the spinlock to relieve the pressure for
> allocation.
>
> Signed-off-by: Huang Shijie <shijie8@gmail.com>
Looks fine. Even if patch 1 is dropped and rmqueue_bulk remains, it
still makes sense.
Reviewed-by: Mel Gorman <mel@csn.ul.ie>
> ---
> mm/page_alloc.c | 2 +-
> 1 files changed, 1 insertions(+), 1 deletions(-)
>
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 23df1ed..00aa83a 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -961,8 +961,8 @@ static int rmqueue_single(struct zone *zone, unsigned long count,
> set_page_private(page, migratetype);
> list = &page->lru;
> }
> - __mod_zone_page_state(zone, NR_FREE_PAGES, -i);
> spin_unlock(&zone->lock);
> + __mod_zone_page_state(zone, NR_FREE_PAGES, -i);
> return i;
> }
>
> --
> 1.6.5.2
>
--
Mel Gorman
Part-time Phd Student Linux Technology Center
University of Limerick IBM Dublin Software Lab
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH 3/4] mm/page_alloc : modify the return type of __free_one_page
2010-01-11 4:37 ` [PATCH 3/4] mm/page_alloc : modify the return type of __free_one_page Huang Shijie
` (2 preceding siblings ...)
2010-01-12 2:56 ` KOSAKI Motohiro
@ 2010-01-18 11:25 ` Mel Gorman
2010-01-19 1:49 ` Huang Shijie
3 siblings, 1 reply; 41+ messages in thread
From: Mel Gorman @ 2010-01-18 11:25 UTC (permalink / raw)
To: Huang Shijie; +Cc: akpm, kosaki.motohiro, linux-mm
On Mon, Jan 11, 2010 at 12:37:13PM +0800, Huang Shijie wrote:
> Modify the return type for __free_one_page.
> It will return 1 on success, and return 0 when
> the check of the compound page is failed.
>
Why?
I assume it's something to do with patch 4, but it's unclear at this
point why it's necessary. A brief explanation is needed in the
changelog.
> Signed-off-by: Huang Shijie <shijie8@gmail.com>
> ---
> mm/page_alloc.c | 10 ++++++----
> 1 files changed, 6 insertions(+), 4 deletions(-)
>
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 00aa83a..290dfc3 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -445,17 +445,18 @@ static inline int page_is_buddy(struct page *page, struct page *buddy,
> * triggers coalescing into a block of larger size.
> *
> * -- wli
> + *
> + * Returns 1 on success, else return 0;
> */
>
> -static inline void __free_one_page(struct page *page,
> - struct zone *zone, unsigned int order,
> - int migratetype)
> +static inline int __free_one_page(struct page *page, struct zone *zone,
> + unsigned int order, int migratetype)
> {
> unsigned long page_idx;
>
> if (unlikely(PageCompound(page)))
> if (unlikely(destroy_compound_page(page, order)))
> - return;
> + return 0;
>
> VM_BUG_ON(migratetype == -1);
>
> @@ -485,6 +486,7 @@ static inline void __free_one_page(struct page *page,
> list_add(&page->lru,
> &zone->free_area[order].free_list[migratetype]);
> zone->free_area[order].nr_free++;
> + return 1;
> }
>
> /*
> --
> 1.6.5.2
>
--
Mel Gorman
Part-time Phd Student Linux Technology Center
University of Limerick IBM Dublin Software Lab
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH 3/4] mm/page_alloc : modify the return type of __free_one_page
2010-01-18 11:25 ` Mel Gorman
@ 2010-01-19 1:49 ` Huang Shijie
0 siblings, 0 replies; 41+ messages in thread
From: Huang Shijie @ 2010-01-19 1:49 UTC (permalink / raw)
To: Mel Gorman; +Cc: akpm, kosaki.motohiro, linux-mm
> On Mon, Jan 11, 2010 at 12:37:13PM +0800, Huang Shijie wrote:
>
>> Modify the return type for __free_one_page.
>> It will return 1 on success, and return 0 when
>> the check of the compound page is failed.
>>
>>
> Why?
>
> I assume it's something to do with patch 4, but it's unclear at this
> point why it's necessary. A brief explanation is needed in the
> changelog.
>
>
Just ignore this patch. Changing the return value of __free_one_page()
is not needed ,even when the
check of compand page is failed.
Just as Wu Fengguang pointed out, a hacker method maybe better.
>> Signed-off-by: Huang Shijie<shijie8@gmail.com>
>> ---
>> mm/page_alloc.c | 10 ++++++----
>> 1 files changed, 6 insertions(+), 4 deletions(-)
>>
>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>> index 00aa83a..290dfc3 100644
>> --- a/mm/page_alloc.c
>> +++ b/mm/page_alloc.c
>> @@ -445,17 +445,18 @@ static inline int page_is_buddy(struct page *page, struct page *buddy,
>> * triggers coalescing into a block of larger size.
>> *
>> * -- wli
>> + *
>> + * Returns 1 on success, else return 0;
>> */
>>
>> -static inline void __free_one_page(struct page *page,
>> - struct zone *zone, unsigned int order,
>> - int migratetype)
>> +static inline int __free_one_page(struct page *page, struct zone *zone,
>> + unsigned int order, int migratetype)
>> {
>> unsigned long page_idx;
>>
>> if (unlikely(PageCompound(page)))
>> if (unlikely(destroy_compound_page(page, order)))
>> - return;
>> + return 0;
>>
>> VM_BUG_ON(migratetype == -1);
>>
>> @@ -485,6 +486,7 @@ static inline void __free_one_page(struct page *page,
>> list_add(&page->lru,
>> &zone->free_area[order].free_list[migratetype]);
>> zone->free_area[order].nr_free++;
>> + return 1;
>> }
>>
>> /*
>> --
>> 1.6.5.2
>>
>>
>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 41+ messages in thread
end of thread, other threads:[~2010-01-19 1:50 UTC | newest]
Thread overview: 41+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-01-11 4:37 [PATCH 1/4] mm/page_alloc : rename rmqueue_bulk to rmqueue_single Huang Shijie
2010-01-11 4:37 ` [PATCH 2/4] mm/page_alloc : relieve the zone->lock's pressure for allocation Huang Shijie
2010-01-11 4:37 ` [PATCH 3/4] mm/page_alloc : modify the return type of __free_one_page Huang Shijie
2010-01-11 4:37 ` [PATCH 4/4] mm/page_alloc : relieve zone->lock's pressure for memory free Huang Shijie
2010-01-11 5:20 ` Minchan Kim
2010-01-11 6:01 ` Huang Shijie
2010-01-11 6:27 ` Huang Shijie
2010-01-11 6:38 ` Minchan Kim
2010-01-11 6:59 ` Huang Shijie
2010-01-12 0:47 ` KAMEZAWA Hiroyuki
2010-01-12 2:02 ` Huang Shijie
2010-01-12 2:07 ` KAMEZAWA Hiroyuki
2010-01-12 2:32 ` Huang Shijie
2010-01-12 2:27 ` Wu Fengguang
2010-01-12 2:56 ` KOSAKI Motohiro
2010-01-12 3:02 ` Huang Shijie
2010-01-12 4:05 ` Minchan Kim
2010-01-12 4:21 ` Wu Fengguang
2010-01-12 4:32 ` KAMEZAWA Hiroyuki
2010-01-12 4:59 ` Wu Fengguang
2010-01-12 5:09 ` Wu Fengguang
2010-01-12 5:10 ` KOSAKI Motohiro
2010-01-12 7:36 ` David Rientjes
2010-01-12 8:56 ` KOSAKI Motohiro
2010-01-12 21:39 ` David Rientjes
2010-01-13 0:01 ` KOSAKI Motohiro
2010-01-12 4:48 ` Minchan Kim
2010-01-12 2:51 ` Huang Shijie
2010-01-12 3:03 ` Wu Fengguang
2010-01-12 3:05 ` KOSAKI Motohiro
2010-01-11 5:04 ` [PATCH 3/4] mm/page_alloc : modify the return type of __free_one_page Minchan Kim
2010-01-12 2:56 ` KOSAKI Motohiro
2010-01-18 11:25 ` Mel Gorman
2010-01-19 1:49 ` Huang Shijie
2010-01-11 5:02 ` [PATCH 2/4] mm/page_alloc : relieve the zone->lock's pressure for allocation Minchan Kim
2010-01-11 5:13 ` Huang Shijie
2010-01-12 2:54 ` KOSAKI Motohiro
2010-01-18 11:24 ` Mel Gorman
2010-01-11 5:00 ` [PATCH 1/4] mm/page_alloc : rename rmqueue_bulk to rmqueue_single Minchan Kim
2010-01-12 2:52 ` KOSAKI Motohiro
2010-01-18 11:21 ` Mel Gorman
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).