linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 1/2][BUGFIX] mm: do not use page_count without a page pin
@ 2012-06-14  1:12 Minchan Kim
  2012-06-14  1:12 ` [PATCH v2 2/2] mm: clean up __count_immobile_pages Minchan Kim
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: Minchan Kim @ 2012-06-14  1:12 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-kernel, linux-mm, Minchan Kim, Andrea Arcangeli, Mel Gorman,
	Michal Hocko, KAMEZAWA Hiroyuki, Wanpeng Li

d179e84ba fixed the problem[1] in vmscan.c but same problem is here.
Let's fix it.

[1] http://comments.gmane.org/gmane.linux.kernel.mm/65844

I copy and paste d179e84ba's contents for description.

"It is unsafe to run page_count during the physical pfn scan because
compound_head could trip on a dangling pointer when reading
page->first_page if the compound page is being freed by another CPU."

* changelog from v1
  - Add comment about skip tail page of THP - Andrea
  - fix typo - Wanpeng Li
  - based on next-20120613

Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Wanpeng Li <liwp.linux@gmail.com>
Signed-off-by: Minchan Kim <minchan@kernel.org>
---
 mm/page_alloc.c |    9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 266f267..543cc2d 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -5496,11 +5496,18 @@ __count_immobile_pages(struct zone *zone, struct page *page, int count)
 			continue;
 
 		page = pfn_to_page(check);
-		if (!page_count(page)) {
+		/*
+		 * We can't use page_count without pin a page
+		 * because another CPU can free compound page.
+		 * This check already skips compound tails of THP
+		 * because their page->_count is zero at all time.
+		 */
+		if (!atomic_read(&page->_count)) {
 			if (PageBuddy(page))
 				iter += (1 << page_order(page)) - 1;
 			continue;
 		}
+
 		if (!PageLRU(page))
 			found++;
 		/*
-- 
1.7.9.5

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH v2 2/2] mm: clean up __count_immobile_pages
  2012-06-14  1:12 [PATCH v2 1/2][BUGFIX] mm: do not use page_count without a page pin Minchan Kim
@ 2012-06-14  1:12 ` Minchan Kim
  2012-06-14  1:15   ` Minchan Kim
  2012-06-14  1:14 ` [PATCH v2 1/2][BUGFIX] mm: do not use page_count without a page pin Minchan Kim
  2012-06-14  2:20 ` Kamezawa Hiroyuki
  2 siblings, 1 reply; 5+ messages in thread
From: Minchan Kim @ 2012-06-14  1:12 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-kernel, linux-mm, Minchan Kim, Andrea Arcangeli, Mel Gorman,
	Michal Hocko

__count_immobile_pages naming is rather awkward.
This patch changes function name more clear and add comment.

* changelog from v1
  - write down page flag race in function comment
  - commit change log change

Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Michal Hocko <mhocko@suse.cz>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Minchan Kim <minchan@kernel.org>
---
 mm/page_alloc.c |   34 ++++++++++++++++++----------------
 1 file changed, 18 insertions(+), 16 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 543cc2d..dc7f8c5 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -5467,26 +5467,28 @@ void set_pageblock_flags_group(struct page *page, unsigned long flags,
 }
 
 /*
- * This is designed as sub function...plz see page_isolation.c also.
- * set/clear page block's type to be ISOLATE.
- * page allocater never alloc memory from ISOLATE block.
+ * This function checks whether pageblock includes unmovable pages or not.
+ * If @count is not zero, it is okay to include less @count unmovable pages
+ *
+ * PageLRU check wihtout isolation or lru_lock could race so that
+ * MIGRATE_MOVABLE block might include unmovable pages. It means you can't
+ * expect this function should be exact.
  */
-
-static int
-__count_immobile_pages(struct zone *zone, struct page *page, int count)
+static bool
+__has_unmovable_pages(struct zone *zone, struct page *page, int count)
 {
 	unsigned long pfn, iter, found;
 	int mt;
 
 	/*
 	 * For avoiding noise data, lru_add_drain_all() should be called
-	 * If ZONE_MOVABLE, the zone never contains immobile pages
+	 * If ZONE_MOVABLE, the zone never contains unmovable pages
 	 */
 	if (zone_idx(zone) == ZONE_MOVABLE)
-		return true;
+		return false;
 	mt = get_pageblock_migratetype(page);
 	if (mt == MIGRATE_MOVABLE || is_migrate_cma(mt))
-		return true;
+		return false;
 
 	pfn = page_to_pfn(page);
 	for (found = 0, iter = 0; iter < pageblock_nr_pages; iter++) {
@@ -5524,9 +5526,9 @@ __count_immobile_pages(struct zone *zone, struct page *page, int count)
 		 * page at boot.
 		 */
 		if (found > count)
-			return false;
+			return true;
 	}
-	return true;
+	return false;
 }
 
 bool is_pageblock_removable_nolock(struct page *page)
@@ -5550,7 +5552,7 @@ bool is_pageblock_removable_nolock(struct page *page)
 			zone->zone_start_pfn + zone->spanned_pages <= pfn)
 		return false;
 
-	return __count_immobile_pages(zone, page, 0);
+	return !__has_unmovable_pages(zone, page, 0);
 }
 
 int set_migratetype_isolate(struct page *page)
@@ -5589,12 +5591,12 @@ int set_migratetype_isolate(struct page *page)
 	 * FIXME: Now, memory hotplug doesn't call shrink_slab() by itself.
 	 * We just check MOVABLE pages.
 	 */
-	if (__count_immobile_pages(zone, page, arg.pages_found))
+	if (!__has_unmovable_pages(zone, page, arg.pages_found))
 		ret = 0;
-
 	/*
-	 * immobile means "not-on-lru" paes. If immobile is larger than
-	 * removable-by-driver pages reported by notifier, we'll fail.
+	 * Unmovable means "not-on-lru" pages. If Unmovable pages are
+	 * larger than removable-by-driver pages reported by notifier,
+	 * we'll fail.
 	 */
 
 out:
-- 
1.7.9.5

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH v2 1/2][BUGFIX] mm: do not use page_count without a page pin
  2012-06-14  1:12 [PATCH v2 1/2][BUGFIX] mm: do not use page_count without a page pin Minchan Kim
  2012-06-14  1:12 ` [PATCH v2 2/2] mm: clean up __count_immobile_pages Minchan Kim
@ 2012-06-14  1:14 ` Minchan Kim
  2012-06-14  2:20 ` Kamezawa Hiroyuki
  2 siblings, 0 replies; 5+ messages in thread
From: Minchan Kim @ 2012-06-14  1:14 UTC (permalink / raw)
  Cc: Andrew Morton, linux-kernel, linux-mm, Andrea Arcangeli,
	Mel Gorman, Michal Hocko, KAMEZAWA Hiroyuki, Wanpeng Li,
	Bartlomiej Zolnierkiewicz

Missing Bartlomiej, Sorry!

On 06/14/2012 10:12 AM, Minchan Kim wrote:

> d179e84ba fixed the problem[1] in vmscan.c but same problem is here.
> Let's fix it.
> 
> [1] http://comments.gmane.org/gmane.linux.kernel.mm/65844
> 
> I copy and paste d179e84ba's contents for description.
> 
> "It is unsafe to run page_count during the physical pfn scan because
> compound_head could trip on a dangling pointer when reading
> page->first_page if the compound page is being freed by another CPU."
> 
> * changelog from v1
>   - Add comment about skip tail page of THP - Andrea
>   - fix typo - Wanpeng Li
>   - based on next-20120613
> 
> Cc: Andrea Arcangeli <aarcange@redhat.com>
> Cc: Mel Gorman <mgorman@suse.de>
> Cc: Michal Hocko <mhocko@suse.cz>
> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> Cc: Wanpeng Li <liwp.linux@gmail.com>
> Signed-off-by: Minchan Kim <minchan@kernel.org>
> ---
>  mm/page_alloc.c |    9 ++++++++-
>  1 file changed, 8 insertions(+), 1 deletion(-)
> 
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 266f267..543cc2d 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -5496,11 +5496,18 @@ __count_immobile_pages(struct zone *zone, struct page *page, int count)
>  			continue;
>  
>  		page = pfn_to_page(check);
> -		if (!page_count(page)) {
> +		/*
> +		 * We can't use page_count without pin a page
> +		 * because another CPU can free compound page.
> +		 * This check already skips compound tails of THP
> +		 * because their page->_count is zero at all time.
> +		 */
> +		if (!atomic_read(&page->_count)) {
>  			if (PageBuddy(page))
>  				iter += (1 << page_order(page)) - 1;
>  			continue;
>  		}
> +
>  		if (!PageLRU(page))
>  			found++;
>  		/*



-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v2 2/2] mm: clean up __count_immobile_pages
  2012-06-14  1:12 ` [PATCH v2 2/2] mm: clean up __count_immobile_pages Minchan Kim
@ 2012-06-14  1:15   ` Minchan Kim
  0 siblings, 0 replies; 5+ messages in thread
From: Minchan Kim @ 2012-06-14  1:15 UTC (permalink / raw)
  Cc: Andrew Morton, linux-kernel, linux-mm, Andrea Arcangeli,
	Mel Gorman, Michal Hocko, Bartlomiej Zolnierkiewicz

Missing Bartlomiej, Sorry!

On 06/14/2012 10:12 AM, Minchan Kim wrote:

> __count_immobile_pages naming is rather awkward.
> This patch changes function name more clear and add comment.
> 
> * changelog from v1
>   - write down page flag race in function comment
>   - commit change log change
> 
> Cc: Andrea Arcangeli <aarcange@redhat.com>
> Cc: Mel Gorman <mgorman@suse.de>
> Cc: Michal Hocko <mhocko@suse.cz>
> Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> Signed-off-by: Minchan Kim <minchan@kernel.org>
> ---
>  mm/page_alloc.c |   34 ++++++++++++++++++----------------
>  1 file changed, 18 insertions(+), 16 deletions(-)
> 
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 543cc2d..dc7f8c5 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -5467,26 +5467,28 @@ void set_pageblock_flags_group(struct page *page, unsigned long flags,
>  }
>  
>  /*
> - * This is designed as sub function...plz see page_isolation.c also.
> - * set/clear page block's type to be ISOLATE.
> - * page allocater never alloc memory from ISOLATE block.
> + * This function checks whether pageblock includes unmovable pages or not.
> + * If @count is not zero, it is okay to include less @count unmovable pages
> + *
> + * PageLRU check wihtout isolation or lru_lock could race so that
> + * MIGRATE_MOVABLE block might include unmovable pages. It means you can't
> + * expect this function should be exact.
>   */
> -
> -static int
> -__count_immobile_pages(struct zone *zone, struct page *page, int count)
> +static bool
> +__has_unmovable_pages(struct zone *zone, struct page *page, int count)
>  {
>  	unsigned long pfn, iter, found;
>  	int mt;
>  
>  	/*
>  	 * For avoiding noise data, lru_add_drain_all() should be called
> -	 * If ZONE_MOVABLE, the zone never contains immobile pages
> +	 * If ZONE_MOVABLE, the zone never contains unmovable pages
>  	 */
>  	if (zone_idx(zone) == ZONE_MOVABLE)
> -		return true;
> +		return false;
>  	mt = get_pageblock_migratetype(page);
>  	if (mt == MIGRATE_MOVABLE || is_migrate_cma(mt))
> -		return true;
> +		return false;
>  
>  	pfn = page_to_pfn(page);
>  	for (found = 0, iter = 0; iter < pageblock_nr_pages; iter++) {
> @@ -5524,9 +5526,9 @@ __count_immobile_pages(struct zone *zone, struct page *page, int count)
>  		 * page at boot.
>  		 */
>  		if (found > count)
> -			return false;
> +			return true;
>  	}
> -	return true;
> +	return false;
>  }
>  
>  bool is_pageblock_removable_nolock(struct page *page)
> @@ -5550,7 +5552,7 @@ bool is_pageblock_removable_nolock(struct page *page)
>  			zone->zone_start_pfn + zone->spanned_pages <= pfn)
>  		return false;
>  
> -	return __count_immobile_pages(zone, page, 0);
> +	return !__has_unmovable_pages(zone, page, 0);
>  }
>  
>  int set_migratetype_isolate(struct page *page)
> @@ -5589,12 +5591,12 @@ int set_migratetype_isolate(struct page *page)
>  	 * FIXME: Now, memory hotplug doesn't call shrink_slab() by itself.
>  	 * We just check MOVABLE pages.
>  	 */
> -	if (__count_immobile_pages(zone, page, arg.pages_found))
> +	if (!__has_unmovable_pages(zone, page, arg.pages_found))
>  		ret = 0;
> -
>  	/*
> -	 * immobile means "not-on-lru" paes. If immobile is larger than
> -	 * removable-by-driver pages reported by notifier, we'll fail.
> +	 * Unmovable means "not-on-lru" pages. If Unmovable pages are
> +	 * larger than removable-by-driver pages reported by notifier,
> +	 * we'll fail.
>  	 */
>  
>  out:



-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v2 1/2][BUGFIX] mm: do not use page_count without a page pin
  2012-06-14  1:12 [PATCH v2 1/2][BUGFIX] mm: do not use page_count without a page pin Minchan Kim
  2012-06-14  1:12 ` [PATCH v2 2/2] mm: clean up __count_immobile_pages Minchan Kim
  2012-06-14  1:14 ` [PATCH v2 1/2][BUGFIX] mm: do not use page_count without a page pin Minchan Kim
@ 2012-06-14  2:20 ` Kamezawa Hiroyuki
  2 siblings, 0 replies; 5+ messages in thread
From: Kamezawa Hiroyuki @ 2012-06-14  2:20 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Andrew Morton, linux-kernel, linux-mm, Andrea Arcangeli,
	Mel Gorman, Michal Hocko, Wanpeng Li

(2012/06/14 10:12), Minchan Kim wrote:
> d179e84ba fixed the problem[1] in vmscan.c but same problem is here.
> Let's fix it.
> 
> [1] http://comments.gmane.org/gmane.linux.kernel.mm/65844
> 
> I copy and paste d179e84ba's contents for description.
> 
> "It is unsafe to run page_count during the physical pfn scan because
> compound_head could trip on a dangling pointer when reading
> page->first_page if the compound page is being freed by another CPU."
> 
> * changelog from v1
>    - Add comment about skip tail page of THP - Andrea
>    - fix typo - Wanpeng Li
>    - based on next-20120613
> 
> Cc: Andrea Arcangeli<aarcange@redhat.com>
> Cc: Mel Gorman<mgorman@suse.de>
> Cc: Michal Hocko<mhocko@suse.cz>
> Cc: KAMEZAWA Hiroyuki<kamezawa.hiroyu@jp.fujitsu.com>
> Cc: Wanpeng Li<liwp.linux@gmail.com>
> Signed-off-by: Minchan Kim<minchan@kernel.org>

Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2012-06-14  2:22 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-06-14  1:12 [PATCH v2 1/2][BUGFIX] mm: do not use page_count without a page pin Minchan Kim
2012-06-14  1:12 ` [PATCH v2 2/2] mm: clean up __count_immobile_pages Minchan Kim
2012-06-14  1:15   ` Minchan Kim
2012-06-14  1:14 ` [PATCH v2 1/2][BUGFIX] mm: do not use page_count without a page pin Minchan Kim
2012-06-14  2:20 ` Kamezawa Hiroyuki

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).