All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] Fix race in set_max_huge_pages for multiple updaters of nr_huge_pages
@ 2005-11-18 17:51 Eric Paris
  2005-11-19  3:09 ` Andrew Morton
  0 siblings, 1 reply; 3+ messages in thread
From: Eric Paris @ 2005-11-18 17:51 UTC (permalink / raw)
  To: linux-kernel, akpm, wli

If there are multiple updaters to /proc/sys/vm/nr_hugepages
simultaneously it is possible for the nr_huge_pages variable to become
incorrect.  There is no locking in the set_max_huge_pages function
around alloc_fresh_huge_page which is able to update nr_huge_pages.  Two
callers to alloc_fresh_huge_page could race against each other as could
a call to alloc_fresh_huge_page and a call to update_and_free_page.
This patch just expands the area covered by the hugetlb_lock to cover
the call into alloc_fresh_huge_page.  I'm not sure how we could say that
a sysctl section is performance critical where more specific locking
would be needed.

My reproducer was to run a couple copies of the following script
simultaneously

while [ true ]; do
	echo 1000 > /proc/sys/vm/nr_hugepages
	echo 500 > /proc/sys/vm/nr_hugepages
	echo 750 > /proc/sys/vm/nr_hugepages
	echo 100 > /proc/sys/vm/nr_hugepages
	echo 0 > /proc/sys/vm/nr_hugepages
done

and then watch /proc/meminfo and eventually you will see things like

HugePages_Total:     100
HugePages_Free:      109

After applying the patch all seemed well.

Signed-off-by: Eric Paris <eparis@redhat.com>
----

 hugetlb.c |   10 +++++++---
 1 files changed, 7 insertions(+), 3 deletions(-)

--- linux-2.6.14.2/mm/hugetlb.c.old
+++ linux-2.6.14.2/mm/hugetlb.c
@@ -22,6 +22,7 @@
 static struct list_head hugepage_freelists[MAX_NUMNODES];
 static unsigned int nr_huge_pages_node[MAX_NUMNODES];
 static unsigned int free_huge_pages_node[MAX_NUMNODES];
+/* This lock protects updates to hugepage_freelists, nr_huge_pages, and free_huge_pages */
 static DEFINE_SPINLOCK(hugetlb_lock);
 
 static void enqueue_huge_page(struct page *page)
@@ -172,10 +173,13 @@
 static unsigned long set_max_huge_pages(unsigned long count)
 {
 	while (count > nr_huge_pages) {
-		struct page *page = alloc_fresh_huge_page();
-		if (!page)
-			return nr_huge_pages;
+		struct page *page;
 		spin_lock(&hugetlb_lock);
+		page = alloc_fresh_huge_page();
+		if (!page) {
+			spin_unlock(&hugetlb_lock);
+			return nr_huge_pages;
+		}
 		enqueue_huge_page(page);
 		spin_unlock(&hugetlb_lock);
 	}


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] Fix race in set_max_huge_pages for multiple updaters of nr_huge_pages
  2005-11-18 17:51 [PATCH] Fix race in set_max_huge_pages for multiple updaters of nr_huge_pages Eric Paris
@ 2005-11-19  3:09 ` Andrew Morton
  2005-11-19  3:24   ` William Lee Irwin III
  0 siblings, 1 reply; 3+ messages in thread
From: Andrew Morton @ 2005-11-19  3:09 UTC (permalink / raw)
  To: Eric Paris; +Cc: linux-kernel, wli

Eric Paris <eparis@redhat.com> wrote:
>
> If there are multiple updaters to /proc/sys/vm/nr_hugepages
> simultaneously it is possible for the nr_huge_pages variable to become
> incorrect.  There is no locking in the set_max_huge_pages function
> around alloc_fresh_huge_page which is able to update nr_huge_pages.  Two
> callers to alloc_fresh_huge_page could race against each other as could
> a call to alloc_fresh_huge_page and a call to update_and_free_page.
> This patch just expands the area covered by the hugetlb_lock to cover
> the call into alloc_fresh_huge_page.  I'm not sure how we could say that
> a sysctl section is performance critical where more specific locking
> would be needed.
> 
> My reproducer was to run a couple copies of the following script
> simultaneously
> 
> while [ true ]; do
> 	echo 1000 > /proc/sys/vm/nr_hugepages
> 	echo 500 > /proc/sys/vm/nr_hugepages
> 	echo 750 > /proc/sys/vm/nr_hugepages
> 	echo 100 > /proc/sys/vm/nr_hugepages
> 	echo 0 > /proc/sys/vm/nr_hugepages
> done
> 
> and then watch /proc/meminfo and eventually you will see things like
> 
> HugePages_Total:     100
> HugePages_Free:      109
> 
> After applying the patch all seemed well.
> 
> Signed-off-by: Eric Paris <eparis@redhat.com>
> ----
> 
>  hugetlb.c |   10 +++++++---
>  1 files changed, 7 insertions(+), 3 deletions(-)
> 
> --- linux-2.6.14.2/mm/hugetlb.c.old
> +++ linux-2.6.14.2/mm/hugetlb.c
> @@ -22,6 +22,7 @@
>  static struct list_head hugepage_freelists[MAX_NUMNODES];
>  static unsigned int nr_huge_pages_node[MAX_NUMNODES];
>  static unsigned int free_huge_pages_node[MAX_NUMNODES];
> +/* This lock protects updates to hugepage_freelists, nr_huge_pages, and free_huge_pages */
>  static DEFINE_SPINLOCK(hugetlb_lock);
>  
>  static void enqueue_huge_page(struct page *page)
> @@ -172,10 +173,13 @@
>  static unsigned long set_max_huge_pages(unsigned long count)
>  {
>  	while (count > nr_huge_pages) {
> -		struct page *page = alloc_fresh_huge_page();
> -		if (!page)
> -			return nr_huge_pages;
> +		struct page *page;
>  		spin_lock(&hugetlb_lock);
> +		page = alloc_fresh_huge_page();
> +		if (!page) {
> +			spin_unlock(&hugetlb_lock);
> +			return nr_huge_pages;
> +		}
>  		enqueue_huge_page(page);
>  		spin_unlock(&hugetlb_lock);
>  	}

Nope, alloc_fresh_huge_page() does a GFP_HIGHUSER allocation, which can
sleep and may not be called inside spinlock.  You would have seen a spew of
might_sleep() warnings if this was tested with the appropriate kernel
debugging options.


How about this?

--- devel/mm/hugetlb.c~hugetlb-fix-race-in-set_max_huge_pages-for-multiple-updaters-of-nr_huge_pages	2005-11-18 19:04:10.000000000 -0800
+++ devel-akpm/mm/hugetlb.c	2005-11-18 19:07:26.000000000 -0800
@@ -22,6 +22,10 @@ unsigned long max_huge_pages;
 static struct list_head hugepage_freelists[MAX_NUMNODES];
 static unsigned int nr_huge_pages_node[MAX_NUMNODES];
 static unsigned int free_huge_pages_node[MAX_NUMNODES];
+
+/*
+ * Protects updates to hugepage_freelists, nr_huge_pages, and free_huge_pages
+ */
 static DEFINE_SPINLOCK(hugetlb_lock);
 
 static void enqueue_huge_page(struct page *page)
@@ -61,8 +65,10 @@ static struct page *alloc_fresh_huge_pag
 					HUGETLB_PAGE_ORDER);
 	nid = (nid + 1) % num_online_nodes();
 	if (page) {
+		spin_lock(&hugetlb_lock);
 		nr_huge_pages++;
 		nr_huge_pages_node[page_to_nid(page)]++;
+		spin_unlock(&hugetlb_lock);
 	}
 	return page;
 }
_


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] Fix race in set_max_huge_pages for multiple updaters of nr_huge_pages
  2005-11-19  3:09 ` Andrew Morton
@ 2005-11-19  3:24   ` William Lee Irwin III
  0 siblings, 0 replies; 3+ messages in thread
From: William Lee Irwin III @ 2005-11-19  3:24 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Eric Paris, linux-kernel

On Fri, Nov 18, 2005 at 07:09:50PM -0800, Andrew Morton wrote:
> Nope, alloc_fresh_huge_page() does a GFP_HIGHUSER allocation, which can
> sleep and may not be called inside spinlock.  You would have seen a spew of
> might_sleep() warnings if this was tested with the appropriate kernel
> debugging options.
> How about this?

Looks good. My reply got stuck behind a queue of other things to do
since it needed a correction.

Acked-by: William Irwin <wli@holomorphy.com>


-- wli

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2005-11-19  3:25 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-11-18 17:51 [PATCH] Fix race in set_max_huge_pages for multiple updaters of nr_huge_pages Eric Paris
2005-11-19  3:09 ` Andrew Morton
2005-11-19  3:24   ` William Lee Irwin III

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.