[PATCH] spinlock in function hugetlb

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* [PATCH] spinlock in function hugetlb_fault could be deleted
@ 2007-07-23  8:18 Zhang, Yanmin
  2007-07-23  9:59 ` Zhang, Yanmin
  2007-07-23 14:27 ` Adam Litke
  0 siblings, 2 replies; 4+ messages in thread
From: Zhang, Yanmin @ 2007-07-23  8:18 UTC (permalink / raw)
  To: LKML; +Cc: agl

Function hugetlb_fault needn't hold spinlock mm->page_table_lock,
because when hugetlb_fault is called:
1) mm->mmap_sem is held already;
2) hugetlb_instantiation_mutex is held by hugetlb_fault, which prevents
other threads/processes from entering this critical area. It's impossible
for other threads/processes to change the page table now.

My patch against kenel 2.6.22 deletes the spinlock statements in the
related functions.

Signed-off-by: Zhang Yanmin <yanmin.zhang@intel.com>

---

diff -Nraup linux-2.6.22/mm/hugetlb.c linux-2.6.22_hugetlb/mm/hugetlb.c
--- linux-2.6.22/mm/hugetlb.c	2007-07-09 07:32:17.000000000 +0800
+++ linux-2.6.22_hugetlb/mm/hugetlb.c	2007-07-23 15:51:49.000000000 +0800
@@ -434,12 +434,12 @@ void unmap_hugepage_range(struct vm_area
 }
 
 static int hugetlb_cow(struct mm_struct *mm, struct vm_area_struct *vma,
-			unsigned long address, pte_t *ptep, pte_t pte)
+			unsigned long address, pte_t *ptep)
 {
 	struct page *old_page, *new_page;
 	int avoidcopy;
 
-	old_page = pte_page(pte);
+	old_page = pte_page(*ptep);
 
 	/* If no-one else is actually using this page, avoid the copy
 	 * and just make the page writable */
@@ -449,28 +449,16 @@ static int hugetlb_cow(struct mm_struct 
 		return VM_FAULT_MINOR;
 	}
 
-	page_cache_get(old_page);
 	new_page = alloc_huge_page(vma, address);
-
 	if (!new_page) {
 		page_cache_release(old_page);
 		return VM_FAULT_OOM;
 	}
 
-	spin_unlock(&mm->page_table_lock);
 	copy_huge_page(new_page, old_page, address, vma);
-	spin_lock(&mm->page_table_lock);
-
-	ptep = huge_pte_offset(mm, address & HPAGE_MASK);
-	if (likely(pte_same(*ptep, pte))) {
-		/* Break COW */
-		set_huge_pte_at(mm, address, ptep,
-				make_huge_pte(vma, new_page, 1));
-		/* Make the old page be freed below */
-		new_page = old_page;
-	}
-	page_cache_release(new_page);
+	set_huge_pte_at (mm, address, ptep, make_huge_pte (vma, new_page, 1));
 	page_cache_release(old_page);
+
 	return VM_FAULT_MINOR;
 }
 
@@ -523,7 +511,6 @@ retry:
 			lock_page(page);
 	}
 
-	spin_lock(&mm->page_table_lock);
 	size = i_size_read(mapping->host) >> HPAGE_SHIFT;
 	if (idx >= size)
 		goto backout;
@@ -538,16 +525,14 @@ retry:
 
 	if (write_access && !(vma->vm_flags & VM_SHARED)) {
 		/* Optimization, do the COW without a second fault */
-		ret = hugetlb_cow(mm, vma, address, ptep, new_pte);
+		ret = hugetlb_cow(mm, vma, address, ptep);
 	}
 
-	spin_unlock(&mm->page_table_lock);
 	unlock_page(page);
 out:
 	return ret;
 
 backout:
-	spin_unlock(&mm->page_table_lock);
 	hugetlb_put_quota(mapping);
 	unlock_page(page);
 	put_page(page);
@@ -580,13 +565,8 @@ int hugetlb_fault(struct mm_struct *mm, 
 	}
 
 	ret = VM_FAULT_MINOR;
-
-	spin_lock(&mm->page_table_lock);
-	/* Check for a racing update before calling hugetlb_cow */
-	if (likely(pte_same(entry, *ptep)))
-		if (write_access && !pte_write(entry))
-			ret = hugetlb_cow(mm, vma, address, ptep, entry);
-	spin_unlock(&mm->page_table_lock);
+	if (write_access && !pte_write(entry))
+		ret = hugetlb_cow(mm, vma, address, ptep);
 	mutex_unlock(&hugetlb_instantiation_mutex);
 
 	return ret;

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] spinlock in function hugetlb_fault could be deleted
  2007-07-23  8:18 [PATCH] spinlock in function hugetlb_fault could be deleted Zhang, Yanmin
@ 2007-07-23  9:59 ` Zhang, Yanmin
  2007-07-23 14:27 ` Adam Litke
  1 sibling, 0 replies; 4+ messages in thread
From: Zhang, Yanmin @ 2007-07-23  9:59 UTC (permalink / raw)
  To: LKML; +Cc: agl

On Mon, 2007-07-23 at 16:18 +0800, Zhang, Yanmin wrote:
> Function hugetlb_fault needn't hold spinlock mm->page_table_lock,
> because when hugetlb_fault is called:
> 1) mm->mmap_sem is held already;
> 2) hugetlb_instantiation_mutex is held by hugetlb_fault, which prevents
> other threads/processes from entering this critical area. It's impossible
> for other threads/processes to change the page table now.
> 
> My patch against kenel 2.6.22 deletes the spinlock statements in the
> related functions.
> 
> Signed-off-by: Zhang Yanmin <yanmin.zhang@intel.com>
> 
> ---
> 
> diff -Nraup linux-2.6.22/mm/hugetlb.c linux-2.6.22_hugetlb/mm/hugetlb.c
> --- linux-2.6.22/mm/hugetlb.c	2007-07-09 07:32:17.000000000 +0800
> +++ linux-2.6.22_hugetlb/mm/hugetlb.c	2007-07-23 15:51:49.000000000 +0800
> @@ -434,12 +434,12 @@ void unmap_hugepage_range(struct vm_area
>  }
>  
>  static int hugetlb_cow(struct mm_struct *mm, struct vm_area_struct *vma,
> -			unsigned long address, pte_t *ptep, pte_t pte)
> +			unsigned long address, pte_t *ptep)
>  {
>  	struct page *old_page, *new_page;
>  	int avoidcopy;
>  
> -	old_page = pte_page(pte);
> +	old_page = pte_page(*ptep);
>  
>  	/* If no-one else is actually using this page, avoid the copy
>  	 * and just make the page writable */
> @@ -449,28 +449,16 @@ static int hugetlb_cow(struct mm_struct 
>  		return VM_FAULT_MINOR;
>  	}
>  
> -	page_cache_get(old_page);
>  	new_page = alloc_huge_page(vma, address);
> -
>  	if (!new_page) {
>  		page_cache_release(old_page);
I'm really sorry. Here page_cache_release(old_page) should be deleted.
Below is the new patch.

---

--- linux-2.6.22/mm/hugetlb.c	2007-07-09 07:32:17.000000000 +0800
+++ linux-2.6.22_hugetlb/mm/hugetlb.c	2007-07-23 17:41:54.000000000 +0800
@@ -434,12 +434,12 @@ void unmap_hugepage_range(struct vm_area
 }
 
 static int hugetlb_cow(struct mm_struct *mm, struct vm_area_struct *vma,
-			unsigned long address, pte_t *ptep, pte_t pte)
+			unsigned long address, pte_t *ptep)
 {
 	struct page *old_page, *new_page;
 	int avoidcopy;
 
-	old_page = pte_page(pte);
+	old_page = pte_page(*ptep);
 
 	/* If no-one else is actually using this page, avoid the copy
 	 * and just make the page writable */
@@ -449,28 +449,14 @@ static int hugetlb_cow(struct mm_struct 
 		return VM_FAULT_MINOR;
 	}
 
-	page_cache_get(old_page);
 	new_page = alloc_huge_page(vma, address);
-
-	if (!new_page) {
-		page_cache_release(old_page);
+	if (!new_page)
 		return VM_FAULT_OOM;
-	}
 
-	spin_unlock(&mm->page_table_lock);
 	copy_huge_page(new_page, old_page, address, vma);
-	spin_lock(&mm->page_table_lock);
-
-	ptep = huge_pte_offset(mm, address & HPAGE_MASK);
-	if (likely(pte_same(*ptep, pte))) {
-		/* Break COW */
-		set_huge_pte_at(mm, address, ptep,
-				make_huge_pte(vma, new_page, 1));
-		/* Make the old page be freed below */
-		new_page = old_page;
-	}
-	page_cache_release(new_page);
+	set_huge_pte_at (mm, address, ptep, make_huge_pte (vma, new_page, 1));
 	page_cache_release(old_page);
+
 	return VM_FAULT_MINOR;
 }
 
@@ -523,7 +509,6 @@ retry:
 			lock_page(page);
 	}
 
-	spin_lock(&mm->page_table_lock);
 	size = i_size_read(mapping->host) >> HPAGE_SHIFT;
 	if (idx >= size)
 		goto backout;
@@ -538,16 +523,14 @@ retry:
 
 	if (write_access && !(vma->vm_flags & VM_SHARED)) {
 		/* Optimization, do the COW without a second fault */
-		ret = hugetlb_cow(mm, vma, address, ptep, new_pte);
+		ret = hugetlb_cow(mm, vma, address, ptep);
 	}
 
-	spin_unlock(&mm->page_table_lock);
 	unlock_page(page);
 out:
 	return ret;
 
 backout:
-	spin_unlock(&mm->page_table_lock);
 	hugetlb_put_quota(mapping);
 	unlock_page(page);
 	put_page(page);
@@ -580,13 +563,8 @@ int hugetlb_fault(struct mm_struct *mm, 
 	}
 
 	ret = VM_FAULT_MINOR;
-
-	spin_lock(&mm->page_table_lock);
-	/* Check for a racing update before calling hugetlb_cow */
-	if (likely(pte_same(entry, *ptep)))
-		if (write_access && !pte_write(entry))
-			ret = hugetlb_cow(mm, vma, address, ptep, entry);
-	spin_unlock(&mm->page_table_lock);
+	if (write_access && !pte_write(entry))
+		ret = hugetlb_cow(mm, vma, address, ptep);
 	mutex_unlock(&hugetlb_instantiation_mutex);
 
 	return ret;

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] spinlock in function hugetlb_fault could be deleted
  2007-07-23  8:18 [PATCH] spinlock in function hugetlb_fault could be deleted Zhang, Yanmin
  2007-07-23  9:59 ` Zhang, Yanmin
@ 2007-07-23 14:27 ` Adam Litke
  2007-07-24  0:03   ` Zhang, Yanmin
  1 sibling, 1 reply; 4+ messages in thread
From: Adam Litke @ 2007-07-23 14:27 UTC (permalink / raw)
  To: Zhang, Yanmin; +Cc: LKML, agl

Hello.  hugetlb_instantiation_mutex is an extremely heavy-weight lock
whose days are numbered (hopefully).  It exists primarily to arbitrate
a race condition where n (n > 1) threads of execution race to satisfy
the same page fault for a process.  Even though only one hugetlb page
is needed, if (n) are not available, the application can receive a
bogus VM_FAULT_OOM.

Anyway, the hugetlb_instantiation_mutex approach has few friends
around here, so rather than making the code rely more heavily upon it,
perhaps you could focus you efforts on helping us remove it.

On 7/23/07, Zhang, Yanmin <yanmin_zhang@linux.intel.com> wrote:
> Function hugetlb_fault needn't hold spinlock mm->page_table_lock,
> because when hugetlb_fault is called:
> 1) mm->mmap_sem is held already;
> 2) hugetlb_instantiation_mutex is held by hugetlb_fault, which prevents
> other threads/processes from entering this critical area. It's impossible
> for other threads/processes to change the page table now.

--
Adam Litke ( agl at us.ibm.com )
IBM Linux Technology Center

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] spinlock in function hugetlb_fault could be deleted
  2007-07-23 14:27 ` Adam Litke
@ 2007-07-24  0:03   ` Zhang, Yanmin
  0 siblings, 0 replies; 4+ messages in thread
From: Zhang, Yanmin @ 2007-07-24  0:03 UTC (permalink / raw)
  To: Adam Litke; +Cc: LKML, agl

On Mon, 2007-07-23 at 09:27 -0500, Adam Litke wrote:
> Hello.  hugetlb_instantiation_mutex is an extremely heavy-weight lock
> whose days are numbered (hopefully).  It exists primarily to arbitrate
> a race condition where n (n > 1) threads of execution race to satisfy
> the same page fault for a process.  Even though only one hugetlb page
> is needed, if (n) are not available, the application can receive a
> bogus VM_FAULT_OOM.
Thanks for your kind comments.

> 
> Anyway, the hugetlb_instantiation_mutex approach has few friends
> around here, so rather than making the code rely more heavily upon it,
> perhaps you could focus you efforts on helping us remove it.
That's the correct direction. I will check if the mutex could be removed. 

> 
> On 7/23/07, Zhang, Yanmin <yanmin_zhang@linux.intel.com> wrote:
> > Function hugetlb_fault needn't hold spinlock mm->page_table_lock,
> > because when hugetlb_fault is called:
> > 1) mm->mmap_sem is held already;
> > 2) hugetlb_instantiation_mutex is held by hugetlb_fault, which prevents
> > other threads/processes from entering this critical area. It's impossible
> > for other threads/processes to change the page table now.
> 

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2007-07-24  0:04 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-07-23  8:18 [PATCH] spinlock in function hugetlb_fault could be deleted Zhang, Yanmin
2007-07-23  9:59 ` Zhang, Yanmin
2007-07-23 14:27 ` Adam Litke
2007-07-24  0:03   ` Zhang, Yanmin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox