linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] futex: bugfix for futex-key conflict when futex use hugepage
@ 2013-04-26 12:13 Zhang Yi
  2013-04-26 18:26 ` Thomas Gleixner
  0 siblings, 1 reply; 26+ messages in thread
From: Zhang Yi @ 2013-04-26 12:13 UTC (permalink / raw)
  To: linux-kernel
  Cc: 'Peter Zijlstra', 'Darren Hart',
	'Thomas Gleixner', 'Ingo Molnar',
	'Dave Hansen', zhang.yi20, wetpzy

Hi ,

At 2013-04-26 04:52:31,"Thomas Gleixner" <tglx@linutronix.de> wrote:
>
>Unfortunately this did not work out very well.
>
>1. Your patch now lacks a proper changelog which explains the change
>
>2. Your patch lacks any newline characters as you can see below
>

I am so sorry for my mistakes. : )




The futex-keys of processes share futex determined by page-offset, mapping-host, and
mapping-index of the user space address.
User appications using hugepage for futex may lead to futex-key conflict.
Assume there are two or more futexes in diffrent normal pages of the hugepage,
and each futex has the same offset in its normal page, causing all the futexes have the same futex-key.
In that case, futex may not work well.

This patch adds the normal page index in the compound page into the offset of futex-key.

Steps to reproduce the bug:
1. The 1st thread map a file of hugetlbfs, and use the return address as the 1st mutex's
address, and use the return address with PAGE_SIZE added as the 2nd mutex's address;
2. The 1st thread initialize the two mutexes with pshared attribute, and lock the two mutexes.
3. The 1st thread create the 2nd thread, and the 2nd thread block on the 1st mutex.
4. The 1st thread create the 3rd thread, and the 3rd thread block on the 2nd mutex.
5. The 1st thread unlock the 2nd mutex, the 3rd thread can not take the 2nd mutex, and
may block forever.


Signed-off-by: Zhang Yi <zhang.yi20@zte.com.cn>
Tested-by: Ma Chenggong <ma.chenggong@zte.com.cn>
Reviewed-by: Liu Dong <liu.dong3@zte.com.cn>
Reviewed-by: Cui Yunfeng <cui.yunfeng@zte.com.cn>
Reviewed-by: Lu Zhongjun <lu.zhongjun@zte.com.cn>
Reviewed-by: Jiang Biao <jiang.biao2@zte.com.cn>


diff -uprN orig/linux3.9-rc7/include/linux/futex.h new/linux3.9-rc7/include/linux/futex.h
--- orig/linux3.9-rc7/include/linux/futex.h	2013-04-15 00:45:16.000000000 +0000
+++ new/linux3.9-rc7/include/linux/futex.h	2013-04-19 16:33:58.725880000 +0000
@@ -19,7 +19,7 @@ handle_futex_death(u32 __user *uaddr, st
  * The key type depends on whether it's a shared or private mapping.
  * Don't rearrange members without looking at hash_futex().
  *
- * offset is aligned to a multiple of sizeof(u32) (== 4) by definition.
+ * There are three components in offset:
  * We use the two low order bits of offset to tell what is the kind of key :
  *  00 : Private process futex (PTHREAD_PROCESS_PRIVATE)
  *       (no reference on an inode or mm)
@@ -27,6 +27,9 @@ handle_futex_death(u32 __user *uaddr, st
  *	mapped on a file (reference on the underlying inode)
  *  10 : Shared futex (PTHREAD_PROCESS_SHARED)
  *       (but private mapping on an mm, and reference taken on it)
+ * Bits 2 to (PAGE_SHIFT-1) indicates the offset of futex in its page.
+ * The rest hign order bits indicates the index if the page is a
+ * subpage of a compound page.
 */

 #define FUT_OFF_INODE    1 /* We set bit 0 if key has a reference on inode */
@@ -36,17 +39,17 @@ union futex_key {
 	struct {
 		unsigned long pgoff;
 		struct inode *inode;
-		int offset;
+		long offset;
 	} shared;
 	struct {
 		unsigned long address;
 		struct mm_struct *mm;
-		int offset;
+		long offset;
 	} private;
 	struct {
 		unsigned long word;
 		void *ptr;
-		int offset;
+		long offset;
 	} both;
 };

diff -uprN orig/linux3.9-rc7/kernel/futex.c new/linux3.9-rc7/kernel/futex.c
--- orig/linux3.9-rc7/kernel/futex.c	2013-04-15 00:45:16.000000000 +0000
+++ new/linux3.9-rc7/kernel/futex.c	2013-04-19 16:24:05.629143000 +0000
@@ -215,6 +215,22 @@ static void drop_futex_key_refs(union fu
 	}
 }

+/*
+* Get subpage index in compound page, for futex_key.
+*/
+static inline int get_page_compound_index(struct page *page)
+{
+	struct page *head_page;
+	if (PageHead(page))
+		return 0;
+
+	head_page = compound_head(page);
+	if (compound_order(head_page) >= MAX_ORDER)
+		return page_to_pfn(page) - page_to_pfn(head_page);
+	else
+		return page - compound_head(page);
+}
+
 /**
  * get_futex_key() - Get parameters which are the keys for a futex
  * @uaddr:	virtual address of the futex
@@ -228,7 +244,8 @@ static void drop_futex_key_refs(union fu
  * The key words are stored in *key on success.
  *
  * For shared mappings, it's (page->index, file_inode(vma->vm_file),
- * offset_within_page).  For private mappings, it's (uaddr, current->mm).
+ * page_compound_index and offset_within_page).
+ * For private mappings, it's (uaddr, current->mm).
  * We can usually work out the index without swapping in the page.
  *
  * lock_page() might sleep, the caller should not hold a spinlock.
@@ -239,7 +256,7 @@ get_futex_key(u32 __user *uaddr, int fsh
 	unsigned long address = (unsigned long)uaddr;
 	struct mm_struct *mm = current->mm;
 	struct page *page, *page_head;
-	int err, ro = 0;
+	int err, ro = 0, comp_idx = 0;

 	/*
 	 * The futex address must be "naturally" aligned.
@@ -299,6 +316,7 @@ again:
 			 * freed from under us.
 			 */
 			if (page != page_head) {
+				comp_idx = get_page_compound_index(page);
 				get_page(page_head);
 				put_page(page);
 			}
@@ -311,6 +329,7 @@ again:
 #else
 	page_head = compound_head(page);
 	if (page != page_head) {
+		comp_idx = get_page_compound_index(page);
 		get_page(page_head);
 		put_page(page);
 	}
@@ -363,7 +382,9 @@ again:
 		key->private.mm = mm;
 		key->private.address = address;
 	} else {
-		key->both.offset |= FUT_OFF_INODE; /* inode-based key */
+		/* inode-based key */
+		key->both.offset |= ((long)comp_idx << PAGE_SHIFT)
+				   | FUT_OFF_INODE;
 		key->shared.inode = page_head->mapping->host;
 		key->shared.pgoff = page_head->index;
 	}



^ permalink raw reply	[flat|nested] 26+ messages in thread
* [PATCH] futex: bugfix for futex-key conflict when futex use hugepage
@ 2013-05-15 13:57 Zhang Yi
  2013-05-15 14:20 ` Mel Gorman
  0 siblings, 1 reply; 26+ messages in thread
From: Zhang Yi @ 2013-05-15 13:57 UTC (permalink / raw)
  To: linux-kernel
  Cc: 'Thomas Gleixner', 'Darren Hart',
	'Peter Zijlstra', 'Ingo Molnar', mgorman,
	zhang.yi20

The futex-keys of processes share futex determined by page-offset,
mapping-host, and mapping-index of the user space address. User
appications using hugepage for futex may lead to futex-key conflict.

Assume there are two or more futexes in diffrent normal pages of the
hugepage, and each futex has the same offset in its normal page,
causing all the futexes have the same futex-key.

This patch adds the normal page index in the compound page into
the pgoff of futex-key.

Steps to reproduce the bug:
1. The 1st thread map a file of hugetlbfs, and use the return address
as the 1st mutex's address, and use the return address with PAGE_SIZE
added as the 2nd mutex's address.
2. The 1st thread initialize the two mutexes with pshared attribute,
and lock the two mutexes.
3. The 1st thread create the 2nd thread, and the 2nd thread block on
the 1st mutex.
4. The 1st thread create the 3rd thread, and the 3rd thread block on
the 2nd mutex.
5. The 1st thread unlock the 2nd mutex, the 3rd thread cannot take
the 2nd mutex, and may block forever.


Signed-off-by: Zhang Yi <zhang.yi20@zte.com.cn>
Tested-by: Ma Chenggong <ma.chenggong@zte.com.cn>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Darren Hart <dvhart@linux.intel.com>
Reviewed-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Mel Gorman <mgorman@suse.de>
Reviewed-by: Liu Dong <liu.dong3@zte.com.cn>
Reviewed-by: Cui Yunfeng <cui.yunfeng@zte.com.cn>
Reviewed-by: Lu Zhongjun <lu.zhongjun@zte.com.cn>
Reviewed-by: Jiang Biao <jiang.biao2@zte.com.cn>


diff -uprN linux-3.10-rc1.org/include/linux/hugetlb.h linux-3.10-rc1/include/linux/hugetlb.h
--- linux-3.10-rc1.org/include/linux/hugetlb.h	2013-05-12 00:14:08.000000000 +0000
+++ linux-3.10-rc1/include/linux/hugetlb.h	2013-05-14 10:15:49.849389000 +0000
@@ -358,6 +358,17 @@ static inline int hstate_index(struct hs
 	return h - hstates;
 }

+pgoff_t __basepage_index(struct page *page);
+
+/* Return page->index in PAGE_SIZE units */
+static inline pgoff_t basepage_index(struct page *page)
+{
+	if (!PageCompound(page))
+		return page->index;
+
+	return __basepage_index(page);
+}
+
 #else	/* CONFIG_HUGETLB_PAGE */
 struct hstate {};
 #define alloc_huge_page_node(h, nid) NULL
@@ -378,6 +389,11 @@ static inline unsigned int pages_per_hug
 }
 #define hstate_index_to_shift(index) 0
 #define hstate_index(h) 0
+
+static inline pgoff_t basepage_index(struct page *page)
+{
+	return page->index;
+}
 #endif	/* CONFIG_HUGETLB_PAGE */

 #endif /* _LINUX_HUGETLB_H */
diff -uprN linux-3.10-rc1.org/kernel/futex.c linux-3.10-rc1/kernel/futex.c
--- linux-3.10-rc1.org/kernel/futex.c	2013-05-12 00:14:08.000000000 +0000
+++ linux-3.10-rc1/kernel/futex.c	2013-05-14 10:15:04.804350000 +0000
@@ -61,6 +61,7 @@
 #include <linux/nsproxy.h>
 #include <linux/ptrace.h>
 #include <linux/sched/rt.h>
+#include <linux/hugetlb.h>

 #include <asm/futex.h>

@@ -365,7 +366,7 @@ again:
 	} else {
 		key->both.offset |= FUT_OFF_INODE; /* inode-based key */
 		key->shared.inode = page_head->mapping->host;
-		key->shared.pgoff = page_head->index;
+		key->shared.pgoff = basepage_index(page);
 	}

 	get_futex_key_refs(key);
diff -uprN linux-3.10-rc1.org/mm/hugetlb.c linux-3.10-rc1/mm/hugetlb.c
--- linux-3.10-rc1.org/mm/hugetlb.c	2013-05-12 00:14:08.000000000 +0000
+++ linux-3.10-rc1/mm/hugetlb.c	2013-05-14 10:15:37.470175000 +0000
@@ -690,6 +690,23 @@ int PageHuge(struct page *page)
 }
 EXPORT_SYMBOL_GPL(PageHuge);

+pgoff_t __basepage_index(struct page *page)
+{
+	struct page *page_head = compound_head(page);
+	pgoff_t index = page_index(page_head);
+	int compound_idx;
+
+	if (!PageHuge(page_head))
+		return page_index(page);
+
+	if (compound_order(page_head) >= MAX_ORDER)
+		compound_idx = page_to_pfn(page) - page_to_pfn(page_head);
+	else
+		compound_idx = page - page_head;
+
+	return (index << compound_order(page_head)) + compound_idx;
+}
+
 static struct page *alloc_fresh_huge_page_node(struct hstate *h, int nid)
 {
 	struct page *page;



^ permalink raw reply	[flat|nested] 26+ messages in thread
* RE: [PATCH] futex: bugfix for futex-key conflict when futex use hugepage
@ 2013-05-07 12:43 Zhang Yi
  0 siblings, 0 replies; 26+ messages in thread
From: Zhang Yi @ 2013-05-07 12:43 UTC (permalink / raw)
  To: 'Thomas Gleixner'
  Cc: linux-kernel, 'Peter Zijlstra', 'Darren Hart',
	'Ingo Molnar', 'Dave Hansen', zhang.yi20, wetpzy

Hi,

The futex-keys of processes share futex determined by page-offset,
mapping-host, and mapping-index of the user space address. User
appications using hugepage for futex may lead to futex-key conflict.

Assume there are two or more futexes in diffrent normal pages of the
hugepage, and each futex has the same offset in its normal page,
causing all the futexes have the same futex-key.


Steps to reproduce the bug:
1. The 1st thread map a file of hugetlbfs, and use the return address
as the 1st mutex's address, and use the return address with PAGE_SIZE
added as the 2nd mutex's address.
2. The 1st thread initialize the two mutexes with pshared attribute,
and lock the two mutexes.
3. The 1st thread create the 2nd thread, and the 2nd thread block on
the 1st mutex.
4. The 1st thread create the 3rd thread, and the 3rd thread block on
the 2nd mutex.
5. The 1st thread unlock the 2nd mutex, the 3rd thread cannot take
the 2nd mutex, and may block forever.


Signed-off-by: Zhang Yi <zhang.yi20@zte.com.cn>
Tested-by: Ma Chenggong <ma.chenggong@zte.com.cn>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Darren Hart <dvhart@linux.intel.com>
Reviewed-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Liu Dong <liu.dong3@zte.com.cn>
Reviewed-by: Cui Yunfeng <cui.yunfeng@zte.com.cn>
Reviewed-by: Lu Zhongjun <lu.zhongjun@zte.com.cn>
Reviewed-by: Jiang Biao <jiang.biao2@zte.com.cn>


diff -uprN linux3.9-orig/include/linux/futex.h linux3.9/include/linux/futex.h
--- linux3.9-orig/include/linux/futex.h	2013-04-15 00:45:16.000000000 +0000
+++ linux3.9/include/linux/futex.h	2013-04-27 08:59:58.932078000 +0000
@@ -19,7 +19,7 @@ handle_futex_death(u32 __user *uaddr, st
  * The key type depends on whether it's a shared or private mapping.
  * Don't rearrange members without looking at hash_futex().
  *
- * offset is aligned to a multiple of sizeof(u32) (== 4) by definition.
+ * There are three cmponents in offset:
  * We use the two low order bits of offset to tell what is the kind of key :
  *  00 : Private process futex (PTHREAD_PROCESS_PRIVATE)
  *       (no reference on an inode or mm)
@@ -27,6 +27,9 @@ handle_futex_death(u32 __user *uaddr, st
  *	mapped on a file (reference on the underlying inode)
  *  10 : Shared futex (PTHREAD_PROCESS_SHARED)
  *       (but private mapping on an mm, and reference taken on it)
+ * Bits 2 to (PAGE_SHIFT-1) indicates the offset of futex in its page.
+ * The rest hign order bits indicates the index if the page is a
+ * subpage of a compound page.
 */

 #define FUT_OFF_INODE    1 /* We set bit 0 if key has a reference on inode */
@@ -36,17 +39,17 @@ union futex_key {
 	struct {
 		unsigned long pgoff;
 		struct inode *inode;
-		int offset;
+		unsigned long offset;
 	} shared;
 	struct {
 		unsigned long address;
 		struct mm_struct *mm;
-		int offset;
+		unsigned long offset;
 	} private;
 	struct {
 		unsigned long word;
 		void *ptr;
-		int offset;
+		unsigned long offset;
 	} both;
 };

diff -uprN linux3.9-orig/kernel/futex.c linux3.9/kernel/futex.c
--- linux3.9-orig/kernel/futex.c	2013-04-15 00:45:16.000000000 +0000
+++ linux3.9/kernel/futex.c	2013-05-06 16:24:40.403525000 +0000
@@ -215,6 +215,22 @@ static void drop_futex_key_refs(union fu
 	}
 }

+/*
+* Get subpage index in compound page, and add it into futex_key.
+*/
+static void key_add_compound_idx(union futex_key *key,
+				 struct page *head_page, struct page *page)
+{
+	int compound_idx;
+
+	if (compound_order(head_page) >= MAX_ORDER)
+		compound_idx = page_to_pfn(page) - page_to_pfn(head_page);
+	else
+		compound_idx = page - head_page;
+
+	key->both.offset |= compound_idx << PAGE_SHIFT;
+}
+
 /**
  * get_futex_key() - Get parameters which are the keys for a futex
  * @uaddr:	virtual address of the futex
@@ -228,7 +244,8 @@ static void drop_futex_key_refs(union fu
  * The key words are stored in *key on success.
  *
  * For shared mappings, it's (page->index, file_inode(vma->vm_file),
- * offset_within_page).  For private mappings, it's (uaddr, current->mm).
+ * page_compound_index and offset_within_page).
+ * For private mappings, it's (uaddr, current->mm).
  * We can usually work out the index without swapping in the page.
  *
  * lock_page() might sleep, the caller should not hold a spinlock.
@@ -366,6 +383,8 @@ again:
 		key->both.offset |= FUT_OFF_INODE; /* inode-based key */
 		key->shared.inode = page_head->mapping->host;
 		key->shared.pgoff = page_head->index;
+		if (page != page_head)
+			key_add_compound_idx(key, page_head, page);
 	}

 	get_futex_key_refs(key);




^ permalink raw reply	[flat|nested] 26+ messages in thread
* [PATCH] futex: bugfix for futex-key conflict when futex use hugepage
@ 2013-04-24 14:27 Zhang Yi
  0 siblings, 0 replies; 26+ messages in thread
From: Zhang Yi @ 2013-04-24 14:27 UTC (permalink / raw)
  To: linux-kernel
  Cc: 'Peter Zijlstra', 'Darren Hart',
	'Thomas Gleixner', 'Ingo Molnar',
	'Dave Hansen', zhang.yi20

Hi all,

I reworked the patch base on your advices.
For the line-wrapped bug before, I use this mailbox to send the mail .



Signed-off-by: Zhang Yi <zhang.yi20@zte.com.cn>
Tested-by: Ma Chenggong <ma.chenggong@zte.com.cn>
Reviewed-by: Liu Dong <liu.dong3@zte.com.cn>
Reviewed-by: Cui Yunfeng <cui.yunfeng@zte.com.cn>
Reviewed-by: Lu Zhongjun <lu.zhongjun@zte.com.cn>
Reviewed-by: Jiang Biao <jiang.biao2@zte.com.cn>


diff -uprN orig/linux3.9-rc7/include/linux/futex.h new/linux3.9-rc7/include/linux/futex.h
--- orig/linux3.9-rc7/include/linux/futex.h	2013-04-15 00:45:16.000000000 +0000
+++ new/linux3.9-rc7/include/linux/futex.h	2013-04-19 16:33:58.725880000 +0000
@@ -19,7 +19,7 @@ handle_futex_death(u32 __user *uaddr, st
  * The key type depends on whether it's a shared or private mapping.
  * Don't rearrange members without looking at hash_futex().
  *
- * offset is aligned to a multiple of sizeof(u32) (== 4) by definition.
+ * There are three cmponents in offset:
  * We use the two low order bits of offset to tell what is the kind of key :
  *  00 : Private process futex (PTHREAD_PROCESS_PRIVATE)
  *       (no reference on an inode or mm)
@@ -27,6 +27,9 @@ handle_futex_death(u32 __user *uaddr, st
  *	mapped on a file (reference on the underlying inode)
  *  10 : Shared futex (PTHREAD_PROCESS_SHARED)
  *       (but private mapping on an mm, and reference taken on it)
+ * Bits 2 to (PAGE_SHIFT-1) indicates the offset of futex in its page.
+ * The rest hign order bits indicates the index if the page is a
+ * subpage of a compound page.
 */

 #define FUT_OFF_INODE    1 /* We set bit 0 if key has a reference on inode */
@@ -36,17 +39,17 @@ union futex_key {
 	struct {
 		unsigned long pgoff;
 		struct inode *inode;
-		int offset;
+		long offset;
 	} shared;
 	struct {
 		unsigned long address;
 		struct mm_struct *mm;
-		int offset;
+		long offset;
 	} private;
 	struct {
 		unsigned long word;
 		void *ptr;
-		int offset;
+		long offset;
 	} both;
 };

diff -uprN orig/linux3.9-rc7/kernel/futex.c new/linux3.9-rc7/kernel/futex.c
--- orig/linux3.9-rc7/kernel/futex.c	2013-04-15 00:45:16.000000000 +0000
+++ new/linux3.9-rc7/kernel/futex.c	2013-04-19 16:24:05.629143000 +0000
@@ -215,6 +215,22 @@ static void drop_futex_key_refs(union fu
 	}
 }

+/*
+* Get subpage index in compound page, for futex_key.
+*/
+static inline int get_page_compound_index(struct page *page)
+{
+	struct page *head_page;
+	if (PageHead(page))
+		return 0;
+
+	head_page = compound_head(page);
+	if (compound_order(head_page) >= MAX_ORDER)
+		return page_to_pfn(page) - page_to_pfn(head_page);
+	else
+		return page - compound_head(page);
+}
+
 /**
  * get_futex_key() - Get parameters which are the keys for a futex
  * @uaddr:	virtual address of the futex
@@ -228,7 +244,8 @@ static void drop_futex_key_refs(union fu
  * The key words are stored in *key on success.
  *
  * For shared mappings, it's (page->index, file_inode(vma->vm_file),
- * offset_within_page).  For private mappings, it's (uaddr, current->mm).
+ * page_compound_index and offset_within_page).
+ * For private mappings, it's (uaddr, current->mm).
  * We can usually work out the index without swapping in the page.
  *
  * lock_page() might sleep, the caller should not hold a spinlock.
@@ -239,7 +256,7 @@ get_futex_key(u32 __user *uaddr, int fsh
 	unsigned long address = (unsigned long)uaddr;
 	struct mm_struct *mm = current->mm;
 	struct page *page, *page_head;
-	int err, ro = 0;
+	int err, ro = 0, comp_idx = 0;

 	/*
 	 * The futex address must be "naturally" aligned.
@@ -299,6 +316,7 @@ again:
 			 * freed from under us.
 			 */
 			if (page != page_head) {
+				comp_idx = get_page_compound_index(page);
 				get_page(page_head);
 				put_page(page);
 			}
@@ -311,6 +329,7 @@ again:
 #else
 	page_head = compound_head(page);
 	if (page != page_head) {
+		comp_idx = get_page_compound_index(page);
 		get_page(page_head);
 		put_page(page);
 	}
@@ -363,7 +382,9 @@ again:
 		key->private.mm = mm;
 		key->private.address = address;
 	} else {
-		key->both.offset |= FUT_OFF_INODE; /* inode-based key */
+		/* inode-based key */
+		key->both.offset |= ((long)comp_idx << PAGE_SHIFT)
+				   | FUT_OFF_INODE;
 		key->shared.inode = page_head->mapping->host;
 		key->shared.pgoff = page_head->index;
 	}



^ permalink raw reply	[flat|nested] 26+ messages in thread
* [PATCH] futex: bugfix for futex-key conflict when futex use hugepage
@ 2013-04-24 13:58 Zhang Yi
  2013-04-25 20:52 ` Thomas Gleixner
  0 siblings, 1 reply; 26+ messages in thread
From: Zhang Yi @ 2013-04-24 13:58 UTC (permalink / raw)
  To: linux-kernel
  Cc: dvhart, mingo, peterz, tglx, dave.hansen, zhang.yi20@zte.com.cn

Hi all,

I reworked the patch base on your advices。
For the line-wrapped bug before, I use this mailbox to send the mail .


Signed-off-by: Zhang Yi <zhang.yi20@zte.com.cn>Tested-by: Ma Chenggong <ma.chenggong@zte.com.cn>Reviewed-by: Liu Dong <liu.dong3@zte.com.cn>Reviewed-by: Cui Yunfeng <cui.yunfeng@zte.com.cn>Reviewed-by: Lu Zhongjun <lu.zhongjun@zte.com.cn>Reviewed-by: Jiang Biao <jiang.biao2@zte.com.cn>

diff -uprN orig/linux3.9-rc7/include/linux/futex.h new/linux3.9-rc7/include/linux/futex.h--- orig/linux3.9-rc7/include/linux/futex.h	2013-04-15 00:45:16.000000000 +0000+++ new/linux3.9-rc7/include/linux/futex.h	2013-04-19 16:33:58.725880000 +0000@@ -19,7 +19,7 @@ handle_futex_death(u32 __user *uaddr, st  * The key type depends on whether it's a shared or private mapping.  * Don't rearrange members without looking at hash_futex().  *- * offset is aligned to a multiple of sizeof(u32) (== 4) by definition.+ * There are three cmponents in offset:  * We use the two low order bits of offset to tell what is the kind of key :  *  00 : Private process futex (PTHREAD_PROCESS_PRIVATE)  *       (no reference on an inode or mm)@@ -27,6 +27,9 @@ handle_futex_death(u32 __user *uaddr, st  *	mapped on a file (reference on the underlying inode)  *  10 : Shared futex (PTHREAD_PROCESS_SHARED)  *       (but private mapping on an mm, and reference taken on it)+ * Bits 2 to (PAGE_SHIFT-1) indicates the offset of futex in its page.+ * The rest hign order bits indicates the index if the page is a+ * subpage of a compound page. */  #define FUT_OFF_INODE    1 /* We set bit 0 if key has a reference on inode */@@ -36,17 +39,17 @@ union futex_key { 	struct { 		unsigned long pgoff; 		struct inode *inode;-		int offset;+		long offset; 	} shared; 	struct { 		unsigned long address; 		struct mm_struct *mm;-		int offset;+		long offset; 	} private; 	struct { 		unsigned long word; 		void *ptr;-		int offset;+		long offset; 	} both; }; diff -uprN orig/linux3.9-rc7/kernel/futex.c new/linux3.9-rc7/kernel/futex.c--- orig/linux3.9-rc7/kernel/futex.c	2013-04-15 00:45:16.000000000 +0000+++ new/linux3.9-rc7/kernel/futex.c	2013-04-19 16:24:05.629143000 +0000@@ -215,6 +215,22 @@ static void drop_futex_key_refs(union fu 	} } +/*+* Get subpage index in compound page, for futex_key.+*/+static inline int get_page_compound_index(struct page *page)+{+	struct page *head_page;+	if (PageHead(page))+		return 0;++	head_page = compound_head(page);+	if (compound_order(head_page) >= MAX_ORDER)+		return page_to_pfn(page) - page_to_pfn(head_page);+	else+		return page - compound_head(page);+}+ /**  * get_futex_key() - Get parameters which are the keys for a futex  * @uaddr:	virtual address of the futex@@ -228,7 +244,8 @@ static void drop_futex_key_refs(union fu  * The key words are stored in *key on success.  *  * For shared mappings, it's (page->index, file_inode(vma->vm_file),- * offset_within_page).  For private mappings, it's (uaddr, current->mm).+ * page_compound_index and offset_within_page).+ * For private mappings, it's (uaddr, current->mm).  * We can usually work out the index without swapping in the page.  *  * lock_page() might sleep, the caller should not hold a spinlock.@@ -239,7 +256,7 @@ get_futex_key(u32 __user *uaddr, int fsh 	unsigned long address = (unsigned long)uaddr; 	struct mm_struct *mm = current->mm; 	struct page *page, *page_head;-	int err, ro = 0;+	int err, ro = 0, comp_idx = 0;  	/* 	 * The futex address must be "naturally" aligned.@@ -299,6 +316,7 @@ again: 			 * freed from under us. 			 */ 			if (page != page_head) {+				comp_idx = get_page_compound_index(page); 				get_page(page_head); 				put_page(page); 			}@@ -311,6 +329,7 @@ again: #else 	page_head = compound_head(page); 	if (page != page_head) {+		comp_idx = get_page_compound_index(page); 		get_page(page_head); 		put_page(page); 	}@@ -363,7 +382,9 @@ again: 		key->private.mm = mm; 		key->private.address = address; 	} else {-		key->both.offset |= FUT_OFF_INODE; /* inode-based key */+		/* inode-based key */+		key->both.offset |= ((long)comp_idx << PAGE_SHIFT)+				   | FUT_OFF_INODE; 		key->shared.inode = page_head->mapping->host; 		key->shared.pgoff = page_head->index; 	}

^ permalink raw reply	[flat|nested] 26+ messages in thread
* Re: Re: [PATCH] futex: bugfix for futex-key conflict when futex use hugepage
  2013-04-16 17:57 ` Darren Hart
@ 2013-04-17  9:55 zhang.yi20
  2013-04-17 14:18 ` Darren Hart
  -1 siblings, 1 reply; 26+ messages in thread
From: zhang.yi20 @ 2013-04-17  9:55 UTC (permalink / raw)
  To: Darren Hart
  Cc: linux-kernel, linux-mm, Ingo Molnar, Peter Zijlstra,
	Thomas Gleixner

Darren Hart <dvhart@linux.intel.com> wrote on 2013/04/17 01:57:10:

> Again, a functional testcase in futextest would be a good idea. This
> helps validate the patch and also can be used to identify regressions in
> the future.

I will post the testcase code later.

> 
> What is the max value of comp_idx? Are we at risk of truncating it?
> Looks like not really from my initial look.
> 
> This also needs a comment in futex.h describing the usage of the offset
> field in union futex_key as well as above get_futex_key describing the
> key for shared mappings.
> 
> 

As far as I know , the max size of one hugepage is 1 GBytes for x86 cpu.
Can some other cpus support greater hugepage even more than 4 GBytes? If 
so, we can change the type of 'offset' from int to long to avoid 
truncating.

^ permalink raw reply	[flat|nested] 26+ messages in thread
* [PATCH] futex: bugfix for futex-key conflict when futex use hugepage
@ 2013-04-16  3:37 zhang.yi20
  2013-04-16 17:57 ` Darren Hart
  2013-04-16 18:37 ` Dave Hansen
  0 siblings, 2 replies; 26+ messages in thread
From: zhang.yi20 @ 2013-04-16  3:37 UTC (permalink / raw)
  To: linux-kernel, linux-mm
  Cc: Peter Zijlstra, Darren Hart, Thomas Gleixner, Ingo Molnar

Hello,

The futex-keys of processes share futex determined by page-offset, 
mapping-host, and 
mapping-index of the user space address. 
User appications using hugepage for futex may lead to futex-key conflict. 
Assume there 
are two or more futexes in diffrent normal pages of the hugepage, and each 
futex has 
the same offset in its normal page, causing all the futexes have the same 
futex-key. 
In that case, futex may not work well. 

This patch adds the normal page index in the compound page into the offset 
of futex-key. 

Steps to reproduce the bug: 
1. The 1st thread map a file of hugetlbfs, and use the return address as 
the 1st mutex's 
address, and use the return address with PAGE_SIZE added as the 2nd 
mutex's address; 
2. The 1st thread initialize the two mutexes with pshared attribute, and 
lock the two mutexes. 
3. The 1st thread create the 2nd thread, and the 2nd thread block on the 
1st mutex. 
4. The 1st thread create the 3rd thread, and the 3rd thread block on the 
2nd mutex. 
5. The 1st thread unlock the 2nd mutex, the 3rd thread can not take the 
2nd mutex, and 
may block forever. 

Signed-off-by: Zhang Yi <zhang.yi20@zte.com.cn>
Tested-by: Ma Chenggong <ma.chenggong@zte.com.cn>
Reviewed-by: Liu Dong <liu.dong3@zte.com.cn>
Reviewed-by: Cui Yunfeng <cui.yunfeng@zte.com.cn>
Reviewed-by: Lu Zhongjun <lu.zhongjun@zte.com.cn>
Reviewed-by: Jiang Biao <jiang.biao2@zte.com.cn>

diff -uprN orig/linux-3.9-rc7/include/linux/mm.h 
new/linux-3.9-rc7/include/linux/mm.h
--- orig/linux-3.9-rc7/include/linux/mm.h       2013-04-15 
00:45:16.000000000 +0000
+++ new/linux-3.9-rc7/include/linux/mm.h        2013-04-16 
11:21:59.573458000 +0000
@@ -502,6 +502,20 @@ static inline void set_compound_order(st
        page[1].lru.prev = (void *)order;
 }
 
+static inline void set_page_compound_index(struct page *page, int index)
+{
+       if (PageHead(page))
+               return;
+       page->index = index;
+}
+
+static inline int get_page_compound_index(struct page *page)
+{
+       if (PageHead(page))
+               return 0;
+       return page->index;
+}
+
 #ifdef CONFIG_MMU
 /*
  * Do pte_mkwrite, but only if the vma says VM_WRITE.  We do this when
diff -uprN orig/linux-3.9-rc7/kernel/futex.c 
new/linux-3.9-rc7/kernel/futex.c
--- orig/linux-3.9-rc7/kernel/futex.c   2013-04-15 00:45:16.000000000 
+0000
+++ new/linux-3.9-rc7/kernel/futex.c    2013-04-16 11:13:30.069887000 
+0000
@@ -239,7 +239,7 @@ get_futex_key(u32 __user *uaddr, int fsh
        unsigned long address = (unsigned long)uaddr;
        struct mm_struct *mm = current->mm;
        struct page *page, *page_head;
-       int err, ro = 0;
+       int err, ro = 0, comp_idx = 0;
 
        /*
         * The futex address must be "naturally" aligned.
@@ -299,6 +299,7 @@ again:
                         * freed from under us.
                         */
                        if (page != page_head) {
+                               comp_idx = get_page_compound_index(page);
                                get_page(page_head);
                                put_page(page);
                        }
@@ -311,6 +312,7 @@ again:
 #else
        page_head = compound_head(page);
        if (page != page_head) {
+               comp_idx = get_page_compound_index(page);
                get_page(page_head);
                put_page(page);
        }
@@ -363,7 +365,8 @@ again:
                key->private.mm = mm;
                key->private.address = address;
        } else {
-               key->both.offset |= FUT_OFF_INODE; /* inode-based key */
+               key->both.offset |= (comp_idx << PAGE_SHIFT)
+                                   | FUT_OFF_INODE; /* inode-based key */
                key->shared.inode = page_head->mapping->host;
                key->shared.pgoff = page_head->index;
        }
diff -uprN orig/linux-3.9-rc7/mm/hugetlb.c new/linux-3.9-rc7/mm/hugetlb.c
--- orig/linux-3.9-rc7/mm/hugetlb.c     2013-04-15 00:45:16.000000000 
+0000
+++ new/linux-3.9-rc7/mm/hugetlb.c      2013-04-16 10:23:02.658531000 
+0000
@@ -667,6 +667,7 @@ static void prep_compound_gigantic_page(
        for (i = 1; i < nr_pages; i++, p = mem_map_next(p, page, i)) {
                __SetPageTail(p);
                set_page_count(p, 0);
+               set_page_compound_index(p, i);
                p->first_page = page;
        }
 }
diff -uprN orig/linux-3.9-rc7/mm/page_alloc.c 
new/linux-3.9-rc7/mm/page_alloc.c
--- orig/linux-3.9-rc7/mm/page_alloc.c  2013-04-15 00:45:16.000000000 
+0000
+++ new/linux-3.9-rc7/mm/page_alloc.c   2013-04-16 10:23:16.452393000 
+0000
@@ -361,6 +361,7 @@ void prep_compound_page(struct page *pag
                struct page *p = page + i;
                __SetPageTail(p);
                set_page_count(p, 0);
+               set_page_compound_index(p, i);
                p->first_page = page;
        }
 }



^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2013-05-16  1:30 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-04-26 12:13 [PATCH] futex: bugfix for futex-key conflict when futex use hugepage Zhang Yi
2013-04-26 18:26 ` Thomas Gleixner
2013-05-07 12:23   ` Zhang Yi
2013-05-07 15:20     ` Mel Gorman
2013-05-07 15:24       ` Thomas Gleixner
2013-05-07 15:54         ` Mel Gorman
2013-05-10  9:08       ` zhang.yi20
2013-05-10  9:42         ` Mel Gorman
2013-05-07 12:34   ` Zhang Yi
  -- strict thread matches above, loose matches on Subject: below --
2013-05-15 13:57 Zhang Yi
2013-05-15 14:20 ` Mel Gorman
2013-05-16  1:16   ` zhang.yi20
2013-05-16  1:30     ` Darren Hart
2013-05-07 12:43 Zhang Yi
2013-04-24 14:27 Zhang Yi
2013-04-24 13:58 Zhang Yi
2013-04-25 20:52 ` Thomas Gleixner
2013-04-17  9:55 zhang.yi20
2013-04-17 14:18 ` Darren Hart
2013-04-17 15:26   ` Dave Hansen
2013-04-17 15:51     ` Darren Hart
2013-04-18  8:05       ` zhang.yi20
2013-04-18 14:34         ` Darren Hart
2013-04-19  2:13           ` zhang.yi20
2013-04-19  2:42             ` Darren Hart
2013-04-19  2:45             ` Darren Hart
2013-04-16  3:37 zhang.yi20
2013-04-16 17:57 ` Darren Hart
2013-04-16 18:37 ` Dave Hansen
2013-04-16 18:47   ` Darren Hart

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).