All of lore.kernel.org
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Zhang Yi <zhang.yi20@zte.com.cn>,
	Jiang Biao <jiang.biao2@zte.com.cn>,
	Ma Chenggong <ma.chenggong@zte.com.cn>,
	Mel Gorman <mgorman@suse.de>,
	Darren Hart <dvhart@linux.intel.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Mike Galbraith <mgalbraith@suse.de>
Subject: [ 13/34] futex: Take hugepages into account when generating futex_key
Date: Sun, 18 Aug 2013 13:34:26 -0700	[thread overview]
Message-ID: <20130818203300.595977408@linuxfoundation.org> (raw)
In-Reply-To: <20130818203259.653403173@linuxfoundation.org>

3.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Zhang Yi <wetpzy@gmail.com>

commit 13d60f4b6ab5b702dc8d2ee20999f98a93728aec upstream.

The futex_keys of process shared futexes are generated from the page
offset, the mapping host and the mapping index of the futex user space
address. This should result in an unique identifier for each futex.

Though this is not true when futexes are located in different subpages
of an hugepage. The reason is, that the mapping index for all those
futexes evaluates to the index of the base page of the hugetlbfs
mapping. So a futex at offset 0 of the hugepage mapping and another
one at offset PAGE_SIZE of the same hugepage mapping have identical
futex_keys. This happens because the futex code blindly uses
page->index.

Steps to reproduce the bug:

1. Map a file from hugetlbfs. Initialize pthread_mutex1 at offset 0
   and pthread_mutex2 at offset PAGE_SIZE of the hugetlbfs
   mapping.

   The mutexes must be initialized as PTHREAD_PROCESS_SHARED because
   PTHREAD_PROCESS_PRIVATE mutexes are not affected by this issue as
   their keys solely depend on the user space address.

2. Lock mutex1 and mutex2

3. Create thread1 and in the thread function lock mutex1, which
   results in thread1 blocking on the locked mutex1.

4. Create thread2 and in the thread function lock mutex2, which
   results in thread2 blocking on the locked mutex2.

5. Unlock mutex2. Despite the fact that mutex2 got unlocked, thread2
   still blocks on mutex2 because the futex_key points to mutex1.

To solve this issue we need to take the normal page index of the page
which contains the futex into account, if the futex is in an hugetlbfs
mapping. In other words, we calculate the normal page mapping index of
the subpage in the hugetlbfs mapping.

Mappings which are not based on hugetlbfs are not affected and still
use page->index.

Thanks to Mel Gorman who provided a patch for adding proper evaluation
functions to the hugetlbfs code to avoid exposing hugetlbfs specific
details to the futex code.

[ tglx: Massaged changelog ]

Signed-off-by: Zhang Yi <zhang.yi20@zte.com.cn>
Reviewed-by: Jiang Biao <jiang.biao2@zte.com.cn>
Tested-by: Ma Chenggong <ma.chenggong@zte.com.cn>
Reviewed-by: 'Mel Gorman' <mgorman@suse.de>
Acked-by: 'Darren Hart' <dvhart@linux.intel.com>
Cc: 'Peter Zijlstra' <peterz@infradead.org>
Link: http://lkml.kernel.org/r/000101ce71a6%24a83c5880%24f8b50980%24@com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Mike Galbraith <mgalbraith@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>


---
 include/linux/hugetlb.h |   16 ++++++++++++++++
 kernel/futex.c          |    3 ++-
 mm/hugetlb.c            |   17 +++++++++++++++++
 3 files changed, 35 insertions(+), 1 deletion(-)

--- a/include/linux/hugetlb.h
+++ b/include/linux/hugetlb.h
@@ -293,6 +293,17 @@ static inline unsigned hstate_index_to_s
 	return hstates[index].order + PAGE_SHIFT;
 }
 
+pgoff_t __basepage_index(struct page *page);
+
+/* Return page->index in PAGE_SIZE units */
+static inline pgoff_t basepage_index(struct page *page)
+{
+	if (!PageCompound(page))
+		return page->index;
+
+	return __basepage_index(page);
+}
+
 #else	/* CONFIG_HUGETLB_PAGE */
 struct hstate {};
 #define alloc_huge_page_node(h, nid) NULL
@@ -311,6 +322,11 @@ static inline unsigned int pages_per_hug
 	return 1;
 }
 #define hstate_index_to_shift(index) 0
+
+static inline pgoff_t basepage_index(struct page *page)
+{
+	return page->index;
+}
 #endif
 
 #endif /* _LINUX_HUGETLB_H */
--- a/kernel/futex.c
+++ b/kernel/futex.c
@@ -60,6 +60,7 @@
 #include <linux/pid.h>
 #include <linux/nsproxy.h>
 #include <linux/ptrace.h>
+#include <linux/hugetlb.h>
 
 #include <asm/futex.h>
 
@@ -363,7 +364,7 @@ again:
 	} else {
 		key->both.offset |= FUT_OFF_INODE; /* inode-based key */
 		key->shared.inode = page_head->mapping->host;
-		key->shared.pgoff = page_head->index;
+		key->shared.pgoff = basepage_index(page);
 	}
 
 	get_futex_key_refs(key);
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -679,6 +679,23 @@ int PageHuge(struct page *page)
 }
 EXPORT_SYMBOL_GPL(PageHuge);
 
+pgoff_t __basepage_index(struct page *page)
+{
+	struct page *page_head = compound_head(page);
+	pgoff_t index = page_index(page_head);
+	unsigned long compound_idx;
+
+	if (!PageHuge(page_head))
+		return page_index(page);
+
+	if (compound_order(page_head) >= MAX_ORDER)
+		compound_idx = page_to_pfn(page) - page_to_pfn(page_head);
+	else
+		compound_idx = page - page_head;
+
+	return (index << compound_order(page_head)) + compound_idx;
+}
+
 static struct page *alloc_fresh_huge_page_node(struct hstate *h, int nid)
 {
 	struct page *page;



  parent reply	other threads:[~2013-08-18 20:33 UTC|newest]

Thread overview: 51+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-08-18 20:34 [ 00/34] 3.4.59-stable review Greg Kroah-Hartman
2013-08-18 20:34 ` [ 01/34] perf/arm: Fix armpmu_map_hw_event() Greg Kroah-Hartman
2013-08-18 20:34 ` [ 02/34] fs/proc/task_mmu.c: fix buffer overflow in add_page_map() Greg Kroah-Hartman
2013-08-18 20:34 ` [ 03/34] drm/i915/lvds: ditch ->prepare special case Greg Kroah-Hartman
2013-08-18 20:34 ` [ 04/34] MIPS: Expose missing pci_io{map,unmap} declarations Greg Kroah-Hartman
2013-08-18 20:34 ` [ 05/34] microblaze: Update microblaze defconfigs Greg Kroah-Hartman
2013-08-18 20:34 ` [ 06/34] sound: Fix make allmodconfig on MIPS Greg Kroah-Hartman
2013-08-18 20:34 ` [ 07/34] sound: Fix make allmodconfig on MIPS correctly Greg Kroah-Hartman
2013-08-18 20:34 ` [ 08/34] HID: microsoft: do not use compound literal - fix build Greg Kroah-Hartman
2013-08-18 20:34 ` [ 09/34] vm: add no-mmu vm_iomap_memory() stub Greg Kroah-Hartman
2013-08-18 20:34 ` [ 10/34] cris: posix_types.h, include asm-generic/posix_types.h Greg Kroah-Hartman
2013-08-18 20:34 ` [ 11/34] cris: Remove old legacy "-traditional" flag from arch-v10/lib/Makefile Greg Kroah-Hartman
2013-08-18 20:34 ` [ 12/34] CRIS: Add _sdata to vmlinux.lds.S Greg Kroah-Hartman
2013-08-18 20:34 ` Greg Kroah-Hartman [this message]
2013-08-18 20:34 ` [ 14/34] frv: Use correct size for task_struct allocation Greg Kroah-Hartman
2013-08-18 20:34 ` [ 15/34] frv: Use core allocator for task_struct Greg Kroah-Hartman
2013-08-18 20:34 ` [ 16/34] powerpc/numa: Avoid stupid uninitialized warning from gcc Greg Kroah-Hartman
2013-08-18 20:34 ` [ 17/34] alpha: makefile: dont enforce small data model for kernel builds Greg Kroah-Hartman
2013-08-18 20:34 ` [ 18/34] md/raid1,raid10: use freeze_array in place of raise_barrier in various places Greg Kroah-Hartman
2013-08-18 20:34 ` [ 19/34] sparc32: add ucmpdi2 Greg Kroah-Hartman
2013-08-18 20:34 ` [ 20/34] sparc32: Add ucmpdi2.o to obj-y instead of lib-y Greg Kroah-Hartman
2013-08-18 20:34 ` [ 21/34] MIPS: Rewrite pfn_valid to work in modules, too Greg Kroah-Hartman
2013-08-18 20:34 ` [ 22/34] af_key: initialize satype in key_notify_policy_flush() Greg Kroah-Hartman
2013-08-18 20:34 ` [ 23/34] iwl4965: set power mode early Greg Kroah-Hartman
2013-08-18 20:34 ` [ 24/34] iwl4965: reset firmware after rfkill off Greg Kroah-Hartman
2013-08-18 20:34 ` [ 25/34] can: pcan_usb: fix wrong memcpy() bytes length Greg Kroah-Hartman
2013-08-18 20:34 ` [ 26/34] genetlink: fix family dump race Greg Kroah-Hartman
2013-08-18 20:34 ` [ 27/34] usb: add two quirky touchscreen Greg Kroah-Hartman
2013-08-18 20:34 ` [ 28/34] USB: mos7720: fix broken control requests Greg Kroah-Hartman
2013-08-18 20:34 ` [ 29/34] xtensa: fix linker script transformation for .text.unlikely Greg Kroah-Hartman
2013-08-18 20:34 ` [ 30/34] xtensa: replace xtensa-specific _f{data,text} by _s{data,text} Greg Kroah-Hartman
2013-08-18 20:34 ` [ 31/34] ARM: 7809/1: perf: fix event validation for software group leaders Greg Kroah-Hartman
2013-08-18 20:34 ` [ 32/34] m68k: Truncate base in do_div() Greg Kroah-Hartman
2013-08-18 20:34 ` [ 33/34] m68k/atari: ARAnyM - Fix NatFeat module support Greg Kroah-Hartman
2013-08-18 20:34 ` [ 34/34] jbd2: Fix use after free after error in jbd2_journal_dirty_metadata() Greg Kroah-Hartman
2013-08-19  1:49 ` [ 00/34] 3.4.59-stable review Guenter Roeck
2013-08-19 18:02 ` Shuah Khan
2013-08-19 19:35   ` Greg Kroah-Hartman
2013-08-19 20:14     ` Stefan Lippers-Hollmann
2013-08-19 22:22       ` Shuah Khan
2013-08-19 22:30         ` Greg Kroah-Hartman
2013-08-20  7:36           ` Berg, Johannes
2013-08-20  7:36             ` Berg, Johannes
2013-08-20 15:24             ` Greg Kroah-Hartman
2013-08-20 15:32               ` Berg, Johannes
2013-08-20 15:53               ` Hugh Dickins
2013-08-20 16:03                 ` Greg Kroah-Hartman
2013-08-20 16:25                   ` Hugh Dickins
2013-08-20 16:43                     ` Steven Rostedt
2013-08-20 16:43                   ` Shuah Khan
2013-08-19 22:31         ` Shuah Khan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130818203300.595977408@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=dvhart@linux.intel.com \
    --cc=jiang.biao2@zte.com.cn \
    --cc=linux-kernel@vger.kernel.org \
    --cc=ma.chenggong@zte.com.cn \
    --cc=mgalbraith@suse.de \
    --cc=mgorman@suse.de \
    --cc=peterz@infradead.org \
    --cc=stable@vger.kernel.org \
    --cc=tglx@linutronix.de \
    --cc=zhang.yi20@zte.com.cn \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.