All of lore.kernel.org
 help / color / mirror / Atom feed
From: Darren Hart <dvhart@linux.intel.com>
To: zhang.yi20@zte.com.cn
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	Peter Zijlstra <peterz@infradead.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@kernel.org>
Subject: Re: [PATCH] futex: bugfix for futex-key conflict when futex use hugepage
Date: Tue, 16 Apr 2013 10:57:10 -0700	[thread overview]
Message-ID: <516D90F6.3020603@linux.intel.com> (raw)
In-Reply-To: <OF000BBE68.EBB4E92E-ON48257B4F.0010C2E7-48257B4F.0013FB89@zte.com.cn>

On 04/15/2013 08:37 PM, zhang.yi20@zte.com.cn wrote:
> Hello,
>

Hi Zhang,

I've rewrapped your plain text here for legibility, please adjust your
mail client accordingly.

> The futex-keys of processes share futex determined by page-offset,
> mapping-host, and mapping-index of the user space address.  User appications
> using hugepage for futex may lead to futex-key conflict.  Assume there are two
> or more futexes in diffrent normal pages of the hugepage, and each futex has
> the same offset in its normal page, causing all the futexes have the same
> futex-key.  In that case, futex may not work well.
>
> This patch adds the normal page index in the compound page into the offset of
> futex-key.

It also modifies the mm prep_compound*page() routines to set the page
compound index. You didn't modify the structure itself, I'm curious why
this information wasn't set before? Something for the MM folks I guess..

>
> Steps to reproduce the bug:
> 1. The 1st thread map a file of hugetlbfs, and use the return address as the
>    1st mutex's address, and use the return address with PAGE_SIZE added as the
>    2nd mutex's address;
> 2. The 1st thread initialize the two mutexes with pshared attribute, and lock
>    the two mutexes.
> 3. The 1st thread create the 2nd thread, and the 2nd thread block on the 1st
>    mutex.
> 4. The 1st thread create the 3rd thread, and the 3rd thread block on the 2nd
>    mutex.
> 5. The 1st thread unlock the 2nd mutex, the 3rd thread can not take the 2nd
>    mutex, and may block forever.


Again, a functional testcase in futextest would be a good idea. This
helps validate the patch and also can be used to identify regressions in
the future.


> Signed-off-by: Zhang Yi <zhang.yi20@zte.com.cn>
> Tested-by: Ma Chenggong <ma.chenggong@zte.com.cn>
> Reviewed-by: Liu Dong <liu.dong3@zte.com.cn>
> Reviewed-by: Cui Yunfeng <cui.yunfeng@zte.com.cn>
> Reviewed-by: Lu Zhongjun <lu.zhongjun@zte.com.cn>
> Reviewed-by: Jiang Biao <jiang.biao2@zte.com.cn>
>
> diff -uprN orig/linux-3.9-rc7/include/linux/mm.h
> new/linux-3.9-rc7/include/linux/mm.h
> --- orig/linux-3.9-rc7/include/linux/mm.h       2013-04-15
> 00:45:16.000000000 +0000
> +++ new/linux-3.9-rc7/include/linux/mm.h        2013-04-16
> 11:21:59.573458000 +0000
> @@ -502,6 +502,20 @@ static inline void set_compound_order(st
>         page[1].lru.prev = (void *)order;
>  }
>
> +static inline void set_page_compound_index(struct page *page, int index)
> +{
> +       if (PageHead(page))
> +               return;
> +       page->index = index;
> +}

I presume the spaces instead of tabs is a result of your mailer mangling
whitespace?

> +
> +static inline int get_page_compound_index(struct page *page)
> +{
> +       if (PageHead(page))
> +               return 0;
> +       return page->index;
> +}
> +
>  #ifdef CONFIG_MMU
>  /*
>   * Do pte_mkwrite, but only if the vma says VM_WRITE.  We do this when
> diff -uprN orig/linux-3.9-rc7/kernel/futex.c
> new/linux-3.9-rc7/kernel/futex.c
> --- orig/linux-3.9-rc7/kernel/futex.c   2013-04-15 00:45:16.000000000
> +0000
> +++ new/linux-3.9-rc7/kernel/futex.c    2013-04-16 11:13:30.069887000
> +0000
> @@ -239,7 +239,7 @@ get_futex_key(u32 __user *uaddr, int fsh
>         unsigned long address = (unsigned long)uaddr;
>         struct mm_struct *mm = current->mm;
>         struct page *page, *page_head;
> -       int err, ro = 0;
> +       int err, ro = 0, comp_idx = 0;
>
>         /*
>          * The futex address must be "naturally" aligned.
> @@ -299,6 +299,7 @@ again:
>                          * freed from under us.
>                          */
>                         if (page != page_head) {
> +                               comp_idx = get_page_compound_index(page);
>                                 get_page(page_head);
>                                 put_page(page);
>                         }
> @@ -311,6 +312,7 @@ again:
>  #else
>         page_head = compound_head(page);
>         if (page != page_head) {
> +               comp_idx = get_page_compound_index(page);
>                 get_page(page_head);
>                 put_page(page);
>         }
> @@ -363,7 +365,8 @@ again:
>                 key->private.mm = mm;
>                 key->private.address = address;
>         } else {
> -               key->both.offset |= FUT_OFF_INODE; /* inode-based key */
> +               key->both.offset |= (comp_idx << PAGE_SHIFT)
> +                                   | FUT_OFF_INODE; /* inode-based key */

Comments at the end of lines are bad form already, when moving to
multi-line, please move the comment just above the statements.

What is the max value of comp_idx? Are we at risk of truncating it?
Looks like not really from my initial look.

This also needs a comment in futex.h describing the usage of the offset
field in union futex_key as well as above get_futex_key describing the
key for shared mappings.


Thanks,

-- 
Darren Hart
Intel Open Source Technology Center
Yocto Project - Technical Lead - Linux Kernel

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)
From: Darren Hart <dvhart@linux.intel.com>
To: zhang.yi20@zte.com.cn
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	Peter Zijlstra <peterz@infradead.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@kernel.org>
Subject: Re: [PATCH] futex: bugfix for futex-key conflict when futex use hugepage
Date: Tue, 16 Apr 2013 10:57:10 -0700	[thread overview]
Message-ID: <516D90F6.3020603@linux.intel.com> (raw)
In-Reply-To: <OF000BBE68.EBB4E92E-ON48257B4F.0010C2E7-48257B4F.0013FB89@zte.com.cn>

On 04/15/2013 08:37 PM, zhang.yi20@zte.com.cn wrote:
> Hello,
>

Hi Zhang,

I've rewrapped your plain text here for legibility, please adjust your
mail client accordingly.

> The futex-keys of processes share futex determined by page-offset,
> mapping-host, and mapping-index of the user space address.  User appications
> using hugepage for futex may lead to futex-key conflict.  Assume there are two
> or more futexes in diffrent normal pages of the hugepage, and each futex has
> the same offset in its normal page, causing all the futexes have the same
> futex-key.  In that case, futex may not work well.
>
> This patch adds the normal page index in the compound page into the offset of
> futex-key.

It also modifies the mm prep_compound*page() routines to set the page
compound index. You didn't modify the structure itself, I'm curious why
this information wasn't set before? Something for the MM folks I guess..

>
> Steps to reproduce the bug:
> 1. The 1st thread map a file of hugetlbfs, and use the return address as the
>    1st mutex's address, and use the return address with PAGE_SIZE added as the
>    2nd mutex's address;
> 2. The 1st thread initialize the two mutexes with pshared attribute, and lock
>    the two mutexes.
> 3. The 1st thread create the 2nd thread, and the 2nd thread block on the 1st
>    mutex.
> 4. The 1st thread create the 3rd thread, and the 3rd thread block on the 2nd
>    mutex.
> 5. The 1st thread unlock the 2nd mutex, the 3rd thread can not take the 2nd
>    mutex, and may block forever.


Again, a functional testcase in futextest would be a good idea. This
helps validate the patch and also can be used to identify regressions in
the future.


> Signed-off-by: Zhang Yi <zhang.yi20@zte.com.cn>
> Tested-by: Ma Chenggong <ma.chenggong@zte.com.cn>
> Reviewed-by: Liu Dong <liu.dong3@zte.com.cn>
> Reviewed-by: Cui Yunfeng <cui.yunfeng@zte.com.cn>
> Reviewed-by: Lu Zhongjun <lu.zhongjun@zte.com.cn>
> Reviewed-by: Jiang Biao <jiang.biao2@zte.com.cn>
>
> diff -uprN orig/linux-3.9-rc7/include/linux/mm.h
> new/linux-3.9-rc7/include/linux/mm.h
> --- orig/linux-3.9-rc7/include/linux/mm.h       2013-04-15
> 00:45:16.000000000 +0000
> +++ new/linux-3.9-rc7/include/linux/mm.h        2013-04-16
> 11:21:59.573458000 +0000
> @@ -502,6 +502,20 @@ static inline void set_compound_order(st
>         page[1].lru.prev = (void *)order;
>  }
>
> +static inline void set_page_compound_index(struct page *page, int index)
> +{
> +       if (PageHead(page))
> +               return;
> +       page->index = index;
> +}

I presume the spaces instead of tabs is a result of your mailer mangling
whitespace?

> +
> +static inline int get_page_compound_index(struct page *page)
> +{
> +       if (PageHead(page))
> +               return 0;
> +       return page->index;
> +}
> +
>  #ifdef CONFIG_MMU
>  /*
>   * Do pte_mkwrite, but only if the vma says VM_WRITE.  We do this when
> diff -uprN orig/linux-3.9-rc7/kernel/futex.c
> new/linux-3.9-rc7/kernel/futex.c
> --- orig/linux-3.9-rc7/kernel/futex.c   2013-04-15 00:45:16.000000000
> +0000
> +++ new/linux-3.9-rc7/kernel/futex.c    2013-04-16 11:13:30.069887000
> +0000
> @@ -239,7 +239,7 @@ get_futex_key(u32 __user *uaddr, int fsh
>         unsigned long address = (unsigned long)uaddr;
>         struct mm_struct *mm = current->mm;
>         struct page *page, *page_head;
> -       int err, ro = 0;
> +       int err, ro = 0, comp_idx = 0;
>
>         /*
>          * The futex address must be "naturally" aligned.
> @@ -299,6 +299,7 @@ again:
>                          * freed from under us.
>                          */
>                         if (page != page_head) {
> +                               comp_idx = get_page_compound_index(page);
>                                 get_page(page_head);
>                                 put_page(page);
>                         }
> @@ -311,6 +312,7 @@ again:
>  #else
>         page_head = compound_head(page);
>         if (page != page_head) {
> +               comp_idx = get_page_compound_index(page);
>                 get_page(page_head);
>                 put_page(page);
>         }
> @@ -363,7 +365,8 @@ again:
>                 key->private.mm = mm;
>                 key->private.address = address;
>         } else {
> -               key->both.offset |= FUT_OFF_INODE; /* inode-based key */
> +               key->both.offset |= (comp_idx << PAGE_SHIFT)
> +                                   | FUT_OFF_INODE; /* inode-based key */

Comments at the end of lines are bad form already, when moving to
multi-line, please move the comment just above the statements.

What is the max value of comp_idx? Are we at risk of truncating it?
Looks like not really from my initial look.

This also needs a comment in futex.h describing the usage of the offset
field in union futex_key as well as above get_futex_key describing the
key for shared mappings.


Thanks,

-- 
Darren Hart
Intel Open Source Technology Center
Yocto Project - Technical Lead - Linux Kernel

  reply	other threads:[~2013-04-16 17:57 UTC|newest]

Thread overview: 47+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-04-16  3:37 [PATCH] futex: bugfix for futex-key conflict when futex use hugepage zhang.yi20
2013-04-16  3:37 ` zhang.yi20
2013-04-16 17:57 ` Darren Hart [this message]
2013-04-16 17:57   ` Darren Hart
2013-04-17  9:55   ` zhang.yi20
2013-04-17  9:55     ` zhang.yi20
2013-04-17 14:18     ` Darren Hart
2013-04-17 14:18       ` Darren Hart
2013-04-17 15:26       ` Dave Hansen
2013-04-17 15:26         ` Dave Hansen
2013-04-17 15:51         ` Darren Hart
2013-04-17 15:51           ` Darren Hart
2013-04-18  8:05           ` zhang.yi20
2013-04-18  8:05             ` zhang.yi20
2013-04-18 14:34             ` Darren Hart
2013-04-18 14:34               ` Darren Hart
2013-04-19  2:13               ` zhang.yi20
2013-04-19  2:13                 ` zhang.yi20
2013-04-19  2:42                 ` Darren Hart
2013-04-19  2:42                   ` Darren Hart
2013-04-19  2:45                 ` Darren Hart
2013-04-19  2:45                   ` Darren Hart
2013-04-19  7:03                   ` zhang.yi20
2013-04-19  7:03                     ` zhang.yi20
2013-04-18 10:14   ` 答复: " zhang.yi20
2013-04-16 18:37 ` Dave Hansen
2013-04-16 18:37   ` Dave Hansen
2013-04-16 18:47   ` Darren Hart
2013-04-16 18:47     ` Darren Hart
2013-04-17  7:25   ` 答复: " zhang.yi20
2013-04-17  7:47   ` zhang.yi20
2013-04-17  7:47     ` zhang.yi20
  -- strict thread matches above, loose matches on Subject: below --
2013-05-15 13:57 Zhang Yi
2013-05-15 14:20 ` Mel Gorman
2013-05-16  1:16   ` zhang.yi20
2013-05-16  1:30     ` Darren Hart
2013-05-07 12:43 Zhang Yi
2013-04-26 12:13 Zhang Yi
2013-04-26 18:26 ` Thomas Gleixner
2013-05-07 12:23   ` Zhang Yi
2013-05-07 15:20     ` Mel Gorman
2013-05-07 15:24       ` Thomas Gleixner
2013-05-07 15:54         ` Mel Gorman
2013-05-07 12:34   ` Zhang Yi
2013-04-24 14:27 Zhang Yi
2013-04-24 13:58 Zhang Yi
2013-04-25 20:52 ` Thomas Gleixner
2013-04-08  6:34 jiang.biao2

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=516D90F6.3020603@linux.intel.com \
    --to=dvhart@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    --cc=zhang.yi20@zte.com.cn \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.