From: Andrea Arcangeli <aarcange@redhat.com>
To: Oleg Nesterov <oleg@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>,
Linus Torvalds <torvalds@linux-foundation.org>,
Dave Jones <davej@redhat.com>,
Darren Hart <dvhart@linux.intel.com>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
Peter Zijlstra <peterz@infradead.org>,
Mel Gorman <mgorman@suse.de>
Subject: Re: PATCH? introduce get_compound_page (Was: process 'stuck' at exit)
Date: Fri, 13 Dec 2013 16:10:35 +0100 [thread overview]
Message-ID: <20131213151035.GE5408@redhat.com> (raw)
In-Reply-To: <20131211191855.GA32485@redhat.com>
On Wed, Dec 11, 2013 at 08:18:55PM +0100, Oleg Nesterov wrote:
> --- a/mm/swap.c
> +++ b/mm/swap.c
> @@ -210,7 +210,7 @@ EXPORT_SYMBOL(put_page);
> * This function is exported but must not be called by anything other
> * than get_page(). It implements the slow path of get_page().
> */
> -bool __get_page_tail(struct page *page)
> +static struct page *__get_compound_head(struct page *page, bool and_tail)
> {
This is not necessarily inline (unlike __get_page_tail_foll which is
inline and optimizes away the build time known "false"/"true" params).
> /*
> * This takes care of get_page() if run on a tail page
> @@ -234,8 +234,9 @@ bool __get_page_tail(struct page *page)
> * cannot race here.
> */
> VM_BUG_ON(!PageHead(page_head));
> - __get_page_tail_foll(page, false);
> - return true;
> + if (and_tail)
> + get_huge_page_tail(page);
So you may be introducing a real branch here unless gcc decides to
self-inline the static func.
> + return page_head;
> } else {
> /*
> * __split_huge_page_refcount run
> @@ -260,17 +261,34 @@ bool __get_page_tail(struct page *page)
> flags = compound_lock_irqsave(page_head);
> /* here __split_huge_page_refcount won't run anymore */
> if (likely(PageTail(page))) {
> - __get_page_tail_foll(page, false);
> + if (and_tail)
> + get_huge_page_tail(page);
Same here.
> +bool __get_page_tail(struct page *page)
> +{
> + return !!__get_compound_head(page, true);
> }
I hope gcc would optimize away the !! overhead vs the old got =
true/false, once __get_compound_head is inline.
So I think you should add inline to __get_compound_head, there's no
benefit to keep the get_compound_head in the CPU icache.
get_huge_page_tail checks different invariants in the VM_BUG_ON and is
only used by gup.c not sure why to call that here.
But __get_page_tail has changed in -mm so the above will reject, in
-mm you will find __get_page_tail_foll called with "true" parameter
too in __get_page_tail for example (not equivalent to
get_huge_page_tail even ignoring the invariant checking in VM_BUG_ON)
and it defines a fast path for hugetlbfs and slab (faster than the one
upstream).
But here we have a "tail page" that has reference on both head and
tail, and you're taking one more reference on the head, just to
release it later by releasing the tail.
Your objective is to do:
page_head = get_compound_head(page);
if (page_head)
put_page(page);
else
page_head = page;
If we've to mess with the compound lock for this, considering
get_compound_head would have only this user anyway, we could do:
page_head = put_compound_tail(page);
if (!page_head)
page_head = page;
Implemented as:
put_compound_tail(page) {
page_head = compound_trans_head(page);
if (!__compound_tail_refcounted(page_head)) {
smp_rmb();
if (PageTail(page)) {
VM_BUG_ON(!PageHead(page_head));
VM_BUG_ON(atomic_read(&page_head->_count) <= 0);
VM_BUG_ON(page_mapcount(page) != 0);
return page_head;
} else
return NULL;
}
flags = compound_lock_irqsave(page_head);
if (!PageTail(page)) {
unlock
return NULL
}
VM_BUG_ON(page_head != page->first_page);
/* __split_huge_page_refcount will wait now */
VM_BUG_ON(page_mapcount(page) <= 0);
atomic_dec(&page->_mapcount);
VM_BUG_ON(atomic_read(&page_head->_count) <= 0);
VM_BUG_ON(atomic_read(&page->_count) != 0);
compound_unlock_irqrestore(page_head, flags);
return page_head;
}
Considering this is going incremental with -mm, and that the -mm
previous patches the above would depend on are not upstream yet, I'd
suggest to fix the bug in futex with Linus patch now (s/1/!ro/).
The strict obviously safe fix Linus posted, is not going to interfere
with these optimization later, so there's no downside to apply it now
and defer the optimizations to the -mm queue.
Thanks!
Andrea
next prev parent reply other threads:[~2013-12-13 15:11 UTC|newest]
Thread overview: 114+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-12-10 15:47 process 'stuck' at exit Dave Jones
2013-12-10 18:40 ` Linus Torvalds
2013-12-10 19:18 ` Thomas Gleixner
2013-12-10 19:55 ` Linus Torvalds
2013-12-10 20:27 ` Dave Jones
2013-12-10 20:34 ` Thomas Gleixner
2013-12-10 20:55 ` Dave Jones
2013-12-10 21:25 ` Darren Hart
2013-12-10 21:28 ` Thomas Gleixner
2013-12-10 21:39 ` Steven Rostedt
2013-12-10 20:33 ` Thomas Gleixner
2013-12-10 20:43 ` Linus Torvalds
2013-12-10 21:17 ` Thomas Gleixner
2013-12-10 20:35 ` Oleg Nesterov
2013-12-10 20:49 ` Dave Jones
2013-12-10 21:06 ` Darren Hart
2013-12-10 21:12 ` Dave Jones
2013-12-10 21:18 ` Linus Torvalds
2013-12-10 21:24 ` Linus Torvalds
2013-12-10 21:32 ` Dave Jones
2013-12-10 21:49 ` Linus Torvalds
2013-12-10 21:56 ` Dave Jones
2013-12-10 21:59 ` Linus Torvalds
2013-12-10 22:07 ` Dave Jones
2013-12-11 12:45 ` Ingo Molnar
2013-12-10 21:34 ` Oleg Nesterov
2013-12-10 21:41 ` Dave Jones
2013-12-10 21:57 ` Linus Torvalds
2013-12-10 22:02 ` Dave Jones
2013-12-10 22:09 ` Oleg Nesterov
2013-12-10 22:19 ` Linus Torvalds
2013-12-10 22:33 ` Linus Torvalds
2013-12-10 22:38 ` Darren Hart
2013-12-10 22:57 ` Thomas Gleixner
2013-12-10 23:05 ` Linus Torvalds
2013-12-10 23:28 ` Thomas Gleixner
2013-12-10 23:31 ` Al Viro
2013-12-11 17:08 ` Oleg Nesterov
2013-12-11 17:18 ` Thomas Gleixner
2013-12-11 17:56 ` Oleg Nesterov
2013-12-11 19:18 ` PATCH? introduce get_compound_page (Was: process 'stuck' at exit) Oleg Nesterov
2013-12-13 15:10 ` Andrea Arcangeli [this message]
2013-12-13 16:22 ` Oleg Nesterov
2013-12-13 17:34 ` Andrea Arcangeli
2013-12-16 18:36 ` Oleg Nesterov
2013-12-16 20:19 ` Andrea Arcangeli
2013-12-16 20:46 ` Oleg Nesterov
2013-12-17 16:53 ` Oleg Nesterov
2013-12-17 18:06 ` Andrea Arcangeli
2013-12-18 19:19 ` [PATCH -mm 0/7] (Was: introduce get_compound_page) Oleg Nesterov
2013-12-18 19:19 ` [PATCH -mm 1/7] mm: thp: introduce __put_nontail_page() Oleg Nesterov
2013-12-18 19:19 ` [PATCH -mm 2/7] mm: thp: change __get_page_tail() to use __put_nontail_page() Oleg Nesterov
2013-12-18 19:19 ` [PATCH -mm 3/7] mm: change release_pages() to use put_page() rather than put_compound_page() Oleg Nesterov
2013-12-18 19:19 ` [PATCH -mm 4/7] mm: thp: turn put_compound_page() into __put_page_tail() Oleg Nesterov
2013-12-18 19:36 ` Peter Zijlstra
2013-12-18 19:50 ` Oleg Nesterov
2013-12-18 19:20 ` [PATCH -mm 5/7] mm: thp: reorganize out_put_single code in __put_page_tail() Oleg Nesterov
2013-12-18 19:20 ` [PATCH -mm 6/7] mm: thp: introduce get_lock_thp_head() Oleg Nesterov
2013-12-18 21:37 ` Linus Torvalds
2013-12-19 16:29 ` Oleg Nesterov
2013-12-18 19:20 ` [PATCH -mm 7/7] mm: thp: introduce compound_head_put_tail(), change get_futex_key() to use it Oleg Nesterov
2013-12-18 19:28 ` Peter Zijlstra
2013-12-18 19:40 ` Oleg Nesterov
2013-12-19 19:08 ` [PATCH 0/1] mm: fix the theoretical compound_lock() vs prep_new_page() race Oleg Nesterov
2013-12-19 19:09 ` [PATCH 1/1] " Oleg Nesterov
2013-12-23 11:43 ` Andrea Arcangeli
2014-01-03 19:55 ` [PATCH v2 0/1] " Oleg Nesterov
2014-01-03 19:55 ` [PATCH v2 1/1] " Oleg Nesterov
2014-01-03 21:00 ` Andrew Morton
2014-01-04 16:43 ` Oleg Nesterov
2014-01-08 11:54 ` Mel Gorman
2014-01-08 13:14 ` Mel Gorman
2014-01-08 16:13 ` Oleg Nesterov
2014-01-08 18:02 ` Mel Gorman
2014-01-08 19:04 ` Oleg Nesterov
2014-01-09 11:27 ` Mel Gorman
2014-01-09 14:04 ` Oleg Nesterov
2014-01-09 18:52 ` Andrea Arcangeli
2014-01-09 19:43 ` Oleg Nesterov
2014-01-09 21:36 ` Andrea Arcangeli
2014-01-10 16:12 ` Oleg Nesterov
2014-01-10 16:50 ` Peter Zijlstra
2014-01-10 16:12 ` Mel Gorman
2013-12-20 14:19 ` [PATCH 0/1] " Martin Schwidefsky
2013-12-16 20:19 ` [PATCH 0/2] mm: thp: get_huge_page_tail() cleanups Oleg Nesterov
2013-12-16 20:19 ` [PATCH -mm 1/2] mm: thp: __get_page_tail_foll() can use get_huge_page_tail() Oleg Nesterov
2013-12-16 20:19 ` [PATCH -mm 2/2] mm: thp: turn compound_head() into BUG_ON(!PageTail) in get_huge_page_tail() Oleg Nesterov
2013-12-16 20:27 ` [PATCH 0/2] mm: thp: get_huge_page_tail() cleanups Andrea Arcangeli
2013-12-10 22:42 ` process 'stuck' at exit Thomas Gleixner
2013-12-10 22:48 ` Linus Torvalds
2013-12-10 22:58 ` Darren Hart
2013-12-10 23:01 ` Dave Jones
2013-12-10 23:00 ` Dave Jones
2013-12-11 0:05 ` Steven Rostedt
2013-12-11 0:23 ` Dave Jones
2013-12-11 0:55 ` Dave Jones
2013-12-14 20:17 ` Oleg Nesterov
2013-12-11 4:09 ` Dave Jones
2013-12-12 4:26 ` Dave Jones
2013-12-12 5:29 ` Darren Hart
2013-12-10 22:51 ` Darren Hart
2013-12-10 23:24 ` Al Viro
2013-12-11 16:31 ` Mel Gorman
2013-12-11 16:38 ` Thomas Gleixner
2013-12-11 17:57 ` Mel Gorman
2013-12-12 19:00 ` Andrea Arcangeli
2013-12-12 19:03 ` Linus Torvalds
2013-12-10 22:09 ` Steven Rostedt
2013-12-10 22:16 ` Dave Jones
2013-12-10 22:21 ` Steven Rostedt
2013-12-10 22:27 ` Dave Jones
2013-12-11 1:02 ` Mel Gorman
2013-12-10 20:57 ` Darren Hart
2013-12-10 21:09 ` Dave Jones
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20131213151035.GE5408@redhat.com \
--to=aarcange@redhat.com \
--cc=davej@redhat.com \
--cc=dvhart@linux.intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mgorman@suse.de \
--cc=oleg@redhat.com \
--cc=peterz@infradead.org \
--cc=tglx@linutronix.de \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).