From: Peter Zijlstra <peterz@infradead.org>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Hugh Dickins <hughd@google.com>,
Tim Chen <tim.c.chen@linux.intel.com>,
Andi Kleen <ak@linux.intel.com>,
Shaohua Li <shaohua.li@intel.com>,
Andrew Morton <akpm@linux-foundation.org>,
KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
Benjamin Herrenschmidt <benh@kernel.crashing.org>,
David Miller <davem@davemloft.net>,
Martin Schwidefsky <schwidefsky@de.ibm.com>,
Russell King <rmk@arm.linux.org.uk>,
Paul Mundt <lethal@linux-sh.org>, Jeff Dike <jdike@addtoit.com>,
Richard Weinberger <richard@nod.at>,
"Luck, Tony" <tony.luck@intel.com>,
KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
Mel Gorman <mel@csn.ul.ie>, Nick Piggin <npiggin@kernel.dk>,
Namhyung Kim <namhyung@gmail.com>,
"Shi, Alex" <alex.shi@intel.com>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"linux-mm@kvack.org" <linux-mm@kvack.org>,
"Rafael J. Wysocki" <rjw@sisk.pl>
Subject: Re: REGRESSION: Performance regressions from switching anon_vma->lock to mutex
Date: Fri, 17 Jun 2011 20:32:37 +0200 [thread overview]
Message-ID: <1308335557.12801.24.camel@laptop> (raw)
In-Reply-To: <1308334688.12801.19.camel@laptop>
On Fri, 2011-06-17 at 20:18 +0200, Peter Zijlstra wrote:
> On Fri, 2011-06-17 at 11:01 -0700, Linus Torvalds wrote:
>
> > So I do think that "page_referenced_anon()" should do a trylock, and
> > return "referenced" if the trylock fails. Comments?
>
> The only problem I can immediately see with that is when a single
> process' anon memory is dominant, then such an allocation will never
> succeed in freeing these pages because the one lock will pin pretty much
> all anon. Then again, there's always a few file pages to drop.
>
> That said, its rather unlikely, and iirc people were working on removing
> direct reclaim, or at least rely less on it.
>
>
something like so I guess, completely untested etc..
Also, there's a page_lock_anon_vma() user in split_huge_page(), which is
used in mm/swap_state.c:add_to_swap(), which is also in the reclaim
path, not quite sure what to do there.
Aside from the THP thing there's a user in memory-failure.c, which looks
to be broken as it is because its calling things under tasklist_lock
which isn't preemptible, but it looks like we can simply swap the
tasklist_lock vs page_lock_anon_vma.
---
kernel/Makefile | 1 +
mm/rmap.c | 39 +++++++++++++++++++++++++++++++++++++--
2 files changed, 38 insertions(+), 2 deletions(-)
diff --git a/kernel/Makefile b/kernel/Makefile
index 2d64cfc..f6d05de 100644
--- a/kernel/Makefile
+++ b/kernel/Makefile
@@ -101,6 +101,7 @@ obj-$(CONFIG_RING_BUFFER) += trace/
obj-$(CONFIG_TRACEPOINTS) += trace/
obj-$(CONFIG_SMP) += sched_cpupri.o
obj-$(CONFIG_IRQ_WORK) += irq_work.o
+obj-m += test.o
obj-$(CONFIG_PERF_EVENTS) += events/
diff --git a/mm/rmap.c b/mm/rmap.c
index 0eb463e..40cd399 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -400,6 +400,41 @@ out:
return anon_vma;
}
+struct anon_vma *page_trylock_anon_vma(struct page *page)
+{
+ struct anon_vma *anon_vma = NULL;
+ struct anon_vma *root_anon_vma;
+ unsigned long anon_mapping;
+
+ rcu_read_lock();
+ anon_mapping = (unsigned long) ACCESS_ONCE(page->mapping);
+ if ((anon_mapping & PAGE_MAPPING_FLAGS) != PAGE_MAPPING_ANON)
+ goto out;
+ if (!page_mapped(page))
+ goto out;
+
+ anon_vma = (struct anon_vma *) (anon_mapping - PAGE_MAPPING_ANON);
+ root_anon_vma = ACCESS_ONCE(anon_vma->root);
+ if (!mutex_trylock(&root_anon_vma->mutex)) {
+ anon_vma = NULL;
+ goto out;
+ }
+
+ /*
+ * If the page is still mapped, then this anon_vma is still
+ * its anon_vma, and holding the mutex ensures that it will
+ * not go away, see anon_vma_free().
+ */
+ if (!page_mapped(page)) {
+ mutex_unlock(&root_anon_vma->mutex);
+ anon_vma = NULL;
+ }
+
+out:
+ rcu_read_unlock();
+ return anon_vma;
+}
+
/*
* Similar to page_get_anon_vma() except it locks the anon_vma.
*
@@ -694,7 +729,7 @@ static int page_referenced_anon(struct page *page,
struct anon_vma_chain *avc;
int referenced = 0;
- anon_vma = page_lock_anon_vma(page);
+ anon_vma = page_trylock_anon_vma(page);
if (!anon_vma)
return referenced;
@@ -1396,7 +1431,7 @@ static int try_to_unmap_anon(struct page *page, enum ttu_flags flags)
struct anon_vma_chain *avc;
int ret = SWAP_AGAIN;
- anon_vma = page_lock_anon_vma(page);
+ anon_vma = page_trylock_anon_vma(page);
if (!anon_vma)
return ret;
WARNING: multiple messages have this Message-ID (diff)
From: Peter Zijlstra <peterz@infradead.org>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Hugh Dickins <hughd@google.com>,
Tim Chen <tim.c.chen@linux.intel.com>,
Andi Kleen <ak@linux.intel.com>,
Shaohua Li <shaohua.li@intel.com>,
Andrew Morton <akpm@linux-foundation.org>,
KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
Benjamin Herrenschmidt <benh@kernel.crashing.org>,
David Miller <davem@davemloft.net>,
Martin Schwidefsky <schwidefsky@de.ibm.com>,
Russell King <rmk@arm.linux.org.uk>,
Paul Mundt <lethal@linux-sh.org>, Jeff Dike <jdike@addtoit.com>,
Richard Weinberger <richard@nod.at>,
"Luck, Tony" <tony.luck@intel.com>,
KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
Mel Gorman <mel@csn.ul.ie>, Nick Piggin <npiggin@kernel.dk>,
Namhyung Kim <namhyung@gmail.com>,
"Shi, Alex" <alex.shi@intel.com>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"linux-mm@kvack.org" <linux-mm@kvack.org>,
"Rafael J. Wysocki" <rjw@sisk.pl>
Subject: Re: REGRESSION: Performance regressions from switching anon_vma->lock to mutex
Date: Fri, 17 Jun 2011 20:32:37 +0200 [thread overview]
Message-ID: <1308335557.12801.24.camel@laptop> (raw)
In-Reply-To: <1308334688.12801.19.camel@laptop>
On Fri, 2011-06-17 at 20:18 +0200, Peter Zijlstra wrote:
> On Fri, 2011-06-17 at 11:01 -0700, Linus Torvalds wrote:
>
> > So I do think that "page_referenced_anon()" should do a trylock, and
> > return "referenced" if the trylock fails. Comments?
>
> The only problem I can immediately see with that is when a single
> process' anon memory is dominant, then such an allocation will never
> succeed in freeing these pages because the one lock will pin pretty much
> all anon. Then again, there's always a few file pages to drop.
>
> That said, its rather unlikely, and iirc people were working on removing
> direct reclaim, or at least rely less on it.
>
>
something like so I guess, completely untested etc..
Also, there's a page_lock_anon_vma() user in split_huge_page(), which is
used in mm/swap_state.c:add_to_swap(), which is also in the reclaim
path, not quite sure what to do there.
Aside from the THP thing there's a user in memory-failure.c, which looks
to be broken as it is because its calling things under tasklist_lock
which isn't preemptible, but it looks like we can simply swap the
tasklist_lock vs page_lock_anon_vma.
---
kernel/Makefile | 1 +
mm/rmap.c | 39 +++++++++++++++++++++++++++++++++++++--
2 files changed, 38 insertions(+), 2 deletions(-)
diff --git a/kernel/Makefile b/kernel/Makefile
index 2d64cfc..f6d05de 100644
--- a/kernel/Makefile
+++ b/kernel/Makefile
@@ -101,6 +101,7 @@ obj-$(CONFIG_RING_BUFFER) += trace/
obj-$(CONFIG_TRACEPOINTS) += trace/
obj-$(CONFIG_SMP) += sched_cpupri.o
obj-$(CONFIG_IRQ_WORK) += irq_work.o
+obj-m += test.o
obj-$(CONFIG_PERF_EVENTS) += events/
diff --git a/mm/rmap.c b/mm/rmap.c
index 0eb463e..40cd399 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -400,6 +400,41 @@ out:
return anon_vma;
}
+struct anon_vma *page_trylock_anon_vma(struct page *page)
+{
+ struct anon_vma *anon_vma = NULL;
+ struct anon_vma *root_anon_vma;
+ unsigned long anon_mapping;
+
+ rcu_read_lock();
+ anon_mapping = (unsigned long) ACCESS_ONCE(page->mapping);
+ if ((anon_mapping & PAGE_MAPPING_FLAGS) != PAGE_MAPPING_ANON)
+ goto out;
+ if (!page_mapped(page))
+ goto out;
+
+ anon_vma = (struct anon_vma *) (anon_mapping - PAGE_MAPPING_ANON);
+ root_anon_vma = ACCESS_ONCE(anon_vma->root);
+ if (!mutex_trylock(&root_anon_vma->mutex)) {
+ anon_vma = NULL;
+ goto out;
+ }
+
+ /*
+ * If the page is still mapped, then this anon_vma is still
+ * its anon_vma, and holding the mutex ensures that it will
+ * not go away, see anon_vma_free().
+ */
+ if (!page_mapped(page)) {
+ mutex_unlock(&root_anon_vma->mutex);
+ anon_vma = NULL;
+ }
+
+out:
+ rcu_read_unlock();
+ return anon_vma;
+}
+
/*
* Similar to page_get_anon_vma() except it locks the anon_vma.
*
@@ -694,7 +729,7 @@ static int page_referenced_anon(struct page *page,
struct anon_vma_chain *avc;
int referenced = 0;
- anon_vma = page_lock_anon_vma(page);
+ anon_vma = page_trylock_anon_vma(page);
if (!anon_vma)
return referenced;
@@ -1396,7 +1431,7 @@ static int try_to_unmap_anon(struct page *page, enum ttu_flags flags)
struct anon_vma_chain *avc;
int ret = SWAP_AGAIN;
- anon_vma = page_lock_anon_vma(page);
+ anon_vma = page_trylock_anon_vma(page);
if (!anon_vma)
return ret;
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2011-06-17 18:28 UTC|newest]
Thread overview: 166+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-06-15 0:29 REGRESSION: Performance regressions from switching anon_vma->lock to mutex Tim Chen
2011-06-15 0:29 ` Tim Chen
2011-06-15 0:36 ` Andi Kleen
2011-06-15 0:36 ` Andi Kleen
2011-06-17 19:07 ` Ingo Molnar
2011-06-17 19:07 ` Ingo Molnar
2011-06-15 1:21 ` Linus Torvalds
2011-06-15 1:21 ` Linus Torvalds
2011-06-15 3:42 ` Linus Torvalds
2011-06-15 1:26 ` Shaohua Li
2011-06-15 1:26 ` Shaohua Li
2011-06-15 11:52 ` Peter Zijlstra
2011-06-15 11:52 ` Peter Zijlstra
2011-06-15 12:49 ` Peter Zijlstra
2011-06-15 12:49 ` Peter Zijlstra
2011-06-15 16:18 ` Andi Kleen
2011-06-15 16:18 ` Andi Kleen
2011-06-15 16:45 ` Peter Zijlstra
2011-06-15 16:45 ` Peter Zijlstra
2011-06-15 16:47 ` Andi Kleen
2011-06-15 16:47 ` Andi Kleen
2011-06-15 18:43 ` Tim Chen
2011-06-15 18:43 ` Tim Chen
2011-06-15 20:32 ` Peter Zijlstra
2011-06-15 20:32 ` Peter Zijlstra
2011-06-15 20:57 ` Andi Kleen
2011-06-15 20:57 ` Andi Kleen
2011-06-15 21:12 ` Tim Chen
2011-06-15 21:12 ` Tim Chen
2011-06-15 21:37 ` Peter Zijlstra
2011-06-15 21:37 ` Peter Zijlstra
2011-06-15 21:51 ` Linus Torvalds
2011-06-15 21:51 ` Linus Torvalds
2011-06-15 22:19 ` Andi Kleen
2011-06-15 22:19 ` Andi Kleen
2011-06-16 0:16 ` Linus Torvalds
2011-06-16 0:16 ` Linus Torvalds
2011-06-16 20:14 ` Andi Kleen
2011-06-16 20:14 ` Andi Kleen
2011-06-16 20:37 ` Linus Torvalds
2011-06-16 20:37 ` Linus Torvalds
2011-06-17 0:24 ` Andi Kleen
2011-06-17 9:13 ` Ingo Molnar
2011-06-17 9:13 ` Ingo Molnar
2011-06-15 22:15 ` Andi Kleen
2011-06-15 22:15 ` Andi Kleen
2011-06-16 1:08 ` Tim Chen
2011-06-16 1:08 ` Tim Chen
2011-06-16 1:50 ` Linus Torvalds
2011-06-16 1:50 ` Linus Torvalds
2011-06-16 20:26 ` Tim Chen
2011-06-16 20:26 ` Tim Chen
2011-06-16 20:47 ` Linus Torvalds
2011-06-16 20:47 ` Linus Torvalds
2011-06-16 21:05 ` Linus Torvalds
2011-06-16 21:05 ` Linus Torvalds
2011-06-16 21:06 ` Linus Torvalds
2011-06-16 21:26 ` Linus Torvalds
2011-06-16 21:26 ` Linus Torvalds
2011-06-17 3:58 ` Linus Torvalds
2011-06-17 11:28 ` Peter Zijlstra
2011-06-17 11:28 ` Peter Zijlstra
2011-06-17 11:54 ` Peter Zijlstra
2011-06-17 11:54 ` Peter Zijlstra
2011-06-17 16:36 ` Linus Torvalds
2011-06-17 16:36 ` Linus Torvalds
2011-06-17 17:41 ` Hugh Dickins
2011-06-17 17:41 ` Hugh Dickins
2011-06-17 17:55 ` Peter Zijlstra
2011-06-17 17:55 ` Peter Zijlstra
2011-06-17 18:01 ` Linus Torvalds
2011-06-17 18:01 ` Linus Torvalds
2011-06-17 18:18 ` Peter Zijlstra
2011-06-17 18:18 ` Peter Zijlstra
2011-06-17 18:32 ` Peter Zijlstra [this message]
2011-06-17 18:32 ` Peter Zijlstra
2011-06-17 18:39 ` Linus Torvalds
2011-06-17 18:41 ` Linus Torvalds
2011-06-17 18:41 ` Linus Torvalds
2011-06-17 20:19 ` Tim Chen
2011-06-17 20:19 ` Tim Chen
2011-06-17 22:20 ` Hugh Dickins
2011-06-17 22:20 ` Hugh Dickins
2011-06-18 4:47 ` Linus Torvalds
2011-06-18 4:47 ` Linus Torvalds
2011-06-17 19:53 ` [PATCH] mm, memory-failure: Fix spinlock vs mutex order Peter Zijlstra
2011-06-17 19:53 ` Peter Zijlstra
2011-06-17 20:04 ` Andi Kleen
2011-06-17 20:04 ` Andi Kleen
2011-06-17 16:46 ` REGRESSION: Performance regressions from switching anon_vma->lock to mutex Linus Torvalds
2011-06-17 16:46 ` Linus Torvalds
2011-06-17 17:28 ` Linus Torvalds
2011-06-17 19:40 ` Andi Kleen
2011-06-17 19:40 ` Andi Kleen
2011-06-18 8:08 ` Ingo Molnar
2011-06-18 8:08 ` Ingo Molnar
2011-06-17 18:22 ` Tim Chen
2011-06-17 18:22 ` Tim Chen
2011-06-17 19:05 ` Ray Lee
2011-06-17 19:05 ` Ray Lee
2011-06-16 22:00 ` Andi Kleen
2011-06-16 22:00 ` Andi Kleen
2011-06-15 10:36 ` Peter Zijlstra
2011-06-15 10:36 ` Peter Zijlstra
2011-06-15 10:58 ` Peter Zijlstra
2011-06-15 10:58 ` Peter Zijlstra
2011-06-15 11:41 ` Peter Zijlstra
2011-06-15 11:41 ` Peter Zijlstra
2011-06-15 19:11 ` Linus Torvalds
2011-06-15 19:11 ` Linus Torvalds
2011-06-15 19:24 ` Andrew Morton
2011-06-15 19:24 ` Andrew Morton
2011-06-15 20:16 ` Ingo Molnar
2011-06-15 20:16 ` Ingo Molnar
2011-06-15 20:55 ` Linus Torvalds
2011-06-15 20:55 ` Linus Torvalds
2011-06-15 20:12 ` [GIT PULL] " Ingo Molnar
2011-06-15 20:12 ` Ingo Molnar
2011-06-15 20:29 ` Paul E. McKenney
2011-06-15 20:29 ` Paul E. McKenney
2011-06-15 20:47 ` Linus Torvalds
2011-06-15 20:47 ` Linus Torvalds
2011-06-15 20:54 ` Paul E. McKenney
2011-06-15 20:54 ` Paul E. McKenney
2011-06-15 21:05 ` Linus Torvalds
2011-06-15 21:05 ` Linus Torvalds
2011-06-15 21:15 ` Paul E. McKenney
2011-06-15 21:15 ` Paul E. McKenney
2011-06-15 21:27 ` Linus Torvalds
2011-06-15 21:27 ` Linus Torvalds
2011-06-16 7:03 ` Ingo Molnar
2011-06-16 7:03 ` Ingo Molnar
2011-06-16 17:16 ` Paul E. McKenney
2011-06-16 17:16 ` Paul E. McKenney
2011-06-16 20:25 ` Ingo Molnar
2011-06-16 20:25 ` Ingo Molnar
2011-06-16 21:01 ` Frederic Weisbecker
2011-06-16 21:01 ` Frederic Weisbecker
2011-06-16 23:02 ` Ingo Molnar
2011-06-16 23:02 ` Ingo Molnar
2011-06-17 15:19 ` Frederic Weisbecker
2011-06-17 15:19 ` Frederic Weisbecker
2011-06-16 21:02 ` Andi Kleen
2011-06-16 21:02 ` Andi Kleen
2011-06-16 22:21 ` Benjamin Herrenschmidt
2011-06-16 22:21 ` Benjamin Herrenschmidt
2011-06-16 22:38 ` Ingo Molnar
2011-06-16 22:38 ` Ingo Molnar
2011-06-16 22:47 ` Andi Kleen
2011-06-16 22:47 ` Andi Kleen
2011-06-16 22:58 ` Ingo Molnar
2011-06-16 22:58 ` Ingo Molnar
2011-06-17 0:45 ` Paul E. McKenney
2011-06-17 0:45 ` Paul E. McKenney
2011-06-17 9:43 ` Ingo Molnar
2011-06-17 9:43 ` Ingo Molnar
2011-06-17 16:48 ` Paul E. McKenney
2011-06-17 16:48 ` Paul E. McKenney
2011-06-16 23:37 ` Paul E. McKenney
2011-06-16 23:37 ` Paul E. McKenney
2011-06-15 20:13 ` Tim Chen
2011-06-15 20:13 ` Tim Chen
2011-06-15 20:17 ` Ingo Molnar
2011-06-15 20:17 ` Ingo Molnar
2011-06-15 20:21 ` Tim Chen
2011-06-15 20:21 ` Tim Chen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1308335557.12801.24.camel@laptop \
--to=peterz@infradead.org \
--cc=ak@linux.intel.com \
--cc=akpm@linux-foundation.org \
--cc=alex.shi@intel.com \
--cc=benh@kernel.crashing.org \
--cc=davem@davemloft.net \
--cc=hughd@google.com \
--cc=jdike@addtoit.com \
--cc=kamezawa.hiroyu@jp.fujitsu.com \
--cc=kosaki.motohiro@jp.fujitsu.com \
--cc=lethal@linux-sh.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mel@csn.ul.ie \
--cc=namhyung@gmail.com \
--cc=npiggin@kernel.dk \
--cc=richard@nod.at \
--cc=rjw@sisk.pl \
--cc=rmk@arm.linux.org.uk \
--cc=schwidefsky@de.ibm.com \
--cc=shaohua.li@intel.com \
--cc=tim.c.chen@linux.intel.com \
--cc=tony.luck@intel.com \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.