From: Dave Chinner <david@fromorbit.com>
To: viro@ZenIV.linux.org.uk
Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-mm@kvack.org
Subject: [PATCH 03/14] vmscan: shrinker->nr updates race and go wrong
Date: Fri, 8 Jul 2011 14:14:35 +1000 [thread overview]
Message-ID: <1310098486-6453-4-git-send-email-david@fromorbit.com> (raw)
In-Reply-To: <1310098486-6453-1-git-send-email-david@fromorbit.com>
From: Dave Chinner <dchinner@redhat.com>
shrink_slab() allows shrinkers to be called in parallel so the
struct shrinker can be updated concurrently. It does not provide any
exclusio for such updates, so we can get the shrinker->nr value
increasing or decreasing incorrectly.
As a result, when a shrinker repeatedly returns a value of -1 (e.g.
a VFS shrinker called w/ GFP_NOFS), the shrinker->nr goes haywire,
sometimes updating with the scan count that wasn't used, sometimes
losing it altogether. Worse is when a shrinker does work and that
update is lost due to racy updates, which means the shrinker will do
the work again!
Fix this by making the total_scan calculations independent of
shrinker->nr, and making the shrinker->nr updates atomic w.r.t. to
other updates via cmpxchg loops.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
mm/vmscan.c | 45 ++++++++++++++++++++++++++++++++-------------
1 files changed, 32 insertions(+), 13 deletions(-)
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 646cb6c..666d811 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -251,17 +251,29 @@ unsigned long shrink_slab(struct shrink_control *shrink,
unsigned long total_scan;
unsigned long max_pass;
int shrink_ret = 0;
+ long nr;
+ long new_nr;
+ /*
+ * copy the current shrinker scan count into a local variable
+ * and zero it so that other concurrent shrinker invocations
+ * don't also do this scanning work.
+ */
+ do {
+ nr = shrinker->nr;
+ } while (cmpxchg(&shrinker->nr, nr, 0) != nr);
+
+ total_scan = nr;
max_pass = do_shrinker_shrink(shrinker, shrink, 0);
delta = (4 * nr_pages_scanned) / shrinker->seeks;
delta *= max_pass;
do_div(delta, lru_pages + 1);
- shrinker->nr += delta;
- if (shrinker->nr < 0) {
+ total_scan += delta;
+ if (total_scan < 0) {
printk(KERN_ERR "shrink_slab: %pF negative objects to "
"delete nr=%ld\n",
- shrinker->shrink, shrinker->nr);
- shrinker->nr = max_pass;
+ shrinker->shrink, total_scan);
+ total_scan = max_pass;
}
/*
@@ -269,13 +281,10 @@ unsigned long shrink_slab(struct shrink_control *shrink,
* never try to free more than twice the estimate number of
* freeable entries.
*/
- if (shrinker->nr > max_pass * 2)
- shrinker->nr = max_pass * 2;
-
- total_scan = shrinker->nr;
- shrinker->nr = 0;
+ if (total_scan > max_pass * 2)
+ total_scan = max_pass * 2;
- trace_mm_shrink_slab_start(shrinker, shrink, total_scan,
+ trace_mm_shrink_slab_start(shrinker, shrink, nr,
nr_pages_scanned, lru_pages,
max_pass, delta, total_scan);
@@ -296,9 +305,19 @@ unsigned long shrink_slab(struct shrink_control *shrink,
cond_resched();
}
- shrinker->nr += total_scan;
- trace_mm_shrink_slab_end(shrinker, shrink_ret, total_scan,
- shrinker->nr);
+ /*
+ * move the unused scan count back into the shrinker in a
+ * manner that handles concurrent updates. If we exhausted the
+ * scan, there is no need to do an update.
+ */
+ do {
+ nr = shrinker->nr;
+ new_nr = total_scan + nr;
+ if (total_scan <= 0)
+ break;
+ } while (cmpxchg(&shrinker->nr, nr, new_nr) != nr);
+
+ trace_mm_shrink_slab_end(shrinker, shrink_ret, nr, new_nr);
}
up_read(&shrinker_rwsem);
out:
--
1.7.5.1
WARNING: multiple messages have this Message-ID (diff)
From: Dave Chinner <david@fromorbit.com>
To: viro@ZenIV.linux.org.uk
Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-mm@kvack.org
Subject: [PATCH 03/14] vmscan: shrinker->nr updates race and go wrong
Date: Fri, 8 Jul 2011 14:14:35 +1000 [thread overview]
Message-ID: <1310098486-6453-4-git-send-email-david@fromorbit.com> (raw)
In-Reply-To: <1310098486-6453-1-git-send-email-david@fromorbit.com>
From: Dave Chinner <dchinner@redhat.com>
shrink_slab() allows shrinkers to be called in parallel so the
struct shrinker can be updated concurrently. It does not provide any
exclusio for such updates, so we can get the shrinker->nr value
increasing or decreasing incorrectly.
As a result, when a shrinker repeatedly returns a value of -1 (e.g.
a VFS shrinker called w/ GFP_NOFS), the shrinker->nr goes haywire,
sometimes updating with the scan count that wasn't used, sometimes
losing it altogether. Worse is when a shrinker does work and that
update is lost due to racy updates, which means the shrinker will do
the work again!
Fix this by making the total_scan calculations independent of
shrinker->nr, and making the shrinker->nr updates atomic w.r.t. to
other updates via cmpxchg loops.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
mm/vmscan.c | 45 ++++++++++++++++++++++++++++++++-------------
1 files changed, 32 insertions(+), 13 deletions(-)
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 646cb6c..666d811 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -251,17 +251,29 @@ unsigned long shrink_slab(struct shrink_control *shrink,
unsigned long total_scan;
unsigned long max_pass;
int shrink_ret = 0;
+ long nr;
+ long new_nr;
+ /*
+ * copy the current shrinker scan count into a local variable
+ * and zero it so that other concurrent shrinker invocations
+ * don't also do this scanning work.
+ */
+ do {
+ nr = shrinker->nr;
+ } while (cmpxchg(&shrinker->nr, nr, 0) != nr);
+
+ total_scan = nr;
max_pass = do_shrinker_shrink(shrinker, shrink, 0);
delta = (4 * nr_pages_scanned) / shrinker->seeks;
delta *= max_pass;
do_div(delta, lru_pages + 1);
- shrinker->nr += delta;
- if (shrinker->nr < 0) {
+ total_scan += delta;
+ if (total_scan < 0) {
printk(KERN_ERR "shrink_slab: %pF negative objects to "
"delete nr=%ld\n",
- shrinker->shrink, shrinker->nr);
- shrinker->nr = max_pass;
+ shrinker->shrink, total_scan);
+ total_scan = max_pass;
}
/*
@@ -269,13 +281,10 @@ unsigned long shrink_slab(struct shrink_control *shrink,
* never try to free more than twice the estimate number of
* freeable entries.
*/
- if (shrinker->nr > max_pass * 2)
- shrinker->nr = max_pass * 2;
-
- total_scan = shrinker->nr;
- shrinker->nr = 0;
+ if (total_scan > max_pass * 2)
+ total_scan = max_pass * 2;
- trace_mm_shrink_slab_start(shrinker, shrink, total_scan,
+ trace_mm_shrink_slab_start(shrinker, shrink, nr,
nr_pages_scanned, lru_pages,
max_pass, delta, total_scan);
@@ -296,9 +305,19 @@ unsigned long shrink_slab(struct shrink_control *shrink,
cond_resched();
}
- shrinker->nr += total_scan;
- trace_mm_shrink_slab_end(shrinker, shrink_ret, total_scan,
- shrinker->nr);
+ /*
+ * move the unused scan count back into the shrinker in a
+ * manner that handles concurrent updates. If we exhausted the
+ * scan, there is no need to do an update.
+ */
+ do {
+ nr = shrinker->nr;
+ new_nr = total_scan + nr;
+ if (total_scan <= 0)
+ break;
+ } while (cmpxchg(&shrinker->nr, nr, new_nr) != nr);
+
+ trace_mm_shrink_slab_end(shrinker, shrink_ret, nr, new_nr);
}
up_read(&shrinker_rwsem);
out:
--
1.7.5.1
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2011-07-08 4:15 UTC|newest]
Thread overview: 44+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-07-08 4:14 [PATCH 0/14] Per superblock cache reclaim Dave Chinner
2011-07-08 4:14 ` Dave Chinner
2011-07-08 4:14 ` [PATCH 01/14] dcache: fix __d_alloc prototype to use const Dave Chinner
2011-07-08 4:14 ` Dave Chinner
2011-07-08 4:14 ` [PATCH 02/14] vmscan: add shrink_slab tracepoints Dave Chinner
2011-07-08 4:14 ` Dave Chinner
2011-07-08 4:14 ` Dave Chinner
2011-07-11 9:57 ` Christoph Hellwig
2011-07-11 9:57 ` Christoph Hellwig
2011-07-18 1:14 ` Dave Chinner
2011-07-18 1:14 ` Dave Chinner
2011-07-08 4:14 ` Dave Chinner [this message]
2011-07-08 4:14 ` [PATCH 03/14] vmscan: shrinker->nr updates race and go wrong Dave Chinner
2011-07-08 4:14 ` [PATCH 04/14] vmscan: reduce wind up shrinker->nr when shrinker can't do work Dave Chinner
2011-07-08 4:14 ` Dave Chinner
2011-07-08 4:14 ` Dave Chinner
2011-07-08 4:14 ` [PATCH 05/14] vmscan: add customisable shrinker batch size Dave Chinner
2011-07-08 4:14 ` Dave Chinner
2011-07-08 4:14 ` [PATCH 06/14] inode: convert inode_stat.nr_unused to per-cpu counters Dave Chinner
2011-07-08 4:14 ` Dave Chinner
2011-07-08 4:14 ` [PATCH 07/14] inode: Make unused inode LRU per superblock Dave Chinner
2011-07-08 4:14 ` Dave Chinner
2011-07-08 4:14 ` [PATCH 08/14] inode: move to per-sb LRU locks Dave Chinner
2011-07-08 4:14 ` Dave Chinner
2011-07-11 19:21 ` Christoph Hellwig
2011-07-11 19:21 ` Christoph Hellwig
2011-07-12 0:34 ` Dave Chinner
2011-07-12 0:34 ` Dave Chinner
2011-07-08 4:14 ` [PATCH 09/14] superblock: move pin_sb_for_writeback() to fs/super.c Dave Chinner
2011-07-08 4:14 ` Dave Chinner
2011-07-08 4:14 ` [PATCH 10/14] superblock: introduce per-sb cache shrinker infrastructure Dave Chinner
2011-07-08 4:14 ` Dave Chinner
2011-07-08 4:14 ` Dave Chinner
2011-07-08 4:14 ` [PATCH 11/14] inode: remove iprune_sem Dave Chinner
2011-07-08 4:14 ` Dave Chinner
2011-07-08 4:14 ` [PATCH 12/14] superblock: add filesystem shrinker operations Dave Chinner
2011-07-08 4:14 ` Dave Chinner
2011-07-08 4:14 ` Dave Chinner
2011-07-08 4:14 ` [PATCH 13/14] vfs: increase shrinker batch size Dave Chinner
2011-07-08 4:14 ` Dave Chinner
2011-07-11 10:05 ` Christoph Hellwig
2011-07-11 10:05 ` Christoph Hellwig
2011-07-08 4:14 ` [PATCH 14/14] xfs: make use of new shrinker callout for the inode cache Dave Chinner
2011-07-08 4:14 ` Dave Chinner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1310098486-6453-4-git-send-email-david@fromorbit.com \
--to=david@fromorbit.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=viro@ZenIV.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.