linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-mm@kvack.org, xfs@oss.sgi.com
Subject: Re: [PATCH 02/12] vmscan: shrinker->nr updates race and go wrong
Date: Mon, 20 Jun 2011 11:25:31 +1000	[thread overview]
Message-ID: <20110620012531.GN561@dastard> (raw)
In-Reply-To: <4DFE987E.1070900@jp.fujitsu.com>

On Mon, Jun 20, 2011 at 09:46:54AM +0900, KOSAKI Motohiro wrote:
> > diff --git a/mm/vmscan.c b/mm/vmscan.c
> > index 48e3fbd..dce2767 100644
> > --- a/mm/vmscan.c
> > +++ b/mm/vmscan.c
> > @@ -251,17 +251,29 @@ unsigned long shrink_slab(struct shrink_control *shrink,
> >  		unsigned long total_scan;
> >  		unsigned long max_pass;
> >  		int shrink_ret = 0;
> > +		long nr;
> > +		long new_nr;
> >  
> > +		/*
> > +		 * copy the current shrinker scan count into a local variable
> > +		 * and zero it so that other concurrent shrinker invocations
> > +		 * don't also do this scanning work.
> > +		 */
> > +		do {
> > +			nr = shrinker->nr;
> > +		} while (cmpxchg(&shrinker->nr, nr, 0) != nr);
> > +
> > +		total_scan = nr;
> >  		max_pass = do_shrinker_shrink(shrinker, shrink, 0);
> >  		delta = (4 * nr_pages_scanned) / shrinker->seeks;
> >  		delta *= max_pass;
> >  		do_div(delta, lru_pages + 1);
> > -		shrinker->nr += delta;
> > -		if (shrinker->nr < 0) {
> > +		total_scan += delta;
> > +		if (total_scan < 0) {
> >  			printk(KERN_ERR "shrink_slab: %pF negative objects to "
> >  			       "delete nr=%ld\n",
> > -			       shrinker->shrink, shrinker->nr);
> > -			shrinker->nr = max_pass;
> > +			       shrinker->shrink, total_scan);
> > +			total_scan = max_pass;
> >  		}
> >  
> >  		/*
> > @@ -269,13 +281,11 @@ unsigned long shrink_slab(struct shrink_control *shrink,
> >  		 * never try to free more than twice the estimate number of
> >  		 * freeable entries.
> >  		 */
> > -		if (shrinker->nr > max_pass * 2)
> > -			shrinker->nr = max_pass * 2;
> > +		if (total_scan > max_pass * 2)
> > +			total_scan = max_pass * 2;
> >  
> > -		total_scan = shrinker->nr;
> > -		shrinker->nr = 0;
> >  
> > -		trace_mm_shrink_slab_start(shrinker, shrink, nr_pages_scanned,
> > +		trace_mm_shrink_slab_start(shrinker, shrink, nr, nr_pages_scanned,
> >  					lru_pages, max_pass, delta, total_scan);
> >  
> >  		while (total_scan >= SHRINK_BATCH) {
> > @@ -295,8 +305,19 @@ unsigned long shrink_slab(struct shrink_control *shrink,
> >  			cond_resched();
> >  		}
> >  
> > -		shrinker->nr += total_scan;
> > -		trace_mm_shrink_slab_end(shrinker, shrink_ret, total_scan);
> > +		/*
> > +		 * move the unused scan count back into the shrinker in a
> > +		 * manner that handles concurrent updates. If we exhausted the
> > +		 * scan, there is no need to do an update.
> > +		 */
> > +		do {
> > +			nr = shrinker->nr;
> > +			new_nr = total_scan + nr;
> > +			if (total_scan <= 0)
> > +				break;
> > +		} while (cmpxchg(&shrinker->nr, nr, new_nr) != nr);
> > +
> > +		trace_mm_shrink_slab_end(shrinker, shrink_ret, nr, new_nr);
> >  	}
> >  	up_read(&shrinker_rwsem);
> >  out:
> 
> Looks great fix. Please remove tracepoint change from this patch and send it
> to -stable. iow, I expect I'll ack your next spin.

I don't believe such a change belongs in -stable. This code has been
buggy for many years and as I mentioned it actually makes existing
bad shrinker behaviour worse. I don't test stable kernels, so I've
got no idea what side effects it will have outside of this series.
I'm extremely hesitant to change VM behaviour in stable kernels
without having tested first, so I'm not going to push it for stable
kernels.

If you want it in stable kernels, then you can always let
stable@kernel.org know once the commits are in the mainline tree and
you've tested them...

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2011-06-20  1:25 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-06-02  7:00 [PATCH 0/12] Per superblock cache reclaim Dave Chinner
2011-06-02  7:00 ` [PATCH 01/12] vmscan: add shrink_slab tracepoints Dave Chinner
2011-06-20  0:44   ` KOSAKI Motohiro
2011-06-20  0:53     ` Dave Chinner
2011-06-02  7:00 ` [PATCH 02/12] vmscan: shrinker->nr updates race and go wrong Dave Chinner
2011-06-20  0:46   ` KOSAKI Motohiro
2011-06-20  1:25     ` Dave Chinner [this message]
2011-06-20  4:30       ` KOSAKI Motohiro
2011-06-02  7:00 ` [PATCH 03/12] vmscan: reduce wind up shrinker->nr when shrinker can't do work Dave Chinner
2011-06-20  0:51   ` KOSAKI Motohiro
2011-06-21  5:09     ` Dave Chinner
2011-06-21  5:27       ` KOSAKI Motohiro
2011-06-02  7:00 ` [PATCH 04/12] vmscan: add customisable shrinker batch size Dave Chinner
2011-06-02  7:01 ` [PATCH 05/12] inode: convert inode_stat.nr_unused to per-cpu counters Dave Chinner
2011-06-02  7:01 ` [PATCH 06/12] inode: Make unused inode LRU per superblock Dave Chinner
2011-06-04  0:25   ` Al Viro
2011-06-04  1:40     ` Dave Chinner
2011-06-02  7:01 ` [PATCH 07/12] inode: move to per-sb LRU locks Dave Chinner
2011-06-02  7:01 ` [PATCH 08/12] superblock: introduce per-sb cache shrinker infrastructure Dave Chinner
2011-06-04  0:42   ` Al Viro
2011-06-04  1:52     ` Dave Chinner
2011-06-04 14:08       ` Christoph Hellwig
2011-06-04 14:19         ` Al Viro
2011-06-04 14:24           ` Al Viro
2011-06-02  7:01 ` [PATCH 09/12] inode: remove iprune_sem Dave Chinner
2011-06-02  7:01 ` [PATCH 10/12] superblock: add filesystem shrinker operations Dave Chinner
2011-06-02  7:01 ` [PATCH 11/12] vfs: increase shrinker batch size Dave Chinner
2011-06-02  9:30   ` Nicolas Kaiser
2011-06-02  7:01 ` [PATCH 12/12] xfs: make use of new shrinker callout for the inode cache Dave Chinner
2011-06-16 11:33 ` [PATCH 0/12] Per superblock cache reclaim Christoph Hellwig
2011-06-17  3:35   ` KOSAKI Motohiro

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110620012531.GN561@dastard \
    --to=david@fromorbit.com \
    --cc=kosaki.motohiro@jp.fujitsu.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).