All of lore.kernel.org
 help / color / mirror / Atom feed
From: Wu Fengguang <fengguang.wu@intel.com>
To: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Jan Kara <jack@suse.cz>,
	"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
	Christoph Hellwig <hch@lst.de>,
	Andrew Morton <akpm@linux-foundation.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	paulmck <paulmck@linux.vnet.ibm.com>
Subject: Re: [PATCH 3/5] writeback: fix dirtied pages accounting on sub-page writes
Date: Tue, 22 Nov 2011 22:11:29 +0800	[thread overview]
Message-ID: <20111122141129.GC10545@localhost> (raw)
In-Reply-To: <1321969999.14799.10.camel@twins>

On Tue, Nov 22, 2011 at 09:53:19PM +0800, Peter Zijlstra wrote:
> On Tue, 2011-11-22 at 21:41 +0800, Wu Fengguang wrote:
> > On Tue, Nov 22, 2011 at 09:07:50PM +0800, Wu Fengguang wrote:
> > > On Tue, Nov 22, 2011 at 08:57:42PM +0800, Peter Zijlstra wrote:
> > > > On Tue, 2011-11-22 at 13:21 +0100, Jan Kara wrote:
> > > > > > +             __get_cpu_var(bdp_ratelimits)++;
> > > > >   I think you need preempt_disable() and preempt_enable() pair around
> > > > > __get_cpu_var(). Otherwise a process could get rescheduled in the middle of
> > > > > read-modify-write cycle... 
> > > > 
> > > > there's of course the this_cpu_inc(bdp_ratelimits); thing.
> > > > 
> > > > On x86 that'll turn into a single insn, on others it will add the
> > > > required preempt_disable/enable bits.
> > > 
> > > It's good to know that. But what if we don't really care which CPU
> > > data it's increasing, and can accept losing some increases due to the
> > > resulted race condition?
> > 
> > I just added a comment for it, hope it helps :)
> > 
> >                 /*
> >                  * This is racy, however bdp_ratelimits merely serves as a
> >                  * gross safeguard. We don't really care the exact CPU it's
> >                  * charging to and the resulted inaccuracy is acceptable.
> >                  */
> >                 __get_cpu_var(bdp_ratelimits)++;
> 
> Thing is, I'm not sure how much update you can effectively wreck by
> interleaving the RmW cycles of two CPUs like this.

Yeah there is the side effect of cache bouncing, which makes it not a
clear win...and pure lose on x86...

> Simply loosing a few increments would be fine, but what are the
> practical implications of actually relying on this behaviour and how do
> various architectures cope.

OK I'll give up the weird (mis-)use of the per-cpu data structure :)

Thanks,
Fengguang
---
Subject: writeback: fix dirtied pages accounting on sub-page writes
Date: Thu Apr 14 07:52:37 CST 2011

When dd in 512bytes, generic_perform_write() calls
balance_dirty_pages_ratelimited() 8 times for the same page, but
obviously the page is only dirtied once.

Fix it by accounting tsk->nr_dirtied and bdp_ratelimits at page dirty time.

Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
---
 mm/page-writeback.c |   13 +++++--------
 1 file changed, 5 insertions(+), 8 deletions(-)

--- linux-next.orig/mm/page-writeback.c	2011-11-22 22:01:56.000000000 +0800
+++ linux-next/mm/page-writeback.c	2011-11-22 22:02:32.000000000 +0800
@@ -1246,8 +1246,6 @@ void balance_dirty_pages_ratelimited_nr(
 	if (bdi->dirty_exceeded)
 		ratelimit = min(ratelimit, 32 >> (PAGE_SHIFT - 10));
 
-	current->nr_dirtied += nr_pages_dirtied;
-
 	preempt_disable();
 	/*
 	 * This prevents one CPU to accumulate too many dirtied pages without
@@ -1258,12 +1256,9 @@ void balance_dirty_pages_ratelimited_nr(
 	p =  &__get_cpu_var(bdp_ratelimits);
 	if (unlikely(current->nr_dirtied >= ratelimit))
 		*p = 0;
-	else {
-		*p += nr_pages_dirtied;
-		if (unlikely(*p >= ratelimit_pages)) {
-			*p = 0;
-			ratelimit = 0;
-		}
+	else if (unlikely(*p >= ratelimit_pages)) {
+		*p = 0;
+		ratelimit = 0;
 	}
 	/*
 	 * Pick up the dirtied pages by the exited tasks. This avoids lots of
@@ -1758,6 +1753,8 @@ void account_page_dirtied(struct page *p
 		__inc_bdi_stat(mapping->backing_dev_info, BDI_DIRTIED);
 		task_dirty_inc(current);
 		task_io_account_write(PAGE_CACHE_SIZE);
+		current->nr_dirtied++;
+		this_cpu_inc(bdp_ratelimits);
 	}
 }
 EXPORT_SYMBOL(account_page_dirtied);

  reply	other threads:[~2011-11-22 14:11 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-11-21 13:03 [PATCH 0/5] dirty throttling bits for 3.3 Wu Fengguang
2011-11-21 13:03 ` [PATCH 1/5] writeback: balanced_rate cannot exceed write bandwidth Wu Fengguang
2011-11-21 22:50   ` Jan Kara
2011-11-22  6:41     ` Wu Fengguang
2011-11-22 21:04       ` Jan Kara
2011-11-23 13:17         ` Wu Fengguang
2011-11-21 13:03 ` [PATCH 2/5] writeback: charge leaked page dirties to active tasks Wu Fengguang
2011-11-21 21:49   ` Andrew Morton
2011-11-21 23:46     ` Jan Kara
2011-11-22 13:35     ` Wu Fengguang
2011-11-21 13:03 ` [PATCH 3/5] writeback: fix dirtied pages accounting on sub-page writes Wu Fengguang
2011-11-22  0:11   ` Jan Kara
2011-11-22  9:21     ` Wu Fengguang
2011-11-22 12:21       ` Jan Kara
2011-11-22 12:30         ` Wu Fengguang
2011-11-22 12:48           ` Jan Kara
2011-11-22 13:02             ` Wu Fengguang
2011-11-22 12:57         ` Peter Zijlstra
2011-11-22 13:07           ` Wu Fengguang
2011-11-22 13:41             ` Wu Fengguang
2011-11-22 13:53               ` Peter Zijlstra
2011-11-22 14:11                 ` Wu Fengguang [this message]
2011-11-28 13:51         ` Wu Fengguang
2011-11-21 13:03 ` [PATCH 4/5] writeback: fix dirtied pages accounting on redirty Wu Fengguang
2011-11-21 21:51   ` Andrew Morton
2011-11-22 13:59     ` Wu Fengguang
2011-11-21 13:03 ` [PATCH 5/5] writeback: dirty ratelimit - think time compensation Wu Fengguang
2011-11-23 12:44 ` [PATCH 0/5] dirty throttling bits for 3.3 Peter Zijlstra
2011-11-28 13:56   ` Wu Fengguang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20111122141129.GC10545@localhost \
    --to=fengguang.wu@intel.com \
    --cc=a.p.zijlstra@chello.nl \
    --cc=akpm@linux-foundation.org \
    --cc=hch@lst.de \
    --cc=jack@suse.cz \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.