From mboxrd@z Thu Jan  1 00:00:00 1970
From: Simon Jeons <simon.jeons@gmail.com>
Subject: Re: [PATCH] writeback: fix writeback cache thrashing
Date: Sat, 05 Jan 2013 03:41:54 -0600
Message-ID: <1357378914.8716.3.camel@kernel.cn.ibm.com>
References: <1356847190-7986-1-git-send-email-linkinjeon@gmail.com>
	 <20121231113054.GC7564@quack.suse.cz>
	 <20130102134334.GB30633@quack.suse.cz>
	 <CAKYAXd8-sZo0XcdHuyOQ1qT_s3kJXyphXsjSS7e1-sJ1QaAOgg@mail.gmail.com>
	 <1357261151.5105.2.camel@kernel.cn.ibm.com>
	 <CAKYAXd-kcnxm6Do9VcbdyrCBvArrjz1iHOpxXHnyUyNcqP7Ofg@mail.gmail.com>
	 <1357346803.5273.10.camel@kernel.cn.ibm.com>
	 <20130105032642.GA8188@localhost>
	 <1357363603.5273.16.camel@kernel.cn.ibm.com>
	 <20130105073846.GA11811@localhost>
Mime-Version: 1.0
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Cc: Namjae Jeon <linkinjeon@gmail.com>, Jan Kara <jack@suse.cz>, Wanpeng Li
 <liwanp@linux.vnet.ibm.com>, linux-fsdevel@vger.kernel.org,
 linux-mm@kvack.org,  linux-kernel@vger.kernel.org, Namjae Jeon
 <namjae.jeon@samsung.com>, Vivek Trivedi <t.vivek@samsung.com>, Dave
 Chinner <dchinner@redhat.com>
To: Fengguang Wu <fengguang.wu@intel.com>
Return-path: <owner-linux-mm@kvack.org>
In-Reply-To: <20130105073846.GA11811@localhost>
Sender: owner-linux-mm@kvack.org
List-Id: linux-fsdevel.vger.kernel.org

On Sat, 2013-01-05 at 15:38 +0800, Fengguang Wu wrote:
> On Fri, Jan 04, 2013 at 11:26:43PM -0600, Simon Jeons wrote:
> > On Sat, 2013-01-05 at 11:26 +0800, Fengguang Wu wrote:
> > > > > > Hi Namjae,
> > > > > >
> > > > > > Why use bdi_stat_error here? What's the meaning of its commen=
t "maximal
> > > > > > error of a stat counter"?
> > > > > Hi Simon,
> > > > >=20
> > > > > As you know bdi stats (BDI_RECLAIMABLE, BDI_WRITEBACK =E2=80=A6=
) are kept in
> > > > > percpu counters.
> > > > > When these percpu counters are incremented/decremented simultan=
eously
> > > > > on multiple CPUs by small amount (individual cpu counter less t=
han
> > > > > threshold BDI_STAT_BATCH),
> > > > > it is possible that we get approximate value (not exact value) =
of
> > > > > these percpu counters.
> > > > > In order, to handle these percpu counter error we have used
> > > > > bdi_stat_error. bdi_stat_error is the maximum error which can h=
appen
> > > > > in percpu bdi stats accounting.
> > > > >=20
> > > > > bdi_stat(bdi, BDI_RECLAIMABLE);
> > > > >  -> This will give approximate value of BDI_RECLAIMABLE by read=
ing
> > > > > previous value of percpu count.
> > > > >=20
> > > > > bdi_stat_sum(bdi, BDI_RECLAIMABLE);
> > > > >  ->This will give exact value of BDI_RECLAIMABLE. It will take =
lock
> > > > > and add current percpu count of individual CPUs.
> > > > >    It is not recommended to use it frequently as it is expensiv=
e. We
> > > > > can better use =E2=80=9Cbdi_stat=E2=80=9D and work with approx =
value of bdi stats.
> > > > >=20
> > > >=20
> > > > Hi Namjae, thanks for your clarify.
> > > >=20
> > > > But why compare error stat count to bdi_bground_thresh? What's th=
e
> > >=20
> > > It's not comparing bdi_stat_error to bdi_bground_thresh, but rather=
,
> > > in concept, comparing bdi_stat (with error bound adjustments) to
> > > bdi_bground_thresh.
> > >=20
> > > > relationship between them? I also see bdi_stat_error compare to
> > > > bdi_thresh/bdi_dirty in function balance_dirty_pages.=20
> > >=20
> >=20
> > Hi Fengguang,
> >=20
> > > Here, it's trying to use bdi_stat_sum(), the accurate (however more
> > > costly) version of bdi_stat(), if the error would possibly be large=
:
> >=20
> > Why error is large use bdi_stat_sum and error is few use bdi_stat?
>=20

Thanks for your response Fengguang! :)

> It's the opposite. Please check this per-cpu counter routine to get an =
idea:
>=20
> /*
>  * Add up all the per-cpu counts, return the result.  This is a more ac=
curate
>  * but much slower version of percpu_counter_read_positive()
>  */                                                =20
> s64 __percpu_counter_sum(struct percpu_counter *fbc)
>=20
> > >=20
> > >                 if (bdi_thresh < 2 * bdi_stat_error(bdi)) {
> > >                         bdi_reclaimable =3D bdi_stat_sum(bdi, BDI_R=
ECLAIMABLE);
> > >                         //...
> > >                 } else {
> > >                         bdi_reclaimable =3D bdi_stat(bdi, BDI_RECLA=
IMABLE);
> > >                         //...
> > >                 }
> > >=20

The comment above these codes:

                 * In order to avoid the stacked BDI deadlock we need
                 * to ensure we accurately count the 'dirty' pages when
                 * the threshold is low.

Why your meaning threshold low is error large?=20


> > > Here the comment should have explained it well:
> > >=20
> > >                  * In theory 1 page is enough to keep the comsumer-=
producer
> > >                  * pipe going: the flusher cleans 1 page =3D> the t=
ask dirties 1
> > >                  * more page. However bdi_dirty has accounting erro=
rs.  So use
> >=20
> > Why bdi_dirty has accounting errors?
>=20
> Because it typically uses bdi_stat() to get the rough sum of the per-cp=
u
> counters.
> =20
> Thanks,
> Fengguang
>=20
> > >                  * the larger and more IO friendly bdi_stat_error.
> > >                  */
> > >                 if (bdi_dirty <=3D bdi_stat_error(bdi))
> > >                         break;
> > >=20
> > >=20
> > > Thanks,
> > > Fengguang
> >=20


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=3Dmailto:"dont@kvack.org"> email@kvack.org </a>