linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jan Kara <jack@suse.cz>
To: Chris Mason <clm@fb.com>
Cc: "tj@kernel.org" <tj@kernel.org>,
	"vgoyal@redhat.com" <vgoyal@redhat.com>,
	"lizefan@huawei.com" <lizefan@huawei.com>,
	"jack@suse.cz" <jack@suse.cz>,
	"gnehzuil.liu@gmail.com" <gnehzuil.liu@gmail.com>,
	"tm@tao.ma" <tm@tao.ma>,
	"lsf-pc@lists.linux-foundation.org"
	<lsf-pc@lists.linux-foundation.org>,
	"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>
Subject: Re: [Lsf-pc] [LSF/MM ATTEND] Filesystems -- Btrfs, cgroups, Storage topics from Facebook
Date: Fri, 3 Jan 2014 07:39:12 +0100	[thread overview]
Message-ID: <20140103063912.GB31086@quack.suse.cz> (raw)
In-Reply-To: <1388689935.24668.20.camel@ret.masoncoding.com>

On Thu 02-01-14 19:11:44, Chris Mason wrote:
> On Thu, 2014-01-02 at 12:10 -0500, tj@kernel.org wrote:
> > Hey, Vivek.
> > 
> > On Thu, Jan 02, 2014 at 12:06:37PM -0500, Vivek Goyal wrote:
> > > So is this a separate configuration which can be done per bdi as opposed
> > > to per device? IOW throttling offered per per cgroup per bdi. This will
> > > help with the case of throttling over NFS too, which some people have
> > > been asking for.
> > 
> > Hah? No, bdi just being split per-cgroup on each device so that it can
> > properly propagate congestion upwards per-blkcg, just like how we
> > split request allocation per-cgroup in the block layer proper.
> > 
> 
> I'm not entirely sure how well this will fit with the filesystems (who
> already expect a single BDI), but it's worth trying.  I'm definitely
> worried about having too many blkcgs and over-committing the dirty memory
> limits.  BDI-per device setup we have now already has that problem, but
> I'm not sure its a good idea to make it worse.
  Well, we already have BDIs not tied to any particular device - e.g. in
NFS or in btrfs (I guess I don't have to tell you ;). In more complex
storage situations we rather seem to follow the rule one BDI per filesystem
instance as that's what makes sense for writeback and most of other stuff
in BDI. And we definitely want to split this "high-level" BDI because
that's the only thing which makes sense for writeback (you cannot really
tell whether e.g. dirty inode belongs to sda or sdb in RAID0 setting).

> > > So it sounds like re-implementing throttling infrastructure at bdi level
> > > now (Similar to what has been done at device level)? Of course use as
> > > much code as possible. But IIUC, proposal is that effectively there will
> > > can be two throttling controllers. One operating at bdi level and one
> > > operating below it at device level?
> > 
> > Not at all.  I was arguing explicitly against that.
> 
> Keep in mind that we do already have throttling at the BDI level.  I was
> definitely hoping we could consolidate some of that since it has grown
> some bits of an elevator already.  I'm not saying what we have in the
> bdi throttling isn't reasonable, but it is definitely replicating some
> of the infrastructure down below.
  What exactly are you talking about? Dirty throttling or something else?

> We'll (Josef, me, perhaps others at FB) will do some experiments before
> LSF.  The current plan is to throw some code at the wall and see what
> sticks.
  That would be great. People are talking about this on and off for at
least three years, there were some patches posted but so far noone was
persistent enough to get anything merged (to be fair at least we have
explored some dead ends where we decided that it's not worth the hassle
after we've seen the code / considered the API).

								Honza
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

  reply	other threads:[~2014-01-03  6:39 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-12-30 21:36 [LSF/MM ATTEND] Filesystems -- Btrfs, cgroups, Storage topics from Facebook Chris Mason
2013-12-31  8:49 ` Zheng Liu
2013-12-31  9:36   ` Jeff Liu
2013-12-31 12:45   ` [Lsf-pc] " Jan Kara
2013-12-31 13:19     ` Chris Mason
2013-12-31 14:22       ` Tao Ma
2013-12-31 15:34         ` Chris Mason
2014-01-02  6:46           ` Jan Kara
2014-01-02 15:21             ` Chris Mason
2014-01-02 16:01               ` tj
2014-01-02 16:14                 ` tj
2014-01-03  6:03                   ` Jan Kara
2014-01-02 17:06                 ` Vivek Goyal
2014-01-02 17:10                   ` tj
2014-01-02 19:11                     ` Chris Mason
2014-01-03  6:39                       ` Jan Kara [this message]
2014-01-02 18:27                 ` James Bottomley
2014-01-02 18:36                   ` tj
2014-01-03  7:44                     ` James Bottomley
2014-01-08 15:04       ` Mel Gorman
2014-01-08 16:14         ` Chris Mason

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140103063912.GB31086@quack.suse.cz \
    --to=jack@suse.cz \
    --cc=clm@fb.com \
    --cc=gnehzuil.liu@gmail.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=lizefan@huawei.com \
    --cc=lsf-pc@lists.linux-foundation.org \
    --cc=tj@kernel.org \
    --cc=tm@tao.ma \
    --cc=vgoyal@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).