From: Vivek Goyal <vgoyal@redhat.com>
To: Dave Chinner <david@fromorbit.com>
Cc: Chris Mason <chris.mason@oracle.com>,
Chad Talbott <ctalbott@google.com>,
James Bottomley <james.bottomley@hansenpartnership.com>,
lsf <lsf@lists.linux-foundation.org>,
linux-fsdevel <linux-fsdevel@vger.kernel.org>
Subject: Re: IO less throttling and cgroup aware writeback (Was: Re: [Lsf] Preliminary Agenda and Activities for LSF)
Date: Thu, 31 Mar 2011 21:34:24 -0400 [thread overview]
Message-ID: <20110401013424.GA17928@redhat.com> (raw)
In-Reply-To: <20110331221425.GB2904@dastard>
On Fri, Apr 01, 2011 at 09:14:25AM +1100, Dave Chinner wrote:
[..]
> > An fsync has two basic parts
> >
> > 1) write the file data pages
> > 2a) flush data=ordered in reiserfs/ext34
> > 2b) do the real transaction commit
> >
> >
> > We can do part one in parallel across any number of writers. For part
> > two, there is only one running transaction. If the FS is smart, the
> > commit will only force down the transaction that last modified the
> > file. 50 procs running fsync may only need to trigger one commit.
>
> Right. However the real issue here, I think, is that the IO comes
> from a thread not associated with writeback nor is in any way cgroup
> aware. IOWs, getting the right context to each block being written
> back will be complex and filesystem specific.
>
> The other thing that concerns me is how metadata IO is accounted and
> throttled. Doing stuff like creating lots of small files will
> generate as much or more metadata IO than data IO, and none of that
> will be associated with a cgroup. Indeed, in XFS metadata doesn't
> even use the pagecache anymore, and it's written back by a thread
> (soon to be a workqueue) deep inside XFS's journalling subsystem, so
> it's pretty much impossible to associate that IO with any specific
> cgroup.
>
> What happens to that IO? Blocking it arbitrarily can have the same
> effect as blocking transaction completion - it can cause the
> filesystem to completely stop....
Dave,
As of today, the cgroup/context of IO is decided from the IO submitting
thread context. So any IO submitted by kernel threads (flusher, kjournald,
workqueue threads) goes to root group IO which should remain unthrottled.
(It is not a good idea to put throttling rules for root group).
Now any meta data operation happening in the context of process will
still be subject to throttling (is there any?). If that's a concern,
can filesystem mark that bio (REQ_META?) and throttling logic can possibly
let these bio pass through.
Determining the cgroup/context from submitting process has the issue of
that any writeback IO is not throttled and we are looking for a way to
control buffered writes also. If we start determining the cgroup from
some information stored in page_cgroup, then we are more likely to
run into issues of priority inversion (filesystem in ordered mode flushing
data first before committing meta data changes). So should we throttle
buffered writes when page cache is being dirtied and not when these
writes are being written back to device.
Thanks
Vivek
next prev parent reply other threads:[~2011-04-01 1:34 UTC|newest]
Thread overview: 166+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <1301373398.2590.20.camel@mulgrave.site>
2011-03-29 5:14 ` [Lsf] Preliminary Agenda and Activities for LSF Amir Goldstein
2011-03-29 11:16 ` Ric Wheeler
2011-03-29 11:22 ` Matthew Wilcox
2011-03-29 12:17 ` Jens Axboe
2011-03-29 13:09 ` Martin K. Petersen
2011-03-29 13:09 ` Martin K. Petersen
2011-03-29 13:12 ` Ric Wheeler
2011-03-29 13:38 ` James Bottomley
2011-03-29 17:20 ` Shyam_Iyer
2011-03-29 17:20 ` Shyam_Iyer
2011-03-29 17:33 ` Vivek Goyal
2011-03-29 18:10 ` Shyam_Iyer
2011-03-29 18:10 ` Shyam_Iyer
2011-03-29 18:45 ` Vivek Goyal
2011-03-29 19:13 ` Shyam_Iyer
2011-03-29 19:13 ` Shyam_Iyer
2011-03-29 19:57 ` Vivek Goyal
2011-03-29 19:59 ` Mike Snitzer
2011-03-29 20:12 ` Shyam_Iyer
2011-03-29 20:12 ` Shyam_Iyer
2011-03-29 20:23 ` Mike Snitzer
2011-03-29 23:09 ` Shyam_Iyer
2011-03-29 23:09 ` Shyam_Iyer
2011-03-30 5:58 ` [Lsf] " Hannes Reinecke
2011-03-30 14:02 ` James Bottomley
2011-03-30 14:10 ` Hannes Reinecke
2011-03-30 14:26 ` James Bottomley
2011-03-30 14:55 ` Hannes Reinecke
2011-03-30 15:33 ` James Bottomley
2011-03-30 15:46 ` Shyam_Iyer
2011-03-30 15:46 ` Shyam_Iyer
2011-03-30 20:32 ` Giridhar Malavali
2011-03-30 20:45 ` James Bottomley
2011-03-29 19:47 ` Nicholas A. Bellinger
2011-03-29 20:29 ` Jan Kara
2011-03-29 20:31 ` Ric Wheeler
2011-03-30 0:33 ` Mingming Cao
2011-03-30 2:17 ` Dave Chinner
2011-03-30 11:13 ` Theodore Tso
2011-03-30 11:28 ` Ric Wheeler
2011-03-30 14:07 ` Chris Mason
2011-04-01 15:19 ` Ted Ts'o
2011-04-01 16:30 ` Amir Goldstein
2011-04-01 21:46 ` Joel Becker
2011-04-02 3:26 ` Amir Goldstein
2011-04-01 21:43 ` Joel Becker
2011-04-01 21:43 ` Joel Becker
2011-04-01 21:43 ` Joel Becker
2011-03-30 21:49 ` Mingming Cao
2011-03-31 0:05 ` Matthew Wilcox
2011-03-31 1:00 ` Joel Becker
2011-04-01 21:34 ` Mingming Cao
2011-04-01 21:49 ` Joel Becker
2011-03-29 15:35 ` [LSF][MM] page allocation & direct reclaim latency Rik van Riel
2011-03-29 19:05 ` [Lsf] " Andrea Arcangeli
2011-03-29 20:35 ` Ying Han
2011-03-29 20:39 ` Ying Han
2011-03-29 20:45 ` Andrea Arcangeli
2011-03-29 20:53 ` Ying Han
2011-03-29 21:22 ` Rik van Riel
2011-03-29 22:38 ` Andrea Arcangeli
2011-03-29 22:13 ` Minchan Kim
2011-03-29 23:12 ` Andrea Arcangeli
2011-03-30 16:17 ` Mel Gorman
2011-03-30 16:49 ` Andrea Arcangeli
2011-03-31 0:42 ` Hugh Dickins
2011-03-31 15:15 ` Andrea Arcangeli
2011-03-31 9:30 ` Mel Gorman
2011-03-31 16:36 ` Andrea Arcangeli
2011-03-30 16:59 ` Dan Magenheimer
2011-03-29 17:35 ` [Lsf] Preliminary Agenda and Activities for LSF Chad Talbott
2011-03-29 19:09 ` Vivek Goyal
2011-03-29 20:14 ` Chad Talbott
2011-03-29 20:35 ` Jan Kara
2011-03-29 21:08 ` Greg Thelen
2011-03-30 4:18 ` Dave Chinner
2011-03-30 15:37 ` IO less throttling and cgroup aware writeback (Was: Re: [Lsf] Preliminary Agenda and Activities for LSF) Vivek Goyal
2011-03-30 22:20 ` Dave Chinner
2011-03-30 22:49 ` Chad Talbott
2011-03-31 3:00 ` Dave Chinner
2011-03-31 14:16 ` Vivek Goyal
2011-03-31 14:34 ` Chris Mason
2011-03-31 22:14 ` Dave Chinner
2011-03-31 23:43 ` Chris Mason
2011-04-01 0:55 ` Dave Chinner
2011-04-01 1:34 ` Vivek Goyal [this message]
2011-04-01 4:36 ` Dave Chinner
2011-04-01 6:32 ` [Lsf] IO less throttling and cgroup aware writeback (Was: " Christoph Hellwig
2011-04-01 7:23 ` Dave Chinner
2011-04-01 12:56 ` Christoph Hellwig
2011-04-21 15:07 ` Vivek Goyal
2011-04-01 14:49 ` IO less throttling and cgroup aware writeback (Was: Re: [Lsf] " Vivek Goyal
2011-03-31 22:25 ` Vivek Goyal
2011-03-31 14:50 ` [Lsf] IO less throttling and cgroup aware writeback (Was: " Greg Thelen
2011-03-31 22:27 ` Dave Chinner
2011-04-01 17:18 ` Vivek Goyal
2011-04-01 19:57 ` [LSF]: fc_rport attributes to further populate HBAAPIv2 Giridhar Malavali
2011-04-01 21:49 ` [Lsf] IO less throttling and cgroup aware writeback (Was: Re: Preliminary Agenda and Activities for LSF) Dave Chinner
2011-04-02 7:33 ` Greg Thelen
2011-04-02 7:34 ` Greg Thelen
2011-04-05 13:13 ` Vivek Goyal
2011-04-05 22:56 ` Dave Chinner
2011-04-06 14:49 ` Curt Wohlgemuth
2011-04-06 15:39 ` Vivek Goyal
2011-04-06 19:49 ` Greg Thelen
2011-04-06 23:07 ` [Lsf] IO less throttling and cgroup aware writeback Greg Thelen
2011-04-06 23:36 ` Dave Chinner
2011-04-07 19:24 ` Vivek Goyal
2011-04-07 20:33 ` Christoph Hellwig
2011-04-07 21:34 ` Vivek Goyal
2011-04-07 23:42 ` Dave Chinner
2011-04-08 0:59 ` Greg Thelen
2011-04-08 1:25 ` Dave Chinner
2011-04-08 1:25 ` Dave Chinner
2011-04-12 3:17 ` KAMEZAWA Hiroyuki
2011-04-08 13:43 ` Vivek Goyal
2011-04-06 23:08 ` [Lsf] IO less throttling and cgroup aware writeback (Was: Re: Preliminary Agenda and Activities for LSF) Dave Chinner
2011-04-07 20:04 ` Vivek Goyal
2011-04-07 23:47 ` Dave Chinner
2011-04-08 13:50 ` Vivek Goyal
2011-04-11 1:05 ` Dave Chinner
2011-04-06 15:37 ` Vivek Goyal
2011-04-06 16:08 ` Vivek Goyal
2011-04-06 17:10 ` Jan Kara
2011-04-06 17:14 ` Curt Wohlgemuth
2011-04-08 1:58 ` Dave Chinner
2011-04-19 14:26 ` Wu Fengguang
2011-04-06 23:50 ` Dave Chinner
2011-04-07 17:55 ` Vivek Goyal
2011-04-11 1:36 ` Dave Chinner
2011-04-15 21:07 ` Vivek Goyal
2011-04-16 3:06 ` Vivek Goyal
2011-04-18 21:58 ` Jan Kara
2011-04-18 22:51 ` cgroup IO throttling and filesystem ordered mode (Was: Re: [Lsf] IO less throttling and cgroup aware writeback (Was: Re: Preliminary Agenda and Activities for LSF)) Vivek Goyal
2011-04-19 0:33 ` Dave Chinner
2011-04-19 14:30 ` Vivek Goyal
2011-04-19 14:45 ` Jan Kara
2011-04-19 17:17 ` Vivek Goyal
2011-04-19 18:30 ` Vivek Goyal
2011-04-21 0:32 ` Dave Chinner
2011-04-21 0:29 ` Dave Chinner
2011-04-19 14:17 ` [Lsf] IO less throttling and cgroup aware writeback (Was: Re: Preliminary Agenda and Activities for LSF) Wu Fengguang
2011-04-19 14:34 ` Vivek Goyal
2011-04-19 14:48 ` Jan Kara
2011-04-19 15:11 ` Vivek Goyal
2011-04-19 15:22 ` Wu Fengguang
2011-04-19 15:31 ` Vivek Goyal
2011-04-19 16:58 ` Wu Fengguang
2011-04-19 17:05 ` Vivek Goyal
2011-04-19 20:58 ` Jan Kara
2011-04-20 1:21 ` Wu Fengguang
2011-04-20 10:56 ` Jan Kara
2011-04-20 11:19 ` Wu Fengguang
2011-04-20 14:42 ` Jan Kara
2011-04-20 1:16 ` Wu Fengguang
2011-04-20 18:44 ` Vivek Goyal
2011-04-20 19:16 ` Jan Kara
2011-04-21 0:17 ` Dave Chinner
2011-04-21 15:06 ` Wu Fengguang
2011-04-21 15:10 ` Wu Fengguang
2011-04-21 17:20 ` Vivek Goyal
2011-04-22 4:21 ` Wu Fengguang
2011-04-22 15:25 ` Vivek Goyal
2011-04-22 16:28 ` Andrea Arcangeli
2011-04-25 18:19 ` Vivek Goyal
2011-04-26 14:37 ` Vivek Goyal
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20110401013424.GA17928@redhat.com \
--to=vgoyal@redhat.com \
--cc=chris.mason@oracle.com \
--cc=ctalbott@google.com \
--cc=david@fromorbit.com \
--cc=james.bottomley@hansenpartnership.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=lsf@lists.linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.