linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Vivek Goyal <vgoyal@redhat.com>
To: Chad Talbott <ctalbott@google.com>
Cc: lsf-pc@lists.linuxfoundation.org, linux-fsdevel@vger.kernel.org
Subject: Re: [LSF/FS TOPIC] I/O performance isolation for shared storage
Date: Thu, 3 Feb 2011 21:31:44 -0500	[thread overview]
Message-ID: <20110204023144.GA30087@redhat.com> (raw)
In-Reply-To: <AANLkTincmnUWsev8VQpP0-On6TG-cLiJJ9DfnnjoqVPZ@mail.gmail.com>

On Thu, Feb 03, 2011 at 05:50:00PM -0800, Chad Talbott wrote:
> I/O performance is the bottleneck in many systems, from phones to
> servers. Knowing which request to schedule at any moment is crucial to
> systems that support interactive latencies and high throughput.  When
> you're watching a video on your desktop, you don't want it to skip
> when you build a kernel.
> 
> To address this in our environment Google has now deployed the
> blk-cgroup code worldwide, and I'd like to share some of our
> experiences. We've made modifications for our purposes, and are in the
> process of proposing those upstream:
> 
>   - Page tracking for buffered writes
>   - Fairness-preserving preemption across cgroups

Chad,

This is definitely of interest to me (though I will not be around but
will like to read LWN summary of discussion later. :-)). Would like to
know more how google has deployed this and using this infrastructre. Also
would like that all the missing pieces be pushed upstream (especially
the buffered WRITE support and page tracking stuff).

One thing I am curious to know that how do you deal with getting service
differentiation while maintaining high throughput. Idling on group for
fairness is more or less reasonable on single SATA disk but can very
well kill performance (especially with random IO) on storage array or
on fast SSDs.

I have been thinking of disabling idling altogether and trying to change the
position of group in the service tree based on weight when new IO comes
(CFQ already does something similar for cfqq, slice_offset() logic). I 
have been thinking of doing similar while calculating vdisktime of group
when it gets enqueued. This might give us some service differentiation
while getting better throughput.

You also mentioned about controlling latencies very tightly and that 
probably means driving shallower queue depths (may be 1) so that preemption
is somewhat effective and latencies are better. But again driving lesser queue
depth can lead to reduced performance. So I am curious how do you deal with
that.

Also curious to know if per memory cgroup dirty ration stuff got in and how
did we deal with the issue of selecting which inode to dispatch the writes
from based on the cgroup it belongs to. 

> 
> There is further work to do along the lines of fine-grained accounting
> and isolation. For example, many file servers in a Google cluster will
> do IO on behalf of hundreds, even thousands of clients. Each client
> has different service requirements, and it's inefficient to map them
> to (cgroup, task) pairs.

So is it ioprio based isolation or soemthing else?

Thanks
Vivek

  reply	other threads:[~2011-02-04  2:31 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-02-04  1:50 [LSF/FS TOPIC] I/O performance isolation for shared storage Chad Talbott
2011-02-04  2:31 ` Vivek Goyal [this message]
2011-02-04 23:07   ` Chad Talbott
2011-02-07 18:06     ` Vivek Goyal
2011-02-07 19:40       ` Chad Talbott
2011-02-07 20:38         ` Vivek Goyal
2011-02-15 12:54     ` Jan Kara
2011-02-15 23:15       ` Chad Talbott

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110204023144.GA30087@redhat.com \
    --to=vgoyal@redhat.com \
    --cc=ctalbott@google.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=lsf-pc@lists.linuxfoundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).