From mboxrd@z Thu Jan  1 00:00:00 1970
From: Vivek Goyal <vgoyal@redhat.com>
Subject: Re: [Lsf] IO less throttling and cgroup aware writeback (Was: Re:
 Preliminary Agenda and Activities for LSF)
Date: Mon, 25 Apr 2011 14:19:54 -0400
Message-ID: <20110425181954.GD10469@redhat.com>
References: <20110419153106.GF31712@redhat.com>
 <20110419165838.GA2134@localhost>
 <20110419170543.GK31712@redhat.com>
 <20110420011638.GA4421@localhost>
 <20110420184433.GH29872@redhat.com>
 <20110421150618.GA22436@localhost>
 <20110421172040.GG8192@redhat.com>
 <20110422042123.GD6199@localhost>
 <20110422152531.GA8255@redhat.com>
 <20110422162829.GX5611@random.random>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Cc: Wu Fengguang <fengguang.wu@intel.com>,
	James Bottomley <James.Bottomley@hansenpartnership.com>,
	"lsf@lists.linux-foundation.org" <lsf@lists.linux-foundation.org>,
	Dave Chinner <david@fromorbit.com>,
	"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
	Jens Axboe <axboe@kernel.dk>
To: Andrea Arcangeli <aarcange@redhat.com>
Return-path: <linux-fsdevel-owner@vger.kernel.org>
Received: from mx1.redhat.com ([209.132.183.28]:50765 "EHLO mx1.redhat.com"
	rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
	id S1757722Ab1DYSUl (ORCPT <rfc822;linux-fsdevel@vger.kernel.org>);
	Mon, 25 Apr 2011 14:20:41 -0400
Content-Disposition: inline
In-Reply-To: <20110422162829.GX5611@random.random>
Sender: linux-fsdevel-owner@vger.kernel.org
List-ID: <linux-fsdevel.vger.kernel.org>

On Fri, Apr 22, 2011 at 06:28:29PM +0200, Andrea Arcangeli wrote:

[..]
> > Also it is only CFQ which provides READS so much preferrence over WRITES.
> > deadline and noop do not which we typically use on faster storage. There
> > we might take a bigger hit on READ latencies depending on what storage
> > is and how effected it is with a burst of WRITES.
> > 
> > I guess it boils down to better system control and better predictability.
> 
> I tend to think to get even better read latency and predictability,
> the IO scheduler could dynamically and temporarily reduce the max
> sector size of the write dma (and also ensure any read readahead is
> also reduced to the dynamic reduced sector size or it'd be detrimental
> on the number of read DMA issued for each userland read).
> 
> Maybe with tagged queuing things are better and the dma size doesn't
> make a difference anymore, I don't know. Surely Jens knows this best
> and can tell me if I'm wrong.
> 
> Anyway it should be real easy to test, just a two liner reducing the
> max sector size to scsi_lib and the max readahead, should allow you to
> see how fast firefox starts with cfq when dd if=/dev/zero is running
> and if there's any difference at all.

I did some quick runs.

- Default queue depth is 31 on my SATA disk. Reducing queue depth to 1
  helps a bit.

  In CFQ we already try to reduce the queue depth of WRITES if READS
  are going on.

- I reduced /sys/block/sda/queue/max_sector_kb to 16. That seemed to
  help with firefox launch time.

There are couple of interesting observations though.

- Even after I reduced max_sector_kb to 16, I saw requests of 1024 sector
  size coming from flusher threads. 

- Firefox launch time reduced by reducing the max_sector_kb but it did
  not help much when I tried to launch first website "lwn.net". It still
  took me little more than 1 minute, to be able to select lwn.net from
  cached entries and then be able to really load and display the page.

I will spend more time figuring out what's happening here.

But in general, reducing the max request size dynamically sounds
interesting. I am not sure how upper layers are impacted because
of this (dm etc).

Thanks
Vivek