linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Vivek Goyal <vgoyal@redhat.com>
To: Fengguang Wu <fengguang.wu@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	LKML <linux-kernel@vger.kernel.org>,
	"Myklebust, Trond" <Trond.Myklebust@netapp.com>,
	linux-fsdevel@vger.kernel.org,
	Linux Memory Management List <linux-mm@kvack.org>
Subject: Re: write-behind on streaming writes
Date: Tue, 5 Jun 2012 14:48:53 -0400	[thread overview]
Message-ID: <20120605184853.GD28556@redhat.com> (raw)
In-Reply-To: <20120605174157.GC28556@redhat.com>

On Tue, Jun 05, 2012 at 01:41:57PM -0400, Vivek Goyal wrote:
> On Tue, Jun 05, 2012 at 01:23:02PM -0400, Vivek Goyal wrote:
> > On Wed, May 30, 2012 at 11:21:29AM +0800, Fengguang Wu wrote:
> > 
> > [..]
> > > (2) comes from the use of _WAIT_ flags in
> > > 
> > >         sync_file_range(..., SYNC_FILE_RANGE_WAIT_BEFORE|SYNC_FILE_RANGE_WRITE|SYNC_FILE_RANGE_WAIT_AFTER);
> > > 
> > > Each sync_file_range() syscall will submit 8MB write IO and wait for
> > > completion. That means the async write IO queue constantly swing
> > > between 0 and 8MB fillness at the frequency (100MBps / 8MB = 12.5ms).
> > > So on every 12.5ms, the async IO queue runs empty, which gives any
> > > pending read IO (from firefox etc.) a chance to be serviced. Nice
> > > and sweet breaks!
> > 
> > I doubt that async IO queue is empty for 12.5ms. We wait for previous
> > range to finish (index-1) and have already started the IO on next 8MB
> > of pages. So effectively that should keep 8MB of async IO in
> > queue (until and unless there are delays from user space side). So reason
> > for latency improvement might be something else and not because async
> > IO queue is empty for some time.
> 
> With sync_file_range() test, we can have 8MB of IO in flight. Without that
> I think we can have more at times and that might be the reason for latency
> improvement.
> 
> I see that CFQ has code to allow deeper NCQ depth if there is only a single
> writer. So once a reader comes along it might find tons of async IO
> already in flight. sync_file_range() will limit that in flight IO hence
> the latency improvement. So if we have multiple dd doing sync_file_range()
> then probably this latency improvement should go away.
> 
> I will run some tests to verify if my understanding about deeper queue
> depths in case of single writer is correct or not.

So I did run some tests and can confirm that on an average there seem to
be more in flight requests *without* sync_file_range() and that's probably
the reason that why sync_file_range() test is showing better latency. 

I can see that with "dd if=/dev/zero of=zerofile bs=1M count=1024", we are
driving deeper queue depths (upto 32) and in later stages in flight
requests are constantly high.

With sync_file_range(), in flight requests number of requests fluctuate a
lot between 1 and 32. Many a times it is just 1 or up to 16 and few times
went up to 32.

So sync_file_range() test keeps less in flight requests on on average
hence better latencies. It might not produce throughput drop on SATA
disks but might have some effect on storage array luns. Will give it
a try.

Thanks
Vivek

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2012-06-05 18:48 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20120528114124.GA6813@localhost>
     [not found] ` <CA+55aFxHt8q8+jQDuoaK=hObX+73iSBTa4bBWodCX3s-y4Q1GQ@mail.gmail.com>
2012-05-29 15:57   ` write-behind on streaming writes Fengguang Wu
2012-05-29 17:35     ` Linus Torvalds
2012-05-30  3:21       ` Fengguang Wu
2012-06-05  1:01         ` Dave Chinner
2012-06-05 17:18           ` Vivek Goyal
2012-06-05 17:23         ` Vivek Goyal
2012-06-05 17:41           ` Vivek Goyal
2012-06-05 18:48             ` Vivek Goyal [this message]
2012-06-05 20:10               ` Vivek Goyal
2012-06-06  2:57                 ` Vivek Goyal
2012-06-06  3:14                   ` Linus Torvalds
2012-06-06 12:14                     ` Vivek Goyal
2012-06-06 14:00                       ` Fengguang Wu
2012-06-06 17:04                         ` Vivek Goyal
2012-06-07  9:45                           ` Jan Kara
2012-06-07 19:06                             ` Vivek Goyal
2012-06-06 16:15                       ` Vivek Goyal
2012-06-06 14:08                   ` Fengguang Wu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120605184853.GD28556@redhat.com \
    --to=vgoyal@redhat.com \
    --cc=Trond.Myklebust@netapp.com \
    --cc=fengguang.wu@intel.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).