Re: write-behind on streaming writes

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Vivek Goyal <vgoyal@redhat.com>
To: Fengguang Wu <fengguang.wu@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	LKML <linux-kernel@vger.kernel.org>,
	"Myklebust, Trond" <Trond.Myklebust@netapp.com>,
	linux-fsdevel@vger.kernel.org,
	Linux Memory Management List <linux-mm@kvack.org>
Subject: Re: write-behind on streaming writes
Date: Tue, 5 Jun 2012 13:41:57 -0400	[thread overview]
Message-ID: <20120605174157.GC28556@redhat.com> (raw)
In-Reply-To: <20120605172302.GB28556@redhat.com>

On Tue, Jun 05, 2012 at 01:23:02PM -0400, Vivek Goyal wrote:
> On Wed, May 30, 2012 at 11:21:29AM +0800, Fengguang Wu wrote:
> 
> [..]
> > (2) comes from the use of _WAIT_ flags in
> > 
> >         sync_file_range(..., SYNC_FILE_RANGE_WAIT_BEFORE|SYNC_FILE_RANGE_WRITE|SYNC_FILE_RANGE_WAIT_AFTER);
> > 
> > Each sync_file_range() syscall will submit 8MB write IO and wait for
> > completion. That means the async write IO queue constantly swing
> > between 0 and 8MB fillness at the frequency (100MBps / 8MB = 12.5ms).
> > So on every 12.5ms, the async IO queue runs empty, which gives any
> > pending read IO (from firefox etc.) a chance to be serviced. Nice
> > and sweet breaks!
> 
> I doubt that async IO queue is empty for 12.5ms. We wait for previous
> range to finish (index-1) and have already started the IO on next 8MB
> of pages. So effectively that should keep 8MB of async IO in
> queue (until and unless there are delays from user space side). So reason
> for latency improvement might be something else and not because async
> IO queue is empty for some time.

With sync_file_range() test, we can have 8MB of IO in flight. Without that
I think we can have more at times and that might be the reason for latency
improvement.

I see that CFQ has code to allow deeper NCQ depth if there is only a single
writer. So once a reader comes along it might find tons of async IO
already in flight. sync_file_range() will limit that in flight IO hence
the latency improvement. So if we have multiple dd doing sync_file_range()
then probably this latency improvement should go away.

I will run some tests to verify if my understanding about deeper queue
depths in case of single writer is correct or not.

Thanks
Vivek

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)

From: Vivek Goyal <vgoyal@redhat.com>
To: Fengguang Wu <fengguang.wu@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	LKML <linux-kernel@vger.kernel.org>,
	"Myklebust, Trond" <Trond.Myklebust@netapp.com>,
	linux-fsdevel@vger.kernel.org,
	Linux Memory Management List <linux-mm@kvack.org>
Subject: Re: write-behind on streaming writes
Date: Tue, 5 Jun 2012 13:41:57 -0400	[thread overview]
Message-ID: <20120605174157.GC28556@redhat.com> (raw)
In-Reply-To: <20120605172302.GB28556@redhat.com>

On Tue, Jun 05, 2012 at 01:23:02PM -0400, Vivek Goyal wrote:
> On Wed, May 30, 2012 at 11:21:29AM +0800, Fengguang Wu wrote:
> 
> [..]
> > (2) comes from the use of _WAIT_ flags in
> > 
> >         sync_file_range(..., SYNC_FILE_RANGE_WAIT_BEFORE|SYNC_FILE_RANGE_WRITE|SYNC_FILE_RANGE_WAIT_AFTER);
> > 
> > Each sync_file_range() syscall will submit 8MB write IO and wait for
> > completion. That means the async write IO queue constantly swing
> > between 0 and 8MB fillness at the frequency (100MBps / 8MB = 12.5ms).
> > So on every 12.5ms, the async IO queue runs empty, which gives any
> > pending read IO (from firefox etc.) a chance to be serviced. Nice
> > and sweet breaks!
> 
> I doubt that async IO queue is empty for 12.5ms. We wait for previous
> range to finish (index-1) and have already started the IO on next 8MB
> of pages. So effectively that should keep 8MB of async IO in
> queue (until and unless there are delays from user space side). So reason
> for latency improvement might be something else and not because async
> IO queue is empty for some time.

With sync_file_range() test, we can have 8MB of IO in flight. Without that
I think we can have more at times and that might be the reason for latency
improvement.

I see that CFQ has code to allow deeper NCQ depth if there is only a single
writer. So once a reader comes along it might find tons of async IO
already in flight. sync_file_range() will limit that in flight IO hence
the latency improvement. So if we have multiple dd doing sync_file_range()
then probably this latency improvement should go away.

I will run some tests to verify if my understanding about deeper queue
depths in case of single writer is correct or not.

Thanks
Vivek

next prev parent reply	other threads:[~2012-06-05 17:41 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-05-28 11:41 [GIT PULL] writeback changes for 3.5-rc1 Fengguang Wu
2012-05-28 17:09 ` Linus Torvalds
2012-05-29 15:57   ` write-behind on streaming writes Fengguang Wu
2012-05-29 15:57     ` Fengguang Wu
2012-05-29 17:35     ` Linus Torvalds
2012-05-29 17:35       ` Linus Torvalds
2012-05-30  3:21       ` Fengguang Wu
2012-05-30  3:21         ` Fengguang Wu
2012-06-05  1:01         ` Dave Chinner
2012-06-05  1:01           ` Dave Chinner
2012-06-05 17:18           ` Vivek Goyal
2012-06-05 17:18             ` Vivek Goyal
2012-06-05 17:23         ` Vivek Goyal
2012-06-05 17:23           ` Vivek Goyal
2012-06-05 17:41           ` Vivek Goyal [this message]
2012-06-05 17:41             ` Vivek Goyal
2012-06-05 18:48             ` Vivek Goyal
2012-06-05 18:48               ` Vivek Goyal
2012-06-05 20:10               ` Vivek Goyal
2012-06-05 20:10                 ` Vivek Goyal
2012-06-06  2:57                 ` Vivek Goyal
2012-06-06  2:57                   ` Vivek Goyal
2012-06-06  3:14                   ` Linus Torvalds
2012-06-06  3:14                     ` Linus Torvalds
2012-06-06 12:14                     ` Vivek Goyal
2012-06-06 12:14                       ` Vivek Goyal
2012-06-06 14:00                       ` Fengguang Wu
2012-06-06 14:00                         ` Fengguang Wu
2012-06-06 17:04                         ` Vivek Goyal
2012-06-06 17:04                           ` Vivek Goyal
2012-06-07  9:45                           ` Jan Kara
2012-06-07  9:45                             ` Jan Kara
2012-06-07 19:06                             ` Vivek Goyal
2012-06-07 19:06                               ` Vivek Goyal
2012-06-06 16:15                       ` Vivek Goyal
2012-06-06 16:15                         ` Vivek Goyal
2012-06-06 14:08                   ` Fengguang Wu
2012-06-06 14:08                     ` Fengguang Wu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120605174157.GC28556@redhat.com \
    --to=vgoyal@redhat.com \
    --cc=Trond.Myklebust@netapp.com \
    --cc=fengguang.wu@intel.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.