Re: [PATCH 5/5] xfs: don't chain ioends during writepage submission

From: Brian Foster <bfoster@redhat.com>
To: Dave Chinner <david@fromorbit.com>
Cc: xfs@oss.sgi.com
Subject: Re: [PATCH 5/5] xfs: don't chain ioends during writepage submission
Date: Thu, 11 Feb 2016 07:24:13 -0500	[thread overview]
Message-ID: <20160211122413.GA4156@bfoster.bfoster> (raw)
In-Reply-To: <20160210210926.GL14668@dastard>

On Thu, Feb 11, 2016 at 08:09:26AM +1100, Dave Chinner wrote:
> On Wed, Feb 10, 2016 at 08:18:17AM -0500, Brian Foster wrote:
> > On Wed, Feb 10, 2016 at 08:59:00AM +1100, Dave Chinner wrote:
> > > On Tue, Feb 09, 2016 at 09:23:55AM -0500, Brian Foster wrote:
> > > > On Mon, Feb 08, 2016 at 04:44:18PM +1100, Dave Chinner wrote:
> > > > > @@ -738,29 +726,22 @@ xfs_writepage_submit(
> > > > >  	struct writeback_control *wbc,
> > > > >  	int			status)
> > > > >  {
> > > > > -	struct blk_plug		plug;
> > > > > -
> > > > > -	/* Reserve log space if we might write beyond the on-disk inode size. */
> > > > > -	if (!status && wpc->ioend && wpc->ioend->io_type != XFS_IO_UNWRITTEN &&
> > > > > -	    xfs_ioend_is_append(wpc->ioend))
> > > > > -		status = xfs_setfilesize_trans_alloc(wpc->ioend);
> > > > > -
> > > > > -	if (wpc->iohead) {
> > > > > -		blk_start_plug(&plug);
> > > > > -		xfs_submit_ioend(wbc, wpc->iohead, status);
> > > > > -		blk_finish_plug(&plug);
> > > > > -	}
> > > > 
> > > > We've dropped our plug here but I don't see anything added in
> > > > xfs_vm_writepages(). Shouldn't we have one there now that ioends are
> > > > submitted as we go? generic_writepages() uses one around its
> > > > write_cache_pages() call..
> > > 
> > > It's not really necessary, as we now have higher level plugging in
> > > the writeback go will get flushed on context switch, and if we don't
> > > have a high level plug (e.g. fsync triggered writeback), then we
> > > submit the IO immediately, just like flushing the plug here would do
> > > anyway....
> > > 
> > 
> > Ok, I'm digging around the wb code a bit and I see plugs in/around
> > wb_writeback(), so I assume that's what you're referring to in the first
> > case. I'm not quite following the fsync case though...
> > 
> > In the current upstream code, fsync() leads to the following call chain:
> > 
> >   filemap_write_and_wait_range()
> >     __filemap_fdatawrite_range()
> >       do_writepages()
> >         xfs_vm_writepages()
> >           generic_writepages()
> >             blk_start_plug()
> >             write_cache_pages()
> >             blk_finish_plug()
> > 
> > After this series, we have the following:
> > 
> >   filemap_write_and_wait_range()
> >     __filemap_fdatawrite_range()
> >       do_writepages()
> >         xfs_vm_writepages() 
> >           write_cache_pages()
> > 
> > ... with no plug that I can see. What am I missing?
> 
> fsync tends to be a latency sensitive operation, not a bandwidth
> maximising operation. Plugging trades off IO submission latency for
> maximising IO bandwidth. For fsync and other single inode operations
> that block waiting for the IO to complete, maximising bandwidth is
> not necessarily the right thing to do.
> 

Ok.

> For single inode IO commands (such as through
> __filemap_fdatawrite_range), block plugging will only improve
> performance if the filesystem does not form large bios to begin
> with. XFS always builds maximally sized bios if it can, so plugging
> cannot improve the IO throughput from such writeback behaviour
> because the bios it builds cannot be further merged.  Such bios are
> better served being pushed straight in the the IO scheduler queues.
> 
> IOWs, plugging only makes a difference when the IO being formed is
> small but is mergable in the IO scheduler. This what happens with
> small file delayed allocation in writeback in XFS, and nowdays we
> have a high level plug for this (i.e. in writeback_inodes_wb() and
> wb_writeback()). Hence those one-bio-per-inode-but-all-sequential IO
> will be merged in the plug before dispatch, thereby improving write
> bandwidth under such small file writeback workloads. (See the
> numbers in commmit d353d75 writeback: plug writeback at a high
> level").)
> 

Makes sense.

> IOWs, block plugging is not a magical "make everything go faster"
> knob. Different filesystems have different IO dispatch methods, and
> so require different plugging strategies to optimise their IO
> patterns.  It may be that plugging in xfs_vm_writepages is
> advantageous in some workloads for fsync, but I haven't been able to
> measure them.
> 

I don't think I suggested it was magical in any way. ;) My initial
feedback was simply based on the fact that it looked like the behavior
changed without notice, so it wasn't clear if that was intentional. You
pointed out the higher level wb plug but at the same time implied not
having a plug in the fsync case, which we previously did have (granted,
only for the mapping). Perhaps I read into that wrong. It would be nice
if the commit log made a quick mention about the plug situation (one
context has a separate plug, for the other it is unnecessary), but
otherwise the explanation addresses my concerns. Thanks!

Brian

> Cheers,
> 
> Dave.
> -- 
> Dave Chinner
> david@fromorbit.com
> 
> _______________________________________________
> xfs mailing list
> xfs@oss.sgi.com
> http://oss.sgi.com/mailman/listinfo/xfs

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs