All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: Brian Foster <bfoster@redhat.com>
Cc: linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org
Subject: Re: [PATCH 14/16] xfs: align writepages to large block sizes
Date: Mon, 19 Nov 2018 12:14:55 +1100	[thread overview]
Message-ID: <20181119011455.GI19305@dastard> (raw)
In-Reply-To: <20181116132922.GA31603@bfoster>

On Fri, Nov 16, 2018 at 08:29:23AM -0500, Brian Foster wrote:
> On Fri, Nov 16, 2018 at 05:19:36PM +1100, Dave Chinner wrote:
> > i.e. we round WB_SYNC_NONE to try to get whole blocks written so we
> > don't end up with partially written blocks on disk for extended
> > periods of time (e.g. between background writeback periods). It
> > doesn't matter for WB_SYNC_ALL, but it will reduce the potential for
> > stale data beign exposed when crashes occur and random pages in
> > blocks have't been written back. i.e. it's to help iprevent
> > re-exposing the problematic cases that we added the "NULL files
> > after crash" workarounds for.
> > 
> 
> Ok. I see that there are earlier patches to do zero-around on sub-block
> writes, so the idea makes a bit more sense with that in mind. That said,
> I still don't grok how messing with nr_to_write is effective.

Because background writeback is range_cyclic = 1, and that means we
always start at offset zero, and then if nr_to_write expires we
stash the page index we are up to in:

	mapping->writeback_index = done_index

And the next background writeback will start again from there.

Hence if nr_to_write is always rounding to the number of pages per
block, background writeback will /tend/ towards writing full blocks
because the writeback_index will always end up a multiple of pages
per block. Hence cyclic writeback will tend towards writing aligned,
full blocks when nr_to_write is rounded.

That's the fundamental concept here - write-in does "zero-around" to
initialise full blocks, writeback does "write-around" to push full
blocks to disk. WB_SYNC_ALL needs ranges to be rounded to do full
block writeback, WB_SYNC_NONE background writeback needs it's range
cyclic behaviour to round to writing full blocks (and that's what
rounding nr_to_write is doing in this patch).

> For background writeback (WB_SYNC_NONE), the range fields are clamped
> out (0-LONG_MAX) since the location of pages to write is not really a
> concern. In that case, ->nr_to_write is set based on some bandwidth
> heuristic and the only change we make here is to round it. If we
> consider the fact that any mapping itself may consist of some
> combination of zeroed-around delalloc blocks (covered by an aligned
> number of dirty pages) and already allocated/overwrite blocks (covered
> by any number of dirty pages), how does a rounded ->nr_to_write actually
> help us at all? Can't the magic ->nr_to_write value that prevents
> stopping at a partially written sub-block page be unaligned to the block
> size?

Yup, I never intended for this RFC prototype to deal with all these
problems. That doesn't mean I'm not aware of them, nor that I don't
have a plan to deal with them.

> Given the above, I don't see how tweaking ->nr_to_write really helps at
> all even as an optimization. Unless I'm missing something else in the
> earlier patches that facilitate this, ISTM that something more explicit
> is required if you want to increase the odds that zeroed-around blocks
> are written together.

Which I've always intended as future work. I've spent about 14 hours
on this patch set so far - it's a functional prototype, not a
finished, completed piece of work.

I'm fully aware that this going to need a lot more work before it is
ready for merging This is an early prototype I'm putting out there
for architectural/design review. i.e. don't bother nitpicking
unimplemented details or bugs, look for big picture things I've got
wrong. Look for showstoppers and fundamental problems, not things
that just require a little bit of time and coding to implement....

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

  reply	other threads:[~2018-11-19 11:52 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-11-07  6:31 [RFC PATCH 00/16] xfs: Block size > PAGE_SIZE support Dave Chinner
2018-11-07  6:31 ` [PATCH 01/16] xfs: drop ->writepage completely Dave Chinner
2018-11-09 15:12   ` Christoph Hellwig
2018-11-12 21:08     ` Dave Chinner
2021-02-02 20:51       ` Darrick J. Wong
2018-11-07  6:31 ` [PATCH 02/16] xfs: move writepage context warnings to writepages Dave Chinner
2018-11-07  6:31 ` [PATCH 03/16] xfs: finobt AG reserves don't consider last AG can be a runt Dave Chinner
2018-11-07 16:55   ` Darrick J. Wong
2018-11-09  0:21     ` Dave Chinner
2018-11-07  6:31 ` [PATCH 04/16] xfs: extent shifting doesn't fully invalidate page cache Dave Chinner
2018-11-07  6:31 ` [PATCH 05/16] iomap: sub-block dio needs to zeroout beyond EOF Dave Chinner
2018-11-09 15:15   ` Christoph Hellwig
2018-11-07  6:31 ` [PATCH 06/16] iomap: support block size > page size for direct IO Dave Chinner
2018-11-08 11:28   ` Nikolay Borisov
2018-11-09 15:18   ` Christoph Hellwig
2018-11-11  1:12     ` Dave Chinner
2018-11-07  6:31 ` [PATCH 07/16] iomap: prepare buffered IO paths for block size > page size Dave Chinner
2018-11-09 15:19   ` Christoph Hellwig
2018-11-11  1:15     ` Dave Chinner
2018-11-07  6:31 ` [PATCH 08/16] iomap: mode iomap_zero_range and friends Dave Chinner
2018-11-09 15:19   ` Christoph Hellwig
2018-11-07  6:31 ` [PATCH 09/16] iomap: introduce zero-around functionality Dave Chinner
2018-11-07  6:31 ` [PATCH 10/16] iomap: enable zero-around for iomap_zero_range() Dave Chinner
2018-11-07  6:31 ` [PATCH 11/16] iomap: Don't mark partial pages zeroing uptodate for zero-around Dave Chinner
2018-11-07  6:31 ` [PATCH 12/16] iomap: zero-around in iomap_page_mkwrite Dave Chinner
2018-11-07  6:31 ` [PATCH 13/16] xfs: add zero-around controls to iomap Dave Chinner
2018-11-07  6:31 ` [PATCH 14/16] xfs: align writepages to large block sizes Dave Chinner
2018-11-09 15:22   ` Christoph Hellwig
2018-11-11  1:20     ` Dave Chinner
2018-11-11 16:32       ` Christoph Hellwig
2018-11-14 14:19   ` Brian Foster
2018-11-14 21:18     ` Dave Chinner
2018-11-15 12:55       ` Brian Foster
2018-11-16  6:19         ` Dave Chinner
2018-11-16 13:29           ` Brian Foster
2018-11-19  1:14             ` Dave Chinner [this message]
2018-11-07  6:31 ` [PATCH 15/16] xfs: expose block size in stat Dave Chinner
2018-11-07  6:31 ` [PATCH 16/16] xfs: enable block size larger than page size support Dave Chinner
2018-11-07 17:14 ` [RFC PATCH 00/16] xfs: Block size > PAGE_SIZE support Darrick J. Wong
2018-11-07 22:04   ` Dave Chinner
2018-11-08  1:38     ` Darrick J. Wong
2018-11-08  9:04       ` Dave Chinner
2018-11-08 22:17         ` Darrick J. Wong
2018-11-08 22:22           ` Dave Chinner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20181119011455.GI19305@dastard \
    --to=david@fromorbit.com \
    --cc=bfoster@redhat.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.