All of lore.kernel.org
 help / color / mirror / Atom feed
From: Christoph Hellwig <hch@infradead.org>
To: Matthew Wilcox <willy@infradead.org>
Cc: linux-fsdevel@vger.kernel.org,
	Zhengyuan Liu <liuzhengyuang521@gmail.com>,
	yukuai3@huawei.com, Christoph Hellwig <hch@infradead.org>,
	Dave Chinner <david@fromorbit.com>,
	David Howells <dhowells@redhat.com>,
	linux-xfs@vger.kernel.org
Subject: Re: Dirty bits and sync writes
Date: Mon, 9 Aug 2021 15:48:56 +0100	[thread overview]
Message-ID: <YRFAWPdMHp8Wpds/@infradead.org> (raw)
In-Reply-To: <YQlgjh2R8OzJkFoB@casper.infradead.org>

On Tue, Aug 03, 2021 at 04:28:14PM +0100, Matthew Wilcox wrote:
> Solution 1: Add an array of dirty bits to the iomap_page
> data structure.  This patch already exists; would need
> to be adjusted slightly to apply to the current tree.
> https://lore.kernel.org/linux-xfs/7fb4bb5a-adc7-5914-3aae-179dd8f3adb1@huawei.com/

> Solution 2a: Replace the array of uptodate bits with an array of
> dirty bits.  It is not often useful to know which parts of the page are
> uptodate; usually the entire page is uptodate.  We can actually use the
> dirty bits for the same purpose as uptodate bits; if a block is dirty, it
> is definitely uptodate.  If a block is !dirty, and the page is !uptodate,
> the block may or may not be uptodate, but it can be safely re-read from
> storage without losing any data.

1 or 2a seems like something we should do once we have lage folio
support.


> Solution 2b: Lose the concept of partially uptodate pages.  If we're
> going to write to a partial page, just bring the entire page uptodate
> first, then write to it.  It's not clear to me that partially-uptodate
> pages are really useful.  I don't know of any network filesystems that
> support partially-uptodate pages, for example.  It seems to have been
> something we did for buffer_head based filesystems "because we could"
> rather than finding a workload that actually cares.

The uptodate bit is important for the use case of a smaller than page
size buffered write into a page that hasn't been read in already, which
is fairly common for things like log writes.  So I'd hate to lose this
optimization.

> (it occurs to me that solution 3 actually allows us to do IOs at storage
> block size instead of filesystem block size, potentially reducing write
> amplification even more, although we will need to be a bit careful if
> we're doing a CoW.)

number 3 might be nice optimization.  The even better version would
be a disk format change to just log those updates in the log and
otherwise use the normal dirty mechanism.  I once had a crude prototype
for that.

  reply	other threads:[~2021-08-09 14:51 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-03 15:28 Dirty bits and sync writes Matthew Wilcox
2021-08-09 14:48 ` Christoph Hellwig [this message]
2021-08-09 15:30   ` Matthew Wilcox
2021-08-11 19:04     ` Jeff Layton
2021-08-11 19:42       ` Matthew Wilcox
2021-08-11 20:40         ` Jeff Layton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YRFAWPdMHp8Wpds/@infradead.org \
    --to=hch@infradead.org \
    --cc=david@fromorbit.com \
    --cc=dhowells@redhat.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=liuzhengyuang521@gmail.com \
    --cc=willy@infradead.org \
    --cc=yukuai3@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.