All of lore.kernel.org
 help / color / mirror / Atom feed
From: Christoph Hellwig <hch@lst.de>
To: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org
Cc: chris.mason@oracle.com, jack@suse.cz, tytso@mit.edu,
	adilger@sun.com, swhiteho@redhat.com,
	konishi.ryusuke@lab.ntt.co.jp, mfasheh@suse.com,
	joel.becker@oracle.com
Subject: Re: [PATCH] notes on volatile write caches vs fdatasync
Date: Thu, 27 Aug 2009 03:19:43 +0200	[thread overview]
Message-ID: <20090827011942.GA10541@lst.de> (raw)
In-Reply-To: <20090827011624.GA10405@lst.de>

No actually a patch, sorry ;-)

On Thu, Aug 27, 2009 at 03:16:24AM +0200, Christoph Hellwig wrote:
> There are two related issues when dealing with volatile write caches,
> the popular and beaten to death one are write barriers to guarantee
> write ordering and stable storage for log writes.  For this post
> I assume naively this works perfectly for all filesystems supporting it.
> 
> The second issue are plain cache flush.  Yes, they happen to be the
> base for the barrier implementation on all common disks in Linux, but
> there are cases where we need to issue them even without a log barrier.
> 
> Think about a plain write into a file that is already fully allocated.
> Or the O_DIRECT version of them same.  If we do an fdatasync after these
> we really do expect the write to really be on disk, not just in the disk
> cache, right?  The same is true for O_SYNC, but I ignore it for this
> write out as with Jan's patch series O_SYNC writes will be implemented
> by a range-fdatasync after the actual write, so after that this sync
> section covers it, too.
> 
> It appears the following Linux filesystems implement barrier support:
> 
>  - btrfs
>  - ext3
>  - ext4
>  - gfs2
>  - nilfs2
>  - ocfs2
>  - reiserfs
>  - xfs
> 
> Interestingly of those only ext4, reiserfs and xfs do contain direct
> calls to blkdev_issue_flush.  And unless a filesystem really creates
> a transaction for every write and forces that out on fdatasync it seems
> like all others do not actually have a chance to guarantee a cache
> flush on fdatasync.
> 
> I have tested btrfs, ext3, ext4, reiserfs, and xfs with a simple test
> program that just does a buffered write into a file, and then calls
> fdatasync.  All of the above filesystems issue a barrier request
> when the file blocks aren't allocated yet (for ext3 and reiserfs
> only when barriers are explicitly enabled, of course).
> 
> That's not the case anymore when all blocks are already allocated.
> As expected by the above grep results reiserfs and xfs still issue a
> barrier in that case.  Btrfs also performs a cache flush in every
> case which at first seems unexpected due to the lack of any
> blkdev_issue_flush call, but given that btrfs is a COW filesystem
> it actually has to allocate blocks even for an overwrite.
> Ext3 expectedly does not issue a cache flush in that case, but ext4
> unexpectedly does not issue a cache flush either.  The reason for that
> is that it only issues the cache flush if the inode was dirty but
> not at all if that is not the case.
---end quoted text---

  reply	other threads:[~2009-08-27  1:20 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-08-27  1:16 [PATCH] notes on volatile write caches vs fdatasync Christoph Hellwig
2009-08-27  1:19 ` Christoph Hellwig [this message]
2009-08-27 13:02 ` Jan Kara
2009-08-27 18:49   ` Christoph Hellwig
2009-08-27 19:26     ` Jeff Garzik

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090827011942.GA10541@lst.de \
    --to=hch@lst.de \
    --cc=adilger@sun.com \
    --cc=chris.mason@oracle.com \
    --cc=jack@suse.cz \
    --cc=joel.becker@oracle.com \
    --cc=konishi.ryusuke@lab.ntt.co.jp \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mfasheh@suse.com \
    --cc=swhiteho@redhat.com \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.