linux-xfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Gionatan Danti <g.danti@assyoma.it>
To: "Darrick J. Wong" <darrick.wong@oracle.com>
Cc: Dave Chinner <david@fromorbit.com>,
	Amir Goldstein <amir73il@gmail.com>,
	linux-xfs <linux-xfs@vger.kernel.org>,
	g.danti@assyoma.it
Subject: Re: Reflink (cow) copy of busy files
Date: Wed, 28 Feb 2018 19:27:57 +0100	[thread overview]
Message-ID: <358aa0b224b6a7017f1c8af845a3b9bf@assyoma.it> (raw)
In-Reply-To: <20180228170737.GL19312@magnolia>

Il 28-02-2018 18:07 Darrick J. Wong ha scritto:
> reflink performs (more or less) a fdatasync of the source and dest file
> before it starts so that any dirty pages backed by delayed allocation
> reservation will be allocated and written to disk, but it doesn't do 
> the
> "force all dirty metadata out to log" action that distinguishes
> fdatasync from fsync.  That is a deliberate design decision because:
> 
> 1) fsync is fairly heavyweight,
> 2) customers might have disposable environments where it is preferable
>    to lose srcfile and destfile over paying performance penalties
>    all the time, and
> 3) if you need srcfile to be completely stable on disk, you needed to
>    call fsync anyway, and nothing prevents you from doing so before
>    calling copy_file_range/clone_file_range if that is part of your
>    operational requirements.
> 
> In other words, if at a certain point you can't afford to lose the
> source file due to a host crash, you have to call fsync, as has been 
> the
> case for ages.  reflink does not itself call fsync, nor does it 
> increase
> the chances of losing any file contents that weren't fsync'd before the
> host went down.

Ok, this is exactly what I expect.

To add some context: Qemu/KVM added safe barrier/fsync passing years 
ago, so when a guest issues a fsync+barrier operation (ie: after key 
operations, as a journal update or a COMMIT) they are immediately passed 
to the host, which issues real fsync+barrier on the backing file. In 
other words, host's writeback cache is used as the volatile disk's DRAM 
cache (which needs to be flushed at specific interval). See: 
https://www.static.linuxfound.org/jp_uploads/JLS2009/jls09_hellwig.pdf

Back to the original argument: are guest/user initiated fsyncs+barriers 
honored even *during a cp --reflink copy*? If so, I can't see any 
shortcoming in using reflinking to hot copy a busy file. Sure, I risk 
losing async writes (which are in writeback host cache *or* in the 
unflushed volatile disk's DRAM cache), but this is nothing more (or 
less) than a normal, interrupted copy. I am right saying that?

Maybe encapsulating the reflink copy in between two fsync calls is a 
good idea?

Thanks.

-- 
Danti Gionatan
Supporto Tecnico
Assyoma S.r.l. - www.assyoma.it
email: g.danti@assyoma.it - info@assyoma.it
GPG public key ID: FF5F32A8

  reply	other threads:[~2018-02-28 18:28 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-02-24 18:20 Reflink (cow) copy of busy files Gionatan Danti
2018-02-24 22:07 ` Dave Chinner
2018-02-24 22:57   ` Gionatan Danti
2018-02-25  2:47     ` Dave Chinner
2018-02-25 11:40       ` Gionatan Danti
2018-02-25 21:13         ` Dave Chinner
2018-02-25 21:58           ` Gionatan Danti
2018-02-26  0:25             ` Dave Chinner
2018-02-26  7:19               ` Gionatan Danti
2018-02-26  7:58                 ` Amir Goldstein
2018-02-26  8:26                   ` Gionatan Danti
2018-02-26 17:26                     ` Darrick J. Wong
2018-02-26 21:23                       ` Gionatan Danti
2018-02-26 21:31                         ` Darrick J. Wong
2018-02-26 21:39                           ` Gionatan Danti
2018-02-27  0:33                       ` Dave Chinner
2018-02-27  0:58                         ` Darrick J. Wong
2018-02-27  8:06                         ` Gionatan Danti
2018-02-27 22:04                           ` Dave Chinner
2018-02-28  7:08                             ` Gionatan Danti
2018-02-28 17:07                               ` Darrick J. Wong
2018-02-28 18:27                                 ` Gionatan Danti [this message]
2018-02-26 20:29                     ` Amir Goldstein
2018-02-26 21:28                       ` Gionatan Danti

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=358aa0b224b6a7017f1c8af845a3b9bf@assyoma.it \
    --to=g.danti@assyoma.it \
    --cc=amir73il@gmail.com \
    --cc=darrick.wong@oracle.com \
    --cc=david@fromorbit.com \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).