linux-xfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Gionatan Danti <g.danti@assyoma.it>
To: "Darrick J. Wong" <darrick.wong@oracle.com>
Cc: Amir Goldstein <amir73il@gmail.com>,
	Dave Chinner <david@fromorbit.com>,
	linux-xfs <linux-xfs@vger.kernel.org>,
	g.danti@assyoma.it
Subject: Re: Reflink (cow) copy of busy files
Date: Mon, 26 Feb 2018 22:23:45 +0100	[thread overview]
Message-ID: <2f48d103c0a6eff6c0a1136057e828b6@assyoma.it> (raw)
In-Reply-To: <20180226172601.GC19312@magnolia>

Il 26-02-2018 18:26 Darrick J. Wong ha scritto:
> The way reflink is supposed to work wrt consistency is:
> 
> 1. lock out all new io/fallocate activity on both inodes 
> (iolock/mmaplock)
> 2. wait for all directio to complete
> 3. fsync both files (write all the dirty pagecache to disk)
> 4. lock both inodes (ilock)
> 5. clone each extent atomically
> 6. unlock ilock
> 7. unlock iolock/mmaplock
> 
> So at least in theory the cloned file will match whatever the host saw
> on disk and page cache at the time the reflink call was initiated.
> I say 'in theory' because there could be bugs.

Great! CoW will be a great addition for XFS when it will be considered 
stable.

> Whatever dirty state is in the guest VM stays in that VM, which means
> that if you only cp --reflink on the host, the clone you get will
> reflect the virtual disk state as if you'd kill -9'd the VM, cloned the
> VM disk, and restarted the VM.  Upon restart the log recovers whatever
> metadata made it out of the VM.

Sure, it is what I means for "crash-consistent".

> However, if you tell the guest to freeze the fs before cloning (as Dave
> suggested earlier) the guest will flush all its state to the upper 
> level
> (the host) and the host will push all that out to disk before cloning.
> The snapshot you create should be cleaner because you're effectively
> prepaying the recovery costs by flushing everything before taking the
> snapshot.

True, and this is "application-level consistency" (which requires a 
guest agent and possibly even an application-specific agent)

> Also note that if the host goes down before returning from the syscall,
> the log will continue on with whichever extent was being cloned at the
> time in order to preserve metadata integrity, but the destination file
> will reflect a partial copy.

Thanks for pointing that, and for your extremely clear explanation!


-- 
Danti Gionatan
Supporto Tecnico
Assyoma S.r.l. - www.assyoma.it
email: g.danti@assyoma.it - info@assyoma.it
GPG public key ID: FF5F32A8

  reply	other threads:[~2018-02-26 21:23 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-02-24 18:20 Reflink (cow) copy of busy files Gionatan Danti
2018-02-24 22:07 ` Dave Chinner
2018-02-24 22:57   ` Gionatan Danti
2018-02-25  2:47     ` Dave Chinner
2018-02-25 11:40       ` Gionatan Danti
2018-02-25 21:13         ` Dave Chinner
2018-02-25 21:58           ` Gionatan Danti
2018-02-26  0:25             ` Dave Chinner
2018-02-26  7:19               ` Gionatan Danti
2018-02-26  7:58                 ` Amir Goldstein
2018-02-26  8:26                   ` Gionatan Danti
2018-02-26 17:26                     ` Darrick J. Wong
2018-02-26 21:23                       ` Gionatan Danti [this message]
2018-02-26 21:31                         ` Darrick J. Wong
2018-02-26 21:39                           ` Gionatan Danti
2018-02-27  0:33                       ` Dave Chinner
2018-02-27  0:58                         ` Darrick J. Wong
2018-02-27  8:06                         ` Gionatan Danti
2018-02-27 22:04                           ` Dave Chinner
2018-02-28  7:08                             ` Gionatan Danti
2018-02-28 17:07                               ` Darrick J. Wong
2018-02-28 18:27                                 ` Gionatan Danti
2018-02-26 20:29                     ` Amir Goldstein
2018-02-26 21:28                       ` Gionatan Danti

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2f48d103c0a6eff6c0a1136057e828b6@assyoma.it \
    --to=g.danti@assyoma.it \
    --cc=amir73il@gmail.com \
    --cc=darrick.wong@oracle.com \
    --cc=david@fromorbit.com \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).