From: "Darrick J. Wong" <darrick.wong@oracle.com>
To: Gionatan Danti <g.danti@assyoma.it>
Cc: Amir Goldstein <amir73il@gmail.com>,
Dave Chinner <david@fromorbit.com>,
linux-xfs <linux-xfs@vger.kernel.org>
Subject: Re: Reflink (cow) copy of busy files
Date: Mon, 26 Feb 2018 13:31:02 -0800 [thread overview]
Message-ID: <20180226213102.GD19312@magnolia> (raw)
In-Reply-To: <2f48d103c0a6eff6c0a1136057e828b6@assyoma.it>
On Mon, Feb 26, 2018 at 10:23:45PM +0100, Gionatan Danti wrote:
> Il 26-02-2018 18:26 Darrick J. Wong ha scritto:
> >The way reflink is supposed to work wrt consistency is:
> >
> >1. lock out all new io/fallocate activity on both inodes (iolock/mmaplock)
> >2. wait for all directio to complete
> >3. fsync both files (write all the dirty pagecache to disk)
> >4. lock both inodes (ilock)
> >5. clone each extent atomically
> >6. unlock ilock
> >7. unlock iolock/mmaplock
> >
> >So at least in theory the cloned file will match whatever the host saw
> >on disk and page cache at the time the reflink call was initiated.
> >I say 'in theory' because there could be bugs.
>
> Great! CoW will be a great addition for XFS when it will be considered
> stable.
>
> >Whatever dirty state is in the guest VM stays in that VM, which means
> >that if you only cp --reflink on the host, the clone you get will
> >reflect the virtual disk state as if you'd kill -9'd the VM, cloned the
> >VM disk, and restarted the VM. Upon restart the log recovers whatever
> >metadata made it out of the VM.
>
> Sure, it is what I means for "crash-consistent".
>
> >However, if you tell the guest to freeze the fs before cloning (as Dave
> >suggested earlier) the guest will flush all its state to the upper level
> >(the host) and the host will push all that out to disk before cloning.
> >The snapshot you create should be cleaner because you're effectively
> >prepaying the recovery costs by flushing everything before taking the
> >snapshot.
>
> True, and this is "application-level consistency" (which requires a guest
> agent and possibly even an application-specific agent)
I believe qemu-ga takes care of guest fs freeze inside the guest,
and you can invoke it from the host via 'virsh domfsfreeze' or the
--quiesce argument to snapshot-create... but you ought to confirm that
for yourself.
--D
> >Also note that if the host goes down before returning from the syscall,
> >the log will continue on with whichever extent was being cloned at the
> >time in order to preserve metadata integrity, but the destination file
> >will reflect a partial copy.
>
> Thanks for pointing that, and for your extremely clear explanation!
>
>
> --
> Danti Gionatan
> Supporto Tecnico
> Assyoma S.r.l. - www.assyoma.it
> email: g.danti@assyoma.it - info@assyoma.it
> GPG public key ID: FF5F32A8
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2018-02-26 21:31 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-02-24 18:20 Reflink (cow) copy of busy files Gionatan Danti
2018-02-24 22:07 ` Dave Chinner
2018-02-24 22:57 ` Gionatan Danti
2018-02-25 2:47 ` Dave Chinner
2018-02-25 11:40 ` Gionatan Danti
2018-02-25 21:13 ` Dave Chinner
2018-02-25 21:58 ` Gionatan Danti
2018-02-26 0:25 ` Dave Chinner
2018-02-26 7:19 ` Gionatan Danti
2018-02-26 7:58 ` Amir Goldstein
2018-02-26 8:26 ` Gionatan Danti
2018-02-26 17:26 ` Darrick J. Wong
2018-02-26 21:23 ` Gionatan Danti
2018-02-26 21:31 ` Darrick J. Wong [this message]
2018-02-26 21:39 ` Gionatan Danti
2018-02-27 0:33 ` Dave Chinner
2018-02-27 0:58 ` Darrick J. Wong
2018-02-27 8:06 ` Gionatan Danti
2018-02-27 22:04 ` Dave Chinner
2018-02-28 7:08 ` Gionatan Danti
2018-02-28 17:07 ` Darrick J. Wong
2018-02-28 18:27 ` Gionatan Danti
2018-02-26 20:29 ` Amir Goldstein
2018-02-26 21:28 ` Gionatan Danti
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180226213102.GD19312@magnolia \
--to=darrick.wong@oracle.com \
--cc=amir73il@gmail.com \
--cc=david@fromorbit.com \
--cc=g.danti@assyoma.it \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.