From: "Darrick J. Wong" <darrick.wong@oracle.com>
To: darrick.wong@oracle.com
Cc: linux-xfs@vger.kernel.org, linux-doc@vger.kernel.org, corbet@lwn.net
Subject: [PATCH 05/22] docs: add XFS shared data block chapter to DS&A book
Date: Wed, 03 Oct 2018 20:25:49 -0700 [thread overview]
Message-ID: <153862354950.27883.8722770456101190838.stgit@magnolia> (raw)
In-Reply-To: <153862350727.27883.9408120819569795102.stgit@magnolia>
From: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
.../filesystems/xfs-data-structures/overview.rst | 1
.../filesystems/xfs-data-structures/reflink.rst | 43 ++++++++++++++++++++
2 files changed, 44 insertions(+)
create mode 100644 Documentation/filesystems/xfs-data-structures/reflink.rst
diff --git a/Documentation/filesystems/xfs-data-structures/overview.rst b/Documentation/filesystems/xfs-data-structures/overview.rst
index 457e81c0eb40..d8d668ec6097 100644
--- a/Documentation/filesystems/xfs-data-structures/overview.rst
+++ b/Documentation/filesystems/xfs-data-structures/overview.rst
@@ -45,3 +45,4 @@ latency.
.. include:: self_describing_metadata.rst
.. include:: delayed_logging.rst
+.. include:: reflink.rst
diff --git a/Documentation/filesystems/xfs-data-structures/reflink.rst b/Documentation/filesystems/xfs-data-structures/reflink.rst
new file mode 100644
index 000000000000..653b3def7e6e
--- /dev/null
+++ b/Documentation/filesystems/xfs-data-structures/reflink.rst
@@ -0,0 +1,43 @@
+.. SPDX-License-Identifier: CC-BY-SA-4.0
+
+Sharing Data Blocks
+-------------------
+
+On a traditional filesystem, there is a 1:1 mapping between a logical block
+offset in a file and a physical block on disk, which is to say that physical
+blocks are not shared. However, there exist various use cases for being able
+to share blocks between files — deduplicating files saves space on archival
+systems; creating space-efficient clones of disk images for virtual machines
+and containers facilitates efficient datacenters; and deferring the payment of
+the allocation cost of a file system tree copy as long as possible makes
+regular work faster. In all of these cases, a write to one of the shared
+copies **must** not affect the other shared copies, which means that writes to
+shared blocks must employ a copy-on-write strategy. Sharing blocks in this
+manner is commonly referred to as "reflinking".
+
+XFS implements block sharing in a fairly straightforward manner. All existing
+data fork structures remain unchanged, save for the addition of a
+per-allocation group `reference count B+tree <#reference-count-b-tree>`__. This
+data structure tracks reference counts for all shared physical blocks, with a
+few rules to maintain compatibility with existing code: If a block is free, it
+will be tracked in the free space B+trees. If a block is owned by a single
+file, it appears in neither the free space nor the reference count B+trees. If
+a block is shared, it will appear in the reference count B+tree with a
+reference count >= 2. The first two cases are established precedent in XFS, so
+the third case is the only behavioral change.
+
+When a filesystem block is shared, the block mapping in the destination file
+is updated to point to that filesystem block and the reference count B+tree
+records are updated to reflect the increased reference count. If a shared
+block is written, a new block will be allocated, the dirty data written to
+this new block, and the file’s block mapping updated to point to the new
+block. If a shared block is unmapped, the reference count records are updated
+to reflect the decreased reference count and the block is also freed if its
+reference count becomes zero. This enables users to create space efficient
+clones of disk images and to copy filesystem subtrees quickly, using the
+standard Linux coreutils packages.
+
+Deduplication employs the same mechanism to share blocks and copy them at
+write time. However, the kernel confirms that the contents of both files are
+identical before updating the destination file’s mapping. This enables XFS to
+be used by userspace deduplication programs such as duperemove.
next prev parent reply other threads:[~2018-10-04 3:25 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-10-04 3:25 [PATCH v2 00/22] xfs-4.20: major documentation surgery Darrick J. Wong
2018-10-04 3:25 ` [PATCH 01/22] docs: add skeleton of XFS Data Structures and Algorithms book Darrick J. Wong
2018-10-04 3:25 ` [PATCH 03/22] docs: add XFS self-describing metadata integrity doc to DS&A book Darrick J. Wong
2018-10-04 3:25 ` [PATCH 04/22] docs: add XFS delayed logging design " Darrick J. Wong
2018-10-04 3:25 ` Darrick J. Wong [this message]
2018-10-04 3:25 ` [PATCH 06/22] docs: add XFS online repair chapter " Darrick J. Wong
2018-10-04 3:26 ` [PATCH 07/22] docs: add XFS common types and magic numbers " Darrick J. Wong
2018-10-04 3:26 ` [PATCH 08/22] docs: add XFS testing chapter to the " Darrick J. Wong
-- strict thread matches above, loose matches on Subject: below --
2018-10-04 4:18 [PATCH v2 00/22] xfs-4.20: major documentation surgery Darrick J. Wong
2018-10-04 4:18 ` [PATCH 05/22] docs: add XFS shared data block chapter to DS&A book Darrick J. Wong
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=153862354950.27883.8722770456101190838.stgit@magnolia \
--to=darrick.wong@oracle.com \
--cc=corbet@lwn.net \
--cc=linux-doc@vger.kernel.org \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox