linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFCv3 00/58] xfs: add reverse-mapping, reflink, and dedupe support
@ 2015-10-07  4:54 Darrick J. Wong
  2015-10-07  4:54 ` [PATCH 01/58] libxfs: make xfs_alloc_fix_freelist non-static Darrick J. Wong
                   ` (57 more replies)
  0 siblings, 58 replies; 67+ messages in thread
From: Darrick J. Wong @ 2015-10-07  4:54 UTC (permalink / raw)
  To: david, darrick.wong; +Cc: linux-fsdevel, xfs

Hi all,

This is the third revision of an RFC for adding to XFS kernel support
for tracking reverse-mappings of physical blocks to file and metadata;
and support for mapping multiple file logical blocks to the same
physical block, more commonly known as reflinking.  Given the
significant amount of re-engineering required to make the initial rmap
implementation compatible with reflink, I decided to publish both
features as an integrated patchset off of upstream.  This means that
rmap and reflink are now compatible with each other.

Dave Chinner's initial rmap implementation featured a simple b+tree
containing (_physical_block_, blockcount, owner) records and enough
code to stuff the rmap btree (rmapbt) whenever a block was allocated
or freed.  However, a generic reflink implementation requires the
ability to map a block to any logical block offset in any file.
Therefore it is necessary to expand the rmapbt record definition to be
(_physical block_, _owner_, _offset_, blockcount) to maintain uniquely
identifiable records.  The upper two bits of the offset field are used
to flag attr fork records and bmbt block records, respectively.  The
highest bit of the blockcount is used to indicate an unwritten extent.
It is intended that in the future the rmapbt will some day be used to
reconstruct a corrupt block map btree (bmbt).

The reflink implementation features a simple b+tree containing
(_physical block_, blockcount, refcount) records to track the
reference counts of extents of physical blocks.  There's also support
code to provide the desired copy-on-write behavior and the userland
interfaces to reflink, query the status of, and a new fallocate mode
to un-reflink parts of files.

For single-owner blocks (i.e. metadata) the rmapbt records are still
managed at alloc/free time.  To enable reflink and rmap at the same
time, however, it becomes necessary to manage rmapbt records for file
extents at map/unmap time.  In the current implementation, file extent
records exactly mirror bmbt contents.  It should be easy to merge
file extent rmaps on non-reflink filesystems, but that is not yet
written.  In theory merging can happen for file extent rmaps on
reflink filesystems too, but that could involve a lot of searching
through the tree since records are not indexed on the last physical
block of the extent.

The ioctl interface to XFS reflink looks surprisingly like the btrfs
ioctl interface -- you can reflink a file, reflink subranges
of a file, or dedupe subranges of files.  To un-reflink a file, I'm
proposing a new fallocate flag which will (try to) fork all shared
blocks within a certain file range.  xfs_fsr is a better candidate
for de-reflinking a file since it also defragments the file; the
extent swap ioctl has also been upgraded (crappily) to support
updating the rmapbt as needed.

The patch set is based on the current (4.3-rc4) upstream kernel.
There are plenty of bugs in this code; in particular the copy-on-write
code is still terrible and prone to all sorts of amusing crashes.
There are too many patches to discuss individually, but they are
grouped by subject area:

0. Cleanups
1. rmapbt support
2. Re-engineering rmapbt to support reflink
3. refcntbt support
4. Implement the data block sharing pieces of reflink

Issues: 

 * The toy CoW implementation exists as a single-threaded workqueue(!)
In talking with Dave Chinner, I get the sense that he sees CoW as a a
natural extension of a reworked XFS write path that doesn't use buffer
heads.  That work hasn't landed, so I've only put enough effort into
fixing the CoW so that it can (barely) pass the associated xfstests.
In the future, a CoW block being written would simply become a
delalloc extent and the process of allocating the delalloc extent
would merely have to know to unmap whatever's there first.

 * The extent swapping ioctl now allocates a bigger fixed-size
transaction.  That's most likely a stupid thing to do, so getting a
better grip on how the journalling code works and auditing all the new
transaction users will have to happen.  Right now it mostly gets
lucky.

 * Don't ENOSPC.  This should get fixed up once we start using delalloc.

 * We'll want to connect to copy_file_range when it appears in a
kernel release some time.

If you're going to start using this mess, you probably ought to just
pull from my github trees for kernel[1], xfsprogs[2], and xfstests[3].

This is an extraordinary way to eat your data.  Enjoy!

Comments and questions are, as always, welcome.

--D

[1] https://github.com/djwong/linux-xfs-dev/commits/master
[2] https://github.com/djwong/xfsprogs/commits/for-next
[3] https://github.com/djwong/xfstests/commits/master

^ permalink raw reply	[flat|nested] 67+ messages in thread

end of thread, other threads:[~2015-10-30 20:56 UTC | newest]

Thread overview: 67+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-10-07  4:54 [RFCv3 00/58] xfs: add reverse-mapping, reflink, and dedupe support Darrick J. Wong
2015-10-07  4:54 ` [PATCH 01/58] libxfs: make xfs_alloc_fix_freelist non-static Darrick J. Wong
2015-10-07  4:54 ` [PATCH 02/58] xfs: fix log ticket type printing Darrick J. Wong
2015-10-07  4:55 ` [PATCH 03/58] xfs: introduce rmap btree definitions Darrick J. Wong
2015-10-07  4:55 ` [PATCH 04/58] xfs: add rmap btree stats infrastructure Darrick J. Wong
2015-10-07  4:55 ` [PATCH 05/58] xfs: rmap btree add more reserved blocks Darrick J. Wong
2015-10-07  4:55 ` [PATCH 06/58] xfs: add owner field to extent allocation and freeing Darrick J. Wong
2015-10-07  4:55 ` [PATCH 07/58] xfs: add extended " Darrick J. Wong
2015-10-07  4:55 ` [PATCH 08/58] xfs: introduce rmap extent operation stubs Darrick J. Wong
2015-10-07  4:55 ` [PATCH 09/58] xfs: extend rmap extent operation stubs to take full owner info Darrick J. Wong
2015-10-07  4:55 ` [PATCH 10/58] xfs: define the on-disk rmap btree format Darrick J. Wong
2015-10-07  4:55 ` [PATCH 11/58] xfs: enhance " Darrick J. Wong
2015-10-07  4:56 ` [PATCH 12/58] xfs: add rmap btree growfs support Darrick J. Wong
2015-10-07  4:56 ` [PATCH 13/58] xfs: enhance " Darrick J. Wong
2015-10-07  4:56 ` [PATCH 14/58] xfs: rmap btree transaction reservations Darrick J. Wong
2015-10-07  4:56 ` [PATCH 15/58] xfs: rmap btree requires more reserved free space Darrick J. Wong
2015-10-07  4:56 ` [PATCH 16/58] libxfs: fix min freelist length calculation Darrick J. Wong
2015-10-07  4:56 ` [PATCH 17/58] xfs: add rmap btree operations Darrick J. Wong
2015-10-07  4:57 ` [PATCH 18/58] xfs: enhance " Darrick J. Wong
2015-10-07  4:57 ` [PATCH 19/58] xfs: add an extent to the rmap btree Darrick J. Wong
2015-10-07  4:57 ` [PATCH 20/58] xfs: add tracepoints for the rmap-mirrors-bmbt functions Darrick J. Wong
2015-10-07  4:57 ` [PATCH 21/58] xfs: teach rmap_alloc how to deal with our larger rmap btree Darrick J. Wong
2015-10-07  4:57 ` [PATCH 22/58] xfs: remove an extent from the " Darrick J. Wong
2015-10-07  4:57 ` [PATCH 23/58] xfs: enhanced " Darrick J. Wong
2015-10-07  4:57 ` [PATCH 24/58] xfs: add rmap btree insert and delete helpers Darrick J. Wong
2015-10-07  4:57 ` [PATCH 25/58] xfs: bmap btree changes should update rmap btree Darrick J. Wong
2015-10-21 21:39   ` Darrick J. Wong
2015-10-07  4:57 ` [PATCH 26/58] xfs: add rmap btree geometry feature flag Darrick J. Wong
2015-10-07  4:58 ` [PATCH 27/58] xfs: add rmap btree block detection to log recovery Darrick J. Wong
2015-10-07  4:58 ` [PATCH 28/58] xfs: enable the rmap btree functionality Darrick J. Wong
2015-10-07  4:58 ` [PATCH 29/58] xfs: disable XFS_IOC_SWAPEXT when rmap btree is enabled Darrick J. Wong
2015-10-07  4:58 ` [PATCH 30/58] xfs: implement " Darrick J. Wong
2015-10-07  4:58 ` [PATCH 31/58] libxfs: refactor short btree block verification Darrick J. Wong
2015-10-07  4:58 ` [PATCH 32/58] xfs: don't update rmapbt when fixing agfl Darrick J. Wong
2015-10-07  4:58 ` [PATCH 33/58] xfs: introduce refcount btree definitions Darrick J. Wong
2015-10-07  4:58 ` [PATCH 34/58] xfs: add refcount btree stats infrastructure Darrick J. Wong
2015-10-07  4:58 ` [PATCH 35/58] xfs: refcount btree add more reserved blocks Darrick J. Wong
2015-10-07  4:59 ` [PATCH 36/58] xfs: define the on-disk refcount btree format Darrick J. Wong
2015-10-07  4:59 ` [PATCH 37/58] xfs: define tracepoints for refcount/reflink activities Darrick J. Wong
2015-10-07  4:59 ` [PATCH 38/58] xfs: add refcount btree support to growfs Darrick J. Wong
2015-10-07  4:59 ` [PATCH 39/58] xfs: add refcount btree operations Darrick J. Wong
2015-10-07  4:59 ` [PATCH 40/58] libxfs: adjust refcount of an extent of blocks in refcount btree Darrick J. Wong
2015-10-27 19:05   ` Darrick J. Wong
2015-10-30 20:56     ` Darrick J. Wong
2015-10-07  4:59 ` [PATCH 41/58] libxfs: adjust refcount when unmapping file blocks Darrick J. Wong
2015-10-07  4:59 ` [PATCH 42/58] xfs: add refcount btree block detection to log recovery Darrick J. Wong
2015-10-07  4:59 ` [PATCH 43/58] xfs: map an inode's offset to an exact physical block Darrick J. Wong
2015-10-07  4:59 ` [PATCH 44/58] xfs: add reflink feature flag to geometry Darrick J. Wong
2015-10-07  5:00 ` [PATCH 45/58] xfs: create a separate workqueue for copy-on-write activities Darrick J. Wong
2015-10-07  5:00 ` [PATCH 46/58] xfs: implement copy-on-write for reflinked blocks Darrick J. Wong
2015-10-07  5:00 ` [PATCH 47/58] xfs: handle directio " Darrick J. Wong
2015-10-07  5:00 ` [PATCH 48/58] xfs: copy-on-write reflinked blocks when zeroing ranges of blocks Darrick J. Wong
2015-10-21 21:17   ` Darrick J. Wong
2015-10-07  5:00 ` [PATCH 49/58] xfs: clear inode reflink flag when freeing blocks Darrick J. Wong
2015-10-07  5:00 ` [PATCH 50/58] xfs: reflink extents from one file to another Darrick J. Wong
2015-10-07  5:12   ` kbuild test robot
2015-10-07  5:00 ` [PATCH 51/58] xfs: add clone file and clone range ioctls Darrick J. Wong
2015-10-07  5:13   ` kbuild test robot
2015-10-07  6:46   ` kbuild test robot
2015-10-07  7:35   ` kbuild test robot
2015-10-07  5:00 ` [PATCH 52/58] xfs: emulate the btrfs dedupe extent same ioctl Darrick J. Wong
2015-10-07  5:00 ` [PATCH 53/58] xfs: teach fiemap about reflink'd extents Darrick J. Wong
2015-10-07  5:01 ` [PATCH 54/58] xfs: swap inode reflink flags when swapping inode extents Darrick J. Wong
2015-10-07  5:01 ` [PATCH 55/58] vfs: add a FALLOC_FL_UNSHARE mode to fallocate to unshare a range of blocks Darrick J. Wong
2015-10-07  5:01 ` [PATCH 56/58] xfs: unshare a range of blocks via fallocate Darrick J. Wong
2015-10-07  5:01 ` [PATCH 57/58] xfs: support XFS_XFLAG_REFLINK (and FS_NOCOW_FL) on reflink filesystems Darrick J. Wong
2015-10-07  5:01 ` [PATCH 58/58] xfs: recognize the reflink feature bit Darrick J. Wong

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).