All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH v9 00/16] Online(inband) data deduplication
@ 2014-04-09  7:08 Liu Bo
  2014-04-09  7:08 ` [PATCH v9 01/16] Btrfs: disable qgroups accounting when quata_enable is 0 Liu Bo
                   ` (16 more replies)
  0 siblings, 17 replies; 22+ messages in thread
From: Liu Bo @ 2014-04-09  7:08 UTC (permalink / raw)
  To: linux-btrfs
  Cc: Marcel Ritter, Christian Robert, alanqk, Konstantinos Skarlatos,
	David Sterba, Martin Steigerwald, Josef Bacik, Chris Mason

Hello,

This the 9th attempt for in-band data dedupe.

Data deduplication is a specialized data compression technique for eliminating
duplicate copies of repeating data.[1]

This patch set is also related to "Content based storage" in project ideas[2],
it introduces inband data deduplication for btrfs and dedup/dedupe is for short.

* PATCH 1 is a speed-up improvement, which is about dedup and quota.

* PATCH 2-5 is the preparation work for dedup implementation.

* PATCH 6 shows how we implement dedup feature.

* PATCH 7 fixes a backref walking bug with dedup.

* PATCH 8 fixes a free space bug of dedup extents on error handling.

* PATCH 9 adds the ioctl to control dedup feature.

* PATCH 10 targets delayed refs' scalability problem of deleting refs, which is 
  uncovered by the dedup feature.

* PATCH 11-16 fixes bugs of dedupe including race bug, deadlock, abnormal
  transaction abortion and crash.

* btrfs-progs patch(PATCH 17) which offers all details about how to control the
  dedup feature on progs side.

I've tested this with xfstests by adding a inline dedup 'enable & on' in xfstests'
mount and scratch_mount.


***NOTE***
Known bugs:
* Mounting with options "flushoncommit" and enabling dedupe feature will end up
  with _deadlock_.


TODO:
* a bit-to-bit comparison callback.

All comments are welcome!


[1]: http://en.wikipedia.org/wiki/Data_deduplication
[2]: https://btrfs.wiki.kernel.org/index.php/Project_ideas#Content_based_storage

v9:
- fix a deadlock and a crash reported by users.
- fix the metadata ENOSPC problem with dedup again.

v8:
- fix the race crash of dedup ref again.
- fix the metadata ENOSPC problem with dedup.

v7:
- rebase onto the lastest btrfs
- break a big patch into smaller ones to make reviewers happy.
- kill mount options of dedup and use ioctl method instead.
- fix two crash due to the special dedup ref

For former patch sets:
v6: http://thread.gmane.org/gmane.comp.file-systems.btrfs/27512
v5: http://thread.gmane.org/gmane.comp.file-systems.btrfs/27257
v4: http://thread.gmane.org/gmane.comp.file-systems.btrfs/25751
v3: http://comments.gmane.org/gmane.comp.file-systems.btrfs/25433
v2: http://comments.gmane.org/gmane.comp.file-systems.btrfs/24959

Liu Bo (16):
  Btrfs: disable qgroups accounting when quata_enable is 0
  Btrfs: introduce dedup tree and relatives
  Btrfs: introduce dedup tree operations
  Btrfs: introduce dedup state
  Btrfs: make ordered extent aware of dedup
  Btrfs: online(inband) data dedup
  Btrfs: skip dedup reference during backref walking
  Btrfs: don't return space for dedup extent
  Btrfs: add ioctl of dedup control
  Btrfs: improve the delayed refs process in rm case
  Btrfs: fix a crash of dedup ref
  Btrfs: fix deadlock of dedup work
  Btrfs: fix transactin abortion in __btrfs_free_extent
  Btrfs: fix wrong pinned bytes in __btrfs_free_extent
  Btrfs: use total_bytes instead of bytes_used for global_rsv
  Btrfs: fix dedup enospc problem

 fs/btrfs/backref.c           |   9 +
 fs/btrfs/ctree.c             |   2 +-
 fs/btrfs/ctree.h             |  86 ++++++
 fs/btrfs/delayed-ref.c       |  26 +-
 fs/btrfs/delayed-ref.h       |   3 +
 fs/btrfs/disk-io.c           |  37 +++
 fs/btrfs/extent-tree.c       | 235 +++++++++++++---
 fs/btrfs/extent_io.c         |  22 +-
 fs/btrfs/extent_io.h         |  16 ++
 fs/btrfs/file-item.c         | 244 +++++++++++++++++
 fs/btrfs/inode.c             | 635 ++++++++++++++++++++++++++++++++++++++-----
 fs/btrfs/ioctl.c             | 167 ++++++++++++
 fs/btrfs/ordered-data.c      |  44 ++-
 fs/btrfs/ordered-data.h      |  13 +-
 fs/btrfs/qgroup.c            |   3 +
 fs/btrfs/relocation.c        |   3 +
 fs/btrfs/transaction.c       |  41 +++
 fs/btrfs/transaction.h       |   1 +
 include/trace/events/btrfs.h |   3 +-
 include/uapi/linux/btrfs.h   |  11 +
 20 files changed, 1470 insertions(+), 131 deletions(-)

-- 
1.8.2.1

^ permalink raw reply	[flat|nested] 22+ messages in thread
* [RFC PATCH v10 00/16] Online(inband) data deduplication
@ 2014-04-10  3:48 Liu Bo
  2014-04-10  3:48 ` [PATCH v5] Btrfs-progs: add dedup subcommand Liu Bo
  0 siblings, 1 reply; 22+ messages in thread
From: Liu Bo @ 2014-04-10  3:48 UTC (permalink / raw)
  To: linux-btrfs
  Cc: Marcel Ritter, Christian Robert, alanqk, Konstantinos Skarlatos,
	David Sterba, Martin Steigerwald, Josef Bacik, Chris Mason

Hello,

This the 10th attempt for in-band data dedupe, based on Linux _3.14_ kernel.

Data deduplication is a specialized data compression technique for eliminating
duplicate copies of repeating data.[1]

This patch set is also related to "Content based storage" in project ideas[2],
it introduces inband data deduplication for btrfs and dedup/dedupe is for short.

* PATCH 1 is a speed-up improvement, which is about dedup and quota.

* PATCH 2-5 is the preparation work for dedup implementation.

* PATCH 6 shows how we implement dedup feature.

* PATCH 7 fixes a backref walking bug with dedup.

* PATCH 8 fixes a free space bug of dedup extents on error handling.

* PATCH 9 adds the ioctl to control dedup feature.

* PATCH 10 targets delayed refs' scalability problem of deleting refs, which is 
  uncovered by the dedup feature.

* PATCH 11-16 fixes bugs of dedupe including race bug, deadlock, abnormal
  transaction abortion and crash.

* btrfs-progs patch(PATCH 17) offers all details about how to control the
  dedup feature on progs side.

I've tested this with xfstests by adding a inline dedup 'enable & on' in xfstests'
mount and scratch_mount.


***NOTE***
Known bugs:
* Mounting with options "flushoncommit" and enabling dedupe feature will end up
  with _deadlock_.


TODO:
* a bit-to-bit comparison callback.

All comments are welcome!


[1]: http://en.wikipedia.org/wiki/Data_deduplication
[2]: https://btrfs.wiki.kernel.org/index.php/Project_ideas#Content_based_storage

v10:
- fix a typo in the subject line.
- update struct 'btrfs_ioctl_dedup_args' in the kernel side to fix
  'Inappropriate ioctl for device'.

v9:
- fix a deadlock and a crash reported by users.
- fix the metadata ENOSPC problem with dedup again.

v8:
- fix the race crash of dedup ref again.
- fix the metadata ENOSPC problem with dedup.

v7:
- rebase onto the lastest btrfs
- break a big patch into smaller ones to make reviewers happy.
- kill mount options of dedup and use ioctl method instead.
- fix two crash due to the special dedup ref

For former patch sets:
v6: http://thread.gmane.org/gmane.comp.file-systems.btrfs/27512
v5: http://thread.gmane.org/gmane.comp.file-systems.btrfs/27257
v4: http://thread.gmane.org/gmane.comp.file-systems.btrfs/25751
v3: http://comments.gmane.org/gmane.comp.file-systems.btrfs/25433
v2: http://comments.gmane.org/gmane.comp.file-systems.btrfs/24959

Liu Bo (16):
  Btrfs: disable qgroups accounting when quota_enable is 0
  Btrfs: introduce dedup tree and relatives
  Btrfs: introduce dedup tree operations
  Btrfs: introduce dedup state
  Btrfs: make ordered extent aware of dedup
  Btrfs: online(inband) data dedup
  Btrfs: skip dedup reference during backref walking
  Btrfs: don't return space for dedup extent
  Btrfs: add ioctl of dedup control
  Btrfs: improve the delayed refs process in rm case
  Btrfs: fix a crash of dedup ref
  Btrfs: fix deadlock of dedup work
  Btrfs: fix transactin abortion in __btrfs_free_extent
  Btrfs: fix wrong pinned bytes in __btrfs_free_extent
  Btrfs: use total_bytes instead of bytes_used for global_rsv
  Btrfs: fix dedup enospc problem

 fs/btrfs/backref.c           |   9 +
 fs/btrfs/ctree.c             |   2 +-
 fs/btrfs/ctree.h             |  86 ++++++
 fs/btrfs/delayed-ref.c       |  26 +-
 fs/btrfs/delayed-ref.h       |   3 +
 fs/btrfs/disk-io.c           |  37 +++
 fs/btrfs/extent-tree.c       | 235 +++++++++++++---
 fs/btrfs/extent_io.c         |  22 +-
 fs/btrfs/extent_io.h         |  16 ++
 fs/btrfs/file-item.c         | 244 +++++++++++++++++
 fs/btrfs/inode.c             | 635 ++++++++++++++++++++++++++++++++++++++-----
 fs/btrfs/ioctl.c             | 167 ++++++++++++
 fs/btrfs/ordered-data.c      |  44 ++-
 fs/btrfs/ordered-data.h      |  13 +-
 fs/btrfs/qgroup.c            |   3 +
 fs/btrfs/relocation.c        |   3 +
 fs/btrfs/transaction.c       |  41 +++
 fs/btrfs/transaction.h       |   1 +
 include/trace/events/btrfs.h |   3 +-
 include/uapi/linux/btrfs.h   |  12 +
 20 files changed, 1471 insertions(+), 131 deletions(-)

-- 
1.8.2.1

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2014-04-10  3:50 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-04-09  7:08 [RFC PATCH v9 00/16] Online(inband) data deduplication Liu Bo
2014-04-09  7:08 ` [PATCH v9 01/16] Btrfs: disable qgroups accounting when quata_enable is 0 Liu Bo
2014-04-09  8:57   ` Liu Bo
2014-04-09  7:08 ` [PATCH v9 02/16] Btrfs: introduce dedup tree and relatives Liu Bo
2014-04-09  7:08 ` [PATCH v9 03/16] Btrfs: introduce dedup tree operations Liu Bo
2014-04-09  7:08 ` [PATCH v9 04/16] Btrfs: introduce dedup state Liu Bo
2014-04-09  7:08 ` [PATCH v9 05/16] Btrfs: make ordered extent aware of dedup Liu Bo
2014-04-09  7:08 ` [PATCH v9 06/16] Btrfs: online(inband) data dedup Liu Bo
2014-04-09  7:08 ` [PATCH v9 07/16] Btrfs: skip dedup reference during backref walking Liu Bo
2014-04-09  7:08 ` [PATCH v9 08/16] Btrfs: don't return space for dedup extent Liu Bo
2014-04-09  7:08 ` [PATCH v9 09/16] Btrfs: add ioctl of dedup control Liu Bo
2014-04-09  7:08 ` [PATCH v9 10/16] Btrfs: improve the delayed refs process in rm case Liu Bo
2014-04-09  7:08 ` [PATCH v9 11/16] Btrfs: fix a crash of dedup ref Liu Bo
2014-04-09  7:08 ` [PATCH v9 12/16] Btrfs: fix deadlock of dedup work Liu Bo
2014-04-09  7:08 ` [PATCH v9 13/16] Btrfs: fix transactin abortion in __btrfs_free_extent Liu Bo
2014-04-09  7:08 ` [PATCH v9 14/16] Btrfs: fix wrong pinned bytes " Liu Bo
2014-04-09  7:08 ` [PATCH v9 15/16] Btrfs: use total_bytes instead of bytes_used for global_rsv Liu Bo
2014-04-09  7:08 ` [PATCH v9 16/16] Btrfs: fix dedup enospc problem Liu Bo
2014-04-09  7:08 ` [PATCH v4] Btrfs-progs: add dedup subcommand Liu Bo
2014-04-09 10:10   ` [PATCH v5] " Liu Bo
2014-04-09 10:14     ` Liu Bo
  -- strict thread matches above, loose matches on Subject: below --
2014-04-10  3:48 [RFC PATCH v10 00/16] Online(inband) data deduplication Liu Bo
2014-04-10  3:48 ` [PATCH v5] Btrfs-progs: add dedup subcommand Liu Bo

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.