All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH RFC 00/10] dm-dedup: device-mapper deduplication target
@ 2014-04-28 22:03 Vasily Tarasov
  2014-04-29  6:23 ` Bart Van Assche
                   ` (2 more replies)
  0 siblings, 3 replies; 13+ messages in thread
From: Vasily Tarasov @ 2014-04-28 22:03 UTC (permalink / raw)
  To: dm-devel; +Cc: Christoph Hellwig, Philip Shilane, Sonam Mandal, Erez Zadok

This is a request for comments for Dmdedup.

Dmdedup is a device-mapper deduplication target.  Every write coming to the
Dmdedup instance is deduplicated against previously written data.  For
datasets that contain many duplicates scattered across the disk (e.g.,
collections of virtual machine disk images and backups) deduplication provides
a significant amount of space savings.

To quickly identify duplicates, Dmdedup maintains an index of hashes for all
written blocks.  A block is a user-configurable unit of deduplication with a
default block size of 4KB.  Dmdedup's index, along with other deduplication
metadata, resides on a separate block device, which we refer to as a
metadata device.  Although the metadata device can be on any block device,
e.g., an HDD or its own partition, for higher performance we recommend to
use SSD devices to store metadata.

Dmdedup is designed to support pluggable metadata backends.  A metadata
backend is responsible for storing metadata: LBN-to-PBN and HASH-to-PBN
mappings, allocation maps, and reference counters.  (LBN: Logical Block
Number, PBN: Physical Block Number).  Currently we implemented "cowbtree"
and "inram" backends.  The cowbtree uses device-mapper persistent API to
store metadata.  The inram backend stores all metadata in RAM as a hash
table.

Our preliminary experiments on real traces (FIU traces from
http://iotta.snia.org/tracetypes/3) demonstrate that Dmdedup can even exceed
the performance of a disk drive running ext4.  The reasons are that (1)
deduplication reduces I/O traffic to the data device, and (2) Dmdedup
effectively sequentializes random writes to the data device.

Dmdedup is developed by a joint group of researchers from Stony Brook
University, Harvey Mudd College, and EMC.  See the documentation patch for
more details.

Vasily Tarasov (10):
  dm-dedup: main data structures
  dm-dedup: core deduplication logic
  dm-dedup: hash computation
  dm-dedup: implementation of the read-on-write procedure
  dm-dedup: COW B-tree backend
  dm-dedup: inram backend
  dm-dedup: Makefile changes
  dm-dedup: Kconfig changes
  dm-dedup: status function
  dm-dedup: documentation

 Documentation/device-mapper/dm-dedup.txt |   51 ++
 drivers/md/Kconfig                       |    8 +
 drivers/md/Makefile                      |    2 +
 drivers/md/dm-dedup-backend.h            |  114 +++++
 drivers/md/dm-dedup-cbt.c                |  724 ++++++++++++++++++++++++++++
 drivers/md/dm-dedup-cbt.h                |   44 ++
 drivers/md/dm-dedup-hash.c               |  148 ++++++
 drivers/md/dm-dedup-hash.h               |   30 ++
 drivers/md/dm-dedup-kvstore.h            |   51 ++
 drivers/md/dm-dedup-ram.c                |  585 +++++++++++++++++++++++
 drivers/md/dm-dedup-ram.h                |   43 ++
 drivers/md/dm-dedup-rw.c                 |  248 ++++++++++
 drivers/md/dm-dedup-rw.h                 |   19 +
 drivers/md/dm-dedup-target.c             |  760 ++++++++++++++++++++++++++++++
 drivers/md/dm-dedup-target.h             |  100 ++++
 15 files changed, 2927 insertions(+), 0 deletions(-)
 create mode 100644 Documentation/device-mapper/dm-dedup.txt
 create mode 100644 drivers/md/dm-dedup-backend.h
 create mode 100644 drivers/md/dm-dedup-cbt.c
 create mode 100644 drivers/md/dm-dedup-cbt.h
 create mode 100644 drivers/md/dm-dedup-hash.c
 create mode 100644 drivers/md/dm-dedup-hash.h
 create mode 100644 drivers/md/dm-dedup-kvstore.h
 create mode 100644 drivers/md/dm-dedup-ram.c
 create mode 100644 drivers/md/dm-dedup-ram.h
 create mode 100644 drivers/md/dm-dedup-rw.c
 create mode 100644 drivers/md/dm-dedup-rw.h
 create mode 100644 drivers/md/dm-dedup-target.c
 create mode 100644 drivers/md/dm-dedup-target.h

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH RFC 00/10] dm-dedup: device-mapper deduplication target
  2014-04-28 22:03 [PATCH RFC 00/10] dm-dedup: device-mapper deduplication target Vasily Tarasov
@ 2014-04-29  6:23 ` Bart Van Assche
  2014-04-29 13:26   ` Vasily Tarasov
  2014-05-05 18:24 ` Mike Snitzer
  2015-03-06 18:37 ` Vivek Goyal
  2 siblings, 1 reply; 13+ messages in thread
From: Bart Van Assche @ 2014-04-29  6:23 UTC (permalink / raw)
  To: device-mapper development
  Cc: Christoph Hellwig, Philip Shilane, Sonam Mandal, Erez Zadok

On 04/29/14 00:03, Vasily Tarasov wrote:
> See the documentation patch for more details.

Regarding that documentation: shouldn't the on-disk data structures be
documented ? Shouldn't it be documented how dm-dedup recovers from a
power failure ? Since different storage devices are used for data and
meta-data recovery from a power failure is nontrivial. How is it e.g.
guaranteed if a data block has been made persistent (e.g. via REQ_FUA)
and the refcount is increased for that data block that neither the data
nor the metadata for that data block is lost if a power failure occurs ?

Bart.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH RFC 00/10] dm-dedup: device-mapper deduplication target
  2014-04-29  6:23 ` Bart Van Assche
@ 2014-04-29 13:26   ` Vasily Tarasov
  0 siblings, 0 replies; 13+ messages in thread
From: Vasily Tarasov @ 2014-04-29 13:26 UTC (permalink / raw)
  To: device-mapper development
  Cc: Christoph Hellwig, Philip Shilane, Sonam Mandal, Erez Zadok

Yes, you're right, documentation should be more detailed. Alasdair
also pointed me to the format they use in cache.txt and verity.txt
files. I'll update dedup.txt to comply with that format and include
the information about on-disk structures and behavior in case of a
power failure. Will send an updated documentation patch in some time.

The short answer to your question is that we use device-mapper's
persistent-data library for storing metadata. The library uses COW
B-trees to provide atomicity, consistency, and durability. On REQ_FUA
we commit transactions. The allocation of data blocks on a data device
ensures that no old blocks are overwritten within a transaction. So,
during a write operation, data blocks go straight to the disk, but
they become visible only after the transaction is committed (i.e., if
power fails in the middle of the transaction, one sees the old image
of the device).

I'll explain more details in the documentation patch.

Thanks,
Vasily


On Tue, Apr 29, 2014 at 2:23 AM, Bart Van Assche <bvanassche@acm.org> wrote:
> On 04/29/14 00:03, Vasily Tarasov wrote:
>> See the documentation patch for more details.
>
> Regarding that documentation: shouldn't the on-disk data structures be
> documented ? Shouldn't it be documented how dm-dedup recovers from a
> power failure ? Since different storage devices are used for data and
> meta-data recovery from a power failure is nontrivial. How is it e.g.
> guaranteed if a data block has been made persistent (e.g. via REQ_FUA)
> and the refcount is increased for that data block that neither the data
> nor the metadata for that data block is lost if a power failure occurs ?
>
> Bart.
>
> --
> dm-devel mailing list
> dm-devel@redhat.com
> https://www.redhat.com/mailman/listinfo/dm-devel
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH RFC 00/10] dm-dedup: device-mapper deduplication target
  2014-04-28 22:03 [PATCH RFC 00/10] dm-dedup: device-mapper deduplication target Vasily Tarasov
  2014-04-29  6:23 ` Bart Van Assche
@ 2014-05-05 18:24 ` Mike Snitzer
  2014-05-06 13:43   ` Vasily Tarasov
  2015-03-06 18:37 ` Vivek Goyal
  2 siblings, 1 reply; 13+ messages in thread
From: Mike Snitzer @ 2014-05-05 18:24 UTC (permalink / raw)
  To: Vasily Tarasov
  Cc: Christoph Hellwig, dm-devel, Philip Shilane, Sonam Mandal,
	Erez Zadok

Seems dm-devel is missing patches 4, 5 and 7.  Do you happen to have
the entire series staged in a public git tree somewhere?

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH RFC 00/10] dm-dedup: device-mapper deduplication target
  2014-05-05 18:24 ` Mike Snitzer
@ 2014-05-06 13:43   ` Vasily Tarasov
  2014-05-06 14:23     ` Mike Snitzer
  2014-07-18  2:43     ` Mike Snitzer
  0 siblings, 2 replies; 13+ messages in thread
From: Vasily Tarasov @ 2014-05-06 13:43 UTC (permalink / raw)
  To: Mike Snitzer
  Cc: Christoph Hellwig, dm-devel, Philip Shilane, Sonam Mandal,
	Erez Zadok

Interestingly, I can see 4, 5, and 7 in dm-devel's archive:

https://www.redhat.com/archives/dm-devel/2014-April/author.html

In any case, you can pull the patches from:

git://git.fsl.cs.sunysb.edu/linux-dmdedup.git

Branch: rfc-v1.1

Thanks for looking into this.

Vasily

On Mon, May 05, 2014 at 02:24:51PM -0400, Mike Snitzer wrote:
> Seems dm-devel is missing patches 4, 5 and 7.  Do you happen to have
> the entire series staged in a public git tree somewhere?
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH RFC 00/10] dm-dedup: device-mapper deduplication target
  2014-05-06 13:43   ` Vasily Tarasov
@ 2014-05-06 14:23     ` Mike Snitzer
  2014-07-18  2:43     ` Mike Snitzer
  1 sibling, 0 replies; 13+ messages in thread
From: Mike Snitzer @ 2014-05-06 14:23 UTC (permalink / raw)
  To: Vasily Tarasov
  Cc: Christoph Hellwig, dm-devel, Philip Shilane, Sonam Mandal,
	Erez Zadok

On Tue, May 06 2014 at  9:43am -0400,
Vasily Tarasov <tarasov@vasily.name> wrote:

> Interestingly, I can see 4, 5, and 7 in dm-devel's archive:
> 
> https://www.redhat.com/archives/dm-devel/2014-April/author.html

Seems the redhat mail servers have been misbehaving.
 
> In any case, you can pull the patches from:
> 
> git://git.fsl.cs.sunysb.edu/linux-dmdedup.git
> 
> Branch: rfc-v1.1

OK, thanks.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH RFC 00/10] dm-dedup: device-mapper deduplication target
  2014-05-06 13:43   ` Vasily Tarasov
  2014-05-06 14:23     ` Mike Snitzer
@ 2014-07-18  2:43     ` Mike Snitzer
  2014-07-18 11:59       ` Vasily Tarasov
  1 sibling, 1 reply; 13+ messages in thread
From: Mike Snitzer @ 2014-07-18  2:43 UTC (permalink / raw)
  To: Vasily Tarasov
  Cc: ejt, Christoph Hellwig, dm-devel, Philip Shilane, Sonam Mandal,
	Erez Zadok, Mikulas Patocka

On Tue, May 06 2014 at  9:43am -0400,
Vasily Tarasov <tarasov@vasily.name> wrote:

> Interestingly, I can see 4, 5, and 7 in dm-devel's archive:
> 
> https://www.redhat.com/archives/dm-devel/2014-April/author.html
> 
> In any case, you can pull the patches from:
> 
> git://git.fsl.cs.sunysb.edu/linux-dmdedup.git
> 
> Branch: rfc-v1.1
> 
> Thanks for looking into this.

Hi,

I haven't been able to get to _really_ reviewing dm-dedup.  It isn't
anything against you guys.. I've just been quite busy with other tasks.

I did start in on dm-dedup a month or so ago by staging a baseline of
your work in a branch here:
http://git.kernel.org/cgit/linux/kernel/git/snitzer/linux.git/log/?h=dm-dedup

I found a few things that didn't look right, but they are more
DM-specific mechanics and not anything to do with your approach for
accomplishing dedup, see the FIXMEs I added to the documentation file in
this commit:
http://git.kernel.org/cgit/linux/kernel/git/snitzer/linux.git/commit/?h=dm-dedup&id=fed855928fba624c7a494db7519c37dcc7c9492d

The reconstruct= param isn't needed.  In both dm-thinp and dm-cache we
use __superblock_all_zeroes to checks if the metadata device's
superblock is all zeros.  Ideally dm-dedup would do something
comparable.

I'm going to be on paternity leave until Sept. 8.  It'd be great if Joe
and/or Mikulas took some time to review dm-dedup but I'm not sure if
they'll be able to.  I do hope to be around to respond to emails
periodically but my availability is TBD at this point.

When I get back from leave I'll definitely make dm-dedup a priority if
others don't beat me to it.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH RFC 00/10] dm-dedup: device-mapper deduplication target
  2014-07-18  2:43     ` Mike Snitzer
@ 2014-07-18 11:59       ` Vasily Tarasov
  2014-07-18 13:29         ` Joe Thornber
  2014-07-18 14:44         ` Mike Snitzer
  0 siblings, 2 replies; 13+ messages in thread
From: Vasily Tarasov @ 2014-07-18 11:59 UTC (permalink / raw)
  To: Mike Snitzer
  Cc: ejt, Christoph Hellwig, device-mapper development, Philip Shilane,
	Sonam Mandal, Erez Zadok, Mikulas Patocka

Hi Mike,

Thanks for your initial reviewing steps! Here is what we plan to do:

1) Address your current comments

2) Few people mentioned that our documentation need to be formatted
better and be more detailed. We'll address this.

3) There are several things in dm-dedup that we would like to improve.
We will work on that to.

4) We will then post dm-dedup v2 patchset and prepare a fresh git
branch. It will definitely happen before you're available in
September.

5) If Joe and/or Mikulas can take a look at the code meanwhile - that
would be great!

Thank you,
Vasily

On Thu, Jul 17, 2014 at 10:43 PM, Mike Snitzer <snitzer@redhat.com> wrote:
> On Tue, May 06 2014 at  9:43am -0400,
> Vasily Tarasov <tarasov@vasily.name> wrote:
>
>> Interestingly, I can see 4, 5, and 7 in dm-devel's archive:
>>
>> https://www.redhat.com/archives/dm-devel/2014-April/author.html
>>
>> In any case, you can pull the patches from:
>>
>> git://git.fsl.cs.sunysb.edu/linux-dmdedup.git
>>
>> Branch: rfc-v1.1
>>
>> Thanks for looking into this.
>
> Hi,
>
> I haven't been able to get to _really_ reviewing dm-dedup.  It isn't
> anything against you guys.. I've just been quite busy with other tasks.
>
> I did start in on dm-dedup a month or so ago by staging a baseline of
> your work in a branch here:
> http://git.kernel.org/cgit/linux/kernel/git/snitzer/linux.git/log/?h=dm-dedup
>
> I found a few things that didn't look right, but they are more
> DM-specific mechanics and not anything to do with your approach for
> accomplishing dedup, see the FIXMEs I added to the documentation file in
> this commit:
> http://git.kernel.org/cgit/linux/kernel/git/snitzer/linux.git/commit/?h=dm-dedup&id=fed855928fba624c7a494db7519c37dcc7c9492d
>
> The reconstruct= param isn't needed.  In both dm-thinp and dm-cache we
> use __superblock_all_zeroes to checks if the metadata device's
> superblock is all zeros.  Ideally dm-dedup would do something
> comparable.
>
> I'm going to be on paternity leave until Sept. 8.  It'd be great if Joe
> and/or Mikulas took some time to review dm-dedup but I'm not sure if
> they'll be able to.  I do hope to be around to respond to emails
> periodically but my availability is TBD at this point.
>
> When I get back from leave I'll definitely make dm-dedup a priority if
> others don't beat me to it.
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH RFC 00/10] dm-dedup: device-mapper deduplication target
  2014-07-18 11:59       ` Vasily Tarasov
@ 2014-07-18 13:29         ` Joe Thornber
  2014-07-18 14:44         ` Mike Snitzer
  1 sibling, 0 replies; 13+ messages in thread
From: Joe Thornber @ 2014-07-18 13:29 UTC (permalink / raw)
  To: device-mapper development
  Cc: Mike Snitzer, Christoph Hellwig, ejt, Philip Shilane,
	Sonam Mandal, Erez Zadok, Mikulas Patocka

On Fri, Jul 18, 2014 at 07:59:55AM -0400, Vasily Tarasov wrote:
> 5) If Joe and/or Mikulas can take a look at the code meanwhile - that
> would be great!

I'll do it over the next week.

- Joe

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH RFC 00/10] dm-dedup: device-mapper deduplication target
  2014-07-18 11:59       ` Vasily Tarasov
  2014-07-18 13:29         ` Joe Thornber
@ 2014-07-18 14:44         ` Mike Snitzer
  1 sibling, 0 replies; 13+ messages in thread
From: Mike Snitzer @ 2014-07-18 14:44 UTC (permalink / raw)
  To: Vasily Tarasov
  Cc: device-mapper development, Christoph Hellwig, ejt, Philip Shilane,
	Sonam Mandal, Erez Zadok, Mikulas Patocka

On Fri, Jul 18 2014 at  7:59am -0400,
Vasily Tarasov <tarasov@vasily.name> wrote:

> Hi Mike,
> 
> Thanks for your initial reviewing steps! Here is what we plan to do:
> 
> 1) Address your current comments

Thanks, try to look at how the other DM targets have tables with
arguments that have a fixed position (less flexible than name=value but
tools like lvm2 benefit from the rigidity).
 
> 2) Few people mentioned that our documentation need to be formatted
> better and be more detailed. We'll address this.

Yes, it could be more comprehensive.  And it would be nice if it were
updated to have the same kind of sections/flow that other targets'
Documentation/device-mapper/ files have (e.g. cache.txt and
thin-provisioning.txt).

But overall the plan sounds great.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH RFC 00/10] dm-dedup: device-mapper deduplication target
  2014-04-28 22:03 [PATCH RFC 00/10] dm-dedup: device-mapper deduplication target Vasily Tarasov
  2014-04-29  6:23 ` Bart Van Assche
  2014-05-05 18:24 ` Mike Snitzer
@ 2015-03-06 18:37 ` Vivek Goyal
  2015-03-06 23:31   ` Akira Hayakawa
  2 siblings, 1 reply; 13+ messages in thread
From: Vivek Goyal @ 2015-03-06 18:37 UTC (permalink / raw)
  To: Vasily Tarasov
  Cc: Christoph Hellwig, dm-devel, Philip Shilane, Sonam Mandal,
	Erez Zadok

On Mon, Apr 28, 2014 at 06:03:06PM -0400, Vasily Tarasov wrote:
> This is a request for comments for Dmdedup.

mkfs.xfs crash the dm-dedup driver. mkfs.ext4 worked find though for me.

Thanks
Vivek

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH RFC 00/10] dm-dedup: device-mapper deduplication target
  2015-03-06 18:37 ` Vivek Goyal
@ 2015-03-06 23:31   ` Akira Hayakawa
  2015-03-07 21:31     ` Vasily Tarasov
  0 siblings, 1 reply; 13+ messages in thread
From: Akira Hayakawa @ 2015-03-06 23:31 UTC (permalink / raw)
  To: device-mapper development, Vasily Tarasov
  Cc: Christoph Hellwig, Philip Shilane, Sonam Mandal, Erez Zadok

Vasily,

I recommend you to add tests into dmts
https://github.com/jthornber/device-mapper-test-suite

This testing framework can eliminate such trivial bugs.
And the tests become available to other developers are
valuable too.
It's written in Ruby and Joe designed carefully. So it's
really easy to add tests.

- Akira

On 2015/03/07 3:37, Vivek Goyal wrote:
> On Mon, Apr 28, 2014 at 06:03:06PM -0400, Vasily Tarasov wrote:
>> This is a request for comments for Dmdedup.
> 
> mkfs.xfs crash the dm-dedup driver. mkfs.ext4 worked find though for me.
> 
> Thanks
> Vivek
> 
> --
> dm-devel mailing list
> dm-devel@redhat.com
> https://www.redhat.com/mailman/listinfo/dm-devel
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH RFC 00/10] dm-dedup: device-mapper deduplication target
  2015-03-06 23:31   ` Akira Hayakawa
@ 2015-03-07 21:31     ` Vasily Tarasov
  0 siblings, 0 replies; 13+ messages in thread
From: Vasily Tarasov @ 2015-03-07 21:31 UTC (permalink / raw)
  To: Akira Hayakawa
  Cc: Christoph Hellwig, device-mapper development, Philip Shilane,
	Sonam Mandal, Erez Zadok

@Vivek, thanks, we'll take a look at it.

@Akira, thanks, we'll add dm test-suite to our testbed.

Vasily

On Fri, Mar 6, 2015 at 6:31 PM, Akira Hayakawa <ruby.wktk@gmail.com> wrote:
> Vasily,
>
> I recommend you to add tests into dmts
> https://github.com/jthornber/device-mapper-test-suite
>
> This testing framework can eliminate such trivial bugs.
> And the tests become available to other developers are
> valuable too.
> It's written in Ruby and Joe designed carefully. So it's
> really easy to add tests.
>
> - Akira
>
> On 2015/03/07 3:37, Vivek Goyal wrote:
>> On Mon, Apr 28, 2014 at 06:03:06PM -0400, Vasily Tarasov wrote:
>>> This is a request for comments for Dmdedup.
>>
>> mkfs.xfs crash the dm-dedup driver. mkfs.ext4 worked find though for me.
>>
>> Thanks
>> Vivek
>>
>> --
>> dm-devel mailing list
>> dm-devel@redhat.com
>> https://www.redhat.com/mailman/listinfo/dm-devel
>>
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2015-03-07 21:31 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-04-28 22:03 [PATCH RFC 00/10] dm-dedup: device-mapper deduplication target Vasily Tarasov
2014-04-29  6:23 ` Bart Van Assche
2014-04-29 13:26   ` Vasily Tarasov
2014-05-05 18:24 ` Mike Snitzer
2014-05-06 13:43   ` Vasily Tarasov
2014-05-06 14:23     ` Mike Snitzer
2014-07-18  2:43     ` Mike Snitzer
2014-07-18 11:59       ` Vasily Tarasov
2014-07-18 13:29         ` Joe Thornber
2014-07-18 14:44         ` Mike Snitzer
2015-03-06 18:37 ` Vivek Goyal
2015-03-06 23:31   ` Akira Hayakawa
2015-03-07 21:31     ` Vasily Tarasov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.