linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Zygo Blaxell <ce3g8jdj@umail.furryterror.org>
To: "Austin S. Hemmelgarn" <ahferroin7@gmail.com>
Cc: James Pharaoh <james@wellbehavedsoftware.com>,
	Mark Fasheh <mfasheh@versity.com>,
	linux-btrfs@vger.kernel.org
Subject: Re: Announcing btrfs-dedupe
Date: Tue, 15 Nov 2016 12:52:01 -0500	[thread overview]
Message-ID: <20161115175201.GL21290@hungrycats.org> (raw)
In-Reply-To: <0b8cd1f0-fca4-b7df-2f41-13c40aee493d@gmail.com>

[-- Attachment #1: Type: text/plain, Size: 3732 bytes --]

On Tue, Nov 15, 2016 at 07:26:53AM -0500, Austin S. Hemmelgarn wrote:
> On 2016-11-14 16:10, Zygo Blaxell wrote:
> >Why is deduplicating thousands of blocks of data crazy?  I already
> >deduplicate four orders of magnitude more than that per week.
> You missed the 'tiny' quantifier.  I'm talking really small blocks, on the
> order of less than 64k (so, IOW, stuff that's not much bigger than a few
> filesystem blocks), and that is somewhat crazy because it ends up not only
> taking _really_ long to do compared to larger chunks (because you're running
> more independent hashes than with bigger blocks), but also because it will
> often split extents unnecessarily and contribute to fragmentation, which
> will lead to all kinds of other performance problems on the FS.

Like I said, millions of extents per week...

64K is an enormous dedup block size, especially if it comes with a 64K
alignment constraint as well.

These are the top ten duplicate block sizes from a sample of 95251
dedup ops on a medium-sized production server with 4TB of filesystem
(about one machine-day of data):

        total bytes     extent count    dup size
        2750808064      20987           131072
        803733504       1533            524288
        123801600       975             126976
        103575552       8429            12288
        97443840        793             122880
        82051072        10016           8192
        77492224        18919           4096
        71331840        645             110592
        64143360        540             118784
        63897600        650             98304

        all bytes       all extents     average dup size
        6129995776      95251           64356

128K and 512K are the most common sizes due to btrfs compression (it
limits the block size to 128K for compressed extents and seems to limit
uncompressed extents to 512K for some reason).  12K is #4, and 3 of the
top ten sizes are below 16K.  The average size is just a little below 64K.

These are the duplicates with block sizes smaller than 64K:

        total bytes     extent count    extent size
        41615360        635             65536
        46264320        753             61440
        45817856        799             57344
        41267200        775             53248 
        45760512        931             49152
        46948352        1042            45056
        43417600        1060            40960
        47296512        1283            36864
        59277312        1809            32768
        49029120        1710            28672
        43745280        1780            24576
        53616640        2618            20480
        43466752        2653            16384
        103575552       8429            12288
        82051072        10016           8192 
        77492224        18919           4096 

        all bytes <=64K extents <=64K   average dup size <=64K
        870641664       55212           15769

14% of my duplicate bytes are in blocks smaller than 64K or blocks not
aligned to a 64K boundary within a file.  It's too large a space saving
to ignore on machines that have constrained storage.

It may be worthwhile skipping 4K and 8K dedups--at 250 ms per dedup,
they're 30% of the total run time and only 2.6% of the total dedup bytes.
On the other hand, this machine is already deduping everything fast enough
to keep up with new data, so there's no performance problem to solve here.

> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

  reply	other threads:[~2016-11-15 17:52 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-11-06 13:30 Announcing btrfs-dedupe James Pharaoh
2016-11-07 14:02 ` David Sterba
2016-11-07 17:48   ` Mark Fasheh
2016-11-07 20:54     ` Adam Borowski
2016-11-08  2:17       ` Darrick J. Wong
2016-11-08 18:59         ` Mark Fasheh
2016-11-08 19:47           ` Darrick J. Wong
2016-11-09 15:02       ` David Sterba
2016-11-08  2:40   ` Christoph Anton Mitterer
2016-11-08  6:11     ` James Pharaoh
2016-11-08 13:26     ` Austin S. Hemmelgarn
2016-11-08 16:57       ` Darrick J. Wong
2016-11-08 17:04         ` Austin S. Hemmelgarn
2016-11-08 18:49     ` Mark Fasheh
2016-11-07 17:59 ` Mark Fasheh
2016-11-07 18:49   ` James Pharaoh
2016-11-07 18:53     ` James Pharaoh
2016-11-14 18:07     ` Zygo Blaxell
2016-11-14 18:22       ` James Pharaoh
2016-11-14 18:39         ` Austin S. Hemmelgarn
2016-11-14 19:51           ` Zygo Blaxell
2016-11-14 19:56             ` Austin S. Hemmelgarn
2016-11-14 21:10               ` Zygo Blaxell
2016-11-15 12:26                 ` Austin S. Hemmelgarn
2016-11-15 17:52                   ` Zygo Blaxell [this message]
2016-11-16 22:24                     ` Niccolò Belli
2016-11-17  3:01                       ` Zygo Blaxell
2016-11-18 10:36                         ` Niccolò Belli
2016-11-14 20:07             ` James Pharaoh
2016-11-14 21:22               ` Zygo Blaxell
2016-11-14 18:43         ` Zygo Blaxell
2016-11-08 11:06 ` Niccolò Belli
2016-11-08 11:38   ` James Pharaoh
2016-11-08 16:57     ` Niccolò Belli
2016-11-08 16:58       ` James Pharaoh
2016-11-08 17:08         ` Niccolò Belli
2016-11-14 18:27   ` Zygo Blaxell
2016-11-08 22:36 ` Saint Germain
2016-11-09 11:24   ` Niccolò Belli
2016-11-09 12:47     ` Saint Germain
2016-11-13 12:45   ` James Pharaoh

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20161115175201.GL21290@hungrycats.org \
    --to=ce3g8jdj@umail.furryterror.org \
    --cc=ahferroin7@gmail.com \
    --cc=james@wellbehavedsoftware.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=mfasheh@versity.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).