From: Matthias Kaehlcke <mka@chromium.org>
To: Matthew Sakai <msakai@redhat.com>
Cc: dm-devel@lists.linux.dev, Brian Geffon <bgeffon@google.com>
Subject: Re: [PATCH v5 01/40] dm: add documentation for dm-vdo target
Date: Mon, 8 Jan 2024 15:52:30 +0000 [thread overview]
Message-ID: <ZZwaPjBECFvs8F4i@google.com> (raw)
In-Reply-To: <8207e4fb-0ef0-50e8-5954-363a3723ffa6@redhat.com>
Hi Matthew,
Thanks for your reply!
On Thu, Jan 04, 2024 at 09:07:07PM -0500, Matthew Sakai wrote:
>
>
> On 12/28/23 14:16, Matthias Kaehlcke wrote:
> > Hi,
> >
> > On Fri, Nov 17, 2023 at 03:59:18PM -0500, Matthew Sakai wrote:
> > > This adds the admin-guide documentation for dm-vdo.
> > >
> > > vdo.rst is the guide to using dm-vdo. vdo-design is an overview of the
> > > design of dm-vdo.
> > >
> > > Co-developed-by: J. corwin Coburn <corwin@hurlbutnet.net>
> > > Signed-off-by: J. corwin Coburn <corwin@hurlbutnet.net>
> > > Signed-off-by: Matthew Sakai <msakai@redhat.com>
> > > ---
> > > .../admin-guide/device-mapper/vdo-design.rst | 415 ++++++++++++++++++
> > > .../admin-guide/device-mapper/vdo.rst | 388 ++++++++++++++++
> > > 2 files changed, 803 insertions(+)
> > > create mode 100644 Documentation/admin-guide/device-mapper/vdo-design.rst
> > > create mode 100644 Documentation/admin-guide/device-mapper/vdo.rst
> > >
> > > diff --git a/Documentation/admin-guide/device-mapper/vdo-design.rst b/Documentation/admin-guide/device-mapper/vdo-design.rst
> > > new file mode 100644
> > > index 000000000000..c82d51071c7d
> > > --- /dev/null
> > > +++ b/Documentation/admin-guide/device-mapper/vdo-design.rst
> > > @@ -0,0 +1,415 @@
> > > +.. SPDX-License-Identifier: GPL-2.0-only
> > > +
> > > +================
> > > +Design of dm-vdo
> > > +================
> > > +
> > > +The dm-vdo (virtual data optimizer) target provides inline deduplication,
> > > +compression, zero-block elimination, and thin provisioning. A dm-vdo target
> > > +can be backed by up to 256TB of storage, and can present a logical size of
> > > +up to 4PB.
>
> [snip]
>
> > > + block map cache size:
> > > + The size of the block map cache, as a number of 4096-byte
> > > + blocks. The minimum and recommended value is 32768 blocks.
> > > + If the logical thread count is non-zero, the cache size
> > > + must be at least 4096 blocks per logical thread.
> >
> > If I understand correctly the minimum of 32768 blocks results in the 128 MB
> > metadata cache mentioned in 'Tuning', which allows to access up to 100 GB
> > of logical space.
> >
> > Is there a strict reason for this minimum? I'm evaluating to use vdo on
> > systems with a relatively small vdo volume (say 4GB) and 'only' 4-8 GB of
> > RAM. The 128 MB of metadata cache would be a sizeable chunk of that, which
> > could make the use of vdo infeasible.
>
> The short answer is that VDO can often use a smaller cache than the default,
> but it likely won't help in the way you want it to.
>
> > > +Examples:
> > > +
> > > +Start a previously-formatted vdo volume with 1 GB logical space and 1 GB
> > > +physical space, storing to /dev/dm-1 which has more than 1 GB of space.
> > > +
> > > +::
> > > +
> > > + dmsetup create vdo0 --table \
> > > + "0 2097152 vdo V4 /dev/dm-1 262144 4096 32768 16380"
> >
> > IIUC the backing device needs to be previously formatted. The formatting
> > fails when the size of the backing device is < 5GB:
> >
> > vdoformat /dev/loop8
> > Minimum required size for VDO volume: 5063921664 bytes
> > vdoformat: formatVDO failed on '/dev/loop8': VDO Status: Out of space
> >
> > That was with 'vdoformat' from https://github.com/dm-vdo/vdo/
> >
> > It would be great if somewhat smaller devices could be supported.
>
> VDO was designed to handle the challenge of data deduplication in very large
> storage pools. It generally is not very useful for very small pools. The
> first question to ask is whether VDO can actually provide any value in the
> sort of environment you're using. VDO generally takes the strategy of saving
> storage space by using extra RAM and CPU cycles. In addition, VDO needs to
> track a certain amount of metadata, which reduces the amount storage
> available for actual user data.
>
> For vdoformat, the biggest consideration is the deduplication index and
> other metadata, which are basically a fixed cost of about 3.5GB. In order
> for VDO to be useful, VDO would have to find enough deduplication to make up
> for the storage lost to VDO's metadata, so the minimum useful size of a VDO
> volume is in the 8-12GB range.
>
> For the block map cache, decreasing the cache size may increase the
> frequency of metadata writes, which generally decreases the write throughput
> of the VDO device. So the tradeoff is between RAM and write speed.
>
> Nothing about the generic structure of VDO would prevent us from producing a
> smaller VDO (and in fact we do for some testing purposes), but in a scenario
> where you can only expect to save a few gigabytes through deduplication, VDO
> is generally more expensive than it is worth.
>
> If you still think this might be worth pursuing, let me know and we can try
> to work out a configuration which might suit your goals.
Some more context about my use case:
I'm evaluating the use of VDO for storing a hibernate image, the goal is to
reduce hibernate resume time by loading less data from potentially slow
storage. That's why the volume is relatively small. The image is only
written once per hibernate cycle and generally after the system was idle
for a longer time, so the lower write throughput due to a smaller cache
size probably wouldn't be a major concern. The systems might not have huge
amounts of free disk space, an overhead of ~3.5GB for the deduplication
index would probably rule out the use of VDO.
In the context of this use case the compression part of VDO seems more
interesting than the deduplication. In the documentation of VDO I noticed
a parameter to disable deduplication. With that I wonder if it would be
feasible/reasonable to add an option to vdoformat to omit the deduplication
index.
Do you think VDO might be (made) suitable for this scenario or is it
just not the right tool?
Thanks
Matthias
next prev parent reply other threads:[~2024-01-08 15:52 UTC|newest]
Thread overview: 46+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-11-17 20:59 [PATCH v5 00/40] dm vdo: add the dm-vdo deduplication and compression DM target Matthew Sakai
2023-11-17 20:59 ` [PATCH v5 01/40] dm: add documentation for dm-vdo target Matthew Sakai
2023-12-28 19:16 ` Matthias Kaehlcke
2024-01-05 2:07 ` Matthew Sakai
2024-01-08 15:52 ` Matthias Kaehlcke [this message]
2024-01-09 3:17 ` Matthew Sakai
2024-01-09 21:03 ` Matthias Kaehlcke
2023-11-17 20:59 ` [PATCH v5 02/40] dm vdo: add the MurmurHash3 fast hashing algorithm Matthew Sakai
2023-11-17 20:59 ` [PATCH v5 03/40] dm vdo: add memory allocation utilities Matthew Sakai
2023-11-17 20:59 ` [PATCH v5 04/40] dm vdo: add basic logging and support utilities Matthew Sakai
2023-11-17 20:59 ` [PATCH v5 05/40] dm vdo: add vdo type declarations, constants, and simple data structures Matthew Sakai
2023-11-17 20:59 ` [PATCH v5 06/40] dm vdo: add thread and synchronization utilities Matthew Sakai
2023-11-17 20:59 ` [PATCH v5 07/40] dm vdo: add specialized request queueing functionality Matthew Sakai
2023-11-17 20:59 ` [PATCH v5 08/40] dm vdo: add basic hash map data structures Matthew Sakai
2023-11-17 20:59 ` [PATCH v5 09/40] dm vdo: add deduplication configuration structures Matthew Sakai
2023-11-17 20:59 ` [PATCH v5 10/40] dm vdo: add deduplication index storage interface Matthew Sakai
2023-11-17 20:59 ` [PATCH v5 11/40] dm vdo: implement the delta index Matthew Sakai
2023-11-17 20:59 ` [PATCH v5 12/40] dm vdo: implement the volume index Matthew Sakai
2023-11-17 20:59 ` [PATCH v5 13/40] dm vdo: implement the open chapter and chapter indexes Matthew Sakai
2023-11-17 20:59 ` [PATCH v5 14/40] dm vdo: implement the chapter volume store Matthew Sakai
2023-11-17 20:59 ` [PATCH v5 15/40] dm vdo: implement top-level deduplication index Matthew Sakai
2023-11-17 20:59 ` [PATCH v5 16/40] dm vdo: implement external deduplication index interface Matthew Sakai
2023-11-17 20:59 ` [PATCH v5 17/40] dm vdo: add administrative state and action manager Matthew Sakai
2023-11-17 20:59 ` [PATCH v5 18/40] dm vdo: add vio, the request object for vdo metadata Matthew Sakai
2023-11-17 20:59 ` [PATCH v5 19/40] dm vdo: add data_vio, the request object which services incoming bios Matthew Sakai
2023-11-17 20:59 ` [PATCH v5 20/40] dm vdo: add flush support Matthew Sakai
2023-11-17 20:59 ` [PATCH v5 21/40] dm vdo: add the vdo io_submitter Matthew Sakai
2023-11-17 20:59 ` [PATCH v5 22/40] dm vdo: add hash locks and hash zones Matthew Sakai
2023-11-17 20:59 ` [PATCH v5 23/40] dm vdo: add use of deduplication index in " Matthew Sakai
2023-11-17 20:59 ` [PATCH v5 24/40] dm vdo: add the compressed block bin packer Matthew Sakai
2023-11-17 20:59 ` [PATCH v5 25/40] dm vdo: add slab structure, slab journal and reference counters Matthew Sakai
2023-11-17 20:59 ` [PATCH v5 26/40] dm vdo: add the slab summary Matthew Sakai
2023-11-17 20:59 ` [PATCH v5 27/40] dm vdo: add the block allocators and physical zones Matthew Sakai
2023-11-17 20:59 ` [PATCH v5 28/40] dm vdo: add the slab depot Matthew Sakai
2023-11-17 20:59 ` [PATCH v5 29/40] dm vdo: add the block map Matthew Sakai
2023-11-17 20:59 ` [PATCH v5 30/40] dm vdo: implement the block map page cache Matthew Sakai
2023-11-17 20:59 ` [PATCH v5 31/40] dm vdo: add the recovery journal Matthew Sakai
2023-11-17 20:59 ` [PATCH v5 32/40] dm vdo: add repair of damaged vdo volumes Matthew Sakai
2023-11-17 20:59 ` [PATCH v5 33/40] dm vdo: add the primary vdo structure Matthew Sakai
2023-11-17 20:59 ` [PATCH v5 34/40] dm vdo: add the on-disk formats and marshalling of vdo structures Matthew Sakai
2023-11-17 20:59 ` [PATCH v5 35/40] dm vdo: add statistics reporting Matthew Sakai
2023-11-17 20:59 ` [PATCH v5 36/40] dm vdo: add sysfs support for setting parameters and fetching stats Matthew Sakai
2023-11-17 20:59 ` [PATCH v5 37/40] dm vdo: add debugging support Matthew Sakai
2023-11-17 20:59 ` [PATCH v5 38/40] dm vdo: add the top-level DM target Matthew Sakai
2023-11-17 20:59 ` [PATCH v5 39/40] dm vdo: enable configuration and building of dm-vdo Matthew Sakai
2023-11-17 20:59 ` [PATCH v5 40/40] dm vdo: add MAINTAINERS file entry Matthew Sakai
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZZwaPjBECFvs8F4i@google.com \
--to=mka@chromium.org \
--cc=bgeffon@google.com \
--cc=dm-devel@lists.linux.dev \
--cc=msakai@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.