From: Joe Thornber <thornber@redhat.com>
To: Eric Wheeler <dm-devel@lists.ewheeler.net>
Cc: dm-devel@redhat.com
Subject: Re: [RFC] dm-thin: Heuristic early chunk copy before COW
Date: Thu, 9 Mar 2017 11:51:43 +0000 [thread overview]
Message-ID: <20170309115142.GA17308@nim> (raw)
In-Reply-To: <alpine.LRH.2.11.1703081005001.19383@mail.ewheeler.net>
Hi Eric,
On Wed, Mar 08, 2017 at 10:17:51AM -0800, Eric Wheeler wrote:
> Hello all,
>
> For dm-thin volumes that are snapshotted often, there is a performance
> penalty for writes because of COW overhead since the modified chunk needs
> to be copied into a freshly allocated chunk.
>
> What if we were to implement some sort of LRU for COW operations on
> chunks? We could then queue chunks that are commonly COWed within the
> inter-snapshot interval to be background copied immediately after the next
> snapshot. This would hide the latency and increase effective throughput
> when the thin device is written by its user since only the meta data would
> need an update because the chunk has already been copied.
>
> I can imagine a simple algorithm where the COW increments the chunk LRU by
> 2, and decrements the LRU by 1 for all stored LRUs when the volume is
> snapshotted. After the snapshot, any LRU>0 would be queued for early copy.
>
> The LRU would be in memory only, probably stored in a red/black tree.
> Pre-copied chunks would not update on-disk meta data unless a write occurs
> to that chunk. The allocator would need to be updated to ignore chunks
> that are in the LRU list which have been pre-copied (perhaps except in the
> case of pool free space exhaustion).
>
> Does this sound viable?
Yes, I can see that it would benefit some people, and presumably we'd
only turn it on for those people. Random thoughts:
- I'm doing a lot of background work in the latest version of dm-cache
in idle periods and it certainly pays off.
- There can be a *lot* of chunks, so holding a counter for all chunks in
memory is not on. (See the hassle I had squeezing stuff into memory
of dm-cache).
- Commonly cloned blocks can be gleaned from the metadata. eg, by
walking the metadata for two snapshots and taking the common ones.
It might be possible to come up with a 'commonly used set' once, and
then keep using it for all future snaps.
- Doing speculative work like this makes it harder to predict
performance. At the moment any expense (ie. copy) is incurred
immediately as the triggering write comes in.
- Could this be done from userland? Metadata snapshots let userland see
the mappings, alternatively dm-era let's userland track where io has
gone. A simple read then write of a block would trigger the sharing
to be broken.
- Joe
next prev parent reply other threads:[~2017-03-09 11:51 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-03-08 18:17 [RFC] dm-thin: Heuristic early chunk copy before COW Eric Wheeler
2017-03-09 11:51 ` Joe Thornber [this message]
2017-03-11 0:43 ` Eric Wheeler
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170309115142.GA17308@nim \
--to=thornber@redhat.com \
--cc=dm-devel@lists.ewheeler.net \
--cc=dm-devel@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.