All of lore.kernel.org
 help / color / mirror / Atom feed
From: Florian Haas <florian.haas@linbit.com>
To: drbd-dev@lists.linbit.com
Subject: Re: [Drbd-dev] [PATCH 2/2] expand section on throughput tuning to highlight prime usecase of external metadata
Date: Fri, 08 Jul 2011 16:17:17 +0200	[thread overview]
Message-ID: <4E17116D.4080305@linbit.com> (raw)
In-Reply-To: <1310053461-15060-2-git-send-email-mrten+drbd@ii.nl>

[-- Attachment #1: Type: text/plain, Size: 3852 bytes --]

Maarten,

many thanks for the contribution. A few questions/comments on this one:

On 07/07/2011 05:44 PM, Mrten wrote:
> ---
>  users-guide/throughput.txt |   33 +++++++++++++++++++++++++++++++++
>  1 files changed, 33 insertions(+), 0 deletions(-)
> 
> diff --git a/users-guide/throughput.txt b/users-guide/throughput.txt
> index f033db5..584d0bf 100644
> --- a/users-guide/throughput.txt
> +++ b/users-guide/throughput.txt
> @@ -48,6 +48,7 @@ important to consider the following natural limitations:
>  
>  * DRBD throughput is limited by that of the raw I/O subsystem.
>  * DRBD throughput is limited by the available network bandwidth.
> +* DRBD throughput can be limited by head seeks with 'meta-disk internal'
>  
>  The _minimum_ between the two establishes the theoretical throughput
>  _maximum_ available to DRBD. DRBD then reduces that throughput maximum

You're adding a third item to the enumeration; so it would be nice if
you could also rephrase the next paragraph which talks about "the
minimum between the two".

> @@ -64,6 +65,16 @@ less than 3 percent.
>  * By contrast, if the I/O subsystem is capable of only 100 MB/s for
>    sustained writes, then it constitutes the bottleneck, and you would
>    be able to expect only 97 MB/s maximum DRBD throughput.
> +  
> +* In case of meta-disk internal without a hardware write cache (which 
> +  should be battery backupped!),

You're talking about a battery backup of a cache that is not there. Does
not compute. :)

 DRBD metadata updates necessary to guarantee
> +  data-completeness in case of failure can slow down 
> +  write throughput significantly. If a raw device is normally capable of
> +  250 MB/s write throughput it is not an anomaly to see writes as slow as 
> +  70 MB/s with DRBD enabled (numbers are for rotational disks). This is 
> +  purely caused by head seeks; 4MB data updates have to be followed by metadata updates
> +  and the data-writes can only continue after the metadata has been reached the 
> +  platters (caching and write reordering does not help).

I'm afraid you're missing some context here. DRBD performs the
synchronous meta data updates you are referring to only when an AL
extent goes hot or cold. It doesn't do so randomly or, as your paragraph
seems to imply to a casual reader, every time it has written 4M of data.

And it is definitely _not_ normal to see 250MB/s write bandwidth drop to
70 MB/s. 110 MB/s would be entirely normal if you are replicating over
Gigagit Ethernet, but that is determined by the bandwidth of the
replication link, it doesn't have much to do with AL updates.

And what you mean by "caching and write reordering does not help" I
don't understand at all, can you elaborate please?

>  
>  [[s-throughput-tuning]]
>  === Tuning recommendations
> @@ -204,3 +215,25 @@ resource <resource> {
>    ...
>  }
>  ----------------------------
> +
> +[[s-tune-external-metadata]]
> +==== Moving meta-disk to external device
> +
> +WARNING: The recommended configuration is running with internal meta-disk. 
> +With external metadata, when underlying storage dies the metadata does not 
> +die with it, so special care should be taken. See <<s-external-meta-data,external meta data>>.
> +
> +With a software raid (md) of rotational media it is often faster to move the metadata to a 
> +dedicated set of platters. 
> +
> +[source,drbd]
> +----------------------------
> +resource <resource> {
> +  disk {
> +    disk /dev/md3;
> +    flexible-meta-disk /dev/md4;
> +    ...
> +  }
> +  ...
> +}
> +----------------------------

This section would be ok, but it's still missing the steps to dump the
existing metadata and restore it onto the new metadata device. Can you
add that and repost the patch please?

Thanks,
Florian


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 262 bytes --]

  reply	other threads:[~2011-07-08 14:40 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-07-07 15:44 [Drbd-dev] [PATCH 1/2] add extra paragraph about manpages/ directory Mrten
2011-07-07 15:44 ` [Drbd-dev] [PATCH 2/2] expand section on throughput tuning to highlight prime usecase of external metadata Mrten
2011-07-08 14:17   ` Florian Haas [this message]
2011-07-08 18:37     ` Mrten
2011-07-08 14:01 ` [Drbd-dev] [PATCH 1/2] add extra paragraph about manpages/ directory Florian Haas
2011-07-08 14:45   ` Florian Haas

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4E17116D.4080305@linbit.com \
    --to=florian.haas@linbit.com \
    --cc=drbd-dev@lists.linbit.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.