* [Drbd-dev] [PATCH 1/2] add extra paragraph about manpages/ directory
@ 2011-07-07 15:44 Mrten
2011-07-07 15:44 ` [Drbd-dev] [PATCH 2/2] expand section on throughput tuning to highlight prime usecase of external metadata Mrten
2011-07-08 14:01 ` [Drbd-dev] [PATCH 1/2] add extra paragraph about manpages/ directory Florian Haas
0 siblings, 2 replies; 6+ messages in thread
From: Mrten @ 2011-07-07 15:44 UTC (permalink / raw)
To: drbd-dev
---
README.txt | 13 +++++++++++++
1 files changed, 13 insertions(+), 0 deletions(-)
diff --git a/README.txt b/README.txt
index 20e664b..3dc4a18 100644
--- a/README.txt
+++ b/README.txt
@@ -72,6 +72,19 @@ of your choice, and the git version control system. The documentation
syntax is AsciiDoc; see http://www.methods.co.nz/asciidoc/ for details
on this format.
+Manpages
+--------
+
+To edit the manpages you need to check out the relevant drbd too in the same directory as
+the drbd-documentation (adjust version number here):
+
+-----------------------------------
+git clone git://git.drbd.org/drbd-8.3.git
+-----------------------------------
+
+Then adjust the Makefile in drbd-documentation/manpages to point to the right
+drbd version and 'make' there will work (create symlinks).
+
Submitting documentation patches
--------------------------------
--
1.7.0.4
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [Drbd-dev] [PATCH 2/2] expand section on throughput tuning to highlight prime usecase of external metadata
2011-07-07 15:44 [Drbd-dev] [PATCH 1/2] add extra paragraph about manpages/ directory Mrten
@ 2011-07-07 15:44 ` Mrten
2011-07-08 14:17 ` Florian Haas
2011-07-08 14:01 ` [Drbd-dev] [PATCH 1/2] add extra paragraph about manpages/ directory Florian Haas
1 sibling, 1 reply; 6+ messages in thread
From: Mrten @ 2011-07-07 15:44 UTC (permalink / raw)
To: drbd-dev
---
users-guide/throughput.txt | 33 +++++++++++++++++++++++++++++++++
1 files changed, 33 insertions(+), 0 deletions(-)
diff --git a/users-guide/throughput.txt b/users-guide/throughput.txt
index f033db5..584d0bf 100644
--- a/users-guide/throughput.txt
+++ b/users-guide/throughput.txt
@@ -48,6 +48,7 @@ important to consider the following natural limitations:
* DRBD throughput is limited by that of the raw I/O subsystem.
* DRBD throughput is limited by the available network bandwidth.
+* DRBD throughput can be limited by head seeks with 'meta-disk internal'
The _minimum_ between the two establishes the theoretical throughput
_maximum_ available to DRBD. DRBD then reduces that throughput maximum
@@ -64,6 +65,16 @@ less than 3 percent.
* By contrast, if the I/O subsystem is capable of only 100 MB/s for
sustained writes, then it constitutes the bottleneck, and you would
be able to expect only 97 MB/s maximum DRBD throughput.
+
+* In case of meta-disk internal without a hardware write cache (which
+ should be battery backupped!), DRBD metadata updates necessary to guarantee
+ data-completeness in case of failure can slow down
+ write throughput significantly. If a raw device is normally capable of
+ 250 MB/s write throughput it is not an anomaly to see writes as slow as
+ 70 MB/s with DRBD enabled (numbers are for rotational disks). This is
+ purely caused by head seeks; 4MB data updates have to be followed by metadata updates
+ and the data-writes can only continue after the metadata has been reached the
+ platters (caching and write reordering does not help).
[[s-throughput-tuning]]
=== Tuning recommendations
@@ -204,3 +215,25 @@ resource <resource> {
...
}
----------------------------
+
+[[s-tune-external-metadata]]
+==== Moving meta-disk to external device
+
+WARNING: The recommended configuration is running with internal meta-disk.
+With external metadata, when underlying storage dies the metadata does not
+die with it, so special care should be taken. See <<s-external-meta-data,external meta data>>.
+
+With a software raid (md) of rotational media it is often faster to move the metadata to a
+dedicated set of platters.
+
+[source,drbd]
+----------------------------
+resource <resource> {
+ disk {
+ disk /dev/md3;
+ flexible-meta-disk /dev/md4;
+ ...
+ }
+ ...
+}
+----------------------------
--
1.7.0.4
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [Drbd-dev] [PATCH 1/2] add extra paragraph about manpages/ directory
2011-07-07 15:44 [Drbd-dev] [PATCH 1/2] add extra paragraph about manpages/ directory Mrten
2011-07-07 15:44 ` [Drbd-dev] [PATCH 2/2] expand section on throughput tuning to highlight prime usecase of external metadata Mrten
@ 2011-07-08 14:01 ` Florian Haas
2011-07-08 14:45 ` Florian Haas
1 sibling, 1 reply; 6+ messages in thread
From: Florian Haas @ 2011-07-08 14:01 UTC (permalink / raw)
To: drbd-dev
[-- Attachment #1: Type: text/plain, Size: 647 bytes --]
Erm, NACK. :) That way to build the man pages is now obsolete. Thanks a
lot for the patch nonetheless, it highlighted an important omission.
I've fixed the README for the right way to do this now.
And as an aside, when a make variable is defined with "?=" as opposed to
":=", then it's meant to be overridden from the environment. Thus, even
when building the old way, you could have just done
make <target> DRBD=/path/to/drbd/checkout
... rather than hacking the Makefile.
Cheers,
Florian
On 07/07/2011 05:44 PM, Mrten wrote:
> ---
> README.txt | 13 +++++++++++++
> 1 files changed, 13 insertions(+), 0 deletions(-)
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 262 bytes --]
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Drbd-dev] [PATCH 2/2] expand section on throughput tuning to highlight prime usecase of external metadata
2011-07-07 15:44 ` [Drbd-dev] [PATCH 2/2] expand section on throughput tuning to highlight prime usecase of external metadata Mrten
@ 2011-07-08 14:17 ` Florian Haas
2011-07-08 18:37 ` Mrten
0 siblings, 1 reply; 6+ messages in thread
From: Florian Haas @ 2011-07-08 14:17 UTC (permalink / raw)
To: drbd-dev
[-- Attachment #1: Type: text/plain, Size: 3852 bytes --]
Maarten,
many thanks for the contribution. A few questions/comments on this one:
On 07/07/2011 05:44 PM, Mrten wrote:
> ---
> users-guide/throughput.txt | 33 +++++++++++++++++++++++++++++++++
> 1 files changed, 33 insertions(+), 0 deletions(-)
>
> diff --git a/users-guide/throughput.txt b/users-guide/throughput.txt
> index f033db5..584d0bf 100644
> --- a/users-guide/throughput.txt
> +++ b/users-guide/throughput.txt
> @@ -48,6 +48,7 @@ important to consider the following natural limitations:
>
> * DRBD throughput is limited by that of the raw I/O subsystem.
> * DRBD throughput is limited by the available network bandwidth.
> +* DRBD throughput can be limited by head seeks with 'meta-disk internal'
>
> The _minimum_ between the two establishes the theoretical throughput
> _maximum_ available to DRBD. DRBD then reduces that throughput maximum
You're adding a third item to the enumeration; so it would be nice if
you could also rephrase the next paragraph which talks about "the
minimum between the two".
> @@ -64,6 +65,16 @@ less than 3 percent.
> * By contrast, if the I/O subsystem is capable of only 100 MB/s for
> sustained writes, then it constitutes the bottleneck, and you would
> be able to expect only 97 MB/s maximum DRBD throughput.
> +
> +* In case of meta-disk internal without a hardware write cache (which
> + should be battery backupped!),
You're talking about a battery backup of a cache that is not there. Does
not compute. :)
DRBD metadata updates necessary to guarantee
> + data-completeness in case of failure can slow down
> + write throughput significantly. If a raw device is normally capable of
> + 250 MB/s write throughput it is not an anomaly to see writes as slow as
> + 70 MB/s with DRBD enabled (numbers are for rotational disks). This is
> + purely caused by head seeks; 4MB data updates have to be followed by metadata updates
> + and the data-writes can only continue after the metadata has been reached the
> + platters (caching and write reordering does not help).
I'm afraid you're missing some context here. DRBD performs the
synchronous meta data updates you are referring to only when an AL
extent goes hot or cold. It doesn't do so randomly or, as your paragraph
seems to imply to a casual reader, every time it has written 4M of data.
And it is definitely _not_ normal to see 250MB/s write bandwidth drop to
70 MB/s. 110 MB/s would be entirely normal if you are replicating over
Gigagit Ethernet, but that is determined by the bandwidth of the
replication link, it doesn't have much to do with AL updates.
And what you mean by "caching and write reordering does not help" I
don't understand at all, can you elaborate please?
>
> [[s-throughput-tuning]]
> === Tuning recommendations
> @@ -204,3 +215,25 @@ resource <resource> {
> ...
> }
> ----------------------------
> +
> +[[s-tune-external-metadata]]
> +==== Moving meta-disk to external device
> +
> +WARNING: The recommended configuration is running with internal meta-disk.
> +With external metadata, when underlying storage dies the metadata does not
> +die with it, so special care should be taken. See <<s-external-meta-data,external meta data>>.
> +
> +With a software raid (md) of rotational media it is often faster to move the metadata to a
> +dedicated set of platters.
> +
> +[source,drbd]
> +----------------------------
> +resource <resource> {
> + disk {
> + disk /dev/md3;
> + flexible-meta-disk /dev/md4;
> + ...
> + }
> + ...
> +}
> +----------------------------
This section would be ok, but it's still missing the steps to dump the
existing metadata and restore it onto the new metadata device. Can you
add that and repost the patch please?
Thanks,
Florian
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 262 bytes --]
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Drbd-dev] [PATCH 1/2] add extra paragraph about manpages/ directory
2011-07-08 14:01 ` [Drbd-dev] [PATCH 1/2] add extra paragraph about manpages/ directory Florian Haas
@ 2011-07-08 14:45 ` Florian Haas
0 siblings, 0 replies; 6+ messages in thread
From: Florian Haas @ 2011-07-08 14:45 UTC (permalink / raw)
To: drbd-dev
[-- Attachment #1: Type: text/plain, Size: 625 bytes --]
On 07/08/2011 04:01 PM, Florian Haas wrote:
> Erm, NACK. :) That way to build the man pages is now obsolete. Thanks a
> lot for the patch nonetheless, it highlighted an important omission.
> I've fixed the README for the right way to do this now.
>
> And as an aside, when a make variable is defined with "?=" as opposed to
> ":=", then it's meant to be overridden from the environment.
Lars just corrected me on this one (you learn something every day): even
a := variable can be overridden from the command line. Anyhow, that's
what you could have done, rather than edit the Makefile. :)
Cheers,
Florian
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 262 bytes --]
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Drbd-dev] [PATCH 2/2] expand section on throughput tuning to highlight prime usecase of external metadata
2011-07-08 14:17 ` Florian Haas
@ 2011-07-08 18:37 ` Mrten
0 siblings, 0 replies; 6+ messages in thread
From: Mrten @ 2011-07-08 18:37 UTC (permalink / raw)
To: drbd-dev
On 08-07-2011 16:17:17, Florian Haas wrote:
> You're adding a third item to the enumeration; so it would be nice if
> you could also rephrase the next paragraph which talks about "the
> minimum between the two".
Will do.
> You're talking about a battery backup of a cache that is not there.
> Does not compute. :)
So true, will fix ;)
>> DRBD metadata updates necessary to guarantee + data-completeness
>> in case of failure can slow down + write throughput significantly.
>> If a raw device is normally capable of + 250 MB/s write throughput
>> it is not an anomaly to see writes as slow as + 70 MB/s with DRBD
>> enabled (numbers are for rotational disks). This is + purely
>> caused by head seeks; 4MB data updates have to be followed by
>> metadata updates + and the data-writes can only continue after the
>> metadata has been reached the + platters (caching and write
>> reordering does not help).
>
> I'm afraid you're missing some context here. DRBD performs the
> synchronous meta data updates you are referring to only when an AL
> extent goes hot or cold. It doesn't do so randomly or, as your
> paragraph seems to imply to a casual reader, every time it has
> written 4M of data.
>
> And it is definitely _not_ normal to see 250MB/s write bandwidth drop
> to 70 MB/s. 110 MB/s would be entirely normal if you are replicating
> over Gigagit Ethernet, but that is determined by the bandwidth of the
> replication link, it doesn't have much to do with AL updates.
I think I should explain what I trying to convey, or rather, my mental
image of what happened while I was benchmarking (and saw that huge
performance drop).
My backing device for DRBD is a software raid-0 (two disks), with
'meta-disk internal'. Benchmarking was done by dd'ing a few gigs from
/dev/zero. All this dd-writing makes a lot of new extents hot (one for
every 4MB written?), which has to be remembered in the metadata, with
synchronous writes. Since my backing device is raid-0 and the default
chunk size for that is rather large these days, the (small) metadata
updates aren't spread over the raid-0 disks but are concentrated on one
device, which becomes the bottleneck for the benchmark because it has to
seek all the time.
This is not a cause for concern when you have a hardware battery-backed
cache, as the raid-controller can then delay writing the metadata, but I
don't have that.
I've blktrace-d, blkparse-d and seekwatcher-ed the hell out of this and
the images show exactly that happen, so I dared to write it up like this
without having read the source ;). Lots of linear writes, regularly
interrupted by a seek to synchronously write the metadata.
The slowdown wasn't caused by the interconnection between primary and
secondary, the 70MB/s was measured both in StandAlone and UpToDate (I
bonded 3 GE interfaces for nice syncing bandwidth).
And it was pure benchmarking, no other things happening on the server so
I'd expect that only the benchmark made extents hot.
I of course do not know the exact criteria that mark extents hot, if
what I described above is not an accurate description of what happens,
please correct me.
But the reason I think this should be in the docs is that I reckon that
lots of people would like to 0+"network raid-1" with relatively cheap
hardware, do the simplest of benchmarks and get confused by the
slowdown. Googling this I saw this subject passing over the mailinglist
a couple of times.
> And what you mean by "caching and write reordering does not help" I
> don't understand at all, can you elaborate please?
The synchronous (barrier?) writes for the metadata, as far as I
understand it from a mailing post from Lars, *must* have reached the
platters before the linear dd-writing can continue. So no enabling of
write caches, NCQ or tuning of elevators is going to help.
However, if you think that the paragraph now implies that *every* write
randomly makes extents hot then I should do some polishing ;)
>> +[[s-tune-external-metadata]]
[...]
> This section would be ok, but it's still missing the steps to dump
> the existing metadata and restore it onto the new metadata device.
> Can you add that and repost the patch please?
Ah, I hadn't thought of that scenario (am using a raid-1 for the
metadata). Is this along the lines of:
drbdadm down [resource]
drbdadm dump-md [resource] > savefile
[change meta-disk]
drbdmeta /dev/drbdX v08 [metadevice] 0 restore-md savefile
?
Is the index 0 correct usage when using flexible-meta-disk?
Maarten.
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2011-07-08 18:37 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-07-07 15:44 [Drbd-dev] [PATCH 1/2] add extra paragraph about manpages/ directory Mrten
2011-07-07 15:44 ` [Drbd-dev] [PATCH 2/2] expand section on throughput tuning to highlight prime usecase of external metadata Mrten
2011-07-08 14:17 ` Florian Haas
2011-07-08 18:37 ` Mrten
2011-07-08 14:01 ` [Drbd-dev] [PATCH 1/2] add extra paragraph about manpages/ directory Florian Haas
2011-07-08 14:45 ` Florian Haas
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.