From: Paul Clements <paul.clements@steeleye.com>
To: Jan Engelhardt <jengelh@computergmbh.de>,
david@lang.hm, Al Boldi <a1426z@gawab.com>,
linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org,
netdev@vger.kernel.org, linux-raid@v
Subject: Re: [RFD] Layering: Use-Case Composers (was: DRBD - what is it, anyways? [compare with e.g. NBD + MD raid])
Date: Sun, 12 Aug 2007 21:41:15 -0400 [thread overview]
Message-ID: <46BFB6BB.80406@steeleye.com> (raw)
In-Reply-To: <20070812174549.GA2915@teal.hq.k1024.org>
Iustin Pop wrote:
> On Sun, Aug 12, 2007 at 07:03:44PM +0200, Jan Engelhardt wrote:
>> On Aug 12 2007 09:39, david@lang.hm wrote:
>>> now, I am not an expert on either option, but three are a couple things that I
>>> would question about the DRDB+MD option
>>>
>>> 1. when the remote machine is down, how does MD deal with it for reads and
>>> writes?
>> I suppose it kicks the drive and you'd have to re-add it by hand unless done by
>> a cronjob.
Yes, and with a bitmap configured on the raid1, you just resync the
blocks that have been written while the connection was down.
>>From my tests, since NBD doesn't have a timeout option, MD hangs in the
> write to that mirror indefinitely, somewhat like when dealing with a
> broken IDE driver/chipset/disk.
Well, if people would like to see a timeout option, I actually coded up
a patch a couple of years ago to do just that, but I never got it into
mainline because you can do almost as well by doing a check at
user-level (I basically ping the nbd connection periodically and if it
fails, I kill -9 the nbd-client).
>>> 2. MD over local drive will alternate reads between mirrors (or so I've been
>>> told), doing so over the network is wrong.
>> Certainly. In which case you set "write_mostly" (or even write_only, not sure
>> of its name) on the raid component that is nbd.
>>
>>> 3. when writing, will MD wait for the network I/O to get the data saved on the
>>> backup before returning from the syscall? or can it sync the data out lazily
>> Can't answer this one - ask Neil :)
>
> MD has the write-mostly/write-behind options - which help in this case
> but only up to a certain amount.
You can configure write_behind (aka, asynchronous writes) to buffer as
much data as you have RAM to hold. At a certain point, presumably, you'd
want to just break the mirror and take the hit of doing a resync once
your network leg falls too far behind.
--
Paul
next prev parent reply other threads:[~2007-08-13 1:41 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-08-12 10:35 [RFD] Layering: Use-Case Composers (was: DRBD - what is it, anyways? [compare with e.g. NBD + MD raid]) Al Boldi
2007-08-12 11:28 ` Jan Engelhardt
2007-08-12 16:39 ` david
2007-08-12 17:03 ` Jan Engelhardt
2007-08-12 17:45 ` Iustin Pop
2007-08-13 1:41 ` Paul Clements [this message]
2007-08-13 3:21 ` david
2007-08-13 8:03 ` David Greaves
2007-08-13 8:31 ` david
2007-08-13 12:43 ` David Greaves
2007-08-13 9:02 ` Jan Engelhardt
2007-08-13 7:51 ` David Greaves
2007-08-12 11:51 ` Evgeniy Polyakov
2007-08-12 15:28 ` Al Boldi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=46BFB6BB.80406@steeleye.com \
--to=paul.clements@steeleye.com \
--cc=a1426z@gawab.com \
--cc=david@lang.hm \
--cc=jengelh@computergmbh.de \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-raid@v \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).