All of lore.kernel.org
 help / color / mirror / Atom feed
From: Paul Clements <paul.clements@steeleye.com>
To: Jan Engelhardt <jengelh@computergmbh.de>,
	david@lang.hm, Al Boldi <a1426z@gawab.com>,
	linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	netdev@vger.kernel.org, linux-raid@v
Subject: Re: [RFD] Layering: Use-Case Composers (was: DRBD - what is it,	anyways? [compare with e.g. NBD + MD raid])
Date: Sun, 12 Aug 2007 21:41:15 -0400	[thread overview]
Message-ID: <46BFB6BB.80406@steeleye.com> (raw)
In-Reply-To: <20070812174549.GA2915@teal.hq.k1024.org>

Iustin Pop wrote:
> On Sun, Aug 12, 2007 at 07:03:44PM +0200, Jan Engelhardt wrote:
>> On Aug 12 2007 09:39, david@lang.hm wrote:
>>> now, I am not an expert on either option, but three are a couple things that I
>>> would question about the DRDB+MD option
>>>
>>> 1. when the remote machine is down, how does MD deal with it for reads and
>>> writes?
>> I suppose it kicks the drive and you'd have to re-add it by hand unless done by
>> a cronjob.

Yes, and with a bitmap configured on the raid1, you just resync the 
blocks that have been written while the connection was down.


>>From my tests, since NBD doesn't have a timeout option, MD hangs in the
> write to that mirror indefinitely, somewhat like when dealing with a
> broken IDE driver/chipset/disk.

Well, if people would like to see a timeout option, I actually coded up 
a patch a couple of years ago to do just that, but I never got it into 
mainline because you can do almost as well by doing a check at 
user-level (I basically ping the nbd connection periodically and if it 
fails, I kill -9 the nbd-client).


>>> 2. MD over local drive will alternate reads between mirrors (or so I've been
>>> told), doing so over the network is wrong.
>> Certainly. In which case you set "write_mostly" (or even write_only, not sure
>> of its name) on the raid component that is nbd.
>>
>>> 3. when writing, will MD wait for the network I/O to get the data saved on the
>>> backup before returning from the syscall? or can it sync the data out lazily
>> Can't answer this one - ask Neil :)
> 
> MD has the write-mostly/write-behind options - which help in this case
> but only up to a certain amount.

You can configure write_behind (aka, asynchronous writes) to buffer as 
much data as you have RAM to hold. At a certain point, presumably, you'd 
want to just break the mirror and take the hit of doing a resync once 
your network leg falls too far behind.

--
Paul

WARNING: multiple messages have this Message-ID (diff)
From: Paul Clements <paul.clements@steeleye.com>
To: Jan Engelhardt <jengelh@computergmbh.de>,
	david@lang.hm, Al Boldi <a1426z@gawab.com>,
	linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	netdev@vger.kernel.org, linux-raid@vger.kernel.org
Subject: Re: [RFD] Layering: Use-Case Composers (was: DRBD - what is it,	anyways? [compare with e.g. NBD + MD raid])
Date: Sun, 12 Aug 2007 21:41:15 -0400	[thread overview]
Message-ID: <46BFB6BB.80406@steeleye.com> (raw)
In-Reply-To: <20070812174549.GA2915@teal.hq.k1024.org>

Iustin Pop wrote:
> On Sun, Aug 12, 2007 at 07:03:44PM +0200, Jan Engelhardt wrote:
>> On Aug 12 2007 09:39, david@lang.hm wrote:
>>> now, I am not an expert on either option, but three are a couple things that I
>>> would question about the DRDB+MD option
>>>
>>> 1. when the remote machine is down, how does MD deal with it for reads and
>>> writes?
>> I suppose it kicks the drive and you'd have to re-add it by hand unless done by
>> a cronjob.

Yes, and with a bitmap configured on the raid1, you just resync the 
blocks that have been written while the connection was down.


>>From my tests, since NBD doesn't have a timeout option, MD hangs in the
> write to that mirror indefinitely, somewhat like when dealing with a
> broken IDE driver/chipset/disk.

Well, if people would like to see a timeout option, I actually coded up 
a patch a couple of years ago to do just that, but I never got it into 
mainline because you can do almost as well by doing a check at 
user-level (I basically ping the nbd connection periodically and if it 
fails, I kill -9 the nbd-client).


>>> 2. MD over local drive will alternate reads between mirrors (or so I've been
>>> told), doing so over the network is wrong.
>> Certainly. In which case you set "write_mostly" (or even write_only, not sure
>> of its name) on the raid component that is nbd.
>>
>>> 3. when writing, will MD wait for the network I/O to get the data saved on the
>>> backup before returning from the syscall? or can it sync the data out lazily
>> Can't answer this one - ask Neil :)
> 
> MD has the write-mostly/write-behind options - which help in this case
> but only up to a certain amount.

You can configure write_behind (aka, asynchronous writes) to buffer as 
much data as you have RAM to hold. At a certain point, presumably, you'd 
want to just break the mirror and take the hit of doing a resync once 
your network leg falls too far behind.

--
Paul

  reply	other threads:[~2007-08-13  1:41 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-08-12 10:35 [RFD] Layering: Use-Case Composers (was: DRBD - what is it, anyways? [compare with e.g. NBD + MD raid]) Al Boldi
2007-08-12 11:28 ` Jan Engelhardt
2007-08-12 16:39   ` david
2007-08-12 17:03     ` Jan Engelhardt
2007-08-12 17:45       ` Iustin Pop
2007-08-12 17:45         ` Iustin Pop
2007-08-13  1:41         ` Paul Clements [this message]
2007-08-13  1:41           ` Paul Clements
2007-08-13  3:21           ` david
2007-08-13  8:03             ` David Greaves
2007-08-13  8:31               ` david
2007-08-13 12:43                 ` David Greaves
2007-08-13  9:02             ` Jan Engelhardt
2007-08-13  7:51           ` David Greaves
2007-08-12 11:51 ` Evgeniy Polyakov
2007-08-12 15:28   ` Al Boldi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=46BFB6BB.80406@steeleye.com \
    --to=paul.clements@steeleye.com \
    --cc=a1426z@gawab.com \
    --cc=david@lang.hm \
    --cc=jengelh@computergmbh.de \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-raid@v \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.