linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Is it possilble to be "delay tolerant" or have "slow dropout" of unavailable components?
@ 2008-04-08 16:56 Ty! Boyack
  2008-04-08 17:48 ` Markus Hochholdinger
  0 siblings, 1 reply; 4+ messages in thread
From: Ty! Boyack @ 2008-04-08 16:56 UTC (permalink / raw)
  To: linux-raid

I'm curious if there is a way to have a raid set (raid5 in my case, but 
this could apply to any raid level) that could tolerate a component 
device being unavailable for a period of time.

My reason for this is that I am building raid5 sets of iscsi disks, 
attempting to provide some redundancy in case one of our iscsi 
arrays/controllers goes down.  We have had our iscsi arrays reboot a 
couple of times, and the result is that the raid5 set will see the iscsi 
array go down and take it out of the raid5 set.  In a couple of minutes, 
the iscsi disk returns, but it it too late.  I would love to have the 
raid5 set buffer the writes for about 5 minutes while the iscsi device 
reboots, and then pump all the writes to it.  This does not have to be 
time-based.  If we could buffer the writes up to the point where we 
filled the overflow buffer that would solve the problem too, so long as 
we could make a really big buffer.

I realize that this could incur significant memory costs, but that is 
far cheaper for us than us having the raid5 array come apart. 

The other modification for this would be that read operations would need 
to either pull from the buffer or reconstruct the data from the parity, 
but would need to NOT initiate a device failure.

Have I missed something that already takes care of this?

Is there already a feature that takes care of this on a much smaller 
scale (microseconds?) that might be able to be increased to several 
minutes? 

Does anyone else think this would be a good idea?

-Ty!






-- 
-===========================-
  Ty! Boyack
  NREL Unix Network Manager
  ty@nrel.colostate.edu
  (970) 491-1186
-===========================-


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Is it possilble to be "delay tolerant" or have "slow dropout" of unavailable components?
  2008-04-08 16:56 Is it possilble to be "delay tolerant" or have "slow dropout" of unavailable components? Ty! Boyack
@ 2008-04-08 17:48 ` Markus Hochholdinger
  2008-04-08 18:24   ` Ty! Boyack
  0 siblings, 1 reply; 4+ messages in thread
From: Markus Hochholdinger @ 2008-04-08 17:48 UTC (permalink / raw)
  To: linux-raid

[-- Attachment #1: Type: text/plain, Size: 511 bytes --]

hi,

Am Dienstag, 8. April 2008 18:56 schrieb Ty! Boyack:
> I'm curious if there is a way to have a raid set (raid5 in my case, but
> this could apply to any raid level) that could tolerate a component
> device being unavailable for a period of time.

for RAID1 there is "--write-mostly" and "--write-behind=". Don't know if this 
is already available for RAID5.

There's also the option "--bitmap=" which can speedup a resync when 
temporarily disconnecting one device.


-- 
greetings

eMHa

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Is it possilble to be "delay tolerant" or have "slow dropout" of unavailable components?
  2008-04-08 17:48 ` Markus Hochholdinger
@ 2008-04-08 18:24   ` Ty! Boyack
  2008-04-08 18:54     ` Peter Rabbitson
  0 siblings, 1 reply; 4+ messages in thread
From: Ty! Boyack @ 2008-04-08 18:24 UTC (permalink / raw)
  To: Markus Hochholdinger; +Cc: linux-raid

Markus Hochholdinger wrote:
> hi,
>
> Am Dienstag, 8. April 2008 18:56 schrieb Ty! Boyack:
>   
>> I'm curious if there is a way to have a raid set (raid5 in my case, but
>> this could apply to any raid level) that could tolerate a component
>> device being unavailable for a period of time.
>>     
>
> for RAID1 there is "--write-mostly" and "--write-behind=". Don't know if this 
> is already available for RAID5.
>
> There's also the option "--bitmap=" which can speedup a resync when 
> temporarily disconnecting one device.
>
>
>   

Thanks - I was looking at those options, but it seems that the 
'write-behind' option would need to be applied to ALL devices.  It seems 
to indicate a difference in the devices - one is fast, one is slow, and 
the slow one is indicated with write-behind.  In my case, I think all 
are fast except in the case of a failure, in which case I'd like to have 
some delay before it gets declared bad to see if it comes back.

As for the 'bitmap' option - I think this has a lot of potential, and 
might work IF there was an automatically re-add a failed device.  With 
the bitmap I see the following sequence taking place:

1) Device in a raid5 goes away for some reason (iscsi reboot, network 
glitch, etc.) but the component is really still good.
2) raid5 marks device as bad, starts tracking changes in bitmap
3) device comes back online
<right now this is as far as I can get it without manual intervention, 
but if there is some sort of auto re-add this sequence could continue 
unabated>
4) device is re-added to raid5
5) Resync occurs fast because of the bitmap.

So... Perhaps I'm asking for the wrong thing.  Is there a way to detect 
a recovery after a failure, and have it automatically repair the raid set?

Right now, without the automation, it is possible, and likely, that an 
operator cannot respond in time to avoid having the bitmap fill up, and 
then we are into a long resync.  More critically, we would be running 
with a degraded array from the point of failure until an operator can 
fix it and the resync finishes, which is frightening.

-Ty!


-- 
-===========================-
  Ty! Boyack
  NREL Unix Network Manager
  ty@nrel.colostate.edu
  (970) 491-1186
-===========================-


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Is it possilble to be "delay tolerant" or have "slow dropout" of unavailable components?
  2008-04-08 18:24   ` Ty! Boyack
@ 2008-04-08 18:54     ` Peter Rabbitson
  0 siblings, 0 replies; 4+ messages in thread
From: Peter Rabbitson @ 2008-04-08 18:54 UTC (permalink / raw)
  To: Ty! Boyack; +Cc: Markus Hochholdinger, linux-raid

Ty! Boyack wrote:
>
> <snip>   
> 
> Right now, without the automation, it is possible, and likely, that an 
> operator cannot respond in time to avoid having the bitmap fill up, and 
> then we are into a long resync.

The bitmap can not "fill-up" - this is by design. It is created by simply 
subdividing the _entire_ array into a number of equally sized regions (the 
exact amount depends on the bitmap size, and the size of every region depends 
on the md device size). Then any time a write operation touches the disk the 
corresponding region is marked as dirty. There might be one write or a 
thousand writes - as long as they all fall within the same region - it is the 
only one which will be resync-ed.

HTH

Peter

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2008-04-08 18:54 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-04-08 16:56 Is it possilble to be "delay tolerant" or have "slow dropout" of unavailable components? Ty! Boyack
2008-04-08 17:48 ` Markus Hochholdinger
2008-04-08 18:24   ` Ty! Boyack
2008-04-08 18:54     ` Peter Rabbitson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).