All of lore.kernel.org
 help / color / mirror / Atom feed
* [Lustre-devel] Replacing a dead OST (fixed subject line)
       [not found] <C46BFD6B.5720%peter.braam@sun.com>
@ 2008-06-04 17:22 ` Nathaniel Rutman
  2008-06-07 14:03   ` Peter Braam
  0 siblings, 1 reply; 2+ messages in thread
From: Nathaniel Rutman @ 2008-06-04 17:22 UTC (permalink / raw)
  To: lustre-devel

Peter Braam wrote:
> There is tremendous value in fixing this bug (15345), because it turns an un-usual
> usage of our tools for recovery into something that is done more routinely.
>
> When I listened to this group, my impression was that it was not so hard to
> rebuild the OSS, but it does require scanning the primary MDS, finding the
> pathnames for affected files (with objects on the failed OSS), and using
> that list of files to re-write on the cluster where the OSS was lost.
>
> Nathan - this is a special case of the recovery mechanisms we are talking
> about (with the log being constructed in a different way). I think you
> should design the solution for this problem.
>   
I am taking this to mean we should design the general case of 
"dead/missing OST" into the HSM/migration architecture, and not 
something to do with recovery per se.   That's actually really 
interesting - you could deactivate an OST, and yet still read the files 
from it transparently.


Should I make a "luste-hsm" mail alias, or should we put it on lustre-devel?

^ permalink raw reply	[flat|nested] 2+ messages in thread

* [Lustre-devel] Replacing a dead OST (fixed subject line)
  2008-06-04 17:22 ` [Lustre-devel] Replacing a dead OST (fixed subject line) Nathaniel Rutman
@ 2008-06-07 14:03   ` Peter Braam
  0 siblings, 0 replies; 2+ messages in thread
From: Peter Braam @ 2008-06-07 14:03 UTC (permalink / raw)
  To: lustre-devel




On 6/4/08 10:22 AM, "Nathaniel Rutman" <Nathan.Rutman@Sun.COM> wrote:

> Peter Braam wrote:
>> There is tremendous value in fixing this bug (15345), because it turns an
>> un-usual
>> usage of our tools for recovery into something that is done more routinely.
>> 
>> When I listened to this group, my impression was that it was not so hard to
>> rebuild the OSS, but it does require scanning the primary MDS, finding the
>> pathnames for affected files (with objects on the failed OSS), and using
>> that list of files to re-write on the cluster where the OSS was lost.
>> 
>> Nathan - this is a special case of the recovery mechanisms we are talking
>> about (with the log being constructed in a different way). I think you
>> should design the solution for this problem.
>>   
> I am taking this to mean we should design the general case of
> "dead/missing OST" into the HSM/migration architecture,

No - into the replication architecture.  You feed a list of files into your
scripts and re-create the objects.

> and not 
> something to do with recovery per se.   That's actually really
> interesting - you could deactivate an OST, and yet still read the files
> from it transparently.

No, you can only read them when the OST has been restored; no cache misses
(yet).

> 
> 
> Should I make a "luste-hsm" mail alias, or should we put it on lustre-devel?
> 
> 

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2008-06-07 14:03 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <C46BFD6B.5720%peter.braam@sun.com>
2008-06-04 17:22 ` [Lustre-devel] Replacing a dead OST (fixed subject line) Nathaniel Rutman
2008-06-07 14:03   ` Peter Braam

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.