* [Lustre-devel] Replacing a dead OST (fixed subject line) [not found] <C46BFD6B.5720%peter.braam@sun.com> @ 2008-06-04 17:22 ` Nathaniel Rutman 2008-06-07 14:03 ` Peter Braam 0 siblings, 1 reply; 2+ messages in thread From: Nathaniel Rutman @ 2008-06-04 17:22 UTC (permalink / raw) To: lustre-devel Peter Braam wrote: > There is tremendous value in fixing this bug (15345), because it turns an un-usual > usage of our tools for recovery into something that is done more routinely. > > When I listened to this group, my impression was that it was not so hard to > rebuild the OSS, but it does require scanning the primary MDS, finding the > pathnames for affected files (with objects on the failed OSS), and using > that list of files to re-write on the cluster where the OSS was lost. > > Nathan - this is a special case of the recovery mechanisms we are talking > about (with the log being constructed in a different way). I think you > should design the solution for this problem. > I am taking this to mean we should design the general case of "dead/missing OST" into the HSM/migration architecture, and not something to do with recovery per se. That's actually really interesting - you could deactivate an OST, and yet still read the files from it transparently. Should I make a "luste-hsm" mail alias, or should we put it on lustre-devel? ^ permalink raw reply [flat|nested] 2+ messages in thread
* [Lustre-devel] Replacing a dead OST (fixed subject line) 2008-06-04 17:22 ` [Lustre-devel] Replacing a dead OST (fixed subject line) Nathaniel Rutman @ 2008-06-07 14:03 ` Peter Braam 0 siblings, 0 replies; 2+ messages in thread From: Peter Braam @ 2008-06-07 14:03 UTC (permalink / raw) To: lustre-devel On 6/4/08 10:22 AM, "Nathaniel Rutman" <Nathan.Rutman@Sun.COM> wrote: > Peter Braam wrote: >> There is tremendous value in fixing this bug (15345), because it turns an >> un-usual >> usage of our tools for recovery into something that is done more routinely. >> >> When I listened to this group, my impression was that it was not so hard to >> rebuild the OSS, but it does require scanning the primary MDS, finding the >> pathnames for affected files (with objects on the failed OSS), and using >> that list of files to re-write on the cluster where the OSS was lost. >> >> Nathan - this is a special case of the recovery mechanisms we are talking >> about (with the log being constructed in a different way). I think you >> should design the solution for this problem. >> > I am taking this to mean we should design the general case of > "dead/missing OST" into the HSM/migration architecture, No - into the replication architecture. You feed a list of files into your scripts and re-create the objects. > and not > something to do with recovery per se. That's actually really > interesting - you could deactivate an OST, and yet still read the files > from it transparently. No, you can only read them when the OST has been restored; no cache misses (yet). > > > Should I make a "luste-hsm" mail alias, or should we put it on lustre-devel? > > ^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2008-06-07 14:03 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <C46BFD6B.5720%peter.braam@sun.com>
2008-06-04 17:22 ` [Lustre-devel] Replacing a dead OST (fixed subject line) Nathaniel Rutman
2008-06-07 14:03 ` Peter Braam
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.