All of lore.kernel.org
 help / color / mirror / Atom feed
From: Rick Matthews <Richard.Matthews@Sun.COM>
To: lustre-devel@lists.lustre.org
Subject: [Lustre-devel] Lustre HSM HLD draft
Date: Fri, 08 Feb 2008 05:52:12 -0600	[thread overview]
Message-ID: <47AC426C.8060902@Sun.COM> (raw)
In-Reply-To: <C892277475E7FB4F90BC212EEE14CA8365B674@U-SANTORIN.dif.dam.intra.cea.fr>

JC.LAFOUCRIERE at CEA.FR wrote:

Thanks for allowing me to participate.
> Hello
>
> thank you for your review, I add some comments in the following
>
> Page 1, 1, Define coordinator (space coordinator?),
>         define agent, (condense Part II intro, page 14)
>         (for me, MDT, MGS and OST)
> These are defined in the arch wiki pages
>   
Thank you, I still haven't got to them yet...but plan to.
> Page 10, 
>         4.2, 2) Implies only one copy per "version"...bad idea
> Different versions correspond to different files in the external storage. We take the more recent.
> Not sure I understand your remark
>   
A basic mantra of SAM-QFS and other data retention systems is that one 
image of the data is vulnerable (a tape breaks,
or is otherwise overwritten). While the archival system can be 
responsible for making multiple identical images, it
can still represent a single point of failure. Note: I am using version 
to represent a point in time image of the files data,
and copy to represent an image of that version. (See LOCKSS for 
additional references on copies).
> Page 13, Lustre object mtime may not be good enough. There are several
>         mechanisms (like touch) to manipulate mtime, which makes it
>         unusable as a last written time.
> If a user make a touch in the past this change the mtime and can hide previous writes.
> If we want to keep real write time we need to add a new time field in Lustre backend
> (may be ZFS has it) 
>   
What the archival system needs to know is that the copy previously made 
(or a first copy need to be made),
which seems to be triggered by a user (not archive or other - like 
restore) write operation.
> Page 19, Special Path, does this boil down to invisible I/O?
> The path is /mnt_mount/.lustre/fid/FID_NUMBER. When a file is open through this path a 
> flag is carried to the OSS to avoid copy in trigger (this used to fill the file)
>
> Page 23, 2.3 and 2.4, I'm assuming that lists of tuples can be processed
>         in any order.
> yes
>
> Issues:
>         The Space manager is likely the most important piece. There is no
>         detail on it. This is where archive and other policy is enforced.
> The space manager is based on changelogs/feed Lustre feature which are very new (draft HLD has just been
> published). This is why it not described at this time.
>   
OK...also consider using change logs as a trigger for need of a new 
archive version (not copy). Alleviates the mtime issue above.
>         The described HSM seems to follow the "copy out" when space needed,
>         then purge, model. This function (a Space Manager function) is contrary
>         to SAM, and a shortfall of many HSMs.
> no spacemanger is doing pre-migration and when free space is needed, it only has to make punc
>   
OK, so who schedules the pre-migration to the archive system?
>         Coordination between agents seems important. For example,
>         if agents requested new copy-outs on objects striped on
>         10 different stores, ordering them on tape seems difficult.
> Tape access optimization has to be made by the archival system. We try to put as few external storage knowledge
> as possible in Lustre to be external storage independant.
>   
The isolation between archive system and file system is (to me) a good 
idea. I'd just like you to
consider that the recall (stage-in) events can be optimized. At least, 
make sure the archive system
is allowed to reorder as needed (hence the async - list of tuples in any 
order - question above).
Think of other association between files to live storage as 1) a 
pre-stage operation, or 2)
a disk cache pre-fetch operation. I hope I'm using understandable words ;>)
>         What is the backup story for Lustre? How does that play with
>         the HSM?
> HSM do not backup the namespace. It has to be done with a separate tool like a MDT scannner.
> The copy tool can use the FID2PATH() function to save the object pathname with the file.
>
>   
One point here is that an HSM + namespace/metadata backup + unarchived 
data capture can be used to be a
nearly continuous backup operation with a relatively tiny backup window.

-- 
---------------------------------------------------------------------
Rick Matthews                           email: Rick.Matthews at sun.com
Sun Microsystems, Inc.                  phone:+1(651) 554-1518
1270 Eagan Industrial Road              phone(internal): 54418
Suite 160                               fax:  +1(651) 554-1540
Eagan, MN 55121-1231 USA                main: +1(651) 554-1500		
---------------------------------------------------------------------

  reply	other threads:[~2008-02-08 11:52 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-02-07 16:19 [Lustre-devel] Lustre HSM HLD draft Rick Matthews
2008-02-08  0:03 ` JC.LAFOUCRIERE at CEA.FR
2008-02-08 11:52   ` Rick Matthews [this message]
2008-02-08 15:55 ` Aurelien Degremont
2008-02-11 18:18   ` Andreas Dilger
2008-02-11 19:38     ` Peter Braam
2008-02-11 21:11     ` Ricardo M. Correia
2008-02-11 21:39       ` Andreas Dilger
2008-02-11 22:07         ` Ricardo M. Correia
2008-02-11 22:32           ` Nathaniel Rutman
2008-02-11 22:46             ` Rick Matthews
2008-02-12 15:41               ` Aurelien Degremont
2008-02-12  0:25             ` Ricardo M. Correia
  -- strict thread matches above, loose matches on Subject: below --
2008-02-07 10:52 DEGREMONT Aurelien
2008-02-08 21:18 ` Nathaniel Rutman
2008-02-11 14:59   ` Aurelien Degremont
2008-02-11 20:33     ` Nathaniel Rutman
2008-02-12  3:55       ` Andreas Dilger
2008-02-12 11:04         ` Eric Barton
2008-02-12 15:25           ` Aurelien Degremont
2008-02-12 17:23             ` Andreas Dilger
2008-02-12 19:43               ` Eric Barton
2008-02-12 23:24               ` Nathaniel Rutman
2008-02-18 21:51 ` Canon, Richard Shane
2008-02-19 17:13   ` Aurelien Degremont
2008-02-25 22:44   ` Peter J Braam
2008-02-21 15:26 ` Aurelien Degremont
2008-02-25 22:38   ` Peter J Braam
2008-02-27 16:51     ` Aurelien Degremont
2008-02-29  4:30       ` Peter Braam

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=47AC426C.8060902@Sun.COM \
    --to=richard.matthews@sun.com \
    --cc=lustre-devel@lists.lustre.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.