All of lore.kernel.org
 help / color / mirror / Atom feed
From: Colin Ngam <Colin.Ngam@Sun.COM>
To: lustre-devel@lists.lustre.org
Subject: [Lustre-devel] SAM-QFS, ADM, and Lustre HSM
Date: Mon, 02 Feb 2009 08:56:15 -0600	[thread overview]
Message-ID: <4987098F.402@Sun.COM> (raw)
In-Reply-To: <8585251D-41D7-4B42-99F9-BDBFA2CF88C1@Sun.COM>

Harriet G. Coverston wrote:

Hi,
> Nathan,
>
> On Jan 30, 2009, at 6:21 PM, Nathaniel Rutman wrote:
>
>> LEIBOVICI Thomas wrote:
>>> At CEA, we are using our own copytool that directly uses HPSS API. 
>>> This already exists and is in production for years.
>>> I think there will be few modifications to adapt it to Lustre-HSM 
>>> purpose
>>> (basically, add fid <-> HSM id mapping and backup of attributes, 
>>> path, stripe...)
>> So then the QFS copytool will indeed be a new tool, and should be 
>> scheduled accordingly.
>> Features:
>> 1. "cp --preserve" like functionality (include metadata attributes in 
>> cp)
>> 2. add EA's (create mini-tarball)
>> 3. implement FID hash to subdivide namespace
>> 4. periodic status reporting (via ioctl on file)
>>
>>
>> Harriet G. Coverston wrote:
>>>> There is a mechanism to get the current full pathname for a given 
>>>> fid from userspace, so an HSM-specific copytool could find it out, 
>>>> but a central tenet of the design here is that as far as the HSM is 
>>>> concerned, the entire Lustre FS is a flat namespace of FIDs.
>>>
>>> Be careful here. We are a file system. We don't have a limit on # of 
>>> files in one directory, but we don't recommend more than 500,000 
>>> files in one single directory or you will start to see some 
>>> performance problems. You will have to create a tree, not use a flat 
>>> namespace.
>> Yes, a tree based on a hash of the fid.
>> The other option is to use the actual filename for storage, but from 
>> Lustre's point of view this gets extremely tricky.  For example:
>> Send /foo/bar to archive.  Client A opens /foo/bar.  Client B renames 
>> /foo/bar to /abc/xyz, but this change hasn't propagated to the 
>> archive yet.  Client A now tries to read its open file handle, which 
>> tells Lustre to read the offline file FID 123, which it translates to 
>> /abc/xyz currently, which the archive doesn't know about yet.  Not 
>> just xyz, but renames on any ancestor path element cause similar 
>> misses.  Since the FID remains constant throughout the life of a 
>> file, we don't have to worry about any namespace changes (file or 
>> parents).  If there was an alternate way of bypassing the archive's 
>> namespace to directly access a file, we could conceivably store e.g. 
>> an archive-specific identifier within the Lustre stripe EA, and pass 
>> this down to the copytool when reading an offline file, but this 
>> presupposes that such a thing exists, is of reasonable size, has a 
>> userspace method to access it, etc.
>
> Yes, we have a FID like concept in SAM-QFS. It is called the file ID. 
> It is 64 bits and consists of the inode/generation number. It is 
> unique. You can store it. You can issue an ioctl to open the ID. You
> can issue an ioctl to do an ID stat, etc. It is much more efficient 
> than using the filename (expensive lookup). This means if you store 
> and use the ID, you can cover the rename window and still be 
> guaranteed that you will get the right file. Note, we don't rearchive 
> on a rename.
I believe this facility only exist on the Meta Data Server Node and not 
on the Linux/Solaris clients.  Am I correct?

Thanks.

colin
>
> I really think a replicated namespace will be much more intuitive and 
> solves restore. If you prefer
> to build a tar container, that is OK, too. The tar file can have a 
> suffix and then you know it is tar and
> you can tar it back.
>>
>>
>>>
>>>> You can get a full pathname if you want to for catastrophe 
>>>> recovery, but Lustre itself will only speak to the HSM with FIDs.
>>>> As I said in the other email, although SAM-QFS can do name-based 
>>>> policies, the "name" as far as QFS is concerned is just the FID, 
>>>> so  name-based policies at the copytool level are worthless.   
>>>> Unless we a.) add the path/filename back to the file (EA, or use a 
>>>> tarball wrapper), and b.) modify the SAM policy engine to use the 
>>>> "real" path/filename instead of the FID.
>>>
>>> Currently, we don't support policy using EA (extended attributes are 
>>> in 5.0). We have had lots of requests for this, especially from our 
>>> digital preservation customers.
>> Ah, policy based on EAs would be the general case, yes.
> Yes, this would be a nice feature for us.
>
>    - Harriet
>
> Harriet G. Coverston
> Solaris, Storage Software             |  Email: harriet.coverston at sun.com
> Sun Microsystems, Inc.                          |  AT&T:  651-554-1515
> 1270 Eagan Industrial Rd., Suite 160       |  Fax:   651-554-1540
> Eagan, MN 55121-1231
>
>
>
>

  reply	other threads:[~2009-02-02 14:56 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <D262D095-D17F-4119-B908-FBE502201835@Sun.COM>
     [not found] ` <49480788.7080306@sun.com>
     [not found]   ` <FCBC9AEA-C61C-4EEA-847B-FE283D19A7BF@Sun.COM>
     [not found]     ` <05e901c95fba$f7688df0$e639a9d0$@com>
     [not found]       ` <49481EEF.2010802@sun.com>
     [not found]         ` <EB7E4462-1CFF-4CCD-A38F-AE4D89108B59@Sun.COM>
     [not found]           ` <3DF0F4AF-F4D6-476E-98F7-CD912C49FC18@Sun.COM>
     [not found]             ` <2734A30F-2C76-4725-9F3A-29AD4245B7E8@Sun.COM>
     [not found]               ` <496FCA67.6000500@sun.com>
     [not found]                 ` <48D329C0-242E-4A5A-94C1-DF493BB25C2F@Sun.COM>
     [not found]                   ` <496FE8D4.2090908@sun.com>
     [not found]                     ` <BEB67402-7AFE-4BE1-A59C-050823AFC8E5@Sun.COM>
     [not found]                       ` <4977647D.5010503@sun.com>
     [not found]                         ` <4977E5BD.7000706@sun.com>
2009-01-22 20:46                           ` [Lustre-devel] SAM-QFS, ADM, and Lustre HSM Nathaniel Rutman
2009-01-22 22:55                             ` Andreas Dilger
2009-01-23 17:39                               ` Shipman, Galen M.
2009-01-26 19:57                                 ` Andreas Dilger
2009-01-29 15:36                                   ` Vicky White
2009-01-23 16:46                             ` Harriet G. Coverston
2009-01-26 19:47                               ` Andreas Dilger
2009-01-26 21:53                                 ` Nathaniel Rutman
2009-01-27  0:12                                 ` Harriet G. Coverston
2009-01-27  8:22                                 ` LEIBOVICI Thomas
2009-01-28 20:30                               ` Vicky White
2009-01-29 15:35                                 ` Vicky White
2009-01-30 14:26                                 ` Vicky White
2009-01-23 19:02                             ` Rick Matthews
2009-01-26 19:35                               ` Andreas Dilger
2009-01-26 22:13                                 ` Nathaniel Rutman
2009-01-27  2:26                                   ` Harriet G. Coverston
2009-01-31  0:21                                     ` Nathaniel Rutman
2009-02-02  4:00                                       ` Harriet G. Coverston
2009-02-02 14:56                                         ` Colin Ngam [this message]
2009-02-02 15:07                                           ` Harriet G. Coverston
2009-02-02 17:25                                             ` [Lustre-devel] Lustre HSM - some talking points Colin Ngam
2009-02-02 17:46                                               ` Vicky White
2009-02-02 18:00                                               ` Vicky White
2009-02-02 19:25                                                 ` Colin Ngam
2009-02-02 19:54                                                   ` Vicky White
2009-02-02 20:42                                                     ` Colin Ngam
2009-02-02 21:02                                                       ` Vicky White
2009-02-04  0:41                                               ` Nathaniel Rutman
2009-02-04  1:29                                                 ` Colin Ngam
2009-02-10  0:48                                                   ` Nathaniel Rutman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4987098F.402@Sun.COM \
    --to=colin.ngam@sun.com \
    --cc=lustre-devel@lists.lustre.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.