Lustre-devel archive on lore.kernel.org
 help / color / mirror / Atom feed
From: Colin Ngam <Colin.Ngam@Sun.COM>
To: lustre-devel@lists.lustre.org
Subject: [Lustre-devel] SAM-QFS, ADM, and Lustre HSM
Date: Mon, 02 Feb 2009 08:56:15 -0600	[thread overview]
Message-ID: <4987098F.402@Sun.COM> (raw)
In-Reply-To: <8585251D-41D7-4B42-99F9-BDBFA2CF88C1@Sun.COM>

Harriet G. Coverston wrote:

Hi,
> Nathan,
>
> On Jan 30, 2009, at 6:21 PM, Nathaniel Rutman wrote:
>
>> LEIBOVICI Thomas wrote:
>>> At CEA, we are using our own copytool that directly uses HPSS API. 
>>> This already exists and is in production for years.
>>> I think there will be few modifications to adapt it to Lustre-HSM 
>>> purpose
>>> (basically, add fid <-> HSM id mapping and backup of attributes, 
>>> path, stripe...)
>> So then the QFS copytool will indeed be a new tool, and should be 
>> scheduled accordingly.
>> Features:
>> 1. "cp --preserve" like functionality (include metadata attributes in 
>> cp)
>> 2. add EA's (create mini-tarball)
>> 3. implement FID hash to subdivide namespace
>> 4. periodic status reporting (via ioctl on file)
>>
>>
>> Harriet G. Coverston wrote:
>>>> There is a mechanism to get the current full pathname for a given 
>>>> fid from userspace, so an HSM-specific copytool could find it out, 
>>>> but a central tenet of the design here is that as far as the HSM is 
>>>> concerned, the entire Lustre FS is a flat namespace of FIDs.
>>>
>>> Be careful here. We are a file system. We don't have a limit on # of 
>>> files in one directory, but we don't recommend more than 500,000 
>>> files in one single directory or you will start to see some 
>>> performance problems. You will have to create a tree, not use a flat 
>>> namespace.
>> Yes, a tree based on a hash of the fid.
>> The other option is to use the actual filename for storage, but from 
>> Lustre's point of view this gets extremely tricky.  For example:
>> Send /foo/bar to archive.  Client A opens /foo/bar.  Client B renames 
>> /foo/bar to /abc/xyz, but this change hasn't propagated to the 
>> archive yet.  Client A now tries to read its open file handle, which 
>> tells Lustre to read the offline file FID 123, which it translates to 
>> /abc/xyz currently, which the archive doesn't know about yet.  Not 
>> just xyz, but renames on any ancestor path element cause similar 
>> misses.  Since the FID remains constant throughout the life of a 
>> file, we don't have to worry about any namespace changes (file or 
>> parents).  If there was an alternate way of bypassing the archive's 
>> namespace to directly access a file, we could conceivably store e.g. 
>> an archive-specific identifier within the Lustre stripe EA, and pass 
>> this down to the copytool when reading an offline file, but this 
>> presupposes that such a thing exists, is of reasonable size, has a 
>> userspace method to access it, etc.
>
> Yes, we have a FID like concept in SAM-QFS. It is called the file ID. 
> It is 64 bits and consists of the inode/generation number. It is 
> unique. You can store it. You can issue an ioctl to open the ID. You
> can issue an ioctl to do an ID stat, etc. It is much more efficient 
> than using the filename (expensive lookup). This means if you store 
> and use the ID, you can cover the rename window and still be 
> guaranteed that you will get the right file. Note, we don't rearchive 
> on a rename.
I believe this facility only exist on the Meta Data Server Node and not 
on the Linux/Solaris clients.  Am I correct?

Thanks.

colin
>
> I really think a replicated namespace will be much more intuitive and 
> solves restore. If you prefer
> to build a tar container, that is OK, too. The tar file can have a 
> suffix and then you know it is tar and
> you can tar it back.
>>
>>
>>>
>>>> You can get a full pathname if you want to for catastrophe 
>>>> recovery, but Lustre itself will only speak to the HSM with FIDs.
>>>> As I said in the other email, although SAM-QFS can do name-based 
>>>> policies, the "name" as far as QFS is concerned is just the FID, 
>>>> so  name-based policies at the copytool level are worthless.   
>>>> Unless we a.) add the path/filename back to the file (EA, or use a 
>>>> tarball wrapper), and b.) modify the SAM policy engine to use the 
>>>> "real" path/filename instead of the FID.
>>>
>>> Currently, we don't support policy using EA (extended attributes are 
>>> in 5.0). We have had lots of requests for this, especially from our 
>>> digital preservation customers.
>> Ah, policy based on EAs would be the general case, yes.
> Yes, this would be a nice feature for us.
>
>    - Harriet
>
> Harriet G. Coverston
> Solaris, Storage Software             |  Email: harriet.coverston at sun.com
> Sun Microsystems, Inc.                          |  AT&T:  651-554-1515
> 1270 Eagan Industrial Rd., Suite 160       |  Fax:   651-554-1540
> Eagan, MN 55121-1231
>
>
>
>

  reply	other threads:[~2009-02-02 14:56 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <D262D095-D17F-4119-B908-FBE502201835@Sun.COM>
     [not found] ` <49480788.7080306@sun.com>
     [not found]   ` <FCBC9AEA-C61C-4EEA-847B-FE283D19A7BF@Sun.COM>
     [not found]     ` <05e901c95fba$f7688df0$e639a9d0$@com>
     [not found]       ` <49481EEF.2010802@sun.com>
     [not found]         ` <EB7E4462-1CFF-4CCD-A38F-AE4D89108B59@Sun.COM>
     [not found]           ` <3DF0F4AF-F4D6-476E-98F7-CD912C49FC18@Sun.COM>
     [not found]             ` <2734A30F-2C76-4725-9F3A-29AD4245B7E8@Sun.COM>
     [not found]               ` <496FCA67.6000500@sun.com>
     [not found]                 ` <48D329C0-242E-4A5A-94C1-DF493BB25C2F@Sun.COM>
     [not found]                   ` <496FE8D4.2090908@sun.com>
     [not found]                     ` <BEB67402-7AFE-4BE1-A59C-050823AFC8E5@Sun.COM>
     [not found]                       ` <4977647D.5010503@sun.com>
     [not found]                         ` <4977E5BD.7000706@sun.com>
2009-01-22 20:46                           ` [Lustre-devel] SAM-QFS, ADM, and Lustre HSM Nathaniel Rutman
2009-01-22 22:55                             ` Andreas Dilger
2009-01-23 17:39                               ` Shipman, Galen M.
2009-01-26 19:57                                 ` Andreas Dilger
2009-01-29 15:36                                   ` Vicky White
2009-01-23 16:46                             ` Harriet G. Coverston
2009-01-26 19:47                               ` Andreas Dilger
2009-01-26 21:53                                 ` Nathaniel Rutman
2009-01-27  0:12                                 ` Harriet G. Coverston
2009-01-27  8:22                                 ` LEIBOVICI Thomas
2009-01-28 20:30                               ` Vicky White
2009-01-29 15:35                                 ` Vicky White
2009-01-30 14:26                                 ` Vicky White
2009-01-23 19:02                             ` Rick Matthews
2009-01-26 19:35                               ` Andreas Dilger
2009-01-26 22:13                                 ` Nathaniel Rutman
2009-01-27  2:26                                   ` Harriet G. Coverston
2009-01-31  0:21                                     ` Nathaniel Rutman
2009-02-02  4:00                                       ` Harriet G. Coverston
2009-02-02 14:56                                         ` Colin Ngam [this message]
2009-02-02 15:07                                           ` Harriet G. Coverston
2009-02-02 17:25                                             ` [Lustre-devel] Lustre HSM - some talking points Colin Ngam
2009-02-02 17:46                                               ` Vicky White
2009-02-02 18:00                                               ` Vicky White
2009-02-02 19:25                                                 ` Colin Ngam
2009-02-02 19:54                                                   ` Vicky White
2009-02-02 20:42                                                     ` Colin Ngam
2009-02-02 21:02                                                       ` Vicky White
2009-02-04  0:41                                               ` Nathaniel Rutman
2009-02-04  1:29                                                 ` Colin Ngam
2009-02-10  0:48                                                   ` Nathaniel Rutman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4987098F.402@Sun.COM \
    --to=colin.ngam@sun.com \
    --cc=lustre-devel@lists.lustre.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox