From: Colin Ngam <Colin.Ngam@Sun.COM>
To: lustre-devel@lists.lustre.org
Subject: [Lustre-devel] SAM-QFS, ADM, and Lustre HSM
Date: Mon, 02 Feb 2009 08:56:15 -0600 [thread overview]
Message-ID: <4987098F.402@Sun.COM> (raw)
In-Reply-To: <8585251D-41D7-4B42-99F9-BDBFA2CF88C1@Sun.COM>
Harriet G. Coverston wrote:
Hi,
> Nathan,
>
> On Jan 30, 2009, at 6:21 PM, Nathaniel Rutman wrote:
>
>> LEIBOVICI Thomas wrote:
>>> At CEA, we are using our own copytool that directly uses HPSS API.
>>> This already exists and is in production for years.
>>> I think there will be few modifications to adapt it to Lustre-HSM
>>> purpose
>>> (basically, add fid <-> HSM id mapping and backup of attributes,
>>> path, stripe...)
>> So then the QFS copytool will indeed be a new tool, and should be
>> scheduled accordingly.
>> Features:
>> 1. "cp --preserve" like functionality (include metadata attributes in
>> cp)
>> 2. add EA's (create mini-tarball)
>> 3. implement FID hash to subdivide namespace
>> 4. periodic status reporting (via ioctl on file)
>>
>>
>> Harriet G. Coverston wrote:
>>>> There is a mechanism to get the current full pathname for a given
>>>> fid from userspace, so an HSM-specific copytool could find it out,
>>>> but a central tenet of the design here is that as far as the HSM is
>>>> concerned, the entire Lustre FS is a flat namespace of FIDs.
>>>
>>> Be careful here. We are a file system. We don't have a limit on # of
>>> files in one directory, but we don't recommend more than 500,000
>>> files in one single directory or you will start to see some
>>> performance problems. You will have to create a tree, not use a flat
>>> namespace.
>> Yes, a tree based on a hash of the fid.
>> The other option is to use the actual filename for storage, but from
>> Lustre's point of view this gets extremely tricky. For example:
>> Send /foo/bar to archive. Client A opens /foo/bar. Client B renames
>> /foo/bar to /abc/xyz, but this change hasn't propagated to the
>> archive yet. Client A now tries to read its open file handle, which
>> tells Lustre to read the offline file FID 123, which it translates to
>> /abc/xyz currently, which the archive doesn't know about yet. Not
>> just xyz, but renames on any ancestor path element cause similar
>> misses. Since the FID remains constant throughout the life of a
>> file, we don't have to worry about any namespace changes (file or
>> parents). If there was an alternate way of bypassing the archive's
>> namespace to directly access a file, we could conceivably store e.g.
>> an archive-specific identifier within the Lustre stripe EA, and pass
>> this down to the copytool when reading an offline file, but this
>> presupposes that such a thing exists, is of reasonable size, has a
>> userspace method to access it, etc.
>
> Yes, we have a FID like concept in SAM-QFS. It is called the file ID.
> It is 64 bits and consists of the inode/generation number. It is
> unique. You can store it. You can issue an ioctl to open the ID. You
> can issue an ioctl to do an ID stat, etc. It is much more efficient
> than using the filename (expensive lookup). This means if you store
> and use the ID, you can cover the rename window and still be
> guaranteed that you will get the right file. Note, we don't rearchive
> on a rename.
I believe this facility only exist on the Meta Data Server Node and not
on the Linux/Solaris clients. Am I correct?
Thanks.
colin
>
> I really think a replicated namespace will be much more intuitive and
> solves restore. If you prefer
> to build a tar container, that is OK, too. The tar file can have a
> suffix and then you know it is tar and
> you can tar it back.
>>
>>
>>>
>>>> You can get a full pathname if you want to for catastrophe
>>>> recovery, but Lustre itself will only speak to the HSM with FIDs.
>>>> As I said in the other email, although SAM-QFS can do name-based
>>>> policies, the "name" as far as QFS is concerned is just the FID,
>>>> so name-based policies at the copytool level are worthless.
>>>> Unless we a.) add the path/filename back to the file (EA, or use a
>>>> tarball wrapper), and b.) modify the SAM policy engine to use the
>>>> "real" path/filename instead of the FID.
>>>
>>> Currently, we don't support policy using EA (extended attributes are
>>> in 5.0). We have had lots of requests for this, especially from our
>>> digital preservation customers.
>> Ah, policy based on EAs would be the general case, yes.
> Yes, this would be a nice feature for us.
>
> - Harriet
>
> Harriet G. Coverston
> Solaris, Storage Software | Email: harriet.coverston at sun.com
> Sun Microsystems, Inc. | AT&T: 651-554-1515
> 1270 Eagan Industrial Rd., Suite 160 | Fax: 651-554-1540
> Eagan, MN 55121-1231
>
>
>
>
next prev parent reply other threads:[~2009-02-02 14:56 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <D262D095-D17F-4119-B908-FBE502201835@Sun.COM>
[not found] ` <49480788.7080306@sun.com>
[not found] ` <FCBC9AEA-C61C-4EEA-847B-FE283D19A7BF@Sun.COM>
[not found] ` <05e901c95fba$f7688df0$e639a9d0$@com>
[not found] ` <49481EEF.2010802@sun.com>
[not found] ` <EB7E4462-1CFF-4CCD-A38F-AE4D89108B59@Sun.COM>
[not found] ` <3DF0F4AF-F4D6-476E-98F7-CD912C49FC18@Sun.COM>
[not found] ` <2734A30F-2C76-4725-9F3A-29AD4245B7E8@Sun.COM>
[not found] ` <496FCA67.6000500@sun.com>
[not found] ` <48D329C0-242E-4A5A-94C1-DF493BB25C2F@Sun.COM>
[not found] ` <496FE8D4.2090908@sun.com>
[not found] ` <BEB67402-7AFE-4BE1-A59C-050823AFC8E5@Sun.COM>
[not found] ` <4977647D.5010503@sun.com>
[not found] ` <4977E5BD.7000706@sun.com>
2009-01-22 20:46 ` [Lustre-devel] SAM-QFS, ADM, and Lustre HSM Nathaniel Rutman
2009-01-22 22:55 ` Andreas Dilger
2009-01-23 17:39 ` Shipman, Galen M.
2009-01-26 19:57 ` Andreas Dilger
2009-01-29 15:36 ` Vicky White
2009-01-23 16:46 ` Harriet G. Coverston
2009-01-26 19:47 ` Andreas Dilger
2009-01-26 21:53 ` Nathaniel Rutman
2009-01-27 0:12 ` Harriet G. Coverston
2009-01-27 8:22 ` LEIBOVICI Thomas
2009-01-28 20:30 ` Vicky White
2009-01-29 15:35 ` Vicky White
2009-01-30 14:26 ` Vicky White
2009-01-23 19:02 ` Rick Matthews
2009-01-26 19:35 ` Andreas Dilger
2009-01-26 22:13 ` Nathaniel Rutman
2009-01-27 2:26 ` Harriet G. Coverston
2009-01-31 0:21 ` Nathaniel Rutman
2009-02-02 4:00 ` Harriet G. Coverston
2009-02-02 14:56 ` Colin Ngam [this message]
2009-02-02 15:07 ` Harriet G. Coverston
2009-02-02 17:25 ` [Lustre-devel] Lustre HSM - some talking points Colin Ngam
2009-02-02 17:46 ` Vicky White
2009-02-02 18:00 ` Vicky White
2009-02-02 19:25 ` Colin Ngam
2009-02-02 19:54 ` Vicky White
2009-02-02 20:42 ` Colin Ngam
2009-02-02 21:02 ` Vicky White
2009-02-04 0:41 ` Nathaniel Rutman
2009-02-04 1:29 ` Colin Ngam
2009-02-10 0:48 ` Nathaniel Rutman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4987098F.402@Sun.COM \
--to=colin.ngam@sun.com \
--cc=lustre-devel@lists.lustre.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox