From: Malcolm Haak <malcolm@sgi.com>
To: Gregory Farnum <greg@inktank.com>, John Spray <john.spray@inktank.com>
Cc: Sage Weil <sage@inktank.com>,
"ceph-devel@vger.kernel.org" <ceph-devel@vger.kernel.org>
Subject: Re: HSM
Date: Tue, 12 Nov 2013 10:57:55 +1000 [thread overview]
Message-ID: <52817D13.1070004@sgi.com> (raw)
In-Reply-To: <CAPYLRzjuhBAgTT2SPo+s5RkY32zT6ue-na_LzGY-9WKhscaL_w@mail.gmail.com>
Hi Gregory,
On 12/11/13 10:13, Gregory Farnum wrote:
> On Mon, Nov 11, 2013 at 3:04 AM, John Spray <john.spray@inktank.com> wrote:
>> This is a really useful summary from Malcolm.
>>
>> In addition to the coordinator/copytool interface, there is the question of
>> where the policy engine gets its data from. Lustre has the MDS changelog,
>> which Robinhood uses to replicate metadata into its MySQL database with all
>> the indices that it wants.
>
>> On Sun, Nov 10, 2013 at 11:17 PM, Malcolm Haak <malcolm@sgi.com> wrote:
>>> So there aren't really any hooks in that exports are triggered by the policy engine after a scan of the metadata, and the recalls are triggered when caps are requested on offline files
>
> Wait, is the HSM using a changelog or is it just scanning the full
> filesystem tree? Scanning the whole tree seems awfully expensive.
While I can't speak at length about the LustreHSM, it may just use
incremental updates to its SQL database via metadata logs, I do know
that filesystem scans are done regularly in other HSM solutions. I also
know that the scan is multi-threaded and when backed by decent disks
does not take an excessive amount of time.
>
>> I don't know if CephFS MDS currently has a similar interface.
> Well, the MDSes each have their journal of course, but more than that
> we can stick whatever we want into the metadata and expose it via
> virtual xattrs or whatever else.
> -Greg
> Software Engineer #42 @ http://inktank.com | http://ceph.com
>
>>
>> John
>>
>>
>> On Sun, Nov 10, 2013 at 11:17 PM, Malcolm Haak <malcolm@sgi.com> wrote:
>>>
>>> Hi All,
>>>
>>> If you are talking specifically about Lustre HSM, its really an interface to add HSM functionality by leveraging existing HSM's (DMF for example)
>>>
>>> So with Lustre HSM you have a policy engine that triggers the migrations out of the filesystem. Rules are based around size, last accessed and target state (online, dual and offline).
>>>
>>> There is a 'coordinator' process involved here as well, it (from what I understand) runs on MDS nodes. It handles the interaction with the copytool. The copytool is provided by the HSM solution you are acutally using.
>>>
>>> For recalls when caps are aquired on the MDS for an exported file the resposible MSD contacts the coordinator, which in-turn uses the copytool to pull the required file out of the HSM.
>>>
>>> In the Lustre HSM, the objects that make up a file are all recalled and the file, not the objects, are handed to the HSM.
>>>
>>> For Lustre all it needs to keep track of is the current state of the file and the correct ID to reqest from the HSM. This is done inside the normal metadata storage.
>>>
>>> So there aren't really any hooks in that exports are triggered by the policy engine after a scan of the metadata, and the recalls are triggered when caps are requested on offline files. Then its just standard POSIX blocking until the file is available.
>>>
>>> Most of the state and ID stuff could be stored as XATTRS in cephfs. I'm not as sure how to do it for other things but as long as you could store some kind of extended metadata about whole objects, it could use the same interfaces as well.
>>>
>>> Hope that was acutually helpful and not just an obvious rehash...
>>>
>>> Regards
>>>
>>> Malcolm Haak
next prev parent reply other threads:[~2013-11-12 0:58 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-11-09 8:33 HSM Sage Weil
2013-11-09 14:20 ` HSM Tim Bell
2013-11-11 9:58 ` HSM Sebastien Ponce
2013-11-10 23:17 ` HSM Malcolm Haak
2013-11-11 11:04 ` HSM John Spray
2013-11-12 0:13 ` HSM Gregory Farnum
2013-11-12 0:57 ` Malcolm Haak [this message]
2013-11-11 9:50 ` HSM Sebastien Ponce
2013-11-12 9:47 ` HSM Andreas Joachim Peters
2013-11-18 19:22 ` HSM Dmitry Borodaenko
2013-11-20 12:09 ` HSM Malcolm Haak
-- strict thread matches above, loose matches on Subject: below --
2013-11-11 16:05 HSM bernhard glomm
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=52817D13.1070004@sgi.com \
--to=malcolm@sgi.com \
--cc=ceph-devel@vger.kernel.org \
--cc=greg@inktank.com \
--cc=john.spray@inktank.com \
--cc=sage@inktank.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.