All of lore.kernel.org
 help / color / mirror / Atom feed
From: Paul Nowoczynski <pauln@psc.edu>
To: lustre-devel@lists.lustre.org
Subject: [Lustre-devel] Queries regarding LDLM_ENQUEUE
Date: Wed, 20 Oct 2010 10:51:06 -0400	[thread overview]
Message-ID: <4CBF01DA.3090505@psc.edu> (raw)
In-Reply-To: <00d001cb705a$fd64cb80$f82e6280$@com>

Eric Barton wrote:
> I do like the idea of a collective open, but I'm wondering if it can be
> implemented simply enough to be worth the effort.  True, it avoids the O(n)
> load on the server of all the clients (re)populating their namespace
> caches, but it's only useful for parallel jobs - a scale-out NAS style
> workload can't benefit.  Ultimately the O(n) will have to be replaced with
> something that scales O(log n) (e.g. with a fat tree of caching proxy
> servers).
Eric makes a good point in that only parallel jobs really need this 
feature. Unfortunately, at scale the system (both clients and servers) 
*really do* need something like this, especially if we continue pushing 
users to perform N-1 file I/O instead of 'file per process'. I too am in 
agreement that some sort of capability mechanism is the best approach. I 
wonder if this is something that could be done outside of POSIX and 
supported through a parallel I/O library? Perhaps a single application 
threads could make a special open call (/proc magic perhaps?) and obtain 
the glob of opaque bytes which are then broadcast to the rest of the 
client via mpi. Traversing the namespace would be avoided on all but one 
client. In such a scenario I don't feel that enforcing unix permissions 
at every level of the path is needed or sensible, the operation should 
be treated as a simple logical open. The question to the lustre experts 
- can enough state be packed into an opaque object such that the 
recv'ing client can construct the necessary cache state?

>
>> On 10/20/10 12:24 PM, Andreas Dilger wrote:
>>> I'm reluctant to expose the whole FID namespace to applications, 
>
> ??? It can just be opaque bytes to the app.
>
>>> since this completely bypasses all directory permissions and allows
>>> opening files only based on their inode permissions.  If we require a
>>> name_to_handle() syscall to succeed first, before allowing
>>> open_by_handle() to work, then at least we know that one of the
>>> involved processes was able to do a full path traversal.
>
> I think this defeats the scalability objective - we trying to avoid having
> to pull the namespace into every client aren't we?
>
>> yes, this is a good point. can be solved if you use FID +
>> capability/signature ?
>
> Yes, I think capabilities are the only way collective open can be made
> secure "properly".  And given the way we believe capabilities have to be
> implemented for scalability (i.e. to keep the capability cache down to a
> reasonable size on the server) any open by one node in a given client
> cluster may well have to confer the right to use the FID by any of its
> peers.
>
>>>> another idea was to do whole path traversal on MDS within a single
>>>> RPC.  bug that'd require amount of changes to llite and/or VFS and
>>>> keep MDS a bottleneck.
>
> That's an optimization rather than a scalability feature.  How much does
> it complicate the code?  I'd hate to see something new tricky and fragile
> complicate further development.
>
>           Cheers,
>                    Eric
>
>
>
> _______________________________________________
> Lustre-devel mailing list
> Lustre-devel at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-devel

  parent reply	other threads:[~2010-10-20 14:51 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-10-18 23:33 [Lustre-devel] Queries regarding LDLM_ENQUEUE Vilobh Meshram
2010-10-19 15:46 ` Fan Yong
2010-10-19 20:28 ` Vilobh Meshram
2010-10-19 22:53   ` Andreas Dilger
2010-10-20  2:04     ` Vilobh Meshram
2010-10-20  7:55       ` Andreas Dilger
2010-10-20  8:11         ` bzzz.tomas at gmail.com
2010-10-20  8:24           ` Andreas Dilger
2010-10-20  8:30             ` bzzz.tomas at gmail.com
2010-10-20  8:38               ` Nikita Danilov
2010-10-20 14:45                 ` Nicolas Williams
2010-10-20 13:30               ` Eric Barton
2010-10-20 13:40                 ` bzzz.tomas at gmail.com
2010-10-20 14:51                 ` Paul Nowoczynski [this message]
2010-10-20 14:55                   ` Nicolas Williams
2010-10-20 15:16                     ` Paul Nowoczynski
2010-10-20 16:07                       ` Andreas Dilger
2010-10-20 15:22                   ` bzzz.tomas at gmail.com
2010-10-20 16:43                     ` Paul Nowoczynski
2010-10-20 16:49                       ` bzzz.tomas at gmail.com
2010-10-20 17:11                         ` Paul Nowoczynski
2010-10-20 17:18                           ` bzzz.tomas at gmail.com
2010-10-20 17:25                             ` Paul Nowoczynski
2010-10-20 17:27                             ` Andreas Dilger
2010-10-20 17:29                             ` Nicolas Williams
2010-10-20 17:40                               ` bzzz.tomas at gmail.com
2010-10-20 18:01                                 ` Andreas Dilger
2010-10-20 18:09                                   ` bzzz.tomas at gmail.com
2010-10-20 16:35                 ` Andreas Dilger
2010-10-20 16:46                   ` Paul Nowoczynski
2010-10-20 17:00                     ` Andreas Dilger
2010-10-20 17:13                       ` Nicolas Williams
2010-10-20 17:30                         ` Andreas Dilger
2010-10-20 17:01                     ` Nicolas Williams
2010-10-22  2:33         ` Vilobh Meshram

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4CBF01DA.3090505@psc.edu \
    --to=pauln@psc.edu \
    --cc=lustre-devel@lists.lustre.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.