All of lore.kernel.org
 help / color / mirror / Atom feed
From: Josh Durgin <josh.durgin@inktank.com>
To: Noah Watkins <jayhawk@cs.ucsc.edu>
Cc: ceph-devel <ceph-devel@vger.kernel.org>, Sage Weil <sage@inktank.com>
Subject: Re: libcephfs create file with layout and replication
Date: Sat, 17 Nov 2012 13:35:08 -0800	[thread overview]
Message-ID: <50A8030C.2010003@inktank.com> (raw)
In-Reply-To: <CAPrxi5-eQ=unNbUMO5b_dJpFERySAuWD9GXratMEo9XGPpkrkA@mail.gmail.com>

On 11/17/2012 12:13 PM, Noah Watkins wrote:
> The Hadoop VFS layer assumes that block size and replication can be
> set on a per-file basis, which is important to users for file
> layout/workload optimizations.
>
> The libcephfs interface doesn't make this entirely easy. Here is one
> approach, but it isn't thread safe as the default values are global
> variables in the client.
>
>    orig_obj_size = ceph_get_default_object_size() //save
>    set_default_object_size(new size)
>    open(path, O_CREAT)
>    set_default_object_size(new size) //reset
>
> Something more convenient might be:
>
>    ceph_open_layout(path, flags, mode, layout, replication)

I think this makes the most sense, since changing the layout of a
file after it's been created can't happen, and this interface
makes that the most clear. It also avoids maintaining extra state
in libcephfs between calls.

Since replication count is a per-pool setting, I think the hadoop
bindings would have to translate from a vfs request to a pool
with the requested replication level. So something like this,
where layout is a struct containing stripe unit, stripe count,
and object size (the subset of struct ceph_file_layout related to
objects that's useful currently):

     ceph_open_layout(path, flags, mode, layout, pool_name)

BTW, for anyone interested, there's a nice description of
the layout parameters here:

http://ceph.com/docs/master/dev/file-striping/

> where layout and replication are used with O_CREAT | O_EXCL, or and
> interface for setting these values explicitly on newly created files:
>
>    ceph_open(path, O_CREAT|O_EXCL)
>    ceph_set_layout(path, layout, replication)
>
> where ceph_set_layout would succeed ostensibly on zero-length files.
>
> Any thoughts on how to handle this?
>
> Thanks,
> Noah


  reply	other threads:[~2012-11-17 21:35 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-11-17 20:13 libcephfs create file with layout and replication Noah Watkins
2012-11-17 21:35 ` Josh Durgin [this message]
2012-11-17 23:23 ` Sage Weil
2012-11-17 23:58   ` Noah Watkins
2012-11-18  0:15     ` Sage Weil
2012-11-18  1:20       ` Noah Watkins
2012-11-18 20:05         ` Noah Watkins
2012-11-20  1:04           ` Gregory Farnum
2012-11-20  2:48             ` Noah Watkins
2012-11-20  3:28               ` Sage Weil
2012-11-20 21:59                 ` Noah Watkins

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=50A8030C.2010003@inktank.com \
    --to=josh.durgin@inktank.com \
    --cc=ceph-devel@vger.kernel.org \
    --cc=jayhawk@cs.ucsc.edu \
    --cc=sage@inktank.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.