From: Josh Durgin <josh.durgin@inktank.com>
To: Sage Weil <sage@inktank.com>
Cc: elder@inktank.com, ceph-devel <ceph-devel@vger.kernel.org>
Subject: Re: RBD format changes and layering
Date: Fri, 25 May 2012 18:43:17 -0700 [thread overview]
Message-ID: <4FC03535.8060409@inktank.com> (raw)
In-Reply-To: <Pine.LNX.4.64.1205251513290.5998@cobra.newdream.net>
On 05/25/2012 03:26 PM, Sage Weil wrote:
> On Fri, 25 May 2012, Josh Durgin wrote:
>> On 05/25/2012 07:57 AM, Alex Elder wrote:
>>>> /**
>>>> * Get the metadata about the image required to do I/O
>>>> * to it. In the future this may include extra information for
>>>> * features that require it, like encryption/compression type.
>>>> * This extra data will be added at the end of the response, so
>>>> * clients that don't support it don't interpret it.
>>>> *
>>>> * Features that would require clients to be updated to access
>>>> * the image correctly (such as image bitmaps) are set in
>>>> * the incompat_features field. A client that doesn't understand
>>>> * those features will return an error when they try to open
>>>> * the image.
>>>> *
>>>> * The size and any extra information is read from the appropriate
>>>> * snapshot metadata, if snapid is not CEPH_NOSNAP.
>>>> *
>>>> * Returns __le64 size, __le64 order, __le64 features,
>>>> * __le64 incompat_features, __le64 snapseq and
>>>> * list of __le64 snapids
>>>> */
>>>> get_info(__le64 snapid)
>>>
>>> I think I would prefer to see these bits of information broken
>>> out into a few routines that group related information, or to
>>> separate what's supplied based on the time or frequency it might
>>> need to be accessed, or the "effort" involved in collecting it.
>>
>> I was thinking that we might want these all in one operation for
>> atomicity, but we could add support for multi-operation transactions to
>> the kernel instead. These were added to userspace a few months ago.
>
> I would prefer separate operations too (e.g., get-size, get-order,
> get-features, etc.). IIRC there is already some infrastructure to handle
> compound operations already. Atomicity shouldn't be a concern, either
> way. This makes it simple to expand the header with other infos without
> creating a get-info2 command or something similar.
>
> A couple other comments:
>
> - The pools currently can't be renamed, but there isn't any reason why
> they couldn't be... at least until we start refering to them by name in
> the rbd parent pointers. I'd rather use the pool ids to keep our options
> open.
Sounds good.
> - Requiring parents be snapshots seems fine to me. It just means the
> child lists need to be per-snapshot, so that we know when it is safe to
> remove snaps on the parent.
>
> - I don't think that creating snapshots on the child needs to touch the
> parent (if that is still the plan). The child can remove itself as a
> child one the final reference (head or snap) is removed; no need to bother
> the parent with that information. (It could also cause a lot of noise for
> the parent 12.04 image with 10,000 children getting snapped regularly.)
>
> - I wonder if it makes sense to create an 'open' method (and maybe
> corresponding 'close'). I'm imagining future *compat* features (e.g.,
> bitmaps), where a new client creates some bitmaps, and then an old client
> mounts the image. The bitmap doesn't have to be incompat if the old
> client invalidates it (e.g., via open with old feature set).
This sounds like a good idea too. I imagine when we add compat features
like this, we might want extra methods to add them to existing images
too.
> This might be useful also when we add locking (so that clients get EBUSY
> if multiple hosts try to map).
>
> - Will we have class methods for rbd_directory as well? That seems like
> the simplest way to maintain backwards compatibility. Also, if we keep
> the name, maybe rbd_header.* and rbd_data.* are more consistent.
Not sure what you mean about the object names.
We can add a class method for rbd_directory too, so we can change its
format when the old format is removed.
next prev parent reply other threads:[~2012-05-26 1:43 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-05-24 23:05 RBD format changes and layering Josh Durgin
2012-05-24 23:39 ` Yehuda Sadeh
2012-05-25 17:33 ` Josh Durgin
2012-05-25 14:57 ` Alex Elder
2012-05-25 20:21 ` Josh Durgin
2012-05-25 22:26 ` Sage Weil
2012-05-26 1:43 ` Josh Durgin [this message]
2012-05-25 20:55 ` Greg Farnum
2012-05-25 21:25 ` Josh Durgin
2012-05-25 23:07 ` Josh Durgin
2012-05-29 22:08 ` Tommi Virtanen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4FC03535.8060409@inktank.com \
--to=josh.durgin@inktank.com \
--cc=ceph-devel@vger.kernel.org \
--cc=elder@inktank.com \
--cc=sage@inktank.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.