* better temp objects
@ 2013-08-29 20:40 Sage Weil
2013-08-29 22:29 ` Loic Dachary
0 siblings, 1 reply; 2+ messages in thread
From: Sage Weil @ 2013-08-29 20:40 UTC (permalink / raw)
To: sam.just; +Cc: greg.farnum, ceph-devel
Here's what I'm thinking:
Above the ObjectStore, we clear everything from temp on restart anyway.
We always write bits of a temp object in pieces, and then at the end
copy/move it into the main collection.
On replay, we should *only* do that final move/rename if the temp object
was replayed in its entirety.
So:
- clear out temp collections in the filestore on startup.
- give temp objects unique names so that they don't collide with non-temp
object fd caching (or whatever else). for the DBObjectMap part there is
probably some futzing though to make this work right.
- add a new 'move_from_temp' type operation that renames an object a temp
(coll_t::is_temp()) collection to a non-temp one. it will succeed iff the
temp source exists.
- all operations that write to temp objects fail if the object doesn't
already exist, except an explicit 'create' op
- all transactions the osds generate that write to temp object start with
that explicit create.
The combination of these thigns means that we will only have a temp source
for the move_from_temp op if it is complete. Which I think means we can
avoid any of the fsync guard stuff entirely.
The DBObjectMap I'm very fuzzy on, so I suspect that's where the tricky
part will be. Maybe the temp object name includes the intended hobject_t
in it somewhere, or something, so that the rename can be reflected
in leveldb at the end.
Thoughts? Maybe we can do a quick hangout this afternoon to make sure
this will work before I start putting it together...
sage
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: better temp objects
2013-08-29 20:40 better temp objects Sage Weil
@ 2013-08-29 22:29 ` Loic Dachary
0 siblings, 0 replies; 2+ messages in thread
From: Loic Dachary @ 2013-08-29 22:29 UTC (permalink / raw)
To: Sage Weil; +Cc: sam.just, greg.farnum, ceph-devel
[-- Attachment #1: Type: text/plain, Size: 2208 bytes --]
Hi Sage,
What is the temp collection used for ? Is there a thread / document I could read to learn about it or a piece of code that is mainly about it ?
Cheers
On 29/08/2013 22:40, Sage Weil wrote:
> Here's what I'm thinking:
>
> Above the ObjectStore, we clear everything from temp on restart anyway.
>
> We always write bits of a temp object in pieces, and then at the end
> copy/move it into the main collection.
>
> On replay, we should *only* do that final move/rename if the temp object
> was replayed in its entirety.
>
> So:
>
> - clear out temp collections in the filestore on startup.
>
> - give temp objects unique names so that they don't collide with non-temp
> object fd caching (or whatever else). for the DBObjectMap part there is
> probably some futzing though to make this work right.
>
> - add a new 'move_from_temp' type operation that renames an object a temp
> (coll_t::is_temp()) collection to a non-temp one. it will succeed iff the
> temp source exists.
>
> - all operations that write to temp objects fail if the object doesn't
> already exist, except an explicit 'create' op
>
> - all transactions the osds generate that write to temp object start with
> that explicit create.
>
> The combination of these thigns means that we will only have a temp source
> for the move_from_temp op if it is complete. Which I think means we can
> avoid any of the fsync guard stuff entirely.
>
> The DBObjectMap I'm very fuzzy on, so I suspect that's where the tricky
> part will be. Maybe the temp object name includes the intended hobject_t
> in it somewhere, or something, so that the rename can be reflected
> in leveldb at the end.
>
> Thoughts? Maybe we can do a quick hangout this afternoon to make sure
> this will work before I start putting it together...
>
> sage
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
--
Loïc Dachary, Artisan Logiciel Libre
All that is necessary for the triumph of evil is that good people do nothing.
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 261 bytes --]
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2013-08-29 22:29 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-08-29 20:40 better temp objects Sage Weil
2013-08-29 22:29 ` Loic Dachary
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.