All of lore.kernel.org
 help / color / mirror / Atom feed
* better temp objects
@ 2013-08-29 20:40 Sage Weil
  2013-08-29 22:29 ` Loic Dachary
  0 siblings, 1 reply; 2+ messages in thread
From: Sage Weil @ 2013-08-29 20:40 UTC (permalink / raw)
  To: sam.just; +Cc: greg.farnum, ceph-devel

Here's what I'm thinking:

Above the ObjectStore, we clear everything from temp on restart anyway.

We always write bits of a temp object in pieces, and then at the end 
copy/move it into the main collection.  

On replay, we should *only* do that final move/rename if the temp object 
was replayed in its entirety.

So:

- clear out temp collections in the filestore on startup.

- give temp objects unique names so that they don't collide with non-temp 
object fd caching (or whatever else).  for the DBObjectMap part there is 
probably some futzing though to make this work right.

- add a new 'move_from_temp' type operation that renames an object a temp 
(coll_t::is_temp()) collection to a non-temp one.  it will succeed iff the 
temp source exists.

- all operations that write to temp objects fail if the object doesn't 
already exist, except an explicit 'create' op

- all transactions the osds generate that write to temp object start with 
that explicit create.

The combination of these thigns means that we will only have a temp source 
for the move_from_temp op if it is complete.  Which I think means we can 
avoid any of the fsync guard stuff entirely.

The DBObjectMap I'm very fuzzy on, so I suspect that's where the tricky 
part will be.  Maybe the temp object name includes the intended hobject_t 
in it somewhere, or something, so that the rename can be reflected 
in leveldb at the end.

Thoughts?  Maybe we can do a quick hangout this afternoon to make sure 
this will work before I start putting it together...

sage

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: better temp objects
  2013-08-29 20:40 better temp objects Sage Weil
@ 2013-08-29 22:29 ` Loic Dachary
  0 siblings, 0 replies; 2+ messages in thread
From: Loic Dachary @ 2013-08-29 22:29 UTC (permalink / raw)
  To: Sage Weil; +Cc: sam.just, greg.farnum, ceph-devel

[-- Attachment #1: Type: text/plain, Size: 2208 bytes --]

Hi Sage,

What is the temp collection used for ? Is there a thread / document I could read to learn about it or a piece of code that is mainly about it ?

Cheers

On 29/08/2013 22:40, Sage Weil wrote:
> Here's what I'm thinking:
> 
> Above the ObjectStore, we clear everything from temp on restart anyway.
> 
> We always write bits of a temp object in pieces, and then at the end 
> copy/move it into the main collection.  
> 
> On replay, we should *only* do that final move/rename if the temp object 
> was replayed in its entirety.
> 
> So:
> 
> - clear out temp collections in the filestore on startup.
> 
> - give temp objects unique names so that they don't collide with non-temp 
> object fd caching (or whatever else).  for the DBObjectMap part there is 
> probably some futzing though to make this work right.
> 
> - add a new 'move_from_temp' type operation that renames an object a temp 
> (coll_t::is_temp()) collection to a non-temp one.  it will succeed iff the 
> temp source exists.
> 
> - all operations that write to temp objects fail if the object doesn't 
> already exist, except an explicit 'create' op
> 
> - all transactions the osds generate that write to temp object start with 
> that explicit create.
> 
> The combination of these thigns means that we will only have a temp source 
> for the move_from_temp op if it is complete.  Which I think means we can 
> avoid any of the fsync guard stuff entirely.
> 
> The DBObjectMap I'm very fuzzy on, so I suspect that's where the tricky 
> part will be.  Maybe the temp object name includes the intended hobject_t 
> in it somewhere, or something, so that the rename can be reflected 
> in leveldb at the end.
> 
> Thoughts?  Maybe we can do a quick hangout this afternoon to make sure 
> this will work before I start putting it together...
> 
> sage
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

-- 
Loïc Dachary, Artisan Logiciel Libre
All that is necessary for the triumph of evil is that good people do nothing.


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 261 bytes --]

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2013-08-29 22:29 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-08-29 20:40 better temp objects Sage Weil
2013-08-29 22:29 ` Loic Dachary

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.