[RFC] Design of name-addressed data portion

Git development
 help / color / mirror / Atom feed

* [RFC] Design of name-addressed data portion
@ 2005-04-24 18:17 Daniel Barkalow
  2005-04-24 20:54 ` Petr Baudis
  2005-04-24 22:58 ` Fabian Franz
  0 siblings, 2 replies; 5+ messages in thread
From: Daniel Barkalow @ 2005-04-24 18:17 UTC (permalink / raw)
  To: git; +Cc: Linus Torvalds, Petr Baudis

I think it has gotten to be time to have a standard mechanism for
name-addressed data in the .git directory. We currently have one
agreed-upon item, HEAD, as well as a number of items in cogito: heads,
tags, and, to a certain extent, remotes. (Even in core git, when we
support tags, we'll want a mapping from tag names to tag objects, even if
this mapping doesn't get transferred by push and pull operations; nobody's
going to want to cut and paste a hash from email every time they want to
refer to the tag, and fsck-cache could stand to know which tags you mean
to have so that it can report the rest to git-prune-script)

It would be useful to have a bit more structure to the repository, such
that there are a fixed number of paths that hold all of the information
about the state of the repository, while the rest of the directory has
information that is particular to a working directory's state (e.g.,
index).

I'd propose the following structure:

 objects/    the content-addressed repository portion
 references/ the name-addressed repository portion
   heads/    the heads that are being used out of this repository
     DEFAULT the head that people pulling this repository mean by default
     ...     other heads, by name, that fsck-cache should mark reachable
   tags/     the tags
     ...     files with the symbolic name of the tags, containing the hash
 info/       other per-repository information
   remotes   URLs of remote repositories
   complete  hashes that the repository contains all references from
   missing   hashes that the repository lacks but wants
   excluded  hashes that the repository doesn't want
 ...         other files are per .git directory, not shared on push/pull
 index       
 HEAD        symlink to the head that is the local default
 tracked     remote that this working directory tracks

All of the files in references/*/* contain hex for objects in the
database, and are not synced between repositories in situ (but some sync
operations will read some of them and write them under different
names). fsck-cache would use as its reachability starting point $(cat
references/*/*).

In info/ are, generically, other files that relate to operations which
work on the repository rather than a working directory. Transfer programs
would use and maintain this information.

I think we'd still eventually want some way of getting from a commit-id to
any tags about it (I think git log would do well to mention any tags you
have when it shows a commit), but I don't want to design this quite
yet. It should also work for going from the real history to cached delta
info, when we have comparison tools that are sufficiently smart,
expensive, and intermediate-dependant to want to cache this.

	-Daniel
*This .sig left intentionally blank*

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [RFC] Design of name-addressed data portion
  2005-04-24 18:17 [RFC] Design of name-addressed data portion Daniel Barkalow
@ 2005-04-24 20:54 ` Petr Baudis
  2005-04-24 21:14   ` Daniel Barkalow
  2005-04-24 22:58 ` Fabian Franz
  1 sibling, 1 reply; 5+ messages in thread
From: Petr Baudis @ 2005-04-24 20:54 UTC (permalink / raw)
  To: Daniel Barkalow; +Cc: git, Linus Torvalds

Dear diary, on Sun, Apr 24, 2005 at 08:17:23PM CEST, I got a letter
where Daniel Barkalow <barkalow@iabervon.org> told me that...
> It would be useful to have a bit more structure to the repository, such
> that there are a fixed number of paths that hold all of the information
> about the state of the repository, while the rest of the directory has
> information that is particular to a working directory's state (e.g.,
> index).

Agreed.

> 
> 
> I'd propose the following structure:
> 
>  objects/    the content-addressed repository portion
>  references/ the name-addressed repository portion

references/ is just too long for my taste. ;-) What about just refs/ ?

>    heads/    the heads that are being used out of this repository
>      DEFAULT the head that people pulling this repository mean by default
>      ...     other heads, by name, that fsck-cache should mark reachable
>    tags/     the tags
>      ...     files with the symbolic name of the tags, containing the hash
>  info/       other per-repository information
>    remotes   URLs of remote repositories
>    complete  hashes that the repository contains all references from
>    missing   hashes that the repository lacks but wants
>    excluded  hashes that the repository doesn't want
>  ...         other files are per .git directory, not shared on push/pull
>  index       
>  HEAD        symlink to the head that is the local default
>  tracked     remote that this working directory tracks

I will probably throw the local stuff to local/.

I think I like this otherwise.

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
C++: an octopus made by nailing extra legs onto a dog. -- Steve Taylor

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [RFC] Design of name-addressed data portion
  2005-04-24 20:54 ` Petr Baudis
@ 2005-04-24 21:14   ` Daniel Barkalow
  0 siblings, 0 replies; 5+ messages in thread
From: Daniel Barkalow @ 2005-04-24 21:14 UTC (permalink / raw)
  To: Petr Baudis; +Cc: git, Linus Torvalds

On Sun, 24 Apr 2005, Petr Baudis wrote:

> Dear diary, on Sun, Apr 24, 2005 at 08:17:23PM CEST, I got a letter
> where Daniel Barkalow <barkalow@iabervon.org> told me that...
> > I'd propose the following structure:
> > 
> >  objects/    the content-addressed repository portion
> >  references/ the name-addressed repository portion
> 
> references/ is just too long for my taste. ;-) What about just refs/ ?

Fine with me. I guess you can't just hit tab when writing a script. :)

> >    heads/    the heads that are being used out of this repository
> >      DEFAULT the head that people pulling this repository mean by default
> >      ...     other heads, by name, that fsck-cache should mark reachable
> >    tags/     the tags
> >      ...     files with the symbolic name of the tags, containing the hash
> >  info/       other per-repository information
> >    remotes   URLs of remote repositories
> >    complete  hashes that the repository contains all references from
> >    missing   hashes that the repository lacks but wants
> >    excluded  hashes that the repository doesn't want
> >  ...         other files are per .git directory, not shared on push/pull
> >  index       
> >  HEAD        symlink to the head that is the local default
> >  tracked     remote that this working directory tracks
> 
> I will probably throw the local stuff to local/.

That seems to encourage confusion with the local/remote repository
contrast. I think branch/ or fork/ would be more clear. Putting it in a
directory doesn't seem so important to me, since it won't be shared
anyway. (The reason I want info/ is so that you just symlink info/ to the
master info/, and you don't have to remember to make a link for each
file).

	-Daniel
*This .sig left intentionally blank*


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [RFC] Design of name-addressed data portion
  2005-04-24 18:17 [RFC] Design of name-addressed data portion Daniel Barkalow
  2005-04-24 20:54 ` Petr Baudis
@ 2005-04-24 22:58 ` Fabian Franz
  2005-04-24 23:12   ` Daniel Barkalow
  1 sibling, 1 reply; 5+ messages in thread
From: Fabian Franz @ 2005-04-24 22:58 UTC (permalink / raw)
  To: Daniel Barkalow, git; +Cc: Linus Torvalds, Petr Baudis

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Am Sonntag, 24. April 2005 20:17 schrieb Daniel Barkalow:
> I'd propose the following structure:
>
> [...]
>    tags/     the tags
>      ...     files with the symbolic name of the tags, containing the hash

Couldn't you use symbolic or hard links here and in references/?

cu

Fabian
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)

iD8DBQFCbCSHI0lSH7CXz7MRAmPDAJ95YVHaGWH3KIMhOrw035cAUZd+QgCfZqFa
8IAfnNgc8P6cx+W2+xNJ0P0=
=WGC/
-----END PGP SIGNATURE-----


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [RFC] Design of name-addressed data portion
  2005-04-24 22:58 ` Fabian Franz
@ 2005-04-24 23:12   ` Daniel Barkalow
  0 siblings, 0 replies; 5+ messages in thread
From: Daniel Barkalow @ 2005-04-24 23:12 UTC (permalink / raw)
  To: Fabian Franz; +Cc: git, Linus Torvalds, Petr Baudis

On Mon, 25 Apr 2005, Fabian Franz wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> Am Sonntag, 24. April 2005 20:17 schrieb Daniel Barkalow:
> > I'd propose the following structure:
> >
> > [...]
> >    tags/     the tags
> >      ...     files with the symbolic name of the tags, containing the hash
> 
> Couldn't you use symbolic or hard links here and in references/?

For most uses of the refs/ directory (of which tags/ is a subdirectory),
we want to get from it the hash, not just the contents of the referenced
object, and we potentially want to get the hash from something like a web
server. Finding out what http://.../foo.git/refs/heads/DEFAULT is a
symlink (or, wrose, hard link) to so that you can decide if it's different
from what you have would be a major pain.

	-Daniel
*This .sig left intentionally blank*

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2005-04-24 23:07 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-04-24 18:17 [RFC] Design of name-addressed data portion Daniel Barkalow
2005-04-24 20:54 ` Petr Baudis
2005-04-24 21:14   ` Daniel Barkalow
2005-04-24 22:58 ` Fabian Franz
2005-04-24 23:12   ` Daniel Barkalow

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox