Git development
 help / color / mirror / Atom feed
* [RFC] Design of name-addressed data portion
@ 2005-04-24 18:17 Daniel Barkalow
  2005-04-24 20:54 ` Petr Baudis
  2005-04-24 22:58 ` Fabian Franz
  0 siblings, 2 replies; 5+ messages in thread
From: Daniel Barkalow @ 2005-04-24 18:17 UTC (permalink / raw)
  To: git; +Cc: Linus Torvalds, Petr Baudis

I think it has gotten to be time to have a standard mechanism for
name-addressed data in the .git directory. We currently have one
agreed-upon item, HEAD, as well as a number of items in cogito: heads,
tags, and, to a certain extent, remotes. (Even in core git, when we
support tags, we'll want a mapping from tag names to tag objects, even if
this mapping doesn't get transferred by push and pull operations; nobody's
going to want to cut and paste a hash from email every time they want to
refer to the tag, and fsck-cache could stand to know which tags you mean
to have so that it can report the rest to git-prune-script)

It would be useful to have a bit more structure to the repository, such
that there are a fixed number of paths that hold all of the information
about the state of the repository, while the rest of the directory has
information that is particular to a working directory's state (e.g.,
index).



I'd propose the following structure:

 objects/    the content-addressed repository portion
 references/ the name-addressed repository portion
   heads/    the heads that are being used out of this repository
     DEFAULT the head that people pulling this repository mean by default
     ...     other heads, by name, that fsck-cache should mark reachable
   tags/     the tags
     ...     files with the symbolic name of the tags, containing the hash
 info/       other per-repository information
   remotes   URLs of remote repositories
   complete  hashes that the repository contains all references from
   missing   hashes that the repository lacks but wants
   excluded  hashes that the repository doesn't want
 ...         other files are per .git directory, not shared on push/pull
 index       
 HEAD        symlink to the head that is the local default
 tracked     remote that this working directory tracks

All of the files in references/*/* contain hex for objects in the
database, and are not synced between repositories in situ (but some sync
operations will read some of them and write them under different
names). fsck-cache would use as its reachability starting point $(cat
references/*/*).

In info/ are, generically, other files that relate to operations which
work on the repository rather than a working directory. Transfer programs
would use and maintain this information.

I think we'd still eventually want some way of getting from a commit-id to
any tags about it (I think git log would do well to mention any tags you
have when it shows a commit), but I don't want to design this quite
yet. It should also work for going from the real history to cached delta
info, when we have comparison tools that are sufficiently smart,
expensive, and intermediate-dependant to want to cache this.

	-Daniel
*This .sig left intentionally blank*


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2005-04-24 23:07 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-04-24 18:17 [RFC] Design of name-addressed data portion Daniel Barkalow
2005-04-24 20:54 ` Petr Baudis
2005-04-24 21:14   ` Daniel Barkalow
2005-04-24 22:58 ` Fabian Franz
2005-04-24 23:12   ` Daniel Barkalow

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox