git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Handling very large numbers of symbolic references?
@ 2006-07-25 19:29 Nix
  2006-07-25 21:29 ` Rene Scharfe
  2006-07-25 22:23 ` Linus Torvalds
  0 siblings, 2 replies; 7+ messages in thread
From: Nix @ 2006-07-25 19:29 UTC (permalink / raw)
  To: git

I'm about to start writing my first git porcelain (to try to convert my
workplace from the world's oldest and cruftiest version control system
to something not based on the bastard offspring of SCCS and VMS's CMS,
with less power than either) and have run into a problem that I'm not
sure how to solve.

The biggest problem with git for totally naive users is that they get
scared by the sha1 IDs used as version numbers (assuming the index is
porcelained away: but that would confuse them, not scare them). They're
not pronounceable, not memorable, and so on. So the porcelain I'm
whipping up conceals them in large part by using instead bug IDs, as the
workflow of the place I'm doing this for is driven entirely by Bugzilla
bug numbers.

I'm taking a leaf from the `git for the ignorant' document and arranging
that every fix that fixes some Bugzilla bug is on a branch named after
that bug, e.g. #2243, #10155, whatever. (I'm going to have to go further
than that and track dependency relationships between bugs, i.e. `if you
merge bug #1404's branch, you must merge #1306's and #1505's as well'. I
could do that by adding a new bug-dependency object, respected by a
wrapper around git-merge, but I'm not sure how kosher it is to add new
types of objects only used by porcelain. Hell, I'm not even sure if it's
possible yet.)

However, this causes a potential problem. There are tens of thousands of
these bugs, and the .git/refs/heads directory gets *enormous* and thus
the system gets terribly terribly slow (crappy old Solaris filesystem
syndrome).

It seems to me there are two ways to fix this:

 - restructure .git/refs/* in a similar way to .git/objects, i.e. as a
   one- or two-level tree.

 - the vast majority of these bugs are closed. They still need to be got
   at now and again for branch merges, but they could be got out of
   .refs/heads at delete_branch time, and pushed into a tree consisting
   entirely of deleted branches, which would in turn be pointed at from
   some new place under .refs; perhaps .refs/heads/heavy (by analogy to
   non-lightweight tags). The problem here is that whenever we delete
   a tag, we'll leak that tree (at least we will if it's in a pack), and
   that leakage really could add up in the end.

   (Deleting branches corresponding to closed bugs is good for other
   reasons: e.g., it cleans up gitweb output. But certain tools *will*
   need to get at those closed bug branches: I'm inclined to say that
   all of them will sooner or later, because the users aren't going to
   tolerate being told that they can't do anything to a closed
   bug. Except for adding code to it: we can reasonably declare the
   addition of commits to those branches over. Of course once we have
   the sha1 id, it's all academic, really.)

I'm not sure which way is preferable. Suggestions? Is the entire idea
lunatic?


And, in case this hasn't been said enough: thank you for git, it's the
nicest version control system I've used in years, and the way it's
structured encourages everyone to play :)

-- 
`We're sysadmins. We deal with the inconceivable so often I can clearly 
 see the need to define levels of inconceivability.' --- Rik Steenwinkel

^ permalink raw reply	[flat|nested] 7+ messages in thread
* Re: Handling very large numbers of symbolic references?
@ 2006-07-26 18:38 linux
  0 siblings, 0 replies; 7+ messages in thread
From: linux @ 2006-07-26 18:38 UTC (permalink / raw)
  To: nix; +Cc: git, linux

Just to contribute a litle brainstorming....

- Remember that git refs only point to one end of a commit chain.
  The origin is kind of implicit.  If bug IDs correspond to *changes*,
  especially ones that you want to mix and match rebasing, is this a
  job for StGit or quilt or something else that tracks patches rather
  than states?
- If you do use core git to label bits of development history, are the
  labels supposed to be mutable heads or mostly frozen tags?
- Assuming they're tags, do you need them to be part of the root set
  for garbage collection purposes?  Or do you assume they are already
  referenced by the development history, and the bug ID links are
  symlinks that moight be broken if the patch isn't merged?

I really should look at StGit more, because from my current position
of ignorance, it looks like possibly a better match to the problem.
The main problems I see are that its patches are per-branch, not global,
and there's no fetch/push mechanism for sharing them.

Also, you might want to have a "patch" with a single name be a
patch SERIES, which I don't think StGit does.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2006-07-26 18:38 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-07-25 19:29 Handling very large numbers of symbolic references? Nix
2006-07-25 21:29 ` Rene Scharfe
2006-07-25 21:52   ` Nix
2006-07-25 22:23 ` Linus Torvalds
2006-07-25 23:08   ` Nix
2006-07-25 23:20     ` Linus Torvalds
  -- strict thread matches above, loose matches on Subject: below --
2006-07-26 18:38 linux

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).