From: Nix <nix@esperi.org.uk>
To: git@vger.kernel.org
Subject: Handling very large numbers of symbolic references?
Date: Tue, 25 Jul 2006 20:29:46 +0100 [thread overview]
Message-ID: <87psfteb4l.fsf@hades.wkstn.nix> (raw)
I'm about to start writing my first git porcelain (to try to convert my
workplace from the world's oldest and cruftiest version control system
to something not based on the bastard offspring of SCCS and VMS's CMS,
with less power than either) and have run into a problem that I'm not
sure how to solve.
The biggest problem with git for totally naive users is that they get
scared by the sha1 IDs used as version numbers (assuming the index is
porcelained away: but that would confuse them, not scare them). They're
not pronounceable, not memorable, and so on. So the porcelain I'm
whipping up conceals them in large part by using instead bug IDs, as the
workflow of the place I'm doing this for is driven entirely by Bugzilla
bug numbers.
I'm taking a leaf from the `git for the ignorant' document and arranging
that every fix that fixes some Bugzilla bug is on a branch named after
that bug, e.g. #2243, #10155, whatever. (I'm going to have to go further
than that and track dependency relationships between bugs, i.e. `if you
merge bug #1404's branch, you must merge #1306's and #1505's as well'. I
could do that by adding a new bug-dependency object, respected by a
wrapper around git-merge, but I'm not sure how kosher it is to add new
types of objects only used by porcelain. Hell, I'm not even sure if it's
possible yet.)
However, this causes a potential problem. There are tens of thousands of
these bugs, and the .git/refs/heads directory gets *enormous* and thus
the system gets terribly terribly slow (crappy old Solaris filesystem
syndrome).
It seems to me there are two ways to fix this:
- restructure .git/refs/* in a similar way to .git/objects, i.e. as a
one- or two-level tree.
- the vast majority of these bugs are closed. They still need to be got
at now and again for branch merges, but they could be got out of
.refs/heads at delete_branch time, and pushed into a tree consisting
entirely of deleted branches, which would in turn be pointed at from
some new place under .refs; perhaps .refs/heads/heavy (by analogy to
non-lightweight tags). The problem here is that whenever we delete
a tag, we'll leak that tree (at least we will if it's in a pack), and
that leakage really could add up in the end.
(Deleting branches corresponding to closed bugs is good for other
reasons: e.g., it cleans up gitweb output. But certain tools *will*
need to get at those closed bug branches: I'm inclined to say that
all of them will sooner or later, because the users aren't going to
tolerate being told that they can't do anything to a closed
bug. Except for adding code to it: we can reasonably declare the
addition of commits to those branches over. Of course once we have
the sha1 id, it's all academic, really.)
I'm not sure which way is preferable. Suggestions? Is the entire idea
lunatic?
And, in case this hasn't been said enough: thank you for git, it's the
nicest version control system I've used in years, and the way it's
structured encourages everyone to play :)
--
`We're sysadmins. We deal with the inconceivable so often I can clearly
see the need to define levels of inconceivability.' --- Rik Steenwinkel
next reply other threads:[~2006-07-25 19:29 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-07-25 19:29 Nix [this message]
2006-07-25 21:29 ` Handling very large numbers of symbolic references? Rene Scharfe
2006-07-25 21:52 ` Nix
2006-07-25 22:23 ` Linus Torvalds
2006-07-25 23:08 ` Nix
2006-07-25 23:20 ` Linus Torvalds
-- strict thread matches above, loose matches on Subject: below --
2006-07-26 18:38 linux
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87psfteb4l.fsf@hades.wkstn.nix \
--to=nix@esperi.org.uk \
--cc=git@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).