From: Jakub Narebski <jnareb@gmail.com>
To: git@vger.kernel.org
Subject: [RFC] New commit object headers: generation and note headers
Date: Sat, 9 Feb 2008 17:46:07 +0100 [thread overview]
Message-ID: <200802091746.09102.jnareb@gmail.com> (raw)
As new major git release 1.6.0 is close (BTW. I wonder if git would ever
reach/get 2.0.0 release...), I'd like to sum up here, adding my own
thoughts and comments, ideas about extending commit object by adding
new headers. I think it would be better to have such major feature
introduced in major release, and not with only minor number changed.
For some headers the faster it is introduced the better.
1. 'generation' header
In the "[BUG?] git log picks up bad commit" thread:
http://permalink.gmane.org/gmane.comp.version-control.git/72274
later "[RFH] revision limiting sometimes ignored" there was resurrected
idea of the 'generation' header. This header is meant to simplify
removing uninteresting commits in the presence of clock skew, to
replace various commit-time related heuristics.
The proposed solution (which was at least once discussed in the past on
git mailing list) is to use for this "generation number":
1. For parentless (root) commits it equals 1 (or 0)
2. For each commit, it equals maximum of generation numbers of parents,
plus 1.
Of course to not to have to recalculate it from beginning it must be
saved somewhere. Best solution is to use 'generation' header for that.
Unfortunately there is complication that commits written before this
header introduced doesn't have generation number handy. It was proposed
then to use generation number if possible, and fallback to old date
based heuristic if it does not exist, and do not (re)calculate it;
the idea is to avoid such cost.
My comments:
============
The problem is twofold: when to calculate generation header, and what to
do with commits that lacks it. We could require to calculate generation
header when creating a commit (commit, amend, rebase, filter-branch),
but this might mean that a few first commits after 'generation' header
is introduced would be much slower.
As for older commits which lacks generation number header: perhaps some
(pack)-index-like external storage/cache, where generation numbers will
be saved as we generate them? And perhaps some command to generate
generation numbers in advance, in a free time.
Note that keeping generation numbers externally to the object database
is more error prone (cache sync), and would not propagate.
The question is if to take grafts and shallows when creating version
numbers: if they are to be saved in object database, then no. If saving
to external pack-index like storage, then perhaps.
2. 'note' header (no semantical meaning)
There was some time ago discussion about adding 'note' header, initially
to save original sha-1 of a commit for cherry-picking and rebase; then
for saving explicit rename or corrected rename info, for saving chosen
merge strategy, and for saving original ID of SCM import.
My comments:
============
>From all those I think what makes most sense is saving foreign SCM ID
for a commit, for commits imported from other SCM. This way we do not
have to parse commit message (fragile and ugly, and makes it harder for
two-way exchange: no pristine commit message), or store them externally
(not propagated, prone to be lost).
Another would be to save rename and copy info when importing from
another SCM which tracks renames and not detects code movement. This
would allow (at least theoretically) for lossless import. When
detecting renames, in the process of finding common merge base(s), we
could check and take into account such information. It would be purely
advisory...
--
Jakub Narebski
Poland
next reply other threads:[~2008-02-09 16:47 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-02-09 16:46 Jakub Narebski [this message]
2008-02-09 17:35 ` [RFC] New commit object headers: generation and note headers Daniel Barkalow
2008-02-09 17:50 ` Nguyen Thai Ngoc Duy
2008-02-09 21:03 ` Junio C Hamano
2008-02-09 23:26 ` [RFC] New commit object headers: " Jakub Narebski
2008-02-10 1:08 ` Johannes Schindelin
2008-02-11 10:08 ` Jakub Narebski
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=200802091746.09102.jnareb@gmail.com \
--to=jnareb@gmail.com \
--cc=git@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).