From: Simon 'corecode' Schubert <corecode@fs.ei.tum.de>
To: "Shawn O. Pearce" <spearce@spearce.org>
Cc: Junio C Hamano <junkio@cox.net>, git@vger.kernel.org
Subject: Re: More precise tag following
Date: Sat, 27 Jan 2007 10:04:52 +0100 [thread overview]
Message-ID: <45BB15B4.7030009@fs.ei.tum.de> (raw)
In-Reply-To: <20070127080126.GC9966@spearce.org>
[-- Attachment #1: Type: text/plain, Size: 2828 bytes --]
Shawn O. Pearce wrote:
> This triplet would probably be encoded with descendant-commit using
> OBJ_REF, ancestor-commit being an OBJ_OFS style back reference within
> the pack (or OBJ_REF if not in this pack) and ancestor-object would
> also be an OBJ_REF. So a triplet probably would wind up costing
> ~60 bytes.
I'd propose the following:
Have a "shadow" tree which is storing DAG information per entry of the original tree. keep this shadow tree sorted in the same order the original tree object is. for simplicity i will now add the new information to the tree examples, but of course this new data cannot be stored in the tree itself and can't be hashed.
tree
<filename> <mode> <object> (<new: ancestor commit> <new: ancestor object>)*
the ancestor object isn't necessary, it would just speed up annotations. considering that we want to look only at the path, it is superfluous. if we want to traverse copies/moves as well, this could be helpful.
now, this information allows us to build a object-level (read: path identifies object) DAG, by starting out on the current tree and retrieving the associated information. using this it is possible to jump to the next commit/tree which changed the object and start over.
expense: 8 or 40 bytes per parent, per object, per commit, for each tree modified.
per parent: we of course store information about all parents
per object: as this is a "shadow" tree, we need to annotate all entries and not just changed ones
for each tree modified: (sub)trees not being modified of course do not need annotation *again*
but: this shadow tree can be deltified quite tightly, i'd say. possibly with a different, specialized tree delta method. then this boils down to 8 or 40 bytes per *changed* object, plus delta overhead.
> This of course penalizes objects which don't ever change, as we'd
> have to walk back a good chunk of the DAG before we find a matching
> triplet. But I would suspect that files which never change are
> also not given to log/blame very often either. And once we do find
> a triplet, we can skip through the DAG in time proportional to the
> rate of change for the path, rather than to the entire repository.
using my proposal this penalty does not exist. i think it would be really awkward to have the annotation of a never-changed Makefile to take way longer than the operation on a recently/often changed file.
we'd have to compare the space requirements of both approaches.
cheers
simon
--
Serve - BSD +++ RENT this banner advert +++ ASCII Ribbon /"\
Work - Mac +++ space for low €€€ NOW!1 +++ Campaign \ /
Party Enjoy Relax | http://dragonflybsd.org Against HTML \
Dude 2c 2 the max ! http://golden-apple.biz Mail + News / \
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 252 bytes --]
next prev parent reply other threads:[~2007-01-27 9:05 UTC|newest]
Thread overview: 92+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-01-26 11:07 More precise tag following Junio C Hamano
2007-01-26 11:53 ` Junio C Hamano
2007-01-27 8:01 ` Shawn O. Pearce
2007-01-27 8:41 ` Junio C Hamano
2007-01-27 13:33 ` Jeff King
2007-01-27 17:47 ` Nicolas Pitre
2007-01-27 9:04 ` Simon 'corecode' Schubert [this message]
2007-01-27 12:58 ` Johannes Schindelin
2007-01-27 13:50 ` Simon 'corecode' Schubert
2007-01-27 16:30 ` Jakub Narebski
2007-01-27 17:36 ` Linus Torvalds
2007-01-27 16:46 ` Johannes Schindelin
2007-01-27 17:12 ` Simon 'corecode' Schubert
2007-01-27 19:13 ` Johannes Schindelin
2007-01-27 19:55 ` Simon 'corecode' Schubert
2007-01-27 19:41 ` Nicolas Pitre
2007-01-27 17:22 ` Linus Torvalds
2007-01-27 17:56 ` Linus Torvalds
2007-01-27 22:00 ` Junio C Hamano
2007-01-27 22:54 ` Linus Torvalds
2007-01-28 9:27 ` Junio C Hamano
2007-01-28 9:44 ` [PATCH] git-blame --porcelain: quote filename in c-style when needed Junio C Hamano
2007-01-28 14:25 ` [PATCH] git-blame --incremental: don't use pager René Scharfe
2007-01-28 19:09 ` Junio C Hamano
2007-01-28 19:14 ` Junio C Hamano
2007-01-29 0:32 ` René Scharfe
2007-01-29 2:35 ` [PATCH] git blame --progress Junio C Hamano
2007-01-29 7:00 ` Simon 'corecode' Schubert
2007-01-29 16:54 ` Alex Riesen
2007-01-29 18:12 ` Matthias Lederhofer
2007-01-29 19:06 ` Junio C Hamano
2007-01-29 19:59 ` René Scharfe
2007-01-29 20:24 ` Linus Torvalds
2007-01-30 1:53 ` Junio C Hamano
2007-01-28 19:08 ` More precise tag following Linus Torvalds
2007-01-28 19:18 ` Junio C Hamano
2007-01-28 19:57 ` Linus Torvalds
2007-01-28 20:01 ` Junio C Hamano
2007-01-28 20:20 ` [PATCH] document 'blame --incremental' Junio C Hamano
2007-01-28 21:06 ` More precise tag following Junio C Hamano
2007-01-28 23:01 ` Jeff King
2007-01-30 9:22 ` Junio C Hamano
2007-01-30 15:31 ` Shawn O. Pearce
2007-01-30 17:02 ` Linus Torvalds
2007-01-28 19:58 ` Junio C Hamano
2007-01-29 6:18 ` Shawn O. Pearce
2007-01-29 10:17 ` Junio C Hamano
2007-01-29 10:31 ` Shawn O. Pearce
2007-01-29 16:24 ` Linus Torvalds
2007-01-29 18:07 ` Simon 'corecode' Schubert
2007-01-29 19:29 ` Theodore Tso
2007-01-29 19:45 ` Linus Torvalds
2007-01-29 20:25 ` Jakub Narebski
2007-01-29 20:47 ` Shawn O. Pearce
2007-01-29 21:02 ` Jakub Narebski
2007-02-09 7:41 ` Shawn O. Pearce
2007-01-31 8:39 ` David Kågedal
2007-01-31 10:59 ` David Kågedal
2007-01-31 16:12 ` Peter Eriksen
2007-01-31 17:04 ` David Kågedal
2007-01-31 17:12 ` Peter Eriksen
2007-01-31 17:35 ` Jakub Narebski
2007-01-31 20:59 ` David Kågedal
2007-01-27 18:40 ` Simon 'corecode' Schubert
2007-01-27 19:02 ` Johannes Schindelin
2007-01-27 19:12 ` Simon 'corecode' Schubert
2007-01-27 19:19 ` Johannes Schindelin
2007-01-27 19:59 ` Jakub Narebski
2007-01-27 19:15 ` Linus Torvalds
2007-01-27 19:25 ` Linus Torvalds
2007-01-27 19:54 ` Jakub Narebski
2007-01-27 20:13 ` Linus Torvalds
2007-01-27 19:36 ` Chris Lee
2007-01-28 18:10 ` Theodore Tso
2007-01-28 18:27 ` Linus Torvalds
2007-01-28 22:26 ` David Lang
2007-01-29 17:34 ` Nicolas Pitre
2007-01-29 17:42 ` Linus Torvalds
2007-01-29 17:58 ` Nicolas Pitre
2007-01-29 19:16 ` Chris Lee
2007-01-29 23:00 ` Eric Wong
2007-01-30 0:42 ` Eric Wong
2007-01-30 0:48 ` Eric Wong
2007-01-30 8:51 ` Eric Wong
2007-01-27 18:52 ` Jakub Narebski
2007-01-27 20:16 ` Jeff King
2007-01-27 22:39 ` Linus Torvalds
2007-01-27 23:52 ` Jeff King
2007-01-28 2:39 ` Theodore Tso
2007-01-28 3:17 ` Randal L. Schwartz
2007-01-28 13:15 ` Jeff King
2007-01-28 7:40 ` Shawn O. Pearce
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=45BB15B4.7030009@fs.ei.tum.de \
--to=corecode@fs.ei.tum.de \
--cc=git@vger.kernel.org \
--cc=junkio@cox.net \
--cc=spearce@spearce.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).