git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Simon 'corecode' Schubert <corecode@fs.ei.tum.de>
To: "Shawn O. Pearce" <spearce@spearce.org>
Cc: Junio C Hamano <junkio@cox.net>, git@vger.kernel.org
Subject: Re: More precise tag following
Date: Sat, 27 Jan 2007 10:04:52 +0100	[thread overview]
Message-ID: <45BB15B4.7030009@fs.ei.tum.de> (raw)
In-Reply-To: <20070127080126.GC9966@spearce.org>

[-- Attachment #1: Type: text/plain, Size: 2828 bytes --]

Shawn O. Pearce wrote:
> This triplet would probably be encoded with descendant-commit using
> OBJ_REF, ancestor-commit being an OBJ_OFS style back reference within
> the pack (or OBJ_REF if not in this pack) and ancestor-object would
> also be an OBJ_REF.  So a triplet probably would wind up costing
> ~60 bytes.

I'd propose the following:

Have a "shadow" tree which is storing DAG information per entry of the original tree.  keep this shadow tree sorted in the same order the original tree object is.  for simplicity i will now add the new information to the tree examples, but of course this new data cannot be stored in the tree itself and can't be hashed.

tree
<filename> <mode> <object> (<new: ancestor commit> <new: ancestor object>)*

the ancestor object isn't necessary, it would just speed up annotations.  considering that we want to look only at the path, it is superfluous.  if we want to traverse copies/moves as well, this could be helpful.

now, this information allows us to build a object-level (read: path identifies object) DAG, by starting out on the current tree and retrieving the associated information.  using this it is possible to jump to the next commit/tree which changed the object and start over.

expense:  8 or 40 bytes per parent, per object, per commit, for each tree modified.

per parent:  we of course store information about all parents
per object:  as this is a "shadow" tree, we need to annotate all entries and not just changed ones
for each tree modified:  (sub)trees not being modified of course do not need annotation *again*

but:  this shadow tree can be deltified quite tightly, i'd say.  possibly with a different, specialized tree delta method.  then this boils down to 8 or 40 bytes per *changed* object, plus delta overhead.

> This of course penalizes objects which don't ever change, as we'd
> have to walk back a good chunk of the DAG before we find a matching
> triplet.  But I would suspect that files which never change are
> also not given to log/blame very often either.  And once we do find
> a triplet, we can skip through the DAG in time proportional to the
> rate of change for the path, rather than to the entire repository.

using my proposal this penalty does not exist.  i think it would be really awkward to have the annotation of a never-changed Makefile to take way longer than the operation on a recently/often changed file.

we'd have to compare the space requirements of both approaches.

cheers
  simon

-- 
Serve - BSD     +++  RENT this banner advert  +++    ASCII Ribbon   /"\
Work - Mac      +++  space for low €€€ NOW!1  +++      Campaign     \ /
Party Enjoy Relax   |   http://dragonflybsd.org      Against  HTML   \
Dude 2c 2 the max   !   http://golden-apple.biz       Mail + News   / \


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 252 bytes --]

  parent reply	other threads:[~2007-01-27  9:05 UTC|newest]

Thread overview: 92+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-01-26 11:07 More precise tag following Junio C Hamano
2007-01-26 11:53 ` Junio C Hamano
2007-01-27  8:01 ` Shawn O. Pearce
2007-01-27  8:41   ` Junio C Hamano
2007-01-27 13:33     ` Jeff King
2007-01-27 17:47     ` Nicolas Pitre
2007-01-27  9:04   ` Simon 'corecode' Schubert [this message]
2007-01-27 12:58   ` Johannes Schindelin
2007-01-27 13:50     ` Simon 'corecode' Schubert
2007-01-27 16:30       ` Jakub Narebski
2007-01-27 17:36         ` Linus Torvalds
2007-01-27 16:46       ` Johannes Schindelin
2007-01-27 17:12         ` Simon 'corecode' Schubert
2007-01-27 19:13           ` Johannes Schindelin
2007-01-27 19:55             ` Simon 'corecode' Schubert
2007-01-27 19:41           ` Nicolas Pitre
2007-01-27 17:22   ` Linus Torvalds
2007-01-27 17:56     ` Linus Torvalds
2007-01-27 22:00       ` Junio C Hamano
2007-01-27 22:54         ` Linus Torvalds
2007-01-28  9:27           ` Junio C Hamano
2007-01-28  9:44             ` [PATCH] git-blame --porcelain: quote filename in c-style when needed Junio C Hamano
2007-01-28 14:25             ` [PATCH] git-blame --incremental: don't use pager René Scharfe
2007-01-28 19:09               ` Junio C Hamano
2007-01-28 19:14                 ` Junio C Hamano
2007-01-29  0:32                   ` René Scharfe
2007-01-29  2:35                     ` [PATCH] git blame --progress Junio C Hamano
2007-01-29  7:00                       ` Simon 'corecode' Schubert
2007-01-29 16:54                       ` Alex Riesen
2007-01-29 18:12                       ` Matthias Lederhofer
2007-01-29 19:06                         ` Junio C Hamano
2007-01-29 19:59                       ` René Scharfe
2007-01-29 20:24                         ` Linus Torvalds
2007-01-30  1:53                         ` Junio C Hamano
2007-01-28 19:08             ` More precise tag following Linus Torvalds
2007-01-28 19:18               ` Junio C Hamano
2007-01-28 19:57                 ` Linus Torvalds
2007-01-28 20:01                   ` Junio C Hamano
2007-01-28 20:20                   ` [PATCH] document 'blame --incremental' Junio C Hamano
2007-01-28 21:06                   ` More precise tag following Junio C Hamano
2007-01-28 23:01                     ` Jeff King
2007-01-30  9:22                   ` Junio C Hamano
2007-01-30 15:31                     ` Shawn O. Pearce
2007-01-30 17:02                     ` Linus Torvalds
2007-01-28 19:58                 ` Junio C Hamano
2007-01-29  6:18             ` Shawn O. Pearce
2007-01-29 10:17               ` Junio C Hamano
2007-01-29 10:31                 ` Shawn O. Pearce
2007-01-29 16:24               ` Linus Torvalds
2007-01-29 18:07                 ` Simon 'corecode' Schubert
2007-01-29 19:29                 ` Theodore Tso
2007-01-29 19:45                   ` Linus Torvalds
2007-01-29 20:25                   ` Jakub Narebski
2007-01-29 20:47                     ` Shawn O. Pearce
2007-01-29 21:02                       ` Jakub Narebski
2007-02-09  7:41                   ` Shawn O. Pearce
2007-01-31  8:39                 ` David Kågedal
2007-01-31 10:59                   ` David Kågedal
2007-01-31 16:12                   ` Peter Eriksen
2007-01-31 17:04                     ` David Kågedal
2007-01-31 17:12                       ` Peter Eriksen
2007-01-31 17:35                         ` Jakub Narebski
2007-01-31 20:59                         ` David Kågedal
2007-01-27 18:40     ` Simon 'corecode' Schubert
2007-01-27 19:02       ` Johannes Schindelin
2007-01-27 19:12         ` Simon 'corecode' Schubert
2007-01-27 19:19           ` Johannes Schindelin
2007-01-27 19:59           ` Jakub Narebski
2007-01-27 19:15       ` Linus Torvalds
2007-01-27 19:25         ` Linus Torvalds
2007-01-27 19:54           ` Jakub Narebski
2007-01-27 20:13             ` Linus Torvalds
2007-01-27 19:36         ` Chris Lee
2007-01-28 18:10           ` Theodore Tso
2007-01-28 18:27             ` Linus Torvalds
2007-01-28 22:26           ` David Lang
2007-01-29 17:34             ` Nicolas Pitre
2007-01-29 17:42               ` Linus Torvalds
2007-01-29 17:58                 ` Nicolas Pitre
2007-01-29 19:16                   ` Chris Lee
2007-01-29 23:00           ` Eric Wong
2007-01-30  0:42             ` Eric Wong
2007-01-30  0:48               ` Eric Wong
2007-01-30  8:51               ` Eric Wong
2007-01-27 18:52     ` Jakub Narebski
2007-01-27 20:16     ` Jeff King
2007-01-27 22:39       ` Linus Torvalds
2007-01-27 23:52         ` Jeff King
2007-01-28  2:39           ` Theodore Tso
2007-01-28  3:17             ` Randal L. Schwartz
2007-01-28 13:15             ` Jeff King
2007-01-28  7:40     ` Shawn O. Pearce

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=45BB15B4.7030009@fs.ei.tum.de \
    --to=corecode@fs.ei.tum.de \
    --cc=git@vger.kernel.org \
    --cc=junkio@cox.net \
    --cc=spearce@spearce.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).