Git development
 help / color / mirror / Atom feed
* Re: Cygwin can't handle huge packfiles?
From: Junio C Hamano @ 2006-04-06 23:53 UTC (permalink / raw)
  To: linux; +Cc: git
In-Reply-To: <20060406205724.12216.qmail@science.horizon.com>

linux@horizon.com writes:

>> Right now we LRU the pack files and evict older ones when we
>> mmap too many, but the unit of eviction is the whole file, so it
>> would not help the case like yours at all.  It might be possible
>> to mmap only part of a packfile, but it would involve fairly
>> major surgery to sha1_file.c.
>
> The simplest solution seems to be to limit pack file size to a reasonable
> fraction of a 32-bit address space.  Say, 0.5 G.

I do not think that would help the original poster's situation
where only 5 revs result in a 1.5G pack.  I would _almost_ say
"do not pack such a repository", but there is the initial
cloning over git-aware transports which always results in a
repository with a single pack.

^ permalink raw reply

* [PATCH] rev-list: honor --abbrev=<n> when doing --pretty=oneline
From: Eric Wong @ 2006-04-07  0:44 UTC (permalink / raw)
  To: Junio C Hamano, git

This should make --pretty=oneline a whole lot more readable for
people using 80-column terminals.

Note that --abbrev=DEFAULT_ABBREV was on by default before, but
it only affected the printing of the Merge: header).  Let me
know if anybody doesn't want the default behavior to change.
Also note that --abbrev without arguments is not supported by
rev-list, but --no-abbrev is supported if you want the old
behavior.

Originally I made abbrev affect the commit sha1 output
regardless of the pretty setting, but that broke some tests and
I figured it's most/only useful for --pretty=oneline (at least
that's why *I* want it :)

Signed-off-by: Eric Wong <normalperson@yhbt.net>

---

 rev-list.c |    5 ++++-
 1 files changed, 4 insertions(+), 1 deletions(-)

c4da073e8256499950e25e2c20ea0b3ec4c29b46
diff --git a/rev-list.c b/rev-list.c
index 22141e2..392209d 100644
--- a/rev-list.c
+++ b/rev-list.c
@@ -52,7 +52,10 @@ static void show_commit(struct commit *c
 		fputs(commit_prefix, stdout);
 	if (commit->object.flags & BOUNDARY)
 		putchar('-');
-	fputs(sha1_to_hex(commit->object.sha1), stdout);
+	if (abbrev && commit_format == CMIT_FMT_ONELINE)
+		fputs(find_unique_abbrev(commit->object.sha1, abbrev), stdout);
+	else
+		fputs(sha1_to_hex(commit->object.sha1), stdout);
 	if (revs.parents) {
 		struct commit_list *parents = commit->parents;
 		while (parents) {
-- 
1.3.0.rc2.g454a-dirty

^ permalink raw reply related

* Re: [PATCH] rev-list: honor --abbrev=<n> when doing --pretty=oneline
From: Junio C Hamano @ 2006-04-07  1:29 UTC (permalink / raw)
  To: Eric Wong; +Cc: git
In-Reply-To: <20060407004455.GF15743@hand.yhbt.net>

Eric Wong <normalperson@yhbt.net> writes:

> Note that --abbrev=DEFAULT_ABBREV was on by default before, but
> it only affected the printing of the Merge: header).  Let me
> know if anybody doesn't want the default behavior to change.

I've never felt need for abbreviating commit object names, so I
only had the abbrev variable to determine how the merge parents
are shown.  If you want to abbreviate the commit object names as
well, you _could_ do independent precision for parents and
commits, but that would be overkil.  So I'd rather see a switch
to turn abbreviation for commits on, perhaps like this:

        $ git-rev-list --pretty=oneline --abbrev-commit -n 3 master
        454a35b Add documentation for git-imap-send.
        ba3c937 blame.c: fix completely broken ancestry traversal.
        6cbd5d7 Tweaks to make asciidoc play nice.

        $ git-rev-list --pretty=oneline --abbrev=4 --abbrev-commit -n 3 master
        454a Add documentation for git-imap-send.
        ba3c9 blame.c: fix completely broken ancestry traversal.
        6cbd5 Tweaks to make asciidoc play nice.

Otherwise you might break Porcelains and people's scripts that
read from --pretty or --header output.

-- >8 --
diff --git a/rev-list.c b/rev-list.c
index 22141e2..1301502 100644
--- a/rev-list.c
+++ b/rev-list.c
@@ -30,6 +30,7 @@ static const char rev_list_usage[] =
 "    --unpacked\n"
 "    --header | --pretty\n"
 "    --abbrev=nr | --no-abbrev\n"
+"    --abbrev-commit\n"
 "  special purpose:\n"
 "    --bisect"
 ;
@@ -39,6 +40,7 @@ struct rev_info revs;
 static int bisect_list = 0;
 static int verbose_header = 0;
 static int abbrev = DEFAULT_ABBREV;
+static int abbrev_commit = 0;
 static int show_timestamp = 0;
 static int hdr_termination = 0;
 static const char *commit_prefix = "";
@@ -52,7 +54,10 @@ static void show_commit(struct commit *c
 		fputs(commit_prefix, stdout);
 	if (commit->object.flags & BOUNDARY)
 		putchar('-');
-	fputs(sha1_to_hex(commit->object.sha1), stdout);
+	if (abbrev_commit && abbrev)
+		fputs(find_unique_abbrev(commit->object.sha1, abbrev), stdout);
+	else
+		fputs(sha1_to_hex(commit->object.sha1), stdout);
 	if (revs.parents) {
 		struct commit_list *parents = commit->parents;
 		while (parents) {
@@ -317,6 +322,14 @@ int main(int argc, const char **argv)
 		}
 		if (!strcmp(arg, "--no-abbrev")) {
 			abbrev = 0;
+			continue;
+		}
+		if (!strcmp(arg, "--abbrev")) {
+			abbrev = DEFAULT_ABBREV;
+			continue;
+		}
+		if (!strcmp(arg, "--abbrev-commit")) {
+			abbrev_commit = 1;
 			continue;
 		}
 		if (!strncmp(arg, "--abbrev=", 9)) {

^ permalink raw reply related

* Re: Cygwin can't handle huge packfiles?
From: linux @ 2006-04-07  3:05 UTC (permalink / raw)
  To: junkio, linux; +Cc: git
In-Reply-To: <7vk6a2uupy.fsf@assigned-by-dhcp.cox.net>

> I do not think that would help the original poster's situation
> where only 5 revs result in a 1.5G pack.  I would _almost_ say
> "do not pack such a repository", but there is the initial
> cloning over git-aware transports which always results in a
> repository with a single pack.

Huh?  Why not?  That repository has a lot of files.  For compression,
you want all versions of a file in one pack, and with few versions that
makes it easier to split up, not harder.

As for network transport of packs, I haven't studied the details,
but if you allow "thin packs" that have deltas relative to
objects not in the pack, then breaking up the pack anywhere
should be legal.

Or, if necessary, you can stuff an arbitrarily large file through
git-unpack-objects, which reads a stream from stdin without
attempting to mmap it.


(Speaking of unpack-objects.c, what's that "static unsigned long eof"
variable in there?  It never seems to be set to a non-zero value.)

^ permalink raw reply

* Re: [PATCH] rev-list: honor --abbrev=<n> when doing --pretty=oneline
From: Eric Wong @ 2006-04-07  3:13 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <7v64lmuqa5.fsf@assigned-by-dhcp.cox.net>

Junio C Hamano <junkio@cox.net> wrote:
> Eric Wong <normalperson@yhbt.net> writes:
> 
> > Note that --abbrev=DEFAULT_ABBREV was on by default before, but
> > it only affected the printing of the Merge: header).  Let me
> > know if anybody doesn't want the default behavior to change.
> 
> I've never felt need for abbreviating commit object names, so I
> only had the abbrev variable to determine how the merge parents
> are shown.  If you want to abbreviate the commit object names as
> well, you _could_ do independent precision for parents and
> commits, but that would be overkil.  So I'd rather see a switch
> to turn abbreviation for commits on, perhaps like this:
> 
>         $ git-rev-list --pretty=oneline --abbrev-commit -n 3 master
>         454a35b Add documentation for git-imap-send.
>         ba3c937 blame.c: fix completely broken ancestry traversal.
>         6cbd5d7 Tweaks to make asciidoc play nice.
> 
>         $ git-rev-list --pretty=oneline --abbrev=4 --abbrev-commit -n 3 master
>         454a Add documentation for git-imap-send.
>         ba3c9 blame.c: fix completely broken ancestry traversal.
>         6cbd5 Tweaks to make asciidoc play nice.
> 
> Otherwise you might break Porcelains and people's scripts that
> read from --pretty or --header output.
> 
> -- >8 --

Sounds good, I like your patch.  I'm not thrilled with the length of the
'--abbrev-commit' switch, but I guess that's what aliases are for :>

-- 
Eric Wong

^ permalink raw reply

* [PATCH] git-svnimport: Don't assume that copied files haven't changed
From: Karl  Hasselström @ 2006-04-07  6:06 UTC (permalink / raw)
  To: Git Mailing List

Don't assume that a file that SVN claims was copied from somewhere
else is bit-for-bit identical with its parent, since SVN allows
changes to copied files before they are committed.

Without this fix, such copy-modify-commit operations causes the
imported file to lack the "modify" part -- that is, we get subtle data
corruption.

Signed-off-by: Karl Hasselström <kha@treskal.com>

---

 git-svnimport.perl |   15 ++++++++++-----
 1 files changed, 10 insertions(+), 5 deletions(-)

diff --git a/git-svnimport.perl b/git-svnimport.perl
index 114784f..4d5371c 100755
--- a/git-svnimport.perl
+++ b/git-svnimport.perl
@@ -616,9 +616,7 @@ sub commit {
 			}
 			if(($action->[0] eq "A") || ($action->[0] eq "R")) {
 				my $node_kind = node_kind($branch,$path,$revision);
-				if($action->[1]) {
-					copy_path($revision,$branch,$path,$action->[1],$action->[2],$node_kind,\@new,\@parents);
-				} elsif ($node_kind eq $SVN::Node::file) {
+				if ($node_kind eq $SVN::Node::file) {
 					my $f = get_file($revision,$branch,$path);
 					if ($f) {
 						push(@new,$f) if $f;
@@ -627,8 +625,15 @@ sub commit {
 						print STDERR "$revision: $branch: could not fetch '$opath'\n";
 					}
 				} elsif ($node_kind eq $SVN::Node::dir) {
-					get_ignore(\@new, \@old, $revision,
-						   $branch,$path);
+					if($action->[1]) {
+						copy_path($revision, $branch,
+							  $path, $action->[1],
+							  $action->[2], $node_kind,
+							  \@new, \@parents);
+					} else {
+						get_ignore(\@new, \@old, $revision,
+							   $branch, $path);
+					}
 				}
 			} elsif ($action->[0] eq "D") {
 				push(@old,$path);

^ permalink raw reply related

* Re: parsecvs tool now creates git repositories
From: Keith Packard @ 2006-04-07  7:24 UTC (permalink / raw)
  To: Martin Langhoff; +Cc: keithp, Jim Radford, Git Mailing List
In-Reply-To: <46a038f90604061622s5a7bee4eq6666d9b3796f70f6@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 565 bytes --]

On Fri, 2006-04-07 at 11:22 +1200, Martin Langhoff wrote:

> parsecvs is committing them with the "added file foo.x" message, not
> the actual commit message.

heh. my cvs repositories are all so kludged that no files have ever been
added, it appears. I'll fix this when I've got a copy of the moodle
repository. sf.net is as useful as always.

I suspect the change is as simple as checking the format of the log
message and time time stamps of the commits and then just dropping the
1.1 revision from the tree entirely.

-- 
keith.packard@intel.com

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 191 bytes --]

^ permalink raw reply

* Re: Cygwin can't handle huge packfiles?
From: Junio C Hamano @ 2006-04-07  8:15 UTC (permalink / raw)
  To: git; +Cc: Kees-Jan Dijkzeul, Linus Torvalds
In-Reply-To: <Pine.LNX.4.64.0604030734440.3781@g5.osdl.org>

Linus Torvalds <torvalds@osdl.org> writes:

> On Mon, 3 Apr 2006, Linus Torvalds wrote:
>> 
>> That said, I think git _does_ have problems with large pack-files. We have 
>> some 32-bit issues etc
>
> I should clarify that. git _itself_ shouldn't have any 32-bit issues, but 
> the packfile data structure does. The index has 32-bit offsets into 
> individual pack-files. 
>
> That's not hugely fundamental,...

Linus _does_ understand what he means, but let me clarify and
outline a possible future direction.

 * pack-*.pack file has the following format:

   - The header appears at the beginning and consists of the following:

     4-byte signature
     4-byte version number (network byte order)
     4-byte number of objects contained in the pack (network byte order)

     Observation: we cannot have more than 4G versions ;-) and
     more than 4G objects in a pack.

   - The header is followed by number of object entries, each of
     which looks like this:

     (undeltified representation)
     n-byte type and length (4-bit type, (n-1)*7+4-bit length)
     compressed data

     (deltified representation)
     n-byte type and length (4-bit type, (n-1)*7+4-bit length)
     20-byte base object name
     compressed delta data

     Observation: length of each object is encoded in a variable
     length format and is not constrained to 32-bit or anything.

  - The trailer records 20-byte SHA1 checksum of all of the above.

 * pack-*.idx file has the following format:

  - The header consists of 256 4-byte network byte order
    integers.  N-th entry of this table records the number of
    objects in the corresponding pack, the first byte of whose
    object name are smaller than N.

    Observation: we would need to extend this to an array of
    8-byte integers to go beyond 4G objects per pack, but it is
    not strictly necessary.

  - The header is followed by sorted 28-byte entries, one entry
    per object in the pack.  Each entry is:

    4-byte network byte order integer, recording where the
    object is stored in the packfile as the offset from the
    beginning.

    20-byte object name.

    Observation: we would definitely need to extend this to
    8-byte integer plus 20-byte object name to handle a packfile
    that is larger than 4GB.

  - The file is concluded with a trailer:

    A copy of the 20-byte SHA1 checksum at the end of
    corresponding packfile.

    20-byte SHA1-checksum of all of the above.

This is not fundamental, in that pack idx file is something we
can regenerate from a packfile.  The push/fetch transfer over
git native protocols does not even transfer pack idx file;
instead, the recipient uses git-index-pack to generate pack idx.
git-index-pack would need to be updated to update the necessary
fields to 8-byte integers, without breaking existing packfiles.

The code to read idx file currently has a sanity check logic to
make sure that the size of the idx file is consistent with
24-byte entries (the last entry in the header matches the number
of objects recorded in the pack).  So we could reliably tell
between the current 24-byte version and 28-byte "beyond 4GB"
version, and support both formats at the same time.

Even after we start supporting the 28-byte "beyond 4GB" format,
we can and we should continue writing the current 24-byte
version of pack idx file when the packfile offset can be
expressed with 32-bit.

Having said that, I have to warn that this is not for weak of
heart.  The necessary changes would be somewhat involved.


----------------------------------------------------------------

Pack idx file

	idx
	    +--------------------------------+
	    | fanout[0] = 2                  |-.
	    +--------------------------------+ |
	    | fanout[1]                      | |
	    +--------------------------------+ |
	    | fanout[2]                      | |
	    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
	    | fanout[255]                    | |
	    +--------------------------------+ |
main	    | offset                         | |
index	    | object name 00XXXXXXXXXXXXXXXX | |
table	    +--------------------------------+ | 
	    | offset                         | |
	    | object name 00XXXXXXXXXXXXXXXX | |
	    +--------------------------------+ |
	  .-| offset                         |<+
	  | | object name 01XXXXXXXXXXXXXXXX |
	  | +--------------------------------+
	  | | offset                         |
	  | | object name 01XXXXXXXXXXXXXXXX |
	  | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
	  | | offset                         |
	  | | object name FFXXXXXXXXXXXXXXXX |
	  | +--------------------------------+
trailer	  | | packfile checksum              |
	  | +--------------------------------+
	  | | idxfile checksum               |
	  | +--------------------------------+
          .-------.      
                  |
Pack file entry: <+

     packed object header:
	1-byte type (bit 4-6)
	       size0 (bit 0-3)
               end-of-length (bit 7)
        n-byte sizeN (as long as MSB is set, each 7-bit)
		size0..sizeN form 4+7+7+..+7 bit integer, size0
		is the most significant part.
     packed object data:
        If it is not DELTA, then deflated bytes (the size above
		is the size before compression).
	If it is DELTA, then
	  20-byte base object name SHA1 (the size above is the
	  	size of the delta data that follows).
          delta data, deflated.

^ permalink raw reply

* Re: Cygwin can't handle huge packfiles?
From: Jakub Narebski @ 2006-04-07  8:27 UTC (permalink / raw)
  To: git
In-Reply-To: <7vhd55ls24.fsf@assigned-by-dhcp.cox.net>

Junio C Hamano wrote:

>  * pack-*.pack file has the following format:
[...]
>  * pack-*.idx file has the following format:
[...]
Could you please put the information in parent post somewhere in
Documentation, for example Documentation/technical/pack-format.txt
(perhaps together with putting description of packing heuristic from
http://marc.theaimsgroup.com/?l=git&m=114134881923320 by Jon Loeliger in
Documentation/technical/pack-heuristics.txt even if it doesn't conform to
"serious documentation" standards)?

Thanks in advance
-- 
Jakub Narebski
Warsaw, Poland

^ permalink raw reply

* blame now knows -S
From: Junio C Hamano @ 2006-04-07  9:28 UTC (permalink / raw)
  To: Martin Langhoff; +Cc: git, Fredrik Kuivinen

I've made a few changes to "git blame" myself:

 - fix breakage caused by recent revision walker reorganization;
 - use built-in xdiff instead of spawning GNU diff;
 - implement -S <ancestry-file> like annotate does.

Depending on the density of changes, it now appears that blame
is 10%-30% faster than annotate.  I thought CVS emulator might
be interested to give it a whirl..

^ permalink raw reply

* Re: blame now knows -S
From: Junio C Hamano @ 2006-04-07  9:32 UTC (permalink / raw)
  To: Martin Langhoff; +Cc: git, Fredrik Kuivinen
In-Reply-To: <7v1ww9loon.fsf@assigned-by-dhcp.cox.net>

Junio C Hamano <junkio@cox.net> writes:

> I've made a few changes to "git blame" myself:
>
>  - fix breakage caused by recent revision walker reorganization;
>  - use built-in xdiff instead of spawning GNU diff;
>  - implement -S <ancestry-file> like annotate does.
>
> Depending on the density of changes, it now appears that blame
> is 10%-30% faster than annotate.  I thought CVS emulator might
> be interested to give it a whirl..

Sorry, forgot to mention... The updated blame will be in "next",
not in "master" yet.

^ permalink raw reply

* Re: Cygwin can't handle huge packfiles?
From: Nicolas Pitre @ 2006-04-07 14:11 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Kees-Jan Dijkzeul, Linus Torvalds
In-Reply-To: <7vhd55ls24.fsf@assigned-by-dhcp.cox.net>

On Fri, 7 Apr 2006, Junio C Hamano wrote:

> Linus Torvalds <torvalds@osdl.org> writes:
> 
> > On Mon, 3 Apr 2006, Linus Torvalds wrote:
> >> 
> >> That said, I think git _does_ have problems with large pack-files. We have 
> >> some 32-bit issues etc
> >
> > I should clarify that. git _itself_ shouldn't have any 32-bit issues, but 
> > the packfile data structure does. The index has 32-bit offsets into 
> > individual pack-files. 
> >
> > That's not hugely fundamental,...
> 
> Linus _does_ understand what he means, but let me clarify and
> outline a possible future direction.
> 
[...]

For the record, the delta code also has 32-bit limitations of its own 
presently.  It cannot encode a delta against a buffer which is larger 
than 4GB.

I however made sure the byte 0 could be used as a prefix for future 
encoding extensions, like 64-bit file offsets for example.


Nicolas

^ permalink raw reply

* Git is one year old today
From: Luck, Tony @ 2006-04-07 16:16 UTC (permalink / raw)
  To: git

Happy birthday to git ... one year old today.  Counting
the "birth" as the point at which Linus made the first commit
of the git sources into git:

 commit e83c5163316f89bfbde7d9ab23ca2e25604af290
 Author: Linus Torvalds <torvalds@ppc970.osdl.org>
 Date:   Thu Apr 7 15:13:13 2005 -0700

    Initial revision of "git", the information manager from hell

-Tony

^ permalink raw reply

* Re: Cygwin can't handle huge packfiles?
From: Junio C Hamano @ 2006-04-07 18:31 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: git
In-Reply-To: <Pine.LNX.4.64.0604071002530.2215@localhost.localdomain>

Nicolas Pitre <nico@cam.org> writes:

> On Fri, 7 Apr 2006, Junio C Hamano wrote:
>
>> Linus Torvalds <torvalds@osdl.org> writes:
>> 
>> > On Mon, 3 Apr 2006, Linus Torvalds wrote:
>> >> 
>> >> That said, I think git _does_ have problems with large pack-files. We have 
>> >> some 32-bit issues etc
>> >
>> > I should clarify that. git _itself_ shouldn't have any 32-bit issues, but 
>> > the packfile data structure does. The index has 32-bit offsets into 
>> > individual pack-files. 
>> >
>> > That's not hugely fundamental,...
>> 
>> Linus _does_ understand what he means, but let me clarify and
>> outline a possible future direction.
>
> For the record, the delta code also has 32-bit limitations of its own 
> presently.  It cannot encode a delta against a buffer which is larger 
> than 4GB.
>
> I however made sure the byte 0 could be used as a prefix for future 
> encoding extensions, like 64-bit file offsets for example.

True the delta data representation, not just the "delta code",
has that limitation, but I do not think you issue "insert 0-byte
literal data" command from the deltifier side right now, so we
should be OK.

Maybe we would want to check (cmd == 0) case to detect delta
extension that we do not handle right now?

^ permalink raw reply

* Can't export whole repo as patches
From: Peter Baumann @ 2006-04-07 18:47 UTC (permalink / raw)
  To: git

I'd like to export the whole history of a project of mine via patches
but I can't get the inital commit.

How can I get the inital commit as a patch?

That's what I tried:

  git --version
  git version 1.2.4				# debian sarge

  mkdir /tmp/testrepo && cd /tmp/testrepo
  git-init-db
  echo a > a_file.txt
  git-add a_file.txt
  git-commit -a -m "a_file added"
  echo b >> a_file.txt
  git-commit -a -m "a_file modifed"
  xp:/tmp/testrepo git-format-patch master~1
  0001-a_file-modified.txt
  cat 0001-a_file-modified.txt
  From nobody Mon Sep 17 00:00:00 2001
  From: Peter Baumann <peter.baumann@gmail.com>
  Date: Fri Apr 7 12:20:54 2006 +0200
  Subject: [PATCH] a_file modified

  ---

   a_file.txt |    1 +
   1 files changed, 1 insertions(+), 0 deletions(-)

  d8ceeed82a29004c066a98e0d390818e65fa9da7
  diff --git a/a_file.txt b/a_file.txt
  index 7898192..422c2b7 100644
  --- a/a_file.txt
  +++ b/a_file.txt
  @@ -1 +1,2 @@
   a
  +b
  --
  1.2.4


As you can see, there is only a patch of the second commit. But it seems that
this behaviour is correct, because I asked for the diff between master^..master

Obviously, I wanted a way to get the diff of master~2..master.

Trying harder:

  git-format-patch master~2
  Not a valid rev master~2 (master~2..HEAD)

Any hint to the correct way is appreciated.

</me thinking loudly>
The best would be if git would have an implicit tag or branch called "init"
(name doesn't really matter) which is the root of an empty repository. In that case
one can do git-format-patch root..master and it would the right thing.

Greetings,
  Peter Baumann

^ permalink raw reply

* Re: Cygwin can't handle huge packfiles?
From: Nicolas Pitre @ 2006-04-07 18:46 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <7vhd55jkz0.fsf@assigned-by-dhcp.cox.net>

On Fri, 7 Apr 2006, Junio C Hamano wrote:

> Nicolas Pitre <nico@cam.org> writes:
> 
> > On Fri, 7 Apr 2006, Junio C Hamano wrote:
> >
> >> Linus Torvalds <torvalds@osdl.org> writes:
> >> 
> >> > On Mon, 3 Apr 2006, Linus Torvalds wrote:
> >> >> 
> >> >> That said, I think git _does_ have problems with large pack-files. We have 
> >> >> some 32-bit issues etc
> >> >
> >> > I should clarify that. git _itself_ shouldn't have any 32-bit issues, but 
> >> > the packfile data structure does. The index has 32-bit offsets into 
> >> > individual pack-files. 
> >> >
> >> > That's not hugely fundamental,...
> >> 
> >> Linus _does_ understand what he means, but let me clarify and
> >> outline a possible future direction.
> >
> > For the record, the delta code also has 32-bit limitations of its own 
> > presently.  It cannot encode a delta against a buffer which is larger 
> > than 4GB.
> >
> > I however made sure the byte 0 could be used as a prefix for future 
> > encoding extensions, like 64-bit file offsets for example.
> 
> True the delta data representation, not just the "delta code",
> has that limitation, but I do not think you issue "insert 0-byte
> literal data" command from the deltifier side right now, so we
> should be OK.
> 
> Maybe we would want to check (cmd == 0) case to detect delta
> extension that we do not handle right now?

Good idea.  Will send you a patch.


Nicolas

^ permalink raw reply

* Re: Can't export whole repo as patches
From: Junio C Hamano @ 2006-04-07 19:18 UTC (permalink / raw)
  To: Peter Baumann; +Cc: git
In-Reply-To: <20060407184701.GA6686@xp.machine.de>

Peter Baumann <peter.baumann@gmail.com> writes:

> How can I get the inital commit as a patch?

format-patch is designed to get a patch to send to upstream, and
does not handle the root commit.  In your two revisions
repository, you could do something like this:

	$ git diff-tree -p --root master~1

Or more in general:

	$ git rev-list master |
          git diff-tree --stdin --root --pretty=fuller -p

BTW, I've been meaning to add --pretty=patch to give
format-patch compatible output to diff-tree, but haven't got
around to actually do it.  Another thing I've been meaning to do
is "git log --diff" which is more or less "git whatchanged".

^ permalink raw reply

* realloc
From: Morten Welinder @ 2006-04-07 20:11 UTC (permalink / raw)
  To: GIT Mailing List

I could be wrong, but shouldn't

      var = realloc (var, whatever);

be changed to call xrealloc?  That, or assign to a different variable and check
for NULL.

This should affect the last four hits below.

M.




/scratch/welinder/git> grep -w realloc *.c
daemon.c:               newlist = realloc(socklist, sizeof(int) *
(socknum + 1));
diff-delta.c:                           out = realloc(out, outsize);
git.c:          cmdname = realloc(cmdname, cmdname_alloc * sizeof(*cmdname));
ls-files.c:             which->excludes = realloc(which->excludes,
sha1_file.c:                            buf = realloc(buf, size);

^ permalink raw reply

* Re: realloc
From: Junio C Hamano @ 2006-04-07 20:35 UTC (permalink / raw)
  To: git
In-Reply-To: <118833cc0604071311v1da93f83n112cc2ea44552ca9@mail.gmail.com>

"Morten Welinder" <mwelinder@gmail.com> writes:

> I could be wrong, but shouldn't
>
>       var = realloc (var, whatever);
>
> be changed to call xrealloc?  That, or assign to a different variable and check
> for NULL.
>
> This should affect the last four hits below.
>
> M.
>
>
>
>
> /scratch/welinder/git> grep -w realloc *.c
> daemon.c:               newlist = realloc(socklist, sizeof(int) *
> (socknum + 1));
> diff-delta.c:                           out = realloc(out, outsize);
> git.c:          cmdname = realloc(cmdname, cmdname_alloc * sizeof(*cmdname));
> ls-files.c:             which->excludes = realloc(which->excludes,
> sha1_file.c:                            buf = realloc(buf, size);

There is no excuse for not using xrealloc() in git.c,
ls-files.c, and sha1_file.c.

The diff-delta.c code wants to be independent from the rest of
git code, so it probably should check the returned value itself.

Historically to a certain degree daemon.c also wanted to be
independent from the rest of git, but I suspect it still is the
case (it uses small pieces from packet interface but that is
about it).

^ permalink raw reply

* [PATCH] Fix paths on FreeBSD by processing gitk like other scripts
From: Eric Anholt @ 2006-04-07 21:03 UTC (permalink / raw)
  To: git


[-- Attachment #1.1: Type: text/plain, Size: 327 bytes --]

The paths for python and tk are not /usr/bin for FreeBSD, so I moved
gitk to gitk.tk and added a rule to sed in the proper path to "wish" in
making gitk, and also added the appropriate default path for python.

-- 
Eric Anholt                     anholt@FreeBSD.org
eric@anholt.net                 eric.anholt@intel.com

[-- Attachment #1.2: git-freebsd.diff --]
[-- Type: text/x-patch, Size: 1635 bytes --]

diff --git a/.gitignore b/.gitignore
index b5959d6..e9d5a7b 100644
--- a/.gitignore
+++ b/.gitignore
@@ -121,6 +121,7 @@ git-verify-tag
 git-whatchanged
 git-write-tree
 git-core-*/?*
+gitk
 test-date
 test-delta
 common-cmds.h
diff --git a/Makefile b/Makefile
index 3367b8c..de28dec 100644
--- a/Makefile
+++ b/Makefile
@@ -136,6 +136,9 @@ SCRIPT_PERL = \
 SCRIPT_PYTHON = \
 	git-merge-recursive.py
 
+SCRIPT_TK = \
+	gitk.tk
+
 SCRIPTS = $(patsubst %.sh,%,$(SCRIPT_SH)) \
 	  $(patsubst %.perl,%,$(SCRIPT_PERL)) \
 	  $(patsubst %.py,%,$(SCRIPT_PYTHON)) \
@@ -174,6 +177,15 @@ # Backward compatibility -- to be remove
 PROGRAMS += git-ssh-pull$X git-ssh-push$X
 
 # Set paths to tools early so that they can be used for version tests.
+ifeq ($(uname_S),FreeBSD)
+	ifndef PYTHON_PATH
+		PYTHON_PATH = /usr/local/bin/python
+	endif
+	ifndef WISH_PATH
+		WISH_PATH = /usr/local/bin/wish8.4
+	endif	
+endif
+
 ifndef SHELL_PATH
 	SHELL_PATH = /bin/sh
 endif
@@ -183,6 +195,9 @@ endif
 ifndef PYTHON_PATH
 	PYTHON_PATH = /usr/bin/python
 endif
+ifndef WISH_PATH
+	WISH_PATH = wish
+endif
 
 PYMODULES = \
 	gitMergeCommon.py
@@ -484,6 +499,12 @@ common-cmds.h: Documentation/git-*.txt
 	    -e 's|@@GIT_PYTHON_PATH@@|$(GIT_PYTHON_DIR_SQ)|g' \
 	    -e 's/@@GIT_VERSION@@/$(GIT_VERSION)/g' \
 	    $@.py >$@
+	chmod +x $@
+
+$(patsubst %.tk,%,$(SCRIPT_TK)) : % : %.tk
+	rm -f $@
+	sed -e "s|exec wish|exec $(WISH_PATH)|" \
+	    $@.tk >$@
 	chmod +x $@
 
 git-cherry-pick: git-revert
diff --git a/gitk b/gitk.tk
similarity index 100%
rename from gitk
rename to gitk.tk

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 187 bytes --]

^ permalink raw reply related

* [ANNOUNCE] Stacked GIT 0.9
From: Catalin Marinas @ 2006-04-07 22:05 UTC (permalink / raw)
  To: git

Stacked GIT 0.9 release is available from http://www.procode.org/stgit/

StGIT is a Python application providing similar functionality to Quilt
(i.e. pushing/popping patches to/from a stack) on top of GIT. These
operations are performed using GIT commands and the patches are stored
as GIT commit objects, allowing easy merging of the StGIT patches into
other repositories using standard GIT functionality.

The main features in this release:

  # Faster three-way merge by using 'git-read-tree --aggressive' and
    dealing with conflicts internally (gitmergeonefile.py removed)
  # StGIT repositories are now 'git prune'-safe
  # 'show' command for displaying a given patch
  # 'uncommit' command for reversing the effects of 'commit'
  # '--series' option added to the 'import' command
  # '--merged' option added to the 'push' and 'pull' commands to check
    for patches merged upstream
  # '--undo' option added to 'refresh'
  # Patch refreshing can be done for individual files only
  # '--stdout' option added to 'export'
  # '--mbox' option added to 'mail'
  # 'smtpdelay' configuration option for delays between messages sending
  # $PAGER or the 'pager' configuration option used for the 'show' and
    'diff' commands
  # '--force' option removed from the 'new' command
  # Bug fixes

Acknowledgements (generated with 'git shortlog'):

Catalin Marinas:
      Fix the clone command failure
      Fix the 'status --reset' for individual files
      Remove the --force option for new
      Allow patch refreshing for some files only
      Use the GIT-specific environment as default
      Check whether the file exists in the fold command
      Add prune-safety to StGIT
      Allow tag objects to be passed on the command line
      Add --stdout option to export
      Add --mbox option to the 'mail' command
      Fix the e-mail address escaping
      Fix the reset command to set HEAD to a proper id
      Allow stg to be loaded in pydb and not run main()
      Print a shorter usage message with the --help option
      Add a merged upstream test for pull and push
      Add --series to import
      Cache the base_dir value for subsequent calls
      Pass the --aggressive flag to git-read-tree
      gitmergeonefile.py should use git.get_base_dir()
      Deal with merge conflicts directly
      Add the --patch option to export
      Add the --strip option to import
      Fix the patch name stripping in import
      Update the TODO file
      The gitmergeonefile config section is deprecated
      Add the "smtpdelay" config option
      Create stgit/basedir.py for determining the .git directory
      Remove the checking for the default configuration values
      Add extra headers to the e-mail messages
      Add the '--undo' option to 'refresh'
      Add a 'show' command
      Remove the basedir exception throwing
      Use a pager for diff and show commands
      Use 'git-*' instead of 'git *'
      Release 0.9

Chuck Lever:
      "stg pull" says "popping all patches" even when it doesn't
      Use a separate directory for patches under each branch subdir
      Add an option to "stg branch" to convert the internal format

Karl Hasselström:
      [PATCH 2/2] Add 'stg uncommit' command
      Use --refid option even when sending a cover mail
      Change the signature start string to "-- \n"
      Update .git/refs/heads/base after patch deletion

Kirill Smelkov:
      [trivial]  fix spelling typos

Paolo 'Blaisorblade' Giarrusso:
      Stgit - gitmergeonefile.py: handle removal vs. changes
      Pass --directory to git-ls-files for stg status

Pavel Roskin:
      stgit: typo fixes
      Make tutorial a valid asciidoc article.
      stg export: check if there are any patches to export
      Treat "stg --help cmd" and "stg help cmd" like "stg cmd
      Improve "stg uncommit" help text.

Sam Vilain:
      common: parse 'email (name)' correctly

--
Catalin

^ permalink raw reply

* [ANNOUNCE] GIT 1.2.6
From: Junio C Hamano @ 2006-04-08  0:56 UTC (permalink / raw)
  To: git; +Cc: linux-kernel

The latest maintenance release GIT 1.2.6 is available at the
usual places:

	http://www.kernel.org/pub/software/scm/git/

	git-1.2.6.tar.{gz,bz2}			(tarball)
	RPMS/$arch/git-*-1.2.6-1.$arch.rpm	(RPM)

These fixes are my birthday present to git ;-).  I'll also do
the 1.3.0-rc3 tonight.

----------------------------------------------------------------

Changes since v1.2.5 are as follows:

Junio C Hamano:
      parse_date(): fix parsing 03/10/2006
      diff_flush(): leakfix.
      count-delta: match get_delta_hdr_size() changes.

Nicolas Pitre:
      check patch_delta bounds more carefully

^ permalink raw reply

* [ANNOUNCE] Cogito-0.17.2
From: Petr Baudis @ 2006-04-08  1:06 UTC (permalink / raw)
  To: git; +Cc: linux-kernel

  Hello,

  to join the series of git-related announcements, Cogito-0.17.2, the next
maintenance release on the current stable (v0.17) branch of Cogito, the
human-friendly version control system on top of Git, is available now.

  There are only very few changes, it looks that we are pretty stable:

Chris Wright:
      cogito spec BuildRequires update

Dennis Stosberg:
      cogito: Push tags over http

Petr Baudis:
      Improved cg-version output (use cg-object-id -d)
      cg-patch -c: Stop also at ^diff --git when slurping the commit message
      Fixed embarassing cg-admin-rewritehist bug
      Make cg-add/rm warnings less confusing: s/files/items/
      cogito-0.17.2


P.S.: Visit us at #git @ FreeNode!

  Happy hacking,

-- 
				Petr "Stable Pasky" Baudis
Stuff: http://pasky.or.cz/
Of the 3 great composers Mozart tells us what it's like to be human,
Beethoven tells us what it's like to be Beethoven and Bach tells us
what it's like to be the universe.  -- Douglas Adams

^ permalink raw reply

* Re: git/cogito suggestion: tags with descriptions
From: Petr Baudis @ 2006-04-08  2:35 UTC (permalink / raw)
  To: Zack Brown; +Cc: Junio C Hamano, git
In-Reply-To: <20050912010051.GJ15630@pasky.or.cz>

Dear diary, on Mon, Sep 12, 2005 at 03:00:51AM CEST, I got a letter
where Petr Baudis <pasky@suse.cz> said that...
> Dear diary, on Mon, Sep 05, 2005 at 11:24:31PM CEST, I got a letter
> where Zack Brown <zbrown@tumblerings.org> told me that...
> > I'm not sure. I'm not as familiar with the low-level git commands as I am with
> > cogito. But cogito has a -d option for giving a tag description. I guess what
> > would be closest to what I was thinking about would be this:
> > 
> > $ cg-tag -d "First draft, everything in place." 0.3 7540e503b9b9c1b03e44ee7fd700c844b2a02224
> > $ cg-tag-ls
> > 0.1     Initial idea complete                 f953b71b21a0bea682c2bed91362f2dce2cc204f
> > 0.3     First draft, everything in place.     7540e503b9b9c1b03e44ee7fd700c844b2a02224 
> > $
> > 
> > or something like that. Currently when I do the above cg-tag command,
> > a subsequent cg-tag-ls gives just:
> > 
> > $ cg-tag-ls
> > 0.1     f953b71b21a0bea682c2bed91362f2dce2cc204f
> > 0.3     7540e503b9b9c1b03e44ee7fd700c844b2a02224
> > 
> > In fact, I probably wouldn't even be interested in seeing the actual hash key
> > unless I gave a special flag, maybe -f (for "full"):
> > 
> > $ cg-tag-ls
> > 0.1     Initial idea complete
> > 0.3     First draft, everything in place.
> > $ cg-tag-ls -f
> > 0.1     Initial idea complete                 f953b71b21a0bea682c2bed91362f2dce2cc204f
> > 0.3     First draft, everything in place.     7540e503b9b9c1b03e44ee7fd700c844b2a02224
> 
> That's a nice idea (except that I'd prefer -l). I'll implement this
> after cogito-0.14.

So, I did. ;-) (In the master branch now.) The format is slightly
different from the proposed one:

	S cogito-0.16rc2   7766e3ba0664
	S cogito-0.17      51392f2dd82a  Poetic cogito-0.17.
	S cogito-0.17rc1   7cb4d8972d5b  Behold, cogito-0.17rc1! Plenty new features and cool stuff.
	% cogito-0.8       f9f0459b5b39
	% cogito-0.9       cc5517b4ea41
	  test             05862786175d

Object IDs are still shown, but abbreviated so they shouldn't get in the
way too much; the full first line is shown in the list output,
untrimmed. The initial flag column denotes signed tags by 'S', "direct
tags" (not pointing to a tag object) by '%' and broken tags by '!'.

P.S.: Also, cg-tag received a lot of improvements in the last two days.
Now features the same cool editor as cg-commit (but only if ran with
-e), -d was renamed to -m (but will stay aliased for quite some time),
cg-tag now also accepts multiple -m options for creating multi-paragraph
descriptions from the commandline, and bunch of other minor stuff was
implemented.

Thanks for the idea,

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
Right now I am having amnesia and deja-vu at the same time.  I think
I have forgotten this before.

^ permalink raw reply

* Re: [PATCH] Script for automated historical Git tree grafting
From: Petr Baudis @ 2006-04-08  3:09 UTC (permalink / raw)
  To: Andrew Morton; +Cc: torvalds, linux-kernel, git
In-Reply-To: <20060406175246.3bd1c972.akpm@osdl.org>

Dear diary, on Fri, Apr 07, 2006 at 02:52:46AM CEST, I got a letter
where Andrew Morton <akpm@osdl.org> said that...
> Petr Baudis <pasky@suse.cz> wrote:
> >
> > This script enables Git users to easily graft the historical Git tree
> >  (Bitkeeper history import) to the current history.
> 
> What impact will that have on the (already rather poor) performance of
> git-whatchanged, gitk, etc?

Negative. ;-)

I didn't try gitk myself, but according to Nick Riviera it eats 1.6G...
Otherwise, assuming that you have at least git-1.2.5, git-whatchanged on
the whole tree should be roughly equally fast as it was before grafting,
but git-whatchanged on individual paths is _significantly_ slower.

That said, 1.3.0rc2 should already have Linus' optimization which should
fix or at least mitigate the performance hit on narrowed-down
git-whatchanged.

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
Right now I am having amnesia and deja-vu at the same time.  I think
I have forgotten this before.

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox