Git development
 help / color / mirror / Atom feed
* Re: gitweb.cgi history not shown
From: Jakub Narebski @ 2006-06-11 16:59 UTC (permalink / raw)
  To: git
In-Reply-To: <Pine.LNX.4.64.0606110933040.5498@g5.osdl.org>

Linus Torvalds wrote:


> Btw, this is also why I suggested adding a "--no-simplify-history" flag, 
> because in this case, that's exactly what _you_ want. The reason git is 
> doing something unexpected - and in your case inferior - is exactly that 
> what you want in this case is really not "explain the STATE of this file", 
> but you want "give me ALL THE HISTORY concerning this filename".
[...]
> Btw, the original "git whatchanged -p" answered exactly the question you 
> had, and the semantics changed when we rewrite "git whatchanged" to act 
> like "git log -p". 
[...]
> And I do agree that we should teach "git log" and friends to be able to 
> answer both questions, and that's what my suggested patch (fleshed out 
> properly, of course) should do.

Could we please 'git whatchanged -p' default to the original (before
rewrite) behavior, i.e. ALL THE HISTORY?

-- 
Jakub Narebski
Warsaw, Poland

^ permalink raw reply

* Re: gitweb.cgi history not shown
From: Linus Torvalds @ 2006-06-11 16:54 UTC (permalink / raw)
  To: Marco Costalba; +Cc: Junio C Hamano, Git Mailing List
In-Reply-To: <Pine.LNX.4.64.0606110933040.5498@g5.osdl.org>


I just like talking to myself.

On Sun, 11 Jun 2006, Linus Torvalds wrote:
> 
> 	git-rev-list --all | git-diff-tree -p -- <filename>

That obviously wants a "--stdin" argument to git-diff-tree, and I might as 
well point out that it has a few other differences to doing this with the 
"--no-simplify-history" flag:

 - git-diff-tree obviously doesn't show merges normally, and when it does, 
   it would show only merges that change the file. In contrast, the "git 
   log" approach would show all merges that are part of the resulting 
   history (which, since you don't simplify merges, is _all_ of them).

 - the extra flag to "git log" approach allows "--parents" to work, ie the 
   stretches of commits in between merges would have their parents 
   rewritten, so that the history would be a unified whole, and you can 
   use qgit/gitk to show the result. The above pipeline obviously cannot 
   do that, since doing the filename limiter in git-diff-tree means that 
   it doesn't ever even _see_ the "history" part, it's just doing it one 
   commit at a time.

That concludes my monologue on the matter, I hope. If somebody wants to 
condense the issue of "show all history" vs "show how we got to this 
state" and add it to the Wiki FAQ thing, that would probably be good.

		Linus

^ permalink raw reply

* invalid command name "listrefs"
From: Junio C Hamano @ 2006-06-11 16:50 UTC (permalink / raw)
  To: Paul Mackerras; +Cc: git

Starting gitk, "File->Reread references" results in an error
dialog.

This patch resurrects the procedure from an older version, and it
seems to work for me, but with the updated code it might be that
you wanted to use a different mechanism to implement rereadrefs
procedure -- I dunno.

-- >8 --
gitk: rereadrefs wants listrefs 

The listrefs procedure was removed during the course of
development, but there is still a user of it, so resurrect it.

Signed-off-by: Junio C Hamano <junkio@cox.net>
---
diff --git a/gitk b/gitk
index 9be10a4..ba4644f 100755
--- a/gitk
+++ b/gitk
@@ -5196,6 +5196,24 @@ proc rereadrefs {} {
     }
 }
 
+proc listrefs {id} {
+    global idtags idheads idotherrefs
+
+    set x {}
+    if {[info exists idtags($id)]} {
+	set x $idtags($id)
+    }
+    set y {}
+    if {[info exists idheads($id)]} {
+	set y $idheads($id)
+    }
+    set z {}
+    if {[info exists idotherrefs($id)]} {
+	set z $idotherrefs($id)
+    }
+    return [list $x $y $z]
+}
+
 proc showtag {tag isnew} {
     global ctext tagcontents tagids linknum
 

^ permalink raw reply related

* Re: [PATCH] Implement safe_strncpy() as strlcpy() and use it more.
From: Linus Torvalds @ 2006-06-11 16:45 UTC (permalink / raw)
  To: Peter Eriksen; +Cc: git
In-Reply-To: <20060611103358.GB10430@bohr.gbar.dtu.dk>



On Sun, 11 Jun 2006, Peter Eriksen wrote:
> > 
> > Please include full copyright information.
> 
> Where should this information go?

Gaah, for soemthing as trivial as strlcpy(), please either just rewrite 
the function (you can only do it in so many ways), since the copyright 
message is _bigger_ and more annoying than the code itself.

Or, just attribute it. Say "this is a trivial implementation 'strlcpy()' 
shamelessly stolen from OpenBSD, originally under a BSD license, converted 
to GPLv2"

That whole "big copyright messages" thign is a _disease_. 

		Linus

^ permalink raw reply

* Re: gitweb.cgi history not shown
From: Linus Torvalds @ 2006-06-11 16:40 UTC (permalink / raw)
  To: Marco Costalba; +Cc: Junio C Hamano, Git Mailing List
In-Reply-To: <Pine.LNX.4.64.0606110902090.5498@g5.osdl.org>



On Sun, 11 Jun 2006, Linus Torvalds wrote:
> 
> See? No renames. The renames is not what is fundamental here. What is 
> fundamental is the _STATE_ of the tree. Remember: that's what git tracks, 
> and that is what "git log" shows you.

Btw, this is also why I suggested adding a "--no-simplify-history" flag, 
because in this case, that's exactly what _you_ want. The reason git is 
doing something unexpected - and in your case inferior - is exactly that 
what you want in this case is really not "explain the STATE of this file", 
but you want "give me ALL THE HISTORY concerning this filename".

Both are very valid things to ask for, it just happens that "git log" by 
default answers the _other_ question. It does NOT answer the "what is all 
the history" question that you're asking, it answers the "how did this 
state come to be" question.

Btw, the original "git whatchanged -p" answered exactly the question you 
had, and the semantics changed when we rewrite "git whatchanged" to act 
like "git log -p". But you can still get the old semantics by hand, if you 
really want it, by doing

	git-rev-list --all | git-diff-tree -p -- <filename>

because (and this actually makes total sense when you look at it), you now 
actually say "first give me all the history" and then "show the actual 
changes in that history as it pertains to <filename>".

See? 

I hope this explains the not-so-subtle (but still easy to overlook) 
difference between the two.

And I do agree that we should teach "git log" and friends to be able to 
answer both questions, and that's what my suggested patch (fleshed out 
properly, of course) should do.

Not that I ever _tested_ it, of course. Me? Testing? You make me laugh. Ho 
Ho Ho.

			Linus

^ permalink raw reply

* Re: gitweb.cgi history not shown
From: Linus Torvalds @ 2006-06-11 16:19 UTC (permalink / raw)
  To: Marco Costalba; +Cc: junkio, git
In-Reply-To: <e5bfff550606102332v6afb7d51m43ad5d74ba226cf0@mail.gmail.com>



On Sun, 11 Jun 2006, Marco Costalba wrote:
> 
> Why I still get empty results if I run git-rev-list from gitweb merge point?

Because the rename happened _inside_ the merge. 

So when you give the revision 0a8f4f00, that really means the state 
_after_ the merge. At that point, the filename doesn't actually exist.

git-rev-list then looks at the parents, one by one, and sees that the 
first parent _matches_ the state as far as your path spec is concerned (in 
this case, it matches "it was empty before, it was empty after"), so it 
will literally _always_ pick the parent that you're not interested in 
(regardless of whether it would have been merged into, or was the one that 
got merged), because that's the one with the minimal history difference.

Going the other way (the way you actually wish it went) would have 
introduced more history changes that aren't needed to explain the final 
state, so git-rev-list - by virtue of trying to generate a _minimal_ 
history - will actively avoid it.

> Is this because path changed: gitweb.cgi -> gitweb/gitweb.cgi

Well, in one sense yes, but in a much more fundamental sense that rename 
really has nothing to do with the real issue.

The real issue is that you asked how the state of a non-existent file came 
to be, and git-rev-list told you the simplest answer: it never existed at 
all.

And that answer is actually _true_. Along one history, that filename never 
existed.

And this really has nothing to do with renames. You can see the exact same 
thing with files that are there. Let's take an example:

	   A	<-- top of tree
	  / \
	 B   C
	 |   |
	 D   E
	  \ /
	   F
	   |
	   .	<-- old history
	   .

Let's say that you have had a file called "file" for all of history, and 
it got changed sligtly differently in _all_ commits B, C, D _and_ E.

Now, despite all the different changes, let's say that the end result was 
identical in B and C - even though the diffs of those two commits were not 
necessarily the same (because they started out from different points: D 
and E respectively). 

In other words, there was a branch, but both branches ended up fixing the 
same bug the same way (and this is less unusual than you'd think, even if 
they are independent, but even more so if the branches "knew of each 
other" some other way, ie the developers talked about the problem and 
perhaps sent patches back-and-forth that both people applied).

What do you think git-rev-list will do when you give it that "file" as a 
path limiter?

What it will do is to notice that merge A has the same state (wrt that 
file) as commit B (it's first parent), SO IT WILL TOTALLY IGNORE THE WHOLE 
HISTORY THAT IS REACHABLE FROM C.

So git-rev-list will first simplify the tree to just A -> B -> D -> F .., 
and then, since A and B were identical in the path (and let's say F was 
identical to it's parents too), it will actually decide that as far as 
those commits were concerned, nothing changed, so the actual end result is 
just "B -> D -> ...", and none of A, C, E and F show up at all, even 
though both C and E really did change something (they just didn't 
_matter_, because all the changes could be explained by just picking B and 
D).

See? No renames. The renames is not what is fundamental here. What is 
fundamental is the _STATE_ of the tree. Remember: that's what git tracks, 
and that is what "git log" shows you.

So when you do

	git log -- gitweb.cgi

you're really asking for: "Please explain the state of the current tree 
with regards to gitweb.cgi that doesn't exist at this point in time". And 
that's _exactly_ what "git log" will do. It will say "hey, I can explain 
it by the file not existing in one of the previous parents either: maybe 
it got removed there", and walk back as far as it possibly can to explain 
that the file doesn't exist.

And it turns out that it can walk all the way back to the root, and the 
file didn't exist there, so the end result is what? The empty set. 

			Linus

^ permalink raw reply

* [ANNOUNCE] Stacked GIT 0.10
From: Catalin Marinas @ 2006-06-11 16:15 UTC (permalink / raw)
  To: git, linux-kernel

Stacked GIT 0.10 release is available from http://www.procode.org/stgit/.

StGIT is a Python application providing similar functionality to Quilt
(i.e. pushing/popping patches to/from a stack) on top of GIT. These
operations are performed using GIT commands and the patches are stored
as GIT commit objects, allowing easy merging of the StGIT patches into
other repositories using standard GIT functionality.

The main features in this release:

    *  Handle branch names with slashes
    * Testsuite framework
    * Configurable file extensions for merge conflicts
    * 'goto' command, equivalent to 'pop --to' or 'push --to'
    * '--update' option for 'pick' to only fold the changes to files
      already in the current patch
    * '--replace' option for 'import' to replace existing patches in
      the series
    * Look for templates in ~/.stgit/templates as well
    * Allow multiple patch names on the 'push' command line
    * Classify commands in 'stg --help' output
    * Bug fixes

Acknowledgements (generated with 'git shortlog'):

Catalin Marinas:
      Allow git.checkout() to work on unmerged indexes
      Fix the added to both but different conflict
      Fix git.reset() to remove the added files
      Allow configurable file extensions for merge conflicts
      Add the --update option to pick
      Implement the 'goto' command
      Add the --replace option to import
      Fix the t1201-pull-trailing.sh test
      Show the stderr for failed GIT commands
      Generate an empty commit for the newly created patches
      Fix the add and rm commands to fail if no patch is applied
      Fix the t0001-subdir-branches.sh test
      Update the TODO file
      Release 0.10

Karl Hasselström:
      Fix infinite recursion on absolute paths
      Fix indexing error during "diff -r/"
      Explicitly specify utf-8 coding in file

Karl Hasselström:
      Don't die when there are no branches
      Handle branch names with slashes
      Tests for branch names with slashes

Pavel Roskin:
      Add .gitignore files, list generated files there
      Add a simple makefile

Yann Dirson:
      Add a testsuite framework copied from git-core
      Add list of bugs to TODO
      Add a couple of safety checks to series creation
      Make branch creation atomic
      Correctly handle refs/patches on series rename
      Fix a seriously bad interaction between .git caching and repo cloning
      Test that pulls a patch creating a file that got modified afterwards
      Exercise "stg pull" on patches just appending lines.
      Look for templates in ~/.stgit/templates as well
      Fixes to the pull testcases.
      Allow to specify multiple patch names on push command-line
      Classify commands in stg --help output.


-- 
Catalin

^ permalink raw reply

* [PATCH] gitweb: Adding a `blame' interface.
From: Florian Forster @ 2006-06-11 15:45 UTC (permalink / raw)
  To: git; +Cc: Florian Forster

This patch adds an interface for `git-blame' to `gitweb.cgi'. Links to it are
place in `git_blob'.
Internally the code uses `git-annotate' because `git-blame's output differs for
files that have been renamed in the past. However, I like the term `blame'
better.

Signed-off-by: Florian Forster <octo@verplant.org>


---

 gitweb/gitweb.cgi |  108 +++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 107 insertions(+), 1 deletions(-)

b11522d270365b293197680e43e8feb87328a352
diff --git a/gitweb/gitweb.cgi b/gitweb/gitweb.cgi
index ea21fbe..91c075d 100755
--- a/gitweb/gitweb.cgi
+++ b/gitweb/gitweb.cgi
@@ -203,6 +203,9 @@ if (!defined $action || $action eq "summ
 } elsif ($action eq "tag") {
 	git_tag();
 	exit;
+} elsif ($action eq "blame") {
+	git_blame();
+	exit;
 } else {
 	undef $action;
 	die_error(undef, "Unknown action.");
@@ -1228,6 +1231,107 @@ sub git_tag {
 	git_footer_html();
 }
 
+sub git_blame {
+	my $fd;
+	die_error('404 Not Found', "What file will it be, master?") if (!$file_name);
+	$hash_base ||= git_read_head($project);
+	die_error(undef, "Reading commit failed.") unless ($hash_base);
+	my %co = git_read_commit($hash_base)
+		or die_error(undef, "Reading commit failed.");
+	if (!defined $hash) {
+		$hash = git_get_hash_by_path($hash_base, $file_name, "blob")
+			or die_error(undef, "Error lookup file.");
+	}
+	open ($fd, "-|", "$gitbin/git-annotate", '-l', '-t', '-r', $file_name, $hash_base)
+		or die_error(undef, "Open failed.");
+	git_header_html();
+	print "<div class=\"page_nav\">\n" .
+		$cgi->a({-href => "$my_uri?" . esc_param("p=$project;a=summary")}, "summary") .
+		" | " . $cgi->a({-href => "$my_uri?" . esc_param("p=$project;a=shortlog")}, "shortlog") .
+		" | " . $cgi->a({-href => "$my_uri?" . esc_param("p=$project;a=log")}, "log") .
+		" | " . $cgi->a({-href => "$my_uri?" . esc_param("p=$project;a=commit;h=$hash_base")}, "commit") .
+		" | " . $cgi->a({-href => "$my_uri?" . esc_param("p=$project;a=commitdiff;h=$hash_base")}, "commitdiff") .
+		" | " . $cgi->a({-href => "$my_uri?" . esc_param("p=$project;a=tree;h=$co{'tree'};hb=$hash_base")}, "tree") . "<br/>\n";
+	print $cgi->a({-href => "$my_uri?" . esc_param("p=$project;a=blob;h=$hash;hb=$hash_base;f=$file_name")}, "blob") .
+		" | " . $cgi->a({-href => "$my_uri?" . esc_param("p=$project;a=blame;f=$file_name")}, "head") . "<br/>\n";
+	print "</div>\n".
+		"<div>" .
+		$cgi->a({-href => "$my_uri?" . esc_param("p=$project;a=commit;h=$hash_base"), -class => "title"}, esc_html($co{'title'})) .
+		"</div>\n";
+	print "<div class=\"page_path\"><b>" . esc_html($file_name) . "</b></div>\n";
+	print "<div class=\"page_body\">\n";
+	print <<HTML;
+<table style="border-collapse: collapse;">
+  <tr>
+    <th>Commit</th>
+    <th>Age</th>
+    <th>Author</th>
+    <th>Line</th>
+    <th>Data</th>
+  </tr>
+HTML
+	my @line_class = (qw(light dark));
+	my $line_class_len = scalar (@line_class);
+	my $line_class_num = $#line_class;
+	while (my $line = <$fd>) {
+		my $long_rev;
+		my $short_rev;
+		my $author;
+		my $time;
+		my $lineno;
+		my $data;
+		my $age;
+		my $age_str;
+		my $age_style;
+
+		chomp $line;
+		$line_class_num = ($line_class_num + 1) % $line_class_len;
+
+		if ($line =~ m/^([0-9a-fA-F]{40})\t\(\s*([^\t]+)\t(\d+) \+\d\d\d\d\t(\d+)\)(.*)$/) {
+			$long_rev = $1;
+			$author   = $2;
+			$time     = $3;
+			$lineno   = $4;
+			$data     = $5;
+		} else {
+			print qq(  <tr><td colspan="5" style="color: red; background-color: yellow;">Unable to parse: $line</td></tr>\n);
+			next;
+		}
+		$short_rev  = substr ($long_rev, 0, 8);
+		$age        = time () - $time;
+		$age_str    = age_string ($age);
+		$age_str    =~ s/ /&nbsp;/g;
+		$age_style  = 'font-style: italic;';
+		$age_style .= ' color: #009900; background: transparent;' if ($age < 60*60*24*2);
+		$age_style .= ' font-weight: bold;' if ($age < 60*60*2);
+		$author     = esc_html ($author);
+		$author     =~ s/ /&nbsp;/g;
+		# escape tabs
+		while ((my $pos = index($data, "\t")) != -1) {
+			if (my $count = (8 - ($pos % 8))) {
+				my $spaces = ' ' x $count;
+				$data =~ s/\t/$spaces/;
+			}
+		}
+		$data = esc_html ($data);
+		$data =~ s/ /&nbsp;/g;
+
+		print <<HTML;
+  <tr class="$line_class[$line_class_num]">
+    <td style="font-family: monospace;"><a href="$my_uri?${\esc_param ("p=$project;a=commit;h=$long_rev")}" class="text">$short_rev..</a></td>
+    <td style="$age_style">$age_str</td>
+    <td>$author</td>
+    <td style="text-align: right;"><a id="$lineno" href="#$lineno" class="linenr">$lineno</a></td>
+    <td style="font-family: monospace;">$data</td>
+  </tr>
+HTML
+	} # while (my $line = <$fd>)
+	print "</table>\n\n";
+	close $fd or print "Reading blob failed.\n";
+	print "</div>";
+	git_footer_html();
+}
+
 sub git_tags {
 	my $head = git_read_head($project);
 	git_header_html();
@@ -1375,7 +1479,8 @@ sub git_blob {
 		      " | " . $cgi->a({-href => "$my_uri?" . esc_param("p=$project;a=commitdiff;h=$hash_base")}, "commitdiff") .
 		      " | " . $cgi->a({-href => "$my_uri?" . esc_param("p=$project;a=tree;h=$co{'tree'};hb=$hash_base")}, "tree") . "<br/>\n";
 		if (defined $file_name) {
-			print $cgi->a({-href => "$my_uri?" . esc_param("p=$project;a=blob_plain;h=$hash;f=$file_name")}, "plain") .
+			print $cgi->a({-href => "$my_uri?" . esc_param("p=$project;a=blame;h=$hash;hb=$hash_base;f=$file_name")}, "blame") .
+			" | " . $cgi->a({-href => "$my_uri?" . esc_param("p=$project;a=blob_plain;h=$hash;f=$file_name")}, "plain") .
 			" | " . $cgi->a({-href => "$my_uri?" . esc_param("p=$project;a=blob;hb=HEAD;f=$file_name")}, "head") . "<br/>\n";
 		} else {
 			print $cgi->a({-href => "$my_uri?" . esc_param("p=$project;a=blob_plain;h=$hash")}, "plain") . "<br/>\n";
@@ -1496,6 +1601,7 @@ sub git_tree {
 			      "</td>\n" .
 			      "<td class=\"link\">" .
 			      $cgi->a({-href => "$my_uri?" . esc_param("p=$project;a=blob;h=$t_hash$base_key;f=$base$t_name")}, "blob") .
+#			      " | " . $cgi->a({-href => "$my_uri?" . esc_param("p=$project;a=blame;h=$t_hash$base_key;f=$base$t_name")}, "blame") .
 			      " | " . $cgi->a({-href => "$my_uri?" . esc_param("p=$project;a=history;h=$hash_base;f=$base$t_name")}, "history") .
 			      "</td>\n";
 		} elsif ($t_type eq "tree") {
-- 
1.3.3

^ permalink raw reply related

* Re: Redhat stateless Linux and git
From: Jon Smirl @ 2006-06-11 15:07 UTC (permalink / raw)
  To: Geert Bosch; +Cc: git, stateless-list
In-Reply-To: <D5AC73C4-5A2F-482E-9B45-71A72C62D670@adacore.com>

On 6/11/06, Geert Bosch <bosch@adacore.com> wrote:
>
> On Jun 9, 2006, at 18:59, Jon Smirl wrote:
> > Redhat is looking for a scheme to sync the disk system of their
> > stateless Linux client. They were using rsync and now they are looking
> > at doing it with LVM.
> >
> > What about using git?
>
> The data model is fine in principle, but git as-is isn't suitable
> for general backup/sync-like schemes. Large (multi-GB) files
> are not really supported yet. Still, I think the underlying
> data model, with some modifications to split large files on
> content-determined boundaries, would be really great for
> distributed filesystems.
>
> Many people using laptops these days connect to different
> filesystems on their office networks, home networks,
> digital cameras and even their PDA, cellphone and MP3-player.
> What is commonly described as "synching", really is just a
> merge between different branches. All arguments in favor
> of using a distributed SCM hold here too.
>
> Right now I'm using a hodge-podge of different manual and
> semi-automated methods to keep my local filesystem with 1.5M
> files totalling 90GB somewhat in synch with various
> homedirectories on different remote systems and backup disks.
> IMO, git is tantalizing close to be able to handle this, just
> needs to get a bit more scalable. Probably you'd want to use
> a different user interface as well, but all the underlying
> data structures and merge strategies may be equally valid.

That's why I though stateless Linux was a good place to start. The
client is read only so it is the simplest case to start with. I would
much prefer a file orientated system for syncing over a block oriented
one, with the block one there is no easy way to tell what is being
copied to your machine.

I added the stateless list to the cc, maybe they'll join in.

-- 
Jon Smirl
jonsmirl@gmail.com

^ permalink raw reply

* Re: [PATCH] Implement safe_strncpy() as strlcpy() and use it more. [Take 2]
From: Peter Eriksen @ 2006-06-11 13:05 UTC (permalink / raw)
  To: git
In-Reply-To: <20060611123332.GA3832@robert.daprodeges.fqdn.th-h.de>

On Sun, Jun 11, 2006 at 12:33:32PM +0000, Rocco Rutte wrote:
> Hi,
> 
> * Peter Eriksen [06-06-11 14:03:28 +0200] wrote:
> 
> >-char *safe_strncpy(char *dest, const char *src, size_t n)
> >+size_t safe_strncpy(char *dest, const char *src, size_t size)
> >{
> >-	strncpy(dest, src, n);
> >-	dest[n - 1] = '\0';
> >+	size_t ret = strlen(src);
> 
> At least FreeBSD's strlen() requires a non-NULL argument, i.e. with 
> src==NULL, this will segfault.
> 
> If you can ensure that src!=NULL, then it's okay, but the safe_ prefix 
> implies something different.

By eyeballing the source code of strlcpy() from FreeBSD and OpenBSD
(which are quite similar), it seems they will segfault if given source
string, which is NULL.  So, from what I've understood, safe_strncpy()
is not more unsafe than strlcpy() or the current safe_strncpy().  It does
have different semantics, because the current one pads will NULL, since
it uses strncpy().

Peter

^ permalink raw reply

* Re: [PATCH] Implement safe_strncpy() as strlcpy() and use it more. [Take 2]
From: Rocco Rutte @ 2006-06-11 12:33 UTC (permalink / raw)
  To: git
In-Reply-To: <20060611120328.GC10430@bohr.gbar.dtu.dk>

Hi,

* Peter Eriksen [06-06-11 14:03:28 +0200] wrote:

>-char *safe_strncpy(char *dest, const char *src, size_t n)
>+size_t safe_strncpy(char *dest, const char *src, size_t size)
> {
>-	strncpy(dest, src, n);
>-	dest[n - 1] = '\0';
>+	size_t ret = strlen(src);

At least FreeBSD's strlen() requires a non-NULL argument, i.e. with 
src==NULL, this will segfault.

If you can ensure that src!=NULL, then it's okay, but the safe_ prefix 
implies something different.

   bye, Rocco
-- 
:wq!

^ permalink raw reply

* Collecting cvsps patches
From: Yann Dirson @ 2006-06-11 12:27 UTC (permalink / raw)
  To: GIT list; +Cc: cvsps

Since there are has been some work done here and there on cvsps, but
upstream does not seem to have time to issue a new release, I have
started to collect the patches I found.

I guess this is a good place for a heads-up: if you know of any other
bugfixes or feature patches to cvsps, I'd like to hear about it, so I
can add it to my repo.

Not that the master branch is an octopus merge of all works in there,
including my preliminary work on multiple-tag support, so for now you
may want to do your own mix.

For now it has:

* bugfixes and such:

Anand Kumria:
      FreeBSD isn't evil - just misguided

Linus Torvalds:
      Increase log-length limit to 64kB
      Improve handling of file collisions in the same patchset
      Fix branch ancestor calculation

Yann Dirson:
      Cleanup the tag handling to simplify multi-tag handling
      Dependency handling

* features

Yann Dirson:
      Allow to have multiple tags on a single patchset.

-- 
Yann Dirson    <ydirson@altern.org> |
Debian-related: <dirson@debian.org> |   Support Debian GNU/Linux:
                                    |  Freedom, Power, Stability, Gratis
     http://ydirson.free.fr/        | Check <http://www.debian.org/>

^ permalink raw reply

* Re: Redhat stateless Linux and git
From: Geert Bosch @ 2006-06-11 12:21 UTC (permalink / raw)
  To: Jon Smirl; +Cc: git
In-Reply-To: <9e4733910606091559m6a88e864m16f9d75a507ee684@mail.gmail.com>


On Jun 9, 2006, at 18:59, Jon Smirl wrote:
> Redhat is looking for a scheme to sync the disk system of their
> stateless Linux client. They were using rsync and now they are looking
> at doing it with LVM.
>
> What about using git?

The data model is fine in principle, but git as-is isn't suitable
for general backup/sync-like schemes. Large (multi-GB) files
are not really supported yet. Still, I think the underlying
data model, with some modifications to split large files on
content-determined boundaries, would be really great for
distributed filesystems.

Many people using laptops these days connect to different
filesystems on their office networks, home networks,
digital cameras and even their PDA, cellphone and MP3-player.
What is commonly described as "synching", really is just a
merge between different branches. All arguments in favor
of using a distributed SCM hold here too.

Right now I'm using a hodge-podge of different manual and
semi-automated methods to keep my local filesystem with 1.5M
files totalling 90GB somewhat in synch with various
homedirectories on different remote systems and backup disks.
IMO, git is tantalizing close to be able to handle this, just
needs to get a bit more scalable. Probably you'd want to use
a different user interface as well, but all the underlying
data structures and merge strategies may be equally valid.

   -Geert

^ permalink raw reply

* [PATCH 3/3] cg-admin-rewritehist: seed the map with the parent of the -r arg, not with itself
From: Yann Dirson @ 2006-06-11 12:05 UTC (permalink / raw)
  To: Petr Baudis; +Cc: git
In-Reply-To: <20060611120431.12116.74005.stgit@gandelf.nowhere.earth>


This is a fix for 95621e54cedef1c4a270af5570a72fc1331b5fcb.

Signed-off-by: Yann Dirson <ydirson@altern.org>
---

 cg-admin-rewritehist |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/cg-admin-rewritehist b/cg-admin-rewritehist
index 7cbdb30..6dd8b92 100755
--- a/cg-admin-rewritehist
+++ b/cg-admin-rewritehist
@@ -157,7 +157,7 @@ while optparse; do
 		git-rev-parse "$OPTARG" >/dev/null || die "Unknown revision '$OPTARG'"
 		git-rev-parse "$OPTARG^" >/dev/null || die "Revision '$OPTARG' does not have parents, check what you really want"
 		startrev="^$OPTARG^ $OPTARG $startrev"
-		startrevparents="$OPTARG $startrevparents"
+		startrevparents="$OPTARG^ $startrevparents"
 	elif optparse --env-filter=; then
 		filter_env="$OPTARG"
 	elif optparse --tree-filter=; then

^ permalink raw reply related

* [PATCH 2/3] cg-admin-rewritehist: catch errors in -r argument early
From: Yann Dirson @ 2006-06-11 12:04 UTC (permalink / raw)
  To: Petr Baudis; +Cc: git
In-Reply-To: <20060611120431.12116.74005.stgit@gandelf.nowhere.earth>




Signed-off-by: Yann Dirson <ydirson@altern.org>
---

 cg-admin-rewritehist |    2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/cg-admin-rewritehist b/cg-admin-rewritehist
index fe3f210..7cbdb30 100755
--- a/cg-admin-rewritehist
+++ b/cg-admin-rewritehist
@@ -154,6 +154,8 @@ while optparse; do
 	if optparse -d=; then
 		tempdir="$OPTARG"
 	elif optparse -r=; then
+		git-rev-parse "$OPTARG" >/dev/null || die "Unknown revision '$OPTARG'"
+		git-rev-parse "$OPTARG^" >/dev/null || die "Revision '$OPTARG' does not have parents, check what you really want"
 		startrev="^$OPTARG^ $OPTARG $startrev"
 		startrevparents="$OPTARG $startrevparents"
 	elif optparse --env-filter=; then

^ permalink raw reply related

* [PATCH 1/3] cg-admin-rewritehist: catch git-rev-list returning no commit
From: Yann Dirson @ 2006-06-11 12:04 UTC (permalink / raw)
  To: Petr Baudis; +Cc: git
In-Reply-To: <20060611120431.12116.74005.stgit@gandelf.nowhere.earth>




Signed-off-by: Yann Dirson <ydirson@altern.org>
---

 cg-admin-rewritehist |    4 ++++
 1 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/cg-admin-rewritehist b/cg-admin-rewritehist
index 861c7f6..fe3f210 100755
--- a/cg-admin-rewritehist
+++ b/cg-admin-rewritehist
@@ -199,6 +199,10 @@ done
 git-rev-list --topo-order HEAD $startrev | tac >../revs
 commits=$(cat ../revs | wc -l)
 
+if [ $commits -eq 0 ]; then
+    die "Found nothing to rewrite"
+fi
+
 i=0
 while read commit; do
 	i=$((i+1))

^ permalink raw reply related

* [PATCH 0/3] another series of cg-admin-rewritehist -r fixes
From: Yann Dirson @ 2006-06-11 10:12 UTC (permalink / raw)
  To: Petr Baudis; +Cc: git

The -r flag is a bit confusing, in that it expects the 1st revision de rewrite, and I
caught myself feeding it the last revision not to rewrite instead.  Patch #2 catches
this early.  Although Patch #2 should take care of most problems, a non-zero status
returned by a command not-last in a pipe is not caught, even under "set -e", so Patch #1
adds an additional safeguard.

Patch #3 corrects seeding of the rewrite map from -r arguments.

-- 
Yann Dirson    <ydirson@altern.org> |
Debian-related: <dirson@debian.org> |   Support Debian GNU/Linux:
                                    |  Freedom, Power, Stability, Gratis
     http://ydirson.free.fr/        | Check <http://www.debian.org/>

^ permalink raw reply

* [PATCH] Implement safe_strncpy() as strlcpy() and use it more. [Take 2]
From: Peter Eriksen @ 2006-06-11 12:03 UTC (permalink / raw)
  To: git

Signed-off-by: Peter Eriksen <s022018@student.dtu.dk>
---

This time, as René suggested, I've taken strlcpy() from the Linux kernel
lib/string.c.  Is it OK to not include copyright information then?

My other comments from take 1 still applies.

Peter
 
 builtin-log.c      |    2 +-
 builtin-tar-tree.c |    4 ++--
 cache.h            |    2 +-
 config.c           |    6 +++---
 http-fetch.c       |   10 ++++------
 http-push.c        |   10 +++++-----
 ident.c            |    5 ++---
 path.c             |   13 +++++++++----
 sha1_name.c        |    3 +--
 9 files changed, 28 insertions(+), 27 deletions(-)

diff --git a/builtin-log.c b/builtin-log.c
index 29a8851..5b0ea28 100644
--- a/builtin-log.c
+++ b/builtin-log.c
@@ -112,7 +112,7 @@ static void reopen_stdout(struct commit 
 	int len = 0;
 
 	if (output_directory) {
-		strncpy(filename, output_directory, 1010);
+		safe_strncpy(filename, output_directory, 1010);
 		len = strlen(filename);
 		if (filename[len - 1] != '/')
 			filename[len++] = '/';
diff --git a/builtin-tar-tree.c b/builtin-tar-tree.c
index 58a8ccd..f6310b9 100644
--- a/builtin-tar-tree.c
+++ b/builtin-tar-tree.c
@@ -240,8 +240,8 @@ static void write_entry(const unsigned c
 	/* XXX: should we provide more meaningful info here? */
 	sprintf(header.uid, "%07o", 0);
 	sprintf(header.gid, "%07o", 0);
-	strncpy(header.uname, "git", 31);
-	strncpy(header.gname, "git", 31);
+	safe_strncpy(header.uname, "git", sizeof(header.uname));
+	safe_strncpy(header.gname, "git", sizeof(header.gname));
 	sprintf(header.devmajor, "%07o", 0);
 	sprintf(header.devminor, "%07o", 0);
 
diff --git a/cache.h b/cache.h
index d5d7fe4..f630cf4 100644
--- a/cache.h
+++ b/cache.h
@@ -210,7 +210,7 @@ int git_mkstemp(char *path, size_t n, co
 
 int adjust_shared_perm(const char *path);
 int safe_create_leading_directories(char *path);
-char *safe_strncpy(char *, const char *, size_t);
+size_t safe_strncpy(char *, const char *, size_t);
 char *enter_repo(char *path, int strict);
 
 /* Read and unpack a sha1 file into memory, write memory to a sha1 file */
diff --git a/config.c b/config.c
index c474970..984c75f 100644
--- a/config.c
+++ b/config.c
@@ -280,17 +280,17 @@ int git_default_config(const char *var, 
 	}
 
 	if (!strcmp(var, "user.name")) {
-		strncpy(git_default_name, value, sizeof(git_default_name));
+		safe_strncpy(git_default_name, value, sizeof(git_default_name));
 		return 0;
 	}
 
 	if (!strcmp(var, "user.email")) {
-		strncpy(git_default_email, value, sizeof(git_default_email));
+		safe_strncpy(git_default_email, value, sizeof(git_default_email));
 		return 0;
 	}
 
 	if (!strcmp(var, "i18n.commitencoding")) {
-		strncpy(git_commit_encoding, value, sizeof(git_commit_encoding));
+		safe_strncpy(git_commit_encoding, value, sizeof(git_commit_encoding));
 		return 0;
 	}
 
diff --git a/http-fetch.c b/http-fetch.c
index d3602b7..da1a7f5 100644
--- a/http-fetch.c
+++ b/http-fetch.c
@@ -584,10 +584,8 @@ static void process_alternates_response(
 			// skip 'objects' at end
 			if (okay) {
 				target = xmalloc(serverlen + posn - i - 6);
-				strncpy(target, base, serverlen);
-				strncpy(target + serverlen, data + i,
-					posn - i - 7);
-				target[serverlen + posn - i - 7] = '\0';
+				safe_strncpy(target, base, serverlen);
+				safe_strncpy(target + serverlen, data + i, posn - i - 6);
 				if (get_verbosely)
 					fprintf(stderr,
 						"Also look at %s\n", target);
@@ -728,8 +726,8 @@ xml_cdata(void *userData, const XML_Char
 	struct xml_ctx *ctx = (struct xml_ctx *)userData;
 	if (ctx->cdata)
 		free(ctx->cdata);
-	ctx->cdata = xcalloc(len+1, 1);
-	strncpy(ctx->cdata, s, len);
+	ctx->cdata = xmalloc(len + 1);
+	safe_strncpy(ctx->cdata, s, len + 1);
 }
 
 static int remote_ls(struct alt_base *repo, const char *path, int flags,
diff --git a/http-push.c b/http-push.c
index b39b36b..2d9441e 100644
--- a/http-push.c
+++ b/http-push.c
@@ -1269,8 +1269,8 @@ xml_cdata(void *userData, const XML_Char
 	struct xml_ctx *ctx = (struct xml_ctx *)userData;
 	if (ctx->cdata)
 		free(ctx->cdata);
-	ctx->cdata = xcalloc(len+1, 1);
-	strncpy(ctx->cdata, s, len);
+	ctx->cdata = xmalloc(len + 1);
+	safe_strncpy(ctx->cdata, s, len + 1);
 }
 
 static struct remote_lock *lock_remote(char *path, long timeout)
@@ -1472,7 +1472,7 @@ static void process_ls_object(struct rem
 		return;
 	path += 8;
 	obj_hex = xmalloc(strlen(path));
-	strncpy(obj_hex, path, 2);
+	safe_strncpy(obj_hex, path, 3);
 	strcpy(obj_hex + 2, path + 3);
 	one_remote_object(obj_hex);
 	free(obj_hex);
@@ -2160,8 +2160,8 @@ static void fetch_symref(char *path, cha
 
 	/* If it's a symref, set the refname; otherwise try for a sha1 */
 	if (!strncmp((char *)buffer.buffer, "ref: ", 5)) {
-		*symref = xcalloc(buffer.posn - 5, 1);
-		strncpy(*symref, (char *)buffer.buffer + 5, buffer.posn - 6);
+		*symref = xmalloc(buffer.posn - 5);
+		safe_strncpy(*symref, (char *)buffer.buffer + 5, buffer.posn - 5);
 	} else {
 		get_sha1_hex(buffer.buffer, sha1);
 	}
diff --git a/ident.c b/ident.c
index 7c81fe8..7b44cbd 100644
--- a/ident.c
+++ b/ident.c
@@ -71,10 +71,9 @@ int setup_ident(void)
 		len = strlen(git_default_email);
 		git_default_email[len++] = '.';
 		if (he && (domainname = strchr(he->h_name, '.')))
-			strncpy(git_default_email + len, domainname + 1, sizeof(git_default_email) - len);
+			safe_strncpy(git_default_email + len, domainname + 1, sizeof(git_default_email) - len);
 		else
-			strncpy(git_default_email + len, "(none)", sizeof(git_default_email) - len);
-		git_default_email[sizeof(git_default_email) - 1] = 0;
+			safe_strncpy(git_default_email + len, "(none)", sizeof(git_default_email) - len);
 	}
 	/* And set the default date */
 	datestamp(git_default_date, sizeof(git_default_date));
diff --git a/path.c b/path.c
index 5168b5f..194e0b5 100644
--- a/path.c
+++ b/path.c
@@ -83,14 +83,19 @@ int git_mkstemp(char *path, size_t len, 
 }
 
 
-char *safe_strncpy(char *dest, const char *src, size_t n)
+size_t safe_strncpy(char *dest, const char *src, size_t size)
 {
-	strncpy(dest, src, n);
-	dest[n - 1] = '\0';
+	size_t ret = strlen(src);
 
-	return dest;
+	if (size) {
+		size_t len = (ret >= size) ? size - 1 : ret;
+		memcpy(dest, src, len);
+		dest[len] = '\0';
+	}
+	return ret;
 }
 
+
 int validate_symref(const char *path)
 {
 	struct stat st;
diff --git a/sha1_name.c b/sha1_name.c
index fbbde1c..8fe9b7a 100644
--- a/sha1_name.c
+++ b/sha1_name.c
@@ -262,8 +262,7 @@ static int get_sha1_basic(const char *st
 		if (str[am] == '@' && str[am+1] == '{' && str[len-1] == '}') {
 			int date_len = len - am - 3;
 			char *date_spec = xmalloc(date_len + 1);
-			strncpy(date_spec, str + am + 2, date_len);
-			date_spec[date_len] = 0;
+			safe_strncpy(date_spec, str + am + 2, date_len + 1);
 			at_time = approxidate(date_spec);
 			free(date_spec);
 			len = am;
-- 
1.3.3.g16a4

^ permalink raw reply related

* Re: [PATCH] Implement safe_strncpy() as strlcpy() and use it more.
From: Rene Scharfe @ 2006-06-11 11:17 UTC (permalink / raw)
  To: Peter Eriksen; +Cc: git
In-Reply-To: <20060611103358.GB10430@bohr.gbar.dtu.dk>

Peter Eriksen schrieb:
> On Sun, Jun 11, 2006 at 07:15:40PM +0900, YOSHIFUJI Hideaki / ?$B5HF#1QL@ wrote:
>> In article <20060611100628.GA10430@bohr.gbar.dtu.dk> (at Sun, 11 Jun 2006 12:06:28 +0200), "Peter Eriksen" <s022018@student.dtu.dk> says:
>>
>>> I've taken strlcpy() from the OpenBSD CVS without attribution.  Is this
>>> allowed?  If it is, how should it be stated?
>> Please include full copyright information.
> 
> Where should this information go?  Just above the function
> safe_strncpy(), or at the top of path.c?  I believe path.c is GPL, so
> can this be mixed freely with BSD licensed code?  Should I put
> safe_strncpy() into a seperate file as with strlcpy()?

Yes...  Or you could avoid all of this by using a GPL'd version, like
the one from the Linux kernel (in lib/string.c).

René

^ permalink raw reply

* Re: [PATCH] Implement safe_strncpy() as strlcpy() and use it more.
From: Peter Eriksen @ 2006-06-11 10:33 UTC (permalink / raw)
  To: git
In-Reply-To: <20060611.191540.68073375.yoshfuji@linux-ipv6.org>

On Sun, Jun 11, 2006 at 07:15:40PM +0900, YOSHIFUJI Hideaki / ?$B5HF#1QL@ wrote:
> In article <20060611100628.GA10430@bohr.gbar.dtu.dk> (at Sun, 11 Jun 2006 12:06:28 +0200), "Peter Eriksen" <s022018@student.dtu.dk> says:
> 
> > I've taken strlcpy() from the OpenBSD CVS without attribution.  Is this
> > allowed?  If it is, how should it be stated?
> 
> Please include full copyright information.

Where should this information go?  Just above the function
safe_strncpy(), or at the top of path.c?  I believe path.c is GPL, so
can this be mixed freely with BSD licensed code?  Should I put
safe_strncpy() into a seperate file as with strlcpy()?
This seems to be the copyright information:

/*
 * Copyright (c) 1998 Todd C. Miller <Todd.Miller@courtesan.com>
 *
 * Permission to use, copy, modify, and distribute this software for any
 * purpose with or without fee is hereby granted, provided that the above
 * copyright notice and this permission notice appear in all copies.
 *
 * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
 * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
 * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
 * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
 * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
 * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
 * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
 */

Peter

^ permalink raw reply

* Re: [PATCH] Implement safe_strncpy() as strlcpy() and use it more.
From: YOSHIFUJI Hideaki / 吉藤英明 @ 2006-06-11 10:15 UTC (permalink / raw)
  To: s022018; +Cc: git, yoshfuji
In-Reply-To: <20060611100628.GA10430@bohr.gbar.dtu.dk>

In article <20060611100628.GA10430@bohr.gbar.dtu.dk> (at Sun, 11 Jun 2006 12:06:28 +0200), "Peter Eriksen" <s022018@student.dtu.dk> says:

> I've taken strlcpy() from the OpenBSD CVS without attribution.  Is this
> allowed?  If it is, how should it be stated?

Please include full copyright information.

--yoshfuji

^ permalink raw reply

* [PATCH] Implement safe_strncpy() as strlcpy() and use it more.
From: Peter Eriksen @ 2006-06-11 10:06 UTC (permalink / raw)
  To: git

Signed-off-by: Peter Eriksen <s022018@student.dtu.dk>
---

I've taken strlcpy() from the OpenBSD CVS without attribution.  Is this
allowed?  If it is, how should it be stated?

I think this fixes some small errors, but might introduce some new ones.
I've tried to be very careful, but this really needs some more eyeballs.
What do you think?

Peter

P.S. By the way, the diff of safe_strncpy() isn't so pretty, because
what I really did was replace the entire function, not edit it.

 builtin-log.c      |    2 +-
 builtin-tar-tree.c |    4 ++--
 cache.h            |    2 +-
 config.c           |    6 +++---
 http-fetch.c       |   10 ++++------
 http-push.c        |   10 +++++-----
 ident.c            |    5 ++---
 path.c             |   31 +++++++++++++++++++++++++++----
 sha1_name.c        |    3 +--
 9 files changed, 46 insertions(+), 27 deletions(-)

diff --git a/builtin-log.c b/builtin-log.c
index 29a8851..5b0ea28 100644
--- a/builtin-log.c
+++ b/builtin-log.c
@@ -112,7 +112,7 @@ static void reopen_stdout(struct commit 
 	int len = 0;
 
 	if (output_directory) {
-		strncpy(filename, output_directory, 1010);
+		safe_strncpy(filename, output_directory, 1010);
 		len = strlen(filename);
 		if (filename[len - 1] != '/')
 			filename[len++] = '/';
diff --git a/builtin-tar-tree.c b/builtin-tar-tree.c
index 58a8ccd..f6310b9 100644
--- a/builtin-tar-tree.c
+++ b/builtin-tar-tree.c
@@ -240,8 +240,8 @@ static void write_entry(const unsigned c
 	/* XXX: should we provide more meaningful info here? */
 	sprintf(header.uid, "%07o", 0);
 	sprintf(header.gid, "%07o", 0);
-	strncpy(header.uname, "git", 31);
-	strncpy(header.gname, "git", 31);
+	safe_strncpy(header.uname, "git", sizeof(header.uname));
+	safe_strncpy(header.gname, "git", sizeof(header.gname));
 	sprintf(header.devmajor, "%07o", 0);
 	sprintf(header.devminor, "%07o", 0);
 
diff --git a/cache.h b/cache.h
index d5d7fe4..f630cf4 100644
--- a/cache.h
+++ b/cache.h
@@ -210,7 +210,7 @@ int git_mkstemp(char *path, size_t n, co
 
 int adjust_shared_perm(const char *path);
 int safe_create_leading_directories(char *path);
-char *safe_strncpy(char *, const char *, size_t);
+size_t safe_strncpy(char *, const char *, size_t);
 char *enter_repo(char *path, int strict);
 
 /* Read and unpack a sha1 file into memory, write memory to a sha1 file */
diff --git a/config.c b/config.c
index c474970..984c75f 100644
--- a/config.c
+++ b/config.c
@@ -280,17 +280,17 @@ int git_default_config(const char *var, 
 	}
 
 	if (!strcmp(var, "user.name")) {
-		strncpy(git_default_name, value, sizeof(git_default_name));
+		safe_strncpy(git_default_name, value, sizeof(git_default_name));
 		return 0;
 	}
 
 	if (!strcmp(var, "user.email")) {
-		strncpy(git_default_email, value, sizeof(git_default_email));
+		safe_strncpy(git_default_email, value, sizeof(git_default_email));
 		return 0;
 	}
 
 	if (!strcmp(var, "i18n.commitencoding")) {
-		strncpy(git_commit_encoding, value, sizeof(git_commit_encoding));
+		safe_strncpy(git_commit_encoding, value, sizeof(git_commit_encoding));
 		return 0;
 	}
 
diff --git a/http-fetch.c b/http-fetch.c
index d3602b7..da1a7f5 100644
--- a/http-fetch.c
+++ b/http-fetch.c
@@ -584,10 +584,8 @@ static void process_alternates_response(
 			// skip 'objects' at end
 			if (okay) {
 				target = xmalloc(serverlen + posn - i - 6);
-				strncpy(target, base, serverlen);
-				strncpy(target + serverlen, data + i,
-					posn - i - 7);
-				target[serverlen + posn - i - 7] = '\0';
+				safe_strncpy(target, base, serverlen);
+				safe_strncpy(target + serverlen, data + i, posn - i - 6);
 				if (get_verbosely)
 					fprintf(stderr,
 						"Also look at %s\n", target);
@@ -728,8 +726,8 @@ xml_cdata(void *userData, const XML_Char
 	struct xml_ctx *ctx = (struct xml_ctx *)userData;
 	if (ctx->cdata)
 		free(ctx->cdata);
-	ctx->cdata = xcalloc(len+1, 1);
-	strncpy(ctx->cdata, s, len);
+	ctx->cdata = xmalloc(len + 1);
+	safe_strncpy(ctx->cdata, s, len + 1);
 }
 
 static int remote_ls(struct alt_base *repo, const char *path, int flags,
diff --git a/http-push.c b/http-push.c
index b39b36b..2d9441e 100644
--- a/http-push.c
+++ b/http-push.c
@@ -1269,8 +1269,8 @@ xml_cdata(void *userData, const XML_Char
 	struct xml_ctx *ctx = (struct xml_ctx *)userData;
 	if (ctx->cdata)
 		free(ctx->cdata);
-	ctx->cdata = xcalloc(len+1, 1);
-	strncpy(ctx->cdata, s, len);
+	ctx->cdata = xmalloc(len + 1);
+	safe_strncpy(ctx->cdata, s, len + 1);
 }
 
 static struct remote_lock *lock_remote(char *path, long timeout)
@@ -1472,7 +1472,7 @@ static void process_ls_object(struct rem
 		return;
 	path += 8;
 	obj_hex = xmalloc(strlen(path));
-	strncpy(obj_hex, path, 2);
+	safe_strncpy(obj_hex, path, 3);
 	strcpy(obj_hex + 2, path + 3);
 	one_remote_object(obj_hex);
 	free(obj_hex);
@@ -2160,8 +2160,8 @@ static void fetch_symref(char *path, cha
 
 	/* If it's a symref, set the refname; otherwise try for a sha1 */
 	if (!strncmp((char *)buffer.buffer, "ref: ", 5)) {
-		*symref = xcalloc(buffer.posn - 5, 1);
-		strncpy(*symref, (char *)buffer.buffer + 5, buffer.posn - 6);
+		*symref = xmalloc(buffer.posn - 5);
+		safe_strncpy(*symref, (char *)buffer.buffer + 5, buffer.posn - 5);
 	} else {
 		get_sha1_hex(buffer.buffer, sha1);
 	}
diff --git a/ident.c b/ident.c
index 7c81fe8..7b44cbd 100644
--- a/ident.c
+++ b/ident.c
@@ -71,10 +71,9 @@ int setup_ident(void)
 		len = strlen(git_default_email);
 		git_default_email[len++] = '.';
 		if (he && (domainname = strchr(he->h_name, '.')))
-			strncpy(git_default_email + len, domainname + 1, sizeof(git_default_email) - len);
+			safe_strncpy(git_default_email + len, domainname + 1, sizeof(git_default_email) - len);
 		else
-			strncpy(git_default_email + len, "(none)", sizeof(git_default_email) - len);
-		git_default_email[sizeof(git_default_email) - 1] = 0;
+			safe_strncpy(git_default_email + len, "(none)", sizeof(git_default_email) - len);
 	}
 	/* And set the default date */
 	datestamp(git_default_date, sizeof(git_default_date));
diff --git a/path.c b/path.c
index 5168b5f..86f51e0 100644
--- a/path.c
+++ b/path.c
@@ -83,14 +83,37 @@ int git_mkstemp(char *path, size_t len, 
 }
 
 
-char *safe_strncpy(char *dest, const char *src, size_t n)
+/*
+ * Copy src to string dst of size siz.  At most siz-1 characters
+ * will be copied.  Always NUL terminates (unless siz == 0).
+ * Returns strlen(src); if retval >= siz, truncation occurred.
+ */
+size_t safe_strncpy(char *dst, const char *src, size_t siz)
 {
-	strncpy(dest, src, n);
-	dest[n - 1] = '\0';
+	char *d = dst;
+	const char *s = src;
+	size_t n = siz;
 
-	return dest;
+	/* Copy as many bytes as will fit */
+	if (n != 0) {
+		while (--n != 0) {
+			if ((*d++ = *s++) == '\0')
+				break;
+		}
+	}
+
+	/* Not enough room in dst, add NUL and traverse rest of src */
+	if (n == 0) {
+		if (siz != 0)
+			*d = '\0';		/* NUL-terminate dst */
+		while (*s++)
+			;
+	}
+
+	return(s - src - 1);	/* count does not include NUL */
 }
 
+
 int validate_symref(const char *path)
 {
 	struct stat st;
diff --git a/sha1_name.c b/sha1_name.c
index fbbde1c..8fe9b7a 100644
--- a/sha1_name.c
+++ b/sha1_name.c
@@ -262,8 +262,7 @@ static int get_sha1_basic(const char *st
 		if (str[am] == '@' && str[am+1] == '{' && str[len-1] == '}') {
 			int date_len = len - am - 3;
 			char *date_spec = xmalloc(date_len + 1);
-			strncpy(date_spec, str + am + 2, date_len);
-			date_spec[date_len] = 0;
+			safe_strncpy(date_spec, str + am + 2, date_len + 1);
 			at_time = approxidate(date_spec);
 			free(date_spec);
 			len = am;
-- 
1.3.3.g16a4

^ permalink raw reply related

* [PATCH] cvsimport: complete the cvsps run before starting the import - take 2
From: Martin Langhoff @ 2006-06-11  8:12 UTC (permalink / raw)
  To: junkio, git; +Cc: Martin Langhoff

On 5/24/06, Linus Torvalds <torvalds@osdl.org> wrote:
> It's entirely possible that the fact that it now seems to work for me is
> purely timing-related, since I also ended up using "-P cvsps-output" to
> avoid having a huge cvsps binary in memory at the same time.

We now capture the output of cvsps to a tempfile, and then read it in.
cvsps 2.1 works quite a bit "in memory", and only prints its patchset info
once it has finished talking with cvs, but apparently retaining all that
memory allocation. With this patch, cvsps is finished and reaped before
cvsimport start working (and growing). So the footprint of the whole
process is much lower.

Signed-off-by: Martin Langhoff <martin@catalyst.net.nz>
---

This is a more reliable implementation, which fork/execs and passes the cvsps
output into the tempfile.
---
 git-cvsimport.perl |   42 ++++++++++++++++++++++++++++--------------
 1 files changed, 28 insertions(+), 14 deletions(-)

diff --git a/git-cvsimport.perl b/git-cvsimport.perl
index 07b3203..9a7408b 100755
--- a/git-cvsimport.perl
+++ b/git-cvsimport.perl
@@ -529,25 +529,39 @@ if ($opt_A) {
 	write_author_info("$git_dir/cvs-authors");
 }
 
-my $pid = open(CVS,"-|");
-die "Cannot fork: $!\n" unless defined $pid;
-unless($pid) {
-	my @opt;
-	@opt = split(/,/,$opt_p) if defined $opt_p;
-	unshift @opt, '-z', $opt_z if defined $opt_z;
-	unshift @opt, '-q'         unless defined $opt_v;
-	unless (defined($opt_p) && $opt_p =~ m/--no-cvs-direct/) {
-		push @opt, '--cvs-direct';
+
+#
+# run cvsps into a file unless we are getting
+# it passed as a file via $opt_P
+#
+unless ($opt_P) {
+	print "Running cvsps...\n" if $opt_v;
+	my $pid = open(CVSPS,"-|");
+	die "Cannot fork: $!\n" unless defined $pid;
+	unless($pid) {
+		my @opt;
+		@opt = split(/,/,$opt_p) if defined $opt_p;
+		unshift @opt, '-z', $opt_z if defined $opt_z;
+		unshift @opt, '-q'         unless defined $opt_v;
+		unless (defined($opt_p) && $opt_p =~ m/--no-cvs-direct/) {
+			push @opt, '--cvs-direct';
+		}
+		exec("cvsps","--norc",@opt,"-u","-A",'--root',$opt_d,$cvs_tree);
+		die "Could not start cvsps: $!\n";
 	}
-	if ($opt_P) {
-	    exec("cat", $opt_P);
-	} else {
-	    exec("cvsps","--norc",@opt,"-u","-A",'--root',$opt_d,$cvs_tree);
-	    die "Could not start cvsps: $!\n";
+	my ($cvspsfh, $cvspsfile) = tempfile('gitXXXXXX', SUFFIX => '.cvsps',
+					     DIR => File::Spec->tmpdir());
+	while (<CVSPS>) {
+	    print $cvspsfh $_;
 	}
+	close CVSPS;
+	close $cvspsfh;
+	$opt_P = $cvspsfile;
 }
 
 
+open(CVS, "<$opt_P") or die $!;
+
 ## cvsps output:
 #---------------------
 #PatchSet 314
-- 
1.4.0.gcda2

^ permalink raw reply related

* [PATCH] cvsimport: ignore CVSPS_NO_BRANCH and impossible branches
From: Martin Langhoff @ 2006-06-11  8:12 UTC (permalink / raw)
  To: junkio, git; +Cc: Martin Langhoff

cvsps output often contains references to CVSPS_NO_BRANCH, commits that it
could not trace to a branch. Ignore that branch.

Additionally, cvsps will sometimes draw circular relationships between
branches -- where two branches are recorded as opening from the other.
In those cases, and where the ancestor branch hasn't been seen, ignore
it.
Signed-off-by: Martin Langhoff <martin@catalyst.net.nz>
---
 git-cvsimport.perl |   17 ++++++++++++++++-
 1 files changed, 16 insertions(+), 1 deletions(-)

diff --git a/git-cvsimport.perl b/git-cvsimport.perl
index 76f6246..07b3203 100755
--- a/git-cvsimport.perl
+++ b/git-cvsimport.perl
@@ -595,7 +595,11 @@ sub write_tree () {
 }
 
 my($patchset,$date,$author_name,$author_email,$branch,$ancestor,$tag,$logmsg);
-my(@old,@new,@skipped);
+my(@old,@new,@skipped,%ignorebranch);
+
+# commits that cvsps cannot place anywhere...
+$ignorebranch{'#CVSPS_NO_BRANCH'} = 1; 
+
 sub commit {
 	update_index(@old, @new);
 	@old = @new = ();
@@ -751,7 +755,16 @@ while(<CVS>) {
 			$state = 11;
 			next;
 		}
+		if (exists $ignorebranch{$branch}) {
+			print STDERR "Skipping $branch\n";
+			$state = 11;
+			next;
+		}
 		if($ancestor) {
+			if($ancestor eq $branch) {
+				print STDERR "Branch $branch erroneously stems from itself -- changed ancestor to $opt_o\n";
+				$ancestor = $opt_o;
+			}
 			if(-f "$git_dir/refs/heads/$branch") {
 				print STDERR "Branch $branch already exists!\n";
 				$state=11;
@@ -759,6 +772,7 @@ while(<CVS>) {
 			}
 			unless(open(H,"$git_dir/refs/heads/$ancestor")) {
 				print STDERR "Branch $ancestor does not exist!\n";
+				$ignorebranch{$branch} = 1;
 				$state=11;
 				next;
 			}
@@ -766,6 +780,7 @@ while(<CVS>) {
 			close(H);
 			unless(open(H,"> $git_dir/refs/heads/$branch")) {
 				print STDERR "Could not create branch $branch: $!\n";
+				$ignorebranch{$branch} = 1;
 				$state=11;
 				next;
 			}
-- 
1.4.0.gcda2

^ permalink raw reply related

* [PATCH 4/5] git-svn: restore original LC_ALL setting (or unset) for commit
From: Eric Wong @ 2006-06-11  7:03 UTC (permalink / raw)
  To: Junio C Hamano, git; +Cc: Eric Wong
In-Reply-To: <11500094292561-git-send-email-normalperson@yhbt.net>

svn forces UTF-8 for commit messages, and with LC_ALL set to 'C'
it is unable to determine encoding of the git commit message.

Now we'll just assume the user has set LC_* correctly for
the commit message they're using.

Signed-off-by: Eric Wong <normalperson@yhbt.net>
---
 contrib/git-svn/git-svn.perl |   34 +++++++++++++++++++++++-----------
 1 files changed, 23 insertions(+), 11 deletions(-)

diff --git a/contrib/git-svn/git-svn.perl b/contrib/git-svn/git-svn.perl
index 8d2e7f7..8bc3d69 100755
--- a/contrib/git-svn/git-svn.perl
+++ b/contrib/git-svn/git-svn.perl
@@ -14,6 +14,7 @@ use Cwd qw/abs_path/;
 $GIT_DIR = abs_path($ENV{GIT_DIR} || '.git');
 $ENV{GIT_DIR} = $GIT_DIR;
 
+my $LC_ALL = $ENV{LC_ALL};
 # make sure the svn binary gives consistent output between locales and TZs:
 $ENV{TZ} = 'UTC';
 $ENV{LC_ALL} = 'C';
@@ -704,23 +705,34 @@ sub svn_commit_tree {
 	my ($oneline) = ($log_msg{msg} =~ /([^\n\r]+)/);
 	print "Committing $commit: $oneline\n";
 
+	if (defined $LC_ALL) {
+		$ENV{LC_ALL} = $LC_ALL;
+	} else {
+		delete $ENV{LC_ALL};
+	}
 	my @ci_output = safe_qx(qw(svn commit -F),$commit_msg);
-	my ($committed) = grep(/^Committed revision \d+\./,@ci_output);
+	$ENV{LC_ALL} = 'C';
 	unlink $commit_msg;
-	defined $committed or croak
+	my ($committed) = ($ci_output[$#ci_output] =~ /(\d+)/);
+	if (!defined $committed) {
+		my $out = join("\n",@ci_output);
+		print STDERR "W: Trouble parsing \`svn commit' output:\n\n",
+				$out, "\n\nAssuming English locale...";
+		($committed) = ($out =~ /^Committed revision \d+\./sm);
+		defined $committed or die " FAILED!\n",
 			"Commit output failed to parse committed revision!\n",
-			join("\n",@ci_output),"\n";
-	my ($rev_committed) = ($committed =~ /^Committed revision (\d+)\./);
+		print STDERR " OK\n";
+	}
 
 	my @svn_up = qw(svn up);
 	push @svn_up, '--ignore-externals' unless $_no_ignore_ext;
-	if ($rev_committed == ($svn_rev + 1)) {
-		push @svn_up, "-r$rev_committed";
+	if ($committed == ($svn_rev + 1)) {
+		push @svn_up, "-r$committed";
 		sys(@svn_up);
 		my $info = svn_info('.');
 		my $date = $info->{'Last Changed Date'} or die "Missing date\n";
-		if ($info->{'Last Changed Rev'} != $rev_committed) {
-			croak "$info->{'Last Changed Rev'} != $rev_committed\n"
+		if ($info->{'Last Changed Rev'} != $committed) {
+			croak "$info->{'Last Changed Rev'} != $committed\n"
 		}
 		my ($Y,$m,$d,$H,$M,$S,$tz) = ($date =~
 					/(\d{4})\-(\d\d)\-(\d\d)\s
@@ -728,16 +740,16 @@ sub svn_commit_tree {
 					 or croak "Failed to parse date: $date\n";
 		$log_msg{date} = "$tz $Y-$m-$d $H:$M:$S";
 		$log_msg{author} = $info->{'Last Changed Author'};
-		$log_msg{revision} = $rev_committed;
+		$log_msg{revision} = $committed;
 		$log_msg{msg} .= "\n";
 		my $parent = file_to_s("$REV_DIR/$svn_rev");
 		git_commit(\%log_msg, $parent, $commit);
-		return $rev_committed;
+		return $committed;
 	}
 	# resync immediately
 	push @svn_up, "-r$svn_rev";
 	sys(@svn_up);
-	return fetch("$rev_committed=$commit")->{revision};
+	return fetch("$committed=$commit")->{revision};
 }
 
 # read the entire log into a temporary file (which is removed ASAP)
-- 
1.3.3.g2dc7b-dirty

^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox