Git development
 help / color / mirror / Atom feed
* Re: Redhat stateless Linux and git
From: Jon Smirl @ 2006-06-11 15:07 UTC (permalink / raw)
  To: Geert Bosch; +Cc: git, stateless-list
In-Reply-To: <D5AC73C4-5A2F-482E-9B45-71A72C62D670@adacore.com>

On 6/11/06, Geert Bosch <bosch@adacore.com> wrote:
>
> On Jun 9, 2006, at 18:59, Jon Smirl wrote:
> > Redhat is looking for a scheme to sync the disk system of their
> > stateless Linux client. They were using rsync and now they are looking
> > at doing it with LVM.
> >
> > What about using git?
>
> The data model is fine in principle, but git as-is isn't suitable
> for general backup/sync-like schemes. Large (multi-GB) files
> are not really supported yet. Still, I think the underlying
> data model, with some modifications to split large files on
> content-determined boundaries, would be really great for
> distributed filesystems.
>
> Many people using laptops these days connect to different
> filesystems on their office networks, home networks,
> digital cameras and even their PDA, cellphone and MP3-player.
> What is commonly described as "synching", really is just a
> merge between different branches. All arguments in favor
> of using a distributed SCM hold here too.
>
> Right now I'm using a hodge-podge of different manual and
> semi-automated methods to keep my local filesystem with 1.5M
> files totalling 90GB somewhat in synch with various
> homedirectories on different remote systems and backup disks.
> IMO, git is tantalizing close to be able to handle this, just
> needs to get a bit more scalable. Probably you'd want to use
> a different user interface as well, but all the underlying
> data structures and merge strategies may be equally valid.

That's why I though stateless Linux was a good place to start. The
client is read only so it is the simplest case to start with. I would
much prefer a file orientated system for syncing over a block oriented
one, with the block one there is no easy way to tell what is being
copied to your machine.

I added the stateless list to the cc, maybe they'll join in.

-- 
Jon Smirl
jonsmirl@gmail.com

^ permalink raw reply

* Re: [PATCH] Implement safe_strncpy() as strlcpy() and use it more. [Take 2]
From: Peter Eriksen @ 2006-06-11 13:05 UTC (permalink / raw)
  To: git
In-Reply-To: <20060611123332.GA3832@robert.daprodeges.fqdn.th-h.de>

On Sun, Jun 11, 2006 at 12:33:32PM +0000, Rocco Rutte wrote:
> Hi,
> 
> * Peter Eriksen [06-06-11 14:03:28 +0200] wrote:
> 
> >-char *safe_strncpy(char *dest, const char *src, size_t n)
> >+size_t safe_strncpy(char *dest, const char *src, size_t size)
> >{
> >-	strncpy(dest, src, n);
> >-	dest[n - 1] = '\0';
> >+	size_t ret = strlen(src);
> 
> At least FreeBSD's strlen() requires a non-NULL argument, i.e. with 
> src==NULL, this will segfault.
> 
> If you can ensure that src!=NULL, then it's okay, but the safe_ prefix 
> implies something different.

By eyeballing the source code of strlcpy() from FreeBSD and OpenBSD
(which are quite similar), it seems they will segfault if given source
string, which is NULL.  So, from what I've understood, safe_strncpy()
is not more unsafe than strlcpy() or the current safe_strncpy().  It does
have different semantics, because the current one pads will NULL, since
it uses strncpy().

Peter

^ permalink raw reply

* Re: [PATCH] Implement safe_strncpy() as strlcpy() and use it more. [Take 2]
From: Rocco Rutte @ 2006-06-11 12:33 UTC (permalink / raw)
  To: git
In-Reply-To: <20060611120328.GC10430@bohr.gbar.dtu.dk>

Hi,

* Peter Eriksen [06-06-11 14:03:28 +0200] wrote:

>-char *safe_strncpy(char *dest, const char *src, size_t n)
>+size_t safe_strncpy(char *dest, const char *src, size_t size)
> {
>-	strncpy(dest, src, n);
>-	dest[n - 1] = '\0';
>+	size_t ret = strlen(src);

At least FreeBSD's strlen() requires a non-NULL argument, i.e. with 
src==NULL, this will segfault.

If you can ensure that src!=NULL, then it's okay, but the safe_ prefix 
implies something different.

   bye, Rocco
-- 
:wq!

^ permalink raw reply

* Collecting cvsps patches
From: Yann Dirson @ 2006-06-11 12:27 UTC (permalink / raw)
  To: GIT list; +Cc: cvsps

Since there are has been some work done here and there on cvsps, but
upstream does not seem to have time to issue a new release, I have
started to collect the patches I found.

I guess this is a good place for a heads-up: if you know of any other
bugfixes or feature patches to cvsps, I'd like to hear about it, so I
can add it to my repo.

Not that the master branch is an octopus merge of all works in there,
including my preliminary work on multiple-tag support, so for now you
may want to do your own mix.

For now it has:

* bugfixes and such:

Anand Kumria:
      FreeBSD isn't evil - just misguided

Linus Torvalds:
      Increase log-length limit to 64kB
      Improve handling of file collisions in the same patchset
      Fix branch ancestor calculation

Yann Dirson:
      Cleanup the tag handling to simplify multi-tag handling
      Dependency handling

* features

Yann Dirson:
      Allow to have multiple tags on a single patchset.

-- 
Yann Dirson    <ydirson@altern.org> |
Debian-related: <dirson@debian.org> |   Support Debian GNU/Linux:
                                    |  Freedom, Power, Stability, Gratis
     http://ydirson.free.fr/        | Check <http://www.debian.org/>

^ permalink raw reply

* Re: Redhat stateless Linux and git
From: Geert Bosch @ 2006-06-11 12:21 UTC (permalink / raw)
  To: Jon Smirl; +Cc: git
In-Reply-To: <9e4733910606091559m6a88e864m16f9d75a507ee684@mail.gmail.com>


On Jun 9, 2006, at 18:59, Jon Smirl wrote:
> Redhat is looking for a scheme to sync the disk system of their
> stateless Linux client. They were using rsync and now they are looking
> at doing it with LVM.
>
> What about using git?

The data model is fine in principle, but git as-is isn't suitable
for general backup/sync-like schemes. Large (multi-GB) files
are not really supported yet. Still, I think the underlying
data model, with some modifications to split large files on
content-determined boundaries, would be really great for
distributed filesystems.

Many people using laptops these days connect to different
filesystems on their office networks, home networks,
digital cameras and even their PDA, cellphone and MP3-player.
What is commonly described as "synching", really is just a
merge between different branches. All arguments in favor
of using a distributed SCM hold here too.

Right now I'm using a hodge-podge of different manual and
semi-automated methods to keep my local filesystem with 1.5M
files totalling 90GB somewhat in synch with various
homedirectories on different remote systems and backup disks.
IMO, git is tantalizing close to be able to handle this, just
needs to get a bit more scalable. Probably you'd want to use
a different user interface as well, but all the underlying
data structures and merge strategies may be equally valid.

   -Geert

^ permalink raw reply

* [PATCH 3/3] cg-admin-rewritehist: seed the map with the parent of the -r arg, not with itself
From: Yann Dirson @ 2006-06-11 12:05 UTC (permalink / raw)
  To: Petr Baudis; +Cc: git
In-Reply-To: <20060611120431.12116.74005.stgit@gandelf.nowhere.earth>


This is a fix for 95621e54cedef1c4a270af5570a72fc1331b5fcb.

Signed-off-by: Yann Dirson <ydirson@altern.org>
---

 cg-admin-rewritehist |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/cg-admin-rewritehist b/cg-admin-rewritehist
index 7cbdb30..6dd8b92 100755
--- a/cg-admin-rewritehist
+++ b/cg-admin-rewritehist
@@ -157,7 +157,7 @@ while optparse; do
 		git-rev-parse "$OPTARG" >/dev/null || die "Unknown revision '$OPTARG'"
 		git-rev-parse "$OPTARG^" >/dev/null || die "Revision '$OPTARG' does not have parents, check what you really want"
 		startrev="^$OPTARG^ $OPTARG $startrev"
-		startrevparents="$OPTARG $startrevparents"
+		startrevparents="$OPTARG^ $startrevparents"
 	elif optparse --env-filter=; then
 		filter_env="$OPTARG"
 	elif optparse --tree-filter=; then

^ permalink raw reply related

* [PATCH 2/3] cg-admin-rewritehist: catch errors in -r argument early
From: Yann Dirson @ 2006-06-11 12:04 UTC (permalink / raw)
  To: Petr Baudis; +Cc: git
In-Reply-To: <20060611120431.12116.74005.stgit@gandelf.nowhere.earth>




Signed-off-by: Yann Dirson <ydirson@altern.org>
---

 cg-admin-rewritehist |    2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/cg-admin-rewritehist b/cg-admin-rewritehist
index fe3f210..7cbdb30 100755
--- a/cg-admin-rewritehist
+++ b/cg-admin-rewritehist
@@ -154,6 +154,8 @@ while optparse; do
 	if optparse -d=; then
 		tempdir="$OPTARG"
 	elif optparse -r=; then
+		git-rev-parse "$OPTARG" >/dev/null || die "Unknown revision '$OPTARG'"
+		git-rev-parse "$OPTARG^" >/dev/null || die "Revision '$OPTARG' does not have parents, check what you really want"
 		startrev="^$OPTARG^ $OPTARG $startrev"
 		startrevparents="$OPTARG $startrevparents"
 	elif optparse --env-filter=; then

^ permalink raw reply related

* [PATCH 1/3] cg-admin-rewritehist: catch git-rev-list returning no commit
From: Yann Dirson @ 2006-06-11 12:04 UTC (permalink / raw)
  To: Petr Baudis; +Cc: git
In-Reply-To: <20060611120431.12116.74005.stgit@gandelf.nowhere.earth>




Signed-off-by: Yann Dirson <ydirson@altern.org>
---

 cg-admin-rewritehist |    4 ++++
 1 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/cg-admin-rewritehist b/cg-admin-rewritehist
index 861c7f6..fe3f210 100755
--- a/cg-admin-rewritehist
+++ b/cg-admin-rewritehist
@@ -199,6 +199,10 @@ done
 git-rev-list --topo-order HEAD $startrev | tac >../revs
 commits=$(cat ../revs | wc -l)
 
+if [ $commits -eq 0 ]; then
+    die "Found nothing to rewrite"
+fi
+
 i=0
 while read commit; do
 	i=$((i+1))

^ permalink raw reply related

* [PATCH 0/3] another series of cg-admin-rewritehist -r fixes
From: Yann Dirson @ 2006-06-11 10:12 UTC (permalink / raw)
  To: Petr Baudis; +Cc: git

The -r flag is a bit confusing, in that it expects the 1st revision de rewrite, and I
caught myself feeding it the last revision not to rewrite instead.  Patch #2 catches
this early.  Although Patch #2 should take care of most problems, a non-zero status
returned by a command not-last in a pipe is not caught, even under "set -e", so Patch #1
adds an additional safeguard.

Patch #3 corrects seeding of the rewrite map from -r arguments.

-- 
Yann Dirson    <ydirson@altern.org> |
Debian-related: <dirson@debian.org> |   Support Debian GNU/Linux:
                                    |  Freedom, Power, Stability, Gratis
     http://ydirson.free.fr/        | Check <http://www.debian.org/>

^ permalink raw reply

* [PATCH] Implement safe_strncpy() as strlcpy() and use it more. [Take 2]
From: Peter Eriksen @ 2006-06-11 12:03 UTC (permalink / raw)
  To: git

Signed-off-by: Peter Eriksen <s022018@student.dtu.dk>
---

This time, as René suggested, I've taken strlcpy() from the Linux kernel
lib/string.c.  Is it OK to not include copyright information then?

My other comments from take 1 still applies.

Peter
 
 builtin-log.c      |    2 +-
 builtin-tar-tree.c |    4 ++--
 cache.h            |    2 +-
 config.c           |    6 +++---
 http-fetch.c       |   10 ++++------
 http-push.c        |   10 +++++-----
 ident.c            |    5 ++---
 path.c             |   13 +++++++++----
 sha1_name.c        |    3 +--
 9 files changed, 28 insertions(+), 27 deletions(-)

diff --git a/builtin-log.c b/builtin-log.c
index 29a8851..5b0ea28 100644
--- a/builtin-log.c
+++ b/builtin-log.c
@@ -112,7 +112,7 @@ static void reopen_stdout(struct commit 
 	int len = 0;
 
 	if (output_directory) {
-		strncpy(filename, output_directory, 1010);
+		safe_strncpy(filename, output_directory, 1010);
 		len = strlen(filename);
 		if (filename[len - 1] != '/')
 			filename[len++] = '/';
diff --git a/builtin-tar-tree.c b/builtin-tar-tree.c
index 58a8ccd..f6310b9 100644
--- a/builtin-tar-tree.c
+++ b/builtin-tar-tree.c
@@ -240,8 +240,8 @@ static void write_entry(const unsigned c
 	/* XXX: should we provide more meaningful info here? */
 	sprintf(header.uid, "%07o", 0);
 	sprintf(header.gid, "%07o", 0);
-	strncpy(header.uname, "git", 31);
-	strncpy(header.gname, "git", 31);
+	safe_strncpy(header.uname, "git", sizeof(header.uname));
+	safe_strncpy(header.gname, "git", sizeof(header.gname));
 	sprintf(header.devmajor, "%07o", 0);
 	sprintf(header.devminor, "%07o", 0);
 
diff --git a/cache.h b/cache.h
index d5d7fe4..f630cf4 100644
--- a/cache.h
+++ b/cache.h
@@ -210,7 +210,7 @@ int git_mkstemp(char *path, size_t n, co
 
 int adjust_shared_perm(const char *path);
 int safe_create_leading_directories(char *path);
-char *safe_strncpy(char *, const char *, size_t);
+size_t safe_strncpy(char *, const char *, size_t);
 char *enter_repo(char *path, int strict);
 
 /* Read and unpack a sha1 file into memory, write memory to a sha1 file */
diff --git a/config.c b/config.c
index c474970..984c75f 100644
--- a/config.c
+++ b/config.c
@@ -280,17 +280,17 @@ int git_default_config(const char *var, 
 	}
 
 	if (!strcmp(var, "user.name")) {
-		strncpy(git_default_name, value, sizeof(git_default_name));
+		safe_strncpy(git_default_name, value, sizeof(git_default_name));
 		return 0;
 	}
 
 	if (!strcmp(var, "user.email")) {
-		strncpy(git_default_email, value, sizeof(git_default_email));
+		safe_strncpy(git_default_email, value, sizeof(git_default_email));
 		return 0;
 	}
 
 	if (!strcmp(var, "i18n.commitencoding")) {
-		strncpy(git_commit_encoding, value, sizeof(git_commit_encoding));
+		safe_strncpy(git_commit_encoding, value, sizeof(git_commit_encoding));
 		return 0;
 	}
 
diff --git a/http-fetch.c b/http-fetch.c
index d3602b7..da1a7f5 100644
--- a/http-fetch.c
+++ b/http-fetch.c
@@ -584,10 +584,8 @@ static void process_alternates_response(
 			// skip 'objects' at end
 			if (okay) {
 				target = xmalloc(serverlen + posn - i - 6);
-				strncpy(target, base, serverlen);
-				strncpy(target + serverlen, data + i,
-					posn - i - 7);
-				target[serverlen + posn - i - 7] = '\0';
+				safe_strncpy(target, base, serverlen);
+				safe_strncpy(target + serverlen, data + i, posn - i - 6);
 				if (get_verbosely)
 					fprintf(stderr,
 						"Also look at %s\n", target);
@@ -728,8 +726,8 @@ xml_cdata(void *userData, const XML_Char
 	struct xml_ctx *ctx = (struct xml_ctx *)userData;
 	if (ctx->cdata)
 		free(ctx->cdata);
-	ctx->cdata = xcalloc(len+1, 1);
-	strncpy(ctx->cdata, s, len);
+	ctx->cdata = xmalloc(len + 1);
+	safe_strncpy(ctx->cdata, s, len + 1);
 }
 
 static int remote_ls(struct alt_base *repo, const char *path, int flags,
diff --git a/http-push.c b/http-push.c
index b39b36b..2d9441e 100644
--- a/http-push.c
+++ b/http-push.c
@@ -1269,8 +1269,8 @@ xml_cdata(void *userData, const XML_Char
 	struct xml_ctx *ctx = (struct xml_ctx *)userData;
 	if (ctx->cdata)
 		free(ctx->cdata);
-	ctx->cdata = xcalloc(len+1, 1);
-	strncpy(ctx->cdata, s, len);
+	ctx->cdata = xmalloc(len + 1);
+	safe_strncpy(ctx->cdata, s, len + 1);
 }
 
 static struct remote_lock *lock_remote(char *path, long timeout)
@@ -1472,7 +1472,7 @@ static void process_ls_object(struct rem
 		return;
 	path += 8;
 	obj_hex = xmalloc(strlen(path));
-	strncpy(obj_hex, path, 2);
+	safe_strncpy(obj_hex, path, 3);
 	strcpy(obj_hex + 2, path + 3);
 	one_remote_object(obj_hex);
 	free(obj_hex);
@@ -2160,8 +2160,8 @@ static void fetch_symref(char *path, cha
 
 	/* If it's a symref, set the refname; otherwise try for a sha1 */
 	if (!strncmp((char *)buffer.buffer, "ref: ", 5)) {
-		*symref = xcalloc(buffer.posn - 5, 1);
-		strncpy(*symref, (char *)buffer.buffer + 5, buffer.posn - 6);
+		*symref = xmalloc(buffer.posn - 5);
+		safe_strncpy(*symref, (char *)buffer.buffer + 5, buffer.posn - 5);
 	} else {
 		get_sha1_hex(buffer.buffer, sha1);
 	}
diff --git a/ident.c b/ident.c
index 7c81fe8..7b44cbd 100644
--- a/ident.c
+++ b/ident.c
@@ -71,10 +71,9 @@ int setup_ident(void)
 		len = strlen(git_default_email);
 		git_default_email[len++] = '.';
 		if (he && (domainname = strchr(he->h_name, '.')))
-			strncpy(git_default_email + len, domainname + 1, sizeof(git_default_email) - len);
+			safe_strncpy(git_default_email + len, domainname + 1, sizeof(git_default_email) - len);
 		else
-			strncpy(git_default_email + len, "(none)", sizeof(git_default_email) - len);
-		git_default_email[sizeof(git_default_email) - 1] = 0;
+			safe_strncpy(git_default_email + len, "(none)", sizeof(git_default_email) - len);
 	}
 	/* And set the default date */
 	datestamp(git_default_date, sizeof(git_default_date));
diff --git a/path.c b/path.c
index 5168b5f..194e0b5 100644
--- a/path.c
+++ b/path.c
@@ -83,14 +83,19 @@ int git_mkstemp(char *path, size_t len, 
 }
 
 
-char *safe_strncpy(char *dest, const char *src, size_t n)
+size_t safe_strncpy(char *dest, const char *src, size_t size)
 {
-	strncpy(dest, src, n);
-	dest[n - 1] = '\0';
+	size_t ret = strlen(src);
 
-	return dest;
+	if (size) {
+		size_t len = (ret >= size) ? size - 1 : ret;
+		memcpy(dest, src, len);
+		dest[len] = '\0';
+	}
+	return ret;
 }
 
+
 int validate_symref(const char *path)
 {
 	struct stat st;
diff --git a/sha1_name.c b/sha1_name.c
index fbbde1c..8fe9b7a 100644
--- a/sha1_name.c
+++ b/sha1_name.c
@@ -262,8 +262,7 @@ static int get_sha1_basic(const char *st
 		if (str[am] == '@' && str[am+1] == '{' && str[len-1] == '}') {
 			int date_len = len - am - 3;
 			char *date_spec = xmalloc(date_len + 1);
-			strncpy(date_spec, str + am + 2, date_len);
-			date_spec[date_len] = 0;
+			safe_strncpy(date_spec, str + am + 2, date_len + 1);
 			at_time = approxidate(date_spec);
 			free(date_spec);
 			len = am;
-- 
1.3.3.g16a4

^ permalink raw reply related

* Re: [PATCH] Implement safe_strncpy() as strlcpy() and use it more.
From: Rene Scharfe @ 2006-06-11 11:17 UTC (permalink / raw)
  To: Peter Eriksen; +Cc: git
In-Reply-To: <20060611103358.GB10430@bohr.gbar.dtu.dk>

Peter Eriksen schrieb:
> On Sun, Jun 11, 2006 at 07:15:40PM +0900, YOSHIFUJI Hideaki / ?$B5HF#1QL@ wrote:
>> In article <20060611100628.GA10430@bohr.gbar.dtu.dk> (at Sun, 11 Jun 2006 12:06:28 +0200), "Peter Eriksen" <s022018@student.dtu.dk> says:
>>
>>> I've taken strlcpy() from the OpenBSD CVS without attribution.  Is this
>>> allowed?  If it is, how should it be stated?
>> Please include full copyright information.
> 
> Where should this information go?  Just above the function
> safe_strncpy(), or at the top of path.c?  I believe path.c is GPL, so
> can this be mixed freely with BSD licensed code?  Should I put
> safe_strncpy() into a seperate file as with strlcpy()?

Yes...  Or you could avoid all of this by using a GPL'd version, like
the one from the Linux kernel (in lib/string.c).

René

^ permalink raw reply

* Re: [PATCH] Implement safe_strncpy() as strlcpy() and use it more.
From: Peter Eriksen @ 2006-06-11 10:33 UTC (permalink / raw)
  To: git
In-Reply-To: <20060611.191540.68073375.yoshfuji@linux-ipv6.org>

On Sun, Jun 11, 2006 at 07:15:40PM +0900, YOSHIFUJI Hideaki / ?$B5HF#1QL@ wrote:
> In article <20060611100628.GA10430@bohr.gbar.dtu.dk> (at Sun, 11 Jun 2006 12:06:28 +0200), "Peter Eriksen" <s022018@student.dtu.dk> says:
> 
> > I've taken strlcpy() from the OpenBSD CVS without attribution.  Is this
> > allowed?  If it is, how should it be stated?
> 
> Please include full copyright information.

Where should this information go?  Just above the function
safe_strncpy(), or at the top of path.c?  I believe path.c is GPL, so
can this be mixed freely with BSD licensed code?  Should I put
safe_strncpy() into a seperate file as with strlcpy()?
This seems to be the copyright information:

/*
 * Copyright (c) 1998 Todd C. Miller <Todd.Miller@courtesan.com>
 *
 * Permission to use, copy, modify, and distribute this software for any
 * purpose with or without fee is hereby granted, provided that the above
 * copyright notice and this permission notice appear in all copies.
 *
 * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
 * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
 * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
 * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
 * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
 * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
 * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
 */

Peter

^ permalink raw reply

* Re: [PATCH] Implement safe_strncpy() as strlcpy() and use it more.
From: YOSHIFUJI Hideaki / 吉藤英明 @ 2006-06-11 10:15 UTC (permalink / raw)
  To: s022018; +Cc: git, yoshfuji
In-Reply-To: <20060611100628.GA10430@bohr.gbar.dtu.dk>

In article <20060611100628.GA10430@bohr.gbar.dtu.dk> (at Sun, 11 Jun 2006 12:06:28 +0200), "Peter Eriksen" <s022018@student.dtu.dk> says:

> I've taken strlcpy() from the OpenBSD CVS without attribution.  Is this
> allowed?  If it is, how should it be stated?

Please include full copyright information.

--yoshfuji

^ permalink raw reply

* [PATCH] Implement safe_strncpy() as strlcpy() and use it more.
From: Peter Eriksen @ 2006-06-11 10:06 UTC (permalink / raw)
  To: git

Signed-off-by: Peter Eriksen <s022018@student.dtu.dk>
---

I've taken strlcpy() from the OpenBSD CVS without attribution.  Is this
allowed?  If it is, how should it be stated?

I think this fixes some small errors, but might introduce some new ones.
I've tried to be very careful, but this really needs some more eyeballs.
What do you think?

Peter

P.S. By the way, the diff of safe_strncpy() isn't so pretty, because
what I really did was replace the entire function, not edit it.

 builtin-log.c      |    2 +-
 builtin-tar-tree.c |    4 ++--
 cache.h            |    2 +-
 config.c           |    6 +++---
 http-fetch.c       |   10 ++++------
 http-push.c        |   10 +++++-----
 ident.c            |    5 ++---
 path.c             |   31 +++++++++++++++++++++++++++----
 sha1_name.c        |    3 +--
 9 files changed, 46 insertions(+), 27 deletions(-)

diff --git a/builtin-log.c b/builtin-log.c
index 29a8851..5b0ea28 100644
--- a/builtin-log.c
+++ b/builtin-log.c
@@ -112,7 +112,7 @@ static void reopen_stdout(struct commit 
 	int len = 0;
 
 	if (output_directory) {
-		strncpy(filename, output_directory, 1010);
+		safe_strncpy(filename, output_directory, 1010);
 		len = strlen(filename);
 		if (filename[len - 1] != '/')
 			filename[len++] = '/';
diff --git a/builtin-tar-tree.c b/builtin-tar-tree.c
index 58a8ccd..f6310b9 100644
--- a/builtin-tar-tree.c
+++ b/builtin-tar-tree.c
@@ -240,8 +240,8 @@ static void write_entry(const unsigned c
 	/* XXX: should we provide more meaningful info here? */
 	sprintf(header.uid, "%07o", 0);
 	sprintf(header.gid, "%07o", 0);
-	strncpy(header.uname, "git", 31);
-	strncpy(header.gname, "git", 31);
+	safe_strncpy(header.uname, "git", sizeof(header.uname));
+	safe_strncpy(header.gname, "git", sizeof(header.gname));
 	sprintf(header.devmajor, "%07o", 0);
 	sprintf(header.devminor, "%07o", 0);
 
diff --git a/cache.h b/cache.h
index d5d7fe4..f630cf4 100644
--- a/cache.h
+++ b/cache.h
@@ -210,7 +210,7 @@ int git_mkstemp(char *path, size_t n, co
 
 int adjust_shared_perm(const char *path);
 int safe_create_leading_directories(char *path);
-char *safe_strncpy(char *, const char *, size_t);
+size_t safe_strncpy(char *, const char *, size_t);
 char *enter_repo(char *path, int strict);
 
 /* Read and unpack a sha1 file into memory, write memory to a sha1 file */
diff --git a/config.c b/config.c
index c474970..984c75f 100644
--- a/config.c
+++ b/config.c
@@ -280,17 +280,17 @@ int git_default_config(const char *var, 
 	}
 
 	if (!strcmp(var, "user.name")) {
-		strncpy(git_default_name, value, sizeof(git_default_name));
+		safe_strncpy(git_default_name, value, sizeof(git_default_name));
 		return 0;
 	}
 
 	if (!strcmp(var, "user.email")) {
-		strncpy(git_default_email, value, sizeof(git_default_email));
+		safe_strncpy(git_default_email, value, sizeof(git_default_email));
 		return 0;
 	}
 
 	if (!strcmp(var, "i18n.commitencoding")) {
-		strncpy(git_commit_encoding, value, sizeof(git_commit_encoding));
+		safe_strncpy(git_commit_encoding, value, sizeof(git_commit_encoding));
 		return 0;
 	}
 
diff --git a/http-fetch.c b/http-fetch.c
index d3602b7..da1a7f5 100644
--- a/http-fetch.c
+++ b/http-fetch.c
@@ -584,10 +584,8 @@ static void process_alternates_response(
 			// skip 'objects' at end
 			if (okay) {
 				target = xmalloc(serverlen + posn - i - 6);
-				strncpy(target, base, serverlen);
-				strncpy(target + serverlen, data + i,
-					posn - i - 7);
-				target[serverlen + posn - i - 7] = '\0';
+				safe_strncpy(target, base, serverlen);
+				safe_strncpy(target + serverlen, data + i, posn - i - 6);
 				if (get_verbosely)
 					fprintf(stderr,
 						"Also look at %s\n", target);
@@ -728,8 +726,8 @@ xml_cdata(void *userData, const XML_Char
 	struct xml_ctx *ctx = (struct xml_ctx *)userData;
 	if (ctx->cdata)
 		free(ctx->cdata);
-	ctx->cdata = xcalloc(len+1, 1);
-	strncpy(ctx->cdata, s, len);
+	ctx->cdata = xmalloc(len + 1);
+	safe_strncpy(ctx->cdata, s, len + 1);
 }
 
 static int remote_ls(struct alt_base *repo, const char *path, int flags,
diff --git a/http-push.c b/http-push.c
index b39b36b..2d9441e 100644
--- a/http-push.c
+++ b/http-push.c
@@ -1269,8 +1269,8 @@ xml_cdata(void *userData, const XML_Char
 	struct xml_ctx *ctx = (struct xml_ctx *)userData;
 	if (ctx->cdata)
 		free(ctx->cdata);
-	ctx->cdata = xcalloc(len+1, 1);
-	strncpy(ctx->cdata, s, len);
+	ctx->cdata = xmalloc(len + 1);
+	safe_strncpy(ctx->cdata, s, len + 1);
 }
 
 static struct remote_lock *lock_remote(char *path, long timeout)
@@ -1472,7 +1472,7 @@ static void process_ls_object(struct rem
 		return;
 	path += 8;
 	obj_hex = xmalloc(strlen(path));
-	strncpy(obj_hex, path, 2);
+	safe_strncpy(obj_hex, path, 3);
 	strcpy(obj_hex + 2, path + 3);
 	one_remote_object(obj_hex);
 	free(obj_hex);
@@ -2160,8 +2160,8 @@ static void fetch_symref(char *path, cha
 
 	/* If it's a symref, set the refname; otherwise try for a sha1 */
 	if (!strncmp((char *)buffer.buffer, "ref: ", 5)) {
-		*symref = xcalloc(buffer.posn - 5, 1);
-		strncpy(*symref, (char *)buffer.buffer + 5, buffer.posn - 6);
+		*symref = xmalloc(buffer.posn - 5);
+		safe_strncpy(*symref, (char *)buffer.buffer + 5, buffer.posn - 5);
 	} else {
 		get_sha1_hex(buffer.buffer, sha1);
 	}
diff --git a/ident.c b/ident.c
index 7c81fe8..7b44cbd 100644
--- a/ident.c
+++ b/ident.c
@@ -71,10 +71,9 @@ int setup_ident(void)
 		len = strlen(git_default_email);
 		git_default_email[len++] = '.';
 		if (he && (domainname = strchr(he->h_name, '.')))
-			strncpy(git_default_email + len, domainname + 1, sizeof(git_default_email) - len);
+			safe_strncpy(git_default_email + len, domainname + 1, sizeof(git_default_email) - len);
 		else
-			strncpy(git_default_email + len, "(none)", sizeof(git_default_email) - len);
-		git_default_email[sizeof(git_default_email) - 1] = 0;
+			safe_strncpy(git_default_email + len, "(none)", sizeof(git_default_email) - len);
 	}
 	/* And set the default date */
 	datestamp(git_default_date, sizeof(git_default_date));
diff --git a/path.c b/path.c
index 5168b5f..86f51e0 100644
--- a/path.c
+++ b/path.c
@@ -83,14 +83,37 @@ int git_mkstemp(char *path, size_t len, 
 }
 
 
-char *safe_strncpy(char *dest, const char *src, size_t n)
+/*
+ * Copy src to string dst of size siz.  At most siz-1 characters
+ * will be copied.  Always NUL terminates (unless siz == 0).
+ * Returns strlen(src); if retval >= siz, truncation occurred.
+ */
+size_t safe_strncpy(char *dst, const char *src, size_t siz)
 {
-	strncpy(dest, src, n);
-	dest[n - 1] = '\0';
+	char *d = dst;
+	const char *s = src;
+	size_t n = siz;
 
-	return dest;
+	/* Copy as many bytes as will fit */
+	if (n != 0) {
+		while (--n != 0) {
+			if ((*d++ = *s++) == '\0')
+				break;
+		}
+	}
+
+	/* Not enough room in dst, add NUL and traverse rest of src */
+	if (n == 0) {
+		if (siz != 0)
+			*d = '\0';		/* NUL-terminate dst */
+		while (*s++)
+			;
+	}
+
+	return(s - src - 1);	/* count does not include NUL */
 }
 
+
 int validate_symref(const char *path)
 {
 	struct stat st;
diff --git a/sha1_name.c b/sha1_name.c
index fbbde1c..8fe9b7a 100644
--- a/sha1_name.c
+++ b/sha1_name.c
@@ -262,8 +262,7 @@ static int get_sha1_basic(const char *st
 		if (str[am] == '@' && str[am+1] == '{' && str[len-1] == '}') {
 			int date_len = len - am - 3;
 			char *date_spec = xmalloc(date_len + 1);
-			strncpy(date_spec, str + am + 2, date_len);
-			date_spec[date_len] = 0;
+			safe_strncpy(date_spec, str + am + 2, date_len + 1);
 			at_time = approxidate(date_spec);
 			free(date_spec);
 			len = am;
-- 
1.3.3.g16a4

^ permalink raw reply related

* [PATCH] cvsimport: complete the cvsps run before starting the import - take 2
From: Martin Langhoff @ 2006-06-11  8:12 UTC (permalink / raw)
  To: junkio, git; +Cc: Martin Langhoff

On 5/24/06, Linus Torvalds <torvalds@osdl.org> wrote:
> It's entirely possible that the fact that it now seems to work for me is
> purely timing-related, since I also ended up using "-P cvsps-output" to
> avoid having a huge cvsps binary in memory at the same time.

We now capture the output of cvsps to a tempfile, and then read it in.
cvsps 2.1 works quite a bit "in memory", and only prints its patchset info
once it has finished talking with cvs, but apparently retaining all that
memory allocation. With this patch, cvsps is finished and reaped before
cvsimport start working (and growing). So the footprint of the whole
process is much lower.

Signed-off-by: Martin Langhoff <martin@catalyst.net.nz>
---

This is a more reliable implementation, which fork/execs and passes the cvsps
output into the tempfile.
---
 git-cvsimport.perl |   42 ++++++++++++++++++++++++++++--------------
 1 files changed, 28 insertions(+), 14 deletions(-)

diff --git a/git-cvsimport.perl b/git-cvsimport.perl
index 07b3203..9a7408b 100755
--- a/git-cvsimport.perl
+++ b/git-cvsimport.perl
@@ -529,25 +529,39 @@ if ($opt_A) {
 	write_author_info("$git_dir/cvs-authors");
 }
 
-my $pid = open(CVS,"-|");
-die "Cannot fork: $!\n" unless defined $pid;
-unless($pid) {
-	my @opt;
-	@opt = split(/,/,$opt_p) if defined $opt_p;
-	unshift @opt, '-z', $opt_z if defined $opt_z;
-	unshift @opt, '-q'         unless defined $opt_v;
-	unless (defined($opt_p) && $opt_p =~ m/--no-cvs-direct/) {
-		push @opt, '--cvs-direct';
+
+#
+# run cvsps into a file unless we are getting
+# it passed as a file via $opt_P
+#
+unless ($opt_P) {
+	print "Running cvsps...\n" if $opt_v;
+	my $pid = open(CVSPS,"-|");
+	die "Cannot fork: $!\n" unless defined $pid;
+	unless($pid) {
+		my @opt;
+		@opt = split(/,/,$opt_p) if defined $opt_p;
+		unshift @opt, '-z', $opt_z if defined $opt_z;
+		unshift @opt, '-q'         unless defined $opt_v;
+		unless (defined($opt_p) && $opt_p =~ m/--no-cvs-direct/) {
+			push @opt, '--cvs-direct';
+		}
+		exec("cvsps","--norc",@opt,"-u","-A",'--root',$opt_d,$cvs_tree);
+		die "Could not start cvsps: $!\n";
 	}
-	if ($opt_P) {
-	    exec("cat", $opt_P);
-	} else {
-	    exec("cvsps","--norc",@opt,"-u","-A",'--root',$opt_d,$cvs_tree);
-	    die "Could not start cvsps: $!\n";
+	my ($cvspsfh, $cvspsfile) = tempfile('gitXXXXXX', SUFFIX => '.cvsps',
+					     DIR => File::Spec->tmpdir());
+	while (<CVSPS>) {
+	    print $cvspsfh $_;
 	}
+	close CVSPS;
+	close $cvspsfh;
+	$opt_P = $cvspsfile;
 }
 
 
+open(CVS, "<$opt_P") or die $!;
+
 ## cvsps output:
 #---------------------
 #PatchSet 314
-- 
1.4.0.gcda2

^ permalink raw reply related

* [PATCH] cvsimport: ignore CVSPS_NO_BRANCH and impossible branches
From: Martin Langhoff @ 2006-06-11  8:12 UTC (permalink / raw)
  To: junkio, git; +Cc: Martin Langhoff

cvsps output often contains references to CVSPS_NO_BRANCH, commits that it
could not trace to a branch. Ignore that branch.

Additionally, cvsps will sometimes draw circular relationships between
branches -- where two branches are recorded as opening from the other.
In those cases, and where the ancestor branch hasn't been seen, ignore
it.
Signed-off-by: Martin Langhoff <martin@catalyst.net.nz>
---
 git-cvsimport.perl |   17 ++++++++++++++++-
 1 files changed, 16 insertions(+), 1 deletions(-)

diff --git a/git-cvsimport.perl b/git-cvsimport.perl
index 76f6246..07b3203 100755
--- a/git-cvsimport.perl
+++ b/git-cvsimport.perl
@@ -595,7 +595,11 @@ sub write_tree () {
 }
 
 my($patchset,$date,$author_name,$author_email,$branch,$ancestor,$tag,$logmsg);
-my(@old,@new,@skipped);
+my(@old,@new,@skipped,%ignorebranch);
+
+# commits that cvsps cannot place anywhere...
+$ignorebranch{'#CVSPS_NO_BRANCH'} = 1; 
+
 sub commit {
 	update_index(@old, @new);
 	@old = @new = ();
@@ -751,7 +755,16 @@ while(<CVS>) {
 			$state = 11;
 			next;
 		}
+		if (exists $ignorebranch{$branch}) {
+			print STDERR "Skipping $branch\n";
+			$state = 11;
+			next;
+		}
 		if($ancestor) {
+			if($ancestor eq $branch) {
+				print STDERR "Branch $branch erroneously stems from itself -- changed ancestor to $opt_o\n";
+				$ancestor = $opt_o;
+			}
 			if(-f "$git_dir/refs/heads/$branch") {
 				print STDERR "Branch $branch already exists!\n";
 				$state=11;
@@ -759,6 +772,7 @@ while(<CVS>) {
 			}
 			unless(open(H,"$git_dir/refs/heads/$ancestor")) {
 				print STDERR "Branch $ancestor does not exist!\n";
+				$ignorebranch{$branch} = 1;
 				$state=11;
 				next;
 			}
@@ -766,6 +780,7 @@ while(<CVS>) {
 			close(H);
 			unless(open(H,"> $git_dir/refs/heads/$branch")) {
 				print STDERR "Could not create branch $branch: $!\n";
+				$ignorebranch{$branch} = 1;
 				$state=11;
 				next;
 			}
-- 
1.4.0.gcda2

^ permalink raw reply related

* [PATCH 4/5] git-svn: restore original LC_ALL setting (or unset) for commit
From: Eric Wong @ 2006-06-11  7:03 UTC (permalink / raw)
  To: Junio C Hamano, git; +Cc: Eric Wong
In-Reply-To: <11500094292561-git-send-email-normalperson@yhbt.net>

svn forces UTF-8 for commit messages, and with LC_ALL set to 'C'
it is unable to determine encoding of the git commit message.

Now we'll just assume the user has set LC_* correctly for
the commit message they're using.

Signed-off-by: Eric Wong <normalperson@yhbt.net>
---
 contrib/git-svn/git-svn.perl |   34 +++++++++++++++++++++++-----------
 1 files changed, 23 insertions(+), 11 deletions(-)

diff --git a/contrib/git-svn/git-svn.perl b/contrib/git-svn/git-svn.perl
index 8d2e7f7..8bc3d69 100755
--- a/contrib/git-svn/git-svn.perl
+++ b/contrib/git-svn/git-svn.perl
@@ -14,6 +14,7 @@ use Cwd qw/abs_path/;
 $GIT_DIR = abs_path($ENV{GIT_DIR} || '.git');
 $ENV{GIT_DIR} = $GIT_DIR;
 
+my $LC_ALL = $ENV{LC_ALL};
 # make sure the svn binary gives consistent output between locales and TZs:
 $ENV{TZ} = 'UTC';
 $ENV{LC_ALL} = 'C';
@@ -704,23 +705,34 @@ sub svn_commit_tree {
 	my ($oneline) = ($log_msg{msg} =~ /([^\n\r]+)/);
 	print "Committing $commit: $oneline\n";
 
+	if (defined $LC_ALL) {
+		$ENV{LC_ALL} = $LC_ALL;
+	} else {
+		delete $ENV{LC_ALL};
+	}
 	my @ci_output = safe_qx(qw(svn commit -F),$commit_msg);
-	my ($committed) = grep(/^Committed revision \d+\./,@ci_output);
+	$ENV{LC_ALL} = 'C';
 	unlink $commit_msg;
-	defined $committed or croak
+	my ($committed) = ($ci_output[$#ci_output] =~ /(\d+)/);
+	if (!defined $committed) {
+		my $out = join("\n",@ci_output);
+		print STDERR "W: Trouble parsing \`svn commit' output:\n\n",
+				$out, "\n\nAssuming English locale...";
+		($committed) = ($out =~ /^Committed revision \d+\./sm);
+		defined $committed or die " FAILED!\n",
 			"Commit output failed to parse committed revision!\n",
-			join("\n",@ci_output),"\n";
-	my ($rev_committed) = ($committed =~ /^Committed revision (\d+)\./);
+		print STDERR " OK\n";
+	}
 
 	my @svn_up = qw(svn up);
 	push @svn_up, '--ignore-externals' unless $_no_ignore_ext;
-	if ($rev_committed == ($svn_rev + 1)) {
-		push @svn_up, "-r$rev_committed";
+	if ($committed == ($svn_rev + 1)) {
+		push @svn_up, "-r$committed";
 		sys(@svn_up);
 		my $info = svn_info('.');
 		my $date = $info->{'Last Changed Date'} or die "Missing date\n";
-		if ($info->{'Last Changed Rev'} != $rev_committed) {
-			croak "$info->{'Last Changed Rev'} != $rev_committed\n"
+		if ($info->{'Last Changed Rev'} != $committed) {
+			croak "$info->{'Last Changed Rev'} != $committed\n"
 		}
 		my ($Y,$m,$d,$H,$M,$S,$tz) = ($date =~
 					/(\d{4})\-(\d\d)\-(\d\d)\s
@@ -728,16 +740,16 @@ sub svn_commit_tree {
 					 or croak "Failed to parse date: $date\n";
 		$log_msg{date} = "$tz $Y-$m-$d $H:$M:$S";
 		$log_msg{author} = $info->{'Last Changed Author'};
-		$log_msg{revision} = $rev_committed;
+		$log_msg{revision} = $committed;
 		$log_msg{msg} .= "\n";
 		my $parent = file_to_s("$REV_DIR/$svn_rev");
 		git_commit(\%log_msg, $parent, $commit);
-		return $rev_committed;
+		return $committed;
 	}
 	# resync immediately
 	push @svn_up, "-r$svn_rev";
 	sys(@svn_up);
-	return fetch("$rev_committed=$commit")->{revision};
+	return fetch("$committed=$commit")->{revision};
 }
 
 # read the entire log into a temporary file (which is removed ASAP)
-- 
1.3.3.g2dc7b-dirty

^ permalink raw reply related

* [PATCH 5/5] git-svn: don't allow commit if svn tree is not current
From: Eric Wong @ 2006-06-11  7:03 UTC (permalink / raw)
  To: Junio C Hamano, git; +Cc: Eric Wong
In-Reply-To: <11500094313384-git-send-email-normalperson@yhbt.net>

If new revisions are fetched, that implies we haven't merged,
acked, or nacked them yet, and attempting to write the tree
we're committing means we'd silently clobber the newly fetched
changes.

Signed-off-by: Eric Wong <normalperson@yhbt.net>
---
 contrib/git-svn/git-svn.perl |   11 +++++++++--
 1 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/contrib/git-svn/git-svn.perl b/contrib/git-svn/git-svn.perl
index 8bc3d69..72129de 100755
--- a/contrib/git-svn/git-svn.perl
+++ b/contrib/git-svn/git-svn.perl
@@ -309,9 +309,16 @@ sub commit {
 	}
 	chomp @revs;
 
-	fetch();
-	chdir $SVN_WC or croak $!;
+	chdir $SVN_WC or croak "Unable to chdir $SVN_WC: $!\n";
 	my $info = svn_info('.');
+	my $fetched = fetch();
+	if ($info->{Revision} != $fetched->{revision}) {
+		print STDERR "There are new revisions that were fetched ",
+				"and need to be merged (or acknowledged) ",
+				"before committing.\n";
+		exit 1;
+	}
+	$info = svn_info('.');
 	read_uuid($info);
 	my $svn_current_rev =  $info->{'Last Changed Rev'};
 	foreach my $c (@revs) {
-- 
1.3.3.g2dc7b-dirty

^ permalink raw reply related

* [PATCH] git-svn: bug fixes (some resends)
From: Eric Wong @ 2006-06-11  7:03 UTC (permalink / raw)
  To: Junio C Hamano, git


[PATCH 1/5] git-svn: t0000: add -f flag to checkout
[PATCH 2/5] git-svn: fix handling of filenames with embedded '@'
	These two are resends, patch 2 only affects 1.1.0-pre.

[PATCH 3/5] git-svn: eol_cp corner-case fixes
	Kinda urgent (but only affects 1.1.0-pre)

[PATCH 4/5] git-svn: restore original LC_ALL setting (or unset) for commit
	For people that want to commit UTF-8 commit messages.

[PATCH 5/5] git-svn: don't allow commit if svn tree is not current
	Extra sanity check, just in case.

^ permalink raw reply

* [PATCH 3/5] git-svn: eol_cp corner-case fixes
From: Eric Wong @ 2006-06-11  7:03 UTC (permalink / raw)
  To: Junio C Hamano, git; +Cc: Eric Wong
In-Reply-To: <11500094281515-git-send-email-normalperson@yhbt.net>

If we read the maximum size of our buffer into $buf, and the
last character is '\015', there's a chance that the character is
'\012', which means our regex won't work correctly.  At the
worst case, this could introduce an extra newline into the code.
We'll now read an extra character if we see '\015' is the last
character in $buf.

We also forgot to recalculate the length of $buf after doing the
newline substitution, causing some files to appeare truncated.
We'll do that now and force byte semantics in length() for good
measure.

Signed-off-by: Eric Wong <normalperson@yhbt.net>
---
 contrib/git-svn/git-svn.perl |   15 +++++++++++----
 1 files changed, 11 insertions(+), 4 deletions(-)

diff --git a/contrib/git-svn/git-svn.perl b/contrib/git-svn/git-svn.perl
index 7ed11ef..8d2e7f7 100755
--- a/contrib/git-svn/git-svn.perl
+++ b/contrib/git-svn/git-svn.perl
@@ -866,19 +866,26 @@ sub eol_cp {
 	binmode $wfd or croak $!;
 
 	my $eol = $EOL{$es} or undef;
-	if ($eol) {
-		print  "$eol: $from => $to\n";
-	}
 	my $buf;
+	use bytes;
 	while (1) {
 		my ($r, $w, $t);
 		defined($r = sysread($rfd, $buf, 4096)) or croak $!;
 		return unless $r;
-		$buf =~ s/(?:\015|\012|\015\012)/$eol/gs if $eol;
+		if ($eol) {
+			if ($buf =~ /\015$/) {
+				my $c;
+				defined($r = sysread($rfd,$c,1)) or croak $!;
+				$buf .= $c if $r > 0;
+			}
+			$buf =~ s/(?:\015\012|\015|\012)/$eol/gs;
+			$r = length($buf);
+		}
 		for ($w = 0; $w < $r; $w += $t) {
 			$t = syswrite($wfd, $buf, $r - $w, $w) or croak $!;
 		}
 	}
+	no bytes;
 }
 
 sub do_update_index {
-- 
1.3.3.g2dc7b-dirty

^ permalink raw reply related

* [PATCH 1/5] git-svn: t0000: add -f flag to checkout
From: Eric Wong @ 2006-06-11  7:03 UTC (permalink / raw)
  To: Junio C Hamano, git; +Cc: Eric Wong
In-Reply-To: <11500094252972-git-send-email-normalperson@yhbt.net>

Some changes to the latest git.git made this test croak.  So
we'll always just force everything when using a new branch.

Signed-off-by: Eric Wong <normalperson@yhbt.net>
---
 contrib/git-svn/t/t0000-contrib-git-svn.sh |   10 +++++-----
 1 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/contrib/git-svn/t/t0000-contrib-git-svn.sh b/contrib/git-svn/t/t0000-contrib-git-svn.sh
index 8b3a0d9..a07fbad 100644
--- a/contrib/git-svn/t/t0000-contrib-git-svn.sh
+++ b/contrib/git-svn/t/t0000-contrib-git-svn.sh
@@ -32,7 +32,7 @@ test_expect_success \
 
 
 name='try a deep --rmdir with a commit'
-git checkout -b mybranch remotes/git-svn
+git checkout -f -b mybranch remotes/git-svn
 mv dir/a/b/c/d/e/file dir/file
 cp dir/file file
 git update-index --add --remove dir/a/b/c/d/e/file dir/file file
@@ -58,7 +58,7 @@ test_expect_code 1 "$name" \
 
 name='detect node change from directory to file #1'
 rm -rf dir $GIT_DIR/index
-git checkout -b mybranch2 remotes/git-svn
+git checkout -f -b mybranch2 remotes/git-svn
 mv bar/zzz zzz
 rm -rf bar
 mv zzz bar
@@ -73,7 +73,7 @@ test_expect_code 1 "$name" \
 
 name='detect node change from file to directory #2'
 rm -f $GIT_DIR/index
-git checkout -b mybranch3 remotes/git-svn
+git checkout -f -b mybranch3 remotes/git-svn
 rm bar/zzz
 git-update-index --remove bar/zzz
 mkdir bar/zzz
@@ -88,7 +88,7 @@ test_expect_code 1 "$name" \
 
 name='detect node change from directory to file #2'
 rm -f $GIT_DIR/index
-git checkout -b mybranch4 remotes/git-svn
+git checkout -f -b mybranch4 remotes/git-svn
 rm -rf dir
 git update-index --remove -- dir/file
 touch dir
@@ -103,7 +103,7 @@ test_expect_code 1 "$name" \
 
 name='remove executable bit from a file'
 rm -f $GIT_DIR/index
-git checkout -b mybranch5 remotes/git-svn
+git checkout -f -b mybranch5 remotes/git-svn
 chmod -x exec.sh
 git update-index exec.sh
 git commit -m "$name"
-- 
1.3.3.g2dc7b-dirty

^ permalink raw reply related

* [PATCH 2/5] git-svn: fix handling of filenames with embedded '@'
From: Eric Wong @ 2006-06-11  7:03 UTC (permalink / raw)
  To: Junio C Hamano, git; +Cc: Eric Wong
In-Reply-To: <11500094271080-git-send-email-normalperson@yhbt.net>

svn has trouble parsing files with embedded '@' characters.  For
example,

  svn propget svn:keywords foo@bar.c
  svn: Syntax error parsing revision 'bar.c'

I asked about this on #svn and the workaround suggested was to append
an explicit revision specifier:

  svn propget svn:keywords foo@bar.c@BASE

This patch appends '@BASE' to the filename in all calls to 'svn
propget'.

Patch originally by Seth Falcon <sethfalcon@gmail.com>
Seth: signoff?

[ew: Made to work with older svn that don't support peg revisions]

Signed-off-by: Eric Wong <normalperson@yhbt.net>
---
 contrib/git-svn/git-svn.perl |   17 +++++++++++++----
 1 files changed, 13 insertions(+), 4 deletions(-)

diff --git a/contrib/git-svn/git-svn.perl b/contrib/git-svn/git-svn.perl
index aac8779..7ed11ef 100755
--- a/contrib/git-svn/git-svn.perl
+++ b/contrib/git-svn/git-svn.perl
@@ -34,7 +34,7 @@ my $sha1_short = qr/[a-f\d]{4,40}/;
 my ($_revision,$_stdin,$_no_ignore_ext,$_no_stop_copy,$_help,$_rmdir,$_edit,
 	$_find_copies_harder, $_l, $_version, $_upgrade, $_authors);
 my (@_branch_from, %tree_map, %users);
-my $_svn_co_url_revs;
+my ($_svn_co_url_revs, $_svn_pg_peg_revs);
 
 my %fc_opts = ( 'no-ignore-externals' => \$_no_ignore_ext,
 		'branch|b=s' => \@_branch_from,
@@ -336,7 +336,7 @@ sub show_ignore {
 	my %ign;
 	File::Find::find({wanted=>sub{if(lstat $_ && -d _ && -d "$_/.svn"){
 		s#^\./##;
-		@{$ign{$_}} = safe_qx(qw(svn propget svn:ignore),$_);
+		@{$ign{$_}} = svn_propget_base('svn:ignore', $_);
 		}}, no_chdir=>1},'.');
 
 	print "\n# /\n";
@@ -859,7 +859,7 @@ sub sys { system(@_) == 0 or croak $? }
 
 sub eol_cp {
 	my ($from, $to) = @_;
-	my $es = safe_qx(qw/svn propget svn:eol-style/, $to);
+	my $es = svn_propget_base('svn:eol-style', $to);
 	open my $rfd, '<', $from or croak $!;
 	binmode $rfd or croak $!;
 	open my $wfd, '>', $to or croak $!;
@@ -897,7 +897,7 @@ sub do_update_index {
 	while (my $x = <$p>) {
 		chomp $x;
 		if (!$no_text_base && lstat $x && ! -l _ &&
-				safe_qx(qw/svn propget svn:keywords/,$x)) {
+				svn_propget_base('svn:keywords', $x)) {
 			my $mode = -x _ ? 0755 : 0644;
 			my ($v,$d,$f) = File::Spec->splitpath($x);
 			my $tb = File::Spec->catfile($d, '.svn', 'tmp',
@@ -1135,6 +1135,9 @@ sub svn_compat_check {
 	if (grep /usage: checkout URL\[\@REV\]/,@co_help) {
 		$_svn_co_url_revs = 1;
 	}
+	if (grep /\[TARGET\[\@REV\]\.\.\.\]/, `svn propget -h`) {
+		$_svn_pg_peg_revs = 1;
+	}
 
 	# I really, really hope nobody hits this...
 	unless (grep /stop-on-copy/, (safe_qx(qw(svn log -h)))) {
@@ -1214,6 +1217,12 @@ sub load_authors {
 	close $authors or croak $!;
 }
 
+sub svn_propget_base {
+	my ($p, $f) = @_;
+	$f .= '@BASE' if $_svn_pg_peg_revs;
+	return safe_qx(qw/svn propget/, $p, $f);
+}
+
 __END__
 
 Data structures:
-- 
1.3.3.g2dc7b-dirty

^ permalink raw reply related

* Re: gitweb.cgi history not shown
From: Marco Costalba @ 2006-06-11  6:32 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: junkio, git
In-Reply-To: <Pine.LNX.4.64.0606102248360.5498@g5.osdl.org>

>
> Now, look what happens if you instead of starting the history search from
> all the _current_ heads, you start it from a location that actually _had_
> that file:
>
>         git log 1130ef362fc8d9c3422c23f5d5 -- gitweb.cgi
>
> and suddenly there the history is - in all its glory.
>

Why I still get empty results if I run git-rev-list from gitweb merge point?

$ git-rev-list 0a8f4f0020cb35095005852c0797f0b90e9ebb74 -- gitweb.cgi
$
$ git-rev-list 0a8f4f0020cb35095005852c0797f0b90e9ebb74 -- gitweb/gitweb.cgi
0a8f4f0020cb35095005852c0797f0b90e9ebb74

Is this because path changed: gitweb.cgi -> gitweb/gitweb.cgi

I would like to think the problem is the path change because in case
of gitk, merge of a parallel branch but with _no_ path change,
everything worked as expected.

So the question is the path change was "fixed up" by hand or done as
part of gitweb branch merge process, in the latter case probably
git-rev-list should already take in account this without special flags
_and_ without removing history traversal optimizations that are good
and useful in the remaining 99% of cases (for a GUI tool is difficult
to know when to use a flag like --no-simplify-merge or not on a per
request basis).

        Marco

^ permalink raw reply

* Re: gitweb.cgi history not shown
From: Linus Torvalds @ 2006-06-11  6:02 UTC (permalink / raw)
  To: Marco Costalba; +Cc: junkio, git
In-Reply-To: <e5bfff550606102231o756f6d11lc46fecdad29568c0@mail.gmail.com>



On Sun, 11 Jun 2006, Marco Costalba wrote:
>
> What I do wrong?
> 
> $ git-rev-list --all -- gitweb/gitweb.cgi
> 0a8f4f0020cb35095005852c0797f0b90e9ebb74
> $ git-rev-list --all -- gitweb.cgi
> $

[ no output ]

This is getting to be a FAQ, and I think we should add the 
"--no-prune-history" flag (or whatever I called it - I even sent out a 
patch for it) so that you can avoid it.

The thing that happens in

	git-rev-list --all -- gitweb.cgi

is that since your _current_ HEAD does not have that file at all, it 
starts going back in history, and at each merge it finds it will 
_simplify_ the history, and only look at that part of history that is 
identical _with_respect_to_the_name_you_gave_!

Now, in the main git history, that name has NEVER existed, so the 
simplified history for that particular name (as seen from the current 
branch) is simply empty. It's empty all the way back to the root. No 
commits at all add that name along the main history branch.

Now, that name obviously existed in the _side_ histories, but we don't 
show those, because they obviously didn't matter (as far as that 
particular name happened) within the particular history starting point you 
chose. See?

Now, look what happens if you instead of starting the history search from 
all the _current_ heads, you start it from a location that actually _had_ 
that file:

	git log 1130ef362fc8d9c3422c23f5d5 -- gitweb.cgi

and suddenly there the history is - in all its glory.

So what this boils down to is really: when you limit revision history by a 
set of filenames, GIT REALLY REWRITES AND SIMPLIFIES THE HISTORY AS PER 
_THAT_ PARTICULAR SET OF FILENAMES. In particular, it will generate the 
_simplest_ history that is consistent with the state of those filenames at 
the point you asked it to start.

If you want to get the non-simplified history (ie you object to the fact 
that we give the simplest history, you want _all_ the possible history for 
that particular filename, whether it was the same along one branch or 
not), you need to apply something like the appended..

(And you obviously need to add that "no_simplify_merge" flag to the 
revision data structure, and you need to add some command line flag to 
enable it. Alternatively, try to find the patch I sent out a couple of 
months ago, I'm pretty sure I called it "--no-simplify-merge" or 
"--no-prune-history" or something like that).

		Linus
---
diff --git a/revision.c b/revision.c
index 6a6952c..5640cef 100644
--- a/revision.c
+++ b/revision.c
@@ -303,7 +303,7 @@ static void try_to_simplify_commit(struc
 		parse_commit(p);
 		switch (rev_compare_tree(revs, p->tree, commit->tree)) {
 		case REV_TREE_SAME:
-			if (p->object.flags & UNINTERESTING) {
+			if (revs->no_simplify_merge || (p->object.flags & UNINTERESTING)) {
 				/* Even if a merge with an uninteresting
 				 * side branch brought the entire change
 				 * we are interested in, we do not want

^ permalink raw reply related

* gitweb.cgi history not shown
From: Marco Costalba @ 2006-06-11  5:31 UTC (permalink / raw)
  To: junkio; +Cc: git

What I do wrong?

$ git-rev-list --all -- gitweb/gitweb.cgi
0a8f4f0020cb35095005852c0797f0b90e9ebb74
$ git-rev-list --all -- gitweb.cgi
$

Also the installed gitweb at kernel.org gives an empty history for
file gitweb.cgi under git repository, while the history is correctly
shown for the same file under the gitweb project.

    Marco

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox