Git development

Git development
 help / color / mirror / Atom feed

* Re: [PATCH 0/2] tagsize < 8kb restriction
From: Junio C Hamano @ 2006-05-23 23:49 UTC (permalink / raw)
  To: Björn Engelmann; +Cc: git
In-Reply-To: <44737353.20904@gmx.de>

Björn Engelmann <BjEngelmann@gmx.de> writes:

> I hope this time I got it right.

Thanks.  Pushed out as a part of "next".  Will hopefully be part
of "master" by the end of the week if not earlier.

^ permalink raw reply

* Re: Make more commands builtin
From: Junio C Hamano @ 2006-05-23 23:48 UTC (permalink / raw)
  To: s022018; +Cc: git
In-Reply-To: <15865.7899396077$1148386583@news.gmane.org>

"Peter Eriksen" <s022018@student.dtu.dk> writes:

> Junio, I've formatted this batch of patches with -M, so
> they are easier to read.

Thanks.  Pushed out as a part of "next".  Will hopefully be part
of "master" by the end of the week if not earlier.

^ permalink raw reply

* Re: [PATCH 0/6] Detect non email patches in git-mailinfo
From: Junio C Hamano @ 2006-05-23 23:44 UTC (permalink / raw)
  To: Eric W. Biederman; +Cc: git
In-Reply-To: <m18xosjznu.fsf@ebiederm.dsl.xmission.com>

Thanks.  Merged to "next", this probably would graduate to
"master" by the end of the week if not earlier.

^ permalink raw reply

* Re: [PATCH 2/2] cvsimport: cleanup commit function
From: Junio C Hamano @ 2006-05-23 23:41 UTC (permalink / raw)
  To: Jeff King; +Cc: Morten Welinder, Martin Langhoff, Matthias Urlichs, git
In-Reply-To: <20060523205944.GA16164@coredump.intra.peff.net>

Jeff King <peff@peff.net> writes:

> On Tue, May 23, 2006 at 01:47:01PM -0400, Morten Welinder wrote:
>
>> Why run "env" and not just muck with %ENV?
>> >+       my $pid = open2(my $commit_read, my $commit_write,
>> >+               'env',
>> >+               "GIT_AUTHOR_NAME=$author_name",
>> >+               "GIT_AUTHOR_EMAIL=$author_email",
>> >+               "GIT_AUTHOR_DATE=$commit_date",
>> >+               "GIT_COMMITTER_NAME=$author_name",
>> >+               "GIT_COMMITTER_EMAIL=$author_email",
>> >+               "GIT_COMMITTER_DATE=$commit_date",
>> >+               'git-commit-tree', $tree, @commit_args);
>
> Oops, that's an obvious fork optimization that I should have caught.

Are you two talking about running git-commit-tree via env is two
fork-execs instead of just one?  Does that have a measurable
difference?

Not that I have anything against the updated code, but I do not
particularly thing it is such a big issue.

> PS What is the preferred format for throwing patches into replies like
> this? Putting the patch at the end (as here) or throwing the reply
> comments in the ignored section near the diffstat?

You could do it either way.  Although I personally find the
former easier to read (meshes well with "do not top post"
mantra), it appears many other people finds the cover letter
material should come after the first '---' separator.

If you append the patch to your message, btw, you would need to
realize that the receiving end needs to edit your message to
remove the top part before running "git am" to apply.

^ permalink raw reply

* Re: [PATCH 0/2] tagsize < 8kb restriction
From: Junio C Hamano @ 2006-05-23 23:15 UTC (permalink / raw)
  To: Björn Engelmann; +Cc: git
In-Reply-To: <44737353.20904@gmx.de>

Björn Engelmann <BjEngelmann@gmx.de> writes:

> I am currently wondering where to store the reference to such a
> sub-repository. It certainly is a head, but I would like to avoid anyone
> commiting code into this "branch". Maybe I will create a new directory
> .git/refs/annotations.

I would recommend against that.  Why shouldn't it be an ordinary
branch that is different from the default "master"?

If you are in a shared repository settings, then update hook is
there for you to prevent people who do not have any business
touching that branch head from mucking with it.

> I am not sure how git would perform in such an environment. Do you think
> the "git-push"-implementation is sufficiently "thread-save" for this ?

Yes.

And I do not necessarily think your workflow would want to have
such "an empty file works as a lock" convention.

Just do things locklessly.  If two people in the group happened
to do duplicated work, the first push would succeed and the
second person would be prevented from pushing (and suggested to
merge in the work first).  When the second person pulls, he
would realize the scan result by the first person is already
there.  If that is considered to be too much wasted work, then
it means your distributed workflow did not have sufficient
communication among people.  Being able to work distributed does
not mean you need no coordination, and a distributed SCM is not
a substitute for comminication among paticipants.

> 1.) Do you intend to add some more advanced metadata-functionality to
> git in the future or should I send a patch with my implementation once
> it is finished ? Will be just some scripts using similar commands to
> what Linus sent me (thanks for that, btw)

Neither, until/unless we have a clear design.

I think the annotation branch (or a separate repository) is a
very natural consequence of what the tool already give you, and
the tools work just fine as they are.  There is nothing
innovative in what I suggested above nor Linus outlined in the
other message.

If you are talking about an application that builds on top of
git to do issue management (or QA or whatever), that uses
metadata linked to the commits on the main development branch,
that would be a wonderful system, but that does not necessarily
have to come with git (it's just an application on top of git,
and the workflow of your organization may or may not match other
people's workflow).

> 2.) Searching for a way to add objects to the database I spent quite a
> while to find the right command. Don't you think it would be much more
> intuitive having an
>
>     git-create-object [-t <type>] [-n] [-f] [-z] [--stdin] <file> [-r
> <ref-name>]
>
> command for creating any type of object (-t blob as default).

No, I do not think we would want to make it too easy and relaxed
to create arbitrary object-looking thing.  Each type have
defined format and semantics, and creation of an object of each
type should be validated.  I do not want to encourage bypassing
it by introducing such a backdoor.  The backdoor is easy to
write, but I suspect it would actively harm us, instead of
helping us, by encouraging "let's build a custom type of object,
we do not care if other people would not understand it"
mentality.

^ permalink raw reply

* Re: file name case-sensitivity issues
From: Junio C Hamano @ 2006-05-23 22:57 UTC (permalink / raw)
  To: Alex Riesen; +Cc: git
In-Reply-To: <20060523210615.GB5869@steel.home>

fork0@t-online.de (Alex Riesen) writes:

> Very simple to reproduce on FAT and NTFS, and under Windows, as usual,
> when a problem is especially annoying. I seem to have no chance to
> get my hands on this myself, so I at least let everyone know about the
> problem.

Isn't it like complaining that the following sequence loses your
precious file on a case-challenged filesystem?

	$ echo precious contents >foo
        $ rm -f FOO

Is it a problem for the user?  Certainly yes.  You lost your
precious file.

Is it a bug in the operating system and/or the filesystem?
Probably not; it is doing what it is asked to do -- its
definition of what string matches what file on the filesystem is
dubious, but that is how it sees the world and you accept that
view while you are on such a system.  Is it a bug in "rm"?
Probably not; it is doing what it is asked to do within the
context that you gave it.

I'd call that a PEBCAK.

If you _know_ you are working on a case challenged filesystem, I
think the best thing you can do is not to work on a project that
has files in different cases on such a filesystem.

^ permalink raw reply

* Re: file name case-sensitivity issues
From: Ben Clifford @ 2006-05-23 22:43 UTC (permalink / raw)
  To: Alex Riesen; +Cc: git, Junio C Hamano, Linus Torvalds
In-Reply-To: <20060523210615.GB5869@steel.home>

On OS X using whatever filesystem it comes with by default, I get the 
following, which doesn't seem right (but in a different way).

$ mkdir case-sensitivity-test
$ cd case-sensitivity-test
$ git init-db
defaulting to local storage area
$ echo foo > foo
$ echo bar > bar
$ git add foo bar
$ git commit -m initial\ commit
Committing initial tree 89ff1a2aefcbff0f09197f0fd8beeb19a7b6e51c
$ git checkout -b side
$ echo bar-side >> bar
$ git commit -m side\ commit -o bar
$ git checkout master
$ rm foo
$ git update-index --remove foo
$ echo FOO > FOO
$ git add FOO
$ git commit -m case\ change
$ ls
FOO bar
$ git pull . side
Trying really trivial in-index merge...
fatal: Merge requires file-level merging
Nope.
Merging HEAD with e1f1e78035b099fad2bbfb82af7ec31864d8e4c1
Merging: 
5d70969775bf595dd5144a2bacc25d32cc288352 case change 
e1f1e78035b099fad2bbfb82af7ec31864d8e4c1 side commit 
found 1 common ancestor(s): 
e35c42fad4f08c2ccf61d93409a0208e92028a51 initial commit 

Merge 98bf1cae75776c141ad3b61dc2cb938c71c303ef, made by recursive.
 bar |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)
$ 
$ ls
bar
$ git ls-files -d
FOO
$ git ls-tree HEAD
100644 blob b7d6715e2df11b9c32b2341423273c6b3ad9ae8a    FOO
100644 blob 5f8b81e197a2cb27816112fb5a6b86b7031ffde8    bar

The checkout is losing the FOO file but the merged tree object has the 
merged FOO in it.

-- 

^ permalink raw reply

* Re: Current Issues #3
From: Jakub Narebski @ 2006-05-23 21:58 UTC (permalink / raw)
  To: git
In-Reply-To: <Pine.LNX.4.64.0605220216310.3697@g5.osdl.org>

Linus Torvalds wrote:

[...]
> But with the above, you can fairly naturally do:
> 
>  - "git pull" 
> 
>       No arguments. fetch the remote described by the current branch, 
>       and merge into current branch (we might decide to fetch all the 
>       remotes associated with that repo, just because once we do this, 
>       we might as well, but that's not that important to the end 
>       result).
> 
>  - "git pull <repo>"
     (i.e. re-clone)
>       fetch all remotes that use <repo>. IFF the current branch is 
>       matched to one of those remotes, merge the changes into the 
>       current branch. But if you happened to be on another unrelated 
>       branch, nothing happens aside from the fetch.
> 
>  - "git pull <remote>"
> 
>       fetch just the named remote. IFF that remote is also the remote 
>       for the current branch, do merge it into current. Again, we 
>       _might_ decide to just do the whole repo.
> 
>  - "git pull <repo> <branchname>"
> 
>       fetch the named branch from the named repository and merge it into 
>       current (no ifs, buts or maybes - now we've basically overridden 
>       the default relationships, so now the <repo> is just a pure 
>       shorthand for the location of the repository)

Fetch into curret branch, or specified by branch configuration, then current
if unspecified?

>  - "git pull <repo> <src>:<dst>"
> 
>       same as now. fetch <repo> <src> into <dst>, and merge it into the 
>       current branch (again, we've overridden any default relationships).
> 
> but maybe this is overdesigned. Comments?

It all means that within <repo> annd <remote> names should be unique
(to know if we use "git pull <repo>" or "git pull <remote>").

Perhaps it would be nice to have

 - "git pull <repo> *:<dst>"
 - "git pull <repo> <src>:*"
 - "git pull <repo> *:*"
and
 - "git pull <repo> <src>:<dst>:<to-merge>"

as easier to remember options. Of course what is the remote branch related
to <dst>, and what is local branch related to <src> would be in
branch/remotes/repos configuration.

BTW. what about --use-separate-remotes option support?

-- 
Jakub Narebski
Warsaw, Poland

^ permalink raw reply

* [PATCH] Add a test-case for git-apply trying to add an ending line
From: Catalin Marinas @ 2006-05-23 21:48 UTC (permalink / raw)
  To: git

From: Catalin Marinas <catalin.marinas@gmail.com>

git-apply adding an ending line doesn't seem to fail if the ending line is
already present in the patched file.

Signed-off-by: Catalin Marinas <catalin.marinas@gmail.com>
---

 t/t4113-apply-ending.sh |   35 +++++++++++++++++++++++++++++++++++
 1 files changed, 35 insertions(+), 0 deletions(-)

diff --git a/t/t4113-apply-ending.sh b/t/t4113-apply-ending.sh
new file mode 100755
index 0000000..d021ae8
--- /dev/null
+++ b/t/t4113-apply-ending.sh
@@ -0,0 +1,35 @@
+#!/bin/sh
+#
+# Copyright (c) 2006 Catalin Marinas
+#
+
+test_description='git-apply trying to add an ending line.
+
+'
+. ./test-lib.sh
+
+# setup
+
+cat >test-patch <<\EOF
+diff --git a/file b/file
+--- a/file
++++ b/file
+@@ -1,2 +1,3 @@
+ a
+ b
++c
+EOF
+
+echo 'a' >file
+echo 'b' >>file
+echo 'c' >>file
+
+test_expect_success setup \
+    'git-update-index --add file'
+
+# test
+
+test_expect_failure apply \
+    'git-apply --index test-patch'
+
+test_done

^ permalink raw reply related

* Re: file name case-sensitivity issues
From: Linus Torvalds @ 2006-05-23 21:30 UTC (permalink / raw)
  To: Alex Riesen; +Cc: git, Junio C Hamano
In-Reply-To: <Pine.LNX.4.64.0605231412350.5623@g5.osdl.org>

On Tue, 23 May 2006, Linus Torvalds wrote:
> 
> The closest I can imagine is to add a config option like "core.lowercase", 
> and that would make us always add files to the index in lower case.

Side note: doing it by just changing the name compare functions to ignore 
case is _not_ a good things to do, because that would generate tree 
objects that simply don't work (or fsck) correctly on any other machine. 

The index and tree objects are all sorted by pathname, and thus the 
sorting order has to be something that everybody agrees on, and any locale 
dependencies are not appropriate.

It might be worth asking the monotone guys what they do - they've worked 
on Windows for a long time.

		Linus

^ permalink raw reply

* Re: file name case-sensitivity issues
From: Linus Torvalds @ 2006-05-23 21:16 UTC (permalink / raw)
  To: Alex Riesen; +Cc: git, Junio C Hamano
In-Reply-To: <20060523210615.GB5869@steel.home>

On Tue, 23 May 2006, Alex Riesen wrote:
>
> Very simple to reproduce on FAT and NTFS, and under Windows, as usual,
> when a problem is especially annoying. I seem to have no chance to
> get my hands on this myself, so I at least let everyone know about the
> problem.

I don't think we can fix it.

At least not in the short term.

The closest I can imagine is to add a config option like "core.lowercase", 
and that would make us always add files to the index in lower case. That, 
together with making sure that "setup_pathspec()" &co always also 
lower-case their arguments might get things limping along with minimal 
trouble.

But it won't ever do things _well_. Anything non-ascii would be just a 
nightmare.

		Linus

^ permalink raw reply

* Re: [PATCH 2/2] cvsimport: cleanup commit function
From: Martin Langhoff @ 2006-05-23 21:13 UTC (permalink / raw)
  To: Martin Langhoff, Linus Torvalds, Junio C Hamano, Matthias Urlichs,
	git
In-Reply-To: <20060523211016.GB16164@coredump.intra.peff.net>

On 5/24/06, Jeff King <peff@peff.net> wrote:
> On Wed, May 24, 2006 at 08:29:07AM +1200, Martin Langhoff wrote:
>
> > Strange! Cannot repro here with v5.8.8 (debian/etch 5.8.8-4) but
> > initialising it doesn't hurt, so let's do it:
>
> I can reproduce with debian perl 5.8.8-4. The bug is only triggered by
> 0-length files, so presumably your test repo doesn't have any.

Given that we are all working off the gentoo repo here, it means that
my machine is slower than Linus' unreleased Intel box. And that I am
too impatient...

In any case, the fix is correct as Junio points out.

cheers,


martin

^ permalink raw reply

* Re: [PATCH 2/2] cvsimport: cleanup commit function
From: Jeff King @ 2006-05-23 21:10 UTC (permalink / raw)
  To: Martin Langhoff; +Cc: Linus Torvalds, Junio C Hamano, Matthias Urlichs, git
In-Reply-To: <46a038f90605231329w35d10cfdg1ac413ebf8d32e11@mail.gmail.com>

On Wed, May 24, 2006 at 08:29:07AM +1200, Martin Langhoff wrote:

> Strange! Cannot repro here with v5.8.8 (debian/etch 5.8.8-4) but
> initialising it doesn't hurt, so let's do it:

I can reproduce with debian perl 5.8.8-4. The bug is only triggered by
0-length files, so presumably your test repo doesn't have any.

-Peff

^ permalink raw reply

* file name case-sensitivity issues
From: Alex Riesen @ 2006-05-23 21:06 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Linus Torvalds

Very simple to reproduce on FAT and NTFS, and under Windows, as usual,
when a problem is especially annoying. I seem to have no chance to
get my hands on this myself, so I at least let everyone know about the
problem.

The case goes as follows:

  $ mkdir case-sensitivity-test
  $ cd case-sensitivity-test
  $ git init-db
  defaulting to local storage area
  $ echo foo > foo
  $ echo bar > bar
  $ git add foo bar
  $ git commit -m initial\ commit
  Committing initial tree 89ff1a2aefcbff0f09197f0fd8beeb19a7b6e51c
  $ git checkout -b side
  $ echo bar-side >> bar
  $ git commit -m side\ commit -o bar
  $ git checkout master
  $ rm foo
  $ git update-index --remove foo
  $ echo FOO > FOO # note case change
  $ git add FOO
# this is on linux, vfat  on an usbstick (mounted with default case
# conversion, which is "lower". That's why the file can't be found).
# Have no Windows at home. On Windows the FOO is created and "git add"
# just passes. We just assume it did add the file, as it would there.
  git-ls-files: error: pathspec 'FOO' did not match any.
  Maybe you misspelled it?
  $ git commit -m case\ change
  $ git pull . side
  Trying really trivial in-index merge...
  git-read-tree: fatal: Untracked working tree file 'foo' would be overwritten by merge.
  Nope. Really trivial in-index merge is not possible.
  Merging HEAD with 7b0cad3a104487fa92afa06736294338acb84281
  Merging:
  7f6a8ba3e41683ef5b55921d050092e766aad4a5 case change
  7b0cad3a104487fa92afa06736294338acb84281 side commit
  found 1 common ancestor(s):
  f857aaf5f1d3716d25ca7751f12de30420d9b2aa initial commit
  git-read-tree: git-read-tree: fatal: Untracked working tree file 'foo' would be overwritten by merge.

  No merge strategy handled the merge.

Well, what now?

What I did was to replace that die() with error() in
read-tree.c:verify_absent, which if cause is not acceptable.
I'll try to find a solution sometime later, but I really hope
someone will find it sooner (because it'll take some time for me).
Hope it didn't bit anyone yet...

^ permalink raw reply

* Re: [PATCH 2/2] cvsimport: cleanup commit function
From: Jeff King @ 2006-05-23 20:59 UTC (permalink / raw)
  To: Morten Welinder; +Cc: Martin Langhoff, Junio C Hamano, Matthias Urlichs, git
In-Reply-To: <118833cc0605231047o2012deefh5e77b8496da1e673@mail.gmail.com>

On Tue, May 23, 2006 at 01:47:01PM -0400, Morten Welinder wrote:

> Why run "env" and not just muck with %ENV?
> >+       my $pid = open2(my $commit_read, my $commit_write,
> >+               'env',
> >+               "GIT_AUTHOR_NAME=$author_name",
> >+               "GIT_AUTHOR_EMAIL=$author_email",
> >+               "GIT_AUTHOR_DATE=$commit_date",
> >+               "GIT_COMMITTER_NAME=$author_name",
> >+               "GIT_COMMITTER_EMAIL=$author_email",
> >+               "GIT_COMMITTER_DATE=$commit_date",
> >+               'git-commit-tree', $tree, @commit_args);

Oops, that's an obvious fork optimization that I should have caught.
Patch is below. Note that this will now affect the environment of all
sub-processes, but it shouldn't matter since we reset it right before
commit. However, if anyone is worried, we can stash the old %ENV in
another hash temporarily.

-Peff

PS What is the preferred format for throwing patches into replies like
this? Putting the patch at the end (as here) or throwing the reply
comments in the ignored section near the diffstat?

---
cvsimport: set up commit environment in perl instead of using env

---

44c4a9f67322302ca49146a7c143c07ea67da366
 git-cvsimport.perl |   13 ++++++-------
 1 files changed, 6 insertions(+), 7 deletions(-)

44c4a9f67322302ca49146a7c143c07ea67da366
diff --git a/git-cvsimport.perl b/git-cvsimport.perl
index 41ee9a6..83d7d3c 100755
--- a/git-cvsimport.perl
+++ b/git-cvsimport.perl
@@ -618,14 +618,13 @@ sub commit {
 	}
 
 	my $commit_date = strftime("+0000 %Y-%m-%d %H:%M:%S",gmtime($date));
+	$ENV{GIT_AUTHOR_NAME} = $author_name;
+	$ENV{GIT_AUTHOR_EMAIL} = $author_email;
+	$ENV{GIT_AUTHOR_DATE} = $commit_date;
+	$ENV{GIT_COMMITTER_NAME} = $author_name;
+	$ENV{GIT_COMMITTER_EMAIL} = $author_email;
+	$ENV{GIT_COMMITTER_DATE} = $commit_date;
 	my $pid = open2(my $commit_read, my $commit_write,
-		'env',
-		"GIT_AUTHOR_NAME=$author_name",
-		"GIT_AUTHOR_EMAIL=$author_email",
-		"GIT_AUTHOR_DATE=$commit_date",
-		"GIT_COMMITTER_NAME=$author_name",
-		"GIT_COMMITTER_EMAIL=$author_email",
-		"GIT_COMMITTER_DATE=$commit_date",
 		'git-commit-tree', $tree, @commit_args);
 
 	# compatibility with git2cvs
-- 
1.3.3.g40505-dirty


> -
> To unsubscribe from this list: send the line "unsubscribe git" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related

* Re: [osol-bugs] access() behaves strange when used as root
From: Linus Torvalds @ 2006-05-23 20:54 UTC (permalink / raw)
  To: Stefan Pfetzing; +Cc: Git Mailing List
In-Reply-To: <f3d7535d0605231343h51bfb2c9w1d15536b92874a88@mail.gmail.com>

On Tue, 23 May 2006, Stefan Pfetzing wrote:
>
> Hi Joerg,

Don't bother talking to Joerg.

He's a certified loon, and thinks Solaris is correct by definition. He's 
insane. He thinks that anybody who does anything different from Solaris is 
by definition not just wrong, but actively evil, even when the "anything 
different" is clearly superior (ie he thinks that Solaris device naming is 
not only sane, but claims that everybody else should do it that way).

If you ever wondered why "cdrecord" takes a default device argument of the 
forma "dev=0,1,0" or other random numbers, it's Joerg.

So just ignore him. I hope the OpenSolaris lists have saner people around.

		Linus

^ permalink raw reply

* Re: [osol-bugs] access() behaves strange when used as root
From: Stefan Pfetzing @ 2006-05-23 20:43 UTC (permalink / raw)
  To: Git Mailing List
In-Reply-To: <44735D61.nail4RQ116Y06@burner>

Hi Joerg,

2006/5/23, Joerg Schilling <schilling@fokus.fraunhofer.de>:
> Before claiming that Soplaris is not behaving correctly, you should have a look
> into the standard.......

I didn't say Solaris does not behave "correctly" - I just said it does
not behave as every
other POSIX/SUS Unix I know.

> The behavior of Solaris access() is OK.

I know that its completely ok with SUS - and I did say that before.

Primarily I was just wondering about this behaviour and IMHO the git
developers were too.

bye

Stefan

-- 
       http://www.dreamind.de/
Oroborus and Debian GNU/Linux Developer.

^ permalink raw reply

* Re: [PATCH 0/2] tagsize < 8kb restriction
From: Björn Engelmann @ 2006-05-23 20:40 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <7vac99c1hv.fsf@assigned-by-dhcp.cox.net>

Hi,

I hope this time I got it right. Is there some kind of style-guide I can
refer to in future ?

> Another question is if the QA data expected to be amended or
> annotated later, after it is created.
>
> If the answer is yes, then you probably would not want tags --
> you can create a new tag that points at the same commit to
> update the data, but then you have no structural relationships
> given by git between such tags that point at the same commit.
> You could infer their order by timestamp but that is about it.
> I think you are better off creating a separate QA project that
> adds one new file per commit on the main project, and have the
> file identify the commit object on the main project (either
> start your text file format for QA data with the commit object
> name, or name each such QA data file after the commit object
> name).  Then your automated procedure could scan and add a new
> file to the QA project every time a new commit is made to the
> main project, and the data in the QA project can be amended or
> annotated and the changes will be version controlled.
>   

Great idea ! Thanks a lot. Originally it was not planned to alter the
results once committet, but this way it would even be possible to rescan
a commit with a different tool and merge the results. Git would also be
able to use delta-encoding when packing what can be considered extremly
efficient since most probably most scan-results won't differ much.

I am currently wondering where to store the reference to such a
sub-repository. It certainly is a head, but I would like to avoid anyone
commiting code into this "branch". Maybe I will create a new directory
.git/refs/annotations.

When thinking about this very elegant way to handle meta-data, I got
another idea:
The quality assurance system also works distributed. For scalability
reasons there are multiple scanners, each scanning one commit at a time.
Do you think git could also be used to handle "locking" ? The scanners
would then push a commit with an empty result-file into the
annotations-repository so all other scanners who are looking for
currently unscanned commits would ignore it in future. When finished the
result can be inserted by pushing a subsequent commit. This way one
avoids the need for a seperate job-server / protocol.
I am not sure how git would perform in such an environment. Do you think
the "git-push"-implementation is sufficiently "thread-save" for this ?
Or could simultaniously pushing into the same branch f.e. break  the
repository ?

Hmm.. 2 more things on my mind:
1.) Do you intend to add some more advanced metadata-functionality to
git in the future or should I send a patch with my implementation once
it is finished ? Will be just some scripts using similar commands to
what Linus sent me (thanks for that, btw)

2.) Searching for a way to add objects to the database I spent quite a
while to find the right command. Don't you think it would be much more
intuitive having an

    git-create-object [-t <type>] [-n] [-f] [-z] [--stdin] <file> [-r
<ref-name>]

command for creating any type of object (-t blob as default), optionally
omitting writing it to the database (-n = no-write) (like
git-hash-object), by default validating its input  (overriding with -f)
(like git-mktag, git-mktree) and maybe even able to add a reference to
it with -r (like git-tag).

Bj

^ permalink raw reply

* Re: [PATCH 2/2] cvsimport: cleanup commit function
From: Martin Langhoff @ 2006-05-23 20:32 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <7vpsi55et5.fsf@assigned-by-dhcp.cox.net>

On 5/23/06, Junio C Hamano <junkio@cox.net> wrote:
> "Martin Langhoff" <martin.langhoff@gmail.com> writes:
>
> > Jeff,
> >
> > good stuff -- aiming at exactly the things that had been nagging me.
> > Some minor notes on top of what junio's mentioned...
> >
> >> +    die "unable to open $f: $!" unless $! == POSIX::ENOENT;
> >> +    return undef;
> >
> > Heh. Is that the return of the living dead?
>
> Note the trailing "unless" there.

Of course. I had actually missed the closing quotes, and thought the
error msg wanted to talk about POSIX. 'twas late in the day, seems
like most of my comments in this email were rather stoopid.

> >> +sub update_index (\@\@) {
> >> +       my $old = shift;
> >> +       my $new = shift;
> >
> > Would it not make more sense to just pass them as plain parameters?
>
> Meaning...?  Perl5 can pass only one flat array, so the above is
> a standard way to pass two arrays.

Meaning I am stupid :(

> >> +       print "Committed patch $patchset ($branch $commit_date)\n" if
> >
> > Given that we have that -- should we remember it and avoid re-reading
> > the headref from disk? A %seenheads cache would save us 99.9% of the
> > hassle.
> >
> > In related news, I've dealt with file reads from the socket being
> > memorybound. Should merge ok.
>
> Merged OK, and I think your last suggestion makes sense.  I'll
> go to bed after pushing out Jeff's two patches and yours.

I'll look into caching headrefs tonight if noone beats me to it.




martin

^ permalink raw reply

* Re: [PATCH 2/2] cvsimport: cleanup commit function
From: Martin Langhoff @ 2006-05-23 20:29 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Junio C Hamano, Matthias Urlichs, git
In-Reply-To: <Pine.LNX.4.64.0605231232360.5623@g5.osdl.org>

On 5/24/06, Linus Torvalds <torvalds@osdl.org> wrote:
> Martin, that problem seems to go away when I initialize $res to 0 in
> _fetchfile.
>
> I don't know perl, and maybe local variables are pre-initialized to empty.
>
> It's entirely possible that the fact that it now seems to work for me is
> purely timing-related, since I also ended up using "-P cvsps-output" to
> avoid having a huge cvsps binary in memory at the same time.

Strange! Cannot repro here with v5.8.8 (debian/etch 5.8.8-4) but
initialising it doesn't hurt, so let's do it:

diff --git a/git-cvsimport.perl b/git-cvsimport.perl
index ace7087..abbfd0b 100755
--- a/git-cvsimport.perl
+++ b/git-cvsimport.perl
@@ -371,7 +371,7 @@ sub file {
 }
 sub _fetchfile {
        my ($self, $fh, $cnt) = @_;
-       my $res;
+       my $res = 0;
        my $bufsize = 1024 * 1024;
        while($cnt) {
            if ($bufsize > $cnt) {

cheers,


martin

^ permalink raw reply related

* Re: [PATCH 2/2] cvsimport: cleanup commit function
From: Junio C Hamano @ 2006-05-23 20:25 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git
In-Reply-To: <Pine.LNX.4.64.0605231232360.5623@g5.osdl.org>

Linus Torvalds <torvalds@osdl.org> writes:

>> 	Committing initial tree 34bd3dcd4bfd79bad35ce3fb08b2e21108195db8
>> 	Server has gone away while fetching BUGS-TODO 1.1, retrying...
>...
> Martin, that problem seems to go away when I initialize $res to 0 in 
> _fetchfile. 
>
> I don't know perl, and maybe local variables are pre-initialized to empty. 

When a new file that is empty is created, sub _line would call
sub _fetchfile with $cnt == 0, and it can return $res which
is initialized to 'undef'.  That explains why sub file says
$self->_line() returned an undef and I think what you did is the
right fix.

^ permalink raw reply

* Re: irc usage..
From: Jakub Narebski @ 2006-05-23 20:19 UTC (permalink / raw)
  To: git
In-Reply-To: <Pine.LNX.4.64.0605221055270.3697@g5.osdl.org>

Linus Torvalds wrote:

> [...] people _should_ realize that removing objects is very very special. 
> Whether it's done by "git prune-packed" or "git prune", that's a very 
> dangerous operations. "git prune" a lot more so than "git prune-packed", 
> of course (in fact, you should _never_ run "git prune" on a repository 
> that is active - you _will_ corrupt it)-

Would it be possible to make 'git prune' command repository corruption safe,
even if some information might be lost (like 'git add')? Or do _corruption_
mean some recoverable only information is lost? Not always one can use "one
repository per developer" workflow.

One of the solution would be to to use reader/writer lock (filesystem
semaphore), with each command modyfying repository performing locking, and
git-prune waiting on lock until noone is accessing repository. Of course
the problem is with OS and filesystems which does not support locking, and
with stale locks...

Second solution would be to [optionally] wait until no process is accessing
repository, copy repository in some safe place, [optionally] calculate
checksum, prune, [optionally] check if the repository was modified
meanwhile and either abort or repeat, and finally copy pruned repository
back.

-- 
Jakub Narebski
Warsaw, Poland

^ permalink raw reply

* [PATCH 6/6] Allow in body headers beyond the in body header prefix.
From: Eric W. Biederman @ 2006-05-23 19:58 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <m1mzd8iklr.fsf_-_@ebiederm.dsl.xmission.com>


- handle_from is fixed to not mangle it's input line.

- Then handle_inbody_header is allowed to look in
  the body of a commit message for additional headers
  that we haven't already seen.

This allows patches with all of the right information in
unfortunate places to be imported.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>


---

 mailinfo.c |    9 +++++----
 1 files changed, 5 insertions(+), 4 deletions(-)

eca59d2fd60af47170cdbfdebf3384465f0e7635
diff --git a/mailinfo.c b/mailinfo.c
index c642ff4..99374b3 100644
--- a/mailinfo.c
+++ b/mailinfo.c
@@ -72,11 +72,14 @@ static int bogus_from(char *line)
 	return 1;
 }
 
-static int handle_from(char *line)
+static int handle_from(char *in_line)
 {
-	char *at = strchr(line, '@');
+	char line[1000];
+	char *at;
 	char *dst;
 
+	strcpy(line, in_line);
+	at = strchr(line, '@');
 	if (!at)
 		return bogus_from(line);
 
@@ -242,8 +245,6 @@ #define SEEN_PREFIX  0x08
 /* First lines of body can have From:, Date:, and Subject: */
 static void handle_inbody_header(int *seen, char *line)
 {
-	if (*seen & SEEN_PREFIX)
-		return;
 	if (!memcmp("From:", line, 5) && isspace(line[5])) {
 		if (!(*seen & SEEN_FROM) && handle_from(line+6)) {
 			*seen |= SEEN_FROM;
-- 
1.3.2.g5041c-dirty

^ permalink raw reply related

* [PATCH 5/6] More accurately detect header lines in read_one_header_line
From: Eric W. Biederman @ 2006-05-23 19:53 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <m1r72kiksz.fsf_-_@ebiederm.dsl.xmission.com>


Only count lines of the form '^.*: ' and '^From ' as email
header lines. 

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>


---

 mailinfo.c |   25 +++++++++++++++++--------
 1 files changed, 17 insertions(+), 8 deletions(-)

b955444f0bfb4ee9a5cd31686dd7eeec0750e235
diff --git a/mailinfo.c b/mailinfo.c
index 99989c2..c642ff4 100644
--- a/mailinfo.c
+++ b/mailinfo.c
@@ -385,20 +385,29 @@ static int read_one_header_line(char *li
 {
 	int ofs = 0;
 	while (ofs < sz) {
+		const char *colon;
 		int peek, len;
 		if (fgets(line + ofs, sz - ofs, in) == NULL)
-			return ofs;
+			break;
 		len = eatspace(line + ofs);
 		if (len == 0)
-			return ofs;
-		peek = fgetc(in); ungetc(peek, in);
-		if (peek == ' ' || peek == '\t') {
-			/* Yuck, 2822 header "folding" */
-			ofs += len;
-			continue;
+			break;
+		colon = strchr(line, ':');
+		if (!colon || !isspace(colon[1])) {
+			/* Readd the newline */
+			line[ofs + len] = '\n';
+			line[ofs + len + 1] = '\0';
+			break;
 		}
-		return ofs + len;
+		ofs += len;
+		/* Yuck, 2822 header "folding" */
+		peek = fgetc(in); ungetc(peek, in);
+		if (peek != ' ' && peek != '\t')
+			break;
 	}
+	/* Count mbox From headers as headers */
+	if (!ofs && !memcmp(line, "From ", 5))
+		ofs = 1;
 	return ofs;
 }
 
-- 
1.3.2.g5041c-dirty

^ permalink raw reply related

* [PATCH 4/6] In handle_body only read a line if we don't already have one.
From: Eric W. Biederman @ 2006-05-23 19:49 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <m1verwikvj.fsf_-_@ebiederm.dsl.xmission.com>

This prepares for detecting non-email patches that don't have
mail headers.  In which case we have already read the first
line so handle_body should not ignore it.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>

---

 mailinfo.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

3ad0c255a351d771c7f301d4a4e9bfb6fdcbde5f
diff --git a/mailinfo.c b/mailinfo.c
index 3fa9505..99989c2 100644
--- a/mailinfo.c
+++ b/mailinfo.c
@@ -724,7 +724,7 @@ static void handle_body(void)
 {
 	int seen = 0;

-	if (fgets(line, sizeof(line), stdin) != NULL) {
+	if (line[0] || fgets(line, sizeof(line), stdin) != NULL) {
 		handle_commit_msg(&seen);
 		handle_patch();
 	}
-- 
1.3.2.g5041c-dirty

^ permalink raw reply related

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox