Git development
 help / color / mirror / Atom feed
* Re: How can I force git to recognize a change change in file modes?
From: Jay Soffian @ 2009-02-28 16:52 UTC (permalink / raw)
  To: Brent Goodrick; +Cc: Jan Krüger, git
In-Reply-To: <e38bce640902280824x3ae41d95qab1f1a450235e096@mail.gmail.com>

On Sat, Feb 28, 2009 at 11:24 AM, Brent Goodrick <bgoodr@gmail.com> wrote:
> Thanks Jan.  Was this choice made due to the conditional coding
> required to track the permission bits content between *NIX and
> non-*NIX platform(s)?

The short answer is: because Git was designed to track content. The
long answer is more complicated. Here's one of the more useful past
discussions:

http://thread.gmane.org/gmane.comp.version-control.git/91783

I'm sure you can find others by searching the git list for "metadata".

j.

^ permalink raw reply

* Re: git-svn, and which branch am I on?
From: Jakub Narebski @ 2009-02-28 17:14 UTC (permalink / raw)
  To: Daniel Pittman; +Cc: git
In-Reply-To: <87ljrr7xof.fsf@rimspace.net>

Daniel Pittman <daniel@rimspace.net> writes:

> The general question was: in git, how do I identify where this branch
> came from?

In general, you cannot. In specific cases, you can.
See below for details.
 
> Specifically, this was about 'git svn', but also generally how to
> identify this information in git.
> 
> So, with a repository branch layout like this:
> 
>   master        (local)
>   testing       (local)
>   trunk         (remote)
>   v100          (remote)
> 
> How would I find out which remote branch master and trunk came from?
> 
> 
> To restate that, because I am not sure if that is clear, given this
> layout of branches:
> 
>      trunk (remote)
>      |
>  o---o---o---o---o  branch master
>   \
>    \
>     o---o---o---o branch testing
>     |
>     v100 (remote)
> 
> How can I identify that 'testing' came from the 'v100' branch, and that
> master came from the 'trunk' branch?
> 
> 
> Ideally, I would like to work this out on the command line, without
> needing to reference gitk or another graphical tool, but even a solution
> that used them would be fine.

[...]
> ...and, finally, is the reason that I am finding it hard to explain this
> because I have an expectation of how things work that doesn't match up
> with git?  In other words, is the question actually meaningless?

On the plumbing level, or on the level of graph of commits, git does
not store information "where this branch came from".  For git, from
the point of view of commit objects, the following two pictures are
totally equivalent (by design):

        /-o---o      branch a
       /
  o---*---o---o---o  branch b

and
  
  o---*---o---o      branch a
       \
        \-o---o---o  branch b

What you _can_ get on this level is to find common ancestor (or common
ancestors) of branches 'a' and 'b', using "git merge-base a b"; this
would return commit marked '*'.


However, when creating a branch, you can tell git that you want for
newly created branch to track branch you are based on (with --track)
option.  By default saving tracking information is done when branching
off remote-tracking branches.

>From git-branch(1):

  When a local branch is started off a remote branch, git sets up the
  branch so that 'git-pull' will appropriately merge from
  the remote branch. This behavior may be changed via the global
  `branch.autosetupmerge` configuration flag. That setting can be
  overridden by using the `--track` and `--no-track` options.

  [...]

  --track::
      When creating a new branch, set up configuration so that 'git-pull'
      will automatically retrieve data from the start point, which must be
      a branch. Use this if you always pull from the same upstream branch
      into the new branch, and if you don't want to use "git pull
      <repository> <refspec>" explicitly. This behavior is the default
      when the start point is a remote branch. Set the
      branch.autosetupmerge configuration variable to `false` if you want
      'git-checkout' and 'git-branch' to always behave as if '--no-track' were
      given. Set it to `always` if you want this behavior when the
      start-point is either a local or remote branch.

And git-config(1)

  branch.<name>.remote::
      When in branch <name>, it tells 'git-fetch' which remote to fetch.
      If this option is not given, 'git-fetch' defaults to remote "origin".

  branch.<name>.merge::
      When in branch <name>, it tells 'git-fetch' the default
      refspec to be marked for merging in FETCH_HEAD. The value is
      handled like the remote part of a refspec, and must match a
      ref which is fetched from the remote given by
      "branch.<name>.remote".
      The merge information is used by 'git-pull' (which at first calls
      'git-fetch') to lookup the default branch for merging. Without
      this option, 'git-pull' defaults to merge the first refspec fetched.
      Specify multiple values to get an octopus merge.
      If you wish to setup 'git-pull' so that it merges into <name> from
      another branch in the local repository, you can point
      branch.<name>.merge to the desired branch, and use the special setting
      `.` (a period) for branch.<name>.remote

In this case you can get _name_ of the branch this branch "came from"
with "git config branch.<branchname>.merge".

HTH.
-- 
Jakub Narebski
Poland
ShadeHawk on #git

^ permalink raw reply

* Re: jgit and ignore
From: Shawn O. Pearce @ 2009-02-28 17:26 UTC (permalink / raw)
  To: Jon Smirl; +Cc: Git Mailing List
In-Reply-To: <9e4733910902280831j70448ce9h7239f14e13b92b76@mail.gmail.com>

Jon Smirl <jonsmirl@gmail.com> wrote:
> I'm using jgit in eclipse. Works great for me.
 
Yay!

> I have a couple of generated files in my working directory. There
> doesn't seem to be any UI for ignoring them. Is it there and I just
> can't find it?

EGit doesn't (yet) honor the .gitignore files like it should. Someone
(Ferry i-forget-the-rest-of-his-name) is working on adding ignore
support and has patches in flight for at least some of it.

-- 
Shawn.

^ permalink raw reply

* Re: [PATCH] fix git format-patch --cc=<email> format
From: Junio C Hamano @ 2009-02-28 17:29 UTC (permalink / raw)
  To: Jay Soffian; +Cc: Peng Tao, git
In-Reply-To: <76718490902280815if1c3fa7o790112b410d52224@mail.gmail.com>

Jay Soffian <jaysoffian@gmail.com> writes:

> On Sat, Feb 28, 2009 at 7:42 AM, Peng Tao <bergwolf@gmail.com> wrote:
>> If there are multiple --cc=<email> arguments, git format-patch will generate
>> patches with cc lines like:
>>  Cc: <email>,
>>      <email>
>> which git send-email fails to parse.
>> git send-email only accept formats like:
>>  Cc: <email>
>>  Cc: <email>
>> So change git format-patch to generate patches in a proper format.
>
> This is fixed in next, but we fixed send-email instead to handle the
> messages that format-patch generates, as they should be valid.

Per RFC2822 3.6 (pp 19-20), "cc" is to appear at most once (same is true
for "to" and "bcc").  I think fix to format-patch is necessary regardless
of what send-email does.

^ permalink raw reply

* Re: How can I force git to recognize a change change in file modes?
From: Brent Goodrick @ 2009-02-28 17:34 UTC (permalink / raw)
  To: Jay Soffian; +Cc: Jan Krüger, git
In-Reply-To: <76718490902280852y2f2657ck7459c138205bb874@mail.gmail.com>

> The short answer is: because Git was designed to track content. The
> long answer is more complicated. Here's one of the more useful past
> discussions:
>
> http://thread.gmane.org/gmane.comp.version-control.git/91783
>
> I'm sure you can find others by searching the git list for "metadata".

I read that thread you showed above. Sounds like a big squirmy
can-o-worms, and I see that thread died on the vine because of it. :)

All I want to do in my case is just chmod 700 a bunch of scripts after
they are checked out or updated.  I'll need to re-read the git-hooks
man page more closely.

Thanks for your help!
bg

^ permalink raw reply

* Re: [PATCH 3/4] diffcore-pickaxe: further refactor count_match()
From: Junio C Hamano @ 2009-02-28 17:40 UTC (permalink / raw)
  To: René Scharfe; +Cc: git
In-Reply-To: <49A937B8.1030205@lsrfire.ath.cx>

René Scharfe <rene.scharfe@lsrfire.ath.cx> writes:

> I get this (Ubuntu 8.10 x64, Fedora 10 x64 using the same Linux repo,
> Windows Vista x64 using a different Linux repo with the same HEAD on
> NTFS and msysgit, numbers are the elapsed time in seconds, best of five
> runs):
>
>                            Ubuntu  Fedora  Windows
>    v1.6.2-rc2                8.14    8.16    9.236
>    v1.6.2-rc2+[1-4]          2.43    2.45    2.995
>    v1.6.2-rc2+[1-4]+memmem   1.31    1.25    2.917
>    v1.6.2-rc2+[1-3]+memmem   1.51    1.16    8.455
>
> Ubuntu has glibc 2.8, while Fedora 10 has glibc 2.9, with a new and more
> efficient memmem() implementation.  On Windows, we use our own naive
> memmem() implementation.

Yeah, what does glibc use these days?  Some variant of Boyer-Moore?

> So using memmem() is worthwhile.  And providing a better fall-back
> version in compat/ can speed up this particular case to the point where
> the fourth patch becomes moot.
>
> Hmm, gnulib (http://git.savannah.gnu.org/gitweb/?p=gnulib.git;a=summary)
> contains all parts ready for copy & paste, licensed under the GPL 2 or
> up.  That won't cause problems with the libgit2 relicensing effort, as
> memmem()  won't end up in there, right?

Correct.

^ permalink raw reply

* Re: [PATCH 0/6] "git repack -a -d" improvements
From: Junio C Hamano @ 2009-02-28 17:41 UTC (permalink / raw)
  To: Kjetil Barvik; +Cc: git
In-Reply-To: <8663iuwxrb.fsf@broadpark.no>

Kjetil Barvik <barvik@broadpark.no> writes:

>   OK, patch 2/6 failed for me when I was doing 'git am' to import the
>   patch-series, so sorry if do not see all bits of the patch correctly.

Ah, I should have mentioned that the series was meant to apply to v1.6.0.6
(and merged upwards).

I am off to dentist so I won't have time to think about below until I get
back.

>   Would it be an improvment to change the signature of the currently
>   find_sha1_pack() function to:
>
>     struct packed_git *
>     find_pack_entry(const unsigned char *sha1, off_t *sha1_pack_offset,
>                     struct packed_git *packs)
>
>     - The currently existing 'struct pack_entry *e' parameter is only
>       used to retrn the offset, so make it more clear.  The struct
>       pack_entry can probably be deleted from the sha1_file.c file.
>
>     - When the 'git repack -a -d' command is used, one has to compute
>       the list of allowed pack-files to look into, and give this list to
>       find_pack_entry().
>
>     - The currently named find_sha1_pack() function can then be deleted.
>
>     - For example, when this function is now used in sha1_object_info()
>       it can be called like this:
>
>           found_pack = find_pack_entry(sha1, &offset, packed_git);
>
>   -- kjetil

^ permalink raw reply

* Re: git-svn and repository hierarchy?
From: Michael J Gruber @ 2009-02-28 17:59 UTC (permalink / raw)
  To: Josef Wolf, git
In-Reply-To: <20090227220512.GC14187@raven.wolf.lan>

Josef Wolf venit, vidit, dixit 27.02.2009 23:05:
> Thanks for your patience, Michael!
> 
> On Fri, Feb 27, 2009 at 06:45:44PM +0100, Michael J Gruber wrote:
>> Josef Wolf venit, vidit, dixit 27.02.2009 18:12:
>>> On Wed, Feb 25, 2009 at 10:26:10AM +0100, Michael J Gruber wrote:
>>>> Josef Wolf venit, vidit, dixit 24.02.2009 23:34:
> 
> [ ... ]
>>>   (cd git-svn-repos; git pull ../clone1)
>> Gives you 1-2-3-4
>>
>>>   (cd git-svn-repos; git svn rebase)
>> Does nothing here (but is good practice)
>>
>>>   (cd git-svn-repos; git svn dcommit)
>> Creates 2-3-4 on the svn side. *Then rebases* your master, which creates
>> 1-2'-3'-4' on master. Note that 2 is different from 2' (git-svn id).
> 
> So the sha1 is not preserved when it goes through svn?

No. Once your commits come back from svn through git-svn they have an
additional line in the commit. Also, the commit time time will be
different, and the author name might be depending on your name remapping.

The patch-id (which only looks at the actual diff being introduced)
should be the same.

>>>   (cd git-svn-repos; git pull ../clone1)  # if this line is executed,
>> That's the problem. This creates a merge after which you 1-2-3-4 and
>> 1-2'-3'-4' plus the merge of 4 and 4'.
> 
> --verbosity=on please ;-)

No such option "--verbosity". ;)

Uhm, I'm just not good at diagramms in ascii. You had 1-2-3-4 in
git-svn-repo. 2, 3 and 4 were dcommit to svn and came back as 2', 3',
4', such that git-svn rebased your master branch in git-svn-repo and
master looked like 1-2'-3'-4'. The primed version are the one with an
additional git-svn-id line in the commit: different sha1 from the
unprimed version, same patch-id.

Now, if you say pull ../clone1, you fetch from there and merges
FETCH_HEAD, i.e. the tip of ../clone1, which is 4. So you get

1-2'-3'-4'-o
 \        /
  2 -3 -4

with o being the tip (HEAD) of master. And that is the problem, because
no history is not linear in master, and the next git-svn dcommit won't
know what to do.

>> Instead, use git pull --rebase here. You don't want merges in the branch
>> from which you dcommit.
> 
> Yeah, "pull --rebase" seems to help a lot.  So I've come up with the next
> version of my workflow-test-script:
> 
> (
>   set -ex
> 
>   # create test directory
>   #
>   TESTDIR=`mktemp --tmpdir=. git-svn-hierarchy-test-XXXXXXXX`
>   rm -rf $TESTDIR
>   mkdir -p $TESTDIR
>   cd $TESTDIR
> 
>   SUBVERSION_REPOS=file://`pwd`/subversion-repos
> 
>   # create subversion repos with some history
>   #
>   svnadmin create subversion-repos
>   svn -m "create standard layout" mkdir \
>       $SUBVERSION_REPOS/trunk \
>       $SUBVERSION_REPOS/branches \
>       $SUBVERSION_REPOS/tags
>   svn co $SUBVERSION_REPOS/trunk subversion-wc
>   echo change1 >>subversion-wc/test
>   svn add subversion-wc/test
>   svn ci -m "commit 0" subversion-wc
> 
>   # create git-svn-repos
>   #
>   git svn init --stdlayout $SUBVERSION_REPOS git-svn-repos
>   (cd git-svn-repos; git svn fetch)
> 
>   # create clones
>   #
>   git clone git-svn-repos clone1
>   git clone git-svn-repos clone2
>   git clone git-svn-repos clone3
> 
>   # now go several times to every clone, do some work on it, and sync
>   # the results
>   #
>   for cycle in 1 2 3; do
>     for clone in 1 2 3; do
>       for commit in 1 2 3; do
>         (
>           cd clone$clone
>           git pull --rebase
>           echo change $clone $commit >>test
>           git commit -a -m "commit $clone $commit"
>         )
>       done
>       (cd git-svn-repos; git pull --rebase ../clone$clone)
>       (cd git-svn-repos; git svn rebase)
>       (cd git-svn-repos; git svn dcommit)
>     done
>   done
> )
> 
> At least, this seems to not creating collisions any more.  But I'm still
> not sure I fully understand what's going on here.  Guess, I'll have to
> get into the learning-by-doing mode :)

Yes, be sure to check the DAG (the graph of commits which you produced)
using something like gitk or git log --graph with the "--all" argument
so that you see all branches!

>> Borrowing from some other vcs:
>>
>> Repeat the soothing mantra: a merge is no merge is no merge - it it's in
>> svn ;)
> 
> Huh?

I meant "if it's", sorry for the typo.
If you don't get the plug don't worry (or look up hg) ;)

>>> Obviously, I'm doing something wrong.  But I can't figure what.  Any hints?
>> I guess when we said integrated we should have said rebase. Haven't we?
> 
> You like to talk in riddles? Aren't you?

No, I'm not ;)

Michael

^ permalink raw reply

* Re: [PATCH 3/4] diffcore-pickaxe: further refactor count_match()
From: René Scharfe @ 2009-02-28 18:15 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <7vmyc6foj3.fsf@gitster.siamese.dyndns.org>

Junio C Hamano schrieb:
> René Scharfe <rene.scharfe@lsrfire.ath.cx> writes:
> 
>> I get this (Ubuntu 8.10 x64, Fedora 10 x64 using the same Linux repo,
>> Windows Vista x64 using a different Linux repo with the same HEAD on
>> NTFS and msysgit, numbers are the elapsed time in seconds, best of five
>> runs):
>>
>>                            Ubuntu  Fedora  Windows
>>    v1.6.2-rc2                8.14    8.16    9.236
>>    v1.6.2-rc2+[1-4]          2.43    2.45    2.995
>>    v1.6.2-rc2+[1-4]+memmem   1.31    1.25    2.917
>>    v1.6.2-rc2+[1-3]+memmem   1.51    1.16    8.455
>>
>> Ubuntu has glibc 2.8, while Fedora 10 has glibc 2.9, with a new and more
>> efficient memmem() implementation.  On Windows, we use our own naive
>> memmem() implementation.
> 
> Yeah, what does glibc use these days?  Some variant of Boyer-Moore?

No, the algorithm is called Two Way, which, unlike Boyer-Moore, only
needs constant space.  The implementation seems to originate from this bug:

	http://sourceware.org/bugzilla/show_bug.cgi?id=5514

And the algorithm is documented here:

	http://www-igm.univ-mlv.fr/~lecroq/string/node26.html

René

^ permalink raw reply

* Re: How can I force git to recognize a change change in file modes?
From: Todd Zullinger @ 2009-02-28 18:34 UTC (permalink / raw)
  To: Brent Goodrick; +Cc: Jay Soffian, Jan Krüger, git
In-Reply-To: <e38bce640902280934u3d9da650ke64865d7149b3c66@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 489 bytes --]

Brent Goodrick wrote:
> All I want to do in my case is just chmod 700 a bunch of scripts
> after they are checked out or updated.  I'll need to re-read the
> git-hooks man page more closely.

You may want to check contrib/hooks/setgitperms.perl as well, if you
haven't seen it already.

-- 
Todd        OpenPGP -> KeyID: 0xBEAF0CE3 | URL: www.pobox.com/~tmz/pgp
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Problems are opportunity in work clothes.


[-- Attachment #2: Type: application/pgp-signature, Size: 542 bytes --]

^ permalink raw reply

* [PATCH/resend] git-rebase: Update --whitespace documentation
From: Todd Zullinger @ 2009-02-28 18:42 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano

The parameters accepted by the --whitespace option of "git apply" have
changed over time, and the documentation for "git rebase" was out of
sync.  Remove the specific parameter list from the "git rebase"
documentation and simply point to the "git apply" documentation for
details, as is already done in the "git am" documentation.

Signed-off-by: Todd Zullinger <tmz@pobox.com>
---

I sent this a few weeks back, but it may have arrived during a
particularly busy time and been lost among the many more important
patches. :)

 Documentation/git-rebase.txt |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/Documentation/git-rebase.txt b/Documentation/git-rebase.txt
index 30487de..da3c38c 100644
--- a/Documentation/git-rebase.txt
+++ b/Documentation/git-rebase.txt
@@ -243,7 +243,7 @@ OPTIONS
 	context exist they all must match.  By default no context is
 	ever ignored.

---whitespace=<nowarn|warn|error|error-all|strip>::
+--whitespace=<option>::
 	This flag is passed to the 'git-apply' program
 	(see linkgit:git-apply[1]) that applies the patch.
 	Incompatible with the --interactive option.
-- 
1.6.2.rc1

-- 
Todd        OpenPGP -> KeyID: 0xBEAF0CE3 | URL: www.pobox.com/~tmz/pgp
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The power of accurate observation is frequently called cynicism by
those who don't have it.
    -- George Bernard Shaw

^ permalink raw reply related

* [PATCH] import memmem() with linear complexity from Gnulib
From: René Scharfe @ 2009-02-28 19:16 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <7vmyc6foj3.fsf@gitster.siamese.dyndns.org>

Gnulib and glibc have gained a memmem() implementation using the Two-Way
algorithm, which needs constant space and linear time.  Import it to
compat/ in order to replace the simple quadratic implementation there.

memmem.c and str-two-way.h are copied verbatim from the repository at
git://git.savannah.gnu.org/gnulib.git, with the following changes to
memmem.c to make it fit into git's build environment:

	21,23c21
	< #ifndef _LIBC
	< # include <config.h>
	< #endif
	---
	> #include "../git-compat-util.h"
	40c38
	< memmem (const void *haystack_start, size_t haystack_len,
	---
	> gitmemmem(const void *haystack_start, size_t haystack_len,

Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
---
 Makefile             |    1 +
 compat/memmem.c      |  103 +++++++++----
 compat/str-two-way.h |  429 ++++++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 504 insertions(+), 29 deletions(-)
 rewrite compat/memmem.c (91%)
 create mode 100644 compat/str-two-way.h

diff --git a/Makefile b/Makefile
index 0675c43..b2b15d9 100644
--- a/Makefile
+++ b/Makefile
@@ -359,6 +359,7 @@ LIB_H += cache-tree.h
 LIB_H += commit.h
 LIB_H += compat/cygwin.h
 LIB_H += compat/mingw.h
+LIB_H += compat/str-two-way.h
 LIB_H += csum-file.h
 LIB_H += decorate.h
 LIB_H += delta.h
diff --git a/compat/memmem.c b/compat/memmem.c
dissimilarity index 91%
index cd0d877..b0b7821 100644
--- a/compat/memmem.c
+++ b/compat/memmem.c
@@ -1,29 +1,74 @@
-#include "../git-compat-util.h"
-
-void *gitmemmem(const void *haystack, size_t haystack_len,
-                const void *needle, size_t needle_len)
-{
-	const char *begin = haystack;
-	const char *last_possible = begin + haystack_len - needle_len;
-
-	/*
-	 * The first occurrence of the empty string is deemed to occur at
-	 * the beginning of the string.
-	 */
-	if (needle_len == 0)
-		return (void *)begin;
-
-	/*
-	 * Sanity check, otherwise the loop might search through the whole
-	 * memory.
-	 */
-	if (haystack_len < needle_len)
-		return NULL;
-
-	for (; begin <= last_possible; begin++) {
-		if (!memcmp(begin, needle, needle_len))
-			return (void *)begin;
-	}
-
-	return NULL;
-}
+/* Copyright (C) 1991,92,93,94,96,97,98,2000,2004,2007,2008 Free Software
+   Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 2, or (at your option)
+   any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License along
+   with this program; if not, write to the Free Software Foundation,
+   Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.  */
+
+/* This particular implementation was written by Eric Blake, 2008.  */
+
+#include "../git-compat-util.h"
+
+/* Specification of memmem.  */
+#include <string.h>
+
+#ifndef _LIBC
+# define __builtin_expect(expr, val)   (expr)
+#endif
+
+#define RETURN_TYPE void *
+#define AVAILABLE(h, h_l, j, n_l) ((j) <= (h_l) - (n_l))
+#include "str-two-way.h"
+
+/* Return the first occurrence of NEEDLE in HAYSTACK.  Return HAYSTACK
+   if NEEDLE_LEN is 0, otherwise NULL if NEEDLE is not found in
+   HAYSTACK.  */
+void *
+gitmemmem(const void *haystack_start, size_t haystack_len,
+	const void *needle_start, size_t needle_len)
+{
+  /* Abstract memory is considered to be an array of 'unsigned char' values,
+     not an array of 'char' values.  See ISO C 99 section 6.2.6.1.  */
+  const unsigned char *haystack = (const unsigned char *) haystack_start;
+  const unsigned char *needle = (const unsigned char *) needle_start;
+
+  if (needle_len == 0)
+    /* The first occurrence of the empty string is deemed to occur at
+       the beginning of the string.  */
+    return (void *) haystack;
+
+  /* Sanity check, otherwise the loop might search through the whole
+     memory.  */
+  if (__builtin_expect (haystack_len < needle_len, 0))
+    return NULL;
+
+  /* Use optimizations in memchr when possible, to reduce the search
+     size of haystack using a linear algorithm with a smaller
+     coefficient.  However, avoid memchr for long needles, since we
+     can often achieve sublinear performance.  */
+  if (needle_len < LONG_NEEDLE_THRESHOLD)
+    {
+      haystack = memchr (haystack, *needle, haystack_len);
+      if (!haystack || __builtin_expect (needle_len == 1, 0))
+	return (void *) haystack;
+      haystack_len -= haystack - (const unsigned char *) haystack_start;
+      if (haystack_len < needle_len)
+	return NULL;
+      return two_way_short_needle (haystack, haystack_len, needle, needle_len);
+    }
+  else
+    return two_way_long_needle (haystack, haystack_len, needle, needle_len);
+}
+
+#undef LONG_NEEDLE_THRESHOLD
diff --git a/compat/str-two-way.h b/compat/str-two-way.h
new file mode 100644
index 0000000..b0338a7
--- /dev/null
+++ b/compat/str-two-way.h
@@ -0,0 +1,429 @@
+/* Byte-wise substring search, using the Two-Way algorithm.
+   Copyright (C) 2008 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+   Written by Eric Blake <ebb9@byu.net>, 2008.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 2, or (at your option)
+   any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License along
+   with this program; if not, write to the Free Software Foundation,
+   Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.  */
+
+/* Before including this file, you need to include <config.h> and
+   <string.h>, and define:
+     RESULT_TYPE             A macro that expands to the return type.
+     AVAILABLE(h, h_l, j, n_l)
+			     A macro that returns nonzero if there are
+			     at least N_L bytes left starting at H[J].
+			     H is 'unsigned char *', H_L, J, and N_L
+			     are 'size_t'; H_L is an lvalue.  For
+			     NUL-terminated searches, H_L can be
+			     modified each iteration to avoid having
+			     to compute the end of H up front.
+
+  For case-insensitivity, you may optionally define:
+     CMP_FUNC(p1, p2, l)     A macro that returns 0 iff the first L
+			     characters of P1 and P2 are equal.
+     CANON_ELEMENT(c)        A macro that canonicalizes an element right after
+			     it has been fetched from one of the two strings.
+			     The argument is an 'unsigned char'; the result
+			     must be an 'unsigned char' as well.
+
+  This file undefines the macros documented above, and defines
+  LONG_NEEDLE_THRESHOLD.
+*/
+
+#include <limits.h>
+#include <stdint.h>
+
+/* We use the Two-Way string matching algorithm, which guarantees
+   linear complexity with constant space.  Additionally, for long
+   needles, we also use a bad character shift table similar to the
+   Boyer-Moore algorithm to achieve improved (potentially sub-linear)
+   performance.
+
+   See http://www-igm.univ-mlv.fr/~lecroq/string/node26.html#SECTION00260
+   and http://en.wikipedia.org/wiki/Boyer-Moore_string_search_algorithm
+*/
+
+/* Point at which computing a bad-byte shift table is likely to be
+   worthwhile.  Small needles should not compute a table, since it
+   adds (1 << CHAR_BIT) + NEEDLE_LEN computations of preparation for a
+   speedup no greater than a factor of NEEDLE_LEN.  The larger the
+   needle, the better the potential performance gain.  On the other
+   hand, on non-POSIX systems with CHAR_BIT larger than eight, the
+   memory required for the table is prohibitive.  */
+#if CHAR_BIT < 10
+# define LONG_NEEDLE_THRESHOLD 32U
+#else
+# define LONG_NEEDLE_THRESHOLD SIZE_MAX
+#endif
+
+#ifndef MAX
+# define MAX(a, b) ((a < b) ? (b) : (a))
+#endif
+
+#ifndef CANON_ELEMENT
+# define CANON_ELEMENT(c) c
+#endif
+#ifndef CMP_FUNC
+# define CMP_FUNC memcmp
+#endif
+
+/* Perform a critical factorization of NEEDLE, of length NEEDLE_LEN.
+   Return the index of the first byte in the right half, and set
+   *PERIOD to the global period of the right half.
+
+   The global period of a string is the smallest index (possibly its
+   length) at which all remaining bytes in the string are repetitions
+   of the prefix (the last repetition may be a subset of the prefix).
+
+   When NEEDLE is factored into two halves, a local period is the
+   length of the smallest word that shares a suffix with the left half
+   and shares a prefix with the right half.  All factorizations of a
+   non-empty NEEDLE have a local period of at least 1 and no greater
+   than NEEDLE_LEN.
+
+   A critical factorization has the property that the local period
+   equals the global period.  All strings have at least one critical
+   factorization with the left half smaller than the global period.
+
+   Given an ordered alphabet, a critical factorization can be computed
+   in linear time, with 2 * NEEDLE_LEN comparisons, by computing the
+   larger of two ordered maximal suffixes.  The ordered maximal
+   suffixes are determined by lexicographic comparison of
+   periodicity.  */
+static size_t
+critical_factorization (const unsigned char *needle, size_t needle_len,
+			size_t *period)
+{
+  /* Index of last byte of left half, or SIZE_MAX.  */
+  size_t max_suffix, max_suffix_rev;
+  size_t j; /* Index into NEEDLE for current candidate suffix.  */
+  size_t k; /* Offset into current period.  */
+  size_t p; /* Intermediate period.  */
+  unsigned char a, b; /* Current comparison bytes.  */
+
+  /* Invariants:
+     0 <= j < NEEDLE_LEN - 1
+     -1 <= max_suffix{,_rev} < j (treating SIZE_MAX as if it were signed)
+     min(max_suffix, max_suffix_rev) < global period of NEEDLE
+     1 <= p <= global period of NEEDLE
+     p == global period of the substring NEEDLE[max_suffix{,_rev}+1...j]
+     1 <= k <= p
+  */
+
+  /* Perform lexicographic search.  */
+  max_suffix = SIZE_MAX;
+  j = 0;
+  k = p = 1;
+  while (j + k < needle_len)
+    {
+      a = CANON_ELEMENT (needle[j + k]);
+      b = CANON_ELEMENT (needle[max_suffix + k]);
+      if (a < b)
+	{
+	  /* Suffix is smaller, period is entire prefix so far.  */
+	  j += k;
+	  k = 1;
+	  p = j - max_suffix;
+	}
+      else if (a == b)
+	{
+	  /* Advance through repetition of the current period.  */
+	  if (k != p)
+	    ++k;
+	  else
+	    {
+	      j += p;
+	      k = 1;
+	    }
+	}
+      else /* b < a */
+	{
+	  /* Suffix is larger, start over from current location.  */
+	  max_suffix = j++;
+	  k = p = 1;
+	}
+    }
+  *period = p;
+
+  /* Perform reverse lexicographic search.  */
+  max_suffix_rev = SIZE_MAX;
+  j = 0;
+  k = p = 1;
+  while (j + k < needle_len)
+    {
+      a = CANON_ELEMENT (needle[j + k]);
+      b = CANON_ELEMENT (needle[max_suffix_rev + k]);
+      if (b < a)
+	{
+	  /* Suffix is smaller, period is entire prefix so far.  */
+	  j += k;
+	  k = 1;
+	  p = j - max_suffix_rev;
+	}
+      else if (a == b)
+	{
+	  /* Advance through repetition of the current period.  */
+	  if (k != p)
+	    ++k;
+	  else
+	    {
+	      j += p;
+	      k = 1;
+	    }
+	}
+      else /* a < b */
+	{
+	  /* Suffix is larger, start over from current location.  */
+	  max_suffix_rev = j++;
+	  k = p = 1;
+	}
+    }
+
+  /* Choose the longer suffix.  Return the first byte of the right
+     half, rather than the last byte of the left half.  */
+  if (max_suffix_rev + 1 < max_suffix + 1)
+    return max_suffix + 1;
+  *period = p;
+  return max_suffix_rev + 1;
+}
+
+/* Return the first location of non-empty NEEDLE within HAYSTACK, or
+   NULL.  HAYSTACK_LEN is the minimum known length of HAYSTACK.  This
+   method is optimized for NEEDLE_LEN < LONG_NEEDLE_THRESHOLD.
+   Performance is guaranteed to be linear, with an initialization cost
+   of 2 * NEEDLE_LEN comparisons.
+
+   If AVAILABLE does not modify HAYSTACK_LEN (as in memmem), then at
+   most 2 * HAYSTACK_LEN - NEEDLE_LEN comparisons occur in searching.
+   If AVAILABLE modifies HAYSTACK_LEN (as in strstr), then at most 3 *
+   HAYSTACK_LEN - NEEDLE_LEN comparisons occur in searching.  */
+static RETURN_TYPE
+two_way_short_needle (const unsigned char *haystack, size_t haystack_len,
+		      const unsigned char *needle, size_t needle_len)
+{
+  size_t i; /* Index into current byte of NEEDLE.  */
+  size_t j; /* Index into current window of HAYSTACK.  */
+  size_t period; /* The period of the right half of needle.  */
+  size_t suffix; /* The index of the right half of needle.  */
+
+  /* Factor the needle into two halves, such that the left half is
+     smaller than the global period, and the right half is
+     periodic (with a period as large as NEEDLE_LEN - suffix).  */
+  suffix = critical_factorization (needle, needle_len, &period);
+
+  /* Perform the search.  Each iteration compares the right half
+     first.  */
+  if (CMP_FUNC (needle, needle + period, suffix) == 0)
+    {
+      /* Entire needle is periodic; a mismatch can only advance by the
+	 period, so use memory to avoid rescanning known occurrences
+	 of the period.  */
+      size_t memory = 0;
+      j = 0;
+      while (AVAILABLE (haystack, haystack_len, j, needle_len))
+	{
+	  /* Scan for matches in right half.  */
+	  i = MAX (suffix, memory);
+	  while (i < needle_len && (CANON_ELEMENT (needle[i])
+				    == CANON_ELEMENT (haystack[i + j])))
+	    ++i;
+	  if (needle_len <= i)
+	    {
+	      /* Scan for matches in left half.  */
+	      i = suffix - 1;
+	      while (memory < i + 1 && (CANON_ELEMENT (needle[i])
+					== CANON_ELEMENT (haystack[i + j])))
+		--i;
+	      if (i + 1 < memory + 1)
+		return (RETURN_TYPE) (haystack + j);
+	      /* No match, so remember how many repetitions of period
+		 on the right half were scanned.  */
+	      j += period;
+	      memory = needle_len - period;
+	    }
+	  else
+	    {
+	      j += i - suffix + 1;
+	      memory = 0;
+	    }
+	}
+    }
+  else
+    {
+      /* The two halves of needle are distinct; no extra memory is
+	 required, and any mismatch results in a maximal shift.  */
+      period = MAX (suffix, needle_len - suffix) + 1;
+      j = 0;
+      while (AVAILABLE (haystack, haystack_len, j, needle_len))
+	{
+	  /* Scan for matches in right half.  */
+	  i = suffix;
+	  while (i < needle_len && (CANON_ELEMENT (needle[i])
+				    == CANON_ELEMENT (haystack[i + j])))
+	    ++i;
+	  if (needle_len <= i)
+	    {
+	      /* Scan for matches in left half.  */
+	      i = suffix - 1;
+	      while (i != SIZE_MAX && (CANON_ELEMENT (needle[i])
+				       == CANON_ELEMENT (haystack[i + j])))
+		--i;
+	      if (i == SIZE_MAX)
+		return (RETURN_TYPE) (haystack + j);
+	      j += period;
+	    }
+	  else
+	    j += i - suffix + 1;
+	}
+    }
+  return NULL;
+}
+
+/* Return the first location of non-empty NEEDLE within HAYSTACK, or
+   NULL.  HAYSTACK_LEN is the minimum known length of HAYSTACK.  This
+   method is optimized for LONG_NEEDLE_THRESHOLD <= NEEDLE_LEN.
+   Performance is guaranteed to be linear, with an initialization cost
+   of 3 * NEEDLE_LEN + (1 << CHAR_BIT) operations.
+
+   If AVAILABLE does not modify HAYSTACK_LEN (as in memmem), then at
+   most 2 * HAYSTACK_LEN - NEEDLE_LEN comparisons occur in searching,
+   and sublinear performance O(HAYSTACK_LEN / NEEDLE_LEN) is possible.
+   If AVAILABLE modifies HAYSTACK_LEN (as in strstr), then at most 3 *
+   HAYSTACK_LEN - NEEDLE_LEN comparisons occur in searching, and
+   sublinear performance is not possible.  */
+static RETURN_TYPE
+two_way_long_needle (const unsigned char *haystack, size_t haystack_len,
+		     const unsigned char *needle, size_t needle_len)
+{
+  size_t i; /* Index into current byte of NEEDLE.  */
+  size_t j; /* Index into current window of HAYSTACK.  */
+  size_t period; /* The period of the right half of needle.  */
+  size_t suffix; /* The index of the right half of needle.  */
+  size_t shift_table[1U << CHAR_BIT]; /* See below.  */
+
+  /* Factor the needle into two halves, such that the left half is
+     smaller than the global period, and the right half is
+     periodic (with a period as large as NEEDLE_LEN - suffix).  */
+  suffix = critical_factorization (needle, needle_len, &period);
+
+  /* Populate shift_table.  For each possible byte value c,
+     shift_table[c] is the distance from the last occurrence of c to
+     the end of NEEDLE, or NEEDLE_LEN if c is absent from the NEEDLE.
+     shift_table[NEEDLE[NEEDLE_LEN - 1]] contains the only 0.  */
+  for (i = 0; i < 1U << CHAR_BIT; i++)
+    shift_table[i] = needle_len;
+  for (i = 0; i < needle_len; i++)
+    shift_table[CANON_ELEMENT (needle[i])] = needle_len - i - 1;
+
+  /* Perform the search.  Each iteration compares the right half
+     first.  */
+  if (CMP_FUNC (needle, needle + period, suffix) == 0)
+    {
+      /* Entire needle is periodic; a mismatch can only advance by the
+	 period, so use memory to avoid rescanning known occurrences
+	 of the period.  */
+      size_t memory = 0;
+      size_t shift;
+      j = 0;
+      while (AVAILABLE (haystack, haystack_len, j, needle_len))
+	{
+	  /* Check the last byte first; if it does not match, then
+	     shift to the next possible match location.  */
+	  shift = shift_table[CANON_ELEMENT (haystack[j + needle_len - 1])];
+	  if (0 < shift)
+	    {
+	      if (memory && shift < period)
+		{
+		  /* Since needle is periodic, but the last period has
+		     a byte out of place, there can be no match until
+		     after the mismatch.  */
+		  shift = needle_len - period;
+		  memory = 0;
+		}
+	      j += shift;
+	      continue;
+	    }
+	  /* Scan for matches in right half.  The last byte has
+	     already been matched, by virtue of the shift table.  */
+	  i = MAX (suffix, memory);
+	  while (i < needle_len - 1 && (CANON_ELEMENT (needle[i])
+					== CANON_ELEMENT (haystack[i + j])))
+	    ++i;
+	  if (needle_len - 1 <= i)
+	    {
+	      /* Scan for matches in left half.  */
+	      i = suffix - 1;
+	      while (memory < i + 1 && (CANON_ELEMENT (needle[i])
+					== CANON_ELEMENT (haystack[i + j])))
+		--i;
+	      if (i + 1 < memory + 1)
+		return (RETURN_TYPE) (haystack + j);
+	      /* No match, so remember how many repetitions of period
+		 on the right half were scanned.  */
+	      j += period;
+	      memory = needle_len - period;
+	    }
+	  else
+	    {
+	      j += i - suffix + 1;
+	      memory = 0;
+	    }
+	}
+    }
+  else
+    {
+      /* The two halves of needle are distinct; no extra memory is
+	 required, and any mismatch results in a maximal shift.  */
+      size_t shift;
+      period = MAX (suffix, needle_len - suffix) + 1;
+      j = 0;
+      while (AVAILABLE (haystack, haystack_len, j, needle_len))
+	{
+	  /* Check the last byte first; if it does not match, then
+	     shift to the next possible match location.  */
+	  shift = shift_table[CANON_ELEMENT (haystack[j + needle_len - 1])];
+	  if (0 < shift)
+	    {
+	      j += shift;
+	      continue;
+	    }
+	  /* Scan for matches in right half.  The last byte has
+	     already been matched, by virtue of the shift table.  */
+	  i = suffix;
+	  while (i < needle_len - 1 && (CANON_ELEMENT (needle[i])
+					== CANON_ELEMENT (haystack[i + j])))
+	    ++i;
+	  if (needle_len - 1 <= i)
+	    {
+	      /* Scan for matches in left half.  */
+	      i = suffix - 1;
+	      while (i != SIZE_MAX && (CANON_ELEMENT (needle[i])
+				       == CANON_ELEMENT (haystack[i + j])))
+		--i;
+	      if (i == SIZE_MAX)
+		return (RETURN_TYPE) (haystack + j);
+	      j += period;
+	    }
+	  else
+	    j += i - suffix + 1;
+	}
+    }
+  return NULL;
+}
+
+#undef AVAILABLE
+#undef CANON_ELEMENT
+#undef CMP_FUNC
+#undef MAX
+#undef RETURN_TYPE
-- 
1.6.2.rc2

^ permalink raw reply related

* Changing the defaults for send-email / suppress-cc ?
From: Paul Gortmaker @ 2009-02-28 19:29 UTC (permalink / raw)
  To: git

I've been involved in helping people who are new git users, and the
one thing that seems to violate the principle of least surprise for
them is the default setting for the sendemail.suppresscc -- in that
the new users don't expect the additional CC lines to be automatically
added based on what is present in the content of the mbox
(format-patch output).

The messages from send-email that indicate it is going to add CC lines
based on SOB etc. come *after* the last input from the user, and so
they don't have an opportunity to jump in and prevent the extra
e-mails from going out to whoever happens to be listed in the patch.
(Lets assume for the moment, that they didn't see "--dry-run", or
simply figured the process looked fairly straightforward, and didn't
see the need for it.)

Here is the use case which I suspect is fairly pervasive, and that
I've already seen several times:

1) User is working on something involving kernel version X, which is
some amount behind the current mainstream HEAD.  (Okay, doesn't have
to be kernel, could even be git itself.)

2) They've created a branch off of X and they've added their own
commits, and also cherry picked relevant commits from upstream that
happened between X and HEAD into their branch.   One of the features
they've cherry picked onto their branch is a 25 patch series that has
"Signed-off-by: miserable@bofh.com" in it, a miserable person who
hates extra-emails.

3) They run "git format-patch -n --thread -o foo X..mybranch"

4) They run "git send-email --to coworker@mycompany.com foo"  so their
buddy within the company can have an mbox patchset.

5) They recoil in horror while smashing ^C as they try to stop
send-email from spamming miserable@bofh.com with 25 of his own
patches.

In light of this, I've simply advised new users to run something like:

git config --global sendemail.suppresscc all

...just so that they won't accidentally do what I've described in the above.

Apologies if this has been discussed before; I took a quick scan of my
archive and didn't see any discussions on it.  With the recent thread
about warning people of non-back compatible changes that will appear
post 1.6.2 -- I thought perhaps this was a good time to
mention/consider it.

I'm not sure what the right thing to do here is -- I suspect if you
made suppress-cc=all the default, then there would be more experienced
users that would complain about having to explicitly add a
suppress-cc=self to get the old behaviour?  Would that be acceptable?
I don't know...

Thanks,
Paul.

^ permalink raw reply

* [PATCH] Ensure proper setup of git_dir for git-hash-object
From: newren @ 2009-02-28 19:56 UTC (permalink / raw)
  To: git; +Cc: gitster, Elijah Newren

From: Elijah Newren <newren@gmail.com>

Call setup_git_directory() before git_config() to make sure git_dir is set
to the proper value.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
Without this patch:
$ mkdir tmp
$ cd tmp/
$ git init --bare
Initialized empty Git repository in /home/newren/floss-development/git/tmp/
$ echo hi | git hash-object -w --stdin
error: unable to create temporary sha1 filename .git/objects/45: No such file or directory

fatal: Unable to add stdin to database
$ echo hi | git --git-dir=. hash-object -w --stdin
45b983be36b73c0788dc9cbcb76cbb80fc7bb057

 hash-object.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/hash-object.c b/hash-object.c
index 37e6677..ebb3bed 100644
--- a/hash-object.c
+++ b/hash-object.c
@@ -84,8 +84,6 @@ int main(int argc, const char **argv)
 
 	git_extract_argv0_path(argv[0]);
 
-	git_config(git_default_config, NULL);
-
 	argc = parse_options(argc, argv, hash_object_options, hash_object_usage, 0);
 
 	if (write_object) {
@@ -95,6 +93,8 @@ int main(int argc, const char **argv)
 			vpath = prefix_filename(prefix, prefix_length, vpath);
 	}
 
+	git_config(git_default_config, NULL);
+
 	if (stdin_paths) {
 		if (hashstdin)
 			errstr = "Can't use --stdin-paths with --stdin";
-- 
1.6.0.6

^ permalink raw reply related

* [PATCH] added missing backtick in git-apply.txt
From: dt @ 2009-02-28 20:03 UTC (permalink / raw)
  To: git, gitster; +Cc: Danijel Tasov

From: Danijel Tasov <dt@korn.shell.la>

Signed-off-by: Danijel Tasov <dt@korn.shell.la>
---
 Documentation/git-apply.txt |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/Documentation/git-apply.txt b/Documentation/git-apply.txt
index 9400f6a..0566376 100644
--- a/Documentation/git-apply.txt
+++ b/Documentation/git-apply.txt
@@ -159,7 +159,7 @@ on the command line, and ignored if there is any include pattern.
 	considered whitespace errors.
 +
 By default, the command outputs warning messages but applies the patch.
-When `git-apply is used for statistics and not applying a
+When `git-apply` is used for statistics and not applying a
 patch, it defaults to `nowarn`.
 +
 You can use different `<action>` to control this
-- 
1.6.1.3

^ permalink raw reply related

* Re: [PATCH] Ensure proper setup of git_dir for git-hash-object
From: Junio C Hamano @ 2009-02-28 20:59 UTC (permalink / raw)
  To: newren; +Cc: git
In-Reply-To: <1235851009-16739-1-git-send-email-newren@gmail.com>

newren@gmail.com writes:

> Without this patch:
> $ mkdir tmp
> $ cd tmp/
> $ git init --bare
> Initialized empty Git repository in /home/newren/floss-development/git/tmp/
> $ echo hi | git hash-object -w --stdin
> error: unable to create temporary sha1 filename .git/objects/45: No such file or directory
>
> fatal: Unable to add stdin to database
> $ echo hi | git --git-dir=. hash-object -w --stdin
> 45b983be36b73c0788dc9cbcb76cbb80fc7bb057

Does the patched version work without -w option?  Should it?

^ permalink raw reply

* Re: [PATCH] added missing backtick in git-apply.txt
From: Junio C Hamano @ 2009-02-28 21:10 UTC (permalink / raw)
  To: dt; +Cc: git, Jonathan Nieder
In-Reply-To: <1235851434-16950-1-git-send-email-dt@korn.shell.la>

dt@korn.shell.la writes:

> From: Danijel Tasov <dt@korn.shell.la>
>
> Signed-off-by: Danijel Tasov <dt@korn.shell.la>
> ---
>  Documentation/git-apply.txt |    2 +-
>  1 files changed, 1 insertions(+), 1 deletions(-)
>
> diff --git a/Documentation/git-apply.txt b/Documentation/git-apply.txt
> index 9400f6a..0566376 100644
> --- a/Documentation/git-apply.txt
> +++ b/Documentation/git-apply.txt
> @@ -159,7 +159,7 @@ on the command line, and ignored if there is any include pattern.
>  	considered whitespace errors.
>  +
>  By default, the command outputs warning messages but applies the patch.
> -When `git-apply is used for statistics and not applying a
> +When `git-apply` is used for statistics and not applying a
>  patch, it defaults to `nowarn`.
>  +
>  You can use different `<action>` to control this

Thanks.

This was caused by the large documentation churn 483bc4f (Documentation
formatting and cleanup, 2008-06-30) that was supposed to be a clean-up.
Can people lend eyeballs to see if there isn't any other such typo
remaining?  I briefly looked at the commit again and I think it is Ok now,
but I obviously missed this when I first applied the patch, so...

^ permalink raw reply

* [PATCH 1/2] Add init-serve, the remote side of "git init --remote=host:path"
From: Junio C Hamano @ 2009-02-28 21:12 UTC (permalink / raw)
  To: git

This is still sprinkled with a few NEEDSWORK, but should be good enough
to start developing and testing the requesting side of the protocol pair.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 Makefile             |    1 +
 builtin-init-serve.c |  117 ++++++++++++++++++++++++++++++++++++++++++++++++++
 builtin.h            |    1 +
 git.c                |    1 +
 4 files changed, 120 insertions(+), 0 deletions(-)
 create mode 100644 builtin-init-serve.c

diff --git a/Makefile b/Makefile
index 0675c43..c0d0cfd 100644
--- a/Makefile
+++ b/Makefile
@@ -544,6 +544,7 @@ BUILTIN_OBJS += builtin-gc.o
 BUILTIN_OBJS += builtin-grep.o
 BUILTIN_OBJS += builtin-help.o
 BUILTIN_OBJS += builtin-init-db.o
+BUILTIN_OBJS += builtin-init-serve.o
 BUILTIN_OBJS += builtin-log.o
 BUILTIN_OBJS += builtin-ls-files.o
 BUILTIN_OBJS += builtin-ls-remote.o
diff --git a/builtin-init-serve.c b/builtin-init-serve.c
new file mode 100644
index 0000000..9d701e7
--- /dev/null
+++ b/builtin-init-serve.c
@@ -0,0 +1,117 @@
+#include "cache.h"
+#include "builtin.h"
+#include "pkt-line.h"
+#include "run-command.h"
+#include "strbuf.h"
+
+/*
+ * The other end gives the command line arguments to "git init"
+ * one by one over pkt-line, and then expects a message back.
+ *
+ * We need to read them all even if we know we will reject
+ * the request before responding.
+ */
+static int serve(const char *errmsg)
+{
+	int argc = 0;
+	const char *argv[64];
+
+	argv[argc++] = "git";
+	argv[argc++] = "init";
+	for (;;) {
+		char line[1000];
+		int len;
+
+		len = packet_read_line(0, line, sizeof(line));
+		if (!len)
+			break;
+
+		if (!*errmsg) {
+			/*
+			 * Notice any command line arguments that we
+			 * may not want to invoke "git init" with when
+			 * we are doing this remotely, and reject the
+			 * request.
+			 */
+			if (!prefixcmp(line, "--template=")) {
+				static char err[1000];
+				snprintf(err, sizeof(err),
+					 "forbidden option to 'git init': %s",
+					 line);
+				errmsg = err;
+			} else if (argc + 1 < ARRAY_SIZE(argv))
+				argv[argc++] = xstrdup(line);
+			else
+				errmsg = "arg list too long";
+		}
+	}
+
+	if (*errmsg)
+		packet_write(1, "ng init - %s\n", errmsg);
+	else {
+		/*
+		 * NEEDSWORK: refactor this list in the codepath for
+		 * local pipe transport in connect.c and use it here
+		 * and also over there.
+		 */
+		const char *sanitize_env[] = {
+			ALTERNATE_DB_ENVIRONMENT,
+			DB_ENVIRONMENT,
+			GIT_DIR_ENVIRONMENT,
+			GIT_WORK_TREE_ENVIRONMENT,
+			GRAFT_ENVIRONMENT,
+			INDEX_ENVIRONMENT,
+			NULL
+		};
+		struct child_process child;
+
+		argv[argc] = NULL;
+		memset(&child, 0, sizeof(child));
+		child.argv = argv;
+		child.env = sanitize_env;
+
+		/*
+		 * NEEDSWORK: I do not currently think it is worth it,
+		 * but this might want to set up and use the sideband
+		 * to capture and send output from the child back to
+		 * the requestor.  At least this comment needs to be removed
+		 * once we make the decision.
+		 */
+		child.stdout_to_stderr = 1;
+
+		/*
+		 * NEEDSWORK: we might want to distinguish various
+		 * error codes from run_command() and return different
+		 * messages back.  I am too lazy to be bothered.
+		 */
+		if (run_command(&child))
+			packet_write(1, "ng init\n");
+		else
+			packet_write(1, "ok init\n");
+	}
+	return 0;
+}
+
+int cmd_init_serve(int argc, const char **argv, const char *prefix)
+{
+	const char *dir;
+	struct strbuf errmsg = STRBUF_INIT;
+
+	if (argc != 2)
+		return serve("init /p/a/th");
+	dir = argv[1];
+
+	/*
+	 * Perhaps lift avoid_alias() from daemon.c and check
+	 * dir with it, as programs like gitosis may
+	 * want to restrict the arguments to this service.
+	 */
+	if (mkdir(dir, 0777))
+		strbuf_addf(&errmsg,
+			    "cannot mkdir('%s'): %s", dir, strerror(errno));
+	else if (chdir(dir))
+		strbuf_addf(&errmsg,
+			    "cannot chdir('%s'): %s", dir, strerror(errno));
+
+	return serve(errmsg.buf);
+}
diff --git a/builtin.h b/builtin.h
index 1495cf6..e9f9ffb 100644
--- a/builtin.h
+++ b/builtin.h
@@ -59,6 +59,7 @@ extern int cmd_grep(int argc, const char **argv, const char *prefix);
 extern int cmd_help(int argc, const char **argv, const char *prefix);
 extern int cmd_http_fetch(int argc, const char **argv, const char *prefix);
 extern int cmd_init_db(int argc, const char **argv, const char *prefix);
+extern int cmd_init_serve(int argc, const char **argv, const char *prefix);
 extern int cmd_log(int argc, const char **argv, const char *prefix);
 extern int cmd_log_reflog(int argc, const char **argv, const char *prefix);
 extern int cmd_ls_files(int argc, const char **argv, const char *prefix);
diff --git a/git.c b/git.c
index c2b181e..1df8584 100644
--- a/git.c
+++ b/git.c
@@ -311,6 +311,7 @@ static void handle_internal_command(int argc, const char **argv)
 #endif
 		{ "init", cmd_init_db },
 		{ "init-db", cmd_init_db },
+		{ "init-serve", cmd_init_serve },
 		{ "log", cmd_log, RUN_SETUP | USE_PAGER },
 		{ "ls-files", cmd_ls_files, RUN_SETUP },
 		{ "ls-tree", cmd_ls_tree, RUN_SETUP },
-- 
1.6.2.rc2.99.g9f3bb

^ permalink raw reply related

* [PATCH 2/2] "git init --remote=host:path"
From: Junio C Hamano @ 2009-02-28 21:13 UTC (permalink / raw)
  To: git
In-Reply-To: <7vsklye05k.fsf@gitster.siamese.dyndns.org>

This implements the requesting side of the pair.

It probably should take the same --exec=init-serve parameter similar to
the way other programs like receive-pack and send-pack does, but I am too
lazy to add it.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin-init-db.c |   34 +++++++++++++++++++++++++++++++++-
 t/t0001-init.sh   |    7 +++++++
 2 files changed, 40 insertions(+), 1 deletions(-)

diff --git a/builtin-init-db.c b/builtin-init-db.c
index ee3911f..0b6da70 100644
--- a/builtin-init-db.c
+++ b/builtin-init-db.c
@@ -6,6 +6,7 @@
 #include "cache.h"
 #include "builtin.h"
 #include "exec_cmd.h"
+#include "pkt-line.h"
 
 #ifndef DEFAULT_GIT_TEMPLATE_DIR
 #define DEFAULT_GIT_TEMPLATE_DIR "/usr/share/git-core/templates"
@@ -363,6 +364,33 @@ static int guess_repository_type(const char *git_dir)
 	return 1;
 }
 
+static int remote_init(const char *remote, int argc, const char **argv)
+{
+	struct child_process *child;
+	int fd[2], i, len, status;
+	char line[1000];
+
+	child = git_connect(fd, remote, "git init-serve", 0);
+	for (i = 1; i < argc; i++) {
+		fprintf(stderr, "writing %d (%s)\n", i, argv[i]);
+		packet_write(fd[1], argv[i]);
+	}
+	packet_flush(fd[1]);
+
+	status = 0;
+	len = packet_read_line(fd[0], line, sizeof(line));
+	if (len < 3 ||
+	    (memcmp(line, "ok ", 3) && memcmp(line, "ng ", 3)))
+		die("protocol error: %s\n", line);
+
+	if (line[0] != 'o') {
+		error("%s", line);
+		status = -1;
+	}
+	status |= finish_connect(child);
+	return status;
+}
+
 static const char init_db_usage[] =
 "git init [-q | --quiet] [--bare] [--template=<template-directory>] [--shared[=<permissions>]]";
 
@@ -394,7 +422,11 @@ int cmd_init_db(int argc, const char **argv, const char *prefix)
 			init_shared_repository = git_config_perm("arg", arg+9);
 		else if (!strcmp(arg, "-q") || !strcmp(arg, "--quiet"))
 			flags |= INIT_DB_QUIET;
-		else
+		else if (!prefixcmp(arg, "--remote=")) {
+			if (i != 1)
+				die("--remote option must be given first");
+			return remote_init(arg+9, argc-1, argv+1);
+		} else
 			usage(init_db_usage);
 	}
 
diff --git a/t/t0001-init.sh b/t/t0001-init.sh
index 5ac0a27..d1069ee 100755
--- a/t/t0001-init.sh
+++ b/t/t0001-init.sh
@@ -199,4 +199,11 @@ test_expect_success 'init honors global core.sharedRepository' '
 	x`git config -f shared-honor-global/.git/config core.sharedRepository`
 '
 
+test_expect_success 'init --remote' '
+	R="$(pwd)/test-of-remote" &&
+	git init --remote="$R" --bare &&
+	test -d "$R/objects/pack" &&
+	test_must_fail git init --remote="$R"
+'
+
 test_done
-- 
1.6.2.rc2.99.g9f3bb

^ permalink raw reply related

* Re: [PATCH] Ensure proper setup of git_dir for git-hash-object
From: Elijah Newren @ 2009-02-28 21:20 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <7v3adyffax.fsf@gitster.siamese.dyndns.org>

On Sat, Feb 28, 2009 at 1:59 PM, Junio C Hamano <gitster@pobox.com> wrote:
> newren@gmail.com writes:
>
>> Without this patch:
>> $ mkdir tmp
>> $ cd tmp/
>> $ git init --bare
>> Initialized empty Git repository in /home/newren/floss-development/git/tmp/
>> $ echo hi | git hash-object -w --stdin
>> error: unable to create temporary sha1 filename .git/objects/45: No such file or directory
>>
>> fatal: Unable to add stdin to database
>> $ echo hi | git --git-dir=. hash-object -w --stdin
>> 45b983be36b73c0788dc9cbcb76cbb80fc7bb057
>
> Does the patched version work without -w option?  Should it?

Yes, the patched version works with or without the -w option (at least
in my testing -- maybe you know of a case I'm missing?)  I would
certainly expect it to work in both cases.

I basically arrived at the patch by realizing that git_config was
setting git_dir incorrectly as a side-effect, causing
setup_git_directory to notice it was already set and not try any of
it's more detailed logic to figure out the correct value.  Then I did
some grepping and noticed that other source files (archive.c,
builtin-apply.c, builtin-diff, etc., etc.) call
setup_git_directory[_gently] before git_config, and that hash-object.c
seemed to be the only one that didn't follow that trend.

^ permalink raw reply

* Re: [PATCH] Ensure proper setup of git_dir for git-hash-object
From: Elijah Newren @ 2009-02-28 21:39 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <7v3adyffax.fsf@gitster.siamese.dyndns.org>

On Sat, Feb 28, 2009 at 1:59 PM, Junio C Hamano <gitster@pobox.com> wrote:
> newren@gmail.com writes:
>
>> Without this patch:
>> $ mkdir tmp
>> $ cd tmp/
>> $ git init --bare
>> Initialized empty Git repository in /home/newren/floss-development/git/tmp/
>> $ echo hi | git hash-object -w --stdin
>> error: unable to create temporary sha1 filename .git/objects/45: No such file or directory
>>
>> fatal: Unable to add stdin to database
>> $ echo hi | git --git-dir=. hash-object -w --stdin
>> 45b983be36b73c0788dc9cbcb76cbb80fc7bb057
>
> Does the patched version work without -w option?  Should it?

Sorry, I think I partially missed what you were asking earlier.  When
-w is not passed there is no dependence on git_dir, so it does not
matter if it is set up or not.  Some evidence that this is true (in
addition to my basic testing): The call to git_config was added to
hash-object.c in revision ff350ccf49a800c4c90f817d346fb1bcb96e02e7;
prior to that revision, when -w was not passed, there would be no
setup of git_dir by either git_config or git_setup_directory.

^ permalink raw reply

* [PATCH] Documentation: minor grammatical fixes.
From: David J. Mellor @ 2009-02-28 21:12 UTC (permalink / raw)
  To: gitster; +Cc: git


Signed-off-by: David J. Mellor <dmellor@whistlingcat.com>
---
 Documentation/git-add.txt |   12 ++++++------
 1 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/Documentation/git-add.txt b/Documentation/git-add.txt
index 7c129cb..e4c711b 100644
--- a/Documentation/git-add.txt
+++ b/Documentation/git-add.txt
@@ -136,7 +136,7 @@ $ git add Documentation/\\*.txt
 ------------
 +
 Note that the asterisk `\*` is quoted from the shell in this
-example; this lets the command to include the files from
+example; this lets the command include the files from
 subdirectories of `Documentation/` directory.
 
 * Considers adding content from all git-*.sh scripts:
@@ -145,7 +145,7 @@ subdirectories of `Documentation/` directory.
 $ git add git-*.sh
 ------------
 +
-Because this example lets shell expand the asterisk (i.e. you are
+Because this example lets the shell expand the asterisk (i.e. you are
 listing the files explicitly), it does not consider
 `subdir/git-foo.sh`.
 
@@ -198,8 +198,8 @@ one deletion).
 
 update::
 
-   This shows the status information and gives prompt
-   "Update>>".  When the prompt ends with double '>>', you can
+   This shows the status information and issues an "Update>>"
+   prompt.  When the prompt ends with double '>>', you can
    make more than one selection, concatenated with whitespace or
    comma.  Also you can say ranges.  E.g. "2-5 7,9" to choose
    2,3,4,5,7,9 from the list.  If the second number in a range is
@@ -238,8 +238,8 @@ add untracked::
 
 patch::
 
-  This lets you choose one path out of 'status' like selection.
-  After choosing the path, it presents diff between the index
+  This lets you choose one path out of a 'status' like selection.
+  After choosing the path, it presents the diff between the index
   and the working tree file and asks you if you want to stage
   the change of each hunk.  You can say:
 
-- 
1.6.1.3

^ permalink raw reply related

* Re: How can I force git to recognize a change change in file modes?
From: Brent Goodrick @ 2009-02-28 22:43 UTC (permalink / raw)
  To: Todd Zullinger; +Cc: Jay Soffian, Jan Krüger, git
In-Reply-To: <20090228183427.GN4505@inocybe.teonanacatl.org>

On Sat, Feb 28, 2009 at 10:34 AM, Todd Zullinger <tmz@pobox.com> wrote:
> You may want to check contrib/hooks/setgitperms.perl as well, if you
> haven't seen it already.

Yes I did see that. I'll be varying that approach a bit since I won't
need to preserve perm bits before and after the git operation, but
just force them to be a specific value all the time after checkout and
update.

bg

^ permalink raw reply

* Re: [PATCH] import memmem() with linear complexity from Gnulib
From: Mike Hommey @ 2009-02-28 22:44 UTC (permalink / raw)
  To: René Scharfe; +Cc: Junio C Hamano, git
In-Reply-To: <1235848615.7043.30.camel@ubuntu.ubuntu-domain>

On Sat, Feb 28, 2009 at 08:16:55PM +0100, René Scharfe wrote:
> Gnulib and glibc have gained a memmem() implementation using the Two-Way
> algorithm, which needs constant space and linear time.  Import it to
> compat/ in order to replace the simple quadratic implementation there.
> 
> memmem.c and str-two-way.h are copied verbatim from the repository at
> git://git.savannah.gnu.org/gnulib.git, with the following changes to
> memmem.c to make it fit into git's build environment:
> 
> 	21,23c21
> 	< #ifndef _LIBC
> 	< # include <config.h>
> 	< #endif
> 	---
> 	> #include "../git-compat-util.h"
> 	40c38
> 	< memmem (const void *haystack_start, size_t haystack_len,
> 	---
> 	> gitmemmem(const void *haystack_start, size_t haystack_len,
> 
> Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
> ---
>  Makefile             |    1 +
>  compat/memmem.c      |  103 +++++++++----
>  compat/str-two-way.h |  429 ++++++++++++++++++++++++++++++++++++++++++++++++++
>  3 files changed, 504 insertions(+), 29 deletions(-)

Seeing how much memmem is being used in the codebase, is it really worth?

Mike

^ permalink raw reply

* [PATCH 1/4] Refactor list of environment variables to be sanitized
From: Junio C Hamano @ 2009-03-01  0:03 UTC (permalink / raw)
  To: git

When local process-to-process pipe transport spawns a subprocess,
it cleans up various git related variables to give the new process
a fresh environment.  The list of variables to cleanse is useful
in other places.
---
 cache.h   |    2 ++
 connect.c |   26 +++++++++++++++-----------
 2 files changed, 17 insertions(+), 11 deletions(-)

diff --git a/cache.h b/cache.h
index 189151d..b72434f 100644
--- a/cache.h
+++ b/cache.h
@@ -389,6 +389,8 @@ extern void set_git_work_tree(const char *tree);
 
 #define ALTERNATE_DB_ENVIRONMENT "GIT_ALTERNATE_OBJECT_DIRECTORIES"
 
+extern const char *sanitize_git_env[];
+
 extern const char **get_pathspec(const char *prefix, const char **pathspec);
 extern void setup_work_tree(void);
 extern const char *setup_git_directory_gently(int *);
diff --git a/connect.c b/connect.c
index 2f23ab3..b4705ff 100644
--- a/connect.c
+++ b/connect.c
@@ -6,6 +6,20 @@
 #include "run-command.h"
 #include "remote.h"
 
+/*
+ * When spawning a subprocess in a fresh environment,
+ * these variables may need to be cleared
+ */
+const char *sanitize_git_env[] = {
+	ALTERNATE_DB_ENVIRONMENT,
+	DB_ENVIRONMENT,
+	GIT_DIR_ENVIRONMENT,
+	GIT_WORK_TREE_ENVIRONMENT,
+	GRAFT_ENVIRONMENT,
+	INDEX_ENVIRONMENT,
+	NULL
+};
+
 static char *server_capabilities;
 
 static int check_ref(const char *name, int len, unsigned int flags)
@@ -625,17 +639,7 @@ struct child_process *git_connect(int fd[2], const char *url_orig,
 		*arg++ = host;
 	}
 	else {
-		/* remove these from the environment */
-		const char *env[] = {
-			ALTERNATE_DB_ENVIRONMENT,
-			DB_ENVIRONMENT,
-			GIT_DIR_ENVIRONMENT,
-			GIT_WORK_TREE_ENVIRONMENT,
-			GRAFT_ENVIRONMENT,
-			INDEX_ENVIRONMENT,
-			NULL
-		};
-		conn->env = env;
+		conn->env = sanitize_git_env;
 		*arg++ = "sh";
 		*arg++ = "-c";
 	}
-- 
1.6.2.rc2.99.g9f3bb

^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox