Git development

Git development
 help / color / mirror / Atom feed

* Re: [PATCH] Simplified GIT usage guide
From: Miklos Vajna @ 2008-12-12 18:53 UTC (permalink / raw)
  To: David Howells; +Cc: torvalds, git, linux-kernel
In-Reply-To: <20081212182827.28408.40963.stgit@warthog.procyon.org.uk>

[-- Attachment #1: Type: text/plain, Size: 3038 bytes --]

On Fri, Dec 12, 2008 at 06:28:27PM +0000, David Howells <dhowells@redhat.com> wrote:
> + (1) File objects.
> +
> +     A file object contains the contents of a source file and the attributes of
> +     that file (such as file mode).

This is incorrect, a 'blob' contains only the contents of the blob, the
file mode is stored in the 'tree' object.

> + (2) Directory objects.
> +
> +     A directory object contains the attributes of that directory plus a list
> +     of file and directory objects that are members of this directory.  The
> +     list includes the names of the entries within that directory and the
> +     object ID of each object.
> +
> + (3) Commit objects.
> +
> +     A commit object contains the attribute of that commit (the author and the
> +     date for instance), a textual description of the change imposed by that
> +     commit as provided by the committer, a list of object IDs for the commits
> +     on which this commit is based, and the object ID of the root directory
> +     object representing the result of this commit.
> +
> +     Note that a commit does not literally describe the changes that have been
> +     made in the way that, say, a diff file does; it merely carries the current
> +     state of the sources after that change, and points to the commits that
> +     describe the state of the sources before that change.  GIT's tools then
> +     infer the changes when asked.
> +
> +     A commit object will typically refer to one base commit when someone has
> +     merely committed some changes on top of the current state, and two base
> +     commits when a couple of trees have been merged.

Is there any reason you hide the tag object?

> +where %HOUR is the hour you want it to go off every day.  For my local mirror
> +of Linus's upstream kernel, I use:
> +
> +	#!/bin/sh
> +	cd /warthog/git/linux-2.6 || exit $?
> +	exec git pull >/tmp/git-pull.log
> +
> +and:
> +
> +	0 6 * * *       /home/dhowells/bin/do-git-pull.sh
> +
> +which will do the update every day at 6am.

Using git clone --mirror would be much efficient, I think.

> +The "-l" tells git clone that the source (mirror) repository is on the local
> +machine, that it shouldn't go over the internet for it, and that it should
> +hardlink GIT objects from the source repository rather than copying them where
> +possible.

Here and later below, IIRC -l is the default for local clones.

> +	cd /my/git/trees
> +	git clone -n --bare %UPSTREAM_REPO %MY_DIR

--bare implies -n.

> +If you haven't yet committed your changes, you'll have to siphon them off into
> +a file:
> +
> +	git diff >a.diff
> +
> +and deapply them:
> +
> +	patch -p1 -R <a.diff
> +
> +You can then update your tree from the upstream tree with no fear of a conflict
> +(assuming you don't also have changes that you have committed).  Once you've
> +updated your tree, you can reapply your changes:
> +
> +	patch -p1 <a.diff

Why not using git stash and git stash pop?

Or at least git apply and git checkout - leaving out patch(1) from the
game.

[-- Attachment #2: Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply

* gitweb and unicode special characters
From: Praveen A @ 2008-12-12 18:33 UTC (permalink / raw)
  To: git; +Cc: Santhosh Thottingal

Hi,

Git currently does not handle unicode special characters ZWJ and ZWNJ,
both are heavily used in Malayalam and common in other languages
needing complex text layout like Sinhala and Arabic.

An example of this is shown in the commit message here
http://git.savannah.gnu.org/gitweb/?p=smc.git;a=commit;h=c3f368c60aabdc380c77608c614d91b0a628590a

\20014 and \20015 should have been ZWNJ and ZWJ respectively. You just
need to handle them as any other unicode character - especially it is
a commit message and expectation is normal pain text display.

I hope some one will fix this.

- Praveen
-- 
പ്രവീണ്‍ അരിമ്പ്രത്തൊടിയില്‍
<GPLv2> I know my rights; I want my phone call!
<DRM> What use is a phone call, if you are unable to speak?
(as seen on /.)
Join The DRM Elimination Crew Now!
http://fci.wikia.com/wiki/Anti-DRM-Campaign

^ permalink raw reply

* [PATCH] Simplified GIT usage guide
From: David Howells @ 2008-12-12 18:28 UTC (permalink / raw)
  To: torvalds; +Cc: dhowells, git, linux-kernel

Add a guide to using GIT's simpler features.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 Documentation/git-haters-guide.txt | 1283 ++++++++++++++++++++++++++++++++++++
 1 files changed, 1283 insertions(+), 0 deletions(-)
 create mode 100644 Documentation/git-haters-guide.txt


diff --git a/Documentation/git-haters-guide.txt b/Documentation/git-haters-guide.txt
new file mode 100644
index 0000000..51e4dac
--- /dev/null
+++ b/Documentation/git-haters-guide.txt
@@ -0,0 +1,1283 @@
+		      ===================================
+		      THE GIT HATER'S GUIDE TO THE GALAXY
+		      ===================================
+
+By David Howells <dhowells@redhat.com>
+
+Contents:
+
+ (*) Introduction.
+
+     - Disclaimer.
+
+ (*) Overview of GIT.
+
+     - Git objects.
+     - Symbolic pointers.
+     - The GIT tree.
+     - GIT trees after merging.
+
+ (*) Downloading upstream trees.
+
+     - Local mirroring.
+     - Automatic updates.
+     - Using your local mirror.
+
+ (*) Accessing the repository.
+
+     - Viewing the history.
+     - Viewing a commit.
+     - Viewing source differences.
+
+ (*) Making changes.
+
+     - Applying patches.
+     - Applying formatted patches.
+     - Incorporating GIT trees.
+
+ (*) Amending and reverting changes.
+
+     - Amending committed changes.
+     - Discarding committed changes.
+     - Reverting committed changes.
+
+ (*) Publishing changes by GIT tree.
+
+     - Setting up.
+     - Updating your development tree.
+     - Publishing your changes.
+
+ (*) Manually merging failed fetches.
+
+ (*) Locating bugs.
+
+     - Bisection.
+     - Blame.
+
+
+============
+INTRODUCTION
+============
+
+So, you want to do some Linux kernel development?  And you hear there's this
+piece of software called 'GIT' that you probably ought to be using when dealing
+with the kernel community?  Then you find out that not only was Linux started
+by this Linus Torvalds person, but GIT was too!  Perhaps it doesn't seem fair:
+Linus has not just _one_ huge piece of software named after himself, but _two_!
+And on top of that, globe spanning hardware vendors just queue up to give him
+all the herring he can eat!!
+
+Then you look at webpages about GIT.  You look at the manpages!  You run the
+commands with --help!  And you *still* don't know how to do anything complex
+with it!!  You feel certain that there's some secret rite you have to perform
+to become a GIT initiate - probably something involving two goats, an altar and
+a full moon - oh, and lots of beer (we *are* talking about kernel developers
+after all).
+
+Then you ask around, and people look at you blankly, hedge or say that it's
+easy and obvious (they should know - they wrote the damned thing).  You realise
+that the manpages are more an aide-memoire and that what you really want is
+some sort of crib sheet; something that can hold your hand whilst you cut and
+paste things from of it until you can see the point.
+
+Well, let's see if I can help...
+
+
+DISCLAIMER
+----------
+
+I don't really know what I'm doing with GIT either.  I'm not sure anyone really
+does, apart from Linus (and then only after some strange Finnish snack
+involving red and white mushrooms).  If you'd pause to wonder why things are
+like they are, you'd realise that only someone totally barking would try to
+write a kernel in the first place... and then it'd dawn on you what the mental
+state must be like of someone who'd try writing something like a source code
+management system from scratch...  and then you'd consider what it must take to
+be someone who'd do *both*.
+
+
+===============
+OVERVIEW OF GIT
+===============
+
+GIT is a source code management system.  You give it your sources to retain,
+and it manages the history of all the changes and provides you with a set of
+tools by which that history can be viewed, extracted and extended.
+
+GIT is unusual in its design in that the objects it retains are referred to by
+hashes of their content.  Because it is mathematically possible for object IDs
+to collide, large hash IDs are used to reduce the probability of a collision.
+If the content of an object changes, rather than updating the existing object,
+GIT will create a new object with a new hash ID.  Objects are _invariant_.
+
+The GIT database in a GIT tree has two sets of data:
+
+ (1) A set of objects, indexed by the object hash ID.
+
+ (2) A set of symbolic object tree heads, as object hash IDs.
+
+
+GIT OBJECTS
+-----------
+
+There are three basic types of object:
+
+ (1) File objects.
+
+     A file object contains the contents of a source file and the attributes of
+     that file (such as file mode).
+
+ (2) Directory objects.
+
+     A directory object contains the attributes of that directory plus a list
+     of file and directory objects that are members of this directory.  The
+     list includes the names of the entries within that directory and the
+     object ID of each object.
+
+ (3) Commit objects.
+
+     A commit object contains the attribute of that commit (the author and the
+     date for instance), a textual description of the change imposed by that
+     commit as provided by the committer, a list of object IDs for the commits
+     on which this commit is based, and the object ID of the root directory
+     object representing the result of this commit.
+
+     Note that a commit does not literally describe the changes that have been
+     made in the way that, say, a diff file does; it merely carries the current
+     state of the sources after that change, and points to the commits that
+     describe the state of the sources before that change.  GIT's tools then
+     infer the changes when asked.
+
+     A commit object will typically refer to one base commit when someone has
+     merely committed some changes on top of the current state, and two base
+     commits when a couple of trees have been merged.
+
+Because objects are invariant, and because they can thus be referred to by a
+hash of their contents, objects can be shared between trees simply by using the
+same object ID in two different places.  This allows objects to be compared to
+see whether they are the same thing or not simply by comparing the object ID.
+
+
+SYMBOLIC POINTERS
+-----------------
+
+GIT retains its historical information in a set of overlapping, shared trees,
+but the notion of where a tree starts isn't really a primary concept with GIT.
+What it has instead is a number of symbolic pointers to commits within the tree
+that are considered to be of some sort of significance.  These are called
+'heads' and include:
+
+ (1) The base for the current working state of the checked out sources (HEAD).
+
+ (2) Branches (by branch name).
+
+ (3) Tags (by tag name).
+
+ (4) Merge base (for incomplete merges).
+
+ (5) Points of interest, such as those that pertain to a git fetch (FETCH_HEAD
+     and ORIG_HEAD).
+
+ (6) Bisection points (when bisection is being used to find a bug).
+
+In essence, these symbolic pointers are just names or conventions for
+particular roots in the tree.  They are a name that maps to the object ID of a
+commit object.
+
+Some of them have special meanings, such as branches, that can be configured to
+behave in various ways under certain conditions (such as when a git fetch is
+performed).
+
+
+THE GIT TREE
+------------
+
+The GIT tree in its simplest terms is a backbone of commits that point to
+directories that point to files.  To give a simple example of the commit
+process, consider the sources for a project that contains one directory, D,
+which contains three files, F1, F2 and F3.
+
+This could then be committed into GIT to begin a project, in this case as
+commit C0.  This would hold version D0 of the directory, and versions F1A, F2A
+and F3A of the three files, and the GIT repository HEAD pointer would point to
+C0:
+
+	                                        +-----+
+	                                    +-->| F3A |
+	                                    |   +-----+
+	                                    |
+	        +-----+        +-----+      |   +-----+
+	HEAD--->| C0  |------->| D0  |------+-->| F2A |
+	        +-----+        +-----+      |   +-----+
+	                                    |
+	                                    |   +-----+
+	                                    +-->| F1A |
+	                                        +-----+
+
+Now imagine that someone changes file F2 and commits the change.  F1A and F3A
+are still useful, and can be shared by the new view of the world, but F2 is now
+on a new version, F2B.  The old directory object, D0, pointed to F2A, so that
+cannot be reused, and so D1 is generated.  The commit process then writes a new
+commit object, C1, that points to D1 as the state of the tree after this
+commit, and points to C0 as the commit on which C1 was based.  Finally, HEAD is
+changed to point to C1.
+
+	                                        +-----+
+	                                  +---->| F2B |
+	        +-----+        +-----+    |     +-----+
+	HEAD--->| C1  |------->| D1  |----+
+	        +-----+        +-----+    |
+	           |                      |
+	           |                      |     +-----+
+	           |                      +---->| F3A |
+	           |                      | +-->+-----+
+	           V                      | |
+	        +-----+        +-----+    | |   +-----+
+	        | C0  |------->| D0  |------+-->| F2A |
+	        +-----+        +-----+    | |   +-----+
+	                                  | |
+	                                  +-|-->+-----+
+	                                    +-->| F1A |
+	                                        +-----+
+
+Then imagine that someone changes file F1 and commits the change.  F3A is still
+viable in its original state, and F2B is usable from commit C1, but F1A is now
+obsolete and gets replaced by version F1B.  This means that neither D0 nor D1
+are usable, so directory object D2 has to be created, and new commit C2 is
+created to point to that and base commit C1.  Then HEAD is set to point to C2:
+
+	                                        +-----+
+	                                +------>| F1B |
+	        +-----+        +-----+  |       +-----+
+	HEAD--->| C2  |------->| D2  |--+
+	        +-----+        +-----+  |
+	           |                    |
+	           |                    +------>+-----+
+	           V                    | +---->| F2B |
+	        +-----+        +-----+  | |     +-----+
+	        | C1  |------->| D1  |----+
+	        +-----+        +-----+  | |
+	           |                    | |
+	           |                    +-|---->+-----+
+	           |                      +---->| F3A |
+	           |                      | +-->+-----+
+	           V                      | |
+	        +-----+        +-----+    | |   +-----+
+	        | C0  |------->| D0  |------+-->| F2A |
+	        +-----+        +-----+    | |   +-----+
+	                                  | |
+	                                  +-|-->+-----+
+	                                    +-->| F1A |
+	                                        +-----+
+
+Now, consider what would have happened if, instead of changing F1A to be F1B to
+produce C2, F2B had been reverted to the same state as F2A.  GIT would realise
+that it already has a file object to represent F2A (by comparing object IDs)
+and would use that rather than creating a new one.  The new set of files in the
+directory would then be F1A, F2A and F3A - but there's already a directory
+object for that: D0.  This would also be discovered by object ID matching, and
+would be used instead.  Commit C3 would then point to base commit C1 and
+directory D0, and HEAD would be moved to point to C3:
+
+	        +-----+
+	HEAD--->| C3  |---+
+	        +-----+   |
+	           |      |
+	           |      |                     +-----+
+	           V      |               +---->| F2B |
+	        +-----+   |    +-----+    |     +-----+
+	        | C1  |------->| D1  |----+
+	        +-----+   |    +-----+    |
+	           |      |               |
+	           |      |               |     +-----+
+	           |      |               +---->| F3A |
+	           |      |               | +-->+-----+
+	           V      |               | |
+	        +-----+   +--->+-----+    | |   +-----+
+	        | C0  |------->| D0  |------+-->| F2A |
+	        +-----+        +-----+    | |   +-----+
+	                                  | |
+	                                  +-|-->+-----+
+	                                    +-->| F1A |
+	                                        +-----+
+
+
+GIT TREES AFTER MERGING
+-----------------------
+
+Now, imagine that two GIT trees are merged.  You start off with two sets of
+commits (for convenience, I'm going to leave out the directories and files, but
+you can just assume they're there):
+
+	        +-----+                         +-----+
+	HEAD--->| C3  |                 Branch->| B3  |
+	        +-----+                         +-----+
+	           |                               |
+	           V                               V
+	        +-----+                         +-----+
+	        | C2  |                         | B2  |
+	        +-----+                         +-----+
+	           |                               |
+	           V                               V
+	        +-----+                         +-----+
+	        | C1  |<------------------------| B1  |
+	        +-----+                         +-----+
+	           |
+	           V
+	        +-----+
+	        | C0  |
+	        +-----+
+
+In the above example, I've assumed that you've got your own tree with the head
+at commit C3, and that you've got a branch that you want to merge, which has
+its head at commit B3.  After merging them, you'd end up with a directed,
+cyclic tree:
+
+	        +-----+
+	HEAD--->| C4  |----------------------------+
+	        +-----+                            |
+	           |                               |
+	           V                               V
+	        +-----+                         +-----+
+	        | C3  |                 Branch->| B3  |
+	        +-----+                         +-----+
+	           |                               |
+	           V                               V
+	        +-----+                         +-----+
+	        | C2  |                         | B2  |
+	        +-----+                         +-----+
+	           |                               |
+	           V                               V
+	        +-----+                         +-----+
+	        | C1  |<------------------------| B1  |
+	        +-----+                         +-----+
+	           |
+	           V
+	        +-----+
+	        | C0  |
+	        +-----+
+
+and the C4 commit will have pointers to *both* contributing commits, C3 and B3.
+If GIT stored the differences at each commit rather than the terminal state, it
+would have to store a delta for each contributing commit.
+
+
+==========================
+DOWNLOADING UPSTREAM TREES
+==========================
+
+The first thing you'll usually want to do with GIT is to grab a copy of the
+cutting edge version of an upstream project and build it; perhaps you want to
+work on it, perhaps because it has a fix in it that you need or perhaps because
+you like living on the cutting edge and enjoy grepping your disks to recover
+your data when things go wrong.  Whatever your reasons, you need to be able
+make a local copy of an upstream GIT tree.
+
+With GIT-based projects, grabbing a local copy of an upstream repository is
+very easy:
+
+	git clone %UPSTREAM_REPO %MY_DIR
+
+This will create a checked-out copy of the the upstream repository
+(%UPSTREAM_REPO) by pulling over the internet and sticking it in a directory on
+the local machine.
+
+For example, to fetch Linus's cutting edge kernel tree, you'd do:
+
+	git clone git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git \
+		linux-2.6-local
+
+Then you look in linux-2.6-local and there is what you're looking for.
+
+
+LOCAL MIRRORING
+---------------
+
+You might find that you wish to run several concurrent, separate developments
+all based upon a single upstream repository.  You could simply clone each one
+as mentioned above, but that has the potential to use excessive amounts of disk
+space as each clone would include an independent copy of the entire source
+repository.
+
+What you might want to do is to set up a mirror of the upstream repository, and
+then share that mirror with each of the clones.  Even better, you can share it
+with other people who can also access the filesystem it is stored upon.
+
+So what you can do is create a local mirror:
+
+	git clone -n %UPSTREAM_REPO %MIRROR_DIR
+
+For example:
+
+	git clone -n git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git \
+		/warthog/git/linux-2.6.git
+
+The -n flag tells git to save space by not bothering to check the files out of
+the repository.  You don't really need the checkout if all you're going is to
+use this as a reference, but you can still check out if you like by omitting
+the -n.
+
+
+AUTOMATIC UPDATES
+-----------------
+
+Furthermore, you might want to automatically update your sources at some
+unfeasible hour of the morning when only Australians are awake because, say,
+your internet supply is rated more cheaply then - but you don't necessarily
+want the automatic update to dump into the sources you're actively meddling
+with.  A local mirror can help with this too.
+
+One way of automatically updating your mirror is to use cron.  To do this
+create a script that looks something like:
+
+	#!/bin/sh
+	cd %MIRROR_DIR || exit $?
+	exec git pull >/tmp/git-pull.log
+
+and chmod u+x it.  Then run the crontab program to modify your personal cron
+schedule and add something like the following line to it (not forgetting to
+remove the leading tab!):
+
+	0 %HOUR * * *       %MIRROR_SCRIPT
+
+where %HOUR is the hour you want it to go off every day.  For my local mirror
+of Linus's upstream kernel, I use:
+
+	#!/bin/sh
+	cd /warthog/git/linux-2.6 || exit $?
+	exec git pull >/tmp/git-pull.log
+
+and:
+
+	0 6 * * *       /home/dhowells/bin/do-git-pull.sh
+
+which will do the update every day at 6am.
+
+
+USING YOUR LOCAL MIRROR
+-----------------------
+
+You can then create a directory to actually do your development in by:
+
+	git clone -l -s %MIRROR_DIR %MY_DIR
+
+The "-l" tells git clone that the source (mirror) repository is on the local
+machine, that it shouldn't go over the internet for it, and that it should
+hardlink GIT objects from the source repository rather than copying them where
+possible.
+
+The "-s" says that git clone should insert a reference under %MY_DIR that
+points to the %MIRROR_DIR's collection of objects.  This means that GIT won't
+bother to copy the objects that it can get from %MIRROR_DIR at all, it'll just
+use them out of %MIRROR_DIR.
+
+    [!] NOTE: This makes %MY_DIR dependent on %MIRROR_DIR: if you delete
+    	%MIRROR_DIR or prune it you may make %MY_DIR unusable!
+
+You can repeat this again and again from the same mirror.  You can even share a
+mirror with other people that can access the filesystem holding the mirror.
+You don't need write access to it, only read.
+
+
+========================
+ACCESSING THE REPOSITORY
+========================
+
+One of the things you'll want to be able to do with what you've downloaded is
+look at changes other people have made.  GIT has some powerful tools to allow
+you to do this.
+
+
+VIEWING THE HISTORY
+-------------------
+
+You might wish, for example, to look back through the commit tree and see what
+changes have been made.  The command to do this is:
+
+	git log
+
+This will take you back through the commit information, starting at the current
+HEAD and going all the way back to the beginning if you let it:
+
+	warthog>git log
+	commit 8b1fae4e4200388b64dd88065639413cb3f1051c
+	Author: Linus Torvalds <torvalds@linux-foundation.org>
+	Date:   Wed Dec 10 15:11:51 2008 -0800
+
+	    Linux 2.6.28-rc8
+
+	commit f9fc05e7620b3ffc93eeeda6d02fc70436676152
+	Merge: b88ed20... 9a2bd24...
+	Author: Linus Torvalds <torvalds@linux-foundation.org>
+	Date:   Wed Dec 10 14:41:06 2008 -0800
+
+	    Merge branch 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
+
+	    * 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
+	      sched: CPU remove deadlock fix
+
+	commit b88ed20594db2c685555b68c52b693b75738b2f5
+	Author: Hugh Dickins <hugh@veritas.com>
+	Date:   Wed Dec 10 20:48:52 2008 +0000
+	...
+
+
+VIEWING A COMMIT
+----------------
+
+Now that you can see the commit IDs in the history, you can examine one more
+closely:
+
+	git show
+
+to see the current HEAD commit, or:
+
+	git show %COMMIT_ID
+
+to see a particular commit:
+
+	warthog>git show 1da177e
+	commit 1da177e4c3f41524e886b7f1b8a0c1fc7321cac2
+	Author: Linus Torvalds <torvalds@ppc970.osdl.org>
+	Date:   Sat Apr 16 15:20:36 2005 -0700
+
+	    Linux-2.6.12-rc2
+
+	    Initial git repository build. I'm not bothering with the full history,
+	...
+	diff --git a/COPYING b/COPYING
+	new file mode 100644
+	index 0000000..2a7e338
+	--- /dev/null
+	+++ b/COPYING
+	@@ -0,0 +1,356 @@
+	+
+	+   NOTE! This copyright does *not* cover user programs that use kernel
+	...
+
+
+VIEWING SOURCE DIFFERENCES
+--------------------------
+
+The 'git-show' command shows you what it thinks the differences are that you
+want to see, between a commit and its first listed base commit.  However, there
+are other differences you might wish to see.
+
+Firstly, you might like to see the differences between what's in the current
+HEAD commit, and what you've got checked out:
+
+	git diff
+
+or you might wish to see the differences between two particular commits, for
+example:
+
+	git diff v2.6.24 v2.6.25
+
+
+==============
+MAKING CHANGES
+==============
+
+So you've got a fresh development GIT tree and you want to make changes in it
+and commit them to it.  The first is easy enough: just use your preferred text
+editor to edit the files directly, or you could use sed or perl to apply some
+textual transformations - that's entirely up to you.
+
+However, once you've made those changes and you've compiled and tested them,
+you'll probably want to consign them to GIT.
+
+Files you've added must be marked by:
+
+	git add <filename>
+
+and files you've deleted must be noted by:
+
+	git rm <filename>
+
+so that GIT knows to include or exclude these files from its tree.
+Furthermore, you must tell GIT about any files that have changed that you want
+to be updated also:
+
+	git add <filename>
+
+You can then commit your changes.  This is done by running:
+
+	git commit
+
+Rather than doing lots of git add and git rm commands to register updated and
+removed files, you can give git commit a '-a' flag.  Note, though, that this
+takes no account of new files that git doesn't already know about.  Those must
+be added manually.
+
+git commit will pop up your favourite editor, asking you to enter a commit
+message describing your changes (don't forget to add your sign-off).  It will
+list the files it sees that have been added, altered and removed, and will
+differentiate between those that it has been told about (and thus will include)
+and those it hasn't (which will be ignored).
+
+After git commit completes successfully, 'git show' should show the new commit
+you've just made, and gitk should show the new tree structure with your new
+commit at the top.
+
+
+APPLYING PATCHES
+----------------
+
+If you have a patch file you wish to apply, you can do that with:
+
+	git apply <patch-file>
+
+This will make the changes specified by the patch, but it won't register any of
+the changes and won't record any of the metadata that might be in the patch
+file, such as authorship, description or attribution.  That has to be done
+manually as if you'd made the changes yourself.
+
+
+APPLYING FORMATTED PATCHES
+--------------------------
+
+Sometimes you may wish to incorporate a patch that someone has emailed you.
+You could use the 'patch' or 'git apply' programs and then set up the commit
+information manually, but if someone has sent you an appropriately formatted
+message - perhaps in an email - you can have GIT import the metadata from the
+message rather than you having to type it manually.
+
+If someone has given you an email or appropriately formatted patch file, the
+following command can import it:
+
+	git am <patch-file>
+
+If successful, this will automatically register all added, altered and removed
+files and commit the changes for you.  The commit message will be concocted
+from the description and email headers (From: and Subject: for instance).  If
+you want to add your own sign-off to the bottom of the commit message whilst
+you're at it, you can add a '-s' flag:
+
+	git am -s <patch-file>
+
+You may find it convenient to edit unformatted patches to make it possible to
+use 'git am' rather than 'git apply'.
+
+
+INCORPORATING GIT TREES
+-----------------------
+
+And sometimes, rather then sending you patches, people may attempt to
+contribute changes to you that are contained within GIT trees and you may wish
+to incorporate these into your development tree.
+
+To do this, the following command will work:
+
+	git pull %CONTRIB_REPO %CONTRIB_BRANCH
+
+where %CONTRIB_REPO is the URL of a repository and %CONTRIB_BRANCH is the name
+of the branch within that repository (usually this will be 'master').
+
+If successful, this will either just stack the pulled changes directly on top
+of your tree (assuming the contributed tree is based on the head of your tree)
+or it will automatically produce a merge commit indicating that the resulting
+tree is a union of the changes in your tree and the contributed tree.
+
+If unsuccessful due to conflicting changes, you'll need to perform the merge
+manually and perform the commit yourself.  See the "Manually merging failed
+fetches" section.
+
+An example of the command line you might use is:
+
+	git pull git://git.infradead.org/mtd-2.6.git master
+
+which will pull master branch of the upstream MTD tree into the GIT tree you're
+currently in.
+
+
+==============================
+AMENDING AND REVERTING CHANGES
+==============================
+
+There will be times when you make a mistake in your changes, and you find that
+you either want to amend them, or you want to discard them entirely.  GIT
+provides a number of tools to do this.
+
+If you make a mistake in changes you haven't yet committed, you can just edit
+them again with your text editor, or if you'd prefer to discard all the changes
+you made to a particular file, you can do:
+
+	git checkout <filename>
+
+This will just wipe away the changes that you've made and restore the file to
+the state it has recorded for it as part of the topmost commit.
+
+
+AMENDING COMMITTED CHANGES
+--------------------------
+
+If you've committed some changes and you realise that those changes are
+incorrect, you can amend them without precisely making a whole new commit -
+provided you haven't committed anything else on top of them.
+
+To do this, you make your changes, run git add and git rm as normal, and then
+do:
+
+	git commit --amend
+
+This will replace the topmost commit with a similar commit that includes the
+amendments.  The old commit will be displaced from the tree and will not appear
+again.
+
+Changes that are buried beneath further commits unfortunately have to be
+altered by making a new commit with the amendments, unless you wish to discard
+all the commits down to the one that needs amending, and then apply them all
+again.
+
+
+DISCARDING COMMITTED CHANGES
+----------------------------
+
+Upon occasion, you'll want to discard one or more commits entirely from the top
+of your tree.  To do this you need to find the ID of the latest commit that you
+want to keep.  Everything from the commit after that to the current commit will
+be discarded.
+
+You can find the commit ID in a number of ways.  Firstly, you can use 'git log'
+to look back through the commits.  The commit ID is shown as something like:
+
+	commit 6c34bc2976b30dc8b56392c020e25bae1f363cab
+
+Secondly, you can use gitk: select the commit of interest; the commit ID
+appears in the box labelled "SHA1 ID".
+
+You can then perform the discard with the following command:
+
+	git reset --hard %COMMIT_ID
+
+Using the above commit ID as an example, you could do:
+
+	git reset --hard 6c34bc2976b30dc8b56392c020e25bae1f363cab
+
+
+REVERTING COMMITTED CHANGES
+---------------------------
+
+And sometimes you'll want to revert changes that you've committed, but that are
+now buried beneath other commits.  Short of discarding and reapplying commits,
+you have to apply a reverse patch:
+
+	git diff %COMMIT_ID | patch -p1 -R
+
+and then commit it.  Both the original application and the reversion will be
+retained by GIT.
+
+
+==============================
+PUBLISHING CHANGES BY GIT TREE
+==============================
+
+Now that you've got a tree and have mangled it in unspeakable ways, you
+probably want to donate the glory of your works back to the community - usually
+with an eye to getting your changes pulled into an upstream maintainer's
+repository.  Your upstream maintainer may then push your changes on to their
+upstream maintainer, until it ends into the ultimate upstream repository
+(Linus's linux-2.6 tree in the case of the Linux kernel).
+
+You could, of course, just push patches to the upstream maintainer, be that
+Linus or one of his cronies in the case of the Linux kernel, or some other
+person if some other project.
+
+GIT, however, leans strongly towards another option.  If you can get access to
+a computer that is accessible by way of the internet, you might be able to set
+up a public GIT tree upon it and ask an appropriate upstream maintainer to pull
+from that.
+
+That computer, however, may not be particularly convenient for developing on:
+it may be remote from where you're working, for example, perhaps even on a
+different continent - so you'll probably want to have two trees: a remote,
+public, published tree, and a local private tree where you can break stuff at
+will.  I'm going to assume the two trees approach.
+
+
+SETTING UP
+----------
+
+First of all, you'll need to set up your two trees.  There are a number of
+steps to go through to do this:
+
+ (1) Find somewhere that's accessible by the internet (%REMOTE_BOX) that you
+     have SSH access to, and set up a public GIT tree that's a clone of the
+     upstream tree you want to use as a base:
+
+	ssh %REMOTE_BOX
+	cd /my/git/trees
+	git clone -n --bare %UPSTREAM_REPO %MY_DIR
+
+     Where %UPSTREAM_REPO is something like:
+
+	git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
+
+     This will create a directory called /my/git/trees/%MY_DIR that contains a
+     bare GIT repository to which you can upload your changes.  There will be
+     no checked out files here, and everything that would usually be in the
+     .git directory is in the top directory instead.
+
+     If your tree is on the same box as the tree you want to fork, you can
+     tell GIT to use that rather than going to the internet:
+
+	git clone -l -s -n --bare %UPSTREAM_DIR %MY_DIR
+
+     For example, I might wish to set up a tree to publish NOMMU changes so
+     that they're available through git.kernel.org.  To that end, I would do:
+
+	ssh master.kernel.org
+	cd /pub/scm/linux/kernel/git/dhowells
+	git clone -l -n -s --bare \
+		/pub/scm/linux/kernel/git/torvalds/linux-2.6.git linux-2.6-nommu
+
+
+ (2) You should set the description on your public repository:
+
+	echo %DESCRIPTION >%MY_DIR/description
+
+     For example:
+
+	echo "NOMMU development" >linux-2.6-nommu/description
+
+     This will be published through the GIT web interface if one is set up, and
+     so can be viewed by going to the appropriate URL.  For instance:
+
+	http://git.kernel.org/?p=linux/kernel/git/dhowells/linux-2.6-nommu.git
+
+
+ (3) Now go to the work machine on which you'll be doing your development.
+     You'll need to create a local fork of your public GIT repository.  You can
+     do this by:
+
+	git clone ssh://%REMOTE_BOX/my/git/trees/%MY_DIR %DEVEL_DIR
+
+     This will create a checked-out GIT tree in a directory (%DEVEL_DIR) that
+     you can later use for development.  If you have a local mirror of the
+     upstream tree that you're using as a base, you can tell git to use the
+     objects from that to save space:
+
+	git clone --reference %LOCAL_UPSTREAM_MIRROR \
+		ssh://%REMOTE_BOX/my/git/trees/%MY_DIR \
+		%DEVEL_DIR
+
+     [!] NOTE: You must use ssh: and not git: to clone your tree because you
+     	       need to be able to push back (write) to your public tree.
+
+     To continue my example, I have a local mirror of Linus's kernel, regularly
+     updated by cron, and so to make my local NOMMU development tree, I would
+     do:
+
+	git clone --reference /warthog/git/linux-2.6 \
+		ssh://master.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-2.6-nommu.git \
+		linux-2.6-nommu
+
+
+ (4) Now you need to set up your local GIT tree to make it possible (a) update
+     your development tree by pulling in the upstream tree, and (b) publish
+     your changes by pushing them to your public tree.
+
+	cd %DEVEL_DIR
+
+     Tell your repository where to find the upstream tree:
+
+	git remote add %UPSTREAM %UPSTREAM_REPO
+
+     where %UPSTREAM is the name you by which you want to refer to the upstream
+     repository to git pull.  For Linus's upstream kernel, you might wish to
+     use 'linus' for example.
+
+     In my example, I did the following to pull Linus's tree into branches of
+     my tree:
+
+	cd linux-2.6-nommu
+	git remote add linus \
+		git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
+
+     Looking in .git/config, I now see section that looks like this:
+
+	[remote "origin"]
+		url = ssh://master.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-2.6-nommu.git
+		fetch = +refs/heads/*:refs/remotes/origin/*
+	[branch "master"]
+		remote = origin
+		merge = refs/heads/master
+	[remote "linus"]
+		url = git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
+		fetch = +refs/heads/*:refs/remotes/linus/*
+
+
+ (5) You should now be able to update your development tree from the upstream
+     repository to make sure that works:
+
+	git fetch -v %UPSTREAM
+
+     In my case, that's:
+
+	git fetch -v linus
+
+     If you've just created the repository, it'll probably just say that things
+     are up to date:
+
+	From git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6
+	 = [up to date]      master     -> linus/master
+
+     [!] NOTE: I cannot determine a way of making "git pull linus" work without
+     	 setting branch.master.remote to 'linus'.
+
+
+ (6) And then you should be able to publish your development tree by pushing it
+     to your public tree, thus allowing the rest of the world to see your
+     changes.
+
+	git push -v origin
+
+
+ (7) Finally you should be able to pull your published tree back into your
+     development tree, and it should just say that it's up to date:
+
+	git pull -v
+
+
+UPDATING YOUR DEVELOPMENT TREE
+------------------------------
+
+Okay: so you've got your tree, and you've made changes to it, and now Linus has
+gone and dumped five thousand patches into his tree, making the base for your
+changes obsolete.  You need to update your tree and fix up your changes.
+
+If you haven't yet committed your changes, you'll have to siphon them off into
+a file:
+
+	git diff >a.diff
+
+and deapply them:
+
+	patch -p1 -R <a.diff
+
+You can then update your tree from the upstream tree with no fear of a conflict
+(assuming you don't also have changes that you have committed).  Once you've
+updated your tree, you can reapply your changes:
+
+	patch -p1 <a.diff
+
+And then fix up the rejects with your favourite editor and a few choice curses.
+
+
+To actually update your tree, you can do the following:
+
+	git fetch %UPSTREAM
+
+In my example, that'd be:
+
+	git fetch linus
+
+If you have committed changes, this will attempt to merge them, but you may
+still need to fix them up.  If everything went smoothly this will automatically
+commit a merge on top of the tree and set the HEAD pointer to that.  This merge
+will point at your last tree and the tree you just merged from upstream, and
+will indicate that the resulting tree is a combination of both.  Of course, you
+shouldn't assume it will still compile, let alone still work...
+
+If you do need to fix them up, refer to the "Manually merging failed fetches"
+section for guidance.
+
+You can view the merge that git pull committed by:
+
+	git show
+
+And you can view the tree structure at that point with the gitk command.
+
+
+PUBLISHING YOUR CHANGES
+-----------------------
+
+Finally, you're in a position to make your changes available.  Firstly, you
+have to commit them to your development tree (as mentioned previously) and then
+you have to make them available to the rest of the world.  To do that, simply
+run:
+
+	git push
+
+which will apply the changes to your public tree.  If you have web access to
+your git tree, these will eventually become visible through there.
+
+You may then have to tell your upstream maintainer what you'd like them to pull
+from your tree.  The standard way to do this is to do:
+
+	git request-pull %BASE_ID %MY_REPO >/tmp/request.txt
+
+where %BASE_ID is the head of the tree on which your changes are based, and
+%MY_REPO is the public URL of your repository.  If you have your development
+git tree configured to know where the upstream remote repository is, then if
+you've ever done 'git fetch' you should have a branch for it, named something
+like "%UPSTREAM/%UPSTREAM_BRANCH" where %UPSTREAM is the name you gave to 'git
+remote' and %UPSTREAM_BRANCH is the upstream branch on which you've based your
+development (almost certainly 'master').
+
+This command will generate a list of all the patches between %BASE_ID and the
+head of your tree that you are asking to be pulled.
+
+In my example, I can do:
+
+	git request-pull linus/master \
+		git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-2.6-nommu.git \
+		>/tmp/request.txt
+
+You should then edit /tmp/request.txt to include a description of what you're
+trying to achieve with these patches, and then mail the whole file to the
+upstream maintainer.
+
+[!] NOTE: It may take some time for the git push to take full effect.  Before
+    that time is up, git request-pull may give spurious warnings and the test
+    it produces may say that the branch is unverified.
+
+
+===============================
+MANUALLY MERGING FAILED FETCHES
+===============================
+
+Occasionally, when you pull someone else's tree in to your repository, either
+because the base needs updating or because you're incorporating stuff from a
+contributor, the merge will fail due to conflicts between the changes you have
+made in your tree, and the changes you're importing.
+
+GIT will try and automatically merge where possible, but it can't always manage
+it.  In such cases you have to unlimber your text editor and fix it manually.
+
+GIT will report the files that need merging during the git fetch/git pull:
+
+	CONFLICT (content): Merge conflict in drivers/char/tty_audit.c
+
+and they can also be determined by looking in ".git/MERGE_MSG".
+
+GIT will interpolate markers into the affected files, along with both versions
+of the code:
+
+	<<<<<<< HEAD:drivers/char/tty_audit.c
+					 tsk->pid, uid, loginuid, sessionid,
+	=======
+					 tsk->pid, tsk->uid, loginuid, sessionid,
+	>>>>>>> b3985e2bf6ce51ae943208af4bd336287fb34ed6:drivers/char/tty_audit.c
+
+The first section (<<<<<<<< to =======) is the version from your tree, the
+second section (======= to >>>>>>>) is the version from the tree being
+imported.  The markers must be removed, and the conflicting code resolved down
+to the appropriate final version.
+
+
+Once that is done, git add (or git rm) must be called on the changed files so
+that git commit knows to include them in the new head.  It works exactly like
+changing files normally (as per the "Making changes" section), except that GIT
+has stored extra data that will go into the merge commit when git commit
+creates it.
+
+
+=============
+LOCATING BUGS
+=============
+
+There will be times when the program you've built malfunctions.  It happens now
+and then even to the best of projects.  Sometimes you can easily locate the bug
+by looking at the symptoms and the debugging output and then eyeballing the
+code, and sometimes you can't.
+
+For very big projects such as the Linux kernel, finding a bug that someone else
+has inadvertently introduced can be very hard, but GIT allows you to take
+advantage of the fact that the changes are introduced a bit at a time with
+clear boundaries (commits) to make life a bit easier.
+
+
+BISECTION
+---------
+
+What you really want to be able to do is to isolate the commit that's causing
+the malfunction, but with automation support so that you don't have to trace
+the commit tree yourself.  GIT has a tool to do this: git bisect.
+
+The way this works is to take two points in the tree: one at which you know the
+program malfunctions, and one at which you know it doesn't, and then chop its
+way through the tree to locate the failing commit.
+
+To illustrate this:
+
+ (1) Assume that you're dealing with the kernel, and that you find that after
+     Linus's merge window, 2.6.25-rc1 does not boot for you, but you know that
+     2.6.24 did prior to the window.
+
+     Firstly you have to start your search and describe the bounds (the working
+     and non-working points).  This is done with the following commands:
+
+	git bisect start [%BAD_COMMIT [%GOOD_COMMIT]]
+	git bisect bad [%BAD_COMMIT]
+	git bisect good [%GOOD_COMMIT]
+
+     where %BAD_COMMIT and %GOOD_COMMIT are optional commit object IDs or
+     symbolic representations thereof.  The 'bad' command is unnecessary if
+     %BAD_COMMIT is given to 'start', and the 'good' command is not required if
+     %GOOD_COMMIT is given to 'start'.
+
+     So, in the example we're looking at, you could do:
+
+	git bisect start
+	git bisect bad v2.6.25-rc1
+	git bisect good v2.6.24
+
+     or:
+
+	git bisect start v2.6.25-rc1
+	git bisect good v2.6.24
+
+     or:
+
+	git bisect start v2.6.25-rc1 v2.6.24
+
+	[!] NOTE: This is using a symbolic tag 'v2.6.24' to refer to the last
+     	    commit before 2.6.24 was declared.
+
+
+     However, if 2.6.25-rc1 is at currently at the head of your tree, you can
+     do:
+
+	git bisect start
+	git bisect bad
+
+     to indicate that this malfunctioned, or you could do this in a single
+     command:
+
+	git bisect start HEAD
+
+     to start bisection _and_ indicate that the HEAD revision is bad.
+
+
+     Alternatively, if you're at a point where the program _does_ work, you can
+     pass either HEAD or no parameter to the 'good' bisection command, or pass
+     HEAD as the %GOOD_COMMIT parameter to the 'start' bisection command.
+
+
+ (2) Now GIT will rumble through the commits between the two points you have
+     declared, and set the current HEAD of the repository to a point that
+     approximates midway between the two:
+
+	warthog>git bisect start v2.6.25-rc1 v2.6.24
+	Bisecting: 4814 revisions left to test after this
+	[d2e626f45cc450c00f5f98a89b8b4c4ac3c9bf5f] x86: add PAGE_KERNEL_EXEC_NOCACHE
+
+     and then it will check out the sources to reflect their state at this point.
+
+
+ (3) You should now attempt to compile this and test it.  If the test succeeds,
+     you should run the command:
+
+	git bisect good
+
+     If the test fails, run the command:
+
+	git bisect bad
+
+     These will tell GIT to binary chop the commits between either the current
+     point and the good end or the current point and the bad end to find a new
+     commit to test:
+
+	warthog>git bisect bad
+	Bisecting: 2406 revisions left to test after this
+	[fb46990dba94866462e90623e183d02ec591cf8f] [NETFILTER]: nf_queue: remove unnecessary hook existance check
+	warthog>git bisect good
+	Bisecting: 1203 revisions left to test after this
+	[936722922f6d2366378de606a40c14f96915474d] [IPV4] fib_trie: compute size when needed
+
+     As for when bisection started, GIT will set the current HEAD pointer and
+     then check out the sources.  You should repeat step (3).
+
+     If the commit is broken for you and the compile fails, run the command:
+
+	git bisect skip
+
+     this will cause the bisection algorithm to move onto the next commit in
+     the hope that this one will be better:
+
+	warthog>git bisect skip
+	Bisecting: 1203 revisions left to test after this
+	[1328042e268c936189f15eba5bd9a5a4605a8581] [IPV4] fib_trie: use hash list
+
+     this will change the HEAD pointer and check out the sources.  Repeat step
+     (3).
+
+
+ (4) Eventually, after you've tested a number of different commits, GIT will
+     tell you that it has narrowed the problem down to either a single commit,
+     or if there were compile errors that got in the way, a range of commits:
+
+	warthog>git bisect bad
+	e3ac5298159c5286cef86f0865d4fa6a606bd391 is first bad commit
+	commit e3ac5298159c5286cef86f0865d4fa6a606bd391
+	Author: Patrick McHardy <kaber@trash.net>
+	Date:   Wed Dec 5 01:23:57 2007 -0800
+
+	    [NETFILTER]: nf_queue: make queue_handler const
+
+	    Signed-off-by: Patrick McHardy <kaber@trash.net>
+	    Signed-off-by: David S. Miller <davem@davemloft.net>
+	...
+
+
+ (5) At any time during the bisection process, you can use:
+
+	git show
+
+     to examine the commit currently selected for testing, and:
+
+	git bisect log
+
+     to view the log of information provided by you through git bisect start,
+     good and bad, and:
+
+	git bisect visualize
+
+     to start up the gitk program to show you a graphical view of the current
+     good-to-bad range of commits as narrowed down by bisection.
+
+
+ (6) You should then end the bisection process by:
+
+	git bisect reset
+
+
+BLAME
+-----
+
+Now imagine that rather than indulging in bisection you've found a bug by
+simply looking at the code: who do you tell about it?  You could look at the
+banner comment at the top of the file to look for names and email addresses,
+and you could also look in the kernel MAINTAINERS file or its equivalent, but
+the person you really want to harangue is whoever made the change...
+
+There's a very useful GIT tool to help determine this:
+
+	git blame <file>
+
+also known as:
+
+	git annotate <file>
+
+which will give you a list of lines in a source file against who changed them
+last and in what commit.  You may find that your favourite editor has a
+facility to run this for you (Emacs has vc-annotate, bound to C-x v g, for
+example).
+
+Running git blame on the kernel's README file, for example, might show:
+
+	warthog>git blame README
+	620034c8 (Jesper Juhl                    2006-12-07 00:45:58 +0100   1)         Linux kernel release 2.6.xx <http://kernel.org/>
+	^1da177e (Linus Torvalds                 2005-04-16 15:20:36 -0700   2)
+	^1da177e (Linus Torvalds                 2005-04-16 15:20:36 -0700   3) These are the release notes for Linux version 2.6.  Read them carefully,
+	...
+
+The hex number that occurs first on the line is a truncated commit object ID,
+and this can be passed to git-show (remove the '^' symbol first, if given).
+
+	warthog>git show 1da177e
+	commit 1da177e4c3f41524e886b7f1b8a0c1fc7321cac2
+	Author: Linus Torvalds <torvalds@ppc970.osdl.org>
+	Date:   Sat Apr 16 15:20:36 2005 -0700
+	...

^ permalink raw reply related

* Re: Unable to index file
From: Linus Torvalds @ 2008-12-12 18:15 UTC (permalink / raw)
  To: Ramon Tayag; +Cc: git
In-Reply-To: <alpine.LFD.2.00.0812120956050.3340@localhost.localdomain>



On Fri, 12 Dec 2008, Linus Torvalds wrote:
> 
> Now, admittedly git is probably being really annoyingly anal about this 
> all, and we probably should loosen the restrictions on it a bit, but I'd 
> like to know why it happens. I cannot recall this having been reported 
> before, so it's some specific filesystem or OS that causes this, I think.

Anyway, the "loosen the symlink lstat() requirements" patch would likely 
look something like this. I can't really test it, though, since I only 
have filesystems that have matching lstat()/readlink() sizes.

		Linus
---
 sha1_file.c |   11 ++++++-----
 1 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/sha1_file.c b/sha1_file.c
index 0e021c5..222c793 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -2522,9 +2522,9 @@ int index_fd(unsigned char *sha1, int fd, struct stat *st, int write_object,
 
 int index_path(unsigned char *sha1, const char *path, struct stat *st, int write_object)
 {
-	int fd;
+	int fd, len;
 	char *target;
-	size_t len;
+	size_t bufsize;
 
 	switch (st->st_mode & S_IFMT) {
 	case S_IFREG:
@@ -2537,9 +2537,10 @@ int index_path(unsigned char *sha1, const char *path, struct stat *st, int write
 				     path);
 		break;
 	case S_IFLNK:
-		len = xsize_t(st->st_size);
-		target = xmalloc(len + 1);
-		if (readlink(path, target, len + 1) != st->st_size) {
+		bufsize = 1+xsize_t(st->st_size);
+		target = xmalloc(bufsize);
+		len = readlink(path, target, bufsize);
+		if (len < 0) {
 			char *errstr = strerror(errno);
 			free(target);
 			return error("readlink(\"%s\"): %s", path,

^ permalink raw reply related

* Re: Unable to index file
From: Linus Torvalds @ 2008-12-12 18:07 UTC (permalink / raw)
  To: Ramon Tayag; +Cc: git
In-Reply-To: <f25d5ad20812120647m646698d7t9849c8ccb08c465e@mail.gmail.com>

On Fri, 12 Dec 2008, Ramon Tayag wrote:
> 
> I've come across a problem that I don't believe lies in Rails.  You
> needn't be familiar, I think, with Rails to see what's wrong.
> 
> I can't seem to add the files that are in
> http://dev.rubyonrails.org/archive/rails_edge.zip
> 
> 1) Unpack the zip
> 2) Initialize a git repo inside the folder that was unpacked
> 3) git add .
> 
> See the errors.. :o http://pastie.org/337571

What platform/filesystem is this?

Git is rather particular about symlinks, and it looks like your platform 
does something odd, and that makes git unhappy about your symlink.

In particular:

	ls -l vendor/rails/actionpack/test/fixtures/layout_tests/layouts/ 
	...
	lrwxrwxrwx 1 root root 48 2008-12-12 18:22 symlinked -> ../../symlink_parent

notice how the symlink content is "../../symlink_parent", but then take a 
look at the _size_ of the symlink: 48 bytes.

Git expects the lstat() information to match the return from readlink(), 
and it doesn't.

For exact details, see "index_path()" in sha1_file.c:

        case S_IFLNK:   
                len = xsize_t(st->st_size);
                target = xmalloc(len + 1);
                if (readlink(path, target, len + 1) != st->st_size) {
                        char *errstr = strerror(errno);

ie we consider it an error if we get less than st_size characters back 
from readlink().

Now, admittedly git is probably being really annoyingly anal about this 
all, and we probably should loosen the restrictions on it a bit, but I'd 
like to know why it happens. I cannot recall this having been reported 
before, so it's some specific filesystem or OS that causes this, I think.

		Linus

^ permalink raw reply

* diff -b / -w and empty diffs
From: Lars Noschinski @ 2008-12-12 17:57 UTC (permalink / raw)
  To: git

Hello!

If the difference between two files is only whitespace, "git diff -b"
leads to diffs just consisting of the "diff" and "index" lines. I would
like an option to suppress those files in the diff output because it
breaks "git diff -b > patch; git apply patch" workflows (and often I'm
just interested in "real" changes).

Is there any reason why such behaviour is not implemented yet (besides
the fact that nobody cared to do it)?

^ permalink raw reply

* [patch] documentation: Explain how to free up space after filter-branch
From: Thomas Jarosch @ 2008-12-12 17:42 UTC (permalink / raw)
  To: git; +Cc: Björn Steinbrink

Explain how to free up space after filter-branch.
Thanks to Björn Steinbrink for pointing me in the right direction.

Signed-off-by: Thomas Jarosch <thomas.jarosch@intra2net.com>

diff --git a/Documentation/git-filter-branch.txt b/Documentation/git-filter-branch.txt
index fed6de6..1432380 100644
--- a/Documentation/git-filter-branch.txt
+++ b/Documentation/git-filter-branch.txt
@@ -319,6 +319,18 @@ git filter-branch --index-filter \
 	 mv $GIT_INDEX_FILE.new $GIT_INDEX_FILE' HEAD
 ---------------------------------------------------------------
 
+Free up the space in .git if the rewritten version is correct
+by deleting refs/original and pruning the reflog:
+
+----------------------------------------------------
+git for-each-ref --format='%(refname)' refs/original
+	| xargs -i git update-ref -d {}
+
+git reflog expire --expire=0 --all
+git repack -a -d --depth=250 --window=250
+git prune
+----------------------------------------------------
+
 
 Author
 ------

^ permalink raw reply related

* Re: [PATCH 2/3 (edit v2)] gitweb: Cache $parent_commit info in git_blame()
From: Jakub Narebski @ 2008-12-12 17:20 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Nanako Shiraishi, git, Luben Tuikov
In-Reply-To: <7vr64e9jq6.fsf@gitster.siamese.dyndns.org>

On Fri, 12 Dec 2008, Junio C Hamano wrote:
> Jakub Narebski <jnareb@gmail.com> writes:

> > Only commit message has changed.
> 
> Which is a bit unnice, because it will conflict with the original [3/3]
> that I queued already (with a pair of fixes, including but not limited to
> the one you sent "Oops, it should have been like this" for).
> 
> I can hand wiggle the patch to make it apply, but I'd prefer if I did not
> have to do this every time I receive a patch.

I'm sorry about that; I have forgot to change order of patches to have
original 1/3, 3/3, 2/3 (I should have used 'stg float' for that).

> I think the conflict was trivial (just a single s/rev/short_rev/) and I
> did not make a silly mistake when I fixed it up, but please check the
> result on 'pu' after I push the results out.

I did the reordering, and gitweb on compared top of reordered stack
of patches with gitweb from top of 'pu' branch, and the only
difference in the area touched by git_blame improvements series is
one comment I have added in v2 of 3/3.

Thank you for your work.
-- 
Jakub Narebski
Poland

^ permalink raw reply

* Re: What's cooking in git.git (Nov 2008, #06; Wed, 26)
From: Nguyen Thai Ngoc Duy @ 2008-12-12 16:54 UTC (permalink / raw)
  To: Johannes Sixt
  Cc: Junio C Hamano, Daniel Barkalow, Shawn O. Pearce,
	Johannes Schindelin, git
In-Reply-To: <4942952E.1060706@viscovery.net>

On 12/12/08, Johannes Sixt <j.sixt@viscovery.net> wrote:
> Nguyen Thai Ngoc Duy schrieb:
>
> > On 12/12/08, Junio C Hamano <gitster@pobox.com> wrote:
>  >>  So "git grep -e frotz Documentation/", whether you only check out
>  >>  Documentation or the whole tree, should grep only in Documentation area,
>  >>  and "git grep -e frotz" should grep in the whole tree, even if you happen
>  >>  to have a sparse checkout.  By definition, a sparse checkout has no
>  >>  modifications outside the checkout area, so whenever grep wants to look
>  >>  for strings outside the checkout area it should pretend as if the same
>  >>  content as what the index records is in the work tree.  This is consistent
>  >>  with the way how "git diff" in a sparsely checked out work tree should
>  >>  behave.
>  >
>  > Assume someone is using sparse checkout with KDE git repository. They
>  > sparse-checkout kdeutils module and do "git grep -e foo". I would
>  > expect that the command only searches in kdeutils only (and is the
>  > current behavior).
>
>
> But what if the same persion notices a #define in a kdeutils header file
>  and want's to know whether it is unused in order to remove it:
>
>     $ git grep FOO
>     kdeutils/foo.h:#define FOO bar

"git grep --cached FOO" ?

>  Conclusion from this output: "It's only defined, but not used anywhere."
>  But this conclusion is not necessarily correct because FOO could be used
>  outside kdeutils.
>
>  So, no, "git grep" should disregard the checkout area.
>
>  -- Hannes
>


-- 
Duy

^ permalink raw reply

* Re: Clarifying "invalid tag signature file" error message from git filter-branch (and others)
From: Jakub Narebski @ 2008-12-12 16:53 UTC (permalink / raw)
  To: Jim Meyering; +Cc: James Youngman, Brandon Casey, git
In-Reply-To: <87zlj1hd0r.fsf@rho.meyering.net>

Jim Meyering <jim@meyering.net> writes:

> I used parsecvs, probably with git-master from the date of
> the initial conversion (check the archives for actual date).
> That was long enough ago that it was almost certainly before
> git-mktag learned to be more strict about its inputs.
> 
> James, since you're about to rewrite the history, you may want to
> start that process from a freshly-cvs-to-git-converted repository.
> 
> I'm not very happy about using parsecvs (considering it's not
> really being maintained, afaik), so if the git crowd
> can recommend something better, I'm all ears.

The page you might want to consult is

  http://git.or.cz/gitwiki/InterfacesFrontendsAndTools

There you have listed git-cvsimport, which uses cvsps to extract
patchset, is in git, and is as far as I know the only tool that allow
incremental import; parsecvs which requires access to *,v files you
use; cvs2svn (cvs2git) which have learned fast-import format and can
be used to import (fast) CVS repositories, but incremental import
(difficult that it is) is only in plans, AFAIK.

So I would recommend trying cvs2svn / cvs2git.
-- 
Jakub Narebski
Poland
ShadeHawk on #git

^ permalink raw reply

* Re: What's cooking in git.git (Nov 2008, #06; Wed, 26)
From: Johannes Sixt @ 2008-12-12 16:45 UTC (permalink / raw)
  To: Nguyen Thai Ngoc Duy
  Cc: Junio C Hamano, Daniel Barkalow, Shawn O. Pearce,
	Johannes Schindelin, git
In-Reply-To: <fcaeb9bf0812120813m2949e36ar7905d5688b8f6ecb@mail.gmail.com>

Nguyen Thai Ngoc Duy schrieb:
> On 12/12/08, Junio C Hamano <gitster@pobox.com> wrote:
>>  So "git grep -e frotz Documentation/", whether you only check out
>>  Documentation or the whole tree, should grep only in Documentation area,
>>  and "git grep -e frotz" should grep in the whole tree, even if you happen
>>  to have a sparse checkout.  By definition, a sparse checkout has no
>>  modifications outside the checkout area, so whenever grep wants to look
>>  for strings outside the checkout area it should pretend as if the same
>>  content as what the index records is in the work tree.  This is consistent
>>  with the way how "git diff" in a sparsely checked out work tree should
>>  behave.
> 
> Assume someone is using sparse checkout with KDE git repository. They
> sparse-checkout kdeutils module and do "git grep -e foo". I would
> expect that the command only searches in kdeutils only (and is the
> current behavior).

But what if the same persion notices a #define in a kdeutils header file
and want's to know whether it is unused in order to remove it:

    $ git grep FOO
    kdeutils/foo.h:#define FOO bar

Conclusion from this output: "It's only defined, but not used anywhere."
But this conclusion is not necessarily correct because FOO could be used
outside kdeutils.

So, no, "git grep" should disregard the checkout area.

-- Hannes

^ permalink raw reply

* Re: Clarifying "invalid tag signature file" error message from git filter-branch (and others)
From: Brandon Casey @ 2008-12-12 16:44 UTC (permalink / raw)
  To: Jim Meyering; +Cc: James Youngman, git
In-Reply-To: <87zlj1hd0r.fsf@rho.meyering.net>

Jim Meyering wrote:
> "James Youngman" <jay@gnu.org> wrote:
>> On Thu, Dec 11, 2008 at 11:13 PM, Brandon Casey <casey@nrlssc.navy.mil> wrote:

>>> What tool was used to convert this repository to git? It should be corrected
>>> to produce valid annotated tags. Especially if it is a tool within git.
>> I don't know, Jim Meyering will know though, so I CC'ed him.
> 
> I used parsecvs, probably with git-master from the date of
> the initial conversion (check the archives for actual date).
> That was long enough ago that it was almost certainly before
> git-mktag learned to be more strict about its inputs.
> 
> James, since you're about to rewrite the history, you may want to
> start that process from a freshly-cvs-to-git-converted repository.
> 
> I'm not very happy about using cvsparse (considering it's not
> really being maintained, afaik), so if the git crowd
> can recommend something better, I'm all ears.

I've only used git-cvsimport. AFAIK it creates light-weight tags in
git rather than annotated tags.

It also uses an unmaintained tool: cvsps. There are some additional
patches in a git repository somewhere that fix a few known problems.

You could try that James.

-brandon

^ permalink raw reply

* Re: Clarifying "invalid tag signature file" error message from git filter-branch (and others)
From: Brandon Casey @ 2008-12-12 16:21 UTC (permalink / raw)
  To: James Youngman; +Cc: git, Jim Meyering
In-Reply-To: <c5df85930812111559p287ea6afk54a9759302288d5e@mail.gmail.com>

James Youngman wrote:
> On Thu, Dec 11, 2008 at 11:13 PM, Brandon Casey <casey@nrlssc.navy.mil> wrote:
> 
>>> Before conversion:
>>> $ git cat-file tag FINDUTILS-4_1-10
>>> object ce25eb352de8dc53a9a7805ba9efc1c9215d28c2
>>> type commit
>>> tag FINDUTILS-4_1-10
>>> tagger Kevin Dalley
>> The tagger field is missing an email address, a timestamp, and a timezone. It
>> should look something like:
>>
>>  tagger Kevin Dalley <kevin.dalley@somewhere.com> 1229036026 -0800
>>
>> git-mktag prevents improperly formatted tags from being created by checking
>> that these fields exist and are well formed.
>>
>> If you know the correct values for the missing fields, then you could
> 
> Yes for the email address.      But as for the timestamp, it's not in
> the tag file; that only contains the sha1.
> There is a timestamp in the object being tagged, is that the timestamp
> you are talking about?

Yes and no. I meant that if you knew the "real" timestamp, possibly by
extracting it from the original repository, then you can use that.
Otherwise yes, as a workaround, use the timestamp in the object being
tagged.

> $ git show --pretty=raw  ce25eb352de8dc53a9a7805ba9efc1c9215d28c2
> commit ce25eb352de8dc53a9a7805ba9efc1c9215d28c2
> tree 752cca144d39bc55d05fbe304752b274ba22641c
> parent 9a998755249b0c8c47e8657cff712fa506aa30fc
> author Kevin Dalley <kevin@seti.org> 830638152 +0000
> committer Kevin Dalley <kevin@seti.org> 830638152 +0000

The committer information should be used, though in this repository it will
probably always be the same as the author.

-brandon

^ permalink raw reply

* Re: What's cooking in git.git (Nov 2008, #06; Wed, 26)
From: Nguyen Thai Ngoc Duy @ 2008-12-12 16:13 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Daniel Barkalow, Shawn O. Pearce, Johannes Schindelin, git
In-Reply-To: <7vy6ym9nm8.fsf@gitster.siamese.dyndns.org>

On 12/12/08, Junio C Hamano <gitster@pobox.com> wrote:
>  So "git grep -e frotz Documentation/", whether you only check out
>  Documentation or the whole tree, should grep only in Documentation area,
>  and "git grep -e frotz" should grep in the whole tree, even if you happen
>  to have a sparse checkout.  By definition, a sparse checkout has no
>  modifications outside the checkout area, so whenever grep wants to look
>  for strings outside the checkout area it should pretend as if the same
>  content as what the index records is in the work tree.  This is consistent
>  with the way how "git diff" in a sparsely checked out work tree should
>  behave.

Assume someone is using sparse checkout with KDE git repository. They
sparse-checkout kdeutils module and do "git grep -e foo". I would
expect that the command only searches in kdeutils only (and is the
current behavior).
-- 
Duy

^ permalink raw reply

* Re: What's cooking in git.git (Nov 2008, #06; Wed, 26)
From: Nguyen Thai Ngoc Duy @ 2008-12-12 16:08 UTC (permalink / raw)
  To: Daniel Barkalow; +Cc: Junio C Hamano, Shawn O. Pearce, Johannes Schindelin, git
In-Reply-To: <alpine.LNX.1.00.0812111520490.19665@iabervon.org>

On 12/12/08, Daniel Barkalow <barkalow@iabervon.org> wrote:
>  > Well, if you set core.defaultsparse properly, those files should
>  > appear/disappear as you wish (and as of now if you define your
>  > checkout area with "git checkout --{include-,exclude-,}sparse" then
>  > core.defaultsparse should be updated accordingly). I don't say
>  > core.defaultsparse is perfect.
>
>
> Right, so in order to get reasonable behavior, the user must use
>  --{include,exclude}-sparse. I think that this should be the *default*
>  behavior, and probably the *only porcelain-supported* behavior, because
>  otherwise it's confusing.

It's pretty hard (or intrusive) to enforce such behaviour. How about
showing files that does not match core.defaultsparse in "git status"
along with instructions how to add them to core.defaultsparse? That
way people can keep it consistent and less modification to current
code.
-- 
Duy

^ permalink raw reply

* Re: Clarifying "invalid tag signature file" error message from git filter-branch (and others)
From: James Youngman @ 2008-12-12 16:05 UTC (permalink / raw)
  To: Jim Meyering; +Cc: Brandon Casey, git
In-Reply-To: <87zlj1hd0r.fsf@rho.meyering.net>

On Fri, Dec 12, 2008 at 11:02 AM, Jim Meyering <jim@meyering.net> wrote:
> "James Youngman" <jay@gnu.org> wrote:
>> On Thu, Dec 11, 2008 at 11:13 PM, Brandon Casey <casey@nrlssc.navy.mil> wrote:
>>
>>>> Before conversion:
>>>> $ git cat-file tag FINDUTILS-4_1-10
>>>> object ce25eb352de8dc53a9a7805ba9efc1c9215d28c2
>>>> type commit
>>>> tag FINDUTILS-4_1-10
>>>> tagger Kevin Dalley
>>>
>>> The tagger field is missing an email address, a timestamp, and a timezone. It
>>> should look something like:
>>>
>>>  tagger Kevin Dalley <kevin.dalley@somewhere.com> 1229036026 -0800
>>>
>>> git-mktag prevents improperly formatted tags from being created by checking
>>> that these fields exist and are well formed.
>>>
>>> If you know the correct values for the missing fields, then you could
>>
>> Yes for the email address.      But as for the timestamp, it's not in
>> the tag file; that only contains the sha1.
>> There is a timestamp in the object being tagged, is that the timestamp
>> you are talking about?
>>
>> $ git show --pretty=raw  ce25eb352de8dc53a9a7805ba9efc1c9215d28c2
>> commit ce25eb352de8dc53a9a7805ba9efc1c9215d28c2
>> tree 752cca144d39bc55d05fbe304752b274ba22641c
>> parent 9a998755249b0c8c47e8657cff712fa506aa30fc
>> author Kevin Dalley <kevin@seti.org> 830638152 +0000
>> committer Kevin Dalley <kevin@seti.org> 830638152 +0000
>>
>>     *** empty log message ***
>>
>> diff --git a/debian.Changelog b/debian.Changelog
>> index e3541eb..d0cd295 100644
>> --- a/debian.Changelog
>> +++ b/debian.Changelog
>> @@ -1,5 +1,7 @@
>>  Sat Apr 27 12:29:06 1996  Kevin Dalley
>> <kevin@aplysia.iway.aimnet.com (Kevin Dalley)>
>>
>> +       * find.info, find.info-1, find.info-2: updated to match find.texi
>> +
>>         * debian.rules (debian): update debian revision to 10
>>
>>         * getline.c (getstr): verify that nchars_avail is *really* greater
>>
>>
>>
>>
>>
>>> recreate the tags before doing the filter-branch. If they are unknown, it
>>> seems valid enough to use the values from the commit that the tag points
>>> to.
>>>
>>> i.e.
>>>
>>>  tagger Kevin Dalley <kevin@seti.org> 830638152 -0000
>>>
>>> What tool was used to convert this repository to git? It should be corrected
>>> to produce valid annotated tags. Especially if it is a tool within git.
>>
>> I don't know, Jim Meyering will know though, so I CC'ed him.
>
> I used parsecvs, probably with git-master from the date of
> the initial conversion (check the archives for actual date).
> That was long enough ago that it was almost certainly before
> git-mktag learned to be more strict about its inputs.
>
> James, since you're about to rewrite the history, you may want to
> start that process from a freshly-cvs-to-git-converted repository.

Maybe, but then afaik CVS tags don't have timestamps, so some of the
data that git-mktag seems to want doesn't exist anyway.

But until we know the answer to the next question, I don't think we
know how we would generate such a freshly-converted repository.

> I'm not very happy about using cvsparse (considering it's not
> really being maintained, afaik), so if the git crowd
> can recommend something better, I'm all ears.

Thanks,
James.

^ permalink raw reply

* Re: [JGIT PATCH 03/15] Add IntList as a more efficient representation of List<Integer>
From: Sverre Rabbelier @ 2008-12-12 15:50 UTC (permalink / raw)
  To: Shawn O. Pearce; +Cc: git
In-Reply-To: <20081212154115.GO32487@spearce.org>

On Fri, Dec 12, 2008 at 16:41, Shawn O. Pearce <spearce@spearce.org> wrote:
> If you'd like to send a patch to change it, I'll apply it.  But I
> don't think its worth my time to make this toString() more efficient.

I mainly mentioned it because it's in a Class meant to be more optimal
than what Java ships with, but I agree with your reasoning that this
toString is not part of what needs to be optimized.

> Other areas of JGit I do try to micro-optimize, because they are
> right smack in the middle of the critical paths.

Hehe, I very much agree with not optimizing prematurely, and if you do
optimize to go for it all the way.

> E.g. look at ObjectId.equals(byte[],int,byte[],int). I hand-unrolled
> the memcmp loop because the JIT on x86 does *soooo* much better
> when the code is spelled out:

<code snipped>

Kind of sad that you have to write this kind of code if you want good
performance, ah well, perhaps someday... (import java.lang.optimized
;) ).

> This block is in the critical path for any tree diff code, in
> particular for a "git log -- a/" sort of operation.  Its used
> to compare the SHA-1s from two different tree records to see if
> they differ.  Not unrolling this was a huge penalty.

I reckon that is done a lot :). Ashame the JRE can't do that kind of
optimization for you. e.g., if you do:
for(int i = 0; i < constant; i++) {
  some_code;
}

-- 
Cheers,

Sverre Rabbelier

^ permalink raw reply

* [PATCH 1/2] cvsserver: add option to configure commit message
From: Fabian Emmes @ 2008-12-12 15:24 UTC (permalink / raw)
  To: git; +Cc: gitster, Fabian Emmes, Lars Noschinski

cvsserver annotates each commit message by "via git-CVS emulator". This is
made configurable via gitcvs.commitmsgannotation.

Signed-off-by: Fabian Emmes <fabian.emmes@rwth-aachen.de>
Signed-off-by: Lars Noschinski <lars@public.noschinski.de>
---
 Documentation/config.txt |    4 ++++
 git-cvsserver.perl       |    8 +++++++-
 2 files changed, 11 insertions(+), 1 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index b233fe5..ee937fe 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -723,6 +723,10 @@ gc.rerereunresolved::
 	kept for this many days when 'git-rerere gc' is run.
 	The default is 15 days.  See linkgit:git-rerere[1].
 
+gitcvs.commitmsgannotation::
+	Append this string to each commit message. Set to empty string
+	to disable this feature. Defaults to "via git-CVS emulator".
+
 gitcvs.enabled::
 	Whether the CVS server interface is enabled for this repository.
 	See linkgit:git-cvsserver[1].
diff --git a/git-cvsserver.perl b/git-cvsserver.perl
index b0a805c..cbcaeb4 100755
--- a/git-cvsserver.perl
+++ b/git-cvsserver.perl
@@ -1358,7 +1358,13 @@ sub req_ci
     # write our commit message out if we have one ...
     my ( $msg_fh, $msg_filename ) = tempfile( DIR => $TEMP_DIR );
     print $msg_fh $state->{opt}{m};# if ( exists ( $state->{opt}{m} ) );
-    print $msg_fh "\n\nvia git-CVS emulator\n";
+    if ( defined ( $cfg->{gitcvs}{commitmsgannotation} ) ) {
+        if ($cfg->{gitcvs}{commitmsgannotation} !~ /^\s*$/ ) {
+            print $msg_fh "\n\n".$cfg->{gitcvs}{commitmsgannotation}."\n"
+        }
+    } else {
+        print $msg_fh "\n\nvia git-CVS emulator\n";
+    }
     close $msg_fh;
 
     my $commithash = `git-commit-tree $treehash -p $parenthash < $msg_filename`;
-- 
1.6.1.rc2.20.gde0d

^ permalink raw reply related

* [PATCH 2/2] cvsserver: change generation of CVS author names
From: Fabian Emmes @ 2008-12-12 15:24 UTC (permalink / raw)
  To: git; +Cc: gitster, Fabian Emmes, Lars Noschinski
In-Reply-To: <1229095449-24755-1-git-send-email-fabian.emmes@rwth-aachen.de>

CVS username is generated from local part email address.
We take the whole local part but restrict the character set to the
Portable Filename Character Set, which is used for Unix login names
according to Single Unix Specification v3.

Signed-off-by: Fabian Emmes <fabian.emmes@rwth-aachen.de>
Signed-off-by: Lars Noschinski <lars@public.noschinski.de>
---
 git-cvsserver.perl |   12 +++++++++---
 1 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/git-cvsserver.perl b/git-cvsserver.perl
index cbcaeb4..fef7faf 100755
--- a/git-cvsserver.perl
+++ b/git-cvsserver.perl
@@ -2533,12 +2533,18 @@ sub open_blob_or_die
     return $fh;
 }
 
-# Generate a CVS author name from Git author information, by taking
-# the first eight characters of the user part of the email address.
+# Generate a CVS author name from Git author information, by taking the local
+# part of the email address and replacing characters not in the Portable
+# Filename Character Set (see IEEE Std 1003.1-2001, 3.276) by underscores. CVS
+# Login names are Unix login names, which should be restricted to this
+# character set.
 sub cvs_author
 {
     my $author_line = shift;
-    (my $author) = $author_line =~ /<([^>@]{1,8})/;
+    (my $author) = $author_line =~ /<([^@>]*)/;
+
+    $author =~ s/[^-a-zA-Z0-9_.]/_/g;
+    $author =~ s/^-/_/;
 
     $author;
 }
-- 
1.6.1.rc2.20.gde0d

^ permalink raw reply related

* Re: [JGIT PATCH 03/15] Add IntList as a more efficient representation of List<Integer>
From: Shawn O. Pearce @ 2008-12-12 15:41 UTC (permalink / raw)
  To: Sverre Rabbelier; +Cc: git
In-Reply-To: <bd6139dc0812120733o7c828532qbcd78c46a321fe6b@mail.gmail.com>

Sverre Rabbelier <srabbelier@gmail.com> wrote:
> On Fri, Dec 12, 2008 at 16:15, Shawn O. Pearce <spearce@spearce.org> wrote:
> > Hmm, yea, good point.  But I don't care too much about the toString()
> > in this case, its meant as a debugging aid and not something one
> > would rely upon.  Hence I didn't think it was worth testing for the
> > empty list, writing the first entry, then doing a loop for [1,count).
> 
> Fair enough :).

If you'd like to send a patch to change it, I'll apply it.  But I
don't think its worth my time to make this toString() more efficient.

Other areas of JGit I do try to micro-optimize, because they are
right smack in the middle of the critical paths.

E.g. look at ObjectId.equals(byte[],int,byte[],int). I hand-unrolled
the memcmp loop because the JIT on x86 does *soooo* much better
when the code is spelled out:

	public static boolean equals(final byte[] firstBuffer, final int fi,
			final byte[] secondBuffer, final int si) {
		return firstBuffer[fi] == secondBuffer[si]
				&& firstBuffer[fi + 1] == secondBuffer[si + 1]
				&& firstBuffer[fi + 2] == secondBuffer[si + 2]
				&& firstBuffer[fi + 3] == secondBuffer[si + 3]
				&& firstBuffer[fi + 4] == secondBuffer[si + 4]
				&& firstBuffer[fi + 5] == secondBuffer[si + 5]
				&& firstBuffer[fi + 6] == secondBuffer[si + 6]
				&& firstBuffer[fi + 7] == secondBuffer[si + 7]
				&& firstBuffer[fi + 8] == secondBuffer[si + 8]
				&& firstBuffer[fi + 9] == secondBuffer[si + 9]
				&& firstBuffer[fi + 10] == secondBuffer[si + 10]
				&& firstBuffer[fi + 11] == secondBuffer[si + 11]
				&& firstBuffer[fi + 12] == secondBuffer[si + 12]
				&& firstBuffer[fi + 13] == secondBuffer[si + 13]
				&& firstBuffer[fi + 14] == secondBuffer[si + 14]
				&& firstBuffer[fi + 15] == secondBuffer[si + 15]
				&& firstBuffer[fi + 16] == secondBuffer[si + 16]
				&& firstBuffer[fi + 17] == secondBuffer[si + 17]
				&& firstBuffer[fi + 18] == secondBuffer[si + 18]
				&& firstBuffer[fi + 19] == secondBuffer[si + 19];
	}

This block is in the critical path for any tree diff code, in
particular for a "git log -- a/" sort of operation.  Its used
to compare the SHA-1s from two different tree records to see if
they differ.  Not unrolling this was a huge penalty.

-- 
Shawn.

^ permalink raw reply

* Re: [JGIT PATCH 03/15] Add IntList as a more efficient representation of List<Integer>
From: Sverre Rabbelier @ 2008-12-12 15:33 UTC (permalink / raw)
  To: Shawn O. Pearce; +Cc: git
In-Reply-To: <20081212151533.GM32487@spearce.org>

On Fri, Dec 12, 2008 at 16:15, Shawn O. Pearce <spearce@spearce.org> wrote:
> Hmm, yea, good point.  But I don't care too much about the toString()
> in this case, its meant as a debugging aid and not something one
> would rely upon.  Hence I didn't think it was worth testing for the
> empty list, writing the first entry, then doing a loop for [1,count).

Fair enough :).

-- 
Cheers,

Sverre Rabbelier

^ permalink raw reply

* Re: [JGIT PATCH] Fix typos in comments / testcase output
From: Shawn O. Pearce @ 2008-12-12 15:30 UTC (permalink / raw)
  To: Mike Ralphson; +Cc: git, Mike Ralphson, Robin Rosenberg
In-Reply-To: <1229079357-19167-1-git-send-email-mike@abacus.co.uk>

Mike Ralphson <mike@abacus.co.uk> wrote:
> Signed-off-by: Mike Ralphson <mike@abacus.co.uk>

Thanks, I've applied this change, and the README correction you
noted but didn't send a patch for.  :-)

> Is it me, or is the actual maintainer of JGIT/EGIT not actually
> mentioned anywhere? Apologies if I've got the to and cc round
> the wrong way 8-)

Robin and I run JGit and EGit as a dual-maintainer approach.
We both have write access to the master repository and we apply
each other's patches rather than push directly ourselves.  It
helps keep us from cutting corners.

Although I just broke that rule by pushing my own patch to README
and my own patch to SUBMITTING_PATCHES to address the other points
you raised, but these are two auxiliary documents that we don't
pay much attention to, hence they have fallen into disarray... :-\

-- 
Shawn.

^ permalink raw reply

* Re: [JGIT PATCH 03/15] Add IntList as a more efficient representation of List<Integer>
From: Shawn O. Pearce @ 2008-12-12 15:15 UTC (permalink / raw)
  To: sverre; +Cc: git
In-Reply-To: <bd6139dc0812120243y2b1a3dddu4975162114280e17@mail.gmail.com>

Sverre Rabbelier <alturin@gmail.com> wrote:
> On Fri, Dec 12, 2008 at 03:46, Shawn O. Pearce <spearce@spearce.org> wrote:
> > +       public String toString() {
> > +               final StringBuilder r = new StringBuilder();
> > +               r.append('[');
> > +               for (int i = 0; i < count; i++) {
> > +                       if (i > 0)
> > +                               r.append(", ");
> > +                       r.append(entries[i]);
> > +               }
> > +               r.append(']');
> > +               return r.toString();
> > +       }
> > +}
> 
> If you care about speed in your toString at all, pull the if statement
> out of there. A friend of mine did a small benchmark once, and it was
> _a lot_ slower to do the if in the for loop. I reckon you don't
> though, but just in case ;).

Hmm, yea, good point.  But I don't care too much about the toString()
in this case, its meant as a debugging aid and not something one
would rely upon.  Hence I didn't think it was worth testing for the
empty list, writing the first entry, then doing a loop for [1,count).

-- 
Shawn.

^ permalink raw reply

* Re: Saving patches from this list
From: Shawn O. Pearce @ 2008-12-12 15:14 UTC (permalink / raw)
  To: Mike Ralphson, Stefan Näwe; +Cc: git, Johannes Sixt, Junio C Hamano
In-Reply-To: <e2b179460812120107t74a4a8e3y1654233fe2870ac7@mail.gmail.com>

Mike Ralphson <mike.ralphson@gmail.com> wrote:
> 2008/12/12 Stefan Näwe <stefan.naewe+git@gmail.com>
> > > Stefan Näwe schrieb:
> > > > What's the best way to get patches sent to this list in a form suitable
> > > > for 'git am' without subscribing to this list ?

If you find the article on the web with gmane, add '/raw' onto the
end of direct link URL.  E.g. to get:

  http://article.gmane.org/gmane.comp.version-control.git/102874

use:

  curl http://article.gmane.org/gmane.comp.version-control.git/102874/raw | git am 

> Junio's blog[1] shows he's looking at patchwork. Personally I think it
> would be fantastic to have a public patchwork server available. It
> might avoid the chicken and egg problem in that it's currently easier
> (for some people) to get hold of a patch to play with / review only
> after it's accepted.

One of the things I want to do with Gerrit 2 is teach it to read a
mailing list and convert patches it receives into temporary branches
that can be fetched over git://, and also create records in its web
database so reviews can be done on the web interface, then let it
CC the list back with a proper In-Reply-To when comments are posted
on the web to a change it received by email.

IOW, I want to make Gerrit 2 useful to the git community to monitor
patch state without changing our current email based workflow.
But I'm still a good two or three months from being able to do that.
Android's workflow is higher priority to me right now.

-- 
Shawn.

^ permalink raw reply

* Re: help needed: Splitting a git repository after subversion migration
From: Björn Steinbrink @ 2008-12-12 14:49 UTC (permalink / raw)
  To: Thomas Jarosch; +Cc: Michael J Gruber, git
In-Reply-To: <200812121522.38791.thomas.jarosch@intra2net.com>

On 2008.12.12 15:22:15 +0100, Thomas Jarosch wrote:
> On Thursday, 11. December 2008 09:10:09 you wrote:
> > > Now I'll manually check the history of the tags/ and branches/ folder
> > > for more funny tags and write down the revision. If I understood
> > > the git-svn man page correctly, I should be able to specifiy
> > > revision ranges it's going to import. I'll try to skip the broken tags.
> >
> > As long as the breakage only involves branches/tags that are completely
> > useless, it's probably a lot easier to just delete them afterwards.
> >
> > And if you accidently added changes to a tag, after it was created, it's
> > also easier to manually tag to right version in git, and just forgetting
> > about the additional commit.
> >
> > And for a bunch of other cases, rebase -i/filter-branch are probably
> > also better options ;-)
> >
> > Skipping revisions in a git-svn import sounds rather annoying and
> > error-prone.
> 
> Sounds very reasonable. When I'm done filtering with filter-branch,
> the original commits are still stored in "refs/originals" and the reflogs.
> What's the best way to get rid of those to free up the space?

See the "purging unwanted history" thread:

http://n2.nabble.com/purging-unwanted-history-td1507638.html

The commands there (starting with the "git for-each-ref") should clean
out all the pre-filter-branch stuff.

> A nice way to find the corresponding commit for a file can be found here: 
> http://stackoverflow.com/questions/223678/git-which-commit-has-this-blob

Yeah, I think something similar (or even the same?) is in the git wiki
somewhere. I never had any use for it though ;-)

Björn

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox