Git development
 help / color / mirror / Atom feed
* [PATCH 1/3] Move CodingGuidelines to ./technical/coding-guidelines.txt and SubmittingPatches to ./technical/submitting-patches.txt
From: Thomas Ackermann @ 2012-12-30 10:31 UTC (permalink / raw)
  To: th.acker, git
In-Reply-To: <1023165134.213650.1356863340563.JavaMail.ngmail@webmail06.arcor-online.net>


Signed-off-by: Thomas Ackermann <th.acker@arcor.de>
---
 Documentation/{CodingGuidelines => technical/coding-guidelines.txt}   | 0
 Documentation/{SubmittingPatches => technical/submitting-patches.txt} | 0
 2 files changed, 0 insertions(+), 0 deletions(-)
 rename Documentation/{CodingGuidelines => technical/coding-guidelines.txt} (100%)
 rename Documentation/{SubmittingPatches => technical/submitting-patches.txt} (100%)

diff --git a/Documentation/CodingGuidelines b/Documentation/technical/coding-guidelines.txt
similarity index 100%
rename from Documentation/CodingGuidelines
rename to Documentation/technical/coding-guidelines.txt
diff --git a/Documentation/SubmittingPatches b/Documentation/technical/submitting-patches.txt
similarity index 100%
rename from Documentation/SubmittingPatches
rename to Documentation/technical/submitting-patches.txt
-- 
1.8.0.msysgit.0


---
Thomas

^ permalink raw reply

* [PATCH 0/3] Move CodingGuidelines and SubmittingPatches to ./Documentation/technical
From: Thomas Ackermann @ 2012-12-30 10:29 UTC (permalink / raw)
  To: th.acker, git


CodingGuidelines and SubmittingPatches are IMHO a little bit hidden in ./Documentation
and with respect to their content should be better placed in ./Documentation/technical.

[PATCH 1/3] Move CodingGuidelines to ./technical/coding-guidelines.txt and SubmittingPatches to ./technical/submitting-patches.txt
[PATCH 2/3] Add coding-guidelines.txt and submitting-patches.txt to ./Documentation/Makefile
[PATCH 3/3] Convert coding-guidelines.txt and submitting-patches.txt to asciidoc


---
Thomas

^ permalink raw reply

* Re: Fail to push over HTTP with MySQL authentication (Apache2)
From: Davide Baldini @ 2012-12-30  0:27 UTC (permalink / raw)
  To: git
In-Reply-To: <20121229210128.GB21058@sigill.intra.peff.net>

On 12/29/12 22:01, Jeff King wrote:

> However, before trying to investigate that avenue, have you considered
> using git's smart-http backend instead of WebDAV? It's significantly
> more efficient. You can get details and example apache configuration
> from "git help http-backend".

Thank you for the suggestions, I've been able to have the setup running.
I abandoned the horrible WebDAV in favour of Smart HTTP, and it's been
quite straightforward to configure and run successfully.

There is some documentation on setting up DAV here:
> 
>   https://github.com/git/git/blob/master/Documentation/howto/setup-git-server-over-http.txt
> 
> but I have no idea if it is up-to-date or not.

No it's not up-to-date. It should refer to Apache versions <2.0 .
Some of its config directives are inexistent in modern Apaches.

- Davide

^ permalink raw reply

* [RFC/PATCH] gitk: Visualize a merge commit with a right-click in gitk
From: Jason Holden @ 2012-12-30  0:16 UTC (permalink / raw)
  To: git; +Cc: paulus, Jason Holden

When first doing a merge in git-gui, the "Visualize Merge" button is
quite helpful to visualize the changes due to a merge.
But once the merge is complete, there's not a similarly convenient
way to recreate that merge view in gitk.

This commit adds to gitk the ability to right-click on a merge commit and
bring up a new gitk window displaying only those commits involved in
the merge.

When right-clicking on a non-merge commit, this option is grayed out.  This
patch also supports correct visualization of octopus merges

Signed-off-by: Jason Holden <jason.k.holden.swdev@gmail.com>
---
 gitk | 33 +++++++++++++++++++++++++++++++++
 1 file changed, 33 insertions(+)

diff --git a/gitk b/gitk
index 379582a..17e1fcb 100755
--- a/gitk
+++ b/gitk
@@ -2551,6 +2551,7 @@ proc makewindow {} {
 	{mc "Compare with marked commit" command compare_commits}
 	{mc "Diff this -> marked commit" command {diffvsmark 0}}
 	{mc "Diff marked commit -> this" command {diffvsmark 1}}
+	{mc "Visualize this merge" command visualize_merge}
     }
     $rowctxmenu configure -tearoff 0
 
@@ -2590,6 +2591,31 @@ proc makewindow {} {
     $diff_menu configure -tearoff 0
 }
 
+# Return the number of parents for a given sha1 id
+proc get_numparents_from_id {id} {
+    global parentlist
+    set row [rowofcommit $id]
+    return [llength [lindex $parentlist $row]]
+}
+
+proc visualize_merge {} {
+    global parents currentid parentlist
+    global rowmenuid
+
+    set num_parents [get_numparents_from_id $rowmenuid]
+    set row [rowofcommit $rowmenuid]
+
+    if {$num_parents >= 2} {
+	set revlist $rowmenuid
+	for { set i 1 } {$i < $num_parents} {incr i} {
+
+	    set revlist "$revlist [lindex $parentlist $row 0]..[lindex $parentlist $row $i] $rowmenuid"
+	}
+	
+	eval exec gitk $revlist
+    }
+}
+
 # Windows sends all mouse wheel events to the current focused window, not
 # the one where the mouse hovers, so bind those events here and redirect
 # to the correct window
@@ -8577,6 +8603,13 @@ proc rowmenu {x y id} {
 	$menu entryconfigure 9 -state $mstate
 	$menu entryconfigure 10 -state $mstate
 	$menu entryconfigure 11 -state $mstate
+
+	# Disable visualize-merge on only one parent
+	if {[get_numparents_from_id $id] == 1} {
+	    $menu entryconfigure 15 -state disabled
+	} else {
+	    $menu entryconfigure 15 -state normal
+	}
     } else {
 	set menu $fakerowmenu
     }
-- 
1.8.1.rc3.28.g0ab5d1f

^ permalink raw reply related

* Re: Lockless Refs?  (Was [PATCH] refs: do not use cached refs in repack_without_ref)
From: Martin Fick @ 2012-12-29 22:18 UTC (permalink / raw)
  To: Jeff King; +Cc: Michael Haggerty, git, Junio C Hamano
In-Reply-To: <20121229081021.GC15408@sigill.intra.peff.net>

Jeff King <peff@peff.net> wrote:

>On Thu, Dec 27, 2012 at 04:11:51PM -0700, Martin Fick wrote:
>> My idea is based on using filenames to store sha1s instead of 
>> file contents.  To do this, the sha1 one of a ref would be 
>> stored in a file in a directory named after the loose ref.  I 
>> believe this would then make it possible to have lockless 
>> atomic ref updates by renaming the file.
>> 
>> To more fully illustrate the idea, imagine that any file 
>> (except for the null file) in the directory will represent the 
>> value of the ref with its name, then the following 
>> transitions can represent atomic state changes to a refs 
>> value and existence:
>
>Hmm. So basically you are relying on atomic rename() to move the value
>around within a directory, rather than using write to move it around
>within a file. Atomic rename is usually something we have on local
>filesystems (and I think we rely on it elsewhere). Though I would not
>be
>surprised if it is not atomic on all networked filesystems (though it
>is
>on NFS, at least).

Yes.  I assume this is OK because doesn't git already rely on atomic renames?  For example to rename the new packed-refs file to unlock it?

...

>> 3) To create a ref, it must be renamed from the null file (sha 
>> 0000...) to the new value just as if it were being updated 
>> from any other value, but there is one extra condition: 
>> before renaming the null file, a full directory scan must be 
>> done to ensure that the null file is the only file in the 
>> directory (this condition exists because creating the 
>> directory and null file cannot be atomic unless the filesystem 
>> supports atomic directory renames, an expectation git does 
>> not currently make).  I am not sure how this compares to 
>> today's approach, but including the setup costs (described 
>> below), I suspect it is slower.
>
>Hmm. mkdir is atomic. So wouldn't it be sufficient to just mkdir and
>create the correct sha1 file?

But then a process could mkdir and die leaving a stale empty dir with no reliable recovery mechanism.


Unfortunately, I think I see another flaw though! :( I should have known that I cannot separate an important check from its state transitioning action.  The following could happen:

 A does mkdir
 A creates null file
 A checks dir -> no other files 
 B checks dir -> no other files
 A renames null file to abcd
 C creates second null file 
 B renames second null file to defg

One way to fix this is to rely on directory renames, but I believe this is something git does not want to require of every FS? If we did, we could Change #3 to be:

3) To create a ref, it must be renamed from the null file (sha 0000...) to the new value just as if it were being updated from any other value. (No more scan)

Then, with reliable directory renames, a process could do what you suggested to a temporary directory, mkdir + create null file, then rename the temporary dir to refname.  This would prevent duplicate null files.  With a grace period, the temporary dirs could be cleaned up in case a process dies before the rename.  This is your approach with reliable recovery.


>> I don't know how this new scheme could be made to work with 
>> the current scheme, it seems like perhaps new git releases 
>> could be made to understand both the old and the new, and a 
>> config option could be used to tell it which method to write 
>> new refs with.  Since in this new scheme ref directory names 
>> would conflict with old ref filenames, this would likely 
>> prevent both schemes from erroneously being used 
>> simultaneously (so they shouldn't corrupt each other), except 
>> for the fact that refs can be nested in directories which 
>> confuses things a bit.  I am not sure what a good solution to 
>> this is?
>
>I think you would need to bump core.repositoryformatversion, and just
>never let old versions of git access the repository directly. Not the
>end of the world, but it certainly increases deployment effort. If we
>were going to do that, it would probably make sense to think about
>solving the D/F conflict issues at the same time (i.e., start calling
>"refs/heads/foo" in the filesystem "refs.d/heads.d/foo.ref" so that it
>cannot conflict with "refs.d/heads.d/foo.d/bar.ref").

Wouldn't you want to use a non legal ref character instead of dot? And without locks, we free up more of the ref namespace too I think? (Refs could end in ".lock")

-Martin

Employee of Qualcomm Innovation Center,Inc. which is a member of Code Aurora Forum

^ permalink raw reply

* Re: Lockless Refs?  (Was [PATCH] refs: do not use cached refs in repack_without_ref)
From: Martin Fick @ 2012-12-29 21:29 UTC (permalink / raw)
  To: Jeff King; +Cc: Michael Haggerty, git, Junio C Hamano, Shawn Pearce
In-Reply-To: <20121229081200.GD15408@sigill.intra.peff.net>

Jeff King <peff@peff.net> wrote:

>On Fri, Dec 28, 2012 at 07:50:14AM -0700, Martin Fick wrote:
>
>> Hmm, actually I believe that with a small modification to the 
>> semantics described here it would be possible to make multi 
>> repo/branch commits work.   Simply allow the ref filename to 
>> be locked by a transaction by appending the transaction ID to 
>> the filename.  So if transaction 123 wants to lock master 
>> which points currently to abcde, then it will move 
>> master/abcde to master/abcde_123.  If transaction 123 is 
>> designed so that any process can commit/complete/abort it 
>> without requiring any locks which can go stale, then this ref 
>> lock will never go stale either (easy as long as it writes 
>> all its proposed updates somewhere upfront and has atomic 
>> semantics for starting, committing and aborting).  On commit, 
>> the ref lock gets updated to its new value: master/newsha and 
>> on abort it gets unlocked: master/abcde.
>
>Hmm. I thought our goal was to avoid locks? Isn't this just locking by
>another name?

It is a lock, but it is a lock with an owner: the transaction.  If the transaction has reliable recovery semantics, then the lock will be recoverable also.  This is possible if we have lock ownership (the transaction) which does not exist today for the ref locks.  With good lock ownership we gain the ability to reliably delete locks for a specific owner without the risk of deleting the lock when held by another owner (putting the owner in the filename is "good", while putting the owner in the filecontents is not).   Lastly, for reliable recovery of stale locks we need the ability to determine when an owner has abandoned a lock.  I believe that the transaction semantics laid out below give this.


>I guess your point is to have no locks in the "normal" case, and have
>locked transactions as an optional add-on?

Basically.  If we design the transaction into the git semantics we could ensure that it is recoverable and we should not need to expose these reflocks outside of the transaction APIs.

To illustrate a simple transaction approach (borrowing some of Shawn's ideas), we could designate a directory to hold transaction files *1.  To prepare a transaction: write a list of repo:ref:oldvalue:newvalue to a file named id.new (in a stable sorted order based on repo:ref to prevent deadlocks).  This is not a state change and thus this file could be deleted by any process at anytime (preferably after a long grace period).

If file renames are atomic on the filesystem holding the transaction files then 1, 2, 3 below will be atomic state changes.  It does not matter who performs state transitions 2 or 3.  It does not matter who implements the work following any of the 3 transitions, many processes could attempt the work in parallel (so could a human).
 
1) To start the transaction, rename the id.new file to id.  If the rename fails, start over if desired/still possible.  On success, ref locks for each entry should be acquired in listed order (to prevent deadlocks), using transaction id and oldvalue.  It is never legal to unlock a ref in this state (because a block could cause the unlock to be delayed until the commit phase).  However, it is legal for any process to transition to abort at any time from this state, perhaps because of a failure to acquire a lock (held by another transaction), and definitely if a ref has changed (is no longer oldvalue).

2) To abort the transaction, rename the id file to id.abort.  This should only ever fail if commit was achieved first.  Once in this state, any process may/should unlock any ref locks belonging to this transaction id.  Once all refs are unlocked, id.abort may be deleted (it could be deleted earlier, but then cleanup will take longer).

3) To commit the transaction, rename the file to id.commit.  This should only ever fail if abort was achieved first. This transition should never be done until every listed ref is locked by the current transaction id.  Once in this phase, all refs may/should be moved to their new values and unlocked by any process. Once all refs are unlocked, id.commit may be deleted. 

Since any process attempting any of the work in these transactions could block at any time for an indefinite amount of time, these processes may wake after the transaction is aborted or comitted and the transaction files are cleaned up.  I believe that in these cases the only actions which could succeed by these waking processes is the ref locking action.  All such abandoned ref locks may/should be unlocked by any process.  This last rule means that no transaction ids should ever be reused,

-Martin


*1 We may want to adapt the simple model illustrated above to use git mechanisms such as refs to hold transaction info instead of files in a directory, and git submodule files to hold the list of refs to update.  

Employee of Qualcomm Innovation Center,Inc. which is a member of Code Aurora Forum

^ permalink raw reply

* Re: Lockless Refs?
From: Martin Fick @ 2012-12-29 21:15 UTC (permalink / raw)
  To: Jeff King, Junio C Hamano; +Cc: Michael Haggerty, git, Shawn Pearce
In-Reply-To: <20121229081657.GE15408@sigill.intra.peff.net>



Jeff King <peff@peff.net> wrote:

>On Fri, Dec 28, 2012 at 09:15:52AM -0800, Junio C Hamano wrote:
>
>> Martin Fick <mfick@codeaurora.org> writes:
>> 
>> > Hmm, actually I believe that with a small modification to the 
>> > semantics described here it would be possible to make multi 
>> > repo/branch commits work....
>> >
>> > Shawn talked about adding multi repo/branch transaction 
>> > semantics to jgit, this might be something that git wants to 
>> > support also at some point?
>> 
>> Shawn may have talked about it and you may have listened to it, but
>> others wouldn't have any idea what kind of "multi repo/branch
>> transaction" you are talking about.  Is it about "I want to push
>> this ref to that repo and push this other ref to that other repo",
>> in what situation will it be used/useful, what are the failure
>> modes, what are failure tolerances by the expected use cases, ...?
>> 
>> Care to explain?
>
>I cannot speak for Martin, but I am assuming the point is to atomically
>update 2 (or more) refs on the same repo. That is, if I have a branch
>"refs/heads/foo" and a ref pointing to meta-information (say, notes
>about commits in foo, in "refs/notes/meta/foo"), I would want to "git
>push" them, and only update them if _both_ will succeed, and otherwise
>fail and update nothing.

My use case was cross repo/branch dependencies in Gerrit (which do not yet exist). Users want to be able to define several changes (destined for different project/branches) which can only be merged together.  If one change cannot be merged, the others should fail too.  The solutions we can think of generally need to hold ref locks while acquiring more ref locks, this drastically increases the opportunities for stale locks over the simple "lock, check, update, unlock" mode which git locks are currently used for.

I was perhaps making too big of a leap to assume that there would be other non Gerrit uses cases for this?  I assumed that other git projects which are spread across several git repos would need this? But maybe this simply wouldn't be practical with other git server solutions?

-Martin

Employee of Qualcomm Innovation Center,Inc. which is a member of Code Aurora Forum

^ permalink raw reply

* Re: Fail to push over HTTP with MySQL authentication (Apache2)
From: Jeff King @ 2012-12-29 21:01 UTC (permalink / raw)
  To: Davide Baldini; +Cc: git
In-Reply-To: <50DF4A78.5000206@gmail.com>

On Sat, Dec 29, 2012 at 08:54:32PM +0100, Davide Baldini wrote:

> SETUP:
> -----
> [...]
> Git repository has been configured as:
>     cd /var/www/public/GT_rulesets/GT00.git
>     git init --bare
>     mv hooks/post-update.sample hooks/post-update
>     git update server-info
>     chmode 777 /var/www/public/GT_rulesets/GT00.git  # for testing.

Should this last line be a "chmod -R"? Git init will create many
subdirectories, and you want to make sure they are all writable for
push.

> _   Git:
>         Git can fetch the repository without problems:
>         git clone http://username:password@greatturn.org:8081/GT00.git
> 
> Pushing the locally fetched repository back to the remote one doesn't
> work:
> [...]
> > 87.19.240.177 - - [29/Dec/2012:16:43:26 +0100] "PROPFIND /GT00.git/ HTTP/1.1" 401 767 "-" "git/1.7.8.3"

If fetch is working and push is not, I'd suspect WebDAV configuration
problems (and indeed, your credentials seem fine, but the PROPFIND is
returning a 401). Fetch works over stock HTTP and does not use WebDAV at
all. There is some documentation on setting up DAV here:

  https://github.com/git/git/blob/master/Documentation/howto/setup-git-server-over-http.txt

but I have no idea if it is up-to-date or not.

However, before trying to investigate that avenue, have you considered
using git's smart-http backend instead of WebDAV? It's significantly
more efficient. You can get details and example apache configuration
from "git help http-backend".

-Peff

^ permalink raw reply

* [BUG] two-way read-tree can write null sha1s into index
From: Jeff King @ 2012-12-29 20:51 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: git, Junio C Hamano
In-Reply-To: <20121229110541.GA1408@sigill.intra.peff.net>

On Sat, Dec 29, 2012 at 06:05:41AM -0500, Jeff King wrote:

>   [clear state from last run]
>   $ rm -rf .git/rebase-apply
>   $ git reset --hard
> 
>   [apply the patch; we get a conflict]
>   $ git am -3sc queue-3.2/alsa-usb-audio-fix-missing-autopm-for-midi-input.patch
> 
>   [now run just the read-tree from "am --abort"]
>   $ git.compile read-tree --reset -u HEAD ORIG_HEAD
>   warning: cache entry has null sha1: sound/usb/midi.c
> 
>   [and now check our index]
>   $ git ls-files -s sound/usb/midi.c
>   100644 0000000000000000000000000000000000000000 0 sound/usb/midi.c
> 
>   [yes, this index is bogus]
>   $ git write-tree
>   error: invalid object 100644 0000000000000000000000000000000000000000 for 'sound/usb/midi.c'
>   fatal: git-write-tree: error building trees
> 
> So I think this check may actually be finding a real bug. I have seen
> these null sha1s in the wild, but I was never able to track down the
> actual cause. Maybe this will give us a clue. Now we just need to work
> backwards and figure out who is putting it in the in-memory index and
> why.

I made some progress on this, but I'd like a sanity check from others
(especially Junio). As far as I can tell, this is a bug in read-tree.

When we call "read-tree --reset -u HEAD ORIG_HEAD", the first thing we
do with the index is call read_cache_unmerged. Originally that would
read the index, leaving aside any unmerged entries. However, as of
d1a43f2 (reset --hard/read-tree --reset -u: remove unmerged new paths,
2008-10-15), it actually creates a new cache entry. This serves as a
placeholder, so that we later know to update the working tree.

However, we later noticed that the sha1 of that unmerged entry was
just copied from some higher stage, leaving you with random content in
the index.  That was fixed by e11d7b5 ("reset --merge": fix unmerged
case, 2009-12-31), which instead puts the null sha1 into the newly
created entry, and sets a CE_CONFLICTED flag. At the same time, it
teaches the unpack-trees machinery to pay attention to this flag, so
that oneway_merge throws away the current value.

However, it did not update the code paths for  twoway_merge, which is
where we end up in the read-tree above. We notice that the HEAD and
ORIG_HEAD versions are the same, and say "oh, we can just reuse the
current version". But that's not true. The current version is bogus.

So I think we need to update twoway_merge to recognize unmerged entries,
which gives us two options:

  1. Reject the merge.

  2. Throw away the current unmerged entry in favor of the "new" entry
     (when old and new are the same, of course; otherwise we would
     reject).

I think (2) is the right thing. It fixes the entry of the bogus sha1
into the index, _and_ it solves the problem that "git am --abort" leaves
the conflicted entry as a modification. It should just go away. But
maybe I am forgetting some other case where read-tree should be more
conservative, and (1) is a safer choice.

Something like this patch:

diff --git a/unpack-trees.c b/unpack-trees.c
index 6d96366..e06e01f 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -1746,14 +1746,19 @@ int twoway_merge(struct cache_entry **src, struct unpack_trees_options *o)
 		newtree = NULL;
 
 	if (current) {
-		if ((!oldtree && !newtree) || /* 4 and 5 */
-		    (!oldtree && newtree &&
-		     same(current, newtree)) || /* 6 and 7 */
-		    (oldtree && newtree &&
-		     same(oldtree, newtree)) || /* 14 and 15 */
-		    (oldtree && newtree &&
-		     !same(oldtree, newtree) && /* 18 and 19 */
-		     same(current, newtree))) {
+		if (current->ce_flags & CE_CONFLICTED) {
+			if (same(oldtree, newtree))
+				return merged_entry(newtree, current, o);
+			return o->gently ? -1 : reject_merge(current, o);
+		}
+		else if ((!oldtree && !newtree) || /* 4 and 5 */
+			 (!oldtree && newtree &&
+			  same(current, newtree)) || /* 6 and 7 */
+			 (oldtree && newtree &&
+			  same(oldtree, newtree)) || /* 14 and 15 */
+			 (oldtree && newtree &&
+			  !same(oldtree, newtree) && /* 18 and 19 */
+			  same(current, newtree))) {
 			return keep_entry(current, o);
 		}
 		else if (oldtree && !newtree && same(current, oldtree)) {

I suspect threeway_merge may need a similar update, but I haven't looked
too carefully yet.

-Peff

^ permalink raw reply related

* Fail to push over HTTP with MySQL authentication (Apache2)
From: Davide Baldini @ 2012-12-29 19:54 UTC (permalink / raw)
  To: git

Hi,

I'm not able to setup a public Git repository over plain HTTP with
MySQL authentication.
Both HTTP and authentication are provided by Apache2.

SETUP:
-----

This setup is performed on Debian 6.0.4.

Apache2 (v. 2.2), with modules:
    auth_mysql
    WebDAV

Git (v. 1.7.8.3)
Git repository location:
    local, for webserver: /var/www/public/GT_rulesets/GT00.git
    public, for Git:      http://greatturn.org:8081/GT00.git

Git repository has been configured as:
    cd /var/www/public/GT_rulesets/GT00.git
    git init --bare
    mv hooks/post-update.sample hooks/post-update
    git update server-info
    chmode 777 /var/www/public/GT_rulesets/GT00.git  # for testing.


FACTS:
-----

The Apache side of my setup seems to work:
_   HTTP, MySQL authentication:
        I point Iceweasel to http://greatturn.org:8081/ .
        It asks for authentication; I authenticate with a username/
        password pair taken from MySQL database (which doesn't exist as
        a system user); It works, and I can see all the content of
        the git repository "GT00.git".
_   WebDAV:
        I point Konqueror to webdav://greatturn.org:8081/ .
        Works exactly as previous point.
_   Git:
        Git can fetch the repository without problems:
        git clone http://username:password@greatturn.org:8081/GT00.git

Pushing the locally fetched repository back to the remote one doesn't
work:
    "git push http://greatturn.org:8081/GT00.git master"
    asks for username and password:
        > Username for 'greatturn.org:8081':
        > Password for 'greatturn.org:8081':

    I enter my credentials, then git outputs the following and exits:
        > error: Cannot access URL http://greatturn.org:8081/GT00.git/,
return code 22
        > fatal: git-http-push failed

    On Apache's access.log, git produces all and no more than the
    following:
        > 87.19.240.177 - - [29/Dec/2012:16:43:22 +0100] "GET /GT00.git
/info/refs?service=git-receive-pack HTTP/1.1" 401 767 "-"
"git/1.7.8.3"
        > 87.19.240.177 - - [29/Dec/2012:16:43:26 +0100] "GET
/GT00.git/info/refs?service=git-receive-pack HTTP/1.1" 401 767 "-"
"git/1.7.8.3"
        > 87.19.240.177 - davide [29/Dec/2012:16:43:26 +0100] "GET
/GT00.git/info/refs?service=git-receive-pack HTTP/1.1" 200 233 "-"
"git/1.7.8.3"
        > 87.19.240.177 - davide [29/Dec/2012:16:43:26 +0100] "GET
/GT00.git/HEAD HTTP/1.1" 200 258 "-" "git/1.7.8.3"
        > 87.19.240.177 - - [29/Dec/2012:16:43:26 +0100] "PROPFIND
/GT00.git/ HTTP/1.1" 401 767 "-" "git/1.7.8.3"

^ permalink raw reply

* Re: [PATCH 1/4] hooks: Add function to check if a hook exists
From: Junio C Hamano @ 2012-12-29 16:54 UTC (permalink / raw)
  To: Aaron Schrab; +Cc: git
In-Reply-To: <20121229145032.GB3789@pug.qqx.org>

Aaron Schrab <aaron@schrab.com> writes:

> Since I'm going to be changing the interface for this hook in v2 of
> the series so that it will be more complicated than can be readily
> addressed with the run_hook() API (and will have use a fixed number of
> arguments anyway) I'll be dropping the run_hook_argv() function.

Just to make sure there is no misunderstanding (sorry for sending
the message without finishing it with this clarification at the end
in the first place).  I didn't mean that converting all of the
existing callers must come earlier than introducing a new hook
invoker.

I just wanted to make sure that we are aware that we are adding to
our technical debt, if we are adding another that is also
specialized; as the proposed interface looked sufficiently generic,
it would be the ideal one to make _other_ ones thin wrappers around
it to unify the various codepaths.

Thanks.

^ permalink raw reply

* Re: [PATCH 0/4] pre-push hook support
From: Junio C Hamano @ 2012-12-29 16:48 UTC (permalink / raw)
  To: Aaron Schrab; +Cc: git
In-Reply-To: <20121229145025.GA3789@pug.qqx.org>

Aaron Schrab <aaron@schrab.com> writes:

> At 18:01 -0800 28 Dec 2012, Junio C Hamano <gitster@pobox.com> wrote:
>>Will it be "all-or-none", or "I'll allow these but not those"?
>
> Currently it just uses the exit code to communicate that back, so it's
> all-or-none.  I think I'll keep that in the updated version as well.

Thanks; that sounds sensible.

^ permalink raw reply

* Re: Hold your fire, please
From: Adam Spiers @ 2012-12-29 15:09 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <7vehi9hgz3.fsf@alter.siamese.dyndns.org>

On Fri, Dec 28, 2012 at 11:52 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Junio C Hamano <gitster@pobox.com> writes:
>
>> Primarily in order to force me concentrate on the releng for the
>> upcoming release, and also to encourage contributors to focus on
>> finding and fixing any last minute regressions (rather than
>> distracting others by showing publicly scratching their itches), I
>> won't be queuing any patch that is not a regression fix, at least
>> for the next few days.  I may not even review them.
>>
>> Thanks.
>
> Just as a friendly reminder, I am reposting this to remind people.

Ah, I thought this period had elapsed already.  I assume this applies
to check-ignore, in which case how long should I hold off before
submitting v4?

>> And have a nice holiday if you are in areas where it is a holiday
>> season ;-)

You too :-)

^ permalink raw reply

* Re: [PATCH 1/4] hooks: Add function to check if a hook exists
From: Aaron Schrab @ 2012-12-29 14:50 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <7vwqw1fw5a.fsf@alter.siamese.dyndns.org>

At 18:08 -0800 28 Dec 2012, Junio C Hamano <gitster@pobox.com> wrote:
>Aaron Schrab <aaron@schrab.com> writes:
>
>> Create find_hook() function to determine if a given hook exists and is
>> executable.  If it is the path to the script will be returned, otherwise
>> NULL is returned.
>
>Sounds like a sensible thing to do.  To make sure the API is also
>sensible, all the existing hooks should be updated to use this API,
>no?

I'd been trying to keep the changes limited.  I'll see about modifying 
the existing places that run hooks in v2 of the series.

>> This is in support for an upcoming run_hook_argv() function which will
>> expect the full path to the hook script as the first element in the
>> argv_array.
>
>There is currently a public function called run_hook() that squats
>on the good name with a kludgy API that is too specific to using
>separate index file.  Back when it was a private helper in the
>implementation of "git commit", it was perfectly fine, but it was
>exported without giving much thought on the API.
>
>If you are introducing a new run_hook_* function, give it a generic
>enough API that lets all the existing hook callers to use it.  I
>would imagine that the API requirement may be modelled after
>run_command() API so that we can pass argv[] and tweak the hook's
>environ[], as well as feeding its stdin and possibly reading from
>its stdout.  That would be very useful.

I think the attraction of the run_hook() API is its simplicity.  It's 
currently a fairly thin wrapper around the run_command() API.  I suspect 
that if the run_hook() API were made generic enough to support all of 
the existing hook callers it would greatly complicate the existing calls 
to run_hook() while not providing much benefit to hook callers which 
can't currently use it beyond what run_command() offers.

Since I'm going to be changing the interface for this hook in v2 of the 
series so that it will be more complicated than can be readily addressed 
with the run_hook() API (and will have use a fixed number of arguments 
anyway) I'll be dropping the run_hook_argv() function.

^ permalink raw reply

* Re: [PATCH 0/4] pre-push hook support
From: Aaron Schrab @ 2012-12-29 14:50 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <7v1ue9hb06.fsf@alter.siamese.dyndns.org>

At 18:01 -0800 28 Dec 2012, Junio C Hamano <gitster@pobox.com> wrote:
>One lesson we learned long time ago while doing hooks is to avoid
>unbound number of command line arguments and instead feed them from
>the standard input.  I think this should do the same.

Good point.  I had been trying to keep the interface for this hook as 
close as possible to the ones for other client-side hooks on the theory 
that less development effort may go into those than for server-side 
hooks.  But thinking on that more I certainly see that this could easily 
run into limits on argument length on some systems; especially when it's 
likely that each of those arguments is likely to be over 100 bytes long.

I'll work on an updated version which sends the variable length 
information over a pipe, using the command-line arguments only to pass 
the remote name and URL.

>How does the hook communicate its decision to the calling Git?
>
>Will it be "all-or-none", or "I'll allow these but not those"?

Currently it just uses the exit code to communicate that back, so it's 
all-or-none.  I think I'll keep that in the updated version as well.

A future enhancement could modify the protocol to support reading from 
the hook's stdout the names of remote refs which are to be rejected, I 
think that just having the option for all-or-nothing is a good starting 
point.

^ permalink raw reply

* Re: [PATCH 2/3] do not write null sha1s to on-disk index
From: Jeff King @ 2012-12-29 11:05 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: git, Jens Lehmann, Junio C Hamano
In-Reply-To: <20121229103430.GG18903@elie.Belkin>

On Sat, Dec 29, 2012 at 02:34:30AM -0800, Jonathan Nieder wrote:

> >>   $ git am --abort
> >>   Unstaged changes after reset:
> >>   M       sound/usb/midi.c
> >
> > What does your index look like afterwards? Does it have a null sha1 in
> > it (check "ls-files -s")?
> 
> $ git diff-index --abbrev HEAD
> :100644 100644 eeefbce3873c... 000000000000... M	sound/usb/midi.c
> $ git ls-files --abbrev -s sound/usb/midi.c
> 100644 eeefbce3873c 0	sound/usb/midi.c

Hmm. It looks like "am --abort" overwrites the index again after the
read-tree which complains.  If I downgrade the error in write_index to a
warning, like this:

diff --git a/read-cache.c b/read-cache.c
index fda78bc..70a6d86 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -1797,7 +1797,7 @@ int write_index(struct index_state *istate, int newfd)
 		if (!ce_uptodate(ce) && is_racy_timestamp(istate, ce))
 			ce_smudge_racily_clean_entry(ce);
 		if (is_null_sha1(ce->sha1))
-			return error("cache entry has null sha1: %s", ce->name);
+			warning("cache entry has null sha1: %s", ce->name);
 		if (ce_write_entry(&c, newfd, ce, previous_name) < 0)
 			return -1;
 	}

and then just run this:

  [clear state from last run]
  $ rm -rf .git/rebase-apply
  $ git reset --hard

  [apply the patch; we get a conflict]
  $ git am -3sc queue-3.2/alsa-usb-audio-fix-missing-autopm-for-midi-input.patch

  [now run just the read-tree from "am --abort"]
  $ git.compile read-tree --reset -u HEAD ORIG_HEAD
  warning: cache entry has null sha1: sound/usb/midi.c

  [and now check our index]
  $ git ls-files -s sound/usb/midi.c
  100644 0000000000000000000000000000000000000000 0 sound/usb/midi.c

  [yes, this index is bogus]
  $ git write-tree
  error: invalid object 100644 0000000000000000000000000000000000000000 for 'sound/usb/midi.c'
  fatal: git-write-tree: error building trees

So I think this check may actually be finding a real bug. I have seen
these null sha1s in the wild, but I was never able to track down the
actual cause. Maybe this will give us a clue. Now we just need to work
backwards and figure out who is putting it in the in-memory index and
why.

I'll try to work on it tomorrow, but please don't let that stop you if
you want to keep digging in the meantime.

-Peff

^ permalink raw reply related

* Re: [PATCH 2/3] do not write null sha1s to on-disk index
From: Jonathan Nieder @ 2012-12-29 10:51 UTC (permalink / raw)
  To: Jeff King; +Cc: git, Jens Lehmann, Junio C Hamano
In-Reply-To: <20121229104247.GA30283@sigill.intra.peff.net>

Jeff King wrote:

> Hrm. But your output does not say there is a conflict. It says you have
> a local modification and it does not try the merge:

That's probably operator error on my part when gathering output to
paste into the email.

In other words, nothing to see there. :)  Sorry for the confusion.

[...]
> If I try to apply it, I get a real conflict:
[...]
> Although running "git am --abort" after that does seem to produce the
> "cache entry has null sha1" error.

Yep, that is what my report should have said.

'night,
Jonathan

^ permalink raw reply

* Re: [PATCH 2/3] do not write null sha1s to on-disk index
From: Jeff King @ 2012-12-29 10:42 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: git, Jens Lehmann, Junio C Hamano
In-Reply-To: <20121229103430.GG18903@elie.Belkin>

On Sat, Dec 29, 2012 at 02:34:30AM -0800, Jonathan Nieder wrote:

> Jeff King wrote:
> 
> > I can't reproduce here. I can checkout v3.2.35, and I guess that the
> > patch you are applying comes from f5f1654, but I don't know your
> > local modification to sound/usb/midi.c.
> 
> No local modification.  The unstaged change after "git am --abort" to
> recover from a conflicted git am is a longstanding bug (at least a
> couple of years old).
> 
> The patch creating conflicts is
> 
> 	queue-3.2/alsa-usb-audio-fix-missing-autopm-for-midi-input.patch
> 
> from git://git.kernel.org/pub/scm/linux/kernel/git/bwh/linux-3.2.y-queue.git

Hrm. But your output does not say there is a conflict. It says you have
a local modification and it does not try the merge:

> $ git am -3sc /tmp/alsa-usb-audio-fix-missing-autopm-for-midi-input.patch
> Applying: ALSA: usb-audio: Fix missing autopm for MIDI input
> Using index info to reconstruct a base tree...
> Falling back to patching base and 3-way merge...
> error: Your local changes to the following files would be overwritten by merge:
>       sound/usb/midi.c
> Please, commit your changes or stash them before you can merge.
> Aborting

If I try to apply it, I get a real conflict:

  $ git checkout v3.2.35
  $ git am -3sc linux-3.2.y-queue/queue-3.2/alsa-usb-audio-fix-missing-autopm-for-midi-input.patch
  Applying: ALSA: usb-audio: Fix missing autopm for MIDI input
  Using index info to reconstruct a base tree...
  M       sound/usb/midi.c
  Falling back to patching base and 3-way merge...
  Auto-merging sound/usb/midi.c
  CONFLICT (content): Merge conflict in sound/usb/midi.c

Although running "git am --abort" after that does seem to produce the
"cache entry has null sha1" error. So I can start investigating from
there.

-Peff

^ permalink raw reply

* Re: [PATCH 2/3] do not write null sha1s to on-disk index
From: Jonathan Nieder @ 2012-12-29 10:40 UTC (permalink / raw)
  To: Jeff King; +Cc: git, Jens Lehmann, Junio C Hamano
In-Reply-To: <20121229102707.GA26730@sigill.intra.peff.net>

Jeff King wrote:

> Can you give more details?

$ GIT_TRACE=1 git am --abort
trace: exec: 'git-am' '--abort'
trace: run_command: 'git-am' '--abort'
trace: built-in: git 'rev-parse' '--parseopt' '--' '--abort'
trace: built-in: git 'rev-parse' '--git-dir'
trace: built-in: git 'rev-parse' '--show-prefix'
trace: built-in: git 'rev-parse' '--show-toplevel'
trace: built-in: git 'var' 'GIT_COMMITTER_IDENT'
trace: built-in: git 'rev-parse' '--verify' '-q' 'HEAD'
trace: built-in: git 'config' '--bool' '--get' 'am.keepcr'
trace: built-in: git 'rerere' 'clear'
trace: built-in: git 'rev-parse' '--verify' '-q' 'HEAD'
trace: built-in: git 'read-tree' '--reset' '-u' 'HEAD' 'ORIG_HEAD'
error: cache entry has null sha1: sound/usb/midi.c
fatal: unable to write new index file
trace: built-in: git 'reset' 'ORIG_HEAD'
Unstaged changes after reset:
M       sound/usb/midi.c

^ permalink raw reply

* Re: [PATCH 2/3] do not write null sha1s to on-disk index
From: Jonathan Nieder @ 2012-12-29 10:34 UTC (permalink / raw)
  To: Jeff King; +Cc: git, Jens Lehmann, Junio C Hamano
In-Reply-To: <20121229102707.GA26730@sigill.intra.peff.net>

Jeff King wrote:

> I can't reproduce here. I can checkout v3.2.35, and I guess that the
> patch you are applying comes from f5f1654, but I don't know your
> local modification to sound/usb/midi.c.

No local modification.  The unstaged change after "git am --abort" to
recover from a conflicted git am is a longstanding bug (at least a
couple of years old).

The patch creating conflicts is

	queue-3.2/alsa-usb-audio-fix-missing-autopm-for-midi-input.patch

from git://git.kernel.org/pub/scm/linux/kernel/git/bwh/linux-3.2.y-queue.git

[...]
>>   $ git am --abort
>>   Unstaged changes after reset:
>>   M       sound/usb/midi.c
>
> What does your index look like afterwards? Does it have a null sha1 in
> it (check "ls-files -s")?

$ git diff-index --abbrev HEAD
:100644 100644 eeefbce3873c... 000000000000... M	sound/usb/midi.c
$ git ls-files --abbrev -s sound/usb/midi.c
100644 eeefbce3873c 0	sound/usb/midi.c

^ permalink raw reply

* Re: [PATCH 2/3] do not write null sha1s to on-disk index
From: Jeff King @ 2012-12-29 10:27 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: git, Jens Lehmann, Junio C Hamano
In-Reply-To: <20121229100130.GA31497@elie.Belkin>

On Sat, Dec 29, 2012 at 02:03:46AM -0800, Jonathan Nieder wrote:

> > --- a/read-cache.c
> > +++ b/read-cache.c
> > @@ -1800,6 +1800,8 @@ int write_index(struct index_state *istate, int newfd)
> >  			continue;
> >  		if (!ce_uptodate(ce) && is_racy_timestamp(istate, ce))
> >  			ce_smudge_racily_clean_entry(ce);
> > +		if (is_null_sha1(ce->sha1))
> > +			return error("cache entry has null sha1: %s", ce->name);
> >  		if (ce_write_entry(&c, newfd, ce, previous_name) < 0)
> >  			return -1;
> 
> Quick heads up: this is tripping for me in the "git am --abort"
> codepath:

Thanks. The intent was that this should never happen, and we are
protecting against bogus sha1s slipping into the index. So either we
have found a bug in "am --abort", or the assumption that it should never
happen was wrong. :)

>   $ cd ~/src/linux
>   $ git checkout v3.2.35
>   $ git am -3sc /tmp/alsa-usb-audio-fix-missing-autopm-for-midi-input.patch 
>   Applying: ALSA: usb-audio: Fix missing autopm for MIDI input
>   Using index info to reconstruct a base tree...
>   Falling back to patching base and 3-way merge...
>   error: Your local changes to the following files would be overwritten by merge:
> 	sound/usb/midi.c
>   Please, commit your changes or stash them before you can merge.
>   Aborting
>   Failed to merge in the changes.
>   Patch failed at 0001 ALSA: usb-audio: Fix missing autopm for MIDI input
>   When you have resolved this problem run "git am --resolved".
>   If you would prefer to skip this patch, instead run "git am --skip".
>   To restore the original branch and stop patching run "git am --abort".
>   $ git am --abort
>   error: cache entry has null sha1: sound/usb/midi.c
>   fatal: unable to write new index file
>   Unstaged changes after reset:
>   M       sound/usb/midi.c
>   $

I can't reproduce here. I can checkout v3.2.35, and I guess that the
patch you are applying comes from f5f1654, but I don't know your
local modification to sound/usb/midi.c. If I just make a fake
modification, I get this:

  $ git checkout v3.2.35
  $ echo fake >sound/usb/midi.c
  $ git format-patch --stdout -1 f5f1654 | git am -3 ~/patch
  [same errors as you]
  $ git am --abort
  Unstaged changes after reset:
  M       sound/usb/midi.c

So I wonder if there is something in the way your working tree is
modified. Can you give more details?

> Reproducible using v1.8.1-rc3 and master.  Bisects to 4337b585 (do not
> write null sha1s to on-disk index, 2012-07-28).  For comparison, older
> gits produced
> 
>   $ git am --abort
>   Unstaged changes after reset:
>   M       sound/usb/midi.c

What does your index look like afterwards? Does it have a null sha1 in
it (check "ls-files -s")?

-Peff

^ permalink raw reply

* Re: [PATCH 2/3] do not write null sha1s to on-disk index
From: Jonathan Nieder @ 2012-12-29 10:03 UTC (permalink / raw)
  To: Jeff King; +Cc: git, Jens Lehmann, Junio C Hamano
In-Reply-To: <20120728150524.GB25269@sigill.intra.peff.net>

Hi Peff,

Jeff King wrote:

> --- a/read-cache.c
> +++ b/read-cache.c
> @@ -1800,6 +1800,8 @@ int write_index(struct index_state *istate, int newfd)
>  			continue;
>  		if (!ce_uptodate(ce) && is_racy_timestamp(istate, ce))
>  			ce_smudge_racily_clean_entry(ce);
> +		if (is_null_sha1(ce->sha1))
> +			return error("cache entry has null sha1: %s", ce->name);
>  		if (ce_write_entry(&c, newfd, ce, previous_name) < 0)
>  			return -1;

Quick heads up: this is tripping for me in the "git am --abort"
codepath:

  $ cd ~/src/linux
  $ git checkout v3.2.35
  $ git am -3sc /tmp/alsa-usb-audio-fix-missing-autopm-for-midi-input.patch 
  Applying: ALSA: usb-audio: Fix missing autopm for MIDI input
  Using index info to reconstruct a base tree...
  Falling back to patching base and 3-way merge...
  error: Your local changes to the following files would be overwritten by merge:
	sound/usb/midi.c
  Please, commit your changes or stash them before you can merge.
  Aborting
  Failed to merge in the changes.
  Patch failed at 0001 ALSA: usb-audio: Fix missing autopm for MIDI input
  When you have resolved this problem run "git am --resolved".
  If you would prefer to skip this patch, instead run "git am --skip".
  To restore the original branch and stop patching run "git am --abort".
  $ git am --abort
  error: cache entry has null sha1: sound/usb/midi.c
  fatal: unable to write new index file
  Unstaged changes after reset:
  M       sound/usb/midi.c
  $

Reproducible using v1.8.1-rc3 and master.  Bisects to 4337b585 (do not
write null sha1s to on-disk index, 2012-07-28).  For comparison, older
gits produced

  $ git am --abort
  Unstaged changes after reset:
  M       sound/usb/midi.c

which is also not great but at least didn't involve any obviously
invalid behavior.  Known problem?

Thanks,
Jonathan

^ permalink raw reply

* Re: [RFC] pack-objects: compression level for non-blobs
From: Jeff King @ 2012-12-29  9:48 UTC (permalink / raw)
  To: Nguyen Thai Ngoc Duy; +Cc: David Michael Barr, Git Mailing List
In-Reply-To: <20121229090558.GA31291@sigill.intra.peff.net>

On Sat, Dec 29, 2012 at 04:05:58AM -0500, Jeff King wrote:

> On Sat, Dec 29, 2012 at 12:27:47AM -0500, Jeff King wrote:
> 
> > > I think I tried the partial decompression for commit header and it did
> > > not help much (or I misremember it, not so sure).
> > 
> > I'll see if I can dig up the reference, as it was something I was going
> > to look at next.
> 
> I tried the simple patch below, but it actually made things slower!  I'm
> assuming it is because the streaming setup is not micro-optimized very
> well. A custom read_sha1_until_blank_line() could probably do better.

Something like the patch below, which does speed things up. But not
nearly as much as I'd hoped:

  [before]
  $ best-of-five git rev-list --count --all
  real    0m4.197s
  user    0m4.112s
  sys     0m0.072s

  [after]
  $ best-of-five git rev-list --count --all
  real    0m3.782s
  user    0m3.708s
  sys     0m0.064s

Only about a 10% speedup (versus ~75% with uncompressed commits).

---
diff --git a/cache.h b/cache.h
index 18fdd18..a494d3b 100644
--- a/cache.h
+++ b/cache.h
@@ -724,6 +724,7 @@ int offset_1st_component(const char *path);
 
 /* object replacement */
 #define READ_SHA1_FILE_REPLACE 1
+#define READ_SHA1_FILE_HEADER 2
 extern void *read_sha1_file_extended(const unsigned char *sha1, enum object_type *type, unsigned long *size, unsigned flag);
 static inline void *read_sha1_file(const unsigned char *sha1, enum object_type *type, unsigned long *size)
 {
@@ -1059,7 +1060,7 @@ extern int is_pack_valid(struct packed_git *);
 extern off_t nth_packed_object_offset(const struct packed_git *, uint32_t);
 extern off_t find_pack_entry_one(const unsigned char *, struct packed_git *);
 extern int is_pack_valid(struct packed_git *);
-extern void *unpack_entry(struct packed_git *, off_t, enum object_type *, unsigned long *);
+extern void *unpack_entry(struct packed_git *, off_t, enum object_type *, unsigned long *, int);
 extern unsigned long unpack_object_header_buffer(const unsigned char *buf, unsigned long len, enum object_type *type, unsigned long *sizep);
 extern unsigned long get_size_from_delta(struct packed_git *, struct pack_window **, off_t);
 extern int unpack_object_header(struct packed_git *, struct pack_window **, off_t *, unsigned long *);
diff --git a/commit.c b/commit.c
index e8eb0ae..0a088dc 100644
--- a/commit.c
+++ b/commit.c
@@ -312,12 +312,13 @@ int parse_commit(struct commit *item)
 	void *buffer;
 	unsigned long size;
 	int ret;
+	int flags = save_commit_buffer ? 0 : READ_SHA1_FILE_HEADER;
 
 	if (!item)
 		return -1;
 	if (item->object.parsed)
 		return 0;
-	buffer = read_sha1_file(item->object.sha1, &type, &size);
+	buffer = read_sha1_file_extended(item->object.sha1, &type, &size, flags);
 	if (!buffer)
 		return error("Could not read %s",
 			     sha1_to_hex(item->object.sha1));
diff --git a/fast-import.c b/fast-import.c
index c2a814e..a140d57 100644
--- a/fast-import.c
+++ b/fast-import.c
@@ -1303,7 +1303,7 @@ static void *gfi_unpack_entry(
 		 */
 		p->pack_size = pack_size + 20;
 	}
-	return unpack_entry(p, oe->idx.offset, &type, sizep);
+	return unpack_entry(p, oe->idx.offset, &type, sizep, 0);
 }
 
 static const char *get_mode(const char *str, uint16_t *modep)
diff --git a/pack-check.c b/pack-check.c
index 63a595c..e4a43c0 100644
--- a/pack-check.c
+++ b/pack-check.c
@@ -116,7 +116,7 @@ static int verify_packfile(struct packed_git *p,
 					    sha1_to_hex(entries[i].sha1),
 					    p->pack_name, (uintmax_t)offset);
 		}
-		data = unpack_entry(p, entries[i].offset, &type, &size);
+		data = unpack_entry(p, entries[i].offset, &type, &size, 0);
 		if (!data)
 			err = error("cannot unpack %s from %s at offset %"PRIuMAX"",
 				    sha1_to_hex(entries[i].sha1), p->pack_name,
diff --git a/sha1_file.c b/sha1_file.c
index 40b2329..1f1f31a 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -1469,16 +1469,19 @@ static void *unpack_sha1_file(void *map, unsigned long mapsize, enum object_type
 	return *hdr ? -1 : type_from_string(type);
 }
 
-static void *unpack_sha1_file(void *map, unsigned long mapsize, enum object_type *type, unsigned long *size, const unsigned char *sha1)
+static void *unpack_sha1_file(void *map, unsigned long mapsize, enum object_type *type, unsigned long *size, const unsigned char *sha1, int stop_at_blank)
 {
 	int ret;
 	git_zstream stream;
-	char hdr[8192];
+	char hdr[512];
 
 	ret = unpack_sha1_header(&stream, map, mapsize, hdr, sizeof(hdr));
 	if (ret < Z_OK || (*type = parse_sha1_header(hdr, size)) < 0)
 		return NULL;
-
+	if (stop_at_blank && strstr(hdr, "\n\n")) {
+		*size = strlen(hdr);
+		return xstrdup(hdr);
+	}
 	return unpack_sha1_rest(&stream, hdr, *size, sha1);
 }
 
@@ -1667,8 +1670,11 @@ static void *unpack_compressed_entry(struct packed_git *p,
 static void *unpack_compressed_entry(struct packed_git *p,
 				    struct pack_window **w_curs,
 				    off_t curpos,
-				    unsigned long size)
+				    unsigned long *sizep,
+				    int stop_at_blank)
 {
+	static const int chunk_size = 256;
+	unsigned long size = *sizep;
 	int st;
 	git_zstream stream;
 	unsigned char *buffer, *in;
@@ -1676,15 +1682,27 @@ static void *unpack_compressed_entry(struct packed_git *p,
 	buffer = xmallocz(size);
 	memset(&stream, 0, sizeof(stream));
 	stream.next_out = buffer;
-	stream.avail_out = size + 1;
+
+	if (stop_at_blank)
+		stream.avail_out = chunk_size;
+	else
+		stream.avail_out = size + 1;
 
 	git_inflate_init(&stream);
 	do {
 		in = use_pack(p, w_curs, curpos, &stream.avail_in);
 		stream.next_in = in;
 		st = git_inflate(&stream, Z_FINISH);
-		if (!stream.avail_out)
-			break; /* the payload is larger than it should be */
+		if (!stream.avail_out) {
+			if (!stop_at_blank)
+				break; /* the payload is larger than it should be */
+			if (memmem(buffer, chunk_size, "\n\n", 2)) {
+				git_inflate_end(&stream);
+				*sizep = chunk_size;
+				return buffer;
+			}
+			stream.avail_out = size + 1 - chunk_size;
+		}
 		curpos += stream.next_in - in;
 	} while (st == Z_OK || st == Z_BUF_ERROR);
 	git_inflate_end(&stream);
@@ -1731,7 +1749,8 @@ static void *cache_or_unpack_entry(struct packed_git *p, off_t base_offset,
 }
 
 static void *cache_or_unpack_entry(struct packed_git *p, off_t base_offset,
-	unsigned long *base_size, enum object_type *type, int keep_cache)
+	unsigned long *base_size, enum object_type *type, int keep_cache,
+	int stop_at_blank)
 {
 	void *ret;
 	unsigned long hash = pack_entry_hash(p, base_offset);
@@ -1739,9 +1758,9 @@ static void *cache_or_unpack_entry(struct packed_git *p, off_t base_offset,
 
 	ret = ent->data;
 	if (!ret || ent->p != p || ent->base_offset != base_offset)
-		return unpack_entry(p, base_offset, type, base_size);
+		return unpack_entry(p, base_offset, type, base_size, stop_at_blank);
 
-	if (!keep_cache) {
+	if (!stop_at_blank && !keep_cache) {
 		ent->data = NULL;
 		ent->lru.next->prev = ent->lru.prev;
 		ent->lru.prev->next = ent->lru.next;
@@ -1810,7 +1829,7 @@ static void *read_object(const unsigned char *sha1, enum object_type *type,
 }
 
 static void *read_object(const unsigned char *sha1, enum object_type *type,
-			 unsigned long *size);
+			 unsigned long *size, int stop_at_blank);
 
 static void *unpack_delta_entry(struct packed_git *p,
 				struct pack_window **w_curs,
@@ -1832,7 +1851,7 @@ static void *unpack_delta_entry(struct packed_git *p,
 		return NULL;
 	}
 	unuse_pack(w_curs);
-	base = cache_or_unpack_entry(p, base_offset, &base_size, type, 0);
+	base = cache_or_unpack_entry(p, base_offset, &base_size, type, 0, 0);
 	if (!base) {
 		/*
 		 * We're probably in deep shit, but let's try to fetch
@@ -1851,12 +1870,12 @@ static void *unpack_delta_entry(struct packed_git *p,
 		      sha1_to_hex(base_sha1), (uintmax_t)base_offset,
 		      p->pack_name);
 		mark_bad_packed_object(p, base_sha1);
-		base = read_object(base_sha1, type, &base_size);
+		base = read_object(base_sha1, type, &base_size, 0);
 		if (!base)
 			return NULL;
 	}
 
-	delta_data = unpack_compressed_entry(p, w_curs, curpos, delta_size);
+	delta_data = unpack_compressed_entry(p, w_curs, curpos, &delta_size, 0);
 	if (!delta_data) {
 		error("failed to unpack compressed delta "
 		      "at offset %"PRIuMAX" from %s",
@@ -1895,7 +1914,8 @@ void *unpack_entry(struct packed_git *p, off_t obj_offset,
 int do_check_packed_object_crc;
 
 void *unpack_entry(struct packed_git *p, off_t obj_offset,
-		   enum object_type *type, unsigned long *sizep)
+		   enum object_type *type, unsigned long *sizep,
+		   int stop_at_blank)
 {
 	struct pack_window *w_curs = NULL;
 	off_t curpos = obj_offset;
@@ -1929,7 +1949,8 @@ void *unpack_entry(struct packed_git *p, off_t obj_offset,
 	case OBJ_TREE:
 	case OBJ_BLOB:
 	case OBJ_TAG:
-		data = unpack_compressed_entry(p, &w_curs, curpos, *sizep);
+		data = unpack_compressed_entry(p, &w_curs, curpos, sizep,
+					       stop_at_blank);
 		break;
 	default:
 		data = NULL;
@@ -2208,14 +2229,15 @@ static void *read_packed_sha1(const unsigned char *sha1,
 }
 
 static void *read_packed_sha1(const unsigned char *sha1,
-			      enum object_type *type, unsigned long *size)
+			      enum object_type *type, unsigned long *size,
+			      int stop_at_blank)
 {
 	struct pack_entry e;
 	void *data;
 
 	if (!find_pack_entry(sha1, &e))
 		return NULL;
-	data = cache_or_unpack_entry(e.p, e.offset, size, type, 1);
+	data = cache_or_unpack_entry(e.p, e.offset, size, type, 1, stop_at_blank);
 	if (!data) {
 		/*
 		 * We're probably in deep shit, but let's try to fetch
@@ -2226,7 +2248,7 @@ static void *read_packed_sha1(const unsigned char *sha1,
 		error("failed to read object %s at offset %"PRIuMAX" from %s",
 		      sha1_to_hex(sha1), (uintmax_t)e.offset, e.p->pack_name);
 		mark_bad_packed_object(e.p, sha1);
-		data = read_object(sha1, type, size);
+		data = read_object(sha1, type, size, stop_at_blank);
 	}
 	return data;
 }
@@ -2255,7 +2277,7 @@ static void *read_object(const unsigned char *sha1, enum object_type *type,
 }
 
 static void *read_object(const unsigned char *sha1, enum object_type *type,
-			 unsigned long *size)
+			 unsigned long *size, int stop_at_blank)
 {
 	unsigned long mapsize;
 	void *map, *buf;
@@ -2268,17 +2290,18 @@ static void *read_object(const unsigned char *sha1, enum object_type *type,
 		return xmemdupz(co->buf, co->size);
 	}
 
-	buf = read_packed_sha1(sha1, type, size);
+	buf = read_packed_sha1(sha1, type, size, stop_at_blank);
 	if (buf)
 		return buf;
 	map = map_sha1_file(sha1, &mapsize);
 	if (map) {
-		buf = unpack_sha1_file(map, mapsize, type, size, sha1);
+		buf = unpack_sha1_file(map, mapsize, type, size, sha1,
+				       stop_at_blank);
 		munmap(map, mapsize);
 		return buf;
 	}
 	reprepare_packed_git();
-	return read_packed_sha1(sha1, type, size);
+	return read_packed_sha1(sha1, type, size, stop_at_blank);
 }
 
 /*
@@ -2296,9 +2319,10 @@ void *read_sha1_file_extended(const unsigned char *sha1,
 	const struct packed_git *p;
 	const unsigned char *repl = (flag & READ_SHA1_FILE_REPLACE)
 		? lookup_replace_object(sha1) : sha1;
+	int stop_at_blank = !!(flag & READ_SHA1_FILE_HEADER);
 
 	errno = 0;
-	data = read_object(repl, type, size);
+	data = read_object(repl, type, size, stop_at_blank);
 	if (data)
 		return data;
 
@@ -2597,7 +2621,7 @@ int force_object_loose(const unsigned char *sha1, time_t mtime)
 
 	if (has_loose_object(sha1))
 		return 0;
-	buf = read_packed_sha1(sha1, &type, &len);
+	buf = read_packed_sha1(sha1, &type, &len, 0);
 	if (!buf)
 		return error("cannot read sha1_file for %s", sha1_to_hex(sha1));
 	hdrlen = sprintf(hdr, "%s %lu", typename(type), len) + 1;

^ permalink raw reply related

* Re: [RFC] pack-objects: compression level for non-blobs
From: Jeff King @ 2012-12-29  9:05 UTC (permalink / raw)
  To: Nguyen Thai Ngoc Duy; +Cc: David Michael Barr, Git Mailing List
In-Reply-To: <20121229052747.GA14928@sigill.intra.peff.net>

On Sat, Dec 29, 2012 at 12:27:47AM -0500, Jeff King wrote:

> > I think I tried the partial decompression for commit header and it did
> > not help much (or I misremember it, not so sure).
> 
> I'll see if I can dig up the reference, as it was something I was going
> to look at next.

I tried the simple patch below, but it actually made things slower!  I'm
assuming it is because the streaming setup is not micro-optimized very
well. A custom read_sha1_until_blank_line() could probably do better.

diff --git a/commit.c b/commit.c
index e8eb0ae..efd6c06 100644
--- a/commit.c
+++ b/commit.c
@@ -8,6 +8,7 @@
 #include "notes.h"
 #include "gpg-interface.h"
 #include "mergesort.h"
+#include "streaming.h"
 
 static struct commit_extra_header *read_commit_extra_header_lines(const char *buf, size_t len, const char **);
 
@@ -306,6 +307,39 @@ int parse_commit_buffer(struct commit *item, const void *buffer, unsigned long s
 	return 0;
 }
 
+static void *read_commit_header(const unsigned char *sha1,
+				enum object_type *type,
+				unsigned long *size)
+{
+	static const int chunk_size = 256;
+	struct strbuf buf = STRBUF_INIT;
+	struct git_istream *st;
+
+	st = open_istream(sha1, type, size, NULL);
+	if (!st)
+		return NULL;
+	while (1) {
+		size_t start = buf.len;
+		ssize_t readlen;
+
+		strbuf_grow(&buf, chunk_size);
+		readlen = read_istream(st, buf.buf + start, chunk_size);
+		buf.buf[start + readlen + 1] = '\0';
+		buf.len += readlen;
+
+		if (readlen < 0) {
+			close_istream(st);
+			strbuf_release(&buf);
+			return NULL;
+		}
+		if (!readlen || strstr(buf.buf + start, "\n\n"))
+			break;
+	}
+
+	close_istream(st);
+	return strbuf_detach(&buf, size);
+}
+
 int parse_commit(struct commit *item)
 {
 	enum object_type type;
@@ -317,7 +351,11 @@ int parse_commit(struct commit *item)
 		return -1;
 	if (item->object.parsed)
 		return 0;
-	buffer = read_sha1_file(item->object.sha1, &type, &size);
+
+	if (!save_commit_buffer)
+		buffer = read_commit_header(item->object.sha1, &type, &size);
+	else
+		buffer = read_sha1_file(item->object.sha1, &type, &size);
 	if (!buffer)
 		return error("Could not read %s",
 			     sha1_to_hex(item->object.sha1));

^ permalink raw reply related

* Re: Lockless Refs?
From: Jeff King @ 2012-12-29  8:16 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Martin Fick, Michael Haggerty, git, Shawn Pearce
In-Reply-To: <7vhan6jdx3.fsf@alter.siamese.dyndns.org>

On Fri, Dec 28, 2012 at 09:15:52AM -0800, Junio C Hamano wrote:

> Martin Fick <mfick@codeaurora.org> writes:
> 
> > Hmm, actually I believe that with a small modification to the 
> > semantics described here it would be possible to make multi 
> > repo/branch commits work....
> >
> > Shawn talked about adding multi repo/branch transaction 
> > semantics to jgit, this might be something that git wants to 
> > support also at some point?
> 
> Shawn may have talked about it and you may have listened to it, but
> others wouldn't have any idea what kind of "multi repo/branch
> transaction" you are talking about.  Is it about "I want to push
> this ref to that repo and push this other ref to that other repo",
> in what situation will it be used/useful, what are the failure
> modes, what are failure tolerances by the expected use cases, ...?
> 
> Care to explain?

I cannot speak for Martin, but I am assuming the point is to atomically
update 2 (or more) refs on the same repo. That is, if I have a branch
"refs/heads/foo" and a ref pointing to meta-information (say, notes
about commits in foo, in "refs/notes/meta/foo"), I would want to "git
push" them, and only update them if _both_ will succeed, and otherwise
fail and update nothing.

I think Shawn mentioned this at the last GitTogether as a stumbling
block for pushing more of Gerrit's meta-information as refs over the git
protocol. But I might be mis-remembering.

-Peff

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox