Git development
 help / color / mirror / Atom feed
* Re: [RFC] Add --create-cache to repack
From: Jay Soffian @ 2011-01-28 19:15 UTC (permalink / raw)
  To: Johannes Sixt
  Cc: Shawn Pearce, git, Junio C Hamano, Nicolas Pitre, John Hawley
In-Reply-To: <4D42E1E3.4060808@viscovery.net>

On Fri, Jan 28, 2011 at 10:33 AM, Johannes Sixt <j.sixt@viscovery.net> wrote:
> Let's define a ref hierarchy, refs/cache-pack, that names the cache pack
> tips. A cache pack would be generated for each ref found in that
> hierarchy. Then these commits are under user control even on github,
> because you can just push the refs. Junio would perhaps choose a release
> tag, and corresponding commits in the man and html histories. The choice
> would not be completely automatic, though.

This is just for bare repos, right? Why not just use HEAD?

j.

^ permalink raw reply

* Re: [RFC] Add --create-cache to repack
From: Shawn Pearce @ 2011-01-28 19:15 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: Johannes Sixt, git, Junio C Hamano, John Hawley
In-Reply-To: <alpine.LFD.2.00.1101281304270.8580@xanadu.home>

On Fri, Jan 28, 2011 at 10:46, Nicolas Pitre <nico@fluxnic.net> wrote:
> On Fri, 28 Jan 2011, Shawn Pearce wrote:
>
>> This started because I was looking for a way to speed up clones coming
>> from a JGit server.  Cloning the linux-2.6 repository is painful,
...
>> Later I realized, we can get rid of that cached list of objects and
>> just use the pack itself.
...
> Playing my old record again... I know.  But pack v4 should solve a big
> part of this enumeration cost.

I've said the same thing for years myself.  As much as it would be
nice to fix some of the decompression costs with pack v2/v3, v2/v3 is
very common in the wild, and a new pack encoding is going to be a
fairly complex thing to get added to C Git.  And pack v4 doesn't
eliminate the enumeration, it just makes it faster.

> So that's the idea.  Keep the exact same functionality as we have now,
> without any need for cache management, but making the data structure in
> a form that should improve object enumeration by some magnitude.

That's what I also liked about my --create-cache flag.  Its keeping
the same data we already have, in the same format we already have it
in.  We're just making a more explicit statement that everything in
some pack is about as tightly compressed as it ever would be for a
client, and it isn't going to change anytime soon.  Thus we might as
well tag it with .keep to prevent repack of mucking with it, and we
can take advantage of this to serve the pack to clients very fast.

Over breakfast this morning I made the point to Junio that with the
cached pack and a slight network protocol change (enabled by a
capability of course) we could stop using pkt-line framing when
sending the cached pack part of the stream, and just send the pack
directly down the socket.  That changes the clone of a 400 MB project
like linux-2.6 from being a lot of user space stuff, to just being a
sendfile() call for the bulk of the content.  I think we can just hand
off the major streaming to the kernel.  (Part of the protocol change
is we would need to use multiple SHA-1 checksums in the stream, so we
don't have to re-checksum the existing cached pack.)


I love the idea of some of the concepts in pack v4.  I really do.  But
this sounds a lot simpler to implement, and it lets us completely
eliminate a massive amount of server processing (even under pack v4
you still have object enumeration), in exchange for what might be a
few extra MBs on the wire to the client due to slightly less good
deltas and the use of REF_DELTA in the thin pack used for the most
recent objects.  I don't envision this being used on projects smaller
than git.git itself, if you can gc --aggressive the whole thing in a
minute the cached pack is probably pointless.  But if you have 400+
MB, you want that to be network bound, and have almost no CPU impact
on the server.

Plus we can safely do byte range requests for resumable clone within
the cached pack part of the stream.  And when pack v4 comes along, we
can use this same strategy for an equally large pack v4 pack.

-- 
Shawn.

^ permalink raw reply

* Re: [PATCH] merge: default to @{upstream}
From: Felipe Contreras @ 2011-01-28 18:46 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: Drew Northup, git
In-Reply-To: <20110128175609.GA27118@burratino>

On Fri, Jan 28, 2011 at 7:56 PM, Jonathan Nieder <jrnieder@gmail.com> wrote:
> Other nits: documentation?  tests?  The rest of cmd_merge does not
> rely on argv[argc] being NULL, but it might make sense to set argv[1]
> to NULL anyway for futureproofing.

Sure, I need to add documentation and tests. I should probably have
sent this as 'RFC'.

Anyway, I don't think we can set argv[1] to NULL, because it's
possible that this is "char *argv[1]", so that would crash. The only
thing the standard ensures, is that the last one would be NULL, so
argv[argc] = NULL, and therefore we can override it, as long as the
rest of the code checks for argc instead of NULL, which AFAIK in the
whole git code it is the case, and certainly in builtin_merge.c
AFAICS.

Cheers.

-- 
Felipe Contreras

^ permalink raw reply

* Re: [RFC] Add --create-cache to repack
From: Nicolas Pitre @ 2011-01-28 18:46 UTC (permalink / raw)
  To: Shawn Pearce; +Cc: Johannes Sixt, git, Junio C Hamano, John Hawley
In-Reply-To: <AANLkTim+AUY9SdeAFfkny2_a3qQ9SCDLUHR3s9Q3M98u@mail.gmail.com>

On Fri, 28 Jan 2011, Shawn Pearce wrote:

> This started because I was looking for a way to speed up clones coming
> from a JGit server.  Cloning the linux-2.6 repository is painful, it
> takes a long time to enumerate the 1.8 million objects.  So I tried
> adding a cached list of objects reachable from a given commit, which
> speeds up the enumeration phase, but JGit still needs to allocate all
> of the working set to track those objects, then go find them in packs
> and slice out each compressed form and reformat the headers on the
> wire.  Its a lot of redundant work when your kernel repository has
> 360MB of data that you know a client needs if they have asked for your
> master branch with no "have" set.
> 
> Later I realized, we can get rid of that cached list of objects and
> just use the pack itself.  Its far cleaner, as there is no redundant
> cache.  But either way (object list or pack) its a bit of a challenge
> to automatically identify the right starting points to use.  Linus
> Torvalds' linux-2.6 repository is the perfect case for the RFC I
> posted, its one branch with all of the history, and it never rewinds.
> But maybe Linus is just very unique in this world.  :-)

Playing my old record again... I know.  But pack v4 should solve a big 
part of this enumeration cost.

I've changed the format slightly again in my WIP branch.  The idea is to:

1) Have a non compressed yet still really dense representation for tree 
   objects;

2) do the same thing for the first part of commit objects, and only 
   deflate the free form text part.

There is nothing new here.  However, it should be possible to:

3) replace all SHA1 references by an offset into the pack file directly, 
   just like we do for OFS_DELTA objects.  If the SHA1 is actually 
   needed then we can obtain it with a reverse lookup with given object offset 
   in the pack index file, but in practice that is not actually required that 
   often.

So walking the history graph and enumerating objects would require 
nothing more than simply following straight pointers in the pack data in 
99% of the cases.  No object decompression, no memory buffer 
allocation/deallocation to perform that decompression, no string parsing 
in the tree object case, etc. Only cross pack references would require a 
full SHA1 based lookup like we do now.

I still have to sit down and figure out the implications of this, 
especially with forward references, meaning that the offset might have 
to be an object index so to allow for variable length encoding, and also 
to make sure index-pack can reconstruct the pack index.  But that would 
only be an indirect lookup which shouldn't be significantly costly.

So that's the idea.  Keep the exact same functionality as we have now, 
without any need for cache management, but making the data structure in 
a form that should improve object enumeration by some magnitude.


Nicolas

^ permalink raw reply

* Re: [RFC] Add --create-cache to repack
From: Shawn Pearce @ 2011-01-28 18:22 UTC (permalink / raw)
  To: Johannes Sixt; +Cc: git, Junio C Hamano, Nicolas Pitre, John Hawley
In-Reply-To: <4D42E1E3.4060808@viscovery.net>

On Fri, Jan 28, 2011 at 07:33, Johannes Sixt <j.sixt@viscovery.net> wrote:
> Am 1/28/2011 15:37, schrieb Shawn Pearce:
>> A simple fix is to use --heads --tags by default like I do here, but
>> make the actual parameters we feed to rev-list configurable.  A
>> repository owner could select only the master branch as input to
>> rev-list, making it less likely the topic branches would be
>> considered.  Unfortunately that requires direct access to the
>> repository.  It fails for a site like GitHub, where you don't manage
>> the repository at all.
>
> Let's define a ref hierarchy, refs/cache-pack, that names the cache pack
> tips. A cache pack would be generated for each ref found in that
> hierarchy. Then these commits are under user control even on github,
> because you can just push the refs. Junio would perhaps choose a release
> tag, and corresponding commits in the man and html histories. The choice
> would not be completely automatic, though.

This is a good idea.  Perhaps we go slightly further and say:

  refs/cache-pack/name-without-slash

    This packs into its own pack file, as a single tip.

  refs/cache-pack/group/a
  refs/cache-pack/group/b

   These pack into a pack file together.

If you have direct repository access, you can also just make one of
these a symbolic reference to a branch, e.g. refs/heads/master, and
then periodic `git repack --create-cache` invocations would pick up
the latest point.

-- 
Shawn.

^ permalink raw reply

* Re: [PATCH] merge: default to @{upstream}
From: Jonathan Nieder @ 2011-01-28 17:56 UTC (permalink / raw)
  To: Drew Northup; +Cc: Felipe Contreras, git
In-Reply-To: <1296233099.12855.14.camel@drew-northup.unet.maine.edu>

Drew Northup wrote:
> On Fri, 2011-01-28 at 18:17 +0200, Felipe Contreras wrote:

>> So 'git merge' is 'git merge @{upstream}' instead of 'git merge -h';
>> it's better to do something useful.
[...]
>> --- a/builtin/merge.c
>> +++ b/builtin/merge.c
>> @@ -983,9 +983,11 @@ int cmd_merge(int argc, const char **argv, const char *prefix)
>>  	if (!allow_fast_forward && fast_forward_only)
>>  		die("You cannot combine --no-ff with --ff-only.");
>>  
>> -	if (!argc)
>> -		usage_with_options(builtin_merge_usage,
>> -			builtin_merge_options);
>> +	if (!argc) {
>> +		/* argv[argc] should be NULL, so we can hijack it */
>> +		argv[0] = "@{u}";
[...]
> Honestly, I'd prefer that this NOT be merged in. When I mess up the
> command line I am typing I don't want some sort of hidden magic to kick
> in--I want it to tell me that I did something stupid by printing out the
> help message.

I generally have some sympathy for that point of view (especially
given the "because we can" commit message).  In this case, can you
think of an example where one would type "git merge" without a
branchname argument and expect it to do something else?

 - Never used "git merge" before, trying it for the first time.
   In this case, I think merging from upstream is a good behavior,
   relatively consistent with "git pull".

 - In a script trying to do an octopus, in the special case of no
   extra parents.  Rough plumbing equivalent:

	set --
	git merge-recursive HEAD -- HEAD "$@"
	tree=$(git write-tree)
	git fetch . "$@" 2>/dev/null
	git fmt-merge-msg <.git/FETCH_HEAD |
	git commit-tree $tree \
		$(git merge-base --independent HEAD "$@" | sed 's/^/-p ')

   The porcelain "git fetch" is used to populate .git/FETCH_HEAD (the
   fmt-merge-msg manual doesn't explain any other way).  Maybe it
   would be better to write the log message by hand.  In any case, the
   porcelain "git fetch" defeats us --- "git fetch ." means to fetch
   the default refspec instead of no branches.

   It might be nice to have a better way to format merge messages
   (like "git fmt-merge-msg --refspec "$@"?) so we can give better
   advice to the author of the script that uses "git merge $@" and
   already breaks in the no-heads case.

 - Started typing a "git merge" command with lots of switches and
   forgot to type the branch name.  It might help to print some output
   'defaulting to branch <foo>' so the operator can notice the
   mistake.

Other nits: documentation?  tests?  The rest of cmd_merge does not
rely on argv[argc] being NULL, but it might make sense to set argv[1]
to NULL anyway for futureproofing.

Thanks.
Jonathan

^ permalink raw reply

* Re: [PATCH] merge: default to @{upstream}
From: Felipe Contreras @ 2011-01-28 17:53 UTC (permalink / raw)
  To: Drew Northup; +Cc: git, Jonathan Nieder
In-Reply-To: <1296233099.12855.14.camel@drew-northup.unet.maine.edu>

On Fri, Jan 28, 2011 at 6:44 PM, Drew Northup <drew.northup@maine.edu> wrote:
> Honestly, I'd prefer that this NOT be merged in. When I mess up the
> command line I am typing I don't want some sort of hidden magic to kick
> in--I want it to tell me that I did something stupid by printing out the
> help message. This is standard to a large number of commands that by
> default expect a certain number of operands and I don't see any good
> reason why git merge should be any different.

git checkout (defaults to HEAD)
git diff (defaults to HEAD)
git fetch (defaults to origin)
git format-patch (defaults to HEAD)
git log (defaults to HEAD)
git pull (defaults to origin)
git show (defaults to HEAD)

How is this different from 'git pull'? If you are not sure about the
'git merge' command, then type 'git help merge' instead. Just like if
you are not sure about the 'git pull' command. If you type any of
these two, a merge would happen, and you can revert it easily with
'git reset --hard HEAD^'.

-- 
Felipe Contreras

^ permalink raw reply

* Re: [PATCH v4] fast-import: treat filemodify with empty tree as delete
From: Jonathan Nieder @ 2011-01-28 17:13 UTC (permalink / raw)
  To: Peter Baumann
  Cc: Junio C Hamano, Sverre Rabbelier, Git Mailing List,
	Ramkumar Ramachandra, Shawn O. Pearce, David Barr
In-Reply-To: <20110127204649.GB19378@m62s10.vlinux.de>

Peter Baumann wrote:
> On Thu, Jan 27, 2011 at 01:48:45PM -0600, Jonathan Nieder wrote:
>>> On Thu, Jan 27, 2011 at 12:07:49AM -0600, Jonathan Nieder wrote:

>>>> +	 empty_tree=$(git mktree </dev/null) &&
[...]
>>                               unless we hardcode the object name
>> (which I prefer not to do).
>
> Wny not? It *is* already hardcoded in the GIT source code (see
> grep -a1 EMPTY cache.h output).

I think it is okay for the git implementation to rely on an
implementation detail. ;-)  Likewise, t0000 checks that the empty tree
has id 4b825dc6.  Meanwhile I would like to see people's scripts
and other tests using the $(git mktree </dev/null) form, since it is
more self-explanatory and avoids hardcoding an implementation detail.

Of course this is not an absolute thing.

Hope that helps,
Jonathan

Further reading: t0000-basic.h --help:

	Note that this test *deliberately* hard-codes many expected object
	IDs.  When object ID computation changes, like in the previous case of
	swapping compression and hashing order, the person who is making the
	modification *should* take notice and update the test vectors here.

"Tips for Writing Tests" in t/README:

	However, other tests that simply rely on basic parts of the core
	GIT working properly should not have that level of intimate
	knowledge of the core GIT internals.  If all the test scripts
	hardcoded the object IDs like t0000-basic.sh does, that defeats
	the purpose of t0000-basic.sh, which is to isolate that level of
	validation in one place.  Your test also ends up needing
	updating when such a change to the internal happens, so do _not_
	do it and leave the low level of validation to t0000-basic.sh.

^ permalink raw reply

* Re: Permissions and authorisations in git repository
From: Harry Johnson @ 2011-01-28 17:01 UTC (permalink / raw)
  To: Victor Engmark; +Cc: git
In-Reply-To: <4D42AEB9.3020008@terreactive.ch>

Git is independent of access control mechanisms to some degree,
however, if you use ssh and share a repository with other users you
will want to look into the core.sharedRepository config setting. Took
me a while to discover this.

Check 'git help config' and search for shared for the details.

HTH,
-Harry

On Fri, Jan 28, 2011 at 6:55 AM, Victor Engmark
<victor.engmark@terreactive.ch> wrote:
> On 01/28/2011 12:41 PM, vikram2rhyme wrote:
>>
>> Hello friends
>> I am wondering if there are any permission and authorization control over
>> git
>> repository. I have gone through git manual but there is no discussion on
>> it.
>> On the internet i searched but hardy i found anything. Please help me if
>> there
>> is any permission control in distributed environment in git repository
>
> Git is independent of access control mechanisms - You can use whatever you
> want. For example, you could use the filesystem read/write permissions on a
> directory to control local access, or SSH permissions to allow remote
> access. See for example GitHub, which uses different protocols for different
> levels of access.
>
> HTH,
> --
> Victor Engmark
> --
> To unsubscribe from this list: send the line "unsubscribe git" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

^ permalink raw reply

* Re: [PATCH] merge: default to @{upstream}
From: Drew Northup @ 2011-01-28 16:44 UTC (permalink / raw)
  To: Felipe Contreras; +Cc: git, Jonathan Nieder
In-Reply-To: <1296231457-18780-1-git-send-email-felipe.contreras@gmail.com>


On Fri, 2011-01-28 at 18:17 +0200, Felipe Contreras wrote:
> So 'git merge' is 'git merge @{upstream}' instead of 'git merge -h';
> it's better to do something useful.
> 
> Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
> ---
>  builtin/merge.c |    8 +++++---
>  1 files changed, 5 insertions(+), 3 deletions(-)
> 
> diff --git a/builtin/merge.c b/builtin/merge.c
> index 42fff38..f23d669 100644
> --- a/builtin/merge.c
> +++ b/builtin/merge.c
> @@ -983,9 +983,11 @@ int cmd_merge(int argc, const char **argv, const char *prefix)
>  	if (!allow_fast_forward && fast_forward_only)
>  		die("You cannot combine --no-ff with --ff-only.");
>  
> -	if (!argc)
> -		usage_with_options(builtin_merge_usage,
> -			builtin_merge_options);
> +	if (!argc) {
> +		/* argv[argc] should be NULL, so we can hijack it */
> +		argv[0] = "@{u}";
> +		argc = 1;
> +	}
>  
>  	/*
>  	 * This could be traditional "merge <msg> HEAD <commit>..."  and

Honestly, I'd prefer that this NOT be merged in. When I mess up the
command line I am typing I don't want some sort of hidden magic to kick
in--I want it to tell me that I did something stupid by printing out the
help message. This is standard to a large number of commands that by
default expect a certain number of operands and I don't see any good
reason why git merge should be any different.

-- 
-Drew Northup
________________________________________________
"As opposed to vegetable or mineral error?"
-John Pescatore, SANS NewsBites Vol. 12 Num. 59

^ permalink raw reply

* Re: Project- vs. Package-Level Branching in Git
From: in-gitvger @ 2011-01-28 16:39 UTC (permalink / raw)
  To: Thomas Hauk; +Cc: Ævar Arnfjörð Bjarmason, git
In-Reply-To: <15B7CA2E-C584-4563-B9E3-D80861CD9565@shaggyfrog.com>


In message <15B7CA2E-C584-4563-B9E3-D80861CD9565@shaggyfrog.com>, Thomas Hauk w
rites:

    On Jan 27, 2011, at 12:53 PM, Ævar Arnfjörð Bjarmason wrote:
    > You'll be much better off if you have project-specific repositories.

    But how often do you have a project that has no external or
    internal dependencies on any other packages or libraries? Any
    project I've ever done, big or small, has relied on some existing
    codebase. Imagine a project that uses liba and libb, which both
    reference libc. To use Git, I'd have to have copies of libc
    existing in three repositories, and copies of liba and lib in two
    repositories each. What a nightmare... and that's only a trivial
    hypothetical example.

Including libs in the superproject is the subtree merge method.  It
certainly works and provides inband commit exploration (since one repo
can see all commits), but it inconvenient to update and even harder to
export changes back to share with other liba users.  It may also cause
the repo to be inconveniently large.  Arranging for the correct
commits to be on the differently named branches (between the
subproject and the superproject) is also not convenient.

git-submodule is the normal approach for the problem you have.  There
is a strong binding from each commit in the superproject to commits in
the subprojects.  What is inconvenient is managing what branch you
need to check out on the subproject in order to get or make the right
changes in the right place.  It is also annoying if you are performing
active development on the subprojects since you continually have to
update the superproject and then recheckout the correct branches on
the subproject.

Another solution is gitslave (http://gitslave.sf.net).  This provides
a loose binding from the superproject to the subprojects which is very
convenient if you are doing active development on all of the
subprojects.  Specifically there is only a strong binding when you tag
(since you tag the superproject and all subprojects at the same time).
Generally, however, you check out the same branch/tag on all branches
at the same time, which obviously does not match your preferred usage,
except it gave me an idea.  Specifically, you could have your own
local master bare repositories for those packages and an orthogonal
naming scheme for branches and tags.  So the project foo would might
have branch foo-2.0 and liba libb and libc would all have those
branches as well.  When you want to update libb, a repo with the true
master upstream and the local master upstream would fetch the true
master and merge the changes from the correct branch into foo-2.0
and then push to the local upstream master.  Everyone else would then just
`gits pull` and would get the changes.

Of course this concept for a local master would work for submodules as
well, depending on whether you want the tight binding and
update/change annoyance or the loose binding and easier
update/changes.

					-Seth Robertson

^ permalink raw reply

* Re: Project- vs. Package-Level Branching in Git
From: Marc Branchaud @ 2011-01-28 16:20 UTC (permalink / raw)
  To: Thomas Hauk; +Cc: Ævar Arnfjörð Bjarmason, git
In-Reply-To: <15B7CA2E-C584-4563-B9E3-D80861CD9565@shaggyfrog.com>

On 11-01-27 06:22 PM, Thomas Hauk wrote:
> On Jan 27, 2011, at 12:53 PM, Ævar Arnfjörð Bjarmason wrote:
>> You'll be much better off if you have project-specific repositories.
> 
> But how often do you have a project that has no external or internal
> dependencies on any other packages or libraries? Any project I've ever
> done, big or small, has relied on some existing codebase. Imagine a
> project that uses liba and libb, which both reference libc. To use Git,
> I'd have to have copies of libc existing in three repositories, and copies
> of liba and lib in two repositories each. What a nightmare... and that's
> only a trivial hypothetical example.

Let me try to give you a more positive response...  :)

Where I work, our products rely on a vast amount of code from different
sources, not just various internal and external libraries but also the entire
FreeBSD tree as well as assorted forks of different Linux kernel versions
(because different customers have their own tweaked kernels that we need to
work with).

We use git in a variety of ways to manage all this.  We rely a lot on git's
submodule feature, but not exclusively.

For most external code, including the Linux kernel forks, we usually set up
an internal git mirror of whatever public repository the code has.  So each
external code base has its own git repository, and we attach those
repositories to our main repo using submodules.

This works fairly well, especially because we don't update the external
sources very often.  Working with submodules takes a bit of getting used to,
but it works nicely when the different pieces are reasonably independent, and
this is usually the case for external libraries.

For our internal libraries, we just keep them all in the main repository.
Our internal code does have some inter-dependencies, so it's convenient to
track them all together.  Note that this doesn't prevent internal libraries
from evolving independently -- even though a branch applies to the whole
repo, the branch's _topic_ can just be about one specific library.  Anyone
who needs to use the library's updated code can merge that branch into their
own, or base their work on that branch.  Eventually the library's branch gets
merged back into the mainline branch and everyone gets to use the updated code.

In addition to all that, we have a different way of working with the FreeBSD
code base.  This was put together a few years ago, and I would do it
differently now, but I'll describe it to give you an idea of what else is
possible with git.

The FreeBSD code tree lives in a subdirectory of our main repo.  It's not a
submodule or anything fancy, it's just the code.  We've modified the FreeBSD
code quite a bit, but we also keep it up to date with changes the FreeBSD
guys make.  We put all of their changes in a separate branch in our main
repo, and a script keeps that branch up to date.  When we're ready, we merge
the FreeBSD changes into our mainline branch.

Keeping that FreeBSD branch up to date is a bit involved.  The branch
actually reflects a single branch in the FreeBSD public subversion
repository.  We use "git svn" to maintain a git mirror of that subversion
branch, and what our script does is compare the latest FreeBSD subversion
revision number in our main repo to the one in the mirror ("git svn" records
the subversion revision numbers in the git commits it creates).  When the
script finds that the main repo is out of date, it extracts the patches for
each individual subversion commit and applies them as git commits in our main
repo's FreeBSD branch.  (Any git veterans who've read this far are probably
cringing right now...)

Anyway, my point is that git provides a lot of flexibility to let you work
with your code base in many different ways, but none of them are how
subversion or perforce work.  Coming from those tools, you have to shift your
mindset a bit to make the best use of git.  That can be frustrating, and I
won't say that git's model is The One True Way, but I've found that what feel
like limitations from the perspective of other tools usually turn out to be
relatively inconsequential.

		M.

^ permalink raw reply

* [PATCH] merge: default to @{upstream}
From: Felipe Contreras @ 2011-01-28 16:17 UTC (permalink / raw)
  To: git; +Cc: Jonathan Nieder, Felipe Contreras

So 'git merge' is 'git merge @{upstream}' instead of 'git merge -h';
it's better to do something useful.

Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
---
 builtin/merge.c |    8 +++++---
 1 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/builtin/merge.c b/builtin/merge.c
index 42fff38..f23d669 100644
--- a/builtin/merge.c
+++ b/builtin/merge.c
@@ -983,9 +983,11 @@ int cmd_merge(int argc, const char **argv, const char *prefix)
 	if (!allow_fast_forward && fast_forward_only)
 		die("You cannot combine --no-ff with --ff-only.");
 
-	if (!argc)
-		usage_with_options(builtin_merge_usage,
-			builtin_merge_options);
+	if (!argc) {
+		/* argv[argc] should be NULL, so we can hijack it */
+		argv[0] = "@{u}";
+		argc = 1;
+	}
 
 	/*
 	 * This could be traditional "merge <msg> HEAD <commit>..."  and
-- 
1.7.4.rc3

^ permalink raw reply related

* Re: Why git tags are there in git?
From: Konstantin Khomoutov @ 2011-01-28 16:06 UTC (permalink / raw)
  To: vikram2rhyme; +Cc: git
In-Reply-To: <1296214676536-5969544.post@n2.nabble.com>

On Fri, 28 Jan 2011 03:37:56 -0800 (PST)
vikram2rhyme <vikram2rhyme@gmail.com> wrote:

> I am wondering why the tags are there in git. As they are just
> pointer to the commit
> we can refer those commit by SHA sum only then why tagging? Moreover a
> commit can
> be tagged more than once that result in multiple tags pointing to the
> same point in the history.
> Is this a design flaw?

Amending what other commenters said, I should note that beyond regular
tags which just provide human-readable names to objects, there exist
so-called "annotated tags" which are tags with descriptive messages
contained in them, which can be used as a side-channel to provide
additional information for commits, and also to digitally sign a line
of history to which such a tag is attached.

Also note that a tag can be attached to any object in a Git database,
not necessarily to a commit. This can be occasionally useful.

^ permalink raw reply

* Re: [RFC] Add --create-cache to repack
From: Johannes Sixt @ 2011-01-28 15:33 UTC (permalink / raw)
  To: Shawn Pearce; +Cc: git, Junio C Hamano, Nicolas Pitre, John Hawley
In-Reply-To: <AANLkTim+AUY9SdeAFfkny2_a3qQ9SCDLUHR3s9Q3M98u@mail.gmail.com>

Am 1/28/2011 15:37, schrieb Shawn Pearce:
> A simple fix is to use --heads --tags by default like I do here, but
> make the actual parameters we feed to rev-list configurable.  A
> repository owner could select only the master branch as input to
> rev-list, making it less likely the topic branches would be
> considered.  Unfortunately that requires direct access to the
> repository.  It fails for a site like GitHub, where you don't manage
> the repository at all.

Let's define a ref hierarchy, refs/cache-pack, that names the cache pack
tips. A cache pack would be generated for each ref found in that
hierarchy. Then these commits are under user control even on github,
because you can just push the refs. Junio would perhaps choose a release
tag, and corresponding commits in the man and html histories. The choice
would not be completely automatic, though.

-- Hannes

^ permalink raw reply

* Re: [PATCH] git-p4: Corrected typo.
From: Thomas Berg @ 2011-01-28 15:19 UTC (permalink / raw)
  To: Vitor Antunes; +Cc: git
In-Reply-To: <AANLkTimQhFzEXr=T9F8TJzTeWwKroTt_BG87RtQCLivv@mail.gmail.com>

Hi,

On Fri, Jan 28, 2011 at 12:35 AM, Vitor Antunes <vitor.hda@gmail.com> wrote:
> Hi everyone,
>
> Could anyone comment the 3 patches I sent (being this the last one)?
>
[...]
> On Thu, Nov 25, 2010 at 1:26 AM, Vitor Antunes <vitor.hda@gmail.com> wrote:
>> ---
>>  contrib/fast-import/git-p4 |    2 +-
>>  1 files changed, 1 insertions(+), 1 deletions(-)
>>
>> diff --git a/contrib/fast-import/git-p4 b/contrib/fast-import/git-p4
>> index 0ea3a44..a466847 100755
>> --- a/contrib/fast-import/git-p4
>> +++ b/contrib/fast-import/git-p4
>> @@ -618,7 +618,7 @@ class P4Submit(Command):
>>         if len(detectRenames) > 0:
>>             diffOpts = "-M%s" % detectRenames
>>         else:
>> -            diffOpts = ("", "-M")[self.detectRenames]
>> +            diffOpts = ("", "-M")[self.detectRename]
>>

This appears to me to be a bugfix for one of the other patches you
sent, is that right?

If so, maybe you could squash it with the previous patch and re-send
it all to the list?

My other comments for now are:
- you have forgotten to sign off on the patches
- commit messages are normally in imperative rather than past tense
(see Documentation/SubmittingPatches in git)

- In your first patch you wrote:
> The detectRenames option should be set to the desired threshold value.
I'm not sure what threshold value you refer to here, and what values
you can set it to. Am I missing something?
(I'm not very familiar with git rename detection options)

I'm a git-p4 user, so I can test your changes and look a bit more at
your code. Someone verifying it could help getting the patches
applied.

Thanks for improving git-p4!

Cheers,
Thomas Berg

^ permalink raw reply

* Re: Why git tags are there in git?
From: Gabriel Filion @ 2011-01-28 15:03 UTC (permalink / raw)
  To: "L. Alberto Giménez"; +Cc: vikram2rhyme, git
In-Reply-To: <4D42B623.5060709@sysvalve.es>

On 11-01-28 07:27 AM, "L. Alberto Giménez" wrote:
> On 28/01/2011 12:37, vikram2rhyme wrote:
>>
>> Hello friends
>> I am wondering why the tags are there in git. As they are just pointer to
>> the commit
>> we can refer those commit by SHA sum only then why tagging?
> 
> Hi, I tend to find easier "release-v1" than 2cff0e391ab127ae...
> 

In general, tags are used for marking a point in time. That marker won't
move, whereas branches can move with time.

For example, in most projects tags are used to make it easier to refer
to commits that mark official releases. In git.git (git's own
repository), the tag v1.7.3.5 points to the release with the same number.

-- 
Gabriel Filion

^ permalink raw reply

* Re: [PATCH v3] Sanity-check config variable names
From: Libor Pechacek @ 2011-01-28 14:53 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Jeff King
In-Reply-To: <7voc72ge4j.fsf@alter.siamese.dyndns.org>

On Thu 27-01-11 14:45:16, Junio C Hamano wrote:
> Libor Pechacek <lpechacek@suse.cz> writes:
> > Fixed the typo and return values from get_value and git_config_set_multivar.
> > We have changed git_config_parse_key to return negative values on error, but
> > forgot to negate the numbers when returning them as exit codes.
> 
> Earlier get_value() returned:
> 
>  -1: when it saw an uncompilable regexp (either in key or value);
>   0: when a value was available (under --all) or unique (without --all);
>   1: when the requested variable is missing; and
>   2: when the requested variable is multivalued under --all.

Fixed one part with the last change and broke the other one.  Thanks for
catching it.  The same return value for "invalid key" and "invalid regex" is OK
for me.

BTW is it OK to exit(-1);?  The return value of get_value() gets propagated to
the process exit status.  At the same time shell uses values >128 to indicate
that the process was terminated by a signal.

[...]
> When moving an existing code segment around like this, I would not mind to
> see style fixes thrown in to the patch, as long as the moved code is small
> enough.  Perhaps like this:

I've added the style fix into the patch.

[...]
> > +/* Auxiliary function to sanity-check and split the key into the section
> 
> 	/*
>        * Style. We write our multi-line comments
> 	 * like this.
>        */

Fixed.

> > +int git_config_parse_key(const char *key, char **store_key, int *baselen_)
[...]
> 
> Does it make sense for this function to be prepared to get called with
> NULL in store_key like this (and in the remainder of your patch)?

No, I wrote it unnecessarily generic.  Removed the excess code.  Thanks for
pointing it out.

[...]
> > +	test_must_fail git -c name=value config core.name
> >  '
> 
> Don't you want to make sure that your sanity check triggers in new tests?

Added a few tests after getting familiar with the test suite.

Libor
-- 
Libor Pechacek
SUSE L3 Team, Prague

^ permalink raw reply

* Re: add command clean to git help
From: redstun @ 2011-01-28 14:49 UTC (permalink / raw)
  To: git
In-Reply-To: <AANLkTimnqoNudu66Y+a2R_ttk9ghw7Z-eL1AtcpK=4HB@mail.gmail.com>

Ah, with git --help help I found that 'git help -a' shows all
available commands, probably by default 'git help' should display a
tip about this command for finding all git commands?

Thanks

On Fri, Jan 28, 2011 at 5:43 PM, redstun <redstun@gmail.com> wrote:
> Just realized in git help the command clean is not mentioned, yet
>
> #git --version
> git version 1.7.3.3
>

^ permalink raw reply

* Re: [RFC] Add --create-cache to repack
From: Shawn Pearce @ 2011-01-28 14:37 UTC (permalink / raw)
  To: Johannes Sixt; +Cc: git, Junio C Hamano, Nicolas Pitre, John Hawley
In-Reply-To: <4D42878E.2020502@viscovery.net>

On Fri, Jan 28, 2011 at 01:08, Johannes Sixt <j.sixt@viscovery.net> wrote:
> Am 1/28/2011 9:06, schrieb Shawn O. Pearce:
>> A cache pack is all objects reachable from a single commit that is
>> part of the project's stable history and won't disappear, and is
>> accessible to all readers of the repository.  By containing only that
>> commit and its contents, if the commit is reached from a reference we
>> know immediately that the entire pack is also reachable.  To help
>> ensure this is true, the --create-cache flag looks for a commit along
>> refs/heads and refs/tags that is at least 1 month old, working under
>> the assumption that a commit this old won't be rebased or pruned.
>
> In one of my repositories, I have two stable branches and a good score of
> topic branches of various ages (a few hours up to two years 8). The topic
> branches will either be dropped eventually, or rebased.
>
> What are the odds that this choice of a tip commit picks one that is in a
> topic branch? Or is there no point in using --create-cache in a repository
> like this?

Argh, you are right.  Its quite likely this would pick a topic
branch... and that isn't really what is desired.

My original concept here was for distribution point repositories,
which are less likely to have these topic branches that will rebase
and disappear.  Though git.git has one called "pu".  *sigh*

A simple fix is to use --heads --tags by default like I do here, but
make the actual parameters we feed to rev-list configurable.  A
repository owner could select only the master branch as input to
rev-list, making it less likely the topic branches would be
considered.  Unfortunately that requires direct access to the
repository.  It fails for a site like GitHub, where you don't manage
the repository at all.

git.git also is problematic because of the man, html and todo
branches.  Branches that are disconnected from the main history but
are very small (e.g. todo) might be selected instead and create a
nearly useless cache file.  Fortunately disconnected branches could
each have their own cache file (with only the inode overhead of having
an additional 3 files per disconnected branch), and pack-objects could
concat all of those packs together when sending.  Its just a challenge
to identify these branches and keep them from being used for that main
project pack.


This started because I was looking for a way to speed up clones coming
from a JGit server.  Cloning the linux-2.6 repository is painful, it
takes a long time to enumerate the 1.8 million objects.  So I tried
adding a cached list of objects reachable from a given commit, which
speeds up the enumeration phase, but JGit still needs to allocate all
of the working set to track those objects, then go find them in packs
and slice out each compressed form and reformat the headers on the
wire.  Its a lot of redundant work when your kernel repository has
360MB of data that you know a client needs if they have asked for your
master branch with no "have" set.

Later I realized, we can get rid of that cached list of objects and
just use the pack itself.  Its far cleaner, as there is no redundant
cache.  But either way (object list or pack) its a bit of a challenge
to automatically identify the right starting points to use.  Linus
Torvalds' linux-2.6 repository is the perfect case for the RFC I
posted, its one branch with all of the history, and it never rewinds.
But maybe Linus is just very unique in this world.  :-)

-- 
Shawn.

^ permalink raw reply

* Re: [PATCH] Fix wrong xhtml option to highlight
From: Jakub Narebski @ 2011-01-28 12:35 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Drew Northup, Jochen Schmitt, Adam Tkac, git
In-Reply-To: <7vvd1agoii.fsf@alter.siamese.dyndns.org>

On Thu, 27 Jan 2011, Junio C Hamano wrote:
> Drew Northup <drew.northup@maine.edu> writes:
> 
>>>> diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
>>>> index 1025c2f..b662420 100755
>>>> --- a/gitweb/gitweb.perl
>>>> +++ b/gitweb/gitweb.perl
>>>> @@ -3468,7 +3468,7 @@ sub run_highlighter {
>>>>  	close $fd;
>>>>  	open $fd, quote_command(git_cmd(), "cat-file", "blob", $hash)." | ".
>>>>  	          quote_command($highlight_bin).
>>>> -	          " --xhtml --fragment --syntax $syntax |"
>>>> +	          " -xhtml --fragment --syntax $syntax |"
>>> 
>>> Curious.
>>> 
>>> Does the command take double-dash for the fragment and syntax options but
>>> a single dash for the xhtml option?  Really...
>>> 
>>> A few top hits returned by Google for "highlight manual page" tells me
>>> otherwise.
>>
>> Certainly appears to be the case that "--xhtml" is the option in Ubuntu
>> 10.04.1 LTS. 
>>
>> Jochen,
>> Did you mean "-X" (which sets the same option)?
> 
> The current proposal is to drop --xhtml and let highlight default to HTML.
> 
> Honestly speaking, I don't like the approach very much; it would have been
> much better if highlight had a single way that is supported throughout its
> versions to specify the output format.  But it appears that there isn't,
> and relying on and hoping for its default to stay HTML is the best we
> could do, if we plan to support highlight 2.4.something or older.
> 
> The copy of U10.04 I have has highlight 2.12, and according to its manual
> pages, -X, --xhtml, and --out-format=xhtml mean the same thing.  HTML is
> the default.
> 
> The change-log at www.andre-simon.de indicates that --out-format has
> become the preferred method and the short options like -X and -H are not
> supported in recent versions (3.0 beta and newer).
> 
> But as Jakub mentioned, 2.4.5 did not have --output-format; it was only in
> 3.0 beta that -O was redefined to mean --output-format and in old versions
> the short option meant something else.

Well, we can always require highlight >= 2.12, or whatever version
introduced --out-format option.

> 
> What a mess...
> 
> The next time we introduce a new dependency, we really should try hard to
> assess the stability and maturity of that dependency.  In hindsight, I
> think "highlight" was probably a bit too premature to be depended upon.

By the way, the idea was to make it possible to configure other highlighter,
but I went with what I known to work, i.e. with Andre Simon's "highlight". 
I think it could be fairly easy to make it configurable via existing
$highlight_bin and to be introduced @highlight_args gitweb configuration
variables.

There are three possible ways to do syntax highlighting in gitweb:
filter, Perl module, or via JavaScript.  An alternative to "highlight"
as a filter could be GNU source-highlight... if not for the fact that
it doesn't seem to support equivalent of "highlight" --fragment option,
i.e. exclude prolog and <pre><tt> wrappers.

-- 
Jakub Narebski
Poland

^ permalink raw reply

* Re: Why git tags are there in git?
From: "L. Alberto Giménez" @ 2011-01-28 12:27 UTC (permalink / raw)
  To: vikram2rhyme; +Cc: git
In-Reply-To: <1296214676536-5969544.post@n2.nabble.com>

On 28/01/2011 12:37, vikram2rhyme wrote:
>
> Hello friends
> I am wondering why the tags are there in git. As they are just pointer to
> the commit
> we can refer those commit by SHA sum only then why tagging?

Hi, I tend to find easier "release-v1" than 2cff0e391ab127ae...

Regards,
L. Alberto Giménez

^ permalink raw reply

* Re: Permissions and authorisations in git repository
From: Konstantin Khomoutov @ 2011-01-28 12:06 UTC (permalink / raw)
  To: vikram2rhyme; +Cc: git
In-Reply-To: <1296214884133-5969556.post@n2.nabble.com>

On Fri, 28 Jan 2011 03:41:24 -0800 (PST)
vikram2rhyme <vikram2rhyme@gmail.com> wrote:

> I am wondering if there are any permission and authorization control
> over git repository.
[...]

In the simplest case -- r/w access via SSH -- those who know the
login/password or possess the necessary private key have (full) access
to the repository. The repository can also be made accessible for
read-only via Git protocol (as a whole as well). This can be used for
simple write/read access discrimination.
If a more fine-grained control
is needed, third-party tools exist: gitolite:
https://github.com/sitaramc/gitolite gitosis: http://swik.net/gitosis

Note that as Git does not suffer from a centralised VCS syndrome of
having a single repository shared by everyone involved, the problem
you're facing might not exist at all: every developer or a group of
related developers maintains their own repository and a "central"
repository (in whatever sense you're willing to put into it) is owned
by a special person or a group of persons.

^ permalink raw reply

* Re: Permissions and authorisations in git repository
From: Victor Engmark @ 2011-01-28 11:55 UTC (permalink / raw)
  To: git
In-Reply-To: <10431381.57687.1296214887819.JavaMail.trustmail@mail1.terreactive.ch>

On 01/28/2011 12:41 PM, vikram2rhyme wrote:
>
> Hello friends
> I am wondering if there are any permission and authorization control over
> git
> repository. I have gone through git manual but there is no discussion on it.
> On the internet i searched but hardy i found anything. Please help me if
> there
> is any permission control in distributed environment in git repository

Git is independent of access control mechanisms - You can use whatever 
you want. For example, you could use the filesystem read/write 
permissions on a directory to control local access, or SSH permissions 
to allow remote access. See for example GitHub, which uses different 
protocols for different levels of access.

HTH,
-- 
Victor Engmark

^ permalink raw reply

* Permissions and authorisations in git repository
From: vikram2rhyme @ 2011-01-28 11:41 UTC (permalink / raw)
  To: git


Hello friends
I am wondering if there are any permission and authorization control over
git 
repository. I have gone through git manual but there is no discussion on it.
On the internet i searched but hardy i found anything. Please help me if
there
is any permission control in distributed environment in git repository
-- 
View this message in context: http://git.661346.n2.nabble.com/Permissions-and-authorisations-in-git-repository-tp5969556p5969556.html
Sent from the git mailing list archive at Nabble.com.

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox