* Re: [PATCH] fast-export: Allow pruned-references in mark file
From: Antoine Pelisse @ 2012-11-26 13:23 UTC (permalink / raw)
To: Felipe Contreras, Junio C Hamano; +Cc: git
In-Reply-To: <CAMP44s0iSkqcOW0YsD=Jm_=x1tuoRbFQ+EbVvkROa_yY2-WFcA@mail.gmail.com>
On Mon, Nov 26, 2012 at 12:37 PM, Felipe Contreras
<felipe.contreras@gmail.com> wrote:
> On Mon, Nov 26, 2012 at 5:03 AM, Junio C Hamano <gitster@pobox.com> wrote:
>> Is this a safe and sane thing to do, and if so why? Could you
>> describe that in the log message here?
> Why would fast-export try to export something that was pruned? Doesn't
> that mean it wasn't reachable?
Hello Junio,
Hello Felipe,
Actually the issue happened while using Felipe's branch with his
git-remote-hg. Everything was going fine until I (or did it run
automatically, I dont remember) ran git gc that pruned unreachable
objects. Of course some of the branch I had pushed to the hg remote
had been changed (most likely rebased). References no longer exists
in the repository (cleaned by gc), but the reference still exists in
mark file, as it was exported earlier. Thus the failure when git
fast-export reads the mark file.
Then, is it safe ?
Updating the last_idnum as I do in the patch doesn't work because
if the reference is the last, the number is going to be overwriten
in the next run.
From git point of view, I guess it is fine. The file is fully read at
the beginning of fast-export and fully written at the end.
The issue is more for git-remote-hg that keeps track of
matches between git marks and hg commits. The marks are going to
change and be overriden. It will most likely need to read the mark
file to see if a ref has changed, and update it's dictionary.
One of the solution I'm thinking of, is to update the mark file
with marks of newly exported objects instead of recreating it,
and let obsolete references in the file. But of course that is
not acceptable.
Cheers,
Antoine
^ permalink raw reply
* Re: Python extension commands in git - request for policy change
From: Felipe Contreras @ 2012-11-26 13:17 UTC (permalink / raw)
To: esr; +Cc: Michael Haggerty, git
In-Reply-To: <20121125221126.GB6937@thyrsus.com>
On Sun, Nov 25, 2012 at 11:11 PM, Eric S. Raymond <esr@thyrsus.com> wrote:
> Felipe Contreras <felipe.contreras@gmail.com>:
>> And are you going to be around to spot them? It seems my patches for
>> git-remote-hg slipped by your watch, because it seems they use stuff
>> specific to python 2.7.
>
> The dev group hasn't decided (in whatever way it decides these
> things) to require 2.6 yet. When and if it does, I will volunteer my
> services as a Python expert to audit the in-tree Python code for 2.6
> conformance and assist the developers in backporting if required.
> I will also make myself available to audit future submissions.
What dev group?
> I think you know who I am. Junio and the other senior devs certainly
> know where to find me. I've been making promises like this, and
> *keeping* them, for decades. Please stop wasting our time with
> petulant display.
All right, you don't wand feedback, fine.
If you need me I'll be rewriting python code to ruby.
Cheers.
--
Felipe Contreras
^ permalink raw reply
* Re: Python extension commands in git - request for policy change
From: Felipe Contreras @ 2012-11-26 13:11 UTC (permalink / raw)
To: esr; +Cc: Nguyen Thai Ngoc Duy, git, msysGit
In-Reply-To: <20121125215635.GA6937@thyrsus.com>
On Sun, Nov 25, 2012 at 10:56 PM, Eric S. Raymond <esr@thyrsus.com> wrote:
> Felipe Contreras <felipe.contreras@gmail.com>:
>> And gitk is an integral part of git. But if you have different
>> numbers, what are they?
>
> I looked at the Makefile. I saw that there are shell variables that collect
> C commands, shell command, Perl commands, and Python commands. There are no
> collections of other commands.
I suppose you are talking about BUILT_INS, and SCRIPT_FOO, but tcl
scripts don't need no SCRIPT_FOO stuff, because they don't need to be
regenerated, in fact, I don't think the shell scripts need to be
regenerated, but that's not important, what is important is this:
all::
ifndef NO_TCLTK
$(QUIET_SUBDIR0)git-gui $(QUIET_SUBDIR1) gitexecdir='$(gitexec_instdir_SQ)' all
$(QUIET_SUBDIR0)gitk-git $(QUIET_SUBDIR1) all
endif
Why are you ignoring that?
> That makes them the top languages in the universe we are concerned about
According to whom?
I find it very curious how you are arguing for a change in the status
quo to move more towards python, and the basis you are using for
choosing python, and not ruby, or other scripting language is
precisely the status quo. However, the only scripts using python are
these:
SCRIPT_PYTHON += git-remote-testgit.py
SCRIPT_PYTHON += git-p4.py
I already re-wrote git-remote-testgit in bash, and it's dubious
whether or not git-remote-testpy (the new name for this old test) will
fulfill any service. Than means 43% of the current python code is
gone.
And what happens if I rewrite git-p4 in ruby? Would you then argue
that ruby is the way to go because 1% of the *current* code-base uses
it?
Interestingly, according to Wikipedia git is written in: C, Bourne
Shell, Tcl, Perl. That seems to be the case.
> Please don't waste further time on quibbling. We all know that gitk is
> an uncomfortable special case and that the project would be far better
> off, maintainability-wise, if it were successfully ported to one if these
> other languages.
As I said, gitk is integral to the git experience. Of course you are
free to disagree, but according to the last user survey 57% of the
responders used some kind of graphical tool (e.g. gitk, tig). How many
use gitk, and how many use something else, we don't really know, but
what we know is that gitk is distributed *by default*.
Nobody is arguing that gitk should not be distributed by default, just
like nobody is arguing that git-p4 shouldn't, but we *know* very few
people use git-p4 (1% according to the survey), and we can reasonably
assume that many more use gitk.
You cannot have your cake and eat it at the same time. If you use the
amount of code as a measure, then you have to agree that Tcl/Tk is a
way bigger language than python in the mainline git world. If not,
then by all means, show us the numbers. But you can't say "the
important languages are A, B, and D, C doesn't count because I don't
like it, and E doesn't count either because we should draw the line at
three", that seems awfully convenient to push your agenda.
And I don't agree that the project would be better off with something
else, if it was, somebody would have proposed an alternative by now,
but there aren't any. I have tried gitg, and giggle, and I have even
contributed to them, but they are just not as good and useful as plain
old gitk, I always keep coming back.
gitk:
* is blazing fast to start
* doesn't have a lot of dependencies: just tcl/tk
* works on Windows without a major fuzz
* is actively maintained
* shows a proper graph (unlike gitg or giggle)
Now, show me an alternative that fulfills all these points. And I'm
pretty sure you won't find one, because if you did, it would have been
already proposed for mainline git... there isn't any. And if you did,
we would start with oh, but it's GTK+, or it's Qt, and how do you make
it work on Windows. No, gitk is just fine, and works great.
Tcl/Tk might not be your cup of tea, and indeed it's rather unmodern,
but that only tells you how an awful job the modern toolkits have done
with regards to portability and flexibility.
You were arguing for portability, well, Tcl/Tk works on all platforms,
here, have a look, there's no other tool that fulfills this:
http://www.mediawiki.org/wiki/Git/Graphical_User_Interfaces
> Trying to catch me out by triumphantly pointing at gitk is...juvenile.
Isn't that what you are doing by triumphantly pointing at git-p4?
Sorry, if you want to cut the line for the languages that git should
use right now at three, then python is out. Maybe in a couple of
years. Maybe. But I doubt it.
Cheers.
--
Felipe Contreras
^ permalink raw reply
* Re: commit gone after merge - how to debug?
From: Tomas Carnecky @ 2012-11-26 13:10 UTC (permalink / raw)
To: Igor Lautar, git
In-Reply-To: <CAO1Khk_eugH--wp3s-gr4HTvuRyL=SaWHWtEXCRZ_Ak7+s5U=w@mail.gmail.com>
On Mon, 26 Nov 2012 14:06:09 +0100, Igor Lautar <igor.lautar@gmail.com> wrote:
> git log <file modified by commit>
> - commit NOT shown in file history any more and file does not have this change
does `git log --full-history <file modified by commit>` show the commit?
^ permalink raw reply
* commit gone after merge - how to debug?
From: Igor Lautar @ 2012-11-26 13:06 UTC (permalink / raw)
To: git
Hi,
This looks really weird and I cannot explain why it occurs.
Setup is as follows:
- origin
- mirror
- local clone
Reference repository is origin from where builds are done etc.
Parallel to that, we keep a mirror that is synced manually
(fetch/merge/push).
I do this from my local clone (which is mostly just tracking origin
and mirror, no local branches).
What happened is that after a merge of mirror/master into local
master, a commit (that also exists on origin/master) is lost.
Lost as in:
pre-merge:
git log <file modified by commit>
- commit shown in history
git merge mirror/master
- no conflicts
git log <file modified by commit>
- commit NOT shown in file history any more and file does not have this change
Doing git log shows commit as being present in repository history. One
interesting point is that one of the parents is previous merge commit
of same branches.
Unfortunately, I cannot open up repository for public access, but
would appreciate any pointers how to debug this.
git fsck finds some dangling blobs/commits, but no other
warnings/errors, I can clone repo just fine, everything seams in
order.
How can I debug what the merge is doing?
git version 1.7.12.1 on mac:
Darwin 12.2.0 Darwin Kernel Version 12.2.0: Sat Aug 25 00:48:52 PDT
2012; root:xnu-2050.18.24~1/RELEASE_X86_64 x86_64
Regards,
Igor
PS. please keep me in CC, I'm not on list
^ permalink raw reply
* Re: Can I zip a git repo without losing anything?
From: Konstantin Khomoutov @ 2012-11-26 13:04 UTC (permalink / raw)
To: Carl Smith; +Cc: git
In-Reply-To: <CAP-uhDcQg0BuEdHDTa6qVqLCeB-LE1GZtEqHgY_j1--XodsDKw@mail.gmail.com>
On Mon, 26 Nov 2012 04:55:10 +0000
Carl Smith <carl.input@gmail.com> wrote:
> After suggesting using zip files to move our projects around, I was
> told that you can not zip a git repo without loosing all the history.
> This didn't make sense to me, but two people told me the same thing,
> so I wasn't sure. I think they may have been confusing the zipped file
> you can download from GitHub with a zipped git repo.
>
> If someone could put me straight on this, I'd really appreciate it.
To amend the already provided answer -- if you need to move repos
around using the sneakernet, the tool you should probably use is
`git bundle`.
^ permalink raw reply
* Re: [RFC/PATCH] Option to revert order of parents in merge commit
From: Kacper Kornet @ 2012-11-26 12:42 UTC (permalink / raw)
To: Junio C Hamano, git
In-Reply-To: <7vfw3zzoye.fsf@alter.siamese.dyndns.org>
On Fri, Nov 23, 2012 at 06:58:49PM -0800, Junio C Hamano wrote:
> Kacper Kornet <draenog@pld-linux.org> writes:
> > The following patch is an attempt to implement this idea.
> I think "revert" is a wrong word (implying you have already done
> something and you are trying to defeat the effect of that old
> something), and you meant to say "reverse" (i.e. the opposite of
> normal) or something.
You are right. Probably transpose is the best description what the patch
really does.
> I am unsure about the usefulness of this, though.
> After completing a topic on branch A, you would merge it to your own
> copy of the integration branch (e.g. 'master') and try to push,
> which may be rejected due to non-fast-forwardness:
> $ git checkout master
> $ git merge A
> $ git push
> At that point, if you _care_ about the merge parent order, you could
> do this (still on 'master'):
> $ git fetch origin
> $ git reset --hard origin/master
> $ git merge A
> $ test test test
> $ git push
> With --reverse-parents, it would become:
> $ git pull --reverse-parents
> $ test test test
> $ git push
> which certainly is shorter and looks simpler. The workflow however
> would encourage people to work directly on the master branch, which
> is a bit of downside.
Our developers work mainly on master branches. The project consists of
many thousands independent git repositories, and at the given time a
developer usually wants to make only one commit in the given repository
and push his changes upstream. So he usually doesn't care to make a
branch. Then after failed pushed, one needs to add creation and removal
of temporary branch (see the commit message of the suggested patch).
The possibility to do git pull --reverse-parent would make the life
easier in this case.
> Is there any interaction between this "pull --reverse-parents"
> change and possible conflict resolution when the command stops and
> asks the user for help? For example, whom should "--ours" and "-X
> ours" refer to? Us, or the upstream?
The change of order of parents happens at the very last moment, so
"ours" in merge options is local version and "theirs" upstream.
--
Kacper Kornet
^ permalink raw reply
* Re: [RFC] pack-objects: compression level for non-blobs
From: David Michael Barr @ 2012-11-26 12:35 UTC (permalink / raw)
To: Git Mailing List; +Cc: David Michael Barr
In-Reply-To: <1353911154-23495-1-git-send-email-b@rr-dav.id.au>
> Add config pack.graphcompression similar to pack.compression.
> Applies to non-blob objects and if unspecified falls back to pack.compression.
>
> We may identify objects compressed with level 0 by their leading bytes.
> Use this to force recompression when the source and target levels mismatch.
> Limit its application to when the config pack.graphcompression is set.
>
> Signed-off-by: David Michael Barr <b@rr-dav.id.au (mailto:b@rr-dav.id.au)>
> ---
> builtin/pack-objects.c | 49 +++++++++++++++++++++++++++++++++++++++++++++----
> 1 file changed, 45 insertions(+), 4 deletions(-)
>
> I started working on this just before taking a vacation,
> so it's been a little while coming.
>
> The intent is to allow selective recompression of pack data.
> For small objects/deltas the overhead of deflate is significant.
> This may improve read performance for the object graph.
>
> I ran some unscientific experiments with the chromium repository.
> With pack.graphcompression = 0, there was a 2.7% increase in pack size.
> I saw a 35% improvement with cold caches and 43% otherwise on git log --raw.
I neglected to mention that this is a WIP. I get failures with certain repositories:
fatal: delta size changed
--
David Michael Barr
^ permalink raw reply
* Re: [PATCH v5 15/15] fast-export: don't handle uninteresting refs
From: Felipe Contreras @ 2012-11-26 12:16 UTC (permalink / raw)
To: Junio C Hamano
Cc: Jeff King, Jonathan Nieder, git, Johannes Schindelin, Max Horn,
Sverre Rabbelier, Brandon Casey, Brandon Casey, Ilari Liusvaara,
Pete Wyckoff, Ben Walton, Matthieu Moy, Julian Phillips
In-Reply-To: <7v7gp9udsl.fsf@alter.siamese.dyndns.org>
On Mon, Nov 26, 2012 at 6:35 AM, Junio C Hamano <gitster@pobox.com> wrote:
> Felipe Contreras <felipe.contreras@gmail.com> writes:
>
>> On Wed, Nov 21, 2012 at 8:48 PM, Jeff King <peff@peff.net> wrote:
>> ...
>> I would like to understand that that even means. What behavior is
>> currently broken?
>
> I do not know if this is the same as what Peff was referring to, but
> I found this message in the discussion thread during my absense.
>
> From: Johannes Schindelin <Johannes.Schindelin@gmx.de>
> Subject: Re: [PATCH v3 4/4] fast-export: make sure refs are updated properly
> Date: Fri, 2 Nov 2012 16:17:14 +0100 (CET)
> Message-ID: <alpine.DEB.1.00.1211021612320.7256@s15462909.onlinehome-server.info>
>
> (which is $gmane/208946) that says:
>
> Note that
>
> $ git branch foo master~1
> $ git fast-export foo master~1..master
>
> still does not update the "foo" ref, but a partial fix is better
> than no fix.
First of all, do we agree that this patch does not change the
situation for this command? If so, I don't see why that would be
relevant while discussing this patch series.
Second, this is what I get:
% git log --decorate --oneline foo master~1..master
8c7a786 (tag: v1.8.0, master) Git 1.8.0
Notice that 'foo' is not there? It's not there because we explicitly
stated that we didn't want it there.
And what do you expect that command to do with 'foo'? To throw a
'reset refs/heads/foo'? To what commit? There is no mark for that
commit. 'reset :0'? That doesn't help anybody. No, that command is not
broken, it works as expected.
Notice the situation would be different with 'git fast-export
--import-marks=marks foo master~1..master', because if there's a mark
for foo, *now* we can do something about it. This particular patch
series doesn't, but the next one does.
Cheers.
--
Felipe Contreras
^ permalink raw reply
* Re: [PATCH] fast-export: Allow pruned-references in mark file
From: Felipe Contreras @ 2012-11-26 11:37 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Antoine Pelisse, git
In-Reply-To: <7vd2z1xb6c.fsf@alter.siamese.dyndns.org>
On Mon, Nov 26, 2012 at 5:03 AM, Junio C Hamano <gitster@pobox.com> wrote:
> Antoine Pelisse <apelisse@gmail.com> writes:
>
>> fast-export can fail because of some pruned-reference when importing a
>> mark file.
>>
>> The problem happens in the following scenario:
>>
>> $ git fast-export --export-marks=MARKS master
>> (rewrite master)
>> $ git prune
>> $ git fast-export --import-marks=MARKS master
>>
>> This might fail if some references have been removed by prune
>> because some marks will refer to non-existing commits.
>>
>> Let's warn when we have a mark for a commit we don't know.
>> Also, increment the last_idnum before, so we don't override
>> the mark.
>
> Is this a safe and sane thing to do, and if so why? Could you
> describe that in the log message here?
Why would fast-export try to export something that was pruned? Doesn't
that mean it wasn't reachable?
Essentially, if 'git rev-list $foo' can't possibly export this pruned
object, why would 'git fast-export $foo' would?
Cheers.
--
Felipe Contreras
^ permalink raw reply
* Re: [PATCH v3 0/7] New remote-bzr remote helper
From: Felipe Contreras @ 2012-11-26 11:34 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git, Jeff King, Johannes Schindelin
In-Reply-To: <7v4nkdxawx.fsf@alter.siamese.dyndns.org>
On Mon, Nov 26, 2012 at 5:09 AM, Junio C Hamano <gitster@pobox.com> wrote:
> Felipe Contreras <felipe.contreras@gmail.com> writes:
>
>> On Sun, Nov 11, 2012 at 3:19 PM, Felipe Contreras
>> <felipe.contreras@gmail.com> wrote:
>>> This is a re-roll of the previous series to add support to fetch and push
>>> special modes, and refactor some related code.
>>
>> It seems this one got forgotten, I only see v2 in pu.
>
> Oops; I think that was fell through the cracks during the maintainer
> hand-off. As the previous one has already been cooking in 'next'
> for a week or so, I would appreciate if you send incremental updates
> to fix or enhance what is in there.
Yes, that's what I have planned for the next patches, as I already did
for remote-hg, but the changes in remote-bzr were a bit bigger.
--
Felipe Contreras
^ permalink raw reply
* Re: [PATCH] Fix bash completion when `egrep` is aliased to `egrep --color=always`
From: Frans Klaver @ 2012-11-26 11:30 UTC (permalink / raw)
To: Adam Tkac; +Cc: Marc Khouzam, git
In-Reply-To: <20121126112352.GA4481@redhat.com>
On Mon, Nov 26, 2012 at 12:23 PM, Adam Tkac <atkac@redhat.com> wrote:
> Good idea, thanks. Improved patch is attached.
It is custom on this list to mail the patches, rather than attaching
them, so people can review your changes in-line. You can read more
about it in in git/Documentation/SubmittingPatches.
Cheers,
Frans
^ permalink raw reply
* Re: [PATCH] Fix bash completion when `egrep` is aliased to `egrep --color=always`
From: Adam Tkac @ 2012-11-26 11:23 UTC (permalink / raw)
To: Marc Khouzam; +Cc: git
In-Reply-To: <CAFj1UpG6H3bpoa7xbqpH6Hyb6pwqE_CCgP6iT36D-ELvtVi4wA@mail.gmail.com>
[-- Attachment #1: Type: text/plain, Size: 597 bytes --]
On Thu, Nov 22, 2012 at 02:55:21PM -0500, Marc Khouzam wrote:
> On Thu, Nov 22, 2012 at 10:41 AM, Adam Tkac <atkac@redhat.com> wrote:
> > Hello all,
> >
> > attached patch fixes bash completion when `egrep` is aliased to `egrep --color=always`.
>
> To avoid any aliases, it may be better to use
> \egrep
Good idea, thanks. Improved patch is attached.
Regards, Adam
>
> This could be worthwhile for all utilities used by the script.
>
> Just a thought.
>
> Marc
>
>
> >
> > Comments are welcomed.
> >
> > Regards, Adam
> >
> > --
> > Adam Tkac, Red Hat, Inc.
--
Adam Tkac, Red Hat, Inc.
[-- Attachment #2: 0001-If-egrep-is-aliased-temporary-disable-it-in-bash.com.patch --]
[-- Type: text/plain, Size: 979 bytes --]
>From 255192296cd175fddcac2647447a66a0ca55b855 Mon Sep 17 00:00:00 2001
From: Adam Tkac <atkac@redhat.com>
Date: Thu, 22 Nov 2012 16:34:58 +0100
Subject: [PATCH] If `egrep` is aliased, temporary disable it in
bash.completion
Originally reported as https://bugzilla.redhat.com/show_bug.cgi?id=863780
Signed-off-by: Adam Tkac <atkac@redhat.com>
Signed-off-by: Holger Arnold <holgerar@gmail.com>
---
contrib/completion/git-completion.bash | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/contrib/completion/git-completion.bash b/contrib/completion/git-completion.bash
index 0960acc..79073c2 100644
--- a/contrib/completion/git-completion.bash
+++ b/contrib/completion/git-completion.bash
@@ -565,7 +565,7 @@ __git_complete_strategy ()
__git_list_all_commands ()
{
local i IFS=" "$'\n'
- for i in $(git help -a|egrep '^ [a-zA-Z0-9]')
+ for i in $(git help -a| \egrep '^ [a-zA-Z0-9]')
do
case $i in
*--*) : helper pattern;;
--
1.8.0
^ permalink raw reply related
* Re: Python extension commands in git - request for policy change
From: Andreas Ericsson @ 2012-11-26 11:05 UTC (permalink / raw)
To: esr; +Cc: Felipe Contreras, git
In-Reply-To: <20121125224443.GC6937@thyrsus.com>
On 11/25/2012 11:44 PM, Eric S. Raymond wrote:
> Felipe Contreras <felipe.contreras@gmail.com>:
>> According to the results of the last survey, our users do care about
>> performance, so I don't think there's anything excessive about it. Are
>> there any hidden costs in maintenance problems? I don't think so.
>
> Then you're either pretending or very naive. Three decades of
> experience as a C programmer tells me that C code at any volume is a
> *serious* maintainance problem relative to almost any language with
> GC. Prudent architects confine it is much as possible.
>
Prudent architects also avoid rewrites as much as possible, since it's
inevitable that bugs will be introduced that have been fixed in the
"official" version.
Personally, I think if you'd left your suggestion on "It would be great
to have guidelines for python scripts. I propose 2.6 as the minimum
required python verison" and left it at that, there would have been
very little disagreement. The suggestion that things should be rewritten
in python for some spurious long-term savings seems mostly designed to
refuel everyone's favourite flamethrower, and you know as well as I do
that it just won't happen unless there's at least a chance of some
substantial technical benefits from doing so.
--
Andreas Ericsson andreas.ericsson@op5.se
OP5 AB www.op5.se
Tel: +46 8-230225 Fax: +46 8-230231
Considering the successes of the wars on alcohol, poverty, drugs and
terror, I think we should give some serious thought to declaring war
on peace.
^ permalink raw reply
* Re: Requirements for integrating a new git subcommand
From: Peter Krefting @ 2012-11-26 11:01 UTC (permalink / raw)
To: Eric S. Raymond; +Cc: git
In-Reply-To: <20121123153541.GA20097@thyrsus.com>
Eric S. Raymond:
> and (b) include the removal of import-directories.perl in my
> integration patch.
Yes, please.
--
\\// Peter - http://www.softwolves.pp.se/
^ permalink raw reply
* Re: Multiple threads of compression
From: Thorsten Glaser @ 2012-11-26 9:59 UTC (permalink / raw)
To: Brandon Casey; +Cc: git@vger.kernel.org
In-Reply-To: <CA+sFfMerMWnJwiBz0=MxO0gn8B2M8g12mt5VzZaRj591oMPVow@mail.gmail.com>
Brandon Casey dixit:
>The number of threads that pack uses can be configured in the global
>or system gitconfig file by setting pack.threads.
[…]
>The other setting you should probably look at is pack.windowMemory
>which should help you control the amount of memory git uses while
>packing. Also look at core.packedGitWindowSize and
>core.packedGitLimit if your repository is really large.
OK, thanks a lot!
I can’t really say much about the repositories beforehand
because it’s a generic code hosting platform, several instances
of which we run at my employer’s place (I also run one privately
now), and which is also run by e.g. Debian. But I’ll try to figure
out some somewhat sensible defaults.
>Running 'git gc' with --aggressive should be as safe as running it
>without --aggressive.
OK, thanks.
>But, you should think about whether you really need to run it more
>than once, or at all. When you use --aggressive, git will perform the
[…]
Great explanation!
I think that I’d want to run it once, after the repository has
been begun to be used (probably not correct English but you know
what I want to say), but have to figure out a way to do so… but
I’ll just leave out the --aggressive from the cronjob then.
Much appreciated,
//mirabilos
--
Sometimes they [people] care too much: pretty printers [and syntax highligh-
ting, d.A.] mechanically produce pretty output that accentuates irrelevant
detail in the program, which is as sensible as putting all the prepositions
in English text in bold font. -- Rob Pike in "Notes on Programming in C"
^ permalink raw reply
* Re: [PATCH] diff: Fixes shortstat number of files
From: Antoine Pelisse @ 2012-11-26 9:10 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git, Linus Torvalds
In-Reply-To: <7vhaodxctq.fsf@alter.siamese.dyndns.org>
Indeed stat seems to be broken on master by commit 74faaa16 from Linus Torvalds
There are three separated issues here:
- unmerged files are marked as "interesting" in stat and probably
shouldn't, with some patch like this:
data->is_interesting = p->status != 0;
if (!one || !two) {
data->is_unmerged = 1;
+ data->is_interesting = 0;
return;
}
By the way, I don't get the point of this code then:
else if (data->files[i]->is_unmerged) {
fprintf(options->file, "%s", line_prefix);
show_name(options->file, prefix, name, len);
fprintf(options->file, " Unmerged\n");
continue;
}
and
if (file->is_unmerged) {
/* "Unmerged" is 8 characters */
bin_width = bin_width < 8 ? 8 : bin_width;
continue;
}
Are we ever supposed to print that ? I feel like it could be removed.
- Unmerged files are not filtered out in shortstat, thus counted
twice (addressed by the patch)
- no file has ever been filtered out of numstat, and probably should
the way it's done in stat. That is with something like this:
if (!data->files[i]->is_interesting &&
(added + deleted == 0)) {
continue;
}
Cheers,
Antoine Pelisse
---------- Forwarded message ----------
From: Junio C Hamano <gitster@pobox.com>
Date: Mon, Nov 26, 2012 at 4:28 AM
Subject: Re: [PATCH] diff: Fixes shortstat number of files
To: Antoine Pelisse <apelisse@gmail.com>
Cc: git@vger.kernel.org
Antoine Pelisse <apelisse@gmail.com> writes:
> Subject: Re: [PATCH] diff: Fixes shortstat number of files
Please replace "Fixes" with "Fix at least (because our log messages
are written as if a patch is giving an order to the codebase, iow,
in imperative mood), but we would prefer to see a concrete
description on what is fixed, when we can. And in this case, I
think we can, perhaps:
diff: do not count unmerged paths twice in --shortstat/--numstat
or something.
> There is a discrepancy between the last line of `git diff --stat`
> and `git diff --shortstat` in case of a merge.
> The unmerged files are actually counted twice, thus doubling the
> value of "file changed".
I think the current 'master' and upward is broken with respect to
this; I am consistently getting two entries for unmerged paths
across --stat, --shortstat and --numstat options (iow, not just
shortstat and numstat but the '--stat' seems to be broken as well).
Thanks.
^ permalink raw reply
* Re: Python extension commands in git - request for policy change
From: Krzysztof Mazur @ 2012-11-26 8:32 UTC (permalink / raw)
To: Sitaram Chamarty; +Cc: esr, git
In-Reply-To: <CAMK1S_g2jpa+VqnuzhNaBNkC5bJHwbEy1iP-=sG29FFKmjTjpw@mail.gmail.com>
On Mon, Nov 26, 2012 at 10:40:00AM +0530, Sitaram Chamarty wrote:
> On Mon, Nov 26, 2012 at 4:17 AM, Eric S. Raymond <esr@thyrsus.com> wrote:
> > Krzysztof Mazur <krzysiek@podlesie.net>:
> >> What about embedded systems? git is also useful there. C and shell is
> >> everywhere, python is not.
> >
> > Supposing this is true (and I question it with regard to shell) if you
> > tell me how you live without gitk and the Perl pieces I'll play that
> > right back at you as your answer.
>
> gitk is unlikely to be used on an embedded system, the perl pieces more so.
Currently even perl is used only for few very high level commands that
are not really needed there. I think that python is ok for pieces
that use perl now, but I think that it shouldn't be used for
basic porcelain commands. I also don't think that we should prefer
python over other languages and especially I don't think that some
existing code should be rewritten to python.
Even if python is really better, I think that the natural migration is
much better.
>
> I have never understood why people complain about readability in perl.
> Just because it uses the ascii charset a bit more? You expect a
> mathematician or indeed any scientist to use special symbols to mean
> special things, why not programmers?
>
Because some perl programmers really create write-only code, but the
maintainer can just reject that code. It's not the language that
create non-readable code, but the programmer. I think that the perl
code in git is readable, at least is parts I've seen.
Krzysiek
^ permalink raw reply
* Re: [PATCH] Third try at documenting command integration requirements.
From: Perry Hutchison @ 2012-11-26 8:07 UTC (permalink / raw)
To: esr; +Cc: git
In-Reply-To: <20121126053557.E56434065F@snark.thyrsus.com>
esr@thyrsus.com (Eric S. Raymond) wrote:
> This document contains no new policies or proposals; it attempts
> to document established practices and interface requirements.
...
> +4. If your command has any dependency on a a particular version of
^^^
typo. (granted this is an extreme nit)
^ permalink raw reply
* Re: [PATCH 00/11] alternative unify appending of sob
From: Nguyen Thai Ngoc Duy @ 2012-11-26 7:56 UTC (permalink / raw)
To: Brandon Casey; +Cc: git, gitster, Brandon Casey
In-Reply-To: <1353894359-6733-1-git-send-email-drafnel@gmail.com>
On Mon, Nov 26, 2012 at 8:45 AM, Brandon Casey <drafnel@gmail.com> wrote:
> From: Brandon Casey <bcasey@nvidia.com>
>
> I hate to have the battle of the patches, but I kinda prefer the
> append_signoff code in sequencer.c over the code in log-tree.c as a
> foundation to build on.
>
> So, this series is similar to Duy's "unification" series, but it goes in the
> opposite direction and builds on top of sequencer.c and additionally adds the
> elements from my original series to treat the "(cherry picked from..." line
> added by 'cherry-pick -x' in the same way that other footer elements are
> treated (after addressing Junio's comments about rfc2822 continuation lines
> and duplicate s-o-b's).
>
> I've integrated Duy's series with a few minor tweaks. I added a couple
> of additional tests to t4014 and corrected one of the tests which had
> incorrect behavior. I think his sign-off's should still be valid, so I
> kept them in. Sorry that I've been slow, and now the two of us are stepping
> on each other's toes, but Duy please take a look and let me know if there's
> anything you dislike.
I'm still not sure whether format-patch should follow cherry-pick's
rule in appending sob. If it does, it probably makes more sense to fix
the sequencer.c code then delete log-tree.c (not fixes on log-tree.c
then delete it). I see that your changes pass all the new tests I
added in format-patch so sequencer.c is probably good enough,
log-tree.c changes are probably not needed. Feel free take over the
series :-)
--
Duy
^ permalink raw reply
* [RFC] pack-objects: compression level for non-blobs
From: David Michael Barr @ 2012-11-26 6:25 UTC (permalink / raw)
To: Git Mailing List; +Cc: David Michael Barr
Add config pack.graphcompression similar to pack.compression.
Applies to non-blob objects and if unspecified falls back to pack.compression.
We may identify objects compressed with level 0 by their leading bytes.
Use this to force recompression when the source and target levels mismatch.
Limit its application to when the config pack.graphcompression is set.
Signed-off-by: David Michael Barr <b@rr-dav.id.au>
---
builtin/pack-objects.c | 49 +++++++++++++++++++++++++++++++++++++++++++++----
1 file changed, 45 insertions(+), 4 deletions(-)
I started working on this just before taking a vacation,
so it's been a little while coming.
The intent is to allow selective recompression of pack data.
For small objects/deltas the overhead of deflate is significant.
This may improve read performance for the object graph.
I ran some unscientific experiments with the chromium repository.
With pack.graphcompression = 0, there was a 2.7% increase in pack size.
I saw a 35% improvement with cold caches and 43% otherwise on git log --raw.
diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index f069462..9518daf 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -40,6 +40,7 @@ struct object_entry {
unsigned long z_delta_size; /* delta data size (compressed) */
unsigned int hash; /* name hint hash */
enum object_type type;
+ enum object_type actual_type;
enum object_type in_pack_type; /* could be delta */
unsigned char in_pack_header_size;
unsigned char preferred_base; /* we do not pack this, but is available
@@ -81,6 +82,8 @@ static int num_preferred_base;
static struct progress *progress_state;
static int pack_compression_level = Z_DEFAULT_COMPRESSION;
static int pack_compression_seen;
+static int pack_graph_compression_level = Z_DEFAULT_COMPRESSION;
+static int pack_graph_compression_seen;
static unsigned long delta_cache_size = 0;
static unsigned long max_delta_cache_size = 256 * 1024 * 1024;
@@ -125,14 +128,14 @@ static void *get_delta(struct object_entry *entry)
return delta_buf;
}
-static unsigned long do_compress(void **pptr, unsigned long size)
+static unsigned long do_compress(void **pptr, unsigned long size, int level)
{
git_zstream stream;
void *in, *out;
unsigned long maxsize;
memset(&stream, 0, sizeof(stream));
- git_deflate_init(&stream, pack_compression_level);
+ git_deflate_init(&stream, level);
maxsize = git_deflate_bound(&stream, size);
in = *pptr;
@@ -191,6 +194,18 @@ static unsigned long write_large_blob_data(struct git_istream *st, struct sha1fi
return olen;
}
+static int check_pack_compressed(struct packed_git *p,
+ struct pack_window **w_curs,
+ off_t offset)
+{
+ unsigned long avail;
+ int compressed = 0;
+ unsigned char *in = use_pack(p, w_curs, offset, &avail);
+ if (avail >= 3)
+ compressed = !!(in[2] & 0x6);
+ return compressed;
+}
+
/*
* we are going to reuse the existing object data as is. make
* sure it is not corrupt.
@@ -240,6 +255,8 @@ static void copy_pack_data(struct sha1file *f,
}
}
+#define compression_level(type) ((type) && (type) != OBJ_BLOB ? pack_graph_compression_level : pack_compression_level)
+
/* Return 0 if we will bust the pack-size limit */
static unsigned long write_no_reuse_object(struct sha1file *f, struct object_entry *entry,
unsigned long limit, int usable_delta)
@@ -286,7 +303,7 @@ static unsigned long write_no_reuse_object(struct sha1file *f, struct object_ent
else if (entry->z_delta_size)
datalen = entry->z_delta_size;
else
- datalen = do_compress(&buf, size);
+ datalen = do_compress(&buf, size, compression_level(entry->actual_type));
/*
* The object header is a byte of 'type' followed by zero or
@@ -379,6 +396,13 @@ static unsigned long write_reuse_object(struct sha1file *f, struct object_entry
offset += entry->in_pack_header_size;
datalen -= entry->in_pack_header_size;
+ if (!pack_to_stdout &&
+ pack_graph_compression_seen &&
+ check_pack_compressed(p, &w_curs, offset) != !!compression_level(entry->actual_type)) {
+ unuse_pack(&w_curs);
+ return write_no_reuse_object(f, entry, limit, usable_delta);
+ }
+
if (!pack_to_stdout && p->index_version == 1 &&
check_pack_inflate(p, &w_curs, offset, datalen, entry->size)) {
error("corrupt packed object for %s", sha1_to_hex(entry->idx.sha1));
@@ -955,6 +979,8 @@ static int add_object_entry(const unsigned char *sha1, enum object_type type,
memset(entry, 0, sizeof(*entry));
hashcpy(entry->idx.sha1, sha1);
entry->hash = hash;
+ if (pack_graph_compression_seen)
+ entry->actual_type = sha1_object_info(sha1, NULL);
if (type)
entry->type = type;
if (exclude)
@@ -1758,7 +1784,8 @@ static void find_deltas(struct object_entry **list, unsigned *list_size,
*/
if (entry->delta_data && !pack_to_stdout) {
entry->z_delta_size = do_compress(&entry->delta_data,
- entry->delta_size);
+ entry->delta_size,
+ compression_level(entry->actual_type));
cache_lock();
delta_cache_size -= entry->delta_size;
delta_cache_size += entry->z_delta_size;
@@ -2159,6 +2186,16 @@ static int git_pack_config(const char *k, const char *v, void *cb)
pack_idx_opts.version);
return 0;
}
+ if (!strcmp(k, "pack.graphcompression")) {
+ int level = git_config_int(k, v);
+ if (level == -1)
+ level = Z_DEFAULT_COMPRESSION;
+ else if (level < 0 || level > Z_BEST_COMPRESSION)
+ die("bad pack graph compression level %d", level);
+ pack_graph_compression_level = level;
+ pack_graph_compression_seen = 1;
+ return 0;
+ }
return git_default_config(k, v, cb);
}
@@ -2519,6 +2556,10 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix)
argc = parse_options(argc, argv, prefix, pack_objects_options,
pack_usage, 0);
+ /* Fall back after option parsing to catch --compression */
+ if (!pack_graph_compression_seen)
+ pack_graph_compression_level = pack_compression_level;
+
if (argc) {
base_name = argv[0];
argc--;
--
1.8.0
^ permalink raw reply related
* [PATCH] Third try at documenting command integration requirements.
From: Eric S. Raymond @ 2012-11-26 5:35 UTC (permalink / raw)
To: git
This document contains no new policies or proposals; it attempts
to document established practices and interface requirements.
Signed-off-by: Eric S. Raymond <esr@thyrsus.com>
---
Documentation/technical/api-command.txt | 91 +++++++++++++++++++++++++++++++
1 file changed, 91 insertions(+)
create mode 100644 Documentation/technical/api-command.txt
diff --git a/Documentation/technical/api-command.txt b/Documentation/technical/api-command.txt
new file mode 100644
index 0000000..c1c1afb
--- /dev/null
+++ b/Documentation/technical/api-command.txt
@@ -0,0 +1,91 @@
+= Integrating new subcommands =
+
+This is how-to documentation for people who want to add extension
+commands to git. It should be read alongside api-builtin.txt.
+
+== Runtime environment ==
+
+git subcommands are standalone executables that live in the git
+execution directory, normally /usr/lib/git-core. The git executable itself
+is a thin wrapper that sets GIT_DIR and passes command-line arguments
+to the subcommand.
+
+(If "git foo" is not found in the git exec path, the wrapper
+will look in the rest of your $PATH for it. Thus, it's possible
+to write local git extensions that don't live in system space.)
+
+== Implementation languages ==
+
+Most subcommands are written in C or shell. A few are written in
+Perl. A tiny minority are written in Python.
+
+While we strongly encourage coding in portable C for portability, these
+specific scripting languages are also acceptable. We won't accept more
+without a very strong technical case, as we don't want to broaden the
+git suite's required dependencies.
+
+Python is fine for import utilities, surgical tools, remote helpers
+and other code at the edges of the git suite - but it should not yet
+be used for core functions. This may change in the future; the problem
+is that we need better Python integration in the git Windows installer
+before we can be confident people in that environment won't
+experience an unacceptably large loss of capability.
+
+C commands are normally written as single modules, named after the
+command, that link a collection of functions called libgit. Thus,
+your command 'git-foo' would normally be implemented as a single
+"git-foo.c"; this organization makes it easy for people reading the
+code to find things.
+
+See the CodingGuidelines document for other guidance on what we consider
+good practice in C and shell, and api-builtin.txt for the support
+functions available to built-in commands written in C.
+
+== What every extension command needs ==
+
+You must have a man page, written in asciidoc (this is what git help
+followed by your subcommand name will display). Be aware that there is
+a local asciidoc configuration and macros which you should use. It's
+often helpful to start by cloning an existing page and replacing the
+text content.
+
+You must have a test, written to report in TAP (Test Anything Protocol).
+Tests are executables (usually shell scripts) that live in the 't'
+subdirectory of the tree. Each test name begins with 't' and a sequence
+number that controls where in the test sequence it will be executed;
+conventionally the rest of the name stem is that of the command
+being tested.
+
+Read the file t/README to learn more about the conventions to be used
+in writing tests, and the test support library.
+
+== Integrating a command ==
+
+Here are the things you need to do when you want to merge a new
+subcommand into the git tree.
+
+0. Don't forget to sign off your patch!
+
+1. Append your command name to one of the variables BUILTIN_OBJS,
+EXTRA_PROGRAMS, SCRIPT_SH, SCRIPT_PERL or SCRIPT_PYTHON.
+
+2. Drop its test in the t directory.
+
+3. If your command is implemented in an interpreted language with a
+p-code intermediate form, make sure .gitignore in the main directory
+includes a pattern entry that ignores such files. Python .pyc and
+.pyo files will already be covered.
+
+4. If your command has any dependency on a a particular version of
+your language, document it in the INSTALL file.
+
+5. There is a file command-list.txt in the distribution main directory
+that categorizes commands by type, so they can be listed in appropriate
+subsections in the documentation's summary command list. Add an entry
+for yours. To understand the categories, look at git-cmmands.txt
+in the main directory.
+
+6. When your patch is merged, remind the maintainer to add something
+about it in the RelNotes file.
+
+That's all there is to it.
--
1.7.9.5
--
<a href="http://www.catb.org/~esr/">Eric S. Raymond</a>
Whether the authorities be invaders or merely local tyrants, the
effect of such [gun control] laws is to place the individual at the
mercy of the state, unable to resist.
-- Robert Anson Heinlein, 1949
^ permalink raw reply related
* Re: [PATCH v5 15/15] fast-export: don't handle uninteresting refs
From: Junio C Hamano @ 2012-11-26 5:35 UTC (permalink / raw)
To: Felipe Contreras
Cc: Jeff King, Jonathan Nieder, git, Johannes Schindelin, Max Horn,
Sverre Rabbelier, Brandon Casey, Brandon Casey, Ilari Liusvaara,
Pete Wyckoff, Ben Walton, Matthieu Moy, Julian Phillips
In-Reply-To: <CAMP44s2B2_htR8LFbHk99WaNUcaYJCxVJPdRdj5VQ0k+fB9NOg@mail.gmail.com>
Felipe Contreras <felipe.contreras@gmail.com> writes:
> On Wed, Nov 21, 2012 at 8:48 PM, Jeff King <peff@peff.net> wrote:
> ...
> I would like to understand that that even means. What behavior is
> currently broken?
I do not know if this is the same as what Peff was referring to, but
I found this message in the discussion thread during my absense.
From: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Subject: Re: [PATCH v3 4/4] fast-export: make sure refs are updated properly
Date: Fri, 2 Nov 2012 16:17:14 +0100 (CET)
Message-ID: <alpine.DEB.1.00.1211021612320.7256@s15462909.onlinehome-server.info>
(which is $gmane/208946) that says:
Note that
$ git branch foo master~1
$ git fast-export foo master~1..master
still does not update the "foo" ref, but a partial fix is better
than no fix.
^ permalink raw reply
* Re: [PATCH] Add documentation on how to integrate commands.
From: Eric S. Raymond @ 2012-11-26 5:25 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
In-Reply-To: <7vy5hpvukk.fsf@alter.siamese.dyndns.org>
Junio C Hamano <gitster@pobox.com>:
> As the first sentence in this paragraph does not make it clear
> enough that you are defining a new term "git execution directory",
> "execution directory" here may be misleading and can easily be
> mistaken as if we look something in the directory where the user
> runs "git" in. We usually call it "exec path".
Fixed.
> Actually, we tend to avoid Python dependency for anything important
> and allow it only on fringes; people who lack Python environment are
> not missing much, and we would want to keep it that way until the
> situation on the Windows front changes.
Added:
Python is fine for import utilities, surgical tools, remote helpers
and other code at the edges of the git suite - but it should not yet
be used for core functions. This may change in the future; the problem
is that we need better Python integration in the git Windows installer
before we can be confident people in that environment won't
experience an unacceptably large loss of capability.
I will also take this as a part-resolution of the related policy thread.
Issue perhaps to be revisited when the Windows port gets the Python support
to a good state.
I will submit for separate consideration a patch proposing the following
new guidelines:
1. Python code SHOULD NOT require an interpreter version newer than 2.6.
2. Python code SHOULD check the interpreter version and exit gracefully
with an explanation if it detects that its dependency cannot be satisfied.
> I would prefer to see this sentence not call libgit.a a "library".
> We primarily use libgit.a to let linker pick necessary object files
> without us having to list object files for non-builtin command
> implementations and it is not designed to be used by other people.
Fixed. I now refer to it as a "collection of functions".
> And when sending a patch in, do not forget to sign off your patches
> ;-)
Added. I will submit a third time with a signoff. :-)
--
<a href="http://www.catb.org/~esr/">Eric S. Raymond</a>
^ permalink raw reply
* Re: Python extension commands in git - request for policy change
From: Sitaram Chamarty @ 2012-11-26 5:10 UTC (permalink / raw)
To: esr; +Cc: Krzysztof Mazur, git
In-Reply-To: <20121125224728.GD6937@thyrsus.com>
On Mon, Nov 26, 2012 at 4:17 AM, Eric S. Raymond <esr@thyrsus.com> wrote:
> Krzysztof Mazur <krzysiek@podlesie.net>:
>> What about embedded systems? git is also useful there. C and shell is
>> everywhere, python is not.
>
> Supposing this is true (and I question it with regard to shell) if you
> tell me how you live without gitk and the Perl pieces I'll play that
> right back at you as your answer.
gitk is unlikely to be used on an embedded system, the perl pieces more so.
I have never understood why people complain about readability in perl.
Just because it uses the ascii charset a bit more? You expect a
mathematician or indeed any scientist to use special symbols to mean
special things, why not programmers?
Perhaps people should be forced to use COBOL for a few years (like I
did, a long while ago) to appreciate brevity.
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox