Git development
 help / color / mirror / Atom feed
* Re: why still no empty directory support in git
From: Asheesh Laroia @ 2009-01-02 21:31 UTC (permalink / raw)
  To: Git Mailing List
In-Reply-To: <alpine.DEB.1.00.0901021954410.30769@pacific.mpi-cbg.de>

On Fri, 2 Jan 2009, Johannes Schindelin wrote:

> Hi,

Hi

*wipes the egg off his face*

> On Thu, 1 Jan 2009, Jeff King wrote:
>
>> On Tue, Dec 30, 2008 at 03:58:46AM -0500, Asheesh Laroia wrote:
>>
>>> So, let's say I take your suggestion.
>>>
>>> $ touch ~/Maildir/new/.exists
>>> $ git add ~/Maildir/new/.exists && git commit -m "La di da"
>>>
>>> Now a spec-compliant Maildir user agent will attempt to deliver this new
>>> "email message" of zero bytes into the mail spool and assign it a message
>>> UID.  Doing so will remove it from Maildir/new.
>>
>> No. The maildir spec says:
>>
>>   A unique name can be anything that doesn't contain a colon (or slash)
>>   and doesn't start with a dot.

Oops.  I never actually tried this...

> For the record, I am using Git to manage my mails, and never had any 
> problems after installing a hook which marks new empty directories with 
> .gitignore.

I'll give that a shot, and my apologies for the noise on the list with 
regard to this particular example.

I do still believe that git shouldn't rmdir() empty directories behind the 
user's back, but with this particular use case gone I'm no longer as 
adamant as before.

My apologies for not having tested this earlier; I will test it shortly, 
but there's every reason to think that Johannes and Jeff are right!

-- Asheesh.

-- 
It's interesting to think that many quite distinguished people have
bodies similar to yours.

^ permalink raw reply

* Re: how to track the history of a line in a file
From: Jeff King @ 2009-01-02 21:26 UTC (permalink / raw)
  To: david; +Cc: git
In-Reply-To: <alpine.DEB.1.10.0901021405460.21567@asgard.lang.hm>

On Fri, Jan 02, 2009 at 02:13:32PM -0800, david@lang.hm wrote:

> I have a need to setup a repository where I'm storing config files, and I  
> need to be able to search the history of a particular line, not just when  
> the last edit of the line was (which is what I see from git blame)

As you figured out, the "manual" way is to just keep reblaming from the
parent of each blame. Recent versions of "git gui blame" have a "reblame
from parent" option in the context menu which makes this a lot less
painful.

> 57f8f7b6 (Linus Torvalds 2008-10-23 20:06:52 -0700 3) SUBLEVEL = 28
>
> what I would want it to show would be a list of the commits that have  
> changed this line.

The tricky thing here is what is "this line"? Using the line number
isn't right, since it will change based on other content coming in and
out of the file. You can keep drilling down by reblaming parent commits,
but remember that each time you do that you are manually looking at the
content and saying "Oh, this is the line I am still interested in." So I
a script would have to correlate the old version and new version of the
line and realize how to follow the "interesting" thing.

In your case, I think you want to see any commit in Makefile which
changed a line with SUBLEVEL in it. Which is maybe easiest done as:

  git log -z -p Makefile |
    perl -0ne 'print if /\n[+-]SUBLEVEL/' |
    tr '\0' '\n'

and is pretty fast. But obviously we're leveraging some content-specific
knowledge about what's in the Makefile.

-Peff

^ permalink raw reply

* how to track the history of a line in a file
From: david @ 2009-01-02 22:13 UTC (permalink / raw)
  To: git

I have a need to setup a repository where I'm storing config files, and I 
need to be able to search the history of a particular line, not just when 
the last edit of the line was (which is what I see from git blame)

I'm not seeing a obvious way to do this, am I missing something or does it 
need a non-obvious approach?

for example, if I do

git blame -L /SUBLEVEL/,+1 -M Makefile

on the linux kernel it currently shows

57f8f7b6 (Linus Torvalds 2008-10-23 20:06:52 -0700 3) SUBLEVEL = 28

what I would want it to show would be a list of the commits that have 
changed this line.

It looks like I can write a script to do this

git blame -L /SUBLEVEL/,+1 -M Makefile 57f8f7b6^
6e86841d (Linus Torvalds 2008-07-28 19:40:31 -0700 3) SUBLEVEL = 27
git blame -L /SUBLEVEL/,+1 -M Makefile 6e86841d^
2ddcca36 (Linus Torvalds 2008-05-03 11:59:44 -0700 3) SUBLEVEL = 26

etc.

is there a better way to do this?

David Lang

^ permalink raw reply

* Re: [PATCH 0/3] Teach Git about the patience diff algorithm
From: Jeff King @ 2009-01-02 20:52 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Johannes Schindelin, Clemens Buchacher, Adeodato Simó,
	Pierre Habouzit, davidel, Francis Galiegue, Git ML
In-Reply-To: <20090102195053.GA10876@coredump.intra.peff.net>

On Fri, Jan 02, 2009 at 02:50:53PM -0500, Jeff King wrote:

> For example, f83b9ba209's commit message indicates that it moves the
> "--format-patch" paragraph. Which is what "git diff" shows. Patience
> diff shows it as moving other text _around_ that paragraph.

Here's another interesting one: d592b315. The commit removes dashes
from git commands in test scripts. Git says:

        echo "tag-one-line" >expect &&
-       git-tag -l | grep "^tag-one-line" >actual &&
+       git tag -l | grep "^tag-one-line" >actual &&
        test_cmp expect actual &&
-       git-tag -n0 -l | grep "^tag-one-line" >actual &&
+       git tag -n0 -l | grep "^tag-one-line" >actual &&
        test_cmp expect actual &&
-       git-tag -n0 -l tag-one-line >actual &&
+       git tag -n0 -l tag-one-line >actual &&
        test_cmp expect actual &&

whereas patience says:

        echo "tag-one-line" >expect &&
-       git-tag -l | grep "^tag-one-line" >actual &&
-       test_cmp expect actual &&
-       git-tag -n0 -l | grep "^tag-one-line" >actual &&
-       test_cmp expect actual &&
-       git-tag -n0 -l tag-one-line >actual &&
+       git tag -l | grep "^tag-one-line" >actual &&
+       test_cmp expect actual &&
+       git tag -n0 -l | grep "^tag-one-line" >actual &&
+       test_cmp expect actual &&
+       git tag -n0 -l tag-one-line >actual &&
        test_cmp expect actual &&

which is exactly what patience is advertised to do: it's treating the
non-unique lines as uninteresting markers. But in this case they _are_
interesting, and I think the git output is more readable. And this is a
case where your "weight lines by length instead of uniqueness"
suggestion would perform better, I think.

-Peff

^ permalink raw reply

* Re: [CLEANUP PATCH RESEND] git wrapper: Make while loop more reader-friendly
From: Boyd Stephen Smith Jr. @ 2009-01-02 20:04 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git, gitster
In-Reply-To: <alpine.DEB.1.00.0901021947170.30769@pacific.mpi-cbg.de>

[-- Attachment #1: Type: text/plain, Size: 1335 bytes --]

On Friday 2009 January 02 12:49:53 Johannes Schindelin wrote:
> > > -	do
> > > -		--slash;
> > > -	while (cmd <= slash && !is_dir_sep(*slash));
> > > +	while (cmd <= slash && !is_dir_sep(*slash))
> > > +		slash--;
> >
> > I prefer the one-liner:
> > for (; cmd <= slash && !is_dir_sep(*slash); --slash);
>
> As I mentioned in the commit message: readability is something to be
> cherished and worshipped.

It's also subjective.  I think my one-line is more readable than your two 
lines which is only slightly more readable than the original 3 lines.  Or is 
there some objective readability metric that of which I'm just not aware?

I also think that the lack of braces around your body on a separate line makes 
it harder to read and easier to break, but I understand that is the git 
coding style.

> For your pleasure, I will not go into details about the motions my bowels
> went through when I looked at those three lines.  Or your single line, for
> that matter.

Please do, although privately if you like.  I really don't see the problem the 
patch is trying to fix.
-- 
Boyd Stephen Smith Jr.                     ,= ,-_-. =. 
bss@iguanasuicide.net                     ((_/)o o(\_))
ICQ: 514984 YM/AIM: DaTwinkDaddy           `-'(. .)`-' 
http://iguanasuicide.net/                      \_/     

[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply

* "git reset --hard" == "git checkout HEAD" == "git reset --hard HEAD" ???
From: chris @ 2009-01-02 19:57 UTC (permalink / raw)
  To: git

Does "git reset --hard" == "git checkout HEAD" == "git reset --hard HEAD" ???

It seems we have 2 ways to blow away work we haven't checked in yet then right?

chris

^ permalink raw reply

* Re: [PATCH 0/3] Teach Git about the patience diff algorithm
From: Jeff King @ 2009-01-02 19:50 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Johannes Schindelin, Clemens Buchacher, Adeodato Simó,
	Pierre Habouzit, davidel, Francis Galiegue, Git ML
In-Reply-To: <20090102193904.GB9129@coredump.intra.peff.net>

On Fri, Jan 02, 2009 at 02:39:04PM -0500, Jeff King wrote:

> If you just want to see the results on some real-world cases (and don't
> care about measuring performance), try installing bzr and using their
> patiencediff test program as a GIT_EXTERNAL_DIFF.
> 
> On Debian, it's:
> 
>   $ sudo apt-get install bzr
>   $ cat >$HOME/patience <<'EOF'
>     #!/bin/sh
>     exec python /usr/share/pyshared/bzrlib/patiencediff.py "$2" "$5"
>     EOF
>   $ chmod 755 patience
>   $ GIT_EXTERNAL_DIFF=$HOME/patience git diff

For added fun, try this:

-- >8 --
#!/bin/sh

canonical() {
  grep '^[ +-]' | egrep -v '^(---|\+\+\+)'
}

git rev-list --no-merges HEAD | while read rev; do
  git diff-tree -p $rev^ $rev | canonical >git.out
  GIT_EXTERNAL_DIFF=$HOME/patience git diff $rev^ $rev | canonical >bzr.out
  echo "`diff -U0 git.out bzr.out | wc -l` $rev"
done
-- 8< --

I'm running it on git.git now. It looks like both algorithms return the
same patch for most cases. Some of the differences are interesting,
though.

For example, f83b9ba209's commit message indicates that it moves the
"--format-patch" paragraph. Which is what "git diff" shows. Patience
diff shows it as moving other text _around_ that paragraph.

-Peff

^ permalink raw reply

* Re: [PATCH 0/3] Teach Git about the patience diff algorithm
From: Jeff King @ 2009-01-02 19:39 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Johannes Schindelin, Clemens Buchacher, Adeodato Simó,
	Pierre Habouzit, davidel, Francis Galiegue, Git ML
In-Reply-To: <alpine.LFD.2.00.0901021050450.5086@localhost.localdomain>

On Fri, Jan 02, 2009 at 11:03:07AM -0800, Linus Torvalds wrote:

> Well, it's also the test-case in the very first hit on google for 
> "patience diff" (with the quotes).
> 
> In fact, it's the _only_ one I ever found ;)

If you just want to see the results on some real-world cases (and don't
care about measuring performance), try installing bzr and using their
patiencediff test program as a GIT_EXTERNAL_DIFF.

On Debian, it's:

  $ sudo apt-get install bzr
  $ cat >$HOME/patience <<'EOF'
    #!/bin/sh
    exec python /usr/share/pyshared/bzrlib/patiencediff.py "$2" "$5"
    EOF
  $ chmod 755 patience
  $ GIT_EXTERNAL_DIFF=$HOME/patience git diff

Other distributions presumably install patiencediff.py somewhere
similar (or you can maybe even pull it from the source, but presumably
you have to install some other bzr modules, too).

-Peff

^ permalink raw reply

* Re: [PATCH 2/2] gitweb: support hiding projects from user-visible lists
From: Jakub Narebski @ 2009-01-02 19:33 UTC (permalink / raw)
  To: Matt McCutchen; +Cc: git
In-Reply-To: <1230082831.2971.45.camel@localhost>

On Wed, 2008-12-24, Matt McCutchen wrote:
> On Sat, 2008-12-13 at 14:02 -0800, Jakub Narebski wrote:
> >
> > Cannot you do this with new $export_auth_hook gitweb configuration
> > variable, added by Alexander Gavrilov in 
> >    dd7f5f1 (gitweb: Add a per-repository authorization hook.)
> > It is used in check_export_ok subroutine, and is is checked also when
> > getting list of project from file
> > 
> > From gitweb/INSTALL
[...]
> >     For example, if you use mod_perl to run the script, and have dumb
> >     http protocol authentication configured for your repositories, you
> >     can use the following hook to allow access only if the user is
> >     authorized to read the files:
[...]
 
> $export_auth_hook would work, and it would have the nice (but not
> essential) feature of including private projects in the list shown to
> suitably authenticated users.  The only problem is that my Web host
> doesn't support mod_perl.  Is there a practical way to accomplish the
> same thing as the above example in a CGI script?  I would like to avoid
> reimplementing Apache authentication-checking functionality if at all
> possible.

I know it is written that the example code is for mod_perl, but I
don't think it is mod_perl specific; have you checked if it works
for you? I assume that you use Apache, and have Apache Perl bindings
installed...

-- 
Jakub Narebski
Poland

^ permalink raw reply

* Re: [PATCH 0/3] Teach Git about the patience diff algorithm
From: Johannes Schindelin @ 2009-01-02 19:22 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Clemens Buchacher, Adeodato Simó, Pierre Habouzit, davidel,
	Francis Galiegue, Git ML
In-Reply-To: <alpine.LFD.2.00.0901021050450.5086@localhost.localdomain>

Hi,

On Fri, 2 Jan 2009, Linus Torvalds wrote:

> On Fri, 2 Jan 2009, Johannes Schindelin wrote:
> 
> > And the worst part: one can only _guess_ what motivated patience diff.  
> > I imagine it came from the observation that function headers are 
> > unique, and that you usually want to preserve as much context around 
> > them.
> 
> Well, I do like the notion of giving more weight to unique lines - I think 
> it makes sense. That said, I suspect it would make almost as much sense to 
> give more weight simply to _longer_ lines, and I suspect the standard 
> Myers' algorithm could possibly be simply extended to take line size into 
> account when calculating the weights.

I think that it makes more sense with the common unique lines, because 
they are _unambiguously_ common.  That is, if possible, we would like to 
keep them as common lines.

BTW this also opens the door for more automatic code movement detection.

As for the longer lines: what exactly would you want to put "weight" on?  
The edit distance is the number of plusses and minusses, and this is the 
thing that actually is pretty critical for the performance: the larger the 
distance, the longer it takes.  So if you want to put a different "weight" 
on a line, i.e. something else than a "1", you are risking a substantially 
worse performance.

And I am still not convinced that longer lines should get more weight. A 
line starting with "exit:" can be much more important than 8 tabs followed 
by a curly bracket.

> Because the problem with diffs for C doesn't really tend to be as much 
> about non-unique lines as about just _trivial_ lines that are mostly empty 
> or contain just braces etc. Those are quite arguably almost totally 
> worthless for equality testing.

Oh, but then we get very C specific.  Let's not go there.

> And btw, don't get me wrong - I don't think there is anything wrong with 
> the patience diff. I think it's a worthy thing to try, and I'm not at 
> all arguing against it.

I never took you to be opposed.  I myself mainly implemented it because I 
wanted to play with it, and have something more useful than what I found 
in the whole wide web.

> However, I do think that the people arguing for it often do so based on 
> less-than-very-logical arguments, and it's entirely possible that other 
> approaches are better (eg the "weight by size" thing rather than "weight 
> by uniqueness").

Actually, I think it is very possible that merging hunks if there are not 
enough alnums between them could make a whole lot of sense.

But IIRC it was shot down and replaced by the not-so-specific-to-C 
algorithm to merge hunks when the output is shorter than having separate 
hunks.

> The thing about unique lines is that there are no guarantees at all that 
> they even exist, so a uniqueness-based thing will always have to fall 
> back on anything else. That, to me, implies that the whole notion is 
> somewhat mis-designed: it's clearly not a generic concept.
> 
> (In contrast, taking the length of the matching lines into account would 
> not have that kind of bad special case)

However, we know that humans like to start from the unique features they 
see, and continue from there.

And while it is quite possible that it is the wrong approach (see all that 
security theater at the airport, and some nice rational analyses about the 
cost/benefit equations there), imitating humans is still the thing that 
will convince most humans.

Ciao,
Dscho

^ permalink raw reply

* Re: [PATCH 0/3] Teach Git about the patience diff algorithm
From: Johannes Schindelin @ 2009-01-02 19:07 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Pierre Habouzit, davidel, Francis Galiegue, Git ML
In-Reply-To: <alpine.LFD.2.00.0901021045290.5086@localhost.localdomain>

Hi,

On Fri, 2 Jan 2009, Linus Torvalds wrote:

> On Fri, 2 Jan 2009, Johannes Schindelin wrote:
> > 
> > BTW the "-p" is not necessary with "show", indeed, you cannot even 
> > switch it off.
> 
> I was just switching back-and-forth between "git log" and "git show" so 
> the -p came from just that, and is not necessary.
> 
> And you _can_ suppress the patch generation - use "-s".

Ah, another thing learnt.

Thanks,
Dscho

^ permalink raw reply

* Re: [PATCH 0/3] Teach Git about the patience diff algorithm
From: Linus Torvalds @ 2009-01-02 19:03 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: Clemens Buchacher, Adeodato Simó, Pierre Habouzit, davidel,
	Francis Galiegue, Git ML
In-Reply-To: <alpine.DEB.1.00.0901021918100.30769@pacific.mpi-cbg.de>



On Fri, 2 Jan 2009, Johannes Schindelin wrote:
> 
> FWIW it's the test case in the commit introducing the --patience option.

Well, it's also the test-case in the very first hit on google for 
"patience diff" (with the quotes).

In fact, it's the _only_ one I ever found ;)

> And the worst part: one can only _guess_ what motivated patience diff.  I 
> imagine it came from the observation that function headers are unique, and 
> that you usually want to preserve as much context around them.

Well, I do like the notion of giving more weight to unique lines - I think 
it makes sense. That said, I suspect it would make almost as much sense to 
give more weight simply to _longer_ lines, and I suspect the standard 
Myers' algorithm could possibly be simply extended to take line size into 
account when calculating the weights.

Because the problem with diffs for C doesn't really tend to be as much 
about non-unique lines as about just _trivial_ lines that are mostly empty 
or contain just braces etc. Those are quite arguably almost totally 
worthless for equality testing.

And btw, don't get me wrong - I don't think there is anything wrong with 
the patience diff. I think it's a worthy thing to try, and I'm not at all 
arguing against it. However, I do think that the people arguing for it 
often do so based on less-than-very-logical arguments, and it's entirely 
possible that other approaches are better (eg the "weight by size" thing 
rather than "weight by uniqueness").

The thing about unique lines is that there are no guarantees at all that 
they even exist, so a uniqueness-based thing will always have to fall back 
on anything else. That, to me, implies that the whole notion is somewhat 
mis-designed: it's clearly not a generic concept.

(In contrast, taking the length of the matching lines into account would 
not have that kind of bad special case)

			Linus

^ permalink raw reply

* Re: [PATCH v2 1/3] rebase: learn to rebase root commit
From: Johannes Schindelin @ 2009-01-02 18:58 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Thomas Rast, git
In-Reply-To: <7v4p0iivwh.fsf@gitster.siamese.dyndns.org>

Hi,

On Thu, 1 Jan 2009, Junio C Hamano wrote:

> Thomas Rast <trast@student.ethz.ch> writes:
> 
> > Teach git-rebase a new option --root, which instructs it to rebase the
> > entire history leading up to <branch>.
> >
> > The main use-case is with git-svn: suppose you start hacking (perhaps
> > offline) on a new project, but later notice you want to commit this
> > work to SVN.  You will have to rebase the entire history, including
> > the root commit, on a (possibly empty) commit coming from git-svn, to
> > establish a history connection.  This previously had to be done by
> > cherry-picking the root commit manually.
> 
> I like what this series tries to do.  Using the --root option is probably
> a more natural way to do what people often do with the "add graft and
> filter-branch the whole history once" procedure.
> 
> But it somewhat feels sad if the "main" use-case for this is to start your
> project in git and then migrate away by feeding your history to subversion
> ;-).

FWIW I had a single case where I could have used something like this 
myself, in my whole life.  It was when I started to write 
git-edit-patch-series.sh in its own repository, only to realize at the end 
that I should have started it in a topic branch in my git.git tree.

Ciao,
Dscho

^ permalink raw reply

* Re: why still no empty directory support in git
From: Johannes Schindelin @ 2009-01-02 18:55 UTC (permalink / raw)
  To: Jeff King; +Cc: Asheesh Laroia, Git Mailing List
In-Reply-To: <20090101200651.GB6536@coredump.intra.peff.net>

Hi,

On Thu, 1 Jan 2009, Jeff King wrote:

> On Tue, Dec 30, 2008 at 03:58:46AM -0500, Asheesh Laroia wrote:
> 
> > So, let's say I take your suggestion.
> >
> > $ touch ~/Maildir/new/.exists
> > $ git add ~/Maildir/new/.exists && git commit -m "La di da"
> >
> > Now a spec-compliant Maildir user agent will attempt to deliver this new  
> > "email message" of zero bytes into the mail spool and assign it a message  
> > UID.  Doing so will remove it from Maildir/new.
> 
> No. The maildir spec says:
> 
>   A unique name can be anything that doesn't contain a colon (or slash)
>   and doesn't start with a dot.
>      -- http://cr.yp.to/proto/maildir.html
> 
> where a "unique name" is the filename used for a message. In practice,
> every maildir implementation I have seen ignores files starting with a
> dot. Do you have one that doesn't?

For the record, I am using Git to manage my mails, and never had any 
problems after installing a hook which marks new empty directories with 
.gitignore.

Ciao,
Dscho

^ permalink raw reply

* Re: [PATCH 0/3] Teach Git about the patience diff algorithm
From: Jeff King @ 2009-01-02 18:51 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git
In-Reply-To: <alpine.DEB.1.00.0901021914420.30769@pacific.mpi-cbg.de>

On Fri, Jan 02, 2009 at 07:17:34PM +0100, Johannes Schindelin wrote:

> BTW the "-p" is not necessary with "show", indeed, you cannot even switch 
> it off.

Half true:

  $ git grep -A1 '"-s"' diff.c
  diff.c: else if (!strcmp(arg, "-s"))
  diff.c-         options->output_format |= DIFF_FORMAT_NO_OUTPUT;

-Peff

^ permalink raw reply

* Re: [PATCH 0/3] Teach Git about the patience diff algorithm
From: Adeodato Simó @ 2009-01-02 18:50 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Johannes Schindelin, Pierre Habouzit, davidel, Francis Galiegue,
	Git ML
In-Reply-To: <alpine.LFD.2.00.0901011747010.5086@localhost.localdomain>

* Linus Torvalds [Thu, 01 Jan 2009 17:56:13 -0800]:

> On Thu, 1 Jan 2009, Adeodato Simó wrote:

> > For me, the cases where I find patience output to be of substantial
> > higher readability are those involving a rewrite of several consecutive
> > paragraphs (i.e., lines of code separated by blank lines). Compare:

> I don't think that's a "patience diff" issue.

Ah, I see.

> That's simply an issue of merging consecutive diff fragments together if 
> they are close-by, and that's independent of the actual diff algorithm 
> itself.

> > I'll note that in this particular case, `git diff` yielded the very same
> > results with or without --patience. I don't know why that is, Johannes?
> > I'll also note that /usr/bin/diff produces (in this case) something
> > closer to patience than to git.

> See above - I really don't think this has anything to do with "patience vs 
> non-patience". It's more akin to the things we do for our merge conflict 
> markers: if we have two merge conflicts next to each other, with just a 
> couple of lines in between, we coalesce the merge conflicts into one 
> larger one instead.

> We don't do that for regular diffs - they're always kept minimal (ok, not 
> really minimal, but as close to minimal as the algorithm finds them).

> See commit f407f14deaa14ebddd0d27238523ced8eca74393 for the git merge 
> conflict merging. We _could_ do similar things for regular diffs. It's 
> sometimes useful, sometimes not.

Independently of patience diff, then, I'd very much support changes to
improve the diff I pasted.

-- 
Adeodato Simó                                     dato at net.com.org.es
Debian Developer                                  adeodato at debian.org
 
Debugging is twice as hard as writing the code in the first place. Therefore,
if you write the code as cleverly as possible, you are, by definition, not
smart enough to debug it.
                -- Brian W. Kernighan

^ permalink raw reply

* Re: [PATCH 0/3] Teach Git about the patience diff algorithm
From: Linus Torvalds @ 2009-01-02 18:49 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Pierre Habouzit, davidel, Francis Galiegue, Git ML
In-Reply-To: <alpine.DEB.1.00.0901021914420.30769@pacific.mpi-cbg.de>



On Fri, 2 Jan 2009, Johannes Schindelin wrote:
> 
> BTW the "-p" is not necessary with "show", indeed, you cannot even switch 
> it off.

I was just switching back-and-forth between "git log" and "git show" so 
the -p came from just that, and is not necessary.

And you _can_ suppress the patch generation - use "-s".

			Linus

^ permalink raw reply

* Re: [CLEANUP PATCH RESEND] git wrapper: Make while loop more reader-friendly
From: Johannes Schindelin @ 2009-01-02 18:49 UTC (permalink / raw)
  To: Boyd Stephen Smith Jr.; +Cc: git, gitster
In-Reply-To: <200901021228.07599.bss@iguanasuicide.net>

Hi,

On Fri, 2 Jan 2009, Boyd Stephen Smith Jr. wrote:

> On Friday 2009 January 02 12:07:52 you wrote:
> > It is not a good practice to prefer performance over readability in
> > something as performance uncritical as finding the trailing slash
> > of argv[0].
> >
> > So avoid head-scratching by making the loop user-readable, and not
> > hyper-performance-optimized.
> 
> > -	do
> > -		--slash;
> > -	while (cmd <= slash && !is_dir_sep(*slash));
> > +	while (cmd <= slash && !is_dir_sep(*slash))
> > +		slash--;
> 
> What confused people?  The predecrement or the do/while?  Should people that 
> don't understand one of those be hacking on git?
> 
> That said, I'm not opposed to the patch.  It is easier on the eyes, though I 
> prefer the one-liner:
> for (; cmd <= slash && !is_dir_sep(*slash); --slash);

As I mentioned in the commit message: readability is something to be 
cherished and worshipped.

For your pleasure, I will not go into details about the motions my bowels 
went through when I looked at those three lines.  Or your single line, for 
that matter.

Ciao,
Dscho

^ permalink raw reply

* Re: [PATCH 0/3] Teach Git about the patience diff algorithm
From: Johannes Schindelin @ 2009-01-02 18:46 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Clemens Buchacher, Adeodato Simó, Pierre Habouzit, davidel,
	Francis Galiegue, Git ML
In-Reply-To: <alpine.LFD.2.00.0901020833000.5086@localhost.localdomain>

Hi,

On Fri, 2 Jan 2009, Linus Torvalds wrote:


> On Fri, 2 Jan 2009, Clemens Buchacher wrote:
> >
> > Only two choices, and I still get it wrong. The diffs should be 
> > labeled the other way around, of course.
> 
> Yes, this one is a real patience diff change, but it's also the same one 
> that I've seen in the google fanboi findings. What google did _not_ show 
> was any real-life examples, or anybody doing any critical analysis.

FWIW it's the test case in the commit introducing the --patience option.

> So I was hoping for something else than a single "in this case patience 
> diff works really well". I was hoping to see what it does in real life.

I will dig out a real-world example where I _know_ patience diff would 
have helped.  (I remember that I rewrote a pretty large diff which was 
sent on this list, only to understand what it actually does, and I am 
pretty certain this is a good real-world showcase.)

But yes, I agree, the thing does not matter _all_ that much in reality.

The case where I expect a real improvement is when you modify a function 
and insert a function just before it, and Myers' algorithm matches mainly 
empty lines and lines ending in curly brackets.

In other words: something I tried to tackle with ee95ec5d(xdl_merge(): 
introduce XDL_MERGE_ZEALOUS_ALNUM) for merges.

The typical look of such a diff is something like this:

-<... some function header ...>
+<... a completely different function header ...>
 {
-	<... variables ...>
+	<... other variables ...>
 
 	for (i = 0; i < 10; i++) {
-		<... some code ...>
+		<... some code doing something completely different ...>
	}
 
 	return 0;
 }

+<... the function header which was removed earlier ...>
+{
 	<... refactored _and also reindented_ code ...>

> And I haven't seen _any_ real critical analysis of it. Anywhere.

Neither have I.  Let alone something close to documentation.

For example, when the "patience diff algorithm" is explained, it looks 
more like a longest common sequence algorithm when the input is already 
sorted in the first item.

Further, there is no rigorous analysis of the runtime (I figured that the 
original runtime is O(nm) where "n" is the number of lines and "m" is the 
length of the maximal ordered sequence of common unique lines, and my 
implementation can only improve that to O(n log(m))).

This could be improved, I think, for the most common case where you have 
pretty long common _continuous_ sequences of unique lines, i.e. large 
ranges of lines that are identical.

The runtime is especially horrible in the light of the runtime of Myers' 
algorithm, which uses O(n d), where d is the edit distance, i.e. the 
number of lines added + number of lines removed.  (Note: in the real 
world, there are substantial speed ups for consecutive edits, i.e. line 
ranges where there are no common lines at all.)

Also, I am less than thrilled by the job the fanbois did coming up with 
"convincing" evidence: exactly as you pointed out, there are _no_ 
real-world examples where it helps.

And the worst part: one can only _guess_ what motivated patience diff.  I 
imagine it came from the observation that function headers are unique, and 
that you usually want to preserve as much context around them.

Ciao,
Dscho

^ permalink raw reply

* Re: [CLEANUP PATCH RESEND] git wrapper: Make while loop more reader-friendly
From: Boyd Stephen Smith Jr. @ 2009-01-02 18:28 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git, gitster
In-Reply-To: <alpine.DEB.1.00.0901021240270.27818@racer>

[-- Attachment #1: Type: text/plain, Size: 969 bytes --]

On Friday 2009 January 02 12:07:52 you wrote:
> It is not a good practice to prefer performance over readability in
> something as performance uncritical as finding the trailing slash
> of argv[0].
>
> So avoid head-scratching by making the loop user-readable, and not
> hyper-performance-optimized.

> -	do
> -		--slash;
> -	while (cmd <= slash && !is_dir_sep(*slash));
> +	while (cmd <= slash && !is_dir_sep(*slash))
> +		slash--;

What confused people?  The predecrement or the do/while?  Should people that 
don't understand one of those be hacking on git?

That said, I'm not opposed to the patch.  It is easier on the eyes, though I 
prefer the one-liner:
for (; cmd <= slash && !is_dir_sep(*slash); --slash);
-- 
Boyd Stephen Smith Jr.                     ,= ,-_-. =. 
bss@iguanasuicide.net                     ((_/)o o(\_))
ICQ: 514984 YM/AIM: DaTwinkDaddy           `-'(. .)`-' 
http://iguanasuicide.net/                      \_/     

[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply

* Re: [PATCH 0/3] Teach Git about the patience diff algorithm
From: Johannes Schindelin @ 2009-01-02 18:17 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Pierre Habouzit, davidel, Francis Galiegue, Git ML
In-Reply-To: <alpine.LFD.2.00.0901011151440.5086@localhost.localdomain>

Hi,

On Thu, 1 Jan 2009, Linus Torvalds wrote:

> On Thu, 1 Jan 2009, Linus Torvalds wrote:
> > 
> > So could we have some actual real data on it?
> 
> .. and some testing. I tried to get some limited data for the kernel 
> myself, by doing
> 
> 	git log --patience -p v2.6.28.. > ~/patience
> 
> but I just got a core-dump instead.
> 
> Pinpointing it to a specific commit shows a smaller failure case:
> 
> 	git show -p --patience 05d564fe00c05bf8ff93948057ca1acb5bc68e10
> 
> which might help you debug this.

Thanks.  I am on it.  valgrind finds an earlier place in 
xdl_change_compact() which I think is rather more sensible, but at the 
same time a bit worrisome, too, as I did not expect any errors _that_ 
late in the game (I did not touch that code).

BTW the "-p" is not necessary with "show", indeed, you cannot even switch 
it off.

Ciao,
Dscho

^ permalink raw reply

* [CLEANUP PATCH] show <tag>: reuse pp_user_info() instead of duplicating code
From: Johannes Schindelin @ 2009-01-02 18:08 UTC (permalink / raw)
  To: git, gitster


We used to extract the tagger information "by hand" in "git show <tag>",
but the function pp_user_info() already does that.  Even better:
it respects the commit_format and date_format specified by the user.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---

	Just a little cleanup, as I tripped over that part of Git's source.

 builtin-log.c |   21 ++++++---------------
 1 files changed, 6 insertions(+), 15 deletions(-)

diff --git a/builtin-log.c b/builtin-log.c
index 99d1137..bc4e1e9 100644
--- a/builtin-log.c
+++ b/builtin-log.c
@@ -249,22 +249,13 @@ int cmd_whatchanged(int argc, const char **argv, const char *prefix)
 
 static void show_tagger(char *buf, int len, struct rev_info *rev)
 {
-	char *email_end, *p;
-	unsigned long date;
-	int tz;
+	struct strbuf out = STRBUF_INIT;
 
-	email_end = memchr(buf, '>', len);
-	if (!email_end)
-		return;
-	p = ++email_end;
-	while (isspace(*p))
-		p++;
-	date = strtoul(p, &p, 10);
-	while (isspace(*p))
-		p++;
-	tz = (int)strtol(p, NULL, 10);
-	printf("Tagger: %.*s\nDate:   %s\n", (int)(email_end - buf), buf,
-	       show_date(date, tz, rev->date_mode));
+	pp_user_info("Tagger", rev->commit_format, &out, buf, rev->date_mode,
+		git_log_output_encoding ?
+		git_log_output_encoding: git_commit_encoding);
+	printf("%s\n", out.buf);
+	strbuf_release(&out);
 }
 
 static int show_object(const unsigned char *sha1, int show_tag_object,
-- 
1.6.1.rc3.224.g95ac9

^ permalink raw reply related

* [PATCH] bundle: allow rev-list options to exclude annotated tags
From: Johannes Schindelin @ 2009-01-02 18:08 UTC (permalink / raw)
  To: git, gitster


With options such as "--all --since=2.weeks.ago", annotated tags used to
be included, when they should have been excluded.  The reason is that we
heavily abuse the revision walker to determine what needs to be included
or excluded.  And the revision walker does not show tags at all (and
therefore never marks tags as uninteresting).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 bundle.c          |   32 ++++++++++++++++++++++++++++++++
 t/t5704-bundle.sh |   33 +++++++++++++++++++++++++++++++++
 2 files changed, 65 insertions(+), 0 deletions(-)
 create mode 100755 t/t5704-bundle.sh

diff --git a/bundle.c b/bundle.c
index daecd8e..4977962 100644
--- a/bundle.c
+++ b/bundle.c
@@ -167,6 +167,32 @@ int list_bundle_refs(struct bundle_header *header, int argc, const char **argv)
 	return list_refs(&header->references, argc, argv);
 }
 
+static int is_tag_in_date_range(struct object *tag, struct rev_info *revs)
+{
+	unsigned long size;
+	enum object_type type;
+	char *buf, *line, *lineend;
+	unsigned long date;
+
+	if (revs->max_age == -1 && revs->min_age == -1)
+		return 1;
+
+	buf = read_sha1_file(tag->sha1, &type, &size);
+	if (!buf)
+		return 1;
+	line = memmem(buf, size, "\ntagger ", 8);
+	if (!line++)
+		return 1;
+	lineend = memchr(line, buf + size - line, '\n');
+	line = memchr(line, lineend ? lineend - line : buf + size - line, '>');
+	if (!line++)
+		return 1;
+	date = strtoul(line, NULL, 10);
+	free(buf);
+	return (revs->max_age == -1 || revs->max_age < date) &&
+		(revs->min_age == -1 || revs->min_age > date);
+}
+
 int create_bundle(struct bundle_header *header, const char *path,
 		int argc, const char **argv)
 {
@@ -255,6 +281,12 @@ int create_bundle(struct bundle_header *header, const char *path,
 			flag = 0;
 		display_ref = (flag & REF_ISSYMREF) ? e->name : ref;
 
+		if (e->item->type == OBJ_TAG &&
+				!is_tag_in_date_range(e->item, &revs)) {
+			e->item->flags |= UNINTERESTING;
+			continue;
+		}
+
 		/*
 		 * Make sure the refs we wrote out is correct; --max-count and
 		 * other limiting options could have prevented all the tips
diff --git a/t/t5704-bundle.sh b/t/t5704-bundle.sh
new file mode 100755
index 0000000..a8f4419
--- /dev/null
+++ b/t/t5704-bundle.sh
@@ -0,0 +1,33 @@
+#!/bin/sh
+
+test_description='some bundle related tests'
+. ./test-lib.sh
+
+test_expect_success 'setup' '
+
+	: > file &&
+	git add file &&
+	test_tick &&
+	git commit -m initial &&
+	test_tick &&
+	git tag -m tag tag &&
+	: > file2 &&
+	git add file2 &&
+	: > file3 &&
+	test_tick &&
+	git commit -m second &&
+	git add file3 &&
+	test_tick &&
+	git commit -m third
+
+'
+
+test_expect_success 'tags can be excluded by rev-list options' '
+
+	git bundle create bundle --all --since=7.Apr.2005.15:16:00.-0700 &&
+	git ls-remote bundle > output &&
+	! grep tag output
+
+'
+
+test_done
-- 
1.6.1.rc3.224.g95ac9

^ permalink raw reply related

* [RESEND PATCH] git add: do not add files from a submodule
From: Johannes Schindelin @ 2009-01-02 18:08 UTC (permalink / raw)
  To: git, gitster


It comes quite as a surprise to an unsuspecting Git user that calling
"git add submodule/file" (which is a mistake, alright) _removes_
the submodule in the index, and adds the file.  Instead, complain loudly.

While at it, be nice when the user said "git add submodule/" which is
most likely the consequence of tab-completion, and stage the submodule,
instead of trying to add the contents of that directory.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---

	Okay, so I let this slip for way too long.  AFAIR I waited for
	a few patches to go into 'next', and it was in an -rc cycle.

	Happily, the patches seem to be in, and we're out of -rc.

	Looking over the patch again, I do not find anything obviously
	wrong, but certainly somebody will.

 builtin-add.c              |   28 ++++++++++++++++++++++++++++
 t/t7400-submodule-basic.sh |   25 +++++++++++++++++++++++++
 2 files changed, 53 insertions(+), 0 deletions(-)

diff --git a/builtin-add.c b/builtin-add.c
index 719de8b..ac98c83 100644
--- a/builtin-add.c
+++ b/builtin-add.c
@@ -68,6 +68,33 @@ static void prune_directory(struct dir_struct *dir, const char **pathspec, int p
         free(seen);
 }
 
+static void treat_gitlinks(const char **pathspec)
+{
+	int i;
+
+	if (!pathspec || !*pathspec)
+		return;
+
+	for (i = 0; i < active_nr; i++) {
+		struct cache_entry *ce = active_cache[i];
+		if (S_ISGITLINK(ce->ce_mode)) {
+			int len = ce_namelen(ce), j;
+			for (j = 0; pathspec[j]; j++) {
+				int len2 = strlen(pathspec[j]);
+				if (len2 <= len || pathspec[j][len] != '/' ||
+				    memcmp(ce->name, pathspec[j], len))
+					continue;
+				if (len2 == len + 1)
+					/* strip trailing slash */
+					pathspec[j] = xstrndup(ce->name, len);
+				else
+					die ("Path '%s' is in submodule '%.*s'",
+						pathspec[j], len, ce->name);
+			}
+		}
+	}
+}
+
 static void fill_directory(struct dir_struct *dir, const char **pathspec,
 		int ignored_too)
 {
@@ -261,6 +288,7 @@ int cmd_add(int argc, const char **argv, const char *prefix)
 
 	if (read_cache() < 0)
 		die("index file corrupt");
+	treat_gitlinks(pathspec);
 
 	if (add_new_files)
 		/* This picks up the paths that are not tracked */
diff --git a/t/t7400-submodule-basic.sh b/t/t7400-submodule-basic.sh
index be73f7b..2ec7ac6 100755
--- a/t/t7400-submodule-basic.sh
+++ b/t/t7400-submodule-basic.sh
@@ -209,4 +209,29 @@ test_expect_success 'update --init' '
 
 '
 
+test_expect_success 'do not add files from a submodule' '
+
+	git reset --hard &&
+	test_must_fail git add init/a
+
+'
+
+test_expect_success 'gracefully add submodule with a trailing slash' '
+
+	git reset --hard &&
+	git commit -m "commit subproject" init &&
+	(cd init &&
+	 echo b > a) &&
+	git add init/ &&
+	git diff --exit-code --cached init &&
+	commit=$(cd init &&
+	 git commit -m update a >/dev/null &&
+	 git rev-parse HEAD) &&
+	git add init/ &&
+	test_must_fail git diff --exit-code --cached init &&
+	test $commit = $(git ls-files --stage |
+		sed -n "s/^160000 \([^ ]*\).*/\1/p")
+
+'
+
 test_done
-- 
1.6.1.rc3.224.g95ac9

^ permalink raw reply related

* [PATCH] bisect view: call gitk if Cygwin's SESSIONNAME variable is set
From: Johannes Schindelin @ 2009-01-02 18:08 UTC (permalink / raw)
  To: git, gitster


It seems that Cygwin sets the variable SESSIONNAME when an interactive
desktop session is running, and does not set it when you log in via ssh.

So we can use this variable to determine whether to run gitk or git log
in git bisect view.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 git-bisect.sh |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/git-bisect.sh b/git-bisect.sh
index 17a35f6..85db4ba 100755
--- a/git-bisect.sh
+++ b/git-bisect.sh
@@ -508,7 +508,7 @@ bisect_visualize() {
 
 	if test $# = 0
 	then
-		case "${DISPLAY+set}${MSYSTEM+set}${SECURITYSESSIONID+set}" in
+		case "${DISPLAY+set}${SESSIONNAME+set}${MSYSTEM+set}${SECURITYSESSIONID+set}" in
 		'')	set git log ;;
 		set*)	set gitk ;;
 		esac
-- 
1.6.1.rc3.224.g95ac9

^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox