Git development
 help / color / mirror / Atom feed
* Re: [PATCH v2] log: grep author/committer using mailmap
From: Antoine Pelisse @ 2012-12-28 18:00 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <7vwqw3l49z.fsf@alter.siamese.dyndns.org>

Actually, gprof seems to be unhappy about the number of call to
strbuf_grow() in map_user() (25% of the time spent in map_user() is
spent in strbuf_grow()).

That probably comes from the repeated call to strbuf_addch() when
lowering the email address.
At this point, we are also copying the '\0' for every char we add,
doubling the copy.
This may not be much of a difference, but it seems to be called 15
millions times when running:
$ git log --author='Junio C Hamano' --use-mailmap

Maybe we should come up with another way to lower this email address afterall.

^ permalink raw reply

* Re: Lockless Refs?
From: Junio C Hamano @ 2012-12-28 17:15 UTC (permalink / raw)
  To: Martin Fick; +Cc: Michael Haggerty, Jeff King, git, Shawn Pearce
In-Reply-To: <201212280750.14695.mfick@codeaurora.org>

Martin Fick <mfick@codeaurora.org> writes:

> Hmm, actually I believe that with a small modification to the 
> semantics described here it would be possible to make multi 
> repo/branch commits work....
>
> Shawn talked about adding multi repo/branch transaction 
> semantics to jgit, this might be something that git wants to 
> support also at some point?

Shawn may have talked about it and you may have listened to it, but
others wouldn't have any idea what kind of "multi repo/branch
transaction" you are talking about.  Is it about "I want to push
this ref to that repo and push this other ref to that other repo",
in what situation will it be used/useful, what are the failure
modes, what are failure tolerances by the expected use cases, ...?

Care to explain?

^ permalink raw reply

* Re: Lockless Refs?
From: Junio C Hamano @ 2012-12-28 16:58 UTC (permalink / raw)
  To: Martin Fick; +Cc: Michael Haggerty, Jeff King, git
In-Reply-To: <201212271611.52203.mfick@codeaurora.org>

Martin Fick <mfick@codeaurora.org> writes:

> 3) To create a ref, it must be renamed from the null file (sha 
> 0000...) to the new value just as if it were being updated 
> from any other value, but there is one extra condition: 
> before renaming the null file, a full directory scan must be 
> done to ensure that the null file is the only file in the 
> directory...

While you are scanning this directory to make sure it is empty, I am
contemplating to create the same ref with a different value.  You
finished checking but haven't created the null. I have also scanned,
created the null and renamed it to my value.  Now you try to create
the null, succeed, and then rename.  We won't know which of the two
non-null values are valid, but worse yet, I think one of them should
have failed in the first place.

Sounds like we would need some form of locking around here.  Is your
goal "no locks", or "less locks"?

> I don't know how this new scheme could be made to work with 
> the current scheme,...

It is much more important to know if/why yours is better than the
current scheme in the first place.  Without an analysis on how the
new scheme interacts with the packed refs and gives better
behaviour, that is kinda difficult.

I think transition plans can wait until that is done.  If it is not
even marginally better, we do not have to worry about transitioning
at all.  If it is only marginally better, the transition has to be
designed to be no impact to the existing repositories.  If it is
vastly better, we might be able to afford a flag day.

^ permalink raw reply

* Re: git diff --ignore-space-at-eol issue
From: Antoine Pelisse @ 2012-12-28 16:46 UTC (permalink / raw)
  To: John Moon; +Cc: git
In-Reply-To: <BLU163-W40634B340214076467C88ECF360@phx.gbl>

> The --ignore-space-at-eol option is ignored when used in conjunction
> with --name-status.
> It works fine otherwise.

Indeed the behavior of diff --stat, and etc has been corrected very
recently to make it more consistent across all options.
I don't know if the new behavior is exactly what you expected:

$ git diff --ignore-space-at-eol test.txt
$ git diff --ignore-space-at-eol --stat test.txt
 test.txt | 0
 1 file changed, 0 insertions(+), 0 deletions(-)
$ git diff --ignore-space-at-eol --name-status test.txt
M       test.txt

The idea is that even though diff doesn't show any differences, stat,
shortstat, numstat and name-status reports the file as being changed.
This is available since v1.8.1-rc0.

^ permalink raw reply

* (unknown)
From: Eric S. Raymond @ 2012-12-28 16:43 UTC (permalink / raw)
  To: git

From: "Eric S. Raymond" <esr@thyrsus.com>
Date: Fri, 28 Dec 2012 11:40:59 -0500
Subject: [PATCH] Add checks to Python scripts for version dependencies.

---
 contrib/ciabot/ciabot.py           | 8 +++++++-
 contrib/fast-import/import-zips.py | 7 ++++++-
 contrib/hg-to-git/hg-to-git.py     | 5 +++++
 contrib/p4import/git-p4import.py   | 5 +++++
 contrib/svn-fe/svnrdump_sim.py     | 4 ++++
 git-p4.py                          | 8 +++++++-
 git-remote-testgit.py              | 5 +++++
 git_remote_helpers/git/__init__.py | 5 +++++
 8 files changed, 44 insertions(+), 3 deletions(-)

diff --git a/contrib/ciabot/ciabot.py b/contrib/ciabot/ciabot.py
index bd24395..81c3ebd 100755
--- a/contrib/ciabot/ciabot.py
+++ b/contrib/ciabot/ciabot.py
@@ -47,7 +47,13 @@
 # we default to that.
 #
 
-import os, sys, commands, socket, urllib
+import sys
+if sys.hexversion < 0x02000000:
+	# The limiter is the xml.sax module
+        sys.stderr.write("ciabot.py: requires Python 2.0.0 or later.\n")
+        sys.exit(1)
+
+import os, commands, socket, urllib
 from xml.sax.saxutils import escape
 
 # Changeset URL prefix for your repo: when the commit ID is appended
diff --git a/contrib/fast-import/import-zips.py b/contrib/fast-import/import-zips.py
index 82f5ed3..b989941 100755
--- a/contrib/fast-import/import-zips.py
+++ b/contrib/fast-import/import-zips.py
@@ -9,10 +9,15 @@
 ##  git log --stat import-zips
 
 from os import popen, path
-from sys import argv, exit
+from sys import argv, exit, hexversion
 from time import mktime
 from zipfile import ZipFile
 
+if hexversion < 0x01060000:
+	# The limiter is the zipfile module
+        sys.stderr.write("import-zips.py: requires Python 1.6.0 or later.\n")
+        sys.exit(1)
+
 if len(argv) < 2:
 	print 'Usage:', argv[0], '<zipfile>...'
 	exit(1)
diff --git a/contrib/hg-to-git/hg-to-git.py b/contrib/hg-to-git/hg-to-git.py
index 046cb2b..232625a 100755
--- a/contrib/hg-to-git/hg-to-git.py
+++ b/contrib/hg-to-git/hg-to-git.py
@@ -23,6 +23,11 @@ import os, os.path, sys
 import tempfile, pickle, getopt
 import re
 
+if sys.hexversion < 0x02030000:
+   # The behavior of the pickle module changed significantly in 2.3
+   sys.stderr.write("hg-to-git.py: requires Python 2.3 or later.\n")
+   sys.exit(1)
+
 # Maps hg version -> git version
 hgvers = {}
 # List of children for each hg revision
diff --git a/contrib/p4import/git-p4import.py b/contrib/p4import/git-p4import.py
index b6e534b..593d6a0 100644
--- a/contrib/p4import/git-p4import.py
+++ b/contrib/p4import/git-p4import.py
@@ -14,6 +14,11 @@ import sys
 import time
 import getopt
 
+if sys.hexversion < 0x02020000:
+   # The behavior of the marshal module changed significantly in 2.2
+   sys.stderr.write("git-p4import.py: requires Python 2.2 or later.\n")
+   sys.exit(1)
+
 from signal import signal, \
    SIGPIPE, SIGINT, SIG_DFL, \
    default_int_handler
diff --git a/contrib/svn-fe/svnrdump_sim.py b/contrib/svn-fe/svnrdump_sim.py
index 1cfac4a..95a80ae 100755
--- a/contrib/svn-fe/svnrdump_sim.py
+++ b/contrib/svn-fe/svnrdump_sim.py
@@ -7,6 +7,10 @@ to the highest revision that should be available.
 """
 import sys, os
 
+if sys.hexversion < 0x02040000:
+	# The limiter is the ValueError() calls. This may be too conservative
+        sys.stderr.write("svnrdump-sim.py: requires Python 2.4 or later.\n")
+        sys.exit(1)
 
 def getrevlimit():
         var = 'SVNRMAX'
diff --git a/git-p4.py b/git-p4.py
index 551aec9..69f1452 100755
--- a/git-p4.py
+++ b/git-p4.py
@@ -8,7 +8,13 @@
 # License: MIT <http://www.opensource.org/licenses/mit-license.php>
 #
 
-import optparse, sys, os, marshal, subprocess, shelve
+import sys
+if sys.hexversion < 0x02040000:
+    # The limiter is the subprocess module
+    sys.stderr.write("git-p4: requires Python 2.4 or later.\n")
+    sys.exit(1)
+
+import optparse, os, marshal, subprocess, shelve
 import tempfile, getopt, os.path, time, platform
 import re, shutil
 
diff --git a/git-remote-testgit.py b/git-remote-testgit.py
index 5f3ebd2..91faabd 100644
--- a/git-remote-testgit.py
+++ b/git-remote-testgit.py
@@ -31,6 +31,11 @@ from git_remote_helpers.git.exporter import GitExporter
 from git_remote_helpers.git.importer import GitImporter
 from git_remote_helpers.git.non_local import NonLocalGit
 
+if sys.hexversion < 0x01050200:
+    # os.makedirs() is the limiter
+    sys.stderr.write("git-remote-testgit: requires Python 1.5.2 or later.\n")
+    sys.exit(1)
+
 def get_repo(alias, url):
     """Returns a git repository object initialized for usage.
     """
diff --git a/git_remote_helpers/git/__init__.py b/git_remote_helpers/git/__init__.py
index e69de29..1dbb1b0 100644
--- a/git_remote_helpers/git/__init__.py
+++ b/git_remote_helpers/git/__init__.py
@@ -0,0 +1,5 @@
+import sys
+if sys.hexversion < 0x02040000:
+    # The limiter is the subprocess module
+    sys.stderr.write("git_remote_helpers: requires Python 2.4 or later.\n")
+    sys.exit(1)
-- 
1.8.1.rc2




-- 
		<a href="http://www.catb.org/~esr/">Eric S. Raymond</a>

A ``decay in the social contract'' is detectable; there is a growing
feeling, particularly among middle-income taxpayers, that they are not
getting back, from society and government, their money's worth for
taxes paid. The tendency is for taxpayers to try to take more control
of their finances...	-- IRS Strategic Plan, (May 1984)

^ permalink raw reply related

* [PATCH] Remove the suggestion to use parsecvs, which is currently broken.
From: Eric S. Raymond @ 2012-12-28 16:20 UTC (permalink / raw)
  To: git

The parsecvs code has been neglected for a long time, and the only
public version does not even build correctly.  I have been handed
control of the project and intend to fix this, but until I do it
cannot be recommended.

Also, the project URL given for Subversion needed to be updated
to follow their site move.
---
 Documentation/git-cvsimport.txt | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/Documentation/git-cvsimport.txt b/Documentation/git-cvsimport.txt
index 98d9881..9d5353e 100644
--- a/Documentation/git-cvsimport.txt
+++ b/Documentation/git-cvsimport.txt
@@ -213,11 +213,9 @@ Problems related to tags:
 * Multiple tags on the same revision are not imported.
 
 If you suspect that any of these issues may apply to the repository you
-want to import consider using these alternative tools which proved to be
-more stable in practice:
+want to imort, consider using cvs2git:
 
-* cvs2git (part of cvs2svn), `http://cvs2svn.tigris.org`
-* parsecvs, `http://cgit.freedesktop.org/~keithp/parsecvs`
+* cvs2git (part of cvs2svn), `http://subversion.apache.org/`
 
 GIT
 ---
-- 
1.8.1.rc2



-- 
		<a href="http://www.catb.org/~esr/">Eric S. Raymond</a>

A ``decay in the social contract'' is detectable; there is a growing
feeling, particularly among middle-income taxpayers, that they are not
getting back, from society and government, their money's worth for
taxes paid. The tendency is for taxpayers to try to take more control
of their finances...	-- IRS Strategic Plan, (May 1984)

^ permalink raw reply related

* Re: Lockless Refs?  (Was [PATCH] refs: do not use cached refs in repack_without_ref)
From: Martin Fick @ 2012-12-28 14:50 UTC (permalink / raw)
  To: Michael Haggerty; +Cc: Jeff King, git, Junio C Hamano, Shawn Pearce
In-Reply-To: <201212271611.52203.mfick@codeaurora.org>

On Thursday, December 27, 2012 04:11:51 pm Martin Fick wrote:
> On Wednesday, December 26, 2012 01:24:39 am Michael
> Haggerty
> 
> wrote:
> > ... lots of discussion about ref locking...
> 
> It concerns me that git uses any locking at all, even for
> refs since it has the potential to leave around stale
> locks.
> 
> For a single user repo this is not a big deal, the lock
> can always be cleaned up manually (and it is a rare
> occurrence). However, in a multi user server environment,
> possibly even from multiple hosts over a shared
> filesystem such as NFS, stale locks could lead to serious
> downtime and risky recovery (since it is currently hard
> to figure out if a lock really is stale).  Even though
> stale locks are probably rare even today in the larger
> shared repo case, as git scales to even larger shared
> repositories, this will eventually become more of a
> problem *1.  Naturally, this has me thinking that git
> should possibly consider moving towards a lockless design
> for refs in the long term.
> 
> I realize this is hard and that git needs to support many
> different filesystems with different semantics.  I had an
> idea I think may be close to a functional lockless design
> for loose refs (one piece at a time) that I thought I
> should propose, just to get the ball rolling, even if it
> is just going to be found to be flawed (I realize that
> history suggests that such schemes usually are).  I hope
> that it does not make use of any semantics which are not
> currently expected from git of filesystems.  I think it
> relies only on the ability to rename a file atomically,
> and the ability to scan the contents of a directory
> reliably to detect the "ordered" existence of files.
> 
> My idea is based on using filenames to store sha1s instead
> of file contents.  To do this, the sha1 one of a ref
> would be stored in a file in a directory named after the
> loose ref.  I believe this would then make it possible to
> have lockless atomic ref updates by renaming the file.
> 
> To more fully illustrate the idea, imagine that any file
> (except for the null file) in the directory will represent
> the value of the ref with its name, then the following
> transitions can represent atomic state changes to a refs
> value and existence:
> 
> 1) To update the value from a known value to a new value
> atomically, simply rename the file to the new value.  This
> operation should only succeed if the file exists and is
> still named old value before the rename.  This should
> even be faster than today's approach, especially on
> remote filesystems since it would require only 1 round
> trip in the success case instead of 3!
> 
> 2) To delete the ref, simply delete the filename
> representing the current value of the ref.  This ensures
> that you are deleting the ref from a specific value.  I
> am not sure if git needs to be able to delete refs
> without knowing their values? If so, this would require
> reading the value and looping until the delete succeeds,
> this may be a bit slow for a constantly updated ref, but
> likely a rare situation (and not likely worse than trying
> to acquire the ref-lock today).  Overall, this again
> would likely be faster than today's approach.
> 
> 3) To create a ref, it must be renamed from the null file
> (sha 0000...) to the new value just as if it were being
> updated from any other value, but there is one extra
> condition: before renaming the null file, a full
> directory scan must be done to ensure that the null file
> is the only file in the directory (this condition exists
> because creating the directory and null file cannot be
> atomic unless the filesystem supports atomic directory
> renames, an expectation git does not currently make).  I
> am not sure how this compares to today's approach, but
> including the setup costs (described below), I suspect it
> is slower.
> 
> While this outlines the state changes, some additional
> operations may be needed to setup some starting conditions
> and to clean things up. But these operations could/should
> be performed by any process/thread and would not cause
> any state changes to the ref existence or value.  For
> example, when creating a ref, the ref directory would
> need to be created and the null file needs to be created.
>  Whenever a null file is detected in the directory at the
> same time as another file, it should be deleted.  
> Whenever the directory is empty, it may be deleted
> (perhaps after a grace period to reduce retries during
> ref creation unless the process just deleted the ref).
> 
> I don't know how this new scheme could be made to work
> with the current scheme, it seems like perhaps new git
> releases could be made to understand both the old and the
> new, and a config option could be used to tell it which
> method to write new refs with.  Since in this new scheme
> ref directory names would conflict with old ref
> filenames, this would likely prevent both schemes from
> erroneously being used
> simultaneously (so they shouldn't corrupt each other),
> except for the fact that refs can be nested in
> directories which confuses things a bit.  I am not sure
> what a good solution to this is?
> 
> What did I miss, where are my flaws?  Does anyone else
> share my concern for stale locks?  How could we similarly
> eliminate locks for the packed-refs file?
> 
> -Martin
> 
> 
> *1 We have been concerned with stale locks in the Gerrit
> community when trying to design atomic cross repository
> updates.  Of course, while a lockless solution eliminates
> stale locks, it might make it impossible to do atomic
> cross repository updates since all of our solutions so
> far need locks. :(

Hmm, actually I believe that with a small modification to the 
semantics described here it would be possible to make multi 
repo/branch commits work.   Simply allow the ref filename to 
be locked by a transaction by appending the transaction ID to 
the filename.  So if transaction 123 wants to lock master 
which points currently to abcde, then it will move 
master/abcde to master/abcde_123.  If transaction 123 is 
designed so that any process can commit/complete/abort it 
without requiring any locks which can go stale, then this ref 
lock will never go stale either (easy as long as it writes 
all its proposed updates somewhere upfront and has atomic 
semantics for starting, committing and aborting).  On commit, 
the ref lock gets updated to its new value: master/newsha and 
on abort it gets unlocked: master/abcde.

Shawn talked about adding multi repo/branch transaction 
semantics to jgit, this might be something that git wants to 
support also at some point?

-Martin

^ permalink raw reply

* Re: [PATCH v2] wt-status: Show ignored files in untracked dirs
From: Antoine Pelisse @ 2012-12-28 14:05 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, git
In-Reply-To: <CALWbr2xmtvchR4G37-FzzgScKe4p4RjLc7=Pg8d4K6SWO7tGAQ@mail.gmail.com>

Hey Peff,
I actually have an issue with the behavior we discussed (referenced as 1.A.)

Using the example from Michael's mail, I end up having this:
$ git status --porcelain --ignored
?? .gitignore
?? x
?? y/
!! x.ignore-me
!! y/

y/ is referred as untracked, because it contains untracked files, and
then as ignored because it
contains ignored files.

Showing it twice doesn't feel right though, so I guess we should only
show "?? y/" with untracked=normal,
and "!! y/foo.ignore-me" when using untracked=all

What do you think ?




On Thu, Dec 27, 2012 at 6:35 PM, Antoine Pelisse <apelisse@gmail.com> wrote:
>> By "committed", I assume you meat that you have "dirA/unco" as an
>> untracked file, and "dirA/committed" as a file in the index?
>
> Of course,
>
>> Thanks for putting this together. I agree with the expected output in
>> each case, and I think this covers the cases we have seen (case 1 is
>> Michael's original report, case 2 is what I wrote in my mail, and case 3
>> is the one you just came up with). I can't think offhand of any others.
>
> Great, so I can build some tests reflecting those behaviors while
> waiting more inputs

^ permalink raw reply

* Re: Installation Plan
From: Enrico Weigelt @ 2012-12-28 13:54 UTC (permalink / raw)
  To: Dennis Putnam; +Cc: git
In-Reply-To: <50D475A9.9020407@bellsouth.net>


> 7) Clone new repository for development and testing on Windows. (Do I
> need the shared drive any more?)

Not necessarily, depending on how to connect your local repo to the
remote one (your central repo). I'd suggest using ssh protocol: in this
case your windows box will connect to the linux box via ssh and do all
operations via ssh - no network filesystem required.

> 8) When a new version is ready for release, push commit to remote
> repository after which builds will use new code (I'm assuming the
> file copies happen automagically).

Yes, see post-update hook (on the central repo side).
It is executed right after objects have been transfered and refs updated.
(IOW: when your changes made finally it into the cental repo).
Note that the 'git push' operation will wait until that hook is finished.
So, if the build takes a while, you most likely want to do it asychronously.
A nice way is letting the hook just add the new refs to some queue
(you can even use git refs for that) and have another process (in a loop
or via cron) polling for new queue entries and run the build.


cu
-- 
Mit freundlichen Grüßen / Kind regards 

Enrico Weigelt 
VNC - Virtual Network Consult GmbH 
Head Of Development 

Pariser Platz 4a, D-10117 Berlin
Tel.: +49 (30) 3464615-20
Mobile: +49 (151) 27565287
Fax: +49 (30) 3464615-59

enrico.weigelt@vnc.biz; www.vnc.de 

^ permalink raw reply

* Re: [PATCH 4/8] wildmatch: support "no FNM_PATHNAME" mode
From: Nguyen Thai Ngoc Duy @ 2012-12-28  7:15 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <7vwqw2k833.fsf@alter.siamese.dyndns.org>

On Fri, Dec 28, 2012 at 1:24 PM, Junio C Hamano <gitster@pobox.com> wrote:
>>                       if (*++p == '*') {
>>                               const uchar *prev_p = p - 2;
>>                               while (*++p == '*') {}
>> -                             if ((prev_p == text || *prev_p == '/') ||
>> +                             if (!(flags & WM_PATHNAME))
>> +                                     /* without WM_PATHNAME, '*' == '**' */
>> +                                     special = 1;
>> +                             else if ((prev_p == text || *prev_p == '/') ||
>
> Not a new issue in this patch,

No, it's an issue from nd/wildmatch, 40bbee0 (wildmatch: adjust "**"
behavior - 2012-10-15).

> but here, "prev_p" points into the
> pattern string, two bytes before p, which is the byte before the
> "**" that we are looking at (which might be before the beginning of
> the pattern).  "text" is the string we are trying to match that
> pattern against.  How can these two pointers be compared to yield a
> meaningful value?

They can't. I wanted to check whether "**" is at the start of the
pattern (so no preceding '/' needed) and used a wrong pointer to
compare to. Funny there is a test about this and it does not catch it
because prev_p access something before the pattern. Will fix.

>
>>                                   (*p == '\0' || *p == '/' ||
>>                                    (p[0] == '\\' && p[1] == '/'))) {
>
> OK.  "**/", "**" (end of pattern), and "**\/" are handled here.
>
> Do we have to worry about "**[/]" the same way, or a class never
> matches the directory separator, even if it is a singleton class
> that consists of '/' (which is fine by me)?  If so, is "\/" more or
> less like "[/]"?

This is a special case of "**" with FNM_PATHNAME on. With
FNM_PATHNAME, '[]' and '?' cannot match '/' so any patterns with '[/]'
match nothing. I think we don't need to worry about this case.
-- 
Duy

^ permalink raw reply

* Re: [PATCH 8/8] wildmatch: advance faster in <asterisk> + <literal> patterns
From: Nguyen Thai Ngoc Duy @ 2012-12-28  6:56 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <7vpq1uk82q.fsf@alter.siamese.dyndns.org>

On Fri, Dec 28, 2012 at 1:24 PM, Junio C Hamano <gitster@pobox.com> wrote:
>> +                                     while ((t_ch = *text) != '\0' &&
>> +                                            (!(flags & WM_PATHNAME) || t_ch != '/')) {
>
> Why do we look at (flags & WM_PATHMAME) and not "special" here?

Because I was careless. Thanks for spotting it. I'll fix it and add
some more tests about **<literal> with WM_PATHNAME.
-- 
Duy

^ permalink raw reply

* Re: Find the starting point of a local branch
From: Martin von Zweigbergk @ 2012-12-28  6:38 UTC (permalink / raw)
  To: Woody Wu; +Cc: Seth Robertson, git
In-Reply-To: <20121228051514.GA4028@zuhnb712.ap.bm.net>

On Thu, Dec 27, 2012 at 9:15 PM, Woody Wu <narkewoody@gmail.com> wrote:
> On Mon, Dec 24, 2012 at 09:24:39AM -0800, Martin von Zweigbergk wrote:
>> On Sun, Dec 23, 2012 at 11:31 PM, Woody Wu <narkewoody@gmail.com> wrote:
>> >
>> > This is not working to me since I have more than one local branch that
>> > diverged from the master, and in fact, the branch I have in question was
>> > diverged from another local branch.
>>
>> As Jeff mentions in a later message, "git pull --rebase" would
>> probably do what you want. It works with local branches too.
>>
>
> I think what 'git pull --rebase' would do is to fetch from the origin
> and do a 'git rebase'.

Not if the configured upstream is a local branch (see the
"branch.<name>.*" configuration variables). In that case it will just
rebase the local branch onto the new position of its upstream. If the
upstream is not configured, I believe you can still do "git pull
--rebase . <upstream branch>".

> On one hand, I don't understand 'git rebase' so
> much from the manual, ont the other hand, I did not get the point why
> 'git rebase' has something to do with the thing I want to do (what I
> want is just query some kind of history information).

I may have misunderstood or assumed things incorrectly that you wanted
to rebase the commits on your branch. So why do you want to know?
(Please ignore me if this was answered elsewhere in the thread that I
might have missed.)

Anyway, to answer your question, you could use a method similar to
what "git pull --rebase" uses internally to figure out the branch
point:

git merge-base $(git rev-parse <branch>) $(git rev-list -g <upstream branch>)

Hope that helps

^ permalink raw reply

* Re: [PATCH 8/8] wildmatch: advance faster in <asterisk> + <literal> patterns
From: Junio C Hamano @ 2012-12-28  6:24 UTC (permalink / raw)
  To: Nguyễn Thái Ngọc Duy; +Cc: git
In-Reply-To: <1356163028-29967-9-git-send-email-pclouds@gmail.com>

Nguyễn Thái Ngọc Duy  <pclouds@gmail.com> writes:

> compat, '*/*/*' on linux-2.6.git file list 2000 times, before:
> wildmatch 7s 985049us
> fnmatch   2s 735541us or 34.26% faster
>
> and after:
> wildmatch 4s 492549us
> fnmatch   0s 888263us or 19.77% slower
>
> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
> ---
>  wildmatch.c | 21 +++++++++++++++++++++
>  1 file changed, 21 insertions(+)
>
> diff --git a/wildmatch.c b/wildmatch.c
> index 3794c4d..68b02e4 100644
> --- a/wildmatch.c
> +++ b/wildmatch.c
> @@ -132,6 +132,27 @@ static int dowild(const uchar *p, const uchar *text, unsigned int flags)
>  			while (1) {
>  				if (t_ch == '\0')
>  					break;
> +				/*
> +				 * Try to advance faster when an asterisk is
> +				 * followed by a literal. We know in this case
> +				 * that the the string before the literal
> +				 * must belong to "*".
> +				 */
> +				if (!is_glob_special(*p)) {

So far, we have looked at "*x" or "**x" in the pattern, p points at
'x' (not an asterisk), and we have "text" to match.  For "text" to
match this pattern, the earlier part of it that is consumed to match
the asterisk must be followed by "x".  "special" tells us if we are
allowed to treat '/' as matching the asterisk.

> +					p_ch = *p;
> +					if ((flags & WM_CASEFOLD) && ISUPPER(p_ch))
> +						p_ch = tolower(p_ch);

That "x" in the example is picked up here and stored in "p_ch".
Let's skip over "text" and find that "x" in there.

> +					while ((t_ch = *text) != '\0' &&
> +					       (!(flags & WM_PATHNAME) || t_ch != '/')) {

Why do we look at (flags & WM_PATHMAME) and not "special" here?

> +						if ((flags & WM_CASEFOLD) && ISUPPER(t_ch))
> +							t_ch = tolower(t_ch);
> +						if (t_ch == p_ch)
> +							break;

Found it.

> +						text++;
> +					}
> +					if (t_ch != p_ch)
> +						return WM_NOMATCH;

If we did not find that "x", then "**x" or "*x" can never match.
OK.  And at this point "text" points at that "x" we found, and "p"
points at "x" after the asterisk in the pattern.

Looks good so far.  Thanks.

> +				}
>  				if ((matched = dowild(p, text,  flags)) != WM_NOMATCH) {
>  					if (!special || matched != WM_ABORT_TO_STARSTAR)
>  						return matched;

^ permalink raw reply

* Re: [PATCH 4/8] wildmatch: support "no FNM_PATHNAME" mode
From: Junio C Hamano @ 2012-12-28  6:24 UTC (permalink / raw)
  To: Nguyễn Thái Ngọc Duy; +Cc: git
In-Reply-To: <1356163028-29967-5-git-send-email-pclouds@gmail.com>

Nguyễn Thái Ngọc Duy  <pclouds@gmail.com> writes:

> diff --git a/wildmatch.c b/wildmatch.c
> index a79f97e..4fe1d65 100644
> --- a/wildmatch.c
> +++ b/wildmatch.c
> @@ -77,14 +77,17 @@ static int dowild(const uchar *p, const uchar *text, unsigned int flags)
>  			continue;
>  		case '?':
>  			/* Match anything but '/'. */
> -			if (t_ch == '/')
> +			if ((flags & WM_PATHNAME) && t_ch == '/')
>  				return WM_NOMATCH;
>  			continue;
>  		case '*':
>  			if (*++p == '*') {
>  				const uchar *prev_p = p - 2;
>  				while (*++p == '*') {}
> -				if ((prev_p == text || *prev_p == '/') ||
> +				if (!(flags & WM_PATHNAME))
> +					/* without WM_PATHNAME, '*' == '**' */
> +					special = 1;
> +				else if ((prev_p == text || *prev_p == '/') ||

Not a new issue in this patch, but here, "prev_p" points into the
pattern string, two bytes before p, which is the byte before the
"**" that we are looking at (which might be before the beginning of
the pattern).  "text" is the string we are trying to match that
pattern against.  How can these two pointers be compared to yield a
meaningful value?

>  				    (*p == '\0' || *p == '/' ||
>  				     (p[0] == '\\' && p[1] == '/'))) {

OK.  "**/", "**" (end of pattern), and "**\/" are handled here.  

Do we have to worry about "**[/]" the same way, or a class never
matches the directory separator, even if it is a singleton class
that consists of '/' (which is fine by me)?  If so, is "\/" more or
less like "[/]"?

^ permalink raw reply

* Re: Find the starting point of a local branch
From: Woody Wu @ 2012-12-28  5:15 UTC (permalink / raw)
  To: Martin von Zweigbergk; +Cc: Seth Robertson, git
In-Reply-To: <CANiSa6iSYvLbp1s8h9pwi=P1m0QdZPqf06hAm+4muChgJUuj=g@mail.gmail.com>

On Mon, Dec 24, 2012 at 09:24:39AM -0800, Martin von Zweigbergk wrote:
> On Sun, Dec 23, 2012 at 11:31 PM, Woody Wu <narkewoody@gmail.com> wrote:
> > On Sun, Dec 23, 2012 at 11:09:58PM -0500, Seth Robertson wrote:
> >>
> >> In message <20121224035825.GA17203@zuhnb712>, Woody Wu writes:
> >>
> >>     How can I find out what's the staring reference point (a commit number
> >>     or tag name) of a locally created branch? I can use gitk to find out it
> >>     but this method is slow, I think there might be a command line to do it
> >>     quickly.
> >>
> >> The answer is more complex than you probably suspected.
> >>
> >> Technically, `git log --oneline mybranch | tail -n 1` will tell you
> >> the starting point of any branch.  But...I'm sure that isn't what you
> >> want to know.
> >>
> >> You want to know "what commit was I at when I typed `git branch
> >> mybranch`"?
> >
> > Yes, this is exactly I want to know.
> >
> >>The problem is git doesn't record this information and
> >> doesn't have the slightest clue.
> >>
> >> But, you say, I can use `gitk` and see it.  See?  Right there.  That
> >> isn't (necessarily) the "starting point" of the branch, it is the
> >> place where your branch diverged from some other branch.  Git is
> >> actually quite able to tell you when the last time your branch
> >> diverged from some other branch.  `git merge-base mybranch master`
> >> will tell you this, and is probably the answer you were looking for.
> >
> > This is not working to me since I have more than one local branch that
> > diverged from the master, and in fact, the branch I have in question was
> > diverged from another local branch.
> 
> As Jeff mentions in a later message, "git pull --rebase" would
> probably do what you want. It works with local branches too.
> 

I think what 'git pull --rebase' would do is to fetch from the origin
and do a 'git rebase'.  On one hand, I don't understand 'git rebase' so
much from the manual, ont the other hand, I did not get the point why
'git rebase' has something to do with the thing I want to do (what I
want is just query some kind of history information).

I know, my knowledge about git is still so limit. I will keep study from
the man pages.


> I once tried to add the same cleverness that "git pull --rebase"
> directly in "git rebase" [1], but there were several issues with those
> patches, one of was regarding the performance ("git pull --rebase" can
> be equally slow, but since it often involves network, users probably
> rarely notice). I think it would be nice to at least add it as an
> option to "git rebase" some day. Until then, "git pull --rebase" works
> fine.
> 
>  [1] http://thread.gmane.org/gmane.comp.version-control.git/166710

-- 
woody
I can't go back to yesterday - because I was a different person then.

^ permalink raw reply

* [PATCH v2 9/9] Makefile: add USE_WILDMATCH to use wildmatch as fnmatch
From: Nguyễn Thái Ngọc Duy @ 2012-12-28  4:10 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy
In-Reply-To: <1356667854-8686-1-git-send-email-pclouds@gmail.com>

This is similar to NO_FNMATCH but it uses wildmatch instead of
compat/fnmatch. This is an intermediate step to let wildmatch be used
as fnmatch replacement for wider audience before it replaces fnmatch
completely and compat/fnmatch is removed.

fnmatch in test-wildmatch is not impacted by this and is the only
place that NO_FNMATCH or NO_FNMATCH_CASEFOLD remain active when
USE_WILDMATCH is set.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 Makefile          |  6 ++++++
 git-compat-util.h | 13 +++++++++++++
 test-wildmatch.c  |  3 +++
 3 files changed, 22 insertions(+)

diff --git a/Makefile b/Makefile
index bc868d1..24e2774 100644
--- a/Makefile
+++ b/Makefile
@@ -99,6 +99,9 @@ all::
 # Define NO_FNMATCH_CASEFOLD if your fnmatch function doesn't have the
 # FNM_CASEFOLD GNU extension.
 #
+# Define USE_WILDMATCH if you want to use Git's wildmatch
+# implementation as fnmatch
+#
 # Define NO_GECOS_IN_PWENT if you don't have pw_gecos in struct passwd
 # in the C library.
 #
@@ -1625,6 +1628,9 @@ ifdef NO_FNMATCH_CASEFOLD
 	COMPAT_OBJS += compat/fnmatch/fnmatch.o
 endif
 endif
+ifdef USE_WILDMATCH
+	COMPAT_CFLAGS += -DUSE_WILDMATCH
+endif
 ifdef NO_SETENV
 	COMPAT_CFLAGS += -DNO_SETENV
 	COMPAT_OBJS += compat/setenv.o
diff --git a/git-compat-util.h b/git-compat-util.h
index 02f48f6..b2c7638 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -106,7 +106,9 @@
 #include <sys/time.h>
 #include <time.h>
 #include <signal.h>
+#ifndef USE_WILDMATCH
 #include <fnmatch.h>
+#endif
 #include <assert.h>
 #include <regex.h>
 #include <utime.h>
@@ -238,6 +240,17 @@ extern char *gitbasename(char *);
 
 #include "compat/bswap.h"
 
+#ifdef USE_WILDMATCH
+#include "wildmatch.h"
+#define FNM_PATHNAME WM_PATHNAME
+#define FNM_CASEFOLD WM_CASEFOLD
+#define FNM_NOMATCH  WM_NOMATCH
+static inline int fnmatch(const char *pattern, const char *string, int flags)
+{
+	return wildmatch(pattern, string, flags, NULL);
+}
+#endif
+
 /* General helper functions */
 extern void vreportf(const char *prefix, const char *err, va_list params);
 extern void vwritef(int fd, const char *prefix, const char *err, va_list params);
diff --git a/test-wildmatch.c b/test-wildmatch.c
index ac86800..a3e2643 100644
--- a/test-wildmatch.c
+++ b/test-wildmatch.c
@@ -1,3 +1,6 @@
+#ifdef USE_WILDMATCH
+#undef USE_WILDMATCH  /* We need real fnmatch implementation here */
+#endif
 #include "cache.h"
 #include "wildmatch.h"
 
-- 
1.8.0.rc2.23.g1fb49df

^ permalink raw reply related

* [PATCH v2 8/9] wildmatch: advance faster in <asterisk> + <literal> patterns
From: Nguyễn Thái Ngọc Duy @ 2012-12-28  4:10 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy
In-Reply-To: <1356667854-8686-1-git-send-email-pclouds@gmail.com>

compat, '*/*/*' on linux-2.6.git file list 2000 times, before:
wildmatch 7s 985049us
fnmatch   2s 735541us or 34.26% faster

and after:
wildmatch 4s 492549us
fnmatch   0s 888263us or 19.77% slower

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 t/t3070-wildmatch.sh |  6 ++++++
 wildmatch.c          | 21 +++++++++++++++++++++
 2 files changed, 27 insertions(+)

diff --git a/t/t3070-wildmatch.sh b/t/t3070-wildmatch.sh
index 4cdb13b..dcbc8bc 100755
--- a/t/t3070-wildmatch.sh
+++ b/t/t3070-wildmatch.sh
@@ -207,6 +207,9 @@ match 0 x foo '*/*/*'
 match 0 x foo/bar '*/*/*'
 match 1 x foo/bba/arr '*/*/*'
 match 0 x foo/bb/aa/rr '*/*/*'
+match 1 x abcXdefXghi '*X*i'
+match 0 x ab/cXd/efXg/hi '*X*i'
+match 1 x ab/cXd/efXg/hi '*/*X*/*/*i'
 
 pathmatch 1 foo foo
 pathmatch 0 foo fo
@@ -226,5 +229,8 @@ pathmatch 0 foo '*/*/*'
 pathmatch 0 foo/bar '*/*/*'
 pathmatch 1 foo/bba/arr '*/*/*'
 pathmatch 1 foo/bb/aa/rr '*/*/*'
+pathmatch 1 abcXdefXghi '*X*i'
+pathmatch 1 ab/cXd/efXg/hi '*/*X*/*/*i'
+pathmatch 1 ab/cXd/efXg/hi '*Xg*i'
 
 test_done
diff --git a/wildmatch.c b/wildmatch.c
index f6d45d5..40eda08 100644
--- a/wildmatch.c
+++ b/wildmatch.c
@@ -132,6 +132,27 @@ static int dowild(const uchar *p, const uchar *text, unsigned int flags)
 			while (1) {
 				if (t_ch == '\0')
 					break;
+				/*
+				 * Try to advance faster when an asterisk is
+				 * followed by a literal. We know in this case
+				 * that the the string before the literal
+				 * must belong to "*".
+				 */
+				if (!is_glob_special(*p)) {
+					p_ch = *p;
+					if ((flags & WM_CASEFOLD) && ISUPPER(p_ch))
+						p_ch = tolower(p_ch);
+					while ((t_ch = *text) != '\0' &&
+					       (!(flags & WM_PATHNAME) || t_ch != '/')) {
+						if ((flags & WM_CASEFOLD) && ISUPPER(t_ch))
+							t_ch = tolower(t_ch);
+						if (t_ch == p_ch)
+							break;
+						text++;
+					}
+					if (t_ch != p_ch)
+						return WM_NOMATCH;
+				}
 				if ((matched = dowild(p, text, flags)) != WM_NOMATCH) {
 					if (!match_slash || matched != WM_ABORT_TO_STARSTAR)
 						return matched;
-- 
1.8.0.rc2.23.g1fb49df

^ permalink raw reply related

* [PATCH v2 7/9] wildmatch: make a special case for "*/" with FNM_PATHNAME
From: Nguyễn Thái Ngọc Duy @ 2012-12-28  4:10 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy
In-Reply-To: <1356667854-8686-1-git-send-email-pclouds@gmail.com>

Normally we need recursion for "*". In this case we know that it
matches everything until "/" so we can skip the recursion.

glibc, '*/*/*' on linux-2.6.git file list 2000 times
before:
wildmatch 8s 74513us
fnmatch   1s 97042us or 13.59% faster
after:
wildmatch 3s 521862us
fnmatch   3s 488616us or 99.06% slower

Same test with compat/fnmatch:
wildmatch 8s 110763us
fnmatch   2s 980845us or 36.75% faster
wildmatch 3s 522156us
fnmatch   1s 544487us or 43.85% slower

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 t/t3070-wildmatch.sh |  8 ++++++++
 wildmatch.c          | 12 ++++++++++++
 2 files changed, 20 insertions(+)

diff --git a/t/t3070-wildmatch.sh b/t/t3070-wildmatch.sh
index dbfa903..4cdb13b 100755
--- a/t/t3070-wildmatch.sh
+++ b/t/t3070-wildmatch.sh
@@ -203,6 +203,10 @@ match 1 1 'XXX/adobe/courier/bold/o/normal//12/120/75/75/m/70/iso8859/1' 'XXX/*/
 match 0 0 'XXX/adobe/courier/bold/o/normal//12/120/75/75/X/70/iso8859/1' 'XXX/*/*/*/*/*/*/12/*/*/*/m/*/*/*'
 match 1 0 'abcd/abcdefg/abcdefghijk/abcdefghijklmnop.txt' '**/*a*b*g*n*t'
 match 0 0 'abcd/abcdefg/abcdefghijk/abcdefghijklmnop.txtz' '**/*a*b*g*n*t'
+match 0 x foo '*/*/*'
+match 0 x foo/bar '*/*/*'
+match 1 x foo/bba/arr '*/*/*'
+match 0 x foo/bb/aa/rr '*/*/*'
 
 pathmatch 1 foo foo
 pathmatch 0 foo fo
@@ -218,5 +222,9 @@ pathmatch 0 foo/bba/arr 'foo/*z'
 pathmatch 0 foo/bba/arr 'foo/**z'
 pathmatch 1 foo/bar 'foo?bar'
 pathmatch 1 foo/bar 'foo[/]bar'
+pathmatch 0 foo '*/*/*'
+pathmatch 0 foo/bar '*/*/*'
+pathmatch 1 foo/bba/arr '*/*/*'
+pathmatch 1 foo/bb/aa/rr '*/*/*'
 
 test_done
diff --git a/wildmatch.c b/wildmatch.c
index 0c8edb8..f6d45d5 100644
--- a/wildmatch.c
+++ b/wildmatch.c
@@ -116,6 +116,18 @@ static int dowild(const uchar *p, const uchar *text, unsigned int flags)
 						return WM_NOMATCH;
 				}
 				return WM_MATCH;
+			} else if (*p == '/' && (flags & WM_PATHNAME) && !match_slash) {
+				/*
+				 * an asterisk followed by a slash
+				 * with WM_PATHNAME matches the next
+				 * directory
+				 */
+				const char *slash = strchr((char*)text, '/');
+				if (!slash)
+					return WM_NOMATCH;
+				text = (const uchar*)slash;
+				/* the slash is consumed by the top-level for loop */
+				break;
 			}
 			while (1) {
 				if (t_ch == '\0')
-- 
1.8.0.rc2.23.g1fb49df

^ permalink raw reply related

* [PATCH v2 6/9] test-wildmatch: add "perf" command to compare wildmatch and fnmatch
From: Nguyễn Thái Ngọc Duy @ 2012-12-28  4:10 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy
In-Reply-To: <1356667854-8686-1-git-send-email-pclouds@gmail.com>

It takes a text file, a pattern, a number <n> and pathname flag. Each
line in the text file is matched against the pattern <n> times. If
"pathname" is given, FNM_PATHNAME is used.

test-wildmatch is built with -O2 and tested against glibc 2.14.1 (also
-O2) and compat/fnmatch. The input file is linux-2.6.git file list.
<n> is 2000. The complete command list is at the end.

wildmatch is beaten in the following cases. Apparently it needs some
improvement in FNM_PATHNAME case:

glibc, '*/*/*' with FNM_PATHNAME:
wildmatch 8s 1559us
fnmatch   1s 11877us or 12.65% faster

compat, '*/*/*' with FNM_PATHNAME:
wildmatch 7s 922458us
fnmatch   2s 905111us or 36.67% faster

compat, '*/*/*' without FNM_PATHNAME:
wildmatch 7s 264201us
fnmatch   2s 1897us or 27.56% faster

compat, '[a-z]*/[a-z]*/[a-z]*' with FNM_PATHNAME:
wildmatch 8s 742827us
fnmatch   0s 922943us or 10.56% faster

compat, '[a-z]*/[a-z]*/[a-z]*' without FNM_PATHNAME:
wildmatch 8s 284520us
fnmatch   0s 6936us or 0.08% faster

The rest of glibc numbers
-------------------------

'Documentation/*'
wildmatch 1s 529479us
fnmatch   1s 98263us or 71.81% slower

'drivers/*'
wildmatch 1s 988288us
fnmatch   1s 192049us or 59.95% slower

'Documentation/*' pathname
wildmatch 1s 557507us
fnmatch   1s 93696us or 70.22% slower

'drivers/*' pathname
wildmatch 2s 161626us
fnmatch   1s 230372us or 56.92% slower

'[Dd]ocu[Mn]entation/*'
wildmatch 1s 776581us
fnmatch   1s 471693us or 82.84% slower

'[Dd]o?u[Mn]en?ati?n/*'
wildmatch 1s 770770us
fnmatch   1s 555727us or 87.86% slower

'[Dd]o?u[Mn]en?ati?n/*' pathname
wildmatch 1s 783507us
fnmatch   1s 537029us or 86.18% slower

'[A-Za-z][A-Za-z]??*'
wildmatch 4s 110386us
fnmatch   4s 926306us or 119.85% slower

'[A-Za-z][A-Za-z]??'
wildmatch 3s 918114us
fnmatch   3s 686175us or 94.08% slower

'[A-Za-z][A-Za-z]??*' pathname
wildmatch 4s 453746us
fnmatch   4s 955856us or 111.27% slower

'[A-Za-z][A-Za-z]??' pathname
wildmatch 3s 896646us
fnmatch   3s 733828us or 95.82% slower

'*/*/*'
wildmatch 7s 287985us
fnmatch   1s 74083us or 14.74% slower

'[a-z]*/[a-z]*/[a-z]*' pathname
wildmatch 8s 796659us
fnmatch   1s 568409us or 17.83% slower

'[a-z]*/[a-z]*/[a-z]*'
wildmatch 8s 316559us
fnmatch   3s 430652us or 41.25% slower

The rest of compat numbers
--------------------------

'Documentation/*'
wildmatch 1s 520389us
fnmatch   0s 62579us or 4.12% slower

'drivers/*'
wildmatch 1s 955354us
fnmatch   0s 190109us or 9.72% slower

'Documentation/*' pathname
wildmatch 1s 561675us
fnmatch   0s 55336us or 3.54% slower

'drivers/*' pathname
wildmatch 2s 106100us
fnmatch   0s 219680us or 10.43% slower

'[Dd]ocu[Mn]entation/*'
wildmatch 1s 750810us
fnmatch   0s 542721us or 31.00% slower

'[Dd]o?u[Mn]en?ati?n/*'
wildmatch 1s 724791us
fnmatch   0s 538948us or 31.25% slower

'[Dd]o?u[Mn]en?ati?n/*' pathname
wildmatch 1s 731403us
fnmatch   0s 537474us or 31.04% slower

'[A-Za-z][A-Za-z]??*'
wildmatch 4s 28555us
fnmatch   1s 67297us or 26.49% slower

'[A-Za-z][A-Za-z]??'
wildmatch 3s 838279us
fnmatch   0s 880005us or 22.93% slower

'[A-Za-z][A-Za-z]??*' pathname
wildmatch 4s 379476us
fnmatch   1s 55643us or 24.10% slower

'[A-Za-z][A-Za-z]??' pathname
wildmatch 3s 830910us
fnmatch   0s 849699us or 22.18% slower

The following commands are used:

LANG=C ./test-wildmatch perf /tmp/filelist.txt 'Documentation/*' 2000
LANG=C ./test-wildmatch perf /tmp/filelist.txt 'drivers/*' 2000
LANG=C ./test-wildmatch perf /tmp/filelist.txt 'Documentation/*' 2000 pathname
LANG=C ./test-wildmatch perf /tmp/filelist.txt 'drivers/*' 2000 pathname
LANG=C ./test-wildmatch perf /tmp/filelist.txt '[Dd]ocu[Mn]entation/*' 2000
LANG=C ./test-wildmatch perf /tmp/filelist.txt '[Dd]o?u[Mn]en?ati?n/*' 2000
LANG=C ./test-wildmatch perf /tmp/filelist.txt '[Dd]o?u[Mn]en?ati?n/*' 2000 pathname
LANG=C ./test-wildmatch perf /tmp/filelist.txt '[A-Za-z][A-Za-z]??*' 2000
LANG=C ./test-wildmatch perf /tmp/filelist.txt '[A-Za-z][A-Za-z]??' 2000
LANG=C ./test-wildmatch perf /tmp/filelist.txt '[A-Za-z][A-Za-z]??*' 2000 pathname
LANG=C ./test-wildmatch perf /tmp/filelist.txt '[A-Za-z][A-Za-z]??' 2000 pathname
LANG=C ./test-wildmatch perf /tmp/filelist.txt '*/*/*' 2000
LANG=C ./test-wildmatch perf /tmp/filelist.txt '*/*/*' 2000 pathname
LANG=C ./test-wildmatch perf /tmp/filelist.txt '[a-z]*/[a-z]*/[a-z]*' 2000 pathname
LANG=C ./test-wildmatch perf /tmp/filelist.txt '[a-z]*/[a-z]*/[a-z]*' 2000

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 test-wildmatch.c | 73 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 73 insertions(+)

diff --git a/test-wildmatch.c b/test-wildmatch.c
index a5f4833..ac86800 100644
--- a/test-wildmatch.c
+++ b/test-wildmatch.c
@@ -1,9 +1,82 @@
 #include "cache.h"
 #include "wildmatch.h"
 
+static int perf(int ac, char **av)
+{
+	struct timeval tv1, tv2;
+	struct stat st;
+	int fd, i, n, flags1 = 0, flags2 = 0;
+	char *buffer, *p;
+	uint32_t usec1, usec2;
+	const char *lang;
+	const char *file = av[0];
+	const char *pattern = av[1];
+
+	lang = getenv("LANG");
+	if (lang && strcmp(lang, "C"))
+		die("Please test it on C locale.");
+
+	if ((fd = open(file, O_RDONLY)) == -1 || fstat(fd, &st))
+		die_errno("file open");
+
+	buffer = xmalloc(st.st_size + 2);
+	if (read(fd, buffer, st.st_size) != st.st_size)
+		die_errno("read");
+
+	buffer[st.st_size] = '\0';
+	buffer[st.st_size + 1] = '\0';
+	for (i = 0; i < st.st_size; i++)
+		if (buffer[i] == '\n')
+			buffer[i] = '\0';
+
+	n = atoi(av[2]);
+	if (av[3] && !strcmp(av[3], "pathname")) {
+		flags1 = WM_PATHNAME;
+		flags2 = FNM_PATHNAME;
+	}
+
+	gettimeofday(&tv1, NULL);
+	for (i = 0; i < n; i++) {
+		for (p = buffer; *p; p += strlen(p) + 1)
+			wildmatch(pattern, p, flags1, NULL);
+	}
+	gettimeofday(&tv2, NULL);
+
+	usec1 = (uint32_t)tv2.tv_sec * 1000000 + tv2.tv_usec;
+	usec1 -= (uint32_t)tv1.tv_sec * 1000000 + tv1.tv_usec;
+	printf("wildmatch %ds %dus\n",
+	       (int)(usec1 / 1000000),
+	       (int)(usec1 % 1000000));
+
+	gettimeofday(&tv1, NULL);
+	for (i = 0; i < n; i++) {
+		for (p = buffer; *p; p += strlen(p) + 1)
+			fnmatch(pattern, p, flags2);
+	}
+	gettimeofday(&tv2, NULL);
+
+	usec2 = (uint32_t)tv2.tv_sec * 1000000 + tv2.tv_usec;
+	usec2 -= (uint32_t)tv1.tv_sec * 1000000 + tv1.tv_usec;
+	if (usec2 > usec1)
+		printf("fnmatch   %ds %dus or %.2f%% slower\n",
+		       (int)((usec2 - usec1) / 1000000),
+		       (int)((usec2 - usec1) % 1000000),
+		       (float)(usec2 - usec1) / usec1 * 100);
+	else
+		printf("fnmatch   %ds %dus or %.2f%% faster\n",
+		       (int)((usec1 - usec2) / 1000000),
+		       (int)((usec1 - usec2) % 1000000),
+		       (float)(usec1 - usec2) / usec1 * 100);
+	return 0;
+}
+
 int main(int argc, char **argv)
 {
 	int i;
+
+	if (!strcmp(argv[1], "perf"))
+		return perf(argc - 2, argv + 2);
+
 	for (i = 2; i < argc; i++) {
 		if (argv[i][0] == '/')
 			die("Forward slash is not allowed at the beginning of the\n"
-- 
1.8.0.rc2.23.g1fb49df

^ permalink raw reply related

* [PATCH v2 5/9] wildmatch: support "no FNM_PATHNAME" mode
From: Nguyễn Thái Ngọc Duy @ 2012-12-28  4:10 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy
In-Reply-To: <1356667854-8686-1-git-send-email-pclouds@gmail.com>

So far, wildmatch() has always honoured directory boundary and there
was no way to turn it off. Make it behave more like fnmatch() by
requiring all callers that want the FNM_PATHNAME behaviour to pass
that in the equivalent flag WM_PATHNAME. Callers that do not specify
WM_PATHNAME will get wildcards like ? and * in their patterns matched
against '/', just like not passing FNM_PATHNAME to fnmatch().

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 dir.c                |  2 +-
 t/t3070-wildmatch.sh | 27 +++++++++++++++++++++++++++
 test-wildmatch.c     |  6 ++++--
 wildmatch.c          | 13 +++++++++----
 wildmatch.h          |  1 +
 5 files changed, 42 insertions(+), 7 deletions(-)

diff --git a/dir.c b/dir.c
index 175a182..6ef0396 100644
--- a/dir.c
+++ b/dir.c
@@ -595,7 +595,7 @@ int match_pathname(const char *pathname, int pathlen,
 	}
 
 	return wildmatch(pattern, name,
-			 ignore_case ? WM_CASEFOLD : 0,
+			 WM_PATHNAME | (ignore_case ? WM_CASEFOLD : 0),
 			 NULL) == 0;
 }
 
diff --git a/t/t3070-wildmatch.sh b/t/t3070-wildmatch.sh
index d5bafef..dbfa903 100755
--- a/t/t3070-wildmatch.sh
+++ b/t/t3070-wildmatch.sh
@@ -29,6 +29,18 @@ match() {
     fi
 }
 
+pathmatch() {
+    if [ $1 = 1 ]; then
+	test_expect_success "pathmatch:    match '$2' '$3'" "
+	    test-wildmatch pathmatch '$2' '$3'
+	"
+    else
+	test_expect_success "pathmatch: no match '$2' '$3'" "
+	    ! test-wildmatch pathmatch '$2' '$3'
+	"
+    fi
+}
+
 # Basic wildmat features
 match 1 1 foo foo
 match 0 0 foo bar
@@ -192,4 +204,19 @@ match 0 0 'XXX/adobe/courier/bold/o/normal//12/120/75/75/X/70/iso8859/1' 'XXX/*/
 match 1 0 'abcd/abcdefg/abcdefghijk/abcdefghijklmnop.txt' '**/*a*b*g*n*t'
 match 0 0 'abcd/abcdefg/abcdefghijk/abcdefghijklmnop.txtz' '**/*a*b*g*n*t'
 
+pathmatch 1 foo foo
+pathmatch 0 foo fo
+pathmatch 1 foo/bar foo/bar
+pathmatch 1 foo/bar 'foo/*'
+pathmatch 1 foo/bba/arr 'foo/*'
+pathmatch 1 foo/bba/arr 'foo/**'
+pathmatch 1 foo/bba/arr 'foo*'
+pathmatch 1 foo/bba/arr 'foo**'
+pathmatch 1 foo/bba/arr 'foo/*arr'
+pathmatch 1 foo/bba/arr 'foo/**arr'
+pathmatch 0 foo/bba/arr 'foo/*z'
+pathmatch 0 foo/bba/arr 'foo/**z'
+pathmatch 1 foo/bar 'foo?bar'
+pathmatch 1 foo/bar 'foo[/]bar'
+
 test_done
diff --git a/test-wildmatch.c b/test-wildmatch.c
index 4bb23b4..a5f4833 100644
--- a/test-wildmatch.c
+++ b/test-wildmatch.c
@@ -12,9 +12,11 @@ int main(int argc, char **argv)
 			argv[i] += 3;
 	}
 	if (!strcmp(argv[1], "wildmatch"))
-		return !!wildmatch(argv[3], argv[2], 0, NULL);
+		return !!wildmatch(argv[3], argv[2], WM_PATHNAME, NULL);
 	else if (!strcmp(argv[1], "iwildmatch"))
-		return !!wildmatch(argv[3], argv[2], WM_CASEFOLD, NULL);
+		return !!wildmatch(argv[3], argv[2], WM_PATHNAME | WM_CASEFOLD, NULL);
+	else if (!strcmp(argv[1], "pathmatch"))
+		return !!wildmatch(argv[3], argv[2], 0, NULL);
 	else if (!strcmp(argv[1], "fnmatch"))
 		return !!fnmatch(argv[3], argv[2], FNM_PATHNAME);
 	else
diff --git a/wildmatch.c b/wildmatch.c
index 68e4213..0c8edb8 100644
--- a/wildmatch.c
+++ b/wildmatch.c
@@ -77,14 +77,17 @@ static int dowild(const uchar *p, const uchar *text, unsigned int flags)
 			continue;
 		case '?':
 			/* Match anything but '/'. */
-			if (t_ch == '/')
+			if ((flags & WM_PATHNAME) && t_ch == '/')
 				return WM_NOMATCH;
 			continue;
 		case '*':
 			if (*++p == '*') {
 				const uchar *prev_p = p - 2;
 				while (*++p == '*') {}
-				if ((prev_p == text || *prev_p == '/') ||
+				if (!(flags & WM_PATHNAME))
+					/* without WM_PATHNAME, '*' == '**' */
+					match_slash = 1;
+				else if ((prev_p == text || *prev_p == '/') ||
 				    (*p == '\0' || *p == '/' ||
 				     (p[0] == '\\' && p[1] == '/'))) {
 					/*
@@ -103,7 +106,8 @@ static int dowild(const uchar *p, const uchar *text, unsigned int flags)
 				} else
 					return WM_ABORT_MALFORMED;
 			} else
-				match_slash = 0;
+				/* without WM_PATHNAME, '*' == '**' */
+				match_slash = flags & WM_PATHNAME ? 0 : 1;
 			if (*p == '\0') {
 				/* Trailing "**" matches everything.  Trailing "*" matches
 				 * only if there are no more slash characters. */
@@ -214,7 +218,8 @@ static int dowild(const uchar *p, const uchar *text, unsigned int flags)
 				} else if (t_ch == p_ch)
 					matched = 1;
 			} while (prev_ch = p_ch, (p_ch = *++p) != ']');
-			if (matched == negated || t_ch == '/')
+			if (matched == negated ||
+			    ((flags & WM_PATHNAME) && t_ch == '/'))
 				return WM_NOMATCH;
 			continue;
 		}
diff --git a/wildmatch.h b/wildmatch.h
index 1c814fd..4090c8f 100644
--- a/wildmatch.h
+++ b/wildmatch.h
@@ -2,6 +2,7 @@
 #define WILDMATCH_H
 
 #define WM_CASEFOLD 1
+#define WM_PATHNAME 2
 
 #define WM_ABORT_MALFORMED 2
 #define WM_NOMATCH 1
-- 
1.8.0.rc2.23.g1fb49df

^ permalink raw reply related

* [PATCH v2 4/9] wildmatch: make dowild() take arbitrary flags
From: Nguyễn Thái Ngọc Duy @ 2012-12-28  4:10 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy
In-Reply-To: <1356667854-8686-1-git-send-email-pclouds@gmail.com>


Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 wildmatch.c | 13 ++++++-------
 1 file changed, 6 insertions(+), 7 deletions(-)

diff --git a/wildmatch.c b/wildmatch.c
index f9b6451..68e4213 100644
--- a/wildmatch.c
+++ b/wildmatch.c
@@ -52,7 +52,7 @@ typedef unsigned char uchar;
 #define ISXDIGIT(c) (ISASCII(c) && isxdigit(c))
 
 /* Match pattern "p" against "text" */
-static int dowild(const uchar *p, const uchar *text, int force_lower_case)
+static int dowild(const uchar *p, const uchar *text, unsigned int flags)
 {
 	uchar p_ch;
 
@@ -61,9 +61,9 @@ static int dowild(const uchar *p, const uchar *text, int force_lower_case)
 		uchar t_ch, prev_ch;
 		if ((t_ch = *text) == '\0' && p_ch != '*')
 			return WM_ABORT_ALL;
-		if (force_lower_case && ISUPPER(t_ch))
+		if ((flags & WM_CASEFOLD) && ISUPPER(t_ch))
 			t_ch = tolower(t_ch);
-		if (force_lower_case && ISUPPER(p_ch))
+		if ((flags & WM_CASEFOLD) && ISUPPER(p_ch))
 			p_ch = tolower(p_ch);
 		switch (p_ch) {
 		case '\\':
@@ -97,7 +97,7 @@ static int dowild(const uchar *p, const uchar *text, int force_lower_case)
 					 * both foo/bar and foo/a/bar.
 					 */
 					if (p[0] == '/' &&
-					    dowild(p + 1, text, force_lower_case) == WM_MATCH)
+					    dowild(p + 1, text, flags) == WM_MATCH)
 						return WM_MATCH;
 					match_slash = 1;
 				} else
@@ -116,7 +116,7 @@ static int dowild(const uchar *p, const uchar *text, int force_lower_case)
 			while (1) {
 				if (t_ch == '\0')
 					break;
-				if ((matched = dowild(p, text,  force_lower_case)) != WM_NOMATCH) {
+				if ((matched = dowild(p, text, flags)) != WM_NOMATCH) {
 					if (!match_slash || matched != WM_ABORT_TO_STARSTAR)
 						return matched;
 				} else if (!match_slash && t_ch == '/')
@@ -227,6 +227,5 @@ static int dowild(const uchar *p, const uchar *text, int force_lower_case)
 int wildmatch(const char *pattern, const char *text,
 	      unsigned int flags, struct wildopts *wo)
 {
-	return dowild((const uchar*)pattern, (const uchar*)text,
-		      flags & WM_CASEFOLD ? 1 :0);
+	return dowild((const uchar*)pattern, (const uchar*)text, flags);
 }
-- 
1.8.0.rc2.23.g1fb49df

^ permalink raw reply related

* [PATCH v2 3/9] wildmatch: rename constants and update prototype
From: Nguyễn Thái Ngọc Duy @ 2012-12-28  4:10 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy
In-Reply-To: <1356667854-8686-1-git-send-email-pclouds@gmail.com>

- All exported constants now have a prefix WM_
- Do not rely on FNM_* constants, use the WM_ counterparts
- Remove TRUE and FALSE to follow Git's coding style
- While at it, turn flags type from int to unsigned int
- Add an (unused yet) argument to carry extra information
  so that we don't have to change the prototype again later
  when we need to pass other stuff to wildmatch

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 dir.c            |  3 +-
 test-wildmatch.c |  4 +--
 wildmatch.c      | 88 +++++++++++++++++++++++++++-----------------------------
 wildmatch.h      | 22 +++++++++-----
 4 files changed, 62 insertions(+), 55 deletions(-)

diff --git a/dir.c b/dir.c
index cb7328b..175a182 100644
--- a/dir.c
+++ b/dir.c
@@ -595,7 +595,8 @@ int match_pathname(const char *pathname, int pathlen,
 	}
 
 	return wildmatch(pattern, name,
-			 ignore_case ? FNM_CASEFOLD : 0) == 0;
+			 ignore_case ? WM_CASEFOLD : 0,
+			 NULL) == 0;
 }
 
 /* Scan the list and let the last match determine the fate.
diff --git a/test-wildmatch.c b/test-wildmatch.c
index e384c8e..4bb23b4 100644
--- a/test-wildmatch.c
+++ b/test-wildmatch.c
@@ -12,9 +12,9 @@ int main(int argc, char **argv)
 			argv[i] += 3;
 	}
 	if (!strcmp(argv[1], "wildmatch"))
-		return !!wildmatch(argv[3], argv[2], 0);
+		return !!wildmatch(argv[3], argv[2], 0, NULL);
 	else if (!strcmp(argv[1], "iwildmatch"))
-		return !!wildmatch(argv[3], argv[2], FNM_CASEFOLD);
+		return !!wildmatch(argv[3], argv[2], WM_CASEFOLD, NULL);
 	else if (!strcmp(argv[1], "fnmatch"))
 		return !!fnmatch(argv[3], argv[2], FNM_PATHNAME);
 	else
diff --git a/wildmatch.c b/wildmatch.c
index 8a58ad4..f9b6451 100644
--- a/wildmatch.c
+++ b/wildmatch.c
@@ -18,9 +18,6 @@ typedef unsigned char uchar;
 #define NEGATE_CLASS	'!'
 #define NEGATE_CLASS2	'^'
 
-#define FALSE 0
-#define TRUE 1
-
 #define CC_EQ(class, len, litmatch) ((len) == sizeof (litmatch)-1 \
 				    && *(class) == *(litmatch) \
 				    && strncmp((char*)class, litmatch, len) == 0)
@@ -63,7 +60,7 @@ static int dowild(const uchar *p, const uchar *text, int force_lower_case)
 		int matched, match_slash, negated;
 		uchar t_ch, prev_ch;
 		if ((t_ch = *text) == '\0' && p_ch != '*')
-			return ABORT_ALL;
+			return WM_ABORT_ALL;
 		if (force_lower_case && ISUPPER(t_ch))
 			t_ch = tolower(t_ch);
 		if (force_lower_case && ISUPPER(p_ch))
@@ -76,12 +73,12 @@ static int dowild(const uchar *p, const uchar *text, int force_lower_case)
 			/* FALLTHROUGH */
 		default:
 			if (t_ch != p_ch)
-				return NOMATCH;
+				return WM_NOMATCH;
 			continue;
 		case '?':
 			/* Match anything but '/'. */
 			if (t_ch == '/')
-				return NOMATCH;
+				return WM_NOMATCH;
 			continue;
 		case '*':
 			if (*++p == '*') {
@@ -100,135 +97,136 @@ static int dowild(const uchar *p, const uchar *text, int force_lower_case)
 					 * both foo/bar and foo/a/bar.
 					 */
 					if (p[0] == '/' &&
-					    dowild(p + 1, text, force_lower_case) == MATCH)
-						return MATCH;
-					match_slash = TRUE;
+					    dowild(p + 1, text, force_lower_case) == WM_MATCH)
+						return WM_MATCH;
+					match_slash = 1;
 				} else
-					return ABORT_MALFORMED;
+					return WM_ABORT_MALFORMED;
 			} else
-				match_slash = FALSE;
+				match_slash = 0;
 			if (*p == '\0') {
 				/* Trailing "**" matches everything.  Trailing "*" matches
 				 * only if there are no more slash characters. */
 				if (!match_slash) {
 					if (strchr((char*)text, '/') != NULL)
-						return NOMATCH;
+						return WM_NOMATCH;
 				}
-				return MATCH;
+				return WM_MATCH;
 			}
 			while (1) {
 				if (t_ch == '\0')
 					break;
-				if ((matched = dowild(p, text,  force_lower_case)) != NOMATCH) {
-					if (!match_slash || matched != ABORT_TO_STARSTAR)
+				if ((matched = dowild(p, text,  force_lower_case)) != WM_NOMATCH) {
+					if (!match_slash || matched != WM_ABORT_TO_STARSTAR)
 						return matched;
 				} else if (!match_slash && t_ch == '/')
-					return ABORT_TO_STARSTAR;
+					return WM_ABORT_TO_STARSTAR;
 				t_ch = *++text;
 			}
-			return ABORT_ALL;
+			return WM_ABORT_ALL;
 		case '[':
 			p_ch = *++p;
 #ifdef NEGATE_CLASS2
 			if (p_ch == NEGATE_CLASS2)
 				p_ch = NEGATE_CLASS;
 #endif
-			/* Assign literal TRUE/FALSE because of "matched" comparison. */
-			negated = p_ch == NEGATE_CLASS? TRUE : FALSE;
+			/* Assign literal 1/0 because of "matched" comparison. */
+			negated = p_ch == NEGATE_CLASS ? 1 : 0;
 			if (negated) {
 				/* Inverted character class. */
 				p_ch = *++p;
 			}
 			prev_ch = 0;
-			matched = FALSE;
+			matched = 0;
 			do {
 				if (!p_ch)
-					return ABORT_ALL;
+					return WM_ABORT_ALL;
 				if (p_ch == '\\') {
 					p_ch = *++p;
 					if (!p_ch)
-						return ABORT_ALL;
+						return WM_ABORT_ALL;
 					if (t_ch == p_ch)
-						matched = TRUE;
+						matched = 1;
 				} else if (p_ch == '-' && prev_ch && p[1] && p[1] != ']') {
 					p_ch = *++p;
 					if (p_ch == '\\') {
 						p_ch = *++p;
 						if (!p_ch)
-							return ABORT_ALL;
+							return WM_ABORT_ALL;
 					}
 					if (t_ch <= p_ch && t_ch >= prev_ch)
-						matched = TRUE;
+						matched = 1;
 					p_ch = 0; /* This makes "prev_ch" get set to 0. */
 				} else if (p_ch == '[' && p[1] == ':') {
 					const uchar *s;
 					int i;
 					for (s = p += 2; (p_ch = *p) && p_ch != ']'; p++) {} /*SHARED ITERATOR*/
 					if (!p_ch)
-						return ABORT_ALL;
+						return WM_ABORT_ALL;
 					i = p - s - 1;
 					if (i < 0 || p[-1] != ':') {
 						/* Didn't find ":]", so treat like a normal set. */
 						p = s - 2;
 						p_ch = '[';
 						if (t_ch == p_ch)
-							matched = TRUE;
+							matched = 1;
 						continue;
 					}
 					if (CC_EQ(s,i, "alnum")) {
 						if (ISALNUM(t_ch))
-							matched = TRUE;
+							matched = 1;
 					} else if (CC_EQ(s,i, "alpha")) {
 						if (ISALPHA(t_ch))
-							matched = TRUE;
+							matched = 1;
 					} else if (CC_EQ(s,i, "blank")) {
 						if (ISBLANK(t_ch))
-							matched = TRUE;
+							matched = 1;
 					} else if (CC_EQ(s,i, "cntrl")) {
 						if (ISCNTRL(t_ch))
-							matched = TRUE;
+							matched = 1;
 					} else if (CC_EQ(s,i, "digit")) {
 						if (ISDIGIT(t_ch))
-							matched = TRUE;
+							matched = 1;
 					} else if (CC_EQ(s,i, "graph")) {
 						if (ISGRAPH(t_ch))
-							matched = TRUE;
+							matched = 1;
 					} else if (CC_EQ(s,i, "lower")) {
 						if (ISLOWER(t_ch))
-							matched = TRUE;
+							matched = 1;
 					} else if (CC_EQ(s,i, "print")) {
 						if (ISPRINT(t_ch))
-							matched = TRUE;
+							matched = 1;
 					} else if (CC_EQ(s,i, "punct")) {
 						if (ISPUNCT(t_ch))
-							matched = TRUE;
+							matched = 1;
 					} else if (CC_EQ(s,i, "space")) {
 						if (ISSPACE(t_ch))
-							matched = TRUE;
+							matched = 1;
 					} else if (CC_EQ(s,i, "upper")) {
 						if (ISUPPER(t_ch))
-							matched = TRUE;
+							matched = 1;
 					} else if (CC_EQ(s,i, "xdigit")) {
 						if (ISXDIGIT(t_ch))
-							matched = TRUE;
+							matched = 1;
 					} else /* malformed [:class:] string */
-						return ABORT_ALL;
+						return WM_ABORT_ALL;
 					p_ch = 0; /* This makes "prev_ch" get set to 0. */
 				} else if (t_ch == p_ch)
-					matched = TRUE;
+					matched = 1;
 			} while (prev_ch = p_ch, (p_ch = *++p) != ']');
 			if (matched == negated || t_ch == '/')
-				return NOMATCH;
+				return WM_NOMATCH;
 			continue;
 		}
 	}
 
-	return *text ? NOMATCH : MATCH;
+	return *text ? WM_NOMATCH : WM_MATCH;
 }
 
 /* Match the "pattern" against the "text" string. */
-int wildmatch(const char *pattern, const char *text, int flags)
+int wildmatch(const char *pattern, const char *text,
+	      unsigned int flags, struct wildopts *wo)
 {
 	return dowild((const uchar*)pattern, (const uchar*)text,
-		      flags & FNM_CASEFOLD ? 1 :0);
+		      flags & WM_CASEFOLD ? 1 :0);
 }
diff --git a/wildmatch.h b/wildmatch.h
index 984a38c..1c814fd 100644
--- a/wildmatch.h
+++ b/wildmatch.h
@@ -1,9 +1,17 @@
-/* wildmatch.h */
+#ifndef WILDMATCH_H
+#define WILDMATCH_H
 
-#define ABORT_MALFORMED 2
-#define NOMATCH 1
-#define MATCH 0
-#define ABORT_ALL -1
-#define ABORT_TO_STARSTAR -2
+#define WM_CASEFOLD 1
 
-int wildmatch(const char *pattern, const char *text, int flags);
+#define WM_ABORT_MALFORMED 2
+#define WM_NOMATCH 1
+#define WM_MATCH 0
+#define WM_ABORT_ALL -1
+#define WM_ABORT_TO_STARSTAR -2
+
+struct wildopts;
+
+int wildmatch(const char *pattern, const char *text,
+	      unsigned int flags,
+	      struct wildopts *wo);
+#endif
-- 
1.8.0.rc2.23.g1fb49df

^ permalink raw reply related

* [PATCH v2 2/9] wildmatch: replace variable 'special' with better named ones
From: Nguyễn Thái Ngọc Duy @ 2012-12-28  4:10 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy
In-Reply-To: <1356667854-8686-1-git-send-email-pclouds@gmail.com>

'special' is too generic and is used for two different purposes.
Replace it with 'match_slash' to indicate "**" pattern and 'negated'
for "[!...]" and "[^...]".

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 wildmatch.c | 18 +++++++++---------
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/wildmatch.c b/wildmatch.c
index 3972e26..8a58ad4 100644
--- a/wildmatch.c
+++ b/wildmatch.c
@@ -60,7 +60,7 @@ static int dowild(const uchar *p, const uchar *text, int force_lower_case)
 	uchar p_ch;
 
 	for ( ; (p_ch = *p) != '\0'; text++, p++) {
-		int matched, special;
+		int matched, match_slash, negated;
 		uchar t_ch, prev_ch;
 		if ((t_ch = *text) == '\0' && p_ch != '*')
 			return ABORT_ALL;
@@ -102,15 +102,15 @@ static int dowild(const uchar *p, const uchar *text, int force_lower_case)
 					if (p[0] == '/' &&
 					    dowild(p + 1, text, force_lower_case) == MATCH)
 						return MATCH;
-					special = TRUE;
+					match_slash = TRUE;
 				} else
 					return ABORT_MALFORMED;
 			} else
-				special = FALSE;
+				match_slash = FALSE;
 			if (*p == '\0') {
 				/* Trailing "**" matches everything.  Trailing "*" matches
 				 * only if there are no more slash characters. */
-				if (!special) {
+				if (!match_slash) {
 					if (strchr((char*)text, '/') != NULL)
 						return NOMATCH;
 				}
@@ -120,9 +120,9 @@ static int dowild(const uchar *p, const uchar *text, int force_lower_case)
 				if (t_ch == '\0')
 					break;
 				if ((matched = dowild(p, text,  force_lower_case)) != NOMATCH) {
-					if (!special || matched != ABORT_TO_STARSTAR)
+					if (!match_slash || matched != ABORT_TO_STARSTAR)
 						return matched;
-				} else if (!special && t_ch == '/')
+				} else if (!match_slash && t_ch == '/')
 					return ABORT_TO_STARSTAR;
 				t_ch = *++text;
 			}
@@ -134,8 +134,8 @@ static int dowild(const uchar *p, const uchar *text, int force_lower_case)
 				p_ch = NEGATE_CLASS;
 #endif
 			/* Assign literal TRUE/FALSE because of "matched" comparison. */
-			special = p_ch == NEGATE_CLASS? TRUE : FALSE;
-			if (special) {
+			negated = p_ch == NEGATE_CLASS? TRUE : FALSE;
+			if (negated) {
 				/* Inverted character class. */
 				p_ch = *++p;
 			}
@@ -217,7 +217,7 @@ static int dowild(const uchar *p, const uchar *text, int force_lower_case)
 				} else if (t_ch == p_ch)
 					matched = TRUE;
 			} while (prev_ch = p_ch, (p_ch = *++p) != ']');
-			if (matched == special || t_ch == '/')
+			if (matched == negated || t_ch == '/')
 				return NOMATCH;
 			continue;
 		}
-- 
1.8.0.rc2.23.g1fb49df

^ permalink raw reply related

* [PATCH v2 1/9] compat/fnmatch: respect NO_FNMATCH* even on glibc
From: Nguyễn Thái Ngọc Duy @ 2012-12-28  4:10 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy
In-Reply-To: <1356667854-8686-1-git-send-email-pclouds@gmail.com>


Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 compat/fnmatch/fnmatch.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/compat/fnmatch/fnmatch.c b/compat/fnmatch/fnmatch.c
index 9473aed..6f7387d 100644
--- a/compat/fnmatch/fnmatch.c
+++ b/compat/fnmatch/fnmatch.c
@@ -55,7 +55,8 @@
    program understand `configure --with-gnu-libc' and omit the object files,
    it is simpler to just do this in the source for each such file.  */
 
-#if defined _LIBC || !defined __GNU_LIBRARY__
+#if defined NO_FNMATCH || defined NO_FNMATCH_CASEFOLD || \
+    defined _LIBC || !defined __GNU_LIBRARY__
 
 
 # if defined STDC_HEADERS || !defined isascii
-- 
1.8.0.rc2.23.g1fb49df

^ permalink raw reply related

* [PATCH v2 0/9] fnmatch replacement step 1
From: Nguyễn Thái Ngọc Duy @ 2012-12-28  4:10 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy

v2 has no big changes:

 - 'special' variable in dowild() is removed in favor of two
   new, better named ones
 - fix TRUE/FALSE in comments as well as code in the rename patch
 - some tests for "*/" and "*<literal>" optimizations
 - USE_WILDMATCH patch is moved to the end of the series

Nguyễn Thái Ngọc Duy (9):
  compat/fnmatch: respect NO_FNMATCH* even on glibc
  wildmatch: replace variable 'special' with better named ones
  wildmatch: rename constants and update prototype
  wildmatch: make dowild() take arbitrary flags
  wildmatch: support "no FNM_PATHNAME" mode
  test-wildmatch: add "perf" command to compare wildmatch and fnmatch
  wildmatch: make a special case for "*/" with FNM_PATHNAME
  wildmatch: advance faster in <asterisk> + <literal> patterns
  Makefile: add USE_WILDMATCH to use wildmatch as fnmatch

 Makefile                 |   6 ++
 compat/fnmatch/fnmatch.c |   3 +-
 dir.c                    |   3 +-
 git-compat-util.h        |  13 +++++
 t/t3070-wildmatch.sh     |  41 +++++++++++++
 test-wildmatch.c         |  82 +++++++++++++++++++++++++-
 wildmatch.c              | 147 +++++++++++++++++++++++++++++------------------
 wildmatch.h              |  23 +++++---
 8 files changed, 251 insertions(+), 67 deletions(-)

-- 
1.8.0.rc2.23.g1fb49df

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox