* Re: Files with colons under Cygwin
2008-10-04 23:39 ` Dmitry Potapov
@ 2008-10-05 9:04 ` Alex Riesen
2008-10-05 9:14 ` Alex Riesen
` (3 subsequent siblings)
4 siblings, 0 replies; 19+ messages in thread
From: Alex Riesen @ 2008-10-05 9:04 UTC (permalink / raw)
To: Dmitry Potapov; +Cc: Giovanni Funchal, git, Shawn O. Pearce
2008/10/5 Dmitry Potapov <dpotapov@gmail.com>:
> On Thu, Oct 02, 2008 at 04:02:23PM +0200, Giovanni Funchal wrote:
>>
>> Cygwin does not allow files with colons, I think this is Windows stuff
>> one just can't avoid.
>
> At least, you cannot use colon in Win32 API. They say Windows "native"
> API has less restrictions over what symbols are not allowed in file
> names, but I guess it is still not allowed.
The part after colon in the file name specifies the identifier of an
alternative data stream (so there can be multiple data sets under one name).
Just another microsoft stupidity no one uses and knows about.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Files with colons under Cygwin
2008-10-04 23:39 ` Dmitry Potapov
2008-10-05 9:04 ` Alex Riesen
@ 2008-10-05 9:14 ` Alex Riesen
2008-10-05 19:51 ` Dmitry Potapov
2008-10-05 9:28 ` Giovanni Funchal
` (2 subsequent siblings)
4 siblings, 1 reply; 19+ messages in thread
From: Alex Riesen @ 2008-10-05 9:14 UTC (permalink / raw)
To: Dmitry Potapov; +Cc: Giovanni Funchal, git, Shawn O. Pearce
2008/10/5 Dmitry Potapov <dpotapov@gmail.com>:
> So, here is a patch. It basically disallow backslashes and colons in
> file names on Windows (whether it is MinGW or Cygwin).
With this and sparse checkout patch combined it maybe possible
to make Git work on these backward filesystems in a saner way:
just never checkout the names which the filesystems cannot support
on disk and mark them correspondingly in the index. If at all needed,
the user can fallback to git-show and git-update-index to see the data
and update them in repo. Assumption is that the cases are rare...
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Files with colons under Cygwin
2008-10-05 9:14 ` Alex Riesen
@ 2008-10-05 19:51 ` Dmitry Potapov
0 siblings, 0 replies; 19+ messages in thread
From: Dmitry Potapov @ 2008-10-05 19:51 UTC (permalink / raw)
To: Alex Riesen; +Cc: Giovanni Funchal, git, Shawn O. Pearce
On Sun, Oct 05, 2008 at 11:14:00AM +0200, Alex Riesen wrote:
> 2008/10/5 Dmitry Potapov <dpotapov@gmail.com>:
> > So, here is a patch. It basically disallow backslashes and colons in
> > file names on Windows (whether it is MinGW or Cygwin).
>
> With this and sparse checkout patch combined it maybe possible
> to make Git work on these backward filesystems in a saner way:
> just never checkout the names which the filesystems cannot support
> on disk and mark them correspondingly in the index.
Perhaps, using sparse checkout is a good idea for dealing with prohibit
characters, but the goal of my patch was a bit different -- to close a
security hole in checkout when files outside of the working directory
can be overwritten. In fact, the whole point of having verify_path() is
to prevent this from happening, and if it does not work properly on
Windows then this function should be corrected.
Dmitry
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Files with colons under Cygwin
2008-10-04 23:39 ` Dmitry Potapov
2008-10-05 9:04 ` Alex Riesen
2008-10-05 9:14 ` Alex Riesen
@ 2008-10-05 9:28 ` Giovanni Funchal
2008-10-06 6:54 ` Johannes Sixt
2008-10-07 2:05 ` Joshua Juran
4 siblings, 0 replies; 19+ messages in thread
From: Giovanni Funchal @ 2008-10-05 9:28 UTC (permalink / raw)
To: Dmitry Potapov; +Cc: git, Shawn O. Pearce
> Strange... What version of Cygwin did you use? When I tried this with
> Cygwin 1.5.25, I got the following error:
>
> error: git checkout-index: unable to create file a:b (No medium found)
I'm on 1.5.25-15 on WinXP over a mounted network file system, but no
errors/warnings here...
Thanks for the clarifications and the patch,
-- Giovanni
On Sun, Oct 5, 2008 at 1:39 AM, Dmitry Potapov <dpotapov@gmail.com> wrote:
> On Thu, Oct 02, 2008 at 04:02:23PM +0200, Giovanni Funchal wrote:
>>
>> Cygwin does not allow files with colons, I think this is Windows stuff
>> one just can't avoid.
>
> At least, you cannot use colon in Win32 API. They say Windows "native"
> API has less restrictions over what symbols are not allowed in file
> names, but I guess it is still not allowed.
>
>> If you have files with colons in a git
>> repository and try pulling them on cygwin, the file is empty, its name
>> is truncated and the status is wrong.
>>
>> linux $ date > a:b
>> linux $ git init
>> linux $ git add a:b
>> linux $ git commit -m test
>> linux $ git push
>> cygwin $ git pull
>
> Strange... What version of Cygwin did you use? When I tried this with
> Cygwin 1.5.25, I got the following error:
>
> error: git checkout-index: unable to create file a:b (No medium found)
>
> Apparently, Git tried to create 'b' file on the drive 'a', and creating
> files outside of the working tree is not a very good thing to do from
> the security point of view, as it can easily overwrite anything in
> c:/windows/.
>
> So, here is a patch. It basically disallow backslashes and colons in
> file names on Windows (whether it is MinGW or Cygwin).
>
> I wonder if the problem exists on Mac OS X too. From what I heard, it
> does not treat ':' as a normal symbol. But I have no access to Mac OS X,
> so here is a patch for Windows only.
>
> -- >8 --
> From: Dmitry Potapov <dpotapov@gmail.com>
> Date: Sat, 4 Oct 2008 22:57:19 +0400
> Subject: [PATCH] correct verify_path for Windows
>
> Colon and backslash in names may be used on Windows to overwrite files
> outside of the working directory.
>
> Signed-off-by: Dmitry Potapov <dpotapov@gmail.com>
> ---
> read-cache.c | 10 ++++++++++
> 1 files changed, 10 insertions(+), 0 deletions(-)
>
> diff --git a/read-cache.c b/read-cache.c
> index 901064b..972592e 100644
> --- a/read-cache.c
> +++ b/read-cache.c
> @@ -701,6 +701,16 @@ inside:
> }
> return 0;
> }
> +#if defined(_WIN32) || defined(__CYGWIN__)
> + /*
> + * There is a bunch of other characters that are not allowed
> + * in Win32 API, but the following two create a security hole
> + * by allowing to overwrite files outside of the working tree,
> + * therefore they are explicitly prohibited.
> + */
> + else if (c == ':' || c == '\\')
> + return 0;
> +#endif
> c = *path++;
> }
> }
> --
> 1.6.0.2.445.g1198
>
> -- >8 --
>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Files with colons under Cygwin
2008-10-04 23:39 ` Dmitry Potapov
` (2 preceding siblings ...)
2008-10-05 9:28 ` Giovanni Funchal
@ 2008-10-06 6:54 ` Johannes Sixt
2008-10-07 0:53 ` Dmitry Potapov
2008-10-07 2:05 ` Joshua Juran
4 siblings, 1 reply; 19+ messages in thread
From: Johannes Sixt @ 2008-10-06 6:54 UTC (permalink / raw)
To: Dmitry Potapov; +Cc: Giovanni Funchal, git, Shawn O. Pearce
Dmitry Potapov schrieb:
> Subject: [PATCH] correct verify_path for Windows
>
> Colon and backslash in names may be used on Windows to overwrite files
> outside of the working directory.
>
> Signed-off-by: Dmitry Potapov <dpotapov@gmail.com>
> ---
> read-cache.c | 10 ++++++++++
> 1 files changed, 10 insertions(+), 0 deletions(-)
>
> diff --git a/read-cache.c b/read-cache.c
> index 901064b..972592e 100644
> --- a/read-cache.c
> +++ b/read-cache.c
> @@ -701,6 +701,16 @@ inside:
> }
> return 0;
> }
> +#if defined(_WIN32) || defined(__CYGWIN__)
> + /*
> + * There is a bunch of other characters that are not allowed
> + * in Win32 API, but the following two create a security hole
> + * by allowing to overwrite files outside of the working tree,
> + * therefore they are explicitly prohibited.
> + */
> + else if (c == ':' || c == '\\')
> + return 0;
> +#endif
> c = *path++;
> }
> }
IIUC, verify_path() checks paths that were found in the database or the
index. As such, it checks for the integrity of the database. And paths
with backslashes or colons certainly do not violate the database integrity.
More precisely, the exchange of path names between the index and tree
objects (both directions) should not do this new check, nor if a path is
added to the index. The check is only meaningful[*] when a path is read
from the index or a tree object and "applied" to the working directory.
Unfortunately, I think there are lots of places where this happens.
[*] I say "meaningful" and not "necessary" because the situation is just
like when you grab some random SoftwarePackage.tar.gz, and run ./configure
without looking first what it is going to do.
-- Hannes
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Files with colons under Cygwin
2008-10-06 6:54 ` Johannes Sixt
@ 2008-10-07 0:53 ` Dmitry Potapov
2008-10-07 6:13 ` Johannes Sixt
0 siblings, 1 reply; 19+ messages in thread
From: Dmitry Potapov @ 2008-10-07 0:53 UTC (permalink / raw)
To: Johannes Sixt; +Cc: Giovanni Funchal, git, Shawn O. Pearce
On Mon, Oct 06, 2008 at 08:54:44AM +0200, Johannes Sixt wrote:
>
> IIUC, verify_path() checks paths that were found in the database or the
> index.
It also checks paths that was given to git add and other commands to
prevent an invalid path to enter to the index. If I am not mistaken,
if invalid path has entered in the index then it will be committed to
the database without any further checks.
> As such, it checks for the integrity of the database. And paths
> with backslashes or colons certainly do not violate the database integrity.
No, it has nothing to do with the database. You can run git fsck --full
on a repository that contains '..' or '.' or '.git', and there will be
no error. Having those names does not violate the database integrity,
as the database is concerned all names are just bytes separated by '/',
so having name '.' is not a problem for it. However, names '.' and '..'
have special meaning for the filesystem, and paths starting with .git/
have special meaning for Git repository. If you work in a bare repo
then those names are not a problem, but once you have a working tree,
you want make sure that there is nothing wrong with it.
>
> More precisely, the exchange of path names between the index and tree
> objects (both directions) should not do this new check, nor if a path is
> added to the index. The check is only meaningful[*] when a path is read
> from the index or a tree object and "applied" to the working directory.
> Unfortunately, I think there are lots of places where this happens.
>
> [*] I say "meaningful" and not "necessary" because the situation is just
> like when you grab some random SoftwarePackage.tar.gz, and run ./configure
> without looking first what it is going to do.
When I grab any tar, I can look at its context without myself of any
risk that some files can be overwritten on my file system. And when
I want to look at some remote git repository, I usually do:
git clone URL
If it can overwrite some files behind my back, it is security a hole.
BTW, it was the reason why the idea of allowing .gitconfig to be stored
in git repository (similar to .gitignore) was stroke down about a year ago
though it could help some clueless users...
On Linux (or other sane file systems), we have all required checks to
prevent that from happening, and they are places in verify_path, which
prevents malicious names entering into the index and thus to the file
system too. So, we should do all required checks on Windows too.
Now, I've realized that my checks were not sufficient (due to Windows
being case-insensitive), so I will add more checks and resend this
patch later.
Dmitry
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Files with colons under Cygwin
2008-10-07 0:53 ` Dmitry Potapov
@ 2008-10-07 6:13 ` Johannes Sixt
0 siblings, 0 replies; 19+ messages in thread
From: Johannes Sixt @ 2008-10-07 6:13 UTC (permalink / raw)
To: Dmitry Potapov; +Cc: Giovanni Funchal, git, Shawn O. Pearce
Dmitry Potapov schrieb:
> On Mon, Oct 06, 2008 at 08:54:44AM +0200, Johannes Sixt wrote:
>> [*] I say "meaningful" and not "necessary" because the situation is just
>> like when you grab some random SoftwarePackage.tar.gz, and run ./configure
>> without looking first what it is going to do.
>
> When I grab any tar, I can look at its context without myself of any
> risk that some files can be overwritten on my file system. And when
> I want to look at some remote git repository, I usually do:
>
> git clone URL
>
> If it can overwrite some files behind my back, it is security a hole.
Fair enough.
> On Linux (or other sane file systems), we have all required checks to
> prevent that from happening, and they are places in verify_path, which
> prevents malicious names entering into the index and thus to the file
> system too. So, we should do all required checks on Windows too.
I don't object the intention of your patch. But I cannot judge whether
verify_path() is the correct location to put the checks because I don't
know this part of the code. I leave the final word to others.
-- Hannes
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Files with colons under Cygwin
2008-10-04 23:39 ` Dmitry Potapov
` (3 preceding siblings ...)
2008-10-06 6:54 ` Johannes Sixt
@ 2008-10-07 2:05 ` Joshua Juran
2008-10-07 3:26 ` [PATCH v2] correct verify_path for Windows Dmitry Potapov
4 siblings, 1 reply; 19+ messages in thread
From: Joshua Juran @ 2008-10-07 2:05 UTC (permalink / raw)
To: Dmitry Potapov; +Cc: Giovanni Funchal, git, Shawn O. Pearce
On Oct 4, 2008, at 4:39 PM, Dmitry Potapov wrote:
> On Thu, Oct 02, 2008 at 04:02:23PM +0200, Giovanni Funchal wrote:
>>
>> Cygwin does not allow files with colons, I think this is Windows
>> stuff
>> one just can't avoid.
>
> I wonder if the problem exists on Mac OS X too. From what I heard, it
> does not treat ':' as a normal symbol.
The short answer is that Mac OS X's POSIX implementation works as
expected (internally replacing ':' with '/') without issue.
Furthermore, my POSIX-like environment Lamp (Lamp ain't Mac POSIX)
that runs on classic Mac OS (much in the manner of Cygwin) provides
the same Unix filing behavior, so Mac filing syntax isn't an issue in
running git on Mac OS 9. (I'm sure everybody's breathing a HUGE sigh
of relief at this news...)
Josh
^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH v2] correct verify_path for Windows
2008-10-07 2:05 ` Joshua Juran
@ 2008-10-07 3:26 ` Dmitry Potapov
2008-10-07 6:18 ` Johannes Sixt
2008-10-07 6:25 ` Alex Riesen
0 siblings, 2 replies; 19+ messages in thread
From: Dmitry Potapov @ 2008-10-07 3:26 UTC (permalink / raw)
To: Joshua Juran; +Cc: Giovanni Funchal, git, Shawn O. Pearce, Johannes Sixt
Colon and backslash in names may be used on Windows to overwrite files
outside of the working directory. Due to the file-system being case-
insensitive, .git can be written as any combination of upper and lower
characters, so we should check that too.
Signed-off-by: Dmitry Potapov <dpotapov@gmail.com>
---
In this version, I have added the check that files in .git/ will not
be overwritten by checkout. Overwriting such files as .git/config is
potentially exploitable.
Josh,
Does OS X need the same check below? I believe it has case-insensitive
filesystem, so it needs that too, but I am not sure what is the right
define should be used.
Thanks,
Dmitry
read-cache.c | 19 +++++++++++++++++++
1 files changed, 19 insertions(+), 0 deletions(-)
diff --git a/read-cache.c b/read-cache.c
index aff6390..7f855ee 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -668,10 +668,19 @@ static int verify_dotfile(const char *rest)
* shares the path end test with the ".." case.
*/
case 'g':
+#if defined(_WIN32) || defined(__CYGWIN__)
+ /* On Windows, file names are case-insensitive */
+ case 'G':
+ if ((rest[1]|0x20) != 'i')
+ break;
+ if ((rest[2]|0x20) != 't')
+ break;
+#else
if (rest[1] != 'i')
break;
if (rest[2] != 't')
break;
+#endif
rest += 2;
/* fallthrough */
case '.':
@@ -703,6 +712,16 @@ inside:
}
return 0;
}
+#if defined(_WIN32) || defined(__CYGWIN__)
+ /*
+ * There is a bunch of other characters that are not allowed
+ * in Win32 API, but the following two create a security hole
+ * by allowing to overwrite files outside of the working tree,
+ * therefore they are explicitly prohibited.
+ */
+ else if (c == ':' || c == '\\')
+ return 0;
+#endif
c = *path++;
}
}
--
1.6.0
^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [PATCH v2] correct verify_path for Windows
2008-10-07 3:26 ` [PATCH v2] correct verify_path for Windows Dmitry Potapov
@ 2008-10-07 6:18 ` Johannes Sixt
2008-10-11 16:33 ` Dmitry Potapov
2008-10-07 6:25 ` Alex Riesen
1 sibling, 1 reply; 19+ messages in thread
From: Johannes Sixt @ 2008-10-07 6:18 UTC (permalink / raw)
To: Dmitry Potapov; +Cc: Joshua Juran, Giovanni Funchal, git, Shawn O. Pearce
Dmitry Potapov schrieb:
> +#if defined(_WIN32) || defined(__CYGWIN__)
I think that for consistency you should use __MINGW32__ instead of _WIN32.
> + /* On Windows, file names are case-insensitive */
> + case 'G':
> + if ((rest[1]|0x20) != 'i')
> + break;
> + if ((rest[2]|0x20) != 't')
> + break;
We have tolower().
-- Hannes
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2] correct verify_path for Windows
2008-10-07 6:18 ` Johannes Sixt
@ 2008-10-11 16:33 ` Dmitry Potapov
2008-10-11 22:58 ` Alex Riesen
0 siblings, 1 reply; 19+ messages in thread
From: Dmitry Potapov @ 2008-10-11 16:33 UTC (permalink / raw)
To: Johannes Sixt; +Cc: Joshua Juran, Giovanni Funchal, git, Shawn O. Pearce
On Tue, Oct 07, 2008 at 08:18:11AM +0200, Johannes Sixt wrote:
> Dmitry Potapov schrieb:
> > +#if defined(_WIN32) || defined(__CYGWIN__)
>
> I think that for consistency you should use __MINGW32__ instead of _WIN32.
I like Alex's suggestion more to use FILESYSTEM_CASEINSENSITIVE.
>
> > + /* On Windows, file names are case-insensitive */
> > + case 'G':
> > + if ((rest[1]|0x20) != 'i')
> > + break;
> > + if ((rest[2]|0x20) != 't')
> > + break;
>
> We have tolower().
I am aware of that, but I am not sure what we gain by using it. It seems
it makes only code bigger and slow. As to readability, I don't see much
improvement... Isn't obvious what this code does, especially with the
above comment?
Dmitry
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2] correct verify_path for Windows
2008-10-11 16:33 ` Dmitry Potapov
@ 2008-10-11 22:58 ` Alex Riesen
2008-10-12 13:50 ` Dmitry Potapov
0 siblings, 1 reply; 19+ messages in thread
From: Alex Riesen @ 2008-10-11 22:58 UTC (permalink / raw)
To: Dmitry Potapov
Cc: Johannes Sixt, Joshua Juran, Giovanni Funchal, git,
Shawn O. Pearce
2008/10/11 Dmitry Potapov <dpotapov@gmail.com>:
>> > + /* On Windows, file names are case-insensitive */
>> > + case 'G':
>> > + if ((rest[1]|0x20) != 'i')
>> > + break;
>> > + if ((rest[2]|0x20) != 't')
>> > + break;
>>
>> We have tolower().
>
> I am aware of that, but I am not sure what we gain by using it. It seems
> it makes only code bigger and slow.
It does? Care to look into git-compat-util.h?
> ... As to readability, I don't see much
> improvement... Isn't obvious what this code does, especially with the
> above comment?
You want to seriously argue that "a | 0x20" is as readable as "tolower(a)"?
For the years to come? With a person who does not even know what ASCII is?
Ok, I'm exaggerating. But the point is: it is not us who will be
reading the code.
And even if they do this just to remove Windows quirks it is well worth to
use a bit more of english language so that they don't need a second look.
As to comment: it is just additional info. It can't be checked by compiler
if you make and accidental typo in your code (like, for example, accidentally
putting an extra pipe in that expression, should happen to that emacs users
from time to time).
BTW, is it such a critical path? Can't the code be unified and do
without #ifdef?
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2] correct verify_path for Windows
2008-10-11 22:58 ` Alex Riesen
@ 2008-10-12 13:50 ` Dmitry Potapov
2008-10-12 18:18 ` Alex Riesen
0 siblings, 1 reply; 19+ messages in thread
From: Dmitry Potapov @ 2008-10-12 13:50 UTC (permalink / raw)
To: Alex Riesen
Cc: Johannes Sixt, Joshua Juran, Giovanni Funchal, git,
Shawn O. Pearce
On Sun, Oct 12, 2008 at 12:58:52AM +0200, Alex Riesen wrote:
> 2008/10/11 Dmitry Potapov <dpotapov@gmail.com>:
> >> > + /* On Windows, file names are case-insensitive */
> >> > + case 'G':
> >> > + if ((rest[1]|0x20) != 'i')
> >> > + break;
> >> > + if ((rest[2]|0x20) != 't')
> >> > + break;
> >>
> >> We have tolower().
> >
> > I am aware of that, but I am not sure what we gain by using it. It seems
> > it makes only code bigger and slow.
>
> It does? Care to look into git-compat-util.h?
As a matter of fact, I did, and I see the following:
#define sane_istest(x,mask) ((sane_ctype[(unsigned char)(x)] & (mask)) != 0)
#define tolower(x) sane_case((unsigned char)(x), 0x20)
static inline int sane_case(int x, int high)
{
if (sane_istest(x, GIT_ALPHA))
x = (x & ~0x20) | high;
return x;
}
So, it looks like an extra look up and an extra comparison here.
>
> > ... As to readability, I don't see much
> > improvement... Isn't obvious what this code does, especially with the
> > above comment?
>
> You want to seriously argue that "a | 0x20" is as readable as "tolower(a)"?
> For the years to come? With a person who does not even know what ASCII is?
> Ok, I'm exaggerating. But the point is: it is not us who will be
> reading the code.
Obviously, for a person who don't know what ASCII is, tolower() will be
much easier to understand, but the question is what I can reasonable to
expect for a person reading this code later. A similar argument can be
made about adding extra parenthesis, i.e. instead of writing
if (a == b || c == d)
you should always write
if ((a == b) || (c == d))
because some people do not remember the priority of each operator.
(And I have seen such programmers who claim to have many experience of
writing in C, yet, they do not remember operator priority.)
For me, using tolower() does not make it more readable, but maybe I am
too old-fashion assuming that people are supposed to know at least basic
things about ASCII.
> BTW, is it such a critical path?
I am not sure whether it is critical or not. It is called for each
name in path. So, if you have a long path, it may be called quite a
few times per a single path. Also, some operation such 'git add' can
call verify_path() more than once (IIRC, it was called thrice per each
added file). But I have no numbers to tell whether it is noticeable or
not.
> Can't the code be unified and do without #ifdef?
It will impose a extra restriction on what file names people can use,
and I don't like extra restrictions for those who use sane file systems.
Dmitry
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2] correct verify_path for Windows
2008-10-12 13:50 ` Dmitry Potapov
@ 2008-10-12 18:18 ` Alex Riesen
2008-10-13 6:00 ` Johannes Sixt
0 siblings, 1 reply; 19+ messages in thread
From: Alex Riesen @ 2008-10-12 18:18 UTC (permalink / raw)
To: Dmitry Potapov
Cc: Johannes Sixt, Joshua Juran, Giovanni Funchal, git,
Shawn O. Pearce
Dmitry Potapov, Sun, Oct 12, 2008 15:50:48 +0200:
> On Sun, Oct 12, 2008 at 12:58:52AM +0200, Alex Riesen wrote:
> > 2008/10/11 Dmitry Potapov <dpotapov@gmail.com>:
> > >> > + /* On Windows, file names are case-insensitive */
> > >> > + case 'G':
> > >> > + if ((rest[1]|0x20) != 'i')
> > >> > + break;
> > >> > + if ((rest[2]|0x20) != 't')
> > >> > + break;
> > >>
> > >> We have tolower().
> > >
> > > I am aware of that, but I am not sure what we gain by using it. It seems
> > > it makes only code bigger and slow.
> >
> > It does? Care to look into git-compat-util.h?
>
> As a matter of fact, I did, and I see the following:
>
> #define sane_istest(x,mask) ((sane_ctype[(unsigned char)(x)] & (mask)) != 0)
> #define tolower(x) sane_case((unsigned char)(x), 0x20)
>
> static inline int sane_case(int x, int high)
> {
> if (sane_istest(x, GIT_ALPHA))
> x = (x & ~0x20) | high;
> return x;
> }
>
> So, it looks like an extra look up and an extra comparison here.
Does not look like much more code. But:
> > BTW, is it such a critical path?
>
> I am not sure whether it is critical or not. It is called for each
> name in path. So, if you have a long path, it may be called quite a
> few times per a single path. Also, some operation such 'git add' can
> call verify_path() more than once (IIRC, it was called thrice per each
> added file). But I have no numbers to tell whether it is noticeable or
> not.
I looked at the callers (briefly). Performance could be a problem: add
and checkout can work with real big file lists and long pathnames.
So ok, than. It is critical.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2] correct verify_path for Windows
2008-10-12 18:18 ` Alex Riesen
@ 2008-10-13 6:00 ` Johannes Sixt
2008-10-13 6:18 ` Alex Riesen
0 siblings, 1 reply; 19+ messages in thread
From: Johannes Sixt @ 2008-10-13 6:00 UTC (permalink / raw)
To: Alex Riesen
Cc: Dmitry Potapov, Joshua Juran, Giovanni Funchal, git,
Shawn O. Pearce
Alex Riesen schrieb:
> I looked at the callers (briefly). Performance could be a problem: add
> and checkout can work with real big file lists and long pathnames.
> So ok, than. It is critical.
You are kidding, aren't you? What you win by a few CPU instructions here
is dwarfed by the time that the stat() implementation requires. Dmitry,
please use the more readable tolower().
-- Hannes
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2] correct verify_path for Windows
2008-10-13 6:00 ` Johannes Sixt
@ 2008-10-13 6:18 ` Alex Riesen
0 siblings, 0 replies; 19+ messages in thread
From: Alex Riesen @ 2008-10-13 6:18 UTC (permalink / raw)
To: Johannes Sixt
Cc: Dmitry Potapov, Joshua Juran, Giovanni Funchal, git,
Shawn O. Pearce
2008/10/13 Johannes Sixt <j.sixt@viscovery.net>:
> Alex Riesen schrieb:
>> I looked at the callers (briefly). Performance could be a problem: add
>> and checkout can work with real big file lists and long pathnames.
>> So ok, than. It is critical.
>
> You are kidding, aren't you? What you win by a few CPU instructions here
> is dwarfed by the time that the stat() implementation requires. Dmitry,
> please use the more readable tolower().
It is called for every element in a path, not just for every filename.
And maybe it is dwarfed, but they both are part of the same operation.
And this code makes it more CPU intensive.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2] correct verify_path for Windows
2008-10-07 3:26 ` [PATCH v2] correct verify_path for Windows Dmitry Potapov
2008-10-07 6:18 ` Johannes Sixt
@ 2008-10-07 6:25 ` Alex Riesen
1 sibling, 0 replies; 19+ messages in thread
From: Alex Riesen @ 2008-10-07 6:25 UTC (permalink / raw)
To: Dmitry Potapov
Cc: Joshua Juran, Giovanni Funchal, git, Shawn O. Pearce,
Johannes Sixt
2008/10/7 Dmitry Potapov <dpotapov@gmail.com>:
> +#if defined(_WIN32) || defined(__CYGWIN__)
> + /* On Windows, file names are case-insensitive */
> + case 'G':
> + if ((rest[1]|0x20) != 'i')
> + break;
> + if ((rest[2]|0x20) != 't')
> + break;
> +#else
Maybe it is already time for FILESYSTEM_CASEINSENSITIVE?
^ permalink raw reply [flat|nested] 19+ messages in thread