git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 4/4] cygwin: Use native Win32 API for stat
@ 2008-09-27  8:43 Dmitry Potapov
       [not found] ` <347507080809270851y79764dbcgba1ef5a1d58bdd3e@mail.gmail.com>
                   ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: Dmitry Potapov @ 2008-09-27  8:43 UTC (permalink / raw)
  To: Git Mailing List
  Cc: Junio C Hamano, Shawn O. Pearce, Alex Riesen, Dmitry Potapov,
	Johannes Sixt, Marcus Griep

lstat/stat functions in Cygwin are very slow, because they try to emulate
some *nix things that Git does not actually need. This patch adds Win32
specific implementation of these functions for Cygwin.

This implementation handles most situation directly but in some rare cases
it falls back on the implementation provided for Cygwin. This is necessary
for two reasons:

- Cygwin has its own file hierarchy, so absolute paths used in Cygwin is
  not suitable to be used Win32 API. cygwin_conv_to_win32_path can not be
  used because it automatically dereference Cygwin symbol links, also it
  causes extra syscall. Fortunately Git rarely use absolute paths, so we
  always use Cygwin implementation for absolute paths.

- Support of symbol links. Cygwin stores symbol links as ordinary using
  one of two possible formats. Therefore, the fast implementation falls
  back to Cygwin functions if it detects potential use of symbol links.

The speed of this implementation should be the same as mingw_lstat for
common cases, but it is considerable slower when the specified file name
does not exist.

Despite all efforts to make the fast implementation as robust as possible,
it may not work well for some very rare situations. I am aware only one
situation: use Cygwin mount to bind unrelated paths inside repository
together.  Therefore, the core.cygwinnativestat configuration option is
provided, which controls whether native or Cygwin version of stat is used.

Signed-off-by: Dmitry Potapov <dpotapov@gmail.com>
---
 Documentation/config.txt |    9 +++
 Makefile                 |    4 ++
 compat/cygwin.c          |  125 ++++++++++++++++++++++++++++++++++++++++++++++
 compat/cygwin.h          |    9 +++
 git-compat-util.h        |    1 +
 5 files changed, 148 insertions(+), 0 deletions(-)
 create mode 100644 compat/cygwin.c
 create mode 100644 compat/cygwin.h

diff --git a/Documentation/config.txt b/Documentation/config.txt
index bea867d..c198bc0 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -117,6 +117,15 @@ core.fileMode::
 	the working copy are ignored; useful on broken filesystems like FAT.
 	See linkgit:git-update-index[1]. True by default.
 
+core.cygwinNativeStat::
+	This option is only used by Cygwin implementation of Git. If false,
+	the Cygwin stat() and lstat() functions are used. This may be useful
+	if your repository consists of a few separate directories joined in
+	one hierarchy using Cygwin mount. If true, Git uses native Win32 API
+	whenever it is possible and falls back to Cygwin functions only to
+	handle symbol links. The native mode is more than twice faster than
+	normal Cygwin l/stat() functions. True by default.
+
 core.trustctime::
 	If false, the ctime differences between the index and the
 	working copy are ignored; useful when the inode change time
diff --git a/Makefile b/Makefile
index 3c0664a..0708390 100644
--- a/Makefile
+++ b/Makefile
@@ -347,6 +347,7 @@ LIB_H += cache.h
 LIB_H += cache-tree.h
 LIB_H += commit.h
 LIB_H += compat/mingw.h
+LIB_H += compat/cygwin.h
 LIB_H += csum-file.h
 LIB_H += decorate.h
 LIB_H += delta.h
@@ -747,6 +748,9 @@ ifeq ($(uname_S),HP-UX)
 	NO_SYS_SELECT_H = YesPlease
 	SNPRINTF_RETURNS_BOGUS = YesPlease
 endif
+ifneq (,$(findstring CYGWIN,$(uname_S)))
+	COMPAT_OBJS += compat/cygwin.o
+endif
 ifneq (,$(findstring MINGW,$(uname_S)))
 	NO_MMAP = YesPlease
 	NO_PREAD = YesPlease
diff --git a/compat/cygwin.c b/compat/cygwin.c
new file mode 100644
index 0000000..ad09b17
--- /dev/null
+++ b/compat/cygwin.c
@@ -0,0 +1,125 @@
+#define WIN32_LEAN_AND_MEAN
+#include "../git-compat-util.h"
+#include "win32.h"
+#include "../cache.h" /* to read configuration */
+
+static inline void filetime_to_timespec(const FILETIME *ft, struct timespec *ts)
+{
+	long long winTime = ((long long)ft->dwHighDateTime << 32) + ft->dwLowDateTime;
+	winTime -= 116444736000000000LL; /* Windows to Unix Epoch conversion */
+	ts->tv_sec = (time_t)(winTime/10000000); /* 100-nanosecond interval to seconds */
+	ts->tv_nsec = (long)(winTime - ts->tv_sec*10000000LL) * 100; /* nanoseconds */
+}
+
+#define size_to_blocks(s) (((s)+511)/512)
+
+/* do_stat is a common implementation for cygwin_lstat and cygwin_stat.
+ *
+ * To simplify its logic, in the case of cygwin symlinks, this implementation
+ * falls back to the cygwin version of stat/lstat, which is provided as the
+ * last argument.
+ */
+static int do_stat(const char *file_name, struct stat *buf, stat_fn_t cygstat)
+{
+	WIN32_FILE_ATTRIBUTE_DATA fdata;
+
+	if (file_name[0] == '/')
+		return cygstat (file_name, buf);
+
+	if (!(errno = get_file_attr(file_name, &fdata))) {
+		/*
+		 * If the system attribute is set and it is not a directory then
+		 * it could be a symbol link created in the nowinsymlinks mode.
+		 * Normally, Cygwin works in the winsymlinks mode, so this situation
+		 * is very unlikely. For the sake of simplicity of our code, let's
+		 * Cygwin to handle it.
+		 */
+		if ((fdata.dwFileAttributes & FILE_ATTRIBUTE_SYSTEM) &&
+		    !(fdata.dwFileAttributes & FILE_ATTRIBUTE_DIRECTORY))
+			return cygstat (file_name, buf);
+
+		/* fill out the stat structure */
+		buf->st_dev = buf->st_rdev = 0; /* not used by Git */
+		buf->st_ino = 0;
+		buf->st_mode = file_attr_to_st_mode (fdata.dwFileAttributes);
+		buf->st_nlink = 1;
+		buf->st_uid = buf->st_gid = 0;
+#ifdef __CYGWIN_USE_BIG_TYPES__
+		buf->st_size = ((_off64_t)fdata.nFileSizeHigh << 32) +
+			fdata.nFileSizeLow;
+#else
+		buf->st_size = (off_t)fdata.nFileSizeLow;
+#endif
+		buf->st_blocks = size_to_blocks(buf->st_size);
+		filetime_to_timespec(&fdata.ftLastAccessTime, &buf->st_atim);
+		filetime_to_timespec(&fdata.ftLastWriteTime, &buf->st_mtim);
+		filetime_to_timespec(&fdata.ftCreationTime, &buf->st_ctim);
+		return 0;
+	} else if (errno == ENOENT) {
+		/*
+		 * In the winsymlinks mode (which is the default), Cygwin
+		 * emulates symbol links using Windows shortcut files. These
+		 * files are formed by adding .lnk extension. So, if we have
+		 * not found the specified file name, it could be that it is
+		 * a symbol link. Let's Cygwin to deal with that.
+		 */
+		return cygstat (file_name, buf);
+	}
+	return -1;
+}
+
+/* We provide our own lstat/stat functions, since the provided Cygwin versions
+ * of these functions are too slow. These stat functions are tailored for Git's
+ * usage, and therefore they are not meant to be complete and correct emulation
+ * of lstat/stat functionality.
+ */
+static int cygwin_lstat(const char *path, struct stat *buf)
+{
+	return do_stat(path, buf, lstat);
+}
+
+static int cygwin_stat(const char *path, struct stat *buf)
+{
+	return do_stat(path, buf, stat);
+}
+
+
+/*
+ * At start up, we are trying to determine whether Win32 API or cygwin stat
+ * functions should be used. The choice is determined by core.cygwinnativestat.
+ * Reading this option is not always possible immediately as git_dir may be
+ * not be set yet. So until it is set, use cygwin lstat/stat functions.
+ */
+static int native_stat = 1;
+
+static int git_cygwin_config(const char *var, const char *value, void *cb)
+{
+	if (!strcmp(var, "core.cygwinnativestat"))
+		native_stat = git_config_bool(var, value);
+	return 0;
+}
+
+static int init_stat(void)
+{
+	if (have_git_dir()) {
+		git_config(git_cygwin_config, NULL);
+		cygwin_stat_fn = native_stat ? cygwin_stat : stat;
+		cygwin_lstat_fn = native_stat ? cygwin_lstat : lstat;
+		return 0;
+	}
+	return 1;
+}
+
+static int cygwin_stat_stub(const char *file_name, struct stat *buf)
+{
+	return (init_stat() ? stat : *cygwin_stat_fn)(file_name, buf);
+}
+
+static int cygwin_lstat_stub(const char *file_name, struct stat *buf)
+{
+	return (init_stat() ? lstat : *cygwin_lstat_fn)(file_name, buf);
+}
+
+stat_fn_t cygwin_stat_fn = cygwin_stat_stub;
+stat_fn_t cygwin_lstat_fn = cygwin_lstat_stub;
+
diff --git a/compat/cygwin.h b/compat/cygwin.h
new file mode 100644
index 0000000..a3229f5
--- /dev/null
+++ b/compat/cygwin.h
@@ -0,0 +1,9 @@
+#include <sys/types.h>
+#include <sys/stat.h>
+
+typedef int (*stat_fn_t)(const char*, struct stat*);
+extern stat_fn_t cygwin_stat_fn;
+extern stat_fn_t cygwin_lstat_fn;
+
+#define stat(path, buf) (*cygwin_stat_fn)(path, buf)
+#define lstat(path, buf) (*cygwin_lstat_fn)(path, buf)
diff --git a/git-compat-util.h b/git-compat-util.h
index db2836f..cd9752c 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -85,6 +85,7 @@
 #undef _XOPEN_SOURCE
 #include <grp.h>
 #define _XOPEN_SOURCE 600
+#include "compat/cygwin.h"
 #else
 #undef _ALL_SOURCE /* AIX 5.3L defines a struct list with _ALL_SOURCE. */
 #include <grp.h>
-- 
1.6.0.2.237.g0297e5

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH 4/4] cygwin: Use native Win32 API for stat
       [not found] ` <347507080809270851y79764dbcgba1ef5a1d58bdd3e@mail.gmail.com>
@ 2008-09-27 16:33   ` Dmitry Potapov
  0 siblings, 0 replies; 11+ messages in thread
From: Dmitry Potapov @ 2008-09-27 16:33 UTC (permalink / raw)
  To: Marcus Griep
  Cc: Git Mailing List, Junio C Hamano, Shawn O. Pearce, Alex Riesen,
	Johannes Sixt

On Sat, Sep 27, 2008 at 11:51:24AM -0400, Marcus Griep wrote:
> 
> Overall, looks good, though Alex's comment of using "cygwin.nativestat" may
> be a better descriptor for the config flag.

Perhaps... but I am not sure about policy for options in config file.
core.cygwinnativestat was proposed by Shawn, so I would like to hear
his opinion.

> I also think there is more refactoring that could be done, with there being
> common code paths still existing in MinGW and Cygwin.

I am not sure that there is much common code left: stat structures are
different in MinGW and Cygwin (different fields and their types), and I
do not think having one function with a lot of #ifdef __CYGWIN__ in it
is actual improvement. But you can send a patch on top of mine and let
other people decide if it is a worty goal.

Dmitry

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 4/4] cygwin: Use native Win32 API for stat
  2008-09-27  8:43 [PATCH 4/4] cygwin: Use native Win32 API for stat Dmitry Potapov
       [not found] ` <347507080809270851y79764dbcgba1ef5a1d58bdd3e@mail.gmail.com>
@ 2008-09-27 18:35 ` Johannes Sixt
  2008-09-27 21:54   ` Dmitry Potapov
  2008-09-28  9:50 ` [PATCH 4/4] " Alex Riesen
  2 siblings, 1 reply; 11+ messages in thread
From: Johannes Sixt @ 2008-09-27 18:35 UTC (permalink / raw)
  To: Dmitry Potapov
  Cc: git, Junio C Hamano, Shawn O. Pearce, Alex Riesen, Marcus Griep

On Samstag, 27. September 2008, Dmitry Potapov wrote:
> lstat/stat functions in Cygwin are very slow, because they try to emulate
> some *nix things that Git does not actually need. This patch adds Win32
> specific implementation of these functions for Cygwin.
>
> This implementation handles most situation directly but in some rare cases
> it falls back on the implementation provided for Cygwin.

Even though I was concerned about code duplication earlier, with the 
factorization that you do in this series this is acceptable, in particular, 
since working out at a solution that deals with the time_t vs. timespec 
difference we would need dirty tricks that are not worth it.

(But see my comment about get_file_attr() in a separate mail.)

> +core.cygwinNativeStat::

This name is *really* odd, for two reasons:

- If I read "native" in connection with Windows, I would understand Windows's 
implementation as "native". Cygwin is not native - it's a bolted-on feature.

- This name talks about the implementation, not about its effect.

Perhaps a better name would be core.ignoreCygwinFSFeatures, and the 
description would only mention that setting this to true (the default) makes 
many operations much faster, but makes it impossible to use File System 
Features A and B and C in the repository. "If you need one of these features, 
set this to false."

(And after writing above paragraphs I notice, that you actually really meant 
Windows's "native" stat; see how confusing the name is?)

> +static inline void filetime_to_timespec(const FILETIME *ft, struct
> timespec *ts) +{
> +	long long winTime = ((long long)ft->dwHighDateTime << 32) +
> ft->dwLowDateTime; +	winTime -= 116444736000000000LL; /* Windows to Unix
> Epoch conversion */ +	ts->tv_sec = (time_t)(winTime/10000000); /*
> 100-nanosecond interval to seconds */ +	ts->tv_nsec = (long)(winTime -
> ts->tv_sec*10000000LL) * 100; /* nanoseconds */ +}

Shorter lines in this function would be appreciated (and not just because my 
MUA can't deal with them ;).

-- Hannes

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 4/4] cygwin: Use native Win32 API for stat
  2008-09-27 18:35 ` Johannes Sixt
@ 2008-09-27 21:54   ` Dmitry Potapov
  2008-09-28  9:24     ` Johannes Sixt
  0 siblings, 1 reply; 11+ messages in thread
From: Dmitry Potapov @ 2008-09-27 21:54 UTC (permalink / raw)
  To: Johannes Sixt
  Cc: git, Junio C Hamano, Shawn O. Pearce, Alex Riesen, Marcus Griep

On Sat, Sep 27, 2008 at 08:35:03PM +0200, Johannes Sixt wrote:
> 
> > +core.cygwinNativeStat::
> 
> This name is *really* odd, for two reasons:
> 
> - If I read "native" in connection with Windows, I would understand Windows's 
> implementation as "native". Cygwin is not native - it's a bolted-on feature.
> 
> - This name talks about the implementation, not about its effect.
> 
> Perhaps a better name would be core.ignoreCygwinFSFeatures, and the 
> description would only mention that setting this to true (the default) makes 
> many operations much faster, but makes it impossible to use File System 
> Features A and B and C in the repository. "If you need one of these features, 
> set this to false."
> 
> (And after writing above paragraphs I notice, that you actually really meant 
> Windows's "native" stat; see how confusing the name is?)

It was Shawn's suggestion. I don't care much about the name as long as
it is explained in the documentation... Therefore, I accepted what Shawn
said without giving it any thought. Now, when you bring this name to my
attention, I believe core.useCygwinStat (in the opposite to the current
core.cygwinNativeStat) would be a better name. Your name is okay too,
but a bit too long for my taste and not specific enough (I suppose
Cygwin does many FS related tricks). Anyway, I don't have a strong
opinion here, so just whatever most people like is fine with me :)

> 
> > +static inline void filetime_to_timespec(const FILETIME *ft, struct
> > timespec *ts) +{
> > +	long long winTime = ((long long)ft->dwHighDateTime << 32) +
> > ft->dwLowDateTime; +	winTime -= 116444736000000000LL; /* Windows to Unix
> > Epoch conversion */ +	ts->tv_sec = (time_t)(winTime/10000000); /*
> > 100-nanosecond interval to seconds */ +	ts->tv_nsec = (long)(winTime -
> > ts->tv_sec*10000000LL) * 100; /* nanoseconds */ +}
> 
> Shorter lines in this function would be appreciated (and not just because my 
> MUA can't deal with them ;).

I am sorry, I did not notice that the line got longer than 80 columns.
I will resent the patch once the issue with the name of the option is
resolved.

Dmitry

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 4/4] cygwin: Use native Win32 API for stat
  2008-09-27 21:54   ` Dmitry Potapov
@ 2008-09-28  9:24     ` Johannes Sixt
  2008-09-29 15:34       ` Shawn O. Pearce
  0 siblings, 1 reply; 11+ messages in thread
From: Johannes Sixt @ 2008-09-28  9:24 UTC (permalink / raw)
  To: Dmitry Potapov
  Cc: git, Junio C Hamano, Shawn O. Pearce, Alex Riesen, Marcus Griep

On Samstag, 27. September 2008, Dmitry Potapov wrote:
> On Sat, Sep 27, 2008 at 08:35:03PM +0200, Johannes Sixt wrote:
> > > +core.cygwinNativeStat::
> >
> > This name is *really* odd, for two reasons:
...
> It was Shawn's suggestion. I don't care much about the name as long as
> it is explained in the documentation... Therefore, I accepted what Shawn
> said without giving it any thought.

Shawn is an importen git-o-maniac, but it's certainly not blasphemy to 
question his words of wisdom ;)

> Now, when you bring this name to my 
> attention, I believe core.useCygwinStat (in the opposite to the current
> core.cygwinNativeStat) would be a better name. Your name is okay too,
> but a bit too long for my taste and not specific enough (I suppose
> Cygwin does many FS related tricks). Anyway, I don't have a strong
> opinion here, so just whatever most people like is fine with me :)

My point is that emphasis on "stat" in the name is wrong: That's about 
implementation, but not about the effect. Why wouldn't 'ignoreCygwinFSTricks' 
be specific enough? By using a native stat implementation, *all* of them are 
ignored. Yes, you fall back to Cygwin's stat sometimes, but these are cases 
where the *effect* is not that relevant. (And the length of the name doesn't 
worry me, considering how many people would want to change the default.)

-- Hannes

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 4/4] cygwin: Use native Win32 API for stat
  2008-09-27  8:43 [PATCH 4/4] cygwin: Use native Win32 API for stat Dmitry Potapov
       [not found] ` <347507080809270851y79764dbcgba1ef5a1d58bdd3e@mail.gmail.com>
  2008-09-27 18:35 ` Johannes Sixt
@ 2008-09-28  9:50 ` Alex Riesen
  2 siblings, 0 replies; 11+ messages in thread
From: Alex Riesen @ 2008-09-28  9:50 UTC (permalink / raw)
  To: Dmitry Potapov
  Cc: Git Mailing List, Junio C Hamano, Shawn O. Pearce, Johannes Sixt,
	Marcus Griep

Dmitry Potapov, Sat, Sep 27, 2008 10:43:49 +0200:
> Despite all efforts to make the fast implementation as robust as possible,
> it may not work well for some very rare situations. I am aware only one
> situation: use Cygwin mount to bind unrelated paths inside repository
> together.  Therefore, the core.cygwinnativestat configuration option is
> provided, which controls whether native or Cygwin version of stat is used.

cygwin.tryWindowsState? (I think cygwin has to get its own section)

> +static int do_stat(const char *file_name, struct stat *buf, stat_fn_t cygstat)
> +{
> +	WIN32_FILE_ATTRIBUTE_DATA fdata;
> +
> +	if (file_name[0] == '/')
> +		return cygstat (file_name, buf);
> +
> +	if (!(errno = get_file_attr(file_name, &fdata))) {
> +		/*
> +		 * If the system attribute is set and it is not a directory then
> +		 * it could be a symbol link created in the nowinsymlinks mode.
> +		 * Normally, Cygwin works in the winsymlinks mode, so this situation
> +		 * is very unlikely. For the sake of simplicity of our code, let's
> +		 * Cygwin to handle it.
> +		 */
> +		if ((fdata.dwFileAttributes & FILE_ATTRIBUTE_SYSTEM) &&
> +		    !(fdata.dwFileAttributes & FILE_ATTRIBUTE_DIRECTORY))
> +			return cygstat (file_name, buf);

formatting: space after function name.

> +
> +		/* fill out the stat structure */
> +		buf->st_dev = buf->st_rdev = 0; /* not used by Git */
> +		buf->st_ino = 0;
> +		buf->st_mode = file_attr_to_st_mode (fdata.dwFileAttributes);
> +		buf->st_nlink = 1;
> +		buf->st_uid = buf->st_gid = 0;
> +#ifdef __CYGWIN_USE_BIG_TYPES__
> +		buf->st_size = ((_off64_t)fdata.nFileSizeHigh << 32) +
> +			fdata.nFileSizeLow;
> +#else
> +		buf->st_size = (off_t)fdata.nFileSizeLow;
> +#endif
> +		buf->st_blocks = size_to_blocks(buf->st_size);
> +		filetime_to_timespec(&fdata.ftLastAccessTime, &buf->st_atim);
> +		filetime_to_timespec(&fdata.ftLastWriteTime, &buf->st_mtim);
> +		filetime_to_timespec(&fdata.ftCreationTime, &buf->st_ctim);
> +		return 0;
> +	} else if (errno == ENOENT) {
> +		/*
> +		 * In the winsymlinks mode (which is the default), Cygwin
> +		 * emulates symbol links using Windows shortcut files. These
> +		 * files are formed by adding .lnk extension. So, if we have
> +		 * not found the specified file name, it could be that it is
> +		 * a symbol link. Let's Cygwin to deal with that.
> +		 */
> +		return cygstat (file_name, buf);
> +	}
> +	return -1;
> +}

I like it and will be keeping in my tree. Thanks!

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 4/4] cygwin: Use native Win32 API for stat
  2008-09-28  9:24     ` Johannes Sixt
@ 2008-09-29 15:34       ` Shawn O. Pearce
  2008-09-29 18:26         ` Dmitry Potapov
  2008-09-30 13:53         ` [PATCH 4/4 v2] " Dmitry Potapov
  0 siblings, 2 replies; 11+ messages in thread
From: Shawn O. Pearce @ 2008-09-29 15:34 UTC (permalink / raw)
  To: Johannes Sixt
  Cc: Dmitry Potapov, git, Junio C Hamano, Alex Riesen, Marcus Griep

Johannes Sixt <johannes.sixt@telecom.at> wrote:
> On Samstag, 27. September 2008, Dmitry Potapov wrote:
> > On Sat, Sep 27, 2008 at 08:35:03PM +0200, Johannes Sixt wrote:
> > > > +core.cygwinNativeStat::
> > >
> > > This name is *really* odd, for two reasons:
> ...
> > It was Shawn's suggestion. I don't care much about the name as long as
> > it is explained in the documentation... Therefore, I accepted what Shawn
> > said without giving it any thought.
> 
> Shawn is an importen git-o-maniac, but it's certainly not blasphemy to 
> question his words of wisdom ;)

As Hannes points out, blindly accepting anything I say might not
be a good idea.  I have my moments of sanity, but I'm far, far
from perfect.  ;-)

> My point is that emphasis on "stat" in the name is wrong: That's about 
> implementation, but not about the effect. Why wouldn't 'ignoreCygwinFSTricks' 
> be specific enough?

I like this a lot better.  I could see us also bypassing other Cygwin
functions like open() in order to get faster system calls for Git.
Since it would be byassing the same Cygwin path name translation
code it should be controlled by the same flag.

> (And the length of the name doesn't 
> worry me, considering how many people would want to change the default.)

Agreed.  Most people setting it would copy and paste from the
documentation anyway.

I wonder though if we can't automatically implement it by grabbing
a copy of the Cygwin mount table and comparing those paths to
$GIT_DIR or $GIT_WORK_TREE.  If any mount table entry is contained
within either of them then we know we can't use the native stat.
Its rather common for neither of these to contain a mount point,
and it is therefore easy to enable the native stat.

-- 
Shawn.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 4/4] cygwin: Use native Win32 API for stat
  2008-09-29 15:34       ` Shawn O. Pearce
@ 2008-09-29 18:26         ` Dmitry Potapov
  2008-09-30 13:53         ` [PATCH 4/4 v2] " Dmitry Potapov
  1 sibling, 0 replies; 11+ messages in thread
From: Dmitry Potapov @ 2008-09-29 18:26 UTC (permalink / raw)
  To: Shawn O. Pearce
  Cc: Johannes Sixt, git, Junio C Hamano, Alex Riesen, Marcus Griep

On Mon, Sep 29, 2008 at 08:34:00AM -0700, Shawn O. Pearce wrote:
> Johannes Sixt <johannes.sixt@telecom.at> wrote:
> 
> > My point is that emphasis on "stat" in the name is wrong: That's about 
> > implementation, but not about the effect. Why wouldn't 'ignoreCygwinFSTricks' 
> > be specific enough?
> 
> I like this a lot better.  I could see us also bypassing other Cygwin
> functions like open() in order to get faster system calls for Git.

If you think that it may be useful to bypass some other functions, and
you want to use the same option to control that then a general name like
that makes sense. Personally, I don't believe that we may want to bypass
something like open() as it is not performance critical, but I said
above I don't care about the name much, so I am going to change my patch
to use ignoreCygwinFSTricks.

Dmitry

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 4/4 v2] cygwin: Use native Win32 API for stat
  2008-09-29 15:34       ` Shawn O. Pearce
  2008-09-29 18:26         ` Dmitry Potapov
@ 2008-09-30 13:53         ` Dmitry Potapov
  2008-09-30 14:57           ` Marcus Griep
  1 sibling, 1 reply; 11+ messages in thread
From: Dmitry Potapov @ 2008-09-30 13:53 UTC (permalink / raw)
  To: Shawn O. Pearce
  Cc: Johannes Sixt, git, Junio C Hamano, Alex Riesen, Marcus Griep

lstat/stat functions in Cygwin are very slow, because they try to emulate
some *nix things that Git does not actually need. This patch adds Win32
specific implementation of these functions for Cygwin.

This implementation handles most situation directly but in some rare cases
it falls back on the implementation provided for Cygwin. This is necessary
for two reasons:

- Cygwin has its own file hierarchy, so absolute paths used in Cygwin is
  not suitable to be used Win32 API. cygwin_conv_to_win32_path can not be
  used because it automatically dereference Cygwin symbol links, also it
  causes extra syscall. Fortunately Git rarely use absolute paths, so we
  always use Cygwin implementation for absolute paths.

- Support of symbol links. Cygwin stores symbol links as ordinary using
  one of two possible formats. Therefore, the fast implementation falls
  back to Cygwin functions if it detects potential use of symbol links.

The speed of this implementation should be the same as mingw_lstat for
common cases, but it is considerable slower when the specified file name
does not exist.

Despite all efforts to make the fast implementation as robust as possible,
it may not work well for some very rare situations. I am aware only one
situation: use Cygwin mount to bind unrelated paths inside repository
together.  Therefore, the core.ignoreCygwinFSTricks configuration option is
provided, which controls whether native or Cygwin version of stat is used.

Signed-off-by: Dmitry Potapov <dpotapov@gmail.com>
---

This version of patch has the following correction:

1. cygwinNativeStat renamed as ignoreCygwinFSTricks
2. lines in filetime_to_timespec are reformatted to fit in 80 columns
3. extra spaces after function names are removed


 Documentation/config.txt |    9 +++
 Makefile                 |    4 ++
 compat/cygwin.c          |  127 ++++++++++++++++++++++++++++++++++++++++++++++
 compat/cygwin.h          |    9 +++
 git-compat-util.h        |    1 +
 5 files changed, 150 insertions(+), 0 deletions(-)
 create mode 100644 compat/cygwin.c
 create mode 100644 compat/cygwin.h

diff --git a/Documentation/config.txt b/Documentation/config.txt
index bea867d..61437fe 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -117,6 +117,15 @@ core.fileMode::
 	the working copy are ignored; useful on broken filesystems like FAT.
 	See linkgit:git-update-index[1]. True by default.
 
+core.ignoreCygwinFSTricks::
+	This option is only used by Cygwin implementation of Git. If false,
+	the Cygwin stat() and lstat() functions are used. This may be useful
+	if your repository consists of a few separate directories joined in
+	one hierarchy using Cygwin mount. If true, Git uses native Win32 API
+	whenever it is possible and falls back to Cygwin functions only to
+	handle symbol links. The native mode is more than twice faster than
+	normal Cygwin l/stat() functions. True by default.
+
 core.trustctime::
 	If false, the ctime differences between the index and the
 	working copy are ignored; useful when the inode change time
diff --git a/Makefile b/Makefile
index 3c0664a..0708390 100644
--- a/Makefile
+++ b/Makefile
@@ -347,6 +347,7 @@ LIB_H += cache.h
 LIB_H += cache-tree.h
 LIB_H += commit.h
 LIB_H += compat/mingw.h
+LIB_H += compat/cygwin.h
 LIB_H += csum-file.h
 LIB_H += decorate.h
 LIB_H += delta.h
@@ -747,6 +748,9 @@ ifeq ($(uname_S),HP-UX)
 	NO_SYS_SELECT_H = YesPlease
 	SNPRINTF_RETURNS_BOGUS = YesPlease
 endif
+ifneq (,$(findstring CYGWIN,$(uname_S)))
+	COMPAT_OBJS += compat/cygwin.o
+endif
 ifneq (,$(findstring MINGW,$(uname_S)))
 	NO_MMAP = YesPlease
 	NO_PREAD = YesPlease
diff --git a/compat/cygwin.c b/compat/cygwin.c
new file mode 100644
index 0000000..423ff20
--- /dev/null
+++ b/compat/cygwin.c
@@ -0,0 +1,127 @@
+#define WIN32_LEAN_AND_MEAN
+#include "../git-compat-util.h"
+#include "win32.h"
+#include "../cache.h" /* to read configuration */
+
+static inline void filetime_to_timespec(const FILETIME *ft, struct timespec *ts)
+{
+	long long winTime = ((long long)ft->dwHighDateTime << 32) +
+			ft->dwLowDateTime;
+	winTime -= 116444736000000000LL; /* Windows to Unix Epoch conversion */
+	/* convert 100-nsecond interval to seconds and nanoseconds */
+	ts->tv_sec = (time_t)(winTime/10000000);
+	ts->tv_nsec = (long)(winTime - ts->tv_sec*10000000LL) * 100;
+}
+
+#define size_to_blocks(s) (((s)+511)/512)
+
+/* do_stat is a common implementation for cygwin_lstat and cygwin_stat.
+ *
+ * To simplify its logic, in the case of cygwin symlinks, this implementation
+ * falls back to the cygwin version of stat/lstat, which is provided as the
+ * last argument.
+ */
+static int do_stat(const char *file_name, struct stat *buf, stat_fn_t cygstat)
+{
+	WIN32_FILE_ATTRIBUTE_DATA fdata;
+
+	if (file_name[0] == '/')
+		return cygstat (file_name, buf);
+
+	if (!(errno = get_file_attr(file_name, &fdata))) {
+		/*
+		 * If the system attribute is set and it is not a directory then
+		 * it could be a symbol link created in the nowinsymlinks mode.
+		 * Normally, Cygwin works in the winsymlinks mode, so this situation
+		 * is very unlikely. For the sake of simplicity of our code, let's
+		 * Cygwin to handle it.
+		 */
+		if ((fdata.dwFileAttributes & FILE_ATTRIBUTE_SYSTEM) &&
+		    !(fdata.dwFileAttributes & FILE_ATTRIBUTE_DIRECTORY))
+			return cygstat(file_name, buf);
+
+		/* fill out the stat structure */
+		buf->st_dev = buf->st_rdev = 0; /* not used by Git */
+		buf->st_ino = 0;
+		buf->st_mode = file_attr_to_st_mode(fdata.dwFileAttributes);
+		buf->st_nlink = 1;
+		buf->st_uid = buf->st_gid = 0;
+#ifdef __CYGWIN_USE_BIG_TYPES__
+		buf->st_size = ((_off64_t)fdata.nFileSizeHigh << 32) +
+			fdata.nFileSizeLow;
+#else
+		buf->st_size = (off_t)fdata.nFileSizeLow;
+#endif
+		buf->st_blocks = size_to_blocks(buf->st_size);
+		filetime_to_timespec(&fdata.ftLastAccessTime, &buf->st_atim);
+		filetime_to_timespec(&fdata.ftLastWriteTime, &buf->st_mtim);
+		filetime_to_timespec(&fdata.ftCreationTime, &buf->st_ctim);
+		return 0;
+	} else if (errno == ENOENT) {
+		/*
+		 * In the winsymlinks mode (which is the default), Cygwin
+		 * emulates symbol links using Windows shortcut files. These
+		 * files are formed by adding .lnk extension. So, if we have
+		 * not found the specified file name, it could be that it is
+		 * a symbol link. Let's Cygwin to deal with that.
+		 */
+		return cygstat(file_name, buf);
+	}
+	return -1;
+}
+
+/* We provide our own lstat/stat functions, since the provided Cygwin versions
+ * of these functions are too slow. These stat functions are tailored for Git's
+ * usage, and therefore they are not meant to be complete and correct emulation
+ * of lstat/stat functionality.
+ */
+static int cygwin_lstat(const char *path, struct stat *buf)
+{
+	return do_stat(path, buf, lstat);
+}
+
+static int cygwin_stat(const char *path, struct stat *buf)
+{
+	return do_stat(path, buf, stat);
+}
+
+
+/*
+ * At start up, we are trying to determine whether Win32 API or cygwin stat
+ * functions should be used. The choice is determined by core.ignorecygwinfstricks.
+ * Reading this option is not always possible immediately as git_dir may be
+ * not be set yet. So until it is set, use cygwin lstat/stat functions.
+ */
+static int native_stat = 1;
+
+static int git_cygwin_config(const char *var, const char *value, void *cb)
+{
+	if (!strcmp(var, "core.ignorecygwinfstricks"))
+		native_stat = git_config_bool(var, value);
+	return 0;
+}
+
+static int init_stat(void)
+{
+	if (have_git_dir()) {
+		git_config(git_cygwin_config, NULL);
+		cygwin_stat_fn = native_stat ? cygwin_stat : stat;
+		cygwin_lstat_fn = native_stat ? cygwin_lstat : lstat;
+		return 0;
+	}
+	return 1;
+}
+
+static int cygwin_stat_stub(const char *file_name, struct stat *buf)
+{
+	return (init_stat() ? stat : *cygwin_stat_fn)(file_name, buf);
+}
+
+static int cygwin_lstat_stub(const char *file_name, struct stat *buf)
+{
+	return (init_stat() ? lstat : *cygwin_lstat_fn)(file_name, buf);
+}
+
+stat_fn_t cygwin_stat_fn = cygwin_stat_stub;
+stat_fn_t cygwin_lstat_fn = cygwin_lstat_stub;
+
diff --git a/compat/cygwin.h b/compat/cygwin.h
new file mode 100644
index 0000000..a3229f5
--- /dev/null
+++ b/compat/cygwin.h
@@ -0,0 +1,9 @@
+#include <sys/types.h>
+#include <sys/stat.h>
+
+typedef int (*stat_fn_t)(const char*, struct stat*);
+extern stat_fn_t cygwin_stat_fn;
+extern stat_fn_t cygwin_lstat_fn;
+
+#define stat(path, buf) (*cygwin_stat_fn)(path, buf)
+#define lstat(path, buf) (*cygwin_lstat_fn)(path, buf)
diff --git a/git-compat-util.h b/git-compat-util.h
index db2836f..cd9752c 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -85,6 +85,7 @@
 #undef _XOPEN_SOURCE
 #include <grp.h>
 #define _XOPEN_SOURCE 600
+#include "compat/cygwin.h"
 #else
 #undef _ALL_SOURCE /* AIX 5.3L defines a struct list with _ALL_SOURCE. */
 #include <grp.h>
-- 
1.6.0

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH 4/4 v2] cygwin: Use native Win32 API for stat
  2008-09-30 13:53         ` [PATCH 4/4 v2] " Dmitry Potapov
@ 2008-09-30 14:57           ` Marcus Griep
  2008-09-30 20:26             ` Shawn O. Pearce
  0 siblings, 1 reply; 11+ messages in thread
From: Marcus Griep @ 2008-09-30 14:57 UTC (permalink / raw)
  To: Dmitry Potapov
  Cc: Shawn O. Pearce, Johannes Sixt, git, Junio C Hamano, Alex Riesen

[-- Attachment #1: Type: text/plain, Size: 588 bytes --]

Dmitry Potapov wrote:
> lstat/stat functions in Cygwin are very slow, because they try to emulate
> some *nix things that Git does not actually need. This patch adds Win32
> specific implementation of these functions for Cygwin.

Can't wait to see this patch in next or master!  If you recall my benchmarks
from earlier, the speed-up is pretty good for cygwin users working with
large repositories.

> Signed-off-by: Dmitry Potapov <dpotapov@gmail.com>

Thanks for the work, Dmitry!

-- 
Marcus Griep
GPG Key ID: 0x5E968152
——
http://www.boohaunt.net
את.ψο´


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 793 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 4/4 v2] cygwin: Use native Win32 API for stat
  2008-09-30 14:57           ` Marcus Griep
@ 2008-09-30 20:26             ` Shawn O. Pearce
  0 siblings, 0 replies; 11+ messages in thread
From: Shawn O. Pearce @ 2008-09-30 20:26 UTC (permalink / raw)
  To: Marcus Griep
  Cc: Dmitry Potapov, Johannes Sixt, git, Junio C Hamano, Alex Riesen

Marcus Griep <marcus@griep.us> wrote:
> Dmitry Potapov wrote:
> > lstat/stat functions in Cygwin are very slow, because they try to emulate
> > some *nix things that Git does not actually need. This patch adds Win32
> > specific implementation of these functions for Cygwin.
> 
> Can't wait to see this patch in next or master!  If you recall my benchmarks
> from earlier, the speed-up is pretty good for cygwin users working with
> large repositories.
> 
> > Signed-off-by: Dmitry Potapov <dpotapov@gmail.com>
> 
> Thanks for the work, Dmitry!

Thanks folks.  I'm scheduling this for 'next'.  Lets see how
it goes...

-- 
Shawn.

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2008-09-30 20:27 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-09-27  8:43 [PATCH 4/4] cygwin: Use native Win32 API for stat Dmitry Potapov
     [not found] ` <347507080809270851y79764dbcgba1ef5a1d58bdd3e@mail.gmail.com>
2008-09-27 16:33   ` Dmitry Potapov
2008-09-27 18:35 ` Johannes Sixt
2008-09-27 21:54   ` Dmitry Potapov
2008-09-28  9:24     ` Johannes Sixt
2008-09-29 15:34       ` Shawn O. Pearce
2008-09-29 18:26         ` Dmitry Potapov
2008-09-30 13:53         ` [PATCH 4/4 v2] " Dmitry Potapov
2008-09-30 14:57           ` Marcus Griep
2008-09-30 20:26             ` Shawn O. Pearce
2008-09-28  9:50 ` [PATCH 4/4] " Alex Riesen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).