Git development
 help / color / mirror / Atom feed
* Re: [RFC/PATCH] Remove repo-config
From: Junio C Hamano @ 2008-01-16 20:32 UTC (permalink / raw)
  To: Dan McGee; +Cc: git
In-Reply-To: <7vlk6psn5h.fsf@gitster.siamese.dyndns.org>

Junio C Hamano <gitster@pobox.com> writes:

> Dan McGee <dpmcgee@gmail.com> writes:
>
>> On 01/15/2008 10:23 PM, Junio C Hamano wrote:
>>> I do not think it is unreasonable to add repo-config to feature
>>> removal schedule in 1.5.4 release notes.  Perhaps something like...
>>> 
>>> + * "git repo-config", which was an old name for "git config" command,
>>> +   has been supported without being advertised for a long time.  The
>>> +   next feature release will remove it.
>>> +
>>
>> Seems fine to me. Does it need to be put in command-list.txt for the time being too, or what all is that file used for? Sorry I am not familiar.
>>
>> Something like:
>>
>> git-repo-config                       ancillarymanipulators	deprecated
>
> Technically, you are right, but somehow it feels backwards.
>
> We stopped advertising the existence of the command in v1.5.0,
> and adding it to the list now would mean we need to add a new
> git-repo-config manual page that says "This is deprecated, use
> git-config instead".

This comment was wrong.  We can keep the existing repo-config
documentation that has deprecation notice, and add the above
line to the list.

Sorry about the confusion.

^ permalink raw reply

* Re: [PATCH 2/2] close_lock_file(): new function in the lockfile API
From: Junio C Hamano @ 2008-01-16 20:36 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Brandon Casey, Brandon Casey, Git Mailing List, Alex Riesen,
	Kristian Høgsberg
In-Reply-To: <alpine.LFD.1.00.0801161207220.2806@woody.linux-foundation.org>

Linus Torvalds <torvalds@linux-foundation.org> writes:

> On Wed, 16 Jan 2008, Junio C Hamano wrote:
>> +
>> +void close_lock_file(struct lock_file *lk)
>> +{
>> +	close(lk->fd);
>> +	lk->fd = -1;
>> +}
>
> Since one of the main purposes of closing would be the error testing of 
> writes that haven't made it out yet on filesystems like NFS that do 
> open-close cache serialization, I'd suggest doing this as
>
> 	int close_lock_file(struct lock_file *lk)
> 	{
> 		int fd = lk->fd;
> 		lk->df = -1;
> 		return close(fd);
> 	} 
>
> to give the return code.

Yup!  You are as always right.

^ permalink raw reply

* Re: [PATCH v2] Do not show "diff --git" metainfo with --no-prefix
From: Daniel Barkalow @ 2008-01-16 20:39 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Linus Torvalds, Chris Ortman, Johannes Schindelin, git
In-Reply-To: <7v1w8hploy.fsf@gitster.siamese.dyndns.org>

On Wed, 16 Jan 2008, Junio C Hamano wrote:

> Linus Torvalds <torvalds@linux-foundation.org> writes:
> 
> > That's why tying "--git" together with any prefix handling is wrong: 
> > because it's a totally different issue. It's true that "git-apply" right 
> > now doesn't understand these things, but assuming we want to teach 
> > git-apply to apply to subprojects eventually (we do, don't we?) we'll 
> > eventually have to teach it.
> 
> That's all correct but
> 
>  * currently diff does not recurse, nor apply does not apply
>    recursively;
> 
>  * "git diff" that comes with 1.5.4, if we do not do anything,
>    can produce a diff that will be rejected by the stricter
>    check "git apply" has when used with --no-prefix and friends;
> 
>  * submodule aware versions of "git diff" can be told to add
>    "--mark-as-git-diff" when it passes "--src-prefix=a/git-gui"
>    and "--dst-prefix=b/git-gui" when it recurses internally, to
>    defeat what my proposed patch does.

Or it could pass an option to include the intermediate portion as part of 
the name rather than as part of the prefixes. And git-apply would probably 
be a lot happier to have confirmation that certain files are supposed to 
be from a submodule, which could be handled by including that option in 
the header after --git.

	-Daniel
*This .sig left intentionally blank*

^ permalink raw reply

* Re: git on MacOSX and files with decomposed utf-8 file names
From: Kevin Ballard @ 2008-01-16 20:39 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: Johannes Schindelin, Mark Junker, git
In-Reply-To: <m33asxn2gt.fsf@roke.D-201>

[-- Attachment #1: Type: text/plain, Size: 3763 bytes --]

On Jan 16, 2008, at 11:46 AM, Jakub Narebski wrote:

>>> More like, Mac OS X has standardized on Unicode and the rest of the
>>> world hasn't caught up yet. Git is the only tool I've ever heard  
>>> of that
>>> has a problem with OS X using Unicode.
>>
>> No.  That's not at all the problem.  Mac OS X insists on storing  
>> _another_
>> encoding of your filename.  Both are UTF-8.  Both encode the _same_
>> string.  Yet they are different, bytewise.  For no good reason.
>
> To be more exact encoding used to _create_ file differs from encoding
> returned when _reading directory_...
>
>> Stop spreading FUD.  Git can handle Unicode just fine.  In fact,  
>> Git does
>> not _care_ how the filename is encoded, it _respects_ the user's  
>> choice,
>> not only of the encoding _type_, but the _encoding_, too.
>
> ...which means that sequence of bytes differ. And Git by design is
> (both for filenames and for blob contents) encoding agnostic.
>
> HFS+ is just _stupid_. And unfortunately Git doesn't support stupid
> filesystems (e.g. case insensitive filesystems) well.

There's two different ways to do filesystem encodings. One is to have  
the fs simply not care about encoding, which is what the linux world  
seems to prefer. Sure, this is great in that what you create the file  
with is what you get back, but on the other hand, given an arbitrary  
non-ASCII file on disk, you have absolutely no idea what the encoding  
should be and you can't display it without making assumptions (yes you  
can use heuristics, but you're still making assumptions). Filesystems  
like HFS+ that standardize the encoding, on the other hand, make it  
such that you always know what the encoding of a file should be, so  
you can always display and use the filename intelligently. It also  
means it plays much nicer in a non-ASCII world, since you don't have  
to worry about different normalizations of a given string referring to  
different files (it's one thing to be case-sensitive, but claiming  
that "föo" and "föo" are different files just because one uses a  
composed character and the other doesn't is extremely user- 
unfriendly). On the other hand, what you create the file with may not  
be what you read back later, since the name has been standardized.  
It's hard to say one is better than the other, they're just different  
ways of doing it. However, I have noticed that everybody who's voiced  
an opinion on this list in favor of the encoding-agnostic approach  
seem to be unwilling to accept that any other approach might have  
validity, to the extent of calling an OS/filesystem that does things  
different stupid or insane. This strikes me as extremely elitist and  
risks alienating what I expect to be a fast-growing group of users  
(i.e. OS X users).

I'm willing to give Linus a free pass on calling other OS's stupid and  
insane, as I don't think Linux would exist as it does today without  
his strong opinions, but I don't think this should give carte blanche  
to the rest of the community for this inflammatory behavior.

I should note that I'm only taking the time to discuss this because,  
despite the fact that I'm new to git, I really like it and I want it  
to work better. And one area that it has a problem with is the de- 
facto filesystem on my OS of choice. However, attempts to discuss the  
problem invariable end up with multiple people calling my OS stupid  
and insane simply because it differs in a particular design decision.  
This is not a good way to build a community or to build a better  
product, and I hope it can be improved.

-Kevin Ballard

-- 
Kevin Ballard
http://kevin.sb.org
kevin@sb.org
http://www.tildesoft.com



[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 2432 bytes --]

^ permalink raw reply

* Re: [PATCH 2/2] close_lock_file(): new function in the lockfile API
From: Brandon Casey @ 2008-01-16 20:46 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Linus Torvalds, Git Mailing List, Alex Riesen,
	=?X-UNKNOWN?Q?Kristian_H=C3=B8gsberg?=
In-Reply-To: <7vodblo6c9.fsf@gitster.siamese.dyndns.org>

On Wed, 16 Jan 2008, Junio C Hamano wrote:

> Linus Torvalds <torvalds@linux-foundation.org> writes:
>
>> On Wed, 16 Jan 2008, Junio C Hamano wrote:
>>> +
>>> +void close_lock_file(struct lock_file *lk)
>>> +{
>>> +	close(lk->fd);
>>> +	lk->fd = -1;
>>> +}
>>
>> Since one of the main purposes of closing would be the error testing of
>> writes that haven't made it out yet on filesystems like NFS that do
>> open-close cache serialization, I'd suggest doing this as
>>
>> 	int close_lock_file(struct lock_file *lk)
>> 	{
>> 		int fd = lk->fd;
>> 		lk->df = -1;
>> 		return close(fd);
>> 	}
>>
>> to give the return code.
>
> Yup!  You are as always right.

My patch does this, though I understand it may take some time to review.

I left the lk->fd unmodified when close() failed in case the caller
would like to include it in an error message.

-brandon

^ permalink raw reply

* Re: [RFC/PATCH] Remove repo-config
From: Peter Oberndorfer @ 2008-01-16 20:47 UTC (permalink / raw)
  To: Dan McGee; +Cc: git, Catalin Marinas, Karl Hasselström
In-Reply-To: <1200453554-14163-1-git-send-email-dpmcgee@gmail.com>

On Mittwoch 16 Januar 2008, Dan McGee wrote:
> 'git config' has been used in place of 'git repo-config' for some time in
> the documentation and most of the tools, so remove traces of repo-config
> from the source.
> 
> Signed-off-by: Dan McGee <dpmcgee@gmail.com>
> ---
> 
> Mainly a request for comment, but I sent this patch becuase I noticed
> repo-config is not even listed in command-list.txt. Before the 1.5.4 release,
> I would think it either needs to be listed there or removed entirely.
> 
> Looking forward to 1.5.5, is there any reason to keep an old command like
> this around? It is rarely used and needlessly complicates things, and the
> manpage did nothing but send you to config anyway.
> 
I don't know if this should impact removing schedule, but stgit 0.14.1
(and the current development version) uses "git repo-config" instead of "git config"
in stgit/config.py and certain tests

will post a patch for this as a reply
> All tests pass on Linux 2.6.23.
> 
> -Dan
Greetings Peter
PS: i hope it is ok to add stgit people to CC

^ permalink raw reply

* [STGIT PATCH] replace "git repo-config" usage by "git config"
From: Peter Oberndorfer @ 2008-01-16 20:58 UTC (permalink / raw)
  To: Karl Hasselström; +Cc: Dan McGee, git, Catalin Marinas
In-Reply-To: <200801162147.33448.kumbayo84@arcor.de>

"git repo-config" will be removed soon

Signed-off-by: Peter Oberndorfer <kumbayo84@arcor.de>
---
since i am not good at creating log messages, feel free to change it :-)
built on top of kha experimental patch
passes all testcases for me
 stgit/config.py               |   14 +++++++-------
 t/t1900-mail.sh               |    2 +-
 t/t2100-pull-policy-fetch.sh  |    4 ++--
 t/t2101-pull-policy-pull.sh   |    4 ++--
 t/t2102-pull-policy-rebase.sh |    4 ++--
 5 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/stgit/config.py b/stgit/config.py
index 1d71cd2..9bfdd52 100644
--- a/stgit/config.py
+++ b/stgit/config.py
@@ -47,7 +47,7 @@ class GitConfig:
         if self.__cache.has_key(name):
             return self.__cache[name]
         try:
-            value = Run('git', 'repo-config', '--get', name).output_one_line()
+            value = Run('git', 'config', '--get', name).output_one_line()
         except RunException:
             value = self.__defaults.get(name, None)
         self.__cache[name] = value
@@ -56,7 +56,7 @@ class GitConfig:
     def getall(self, name):
         if self.__cache.has_key(name):
             return self.__cache[name]
-        values = Run('git', 'repo-config', '--get-all', name
+        values = Run('git', 'config', '--get-all', name
                      ).returns([0, 1]).output_lines()
         self.__cache[name] = values
         return values
@@ -71,23 +71,23 @@ class GitConfig:
     def rename_section(self, from_name, to_name):
         """Rename a section in the config file. Silently do nothing if
         the section doesn't exist."""
-        Run('git', 'repo-config', '--rename-section', from_name, to_name
+        Run('git', 'config', '--rename-section', from_name, to_name
             ).returns([0, 1]).run()
         self.__cache.clear()
 
     def remove_section(self, name):
         """Remove a section in the config file. Silently do nothing if
         the section doesn't exist."""
-        Run('git', 'repo-config', '--remove-section', name
+        Run('git', 'config', '--remove-section', name
             ).returns([0, 1]).discard_stderr().discard_output()
         self.__cache.clear()
 
     def set(self, name, value):
-        Run('git', 'repo-config', name, value).run()
+        Run('git', 'config', name, value).run()
         self.__cache[name] = value
 
     def unset(self, name):
-        Run('git', 'repo-config', '--unset', name)
+        Run('git', 'config', '--unset', name)
         self.__cache[name] = None
 
     def sections_matching(self, regexp):
@@ -96,7 +96,7 @@ class GitConfig:
         group contents, for all variable names matching the regexp.
         """
         result = []
-        for line in Run('git', 'repo-config', '--get-regexp', '"^%s$"' % regexp
+        for line in Run('git', 'config', '--get-regexp', '"^%s$"' % regexp
                         ).returns([0, 1]).output_lines():
             m = re.match('^%s ' % regexp, line)
             if m:
diff --git a/t/t1900-mail.sh b/t/t1900-mail.sh
index e83b2d3..cfdc6f3 100755
--- a/t/t1900-mail.sh
+++ b/t/t1900-mail.sh
@@ -6,7 +6,7 @@ test_description='Test the mail command'
 test_expect_success \
     'Initialize the StGIT repository' \
     '
-    git repo-config stgit.sender "A U Thor <author@example.com>" &&
+    git config stgit.sender "A U Thor <author@example.com>" &&
     for i in 1 2 3 4 5; do
       touch foo.txt &&
       echo "line $i" >> foo.txt &&
diff --git a/t/t2100-pull-policy-fetch.sh b/t/t2100-pull-policy-fetch.sh
index 9e4bc31..670c7c6 100755
--- a/t/t2100-pull-policy-fetch.sh
+++ b/t/t2100-pull-policy-fetch.sh
@@ -19,8 +19,8 @@ test_expect_success \
     (cd upstream && stg init) &&
     stg clone upstream clone &&
     (cd clone &&
-     git repo-config branch.master.stgit.pull-policy fetch-rebase &&
-     git repo-config --list &&
+     git config branch.master.stgit.pull-policy fetch-rebase &&
+     git config --list &&
      stg new c1 -m c1 &&
      echo a > file && git add file && stg refresh
     )
diff --git a/t/t2101-pull-policy-pull.sh b/t/t2101-pull-policy-pull.sh
index b4521f0..ce4b5c8 100755
--- a/t/t2101-pull-policy-pull.sh
+++ b/t/t2101-pull-policy-pull.sh
@@ -19,8 +19,8 @@ test_expect_success \
     (cd upstream && stg init) &&
     stg clone upstream clone &&
     (cd clone &&
-     git repo-config branch.master.stgit.pull-policy pull &&
-     git repo-config --list &&
+     git config branch.master.stgit.pull-policy pull &&
+     git config --list &&
      stg new c1 -m c1 &&
      echo a > file && git add file && stg refresh
     )
diff --git a/t/t2102-pull-policy-rebase.sh b/t/t2102-pull-policy-rebase.sh
index 135b48c..5619bda 100755
--- a/t/t2102-pull-policy-rebase.sh
+++ b/t/t2102-pull-policy-rebase.sh
@@ -13,8 +13,8 @@ test_expect_success \
     git branch -m master parent &&
     stg init &&
     stg branch --create stack &&
-    git repo-config branch.stack.stgit.pull-policy rebase &&
-    git repo-config --list &&
+    git config branch.stack.stgit.pull-policy rebase &&
+    git config --list &&
     stg new c1 -m c1 &&
     echo a > file && git add file && stg refresh
     '
-- 
1.5.4.rc3

^ permalink raw reply related

* Re: [STGIT PATCH] replace "git repo-config" usage by "git config"
From: Jakub Narebski @ 2008-01-16 21:13 UTC (permalink / raw)
  To: Peter Oberndorfer; +Cc: Karl Hasselström, Dan McGee, git, Catalin Marinas
In-Reply-To: <200801162158.26450.kumbayo84@arcor.de>

Peter Oberndorfer <kumbayo84@arcor.de> writes:

> "git repo-config" will be removed soon
> 
> Signed-off-by: Peter Oberndorfer <kumbayo84@arcor.de>
> ---
> since i am not good at creating log messages, feel free to change it :-)
> built on top of kha experimental patch
> passes all testcases for me

> @@ -47,7 +47,7 @@ class GitConfig:
>          if self.__cache.has_key(name):
>              return self.__cache[name]
>          try:
> -            value = Run('git', 'repo-config', '--get', name).output_one_line()
> +            value = Run('git', 'config', '--get', name).output_one_line()
>          except RunException:
>              value = self.__defaults.get(name, None)
>          self.__cache[name] = value

Strange that StGIT didn't abstracted out reading git config, like
Git.pm and gitweb.perl did.

BTW. will StGIT be using libgit-thin + PyGit, or is it not ready yet?

-- 
Jakub Narebski
Poland
ShadeHawk on #git

^ permalink raw reply

* Re: Git Cygwin - unable to create any repository - help!
From: Paul Umbers @ 2008-01-16 21:44 UTC (permalink / raw)
  To: Alex Riesen; +Cc: Robin Rosenberg, git
In-Reply-To: <20080116191737.GD3181@steel.home>

[-- Attachment #1: Type: text/plain, Size: 822 bytes --]

Here's the log from the latest strace.

On Jan 16, 2008 12:17 PM, Alex Riesen <raa.lkml@gmail.com> wrote:
> Paul Umbers, Wed, Jan 16, 2008 19:48:28 +0100:
> > Latest output, log file attached:
> >
> > paulumbers@Devteam29 ~/workspace/git/git-1.5.3
> > $ uname -a > somefile
> >
> > paulumbers@Devteam29 ~/workspace/git/git-1.5.3
> > $ git init
> > Initialized empty Git repository in .git/
> >
> > paulumbers@Devteam29 ~/workspace/git/git-1.5.3
> > $ strace -o log -f -m syscall ./git --exec-path=$(pwd) add somefile
> >
>
> Could you please retry with "-m all"? "-m syscall" is not what one
> might expect, its some tracing of cygwin and not enough at that.
>
>



-- 
Computer Science is no more about computers than astronomy is about telescopes.
--- Edsger W. Dijkstra

Paul Umbers MSc MBCS MIAP
paul.umbers@gmail.com

[-- Attachment #2: log_2.tar.gz --]
[-- Type: application/x-gzip, Size: 35782 bytes --]

^ permalink raw reply

* Re: git on MacOSX and files with decomposed utf-8 file names
From: Jakub Narebski @ 2008-01-16 21:51 UTC (permalink / raw)
  To: Kevin Ballard; +Cc: Johannes Schindelin, Mark Junker, git
In-Reply-To: <65026F2B-5CE8-4238-A9AB-D3545D336B41@sb.org>

On Wed, 16 Jan 2008, Kevin Ballard wrote:
> On Jan 16, 2008, at 11:46 AM, Jakub Narebski wrote:
>>>> More like, Mac OS X has standardized on Unicode and the rest of the
>>>> world hasn't caught up yet. Git is the only tool I've ever heard  
>>>> which has a problem with OS X using Unicode.
>>>
>>> No.  That's not at all the problem.  Mac OS X insists on storing  
>>> _another_  encoding of your filename.  Both are UTF-8.  Both encode
>>> the _same_ string.  Yet they are different, bytewise.  For no good
>>> reason. 
>>
>> To be more exact encoding used to _create_ file differs from encoding
>> returned when _reading directory_...
>>
>>> Stop spreading FUD.  Git can handle Unicode just fine.  In fact,  
>>> Git does not _care_ how the filename is encoded, it _respects_ the
>>> user's choice, not only of the encoding _type_, but the _encoding_,
>>> too. 
>>
>> ...which means that sequence of bytes differ. And Git by design is
>> (both for filenames and for blob contents) encoding agnostic.
>>
>> HFS+ is just _stupid_. And unfortunately Git doesn't support stupid
>> filesystems (e.g. case insensitive filesystems) well.

By the way, calling HFS+ stupid, or rather calling at least two 
different normalizations of UTF-8 (two different encodings) used for 
writing and reading filenames stupid is wrong _for me_. I have quoted 
Linus here, when I think I should use other description.
 
> There's two different ways to do filesystem encodings. One is to have  
> the fs simply not care about encoding, which is what the linux world  
> seems to prefer. Sure, this is great in that what you create the file  
> with is what you get back, but on the other hand, given an arbitrary  
> non-ASCII file on disk, you have absolutely no idea what the encoding  
> should be and you can't display it without making assumptions (yes you  
> can use heuristics, but you're still making assumptions). Filesystems  
> like HFS+ that standardize the encoding, on the other hand, make it  
> such that you always know what the encoding of a file should be, so  
> you can always display and use the filename intelligently. It also  
> means it plays much nicer in a non-ASCII world, since you don't have  
> to worry about different normalizations of a given string referring to  
> different files (it's one thing to be case-sensitive, but claiming  
> that "föo" and "föo" are different files just because one uses a  
> composed character and the other doesn't is extremely user- 
> unfriendly).

For me it looks like a layering violation... but my knowledge about 
filesystem is cluse to nil. IMHO it is VFS and libc which should do the 
translating.

> On the other hand, what you create the file with may not   
> be what you read back later, since the name has been standardized.  
> It's hard to say one is better than the other, they're just different  
> ways of doing it.

But using one encoding to create file, and another when reding filenames 
is strange. It is IMHO better to simply refuse creating filenames which 
are outside chosen encoding / normalization. But having different 
encodings used for reading and writing on the level of filesystem 
access (not on level of UI) is strange.

> However, I have noticed that everybody who's voiced   
> an opinion on this list in favor of the encoding-agnostic approach  
> seem to be unwilling to accept that any other approach might have  
> validity, to the extent of calling an OS/filesystem that does things  
> different stupid or insane. This strikes me as extremely elitist and  
> risks alienating what I expect to be a fast-growing group of users  
> (i.e. OS X users).

First, it is Git philosophy and very core of design to be encoding 
agnostic (to be "content tracker"). Second, using the same sequence of 
bytes on filesystem, in the index, and in 'tree' objects ensures good 
performance... this is something to think about if you want to add 
patches which would deal with HFS+ API/UI quirks.

[cut]
-- 
Jakub Narebski
Poland

^ permalink raw reply

* Re: [STGIT PATCH] replace "git repo-config" usage by "git config"
From: Peter Oberndorfer @ 2008-01-16 21:46 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: Karl Hasselström, Dan McGee, git, Catalin Marinas
In-Reply-To: <m3y7aplbie.fsf@roke.D-201>

On Mittwoch 16 Januar 2008, Jakub Narebski wrote:
> Peter Oberndorfer <kumbayo84@arcor.de> writes:
> 
> > "git repo-config" will be removed soon
> > 
> > Signed-off-by: Peter Oberndorfer <kumbayo84@arcor.de>
> > ---
> > since i am not good at creating log messages, feel free to change it :-)
> > built on top of kha experimental patch
> > passes all testcases for me
> 
> > @@ -47,7 +47,7 @@ class GitConfig:
> >          if self.__cache.has_key(name):
> >              return self.__cache[name]
> >          try:
> > -            value = Run('git', 'repo-config', '--get', 
name).output_one_line()
> > +            value = Run('git', 'config', '--get', name).output_one_line()
> >          except RunException:
> >              value = self.__defaults.get(name, None)
> >          self.__cache[name] = value
> 
> Strange that StGIT didn't abstracted out reading git config, like
> Git.pm and gitweb.perl did.
> 
It is abstracted, that is why the patch only affectes stgit/config.py :-)
(beside tests)
Or do you mean another type of abstraction like loading the whole config file 
at once?
> BTW. will StGIT be using libgit-thin + PyGit, or is it not ready yet?
> 
I did not hear about it lately.
But since i do not create stgit patches very often i have to say actually i 
actually have no idea.

Greetings Peter

^ permalink raw reply

* Re: [PATCH 2/2] close_lock_file(): new function in the lockfile API
From: Junio C Hamano @ 2008-01-16 21:57 UTC (permalink / raw)
  To: Brandon Casey
  Cc: Linus Torvalds, Git Mailing List, Alex Riesen,
	Kristian Høgsberg
In-Reply-To: <Pine.LNX.4.64.0801161443340.31161@torch.nrlssc.navy.mil>

Brandon Casey <casey@nrlssc.navy.mil> writes:

> My patch does this, though I understand it may take some time to review.

This is what I have right now, squashed your change into [2/2] 
I sent earlier, along with a couple of further fixups.

Improvement since my v1 are:

 - close_lock_file() returns int, which is the return value from
   close(2);

 - remove_lock_file() avoids calling close(2) if
   close_lock_file() was called earlier on the lockfile;

 - commit_lock_file() does the same, and notices failure
   returned from close_lock_file().  In addition, it unlinks the
   lockfile and clears lk->filename upon failure;

 - commit_locked_index() does the same, and notices failure
   returned from close_lock_file() in alternate_index_output
   codepath.  It unlinks the lockfile and clears lk->filename
   upon failure;

 - The codepath in commit_locked_index() for writing to
   alternate_index_output clears the alternate_index_output
   variable when it is done.

Most of the above are thanks to the changes to lockfile.c in your
version and Linus's suggestion.

I am planning to take your patch (only the parts that fix the
callers, because the changes to lockfile.c are all included
here) on top of this one.

-- >8 --
close_lock_file(): new function in the lockfile API

The lockfile API is a handy way to obtain a file that is cleaned
up if you die().  But sometimes you would need this sequence to
work:

 1. hold_lock_file_for_update() to get a file descriptor for
    writing;

 2. write the contents out, without being able to decide if the
    results should be committed or rolled back;

 3. do something else that makes the decision --- and this
    "something else" needs the lockfile not to have an open file
    descriptor for writing (e.g. Windows do not want a open file
    to be renamed);

 4. call commit_lock_file() or rollback_lock_file() as
    appropriately.

This adds close_lock_file() you can call between step 2 and 3 in
the above sequence.

[jc: updated with Brandon's stricter error checking on return values
 from close() and a suggestion by Linus.]

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 Documentation/technical/api-lockfile.txt |   15 +++++++++++--
 cache.h                                  |    2 +-
 lockfile.c                               |   32 ++++++++++++++++++++++++-----
 3 files changed, 39 insertions(+), 10 deletions(-)

diff --git a/Documentation/technical/api-lockfile.txt b/Documentation/technical/api-lockfile.txt
index 5b1553e..dd89404 100644
--- a/Documentation/technical/api-lockfile.txt
+++ b/Documentation/technical/api-lockfile.txt
@@ -37,7 +37,8 @@ commit_lock_file::
 	Take a pointer to the `struct lock_file` initialized
 	with an earlier call to `hold_lock_file_for_update()`,
 	close the file descriptor and rename the lockfile to its
-	final destination.
+	final destination.  Returns 0 upon success, a negative
+	value on failure to close(2) or rename(2).
 
 rollback_lock_file::
 
@@ -45,6 +46,12 @@ rollback_lock_file::
 	with an earlier call to `hold_lock_file_for_update()`,
 	close the file descriptor and remove the lockfile.
 
+close_lock_file::
+	Take a pointer to the `struct lock_file` initialized
+	with an earlier call to `hold_lock_file_for_update()`,
+	and close the file descriptor.  Returns 0 upon success,
+	a negative value on failure to close(2).
+
 Because the structure is used in an `atexit(3)` handler, its
 storage has to stay throughout the life of the program.  It
 cannot be an auto variable allocated on the stack.
@@ -54,8 +61,10 @@ done writing to the file descriptor.  If you do not call either
 and simply `exit(3)` from the program, an `atexit(3)` handler
 will close and remove the lockfile.
 
-You should not close the file descriptor you obtained from
-`hold_lock_file_for_update` function yourself.  The `struct
+If you need to close the file descriptor you obtained from
+`hold_lock_file_for_update` function yourself, do so by calling
+`close_lock_file()`.  You should never call `close(2)` yourself!
+Otherwise the `struct
 lock_file` structure still remembers that the file descriptor
 needs to be closed, and a later call to `commit_lock_file()` or
 `rollback_lock_file()` will result in duplicate calls to
diff --git a/cache.h b/cache.h
index 39331c2..24735bd 100644
--- a/cache.h
+++ b/cache.h
@@ -308,7 +308,7 @@ extern int commit_lock_file(struct lock_file *);
 extern int hold_locked_index(struct lock_file *, int);
 extern int commit_locked_index(struct lock_file *);
 extern void set_alternate_index_output(const char *);
-
+extern int close_lock_file(struct lock_file *);
 extern void rollback_lock_file(struct lock_file *);
 extern int delete_ref(const char *, const unsigned char *sha1);
 
diff --git a/lockfile.c b/lockfile.c
index f45d3ed..fcf9285 100644
--- a/lockfile.c
+++ b/lockfile.c
@@ -13,7 +13,8 @@ static void remove_lock_file(void)
 	while (lock_file_list) {
 		if (lock_file_list->owner == me &&
 		    lock_file_list->filename[0]) {
-			close(lock_file_list->fd);
+			if (lock_file_list->fd >= 0)
+				close(lock_file_list->fd);
 			unlink(lock_file_list->filename);
 		}
 		lock_file_list = lock_file_list->next;
@@ -159,11 +160,23 @@ int hold_lock_file_for_update(struct lock_file *lk, const char *path, int die_on
 	return fd;
 }
 
+int close_lock_file(struct lock_file *lk)
+{
+	int fd = lk->fd;
+	lk->fd = -1;
+	return close(fd);
+}
+
 int commit_lock_file(struct lock_file *lk)
 {
 	char result_file[PATH_MAX];
 	int i;
-	close(lk->fd);
+
+	if (lk->fd >= 0 && close_lock_file(lk)) {
+		unlink(lk->filename);
+		lk->filename[0] = 0;
+		return -1;
+	}
 	strcpy(result_file, lk->filename);
 	i = strlen(result_file) - 5; /* .lock */
 	result_file[i] = 0;
@@ -185,9 +198,15 @@ void set_alternate_index_output(const char *name)
 int commit_locked_index(struct lock_file *lk)
 {
 	if (alternate_index_output) {
-		int result = rename(lk->filename, alternate_index_output);
-		lk->filename[0] = 0;
-		return result;
+		const char *newname = alternate_index_output;
+		alternate_index_output = NULL;
+
+		if (lk->fd >= 0 && close_lock_file(lk)) {
+			unlink(lk->filename);
+			lk->filename[0] = 0;
+			return -1;
+		}
+		return rename(lk->filename, newname);
 	}
 	else
 		return commit_lock_file(lk);
@@ -196,7 +215,8 @@ int commit_locked_index(struct lock_file *lk)
 void rollback_lock_file(struct lock_file *lk)
 {
 	if (lk->filename[0]) {
-		close(lk->fd);
+		if (lk->fd >= 0)
+			close(lk->fd);
 		unlink(lk->filename);
 	}
 	lk->filename[0] = 0;
-- 
1.5.4.rc3.14.g44397

^ permalink raw reply related

* Re: git on MacOSX and files with decomposed utf-8 file names
From: Kevin Ballard @ 2008-01-16 22:06 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: Johannes Schindelin, Mark Junker, git
In-Reply-To: <200801162251.54219.jnareb@gmail.com>

[-- Attachment #1: Type: text/plain, Size: 2687 bytes --]

On Jan 16, 2008, at 4:51 PM, Jakub Narebski wrote:

>> On the other hand, what you create the file with may not
>> be what you read back later, since the name has been standardized.
>> It's hard to say one is better than the other, they're just different
>> ways of doing it.
>
> But using one encoding to create file, and another when reding  
> filenames
> is strange. It is IMHO better to simply refuse creating filenames  
> which
> are outside chosen encoding / normalization. But having different
> encodings used for reading and writing on the level of filesystem
> access (not on level of UI) is strange.

It's not using different encodings, it's all Unicode. However, it  
accepts different normalization variants of Unicode, since it can read  
them all and it would be folly to require everybody to conform to its  
own special internal variant. But it does have to normalize them,  
otherwise how would it detect the same filename using different  
normalizations? Also, it may seem strange to have different names  
between reading and writing, but that's only if you think of the name  
as a sequence of bytes - when treated as a sequence of characters, you  
get the same result. In other words, you're used to filenames as  
bytes, HFS+ treats filenames as strings.

>> However, I have noticed that everybody who's voiced
>> an opinion on this list in favor of the encoding-agnostic approach
>> seem to be unwilling to accept that any other approach might have
>> validity, to the extent of calling an OS/filesystem that does things
>> different stupid or insane. This strikes me as extremely elitist and
>> risks alienating what I expect to be a fast-growing group of users
>> (i.e. OS X users).
>
> First, it is Git philosophy and very core of design to be encoding
> agnostic (to be "content tracker"). Second, using the same sequence of
> bytes on filesystem, in the index, and in 'tree' objects ensures good
> performance... this is something to think about if you want to add
> patches which would deal with HFS+ API/UI quirks.

Sure, it makes sense from a performance perspective, but it causes  
problems with HFS+ and any other filesystem that behaves the same way.  
In the previous discussion about case-sensitivity, somebody suggested  
using a lookup table to map between git's internal representation and  
the name the filesystem returns, which seems like a decent idea and  
one that could be enabled with a config parameter to avoid penalizing  
repos on other filesystems. But I don't know enough about the  
internals of git to even think of trying to implement it myself.

-- 
Kevin Ballard
http://kevin.sb.org
kevin@sb.org
http://www.tildesoft.com



[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 2432 bytes --]

^ permalink raw reply

* Re: git on MacOSX and files with decomposed utf-8 file names
From: Johannes Schindelin @ 2008-01-16 22:23 UTC (permalink / raw)
  To: Kevin Ballard; +Cc: Jakub Narebski, Mark Junker, git
In-Reply-To: <1574A90A-8C45-46AD-9402-34AE6F582B3F@sb.org>

Hi,

On Wed, 16 Jan 2008, Kevin Ballard wrote:

> It's not using different encodings, it's all Unicode.

But that's the _point_!  It _is_ Unicode, yet it uses _different_ 
encodings of the _same_ string.

Now, this discussion gets really annoying.  The real question is: will you 
do something about it, or reply with another 500-line email?

Ciao,
Dscho

^ permalink raw reply

* Re: git on MacOSX and files with decomposed utf-8 file names
From: Linus Torvalds @ 2008-01-16 22:32 UTC (permalink / raw)
  To: Kevin Ballard; +Cc: Jakub Narebski, Johannes Schindelin, Mark Junker, git
In-Reply-To: <1574A90A-8C45-46AD-9402-34AE6F582B3F@sb.org>



On Wed, 16 Jan 2008, Kevin Ballard wrote:
> 
> It's not using different encodings, it's all Unicode. However, it accepts
> different normalization variants of Unicode, since it can read them all and it
> would be folly to require everybody to conform to its own special internal
> variant. But it does have to normalize them, otherwise how would it detect the
> same filename using different normalizations?

That's a singularly *stupid* argument.

Here, let me rephrase that same idiotic argument:

  "But it does have to uppercase them, otherwise how would it detect the 
   same filename using different cases?"

..and if you don't see how that's *exactly* the same argument, you really 
are stupid.

The fact is, normalization is wrong.

It's wrong when you normalize upper/lower case (no, the word "Polish" is 
not the same as "polish"), and it's equally wrong when you normalize for 
"looks similar".

> In other words, you're used to filenames as bytes, HFS+ treats filenames 
> as strings.

No. HFS+ treats users as idiots and thinks that it should "fix" the 
filename for them. And it causes problems.

It causes problems for exactly the same reasons case-independence causes 
problems, because it's EXACTLY THE SAME ISSUE. People may think that "but 
they are the same", but they aren't. Case matters. And so does "single 
character" vs "two character overlay". 

Does it always matter? Hell no. But the problem with a filesystem that 
thinks it knows better is that when it *sometimes* matters, the filesystem 
simply DOES THE WRONG THING.

Can't you understand that?

			Linus

^ permalink raw reply

* Re: I don't want the .git directory next to my code.
From: Wayne Davison @ 2008-01-16 22:33 UTC (permalink / raw)
  To: Daniel Barkalow; +Cc: Mike, git
In-Reply-To: <alpine.LNX.1.00.0801152305050.13593@iabervon.org>

On Tue, Jan 15, 2008 at 11:13:58PM -0500, Daniel Barkalow wrote:
> export GIT_DIR=/somewhere-else.git
> 
> Note that this only really works if you've only got one repository; 
> there's no good way to have the repository information outside of the 
> working directory and still have which repository you're using depend on 
> which directory you're working in.

If you use zsh as your shell, put this chpwd function in .zprofile or
.zshrc:

chpwd() {
    local gitbasedir=/var/local/git
    local subdir=$gitbasedir$PWD
    while [[ ! -d $subdir ]]; do
	subdir=${subdir:h}
    done
    if [[ -d $subdir/.git ]]; then
	export GIT_DIR=$subdir/.git
	export GIT_WORK_TREE=${subdir/$gitbasedir/}
    else
	unset GIT_DIR GIT_WORK_TREE
    fi
}

Then, for each /path/.git dir, move it to /var/local/git/path/.git
(creating any needed parent dirs).  On every change of directory in
the shell, the variables GIT_DIR and GIT_WORK_TREE will be either
updated or cleared.  If you ever move a work-tree dir, you'll need
to remember to move the .git dir in the /var/local/git hierarchy.

..wayne..

^ permalink raw reply

* Re: Merging using only fast-forward
From: Sverre Hvammen Johansen @ 2008-01-16 22:38 UTC (permalink / raw)
  To: git
In-Reply-To: <7vwsq9o6ls.fsf@gitster.siamese.dyndns.org>

On Jan 16, 2008 12:31 PM, Junio C Hamano <gitster@pobox.com> wrote:
> A sane integration is a different story.
>
> We have --ff and --no-ff options to merge.  How should this new
> option --ff-only mesh with them?  Perhaps we would want to have
> an option --ff that takes three values?
>
>         --ff=never
>         --ff=normal
>         --ff=only
>
> and have the first one be synonym for existing --no-ff, the second
> one to be a synonym for not giving anything (or giving --ff
> explicitly), and the third one to be this new mode of operation?

Thanks for the patch.  I can probably look into it tonight and do the
suggested integration and test it out, I keep you posted.

-- 
Sverre Hvammen Johansen

^ permalink raw reply

* Re: [PATCH 2/2] close_lock_file(): new function in the lockfile API
From: Brandon Casey @ 2008-01-16 22:46 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Linus Torvalds, Git Mailing List, Alex Riesen,
	Kristian Høgsberg
In-Reply-To: <7v7ii9o2ld.fsf@gitster.siamese.dyndns.org>

Junio C Hamano wrote:
> Brandon Casey <casey@nrlssc.navy.mil> writes:
> 
>> My patch does this, though I understand it may take some time to review.
> 
> This is what I have right now, squashed your change into [2/2] 
> I sent earlier, along with a couple of further fixups.
> 

Mainly, I prefer to not modify the data structures when a failure occurs.

Generally when commit_lock_file fails, the caller die()'s and then
remove_lock_file will delete the temporary file. If the caller were to
not call die, I think it should call rollback_lock_file() similarly to
how unlock_ref() is called everywhere in refs.c.

So I think the semantics should be:

	lock_fd = hold_lock_file(&lock);
	<do_something>
	if (close_lock_file(&lock)) {
		rollback_lock_file(&lock);
		return error;
	}
	if (commit_lock_file(&lock)) {
		rollback_lock_file(&lock);
		return error;
	}


> diff --git a/lockfile.c b/lockfile.c
> index f45d3ed..fcf9285 100644
> --- a/lockfile.c
> +++ b/lockfile.c
> @@ -13,7 +13,8 @@ static void remove_lock_file(void)
>  	while (lock_file_list) {
>  		if (lock_file_list->owner == me &&
>  		    lock_file_list->filename[0]) {
> -			close(lock_file_list->fd);
> +			if (lock_file_list->fd >= 0)
> +				close(lock_file_list->fd);
>  			unlink(lock_file_list->filename);
>  		}
>  		lock_file_list = lock_file_list->next;
> @@ -159,11 +160,23 @@ int hold_lock_file_for_update(struct lock_file *lk, const char *path, int die_on
>  	return fd;
>  }
>  
> +int close_lock_file(struct lock_file *lk)
> +{
> +	int fd = lk->fd;
> +	lk->fd = -1;
> +	return close(fd);
> +}

minor nit on this that I mentioned in another email.

> +
>  int commit_lock_file(struct lock_file *lk)
>  {
>  	char result_file[PATH_MAX];
>  	int i;
> -	close(lk->fd);
> +
> +	if (lk->fd >= 0 && close_lock_file(lk)) {
> +		unlink(lk->filename);
> +		lk->filename[0] = 0;
> +		return -1;
> +	}


I would rather have the caller call rollback_lock_file, or
fall back to remove_lock_file rather than do this unlinking
and modifying lk->filename here in commit_lock_file.


>  	strcpy(result_file, lk->filename);
>  	i = strlen(result_file) - 5; /* .lock */
>  	result_file[i] = 0;
> @@ -185,9 +198,15 @@ void set_alternate_index_output(const char *name)
>  int commit_locked_index(struct lock_file *lk)
>  {
>  	if (alternate_index_output) {
> -		int result = rename(lk->filename, alternate_index_output);
> -		lk->filename[0] = 0;
> -		return result;
> +		const char *newname = alternate_index_output;
> +		alternate_index_output = NULL;
> +
> +		if (lk->fd >= 0 && close_lock_file(lk)) {
> +			unlink(lk->filename);
> +			lk->filename[0] = 0;
> +			return -1;
> +		}

ditto here.

> +		return rename(lk->filename, newname);


If rename succeeds, we'll try to do an unnecessary unlink atexit.


Having said those things, I will defer to your more experienced judgment.

-brandon

^ permalink raw reply

* Re: Merging using only fast-forward
From: Junio C Hamano @ 2008-01-16 22:49 UTC (permalink / raw)
  To: Sverre Hvammen Johansen; +Cc: git
In-Reply-To: <402c10cd0801161438u7443ebf2xdf1d6f3040ceb1a9@mail.gmail.com>

"Sverre Hvammen Johansen" <hvammen@gmail.com> writes:

> On Jan 16, 2008 12:31 PM, Junio C Hamano <gitster@pobox.com> wrote:
>> A sane integration is a different story.
>>
>> We have --ff and --no-ff options to merge.  How should this new
>> option --ff-only mesh with them?  Perhaps we would want to have
>> an option --ff that takes three values?
>>
>>         --ff=never
>>         --ff=normal
>>         --ff=only
>>
>> and have the first one be synonym for existing --no-ff, the second
>> one to be a synonym for not giving anything (or giving --ff
>> explicitly), and the third one to be this new mode of operation?
>
> Thanks for the patch.  I can probably look into it tonight and do the
> suggested integration and test it out, I keep you posted.

Well, no, I did not _suggest_ that particular integration.  I do
not even know what the right answer is to these questions.  The
UI that pastebin patch implements, which uses totally separate
flags, may turn out to be the right approach.  If you want to
carry the torch forward on this topic, that's great.  It is very
much appreciated.  But one of the things included in that work
includes thinking and deciding which UI is better between the
patch in the pastebin and the one that tries to introduce a
unified --ff=<value> option.  I simply do not know the answer,
and prefer not having to think about it during a pre-release
feature freeze.

Coming up with a 7-liner quick-and-dirty solution is one thing.
It was easy (although I haven't tested it, so it may not work at
all!).  But if we want to mainline it, we need to think
carefully how the new feature integrates well with what we
already have.  That's all I wanted to say.

^ permalink raw reply

* Re: git on MacOSX and files with decomposed utf-8 file names
From: Linus Torvalds @ 2008-01-16 22:52 UTC (permalink / raw)
  To: Kevin Ballard; +Cc: Jakub Narebski, Johannes Schindelin, Mark Junker, git
In-Reply-To: <alpine.LFD.1.00.0801161424040.2806@woody.linux-foundation.org>



On Wed, 16 Jan 2008, Linus Torvalds wrote:
> 
> Does it always matter? Hell no. But the problem with a filesystem that 
> thinks it knows better is that when it *sometimes* matters, the filesystem 
> simply DOES THE WRONG THING.
> 
> Can't you understand that?

Side note: there are ways to do it right.

You can:

 - not do conversion at all (which is always right). Not corrupting the 
   user data means that the user never gets something back that he didn't 
   put in

   (And, btw, the "security" argument is total BS. The fact that two 
   characters look the same does not mean that they should act the same, 
   and it is *not* a security feature. Quite the reverse. Having programs 
   that get different results back from what they actually wrote, *that* 
   tends to be a security issue, because now you have a confused program, 
   and I guarantee that there are more bugs in unexpected cases than in 
   the expected ones)

 - Not accept data in formats that you don't like. This is also always 
   right, but can be rather impolite.

 - Not accept data in formats that you don't like, and give people 
   explicit conversion and comparison routines so that they can then make 
   their own decisions and they are *aware* of the conversion (so that 
   they don't come back to the problem of being confused)

So there are certainly many ways to handle things like this.

The one thing you shouldn't do is to silently convert data behind the 
programs back, without even giving any way to disable it (and that disable 
has to be on a use-by-use casis, not some "disable/enable for all users of 
this filesystem", because you can - and do - have different programs that 
have different expectations).

And finally: all of the above is true at *all* levels. It doesn't matter 
one whit whether the automatic conversion conversion is in the kernel or 
in a library. Doing it on a library level has advantages (namely the whole 
"disable/enable" thing tends to get *much* easier to do, and applications 
can decide to link against a particular version to get the behaviour 
*they* want, for example).

So doing it inside the kernel is just about the worst possible case, 
exactly because it makes it really hard to do a "on a case-by-case" basis. 

Yes, Linux does it too, but it does it only for filesystems that are 
*defined* to be insane. OS X really should have known better. Especially 
since they already fixed the applications (ie they do allow for 
case-sensitive filesystems).

I can understand normalization when it's about case-insensitivity (there 
are lots of _technical_ reasons to do it there), but once you let the 
case-insensitivity go, there just isn't any excuse any more.

			Linus

^ permalink raw reply

* Re: [PATCH 2/2] close_lock_file(): new function in the lockfile API
From: Junio C Hamano @ 2008-01-16 22:55 UTC (permalink / raw)
  To: Brandon Casey
  Cc: Linus Torvalds, Git Mailing List, Alex Riesen,
	Kristian Høgsberg
In-Reply-To: <478E893F.4070100@nrlssc.navy.mil>

Brandon Casey <casey@nrlssc.navy.mil> writes:

> Mainly, I prefer to not modify the data structures when a failure occurs.

Ok.  Is the rest of your patch that fixes callers Ok with that
semantics?  If so, I'd agree that is probably cleaner.  I'll
scrap the one we are discussing, resurrecting only the api
documentation part, and replace it with the lockfile.c changes
from your patch, along with the fixes to callers.
 

^ permalink raw reply

* fixlet in docs.
From: Thomas Zander @ 2008-01-16 22:48 UTC (permalink / raw)
  To: git; +Cc: gitster

>From 785f928973e3ec3e5ca27f322fb44890e843a340 Mon Sep 17 00:00:00 2001
From: Thomas Zander <Thomas Zander zander@kde.org>
Date: Wed, 16 Jan 2008 23:42:00 +0100
Subject: [PATCH] Fixlet in doc; there was an open brace without a closing one.
 Signed-off-by: Thomas Zander <zander@kde.org>

---
 Documentation/core-tutorial.txt |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/Documentation/core-tutorial.txt b/Documentation/core-tutorial.txt
index bd6cd41..aa40dfd 100644
--- a/Documentation/core-tutorial.txt
+++ b/Documentation/core-tutorial.txt
@@ -578,7 +578,7 @@ particular state. You can, for example, do
 $ git diff my-first-tag
 ----------------
 
-to diff your current state against that tag (which at this point will
+to diff your current state against that tag which at this point will
 obviously be an empty diff, but if you continue to develop and commit
 stuff, you can use your tag as an "anchor-point" to see what has changed
 since you tagged it.
-- 
1.5.3
-- 
Thomas Zander

^ permalink raw reply related

* git apply behaves differently from patch(1)
From: Thomas Zander @ 2008-01-16 22:58 UTC (permalink / raw)
  To: git

[-- Attachment #1: Type: text/plain, Size: 1396 bytes --]

In the following usecase git apply (git version 1.5.4.rc3.15.g785f9) 
doesn't do what I expect it should do. I expect it to do the same as 
patch does in the same situation.

To reproduce;
Create a file 'test' with a number on each line.  Numbers 1 though 10.
$ for i in 1 2 3 4 5 6 7 8 9 10; do echo $i >> test; done
$ git add test
$ git commit test

$ rm test
$ for i in 1 3 4 5 6 7 10; do echo $i >> test; done
$ git diff-index -p --unified=0 HEAD test | tee mypatch

Now use your editor to edit 'mypatch' and remove the first hunk; the end 
result (after your editing) should be something like this;
$ cat mypatch
diff --git a/test b/test
index f00c965..319869c 100644
--- a/test
+++ b/test
@@ -8,2 +6,0 @@
-8
-9

apply revert this patch;
$ git apply -R --unidiff-zero --apply mypatch
$ git diff

What I expect (and what I get if I replace git apply with a 'patch -R -p1 
< mypatch') is that the diff shows line "2" is still missing.

What I get instead is that "2" is missing but also that "10" moved 2 lines 
up.
I conclude that git somehow doesn't like the patch to be removed, while 
patch(1) has no problem with that.

I hope you agree its a bug and fix it in an upcoming version, it would be 
great if I can avoid using patch(1) or worse.
If you have any questions feel free to ask; but please cc me as I am not 
subscribed.
-- 
Thomas Zander

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply related

* Re: git on MacOSX and files with decomposed utf-8 file names
From: Wincent Colaiuta @ 2008-01-16 23:03 UTC (permalink / raw)
  To: Kevin Ballard; +Cc: Johannes Schindelin, Mark Junker, git
In-Reply-To: <427BE4FD-6534-4CB2-91F8-F9014DC82B54@sb.org>

El 16/1/2008, a las 16:43, Kevin Ballard escribió:

> On Jan 16, 2008, at 10:34 AM, Johannes Schindelin wrote:
>
>> On Wed, 16 Jan 2008, Mark Junker wrote:
>>
>>> I have some files like "Lüftung.txt" in my repository. The strange  
>>> thing is
>>> that I can pull / add / commit / push those files without problem  
>>> but
>>> git-status always complains that thoes files are untraced (but not  
>>> missing).
>>
>> This is a known problem.  Unfortunately, noone has implemented a fix,
>> although if you're serious about it, I can point you to threads  
>> where it
>> has been hinted how to solve the issue.
>>
>> FWIW the issue is that Mac OS X decides that it knows better how to  
>> encode
>> your filename than you could yourself.
>
> More like, Mac OS X has standardized on Unicode and the rest of the  
> world hasn't caught up yet. Git is the only tool I've ever heard of  
> that has a problem with OS X using Unicode.

As far as I know, Subversion has basically exactly the same problem,  
and any time you consume/produce files on Mac OS X that are be  
consumed/produced on other platforms you will run into this kind of  
issue, with any software.

Tell Mac OS X to write a file with "ó" in the file name ("\xc3\xb3" in  
UTF-8), and it will "normalize" it prior to writing by converting it  
into a decomposed form (that is, ASCII "o" followed by "\xcc\x81", or  
"combining acute accent"). So they're both valid Unicode, both valid  
UTF-8, and they encode exactly the same characters but the byte stream  
is different.

If you only work on Mac OS X then this will never be a problem because  
all the files you create and therefore all the files you add to your  
Git repository will have their names in decomposed UTF-8. But when you  
start cloning repositories containing files added on other systems,  
systems which might use precomposed rather than decomposed UTF-8 then  
you'll run into exactly this kind of problem. The git.git repo has one  
such file itself (gitweb/test/Märchen, if I remember correctly, which  
Git reports as untracked).

Now, Mac OS X's behaviour is not entirely "insane" as some would  
claim; there is indeed a rationale behind it even if you don't agree  
with it, but it *does* produce some unfortunate teething problems for  
people wanting to use Mac OS X in a cross-platform environment.

Here are some Apple docs on the subject:

http://developer.apple.com/qa/qa2001/qa1173.html

http://developer.apple.com/qa/qa2001/qa1235.html

I personally wish that UTF-8 didn't allow different normalization  
forms; then this kind of problem wouldn't arise. But it has arisen and  
we have to live with it. Some workarounds have been proposed for Git,  
but I haven't seen any convincing proposals yet.

Cheers,
Wincent

^ permalink raw reply

* Re: git on MacOSX and files with decomposed utf-8 file names
From: Eyvind Bernhardsen @ 2008-01-16 22:37 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Kevin Ballard, Mark Junker, git
In-Reply-To: <alpine.LSU.1.00.0801161629580.17650@racer.site>

On 16. jan.. 2008, at 17.32, Johannes Schindelin wrote:

>>> FWIW the issue is that Mac OS X decides that it knows better how to
>>> encode your filename than you could yourself.
>>
>> More like, Mac OS X has standardized on Unicode and the rest of the
>> world hasn't caught up yet. Git is the only tool I've ever heard of  
>> that
>> has a problem with OS X using Unicode.
>
> No.  That's not at all the problem.  Mac OS X insists on storing  
> _another_
> encoding of your filename.  Both are UTF-8.  Both encode the _same_
> string.  Yet they are different, bytewise.  For no good reason.
>
> Stop spreading FUD.  Git can handle Unicode just fine.  In fact, Git  
> does
> not _care_ how the filename is encoded, it _respects_ the user's  
> choice,
> not only of the encoding _type_, but the _encoding_, too.

"FUD" is a bit strong, don't you think?  HFS+ is the way it is and it  
would be nice if Git could deal with it.

The problem is that HFS+ normalizes filenames to avoid multiple files  
that appear to have the same name (eg "M<A WITH UMLAUT>rchen" vs  
"Ma<UMLAUT MODIFIER>rchen", in gitweb/test).  This is sort of like  
case sensitivity, but filenames are normalized when a file is  
_created_.  Git, not unreasonably, expects a file to keep the name it  
was created with.

As far as I can tell, as long as you add all your internationally  
becharactered files to git from an HFS+ file system using a gui or  
command-line completion, you'll be okay; trouble starts when you check  
in a file with the composed form of a character, by typing the name on  
the command line (I'm not sure about this one) or committing on  
another OS.  Git will store the filename in composed form, but the  
Mac's filesystem will decompose the filename when you check the file  
out.

The result looks like this:

vredefort:[git]% git status
# On branch master
# Untracked files:
#   (use "git add <file>..." to include in what will be committed)
#
#	gitweb/test/Märchen
nothing added to commit but untracked files present (use "git add" to  
track)

(this is directly after checking out git.git @ v1.5.4-rc3)

There are two things to note here.  One is that Git thinks that there  
is a new file called "gitweb/test/Märchen" (decomposed) when it's  
"really" just the same "gitweb/test/Märchen" (precomposed) that's in  
the repository.  The other is that git _thinks_ that the "gitweb/test/ 
Märchen" (precomposed) it's expecting is still there, because the  
filesystem, when asked for "gitweb/test/Märchen" in any form will  
return the file "gitweb/test/Märchen" (decomposed).

Trying to check out the "next" branch at this point is a pain since  
next's "Märchen" would overwrite the untracked "Märchen".

I can't provide links to any previous discussions about this, but  
here's Apple's Technical Q&A on the subject:

http://developer.apple.com/qa/qa2001/qa1235.html

Finding a sane way of allowing git to handle this behaviour is left as  
an exercise for the reader.

Eyvind Bernhardsen

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox