* git-svn does not seems to work with crlf convertion enabled. @ 2008-07-23 8:44 Alexander Litvinov 2008-07-23 9:18 ` Johannes Schindelin 0 siblings, 1 reply; 45+ messages in thread From: Alexander Litvinov @ 2008-07-23 8:44 UTC (permalink / raw) To: git Hello list. In short: I can't clone svn repo into git when crlf convertion is activated. Long story. I use latest git: $ git version git version 1.5.6.4 For a long period of time I use git at work. Main repo is svn-powered and I use git-svn for linking git and svn. The project itself is a windows cpp project. I use git under Linux machine (Debian etch with manually backported git from sid) and work with linux-hosted project thru samba. From the begin I did not enable crlf convertion and broke crlf notation in files one by one during my commits. My co-workers does not like this and finally I decide to try to use autocrlf feature of git. So I take a copy of my git repo and convert all text files to unix LF line endings: git filter-branch --tree-filter "find -type f \( -iname '*.h' -or \ -iname '*.cpp' -or -iname '*.vcproj' -or -iname '*.sln' -or \ -iname '*.h.tmpl' -or -iname '*.bat' -or -iname '*.mp' -or \ -iname '*.txt' -or -iname '*.nsi' -or -iname '*.def' -or \ -iname '*.rc' -or -iname '*.ini' -or -iname '*.inf' -or \ -iname '*.skin' -or -iname '*.c' -or -iname '*.dsp' \ -or -iname '*.dsw' \) -print0 | xargs -r0 dos2unix" \ `git branch -a | sed 's/^..//'` It finished succefully. After fish I have added .git/info/attributes like this: * -crlf *.h crlf *.c crlf *.cpp crlf and so on... and add set core.autocrlf to true and safecrlf to false. Also I cleared all git-svn's caches: rm -rf .git/svn As I understand I got pure repo that is capable to work with crlf convertion. Lets update it (on branch forked from trunk): git svn rebase <.. some long list of revs during migration to new git-svn layout..> Done rebuilding .git/svn/trunk/.rev_map.f1f59411-8b2e-0410-9ee3-aa470c928bf2 M FindHistory.cpp Incomplete data: Delta source ended unexpectedly at /tmp/g/bin/git-svn line 3856 Oops ! Whats this ? I am not able to update. I can update other branches but not trunk. So I have to try my old original repo without crlf convertion enabled. It was updated succeffuly, I cant show log it was lost and I was not able to reproduce it. Is there any way to fix this problem ? P.S. I can't even clone that svn repo from scratch with crlf convertion enabled. ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: git-svn does not seems to work with crlf convertion enabled. 2008-07-23 8:44 git-svn does not seems to work with crlf convertion enabled Alexander Litvinov @ 2008-07-23 9:18 ` Johannes Schindelin 2008-07-23 11:52 ` Alexander Litvinov ` (2 more replies) 0 siblings, 3 replies; 45+ messages in thread From: Johannes Schindelin @ 2008-07-23 9:18 UTC (permalink / raw) To: Alexander Litvinov; +Cc: git Hi, On Wed, 23 Jul 2008, Alexander Litvinov wrote: > In short: I can't clone svn repo into git when crlf convertion is > activated. This is a known issue, but since nobody with that itch seems to care enough to fix it, I doubt it will ever be fixed. Ciao, Dscho ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: git-svn does not seems to work with crlf convertion enabled. 2008-07-23 9:18 ` Johannes Schindelin @ 2008-07-23 11:52 ` Alexander Litvinov 2008-07-23 12:57 ` Johannes Schindelin 2008-07-24 14:24 ` Dmitry Potapov 2008-07-30 4:37 ` Alexander Litvinov 2008-07-31 5:43 ` [PATCH] git-svn now " Alexander Litvinov 2 siblings, 2 replies; 45+ messages in thread From: Alexander Litvinov @ 2008-07-23 11:52 UTC (permalink / raw) To: Johannes Schindelin; +Cc: git > On Wed, 23 Jul 2008, Alexander Litvinov wrote: > > In short: I can't clone svn repo into git when crlf convertion is > > activated. > > This is a known issue, but since nobody with that itch seems to care > enough to fix it, I doubt it will ever be fixed. That is a bad news for me. Anyway I will spend some time at holidays during digging this bug. ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: git-svn does not seems to work with crlf convertion enabled. 2008-07-23 11:52 ` Alexander Litvinov @ 2008-07-23 12:57 ` Johannes Schindelin 2008-07-23 15:49 ` Avery Pennarun ` (2 more replies) 2008-07-24 14:24 ` Dmitry Potapov 1 sibling, 3 replies; 45+ messages in thread From: Johannes Schindelin @ 2008-07-23 12:57 UTC (permalink / raw) To: Alexander Litvinov; +Cc: git Hi, On Wed, 23 Jul 2008, Alexander Litvinov wrote: > > On Wed, 23 Jul 2008, Alexander Litvinov wrote: > > > In short: I can't clone svn repo into git when crlf convertion is > > > activated. > > > > This is a known issue, but since nobody with that itch seems to care > > enough to fix it, I doubt it will ever be fixed. > > That is a bad news for me. Anyway I will spend some time at holidays > during digging this bug. Note that you will have to do your digging using msysGit (i.e. the developer's pack, not the installer for plain Git), since git-svn will be removed from the next official "Windows Git" release, due to lack of fixers. Ciao, Dscho ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: git-svn does not seems to work with crlf convertion enabled. 2008-07-23 12:57 ` Johannes Schindelin @ 2008-07-23 15:49 ` Avery Pennarun 2008-07-23 16:07 ` Johannes Schindelin 2008-07-24 3:13 ` Alexander Litvinov 2008-08-06 11:15 ` Petr Baudis 2 siblings, 1 reply; 45+ messages in thread From: Avery Pennarun @ 2008-07-23 15:49 UTC (permalink / raw) To: Johannes Schindelin; +Cc: Alexander Litvinov, git On 7/23/08, Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote: > On Wed, 23 Jul 2008, Alexander Litvinov wrote: > > > On Wed, 23 Jul 2008, Alexander Litvinov wrote: > > > > In short: I can't clone svn repo into git when crlf convertion is > > > > activated. > > > > > > This is a known issue, but since nobody with that itch seems to care > > > enough to fix it, I doubt it will ever be fixed. > > > > That is a bad news for me. Anyway I will spend some time at holidays > > during digging this bug. > > Note that you will have to do your digging using msysGit (i.e. the > developer's pack, not the installer for plain Git), since git-svn will be > removed from the next official "Windows Git" release, due to lack of > fixers. Presumably cygwin git will work too, right? Does this known issue apply only to msysGit, or both msys and Cygwin, or all versions? ie. could it be debugged on Linux? Thanks, Avery ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: git-svn does not seems to work with crlf convertion enabled. 2008-07-23 15:49 ` Avery Pennarun @ 2008-07-23 16:07 ` Johannes Schindelin 0 siblings, 0 replies; 45+ messages in thread From: Johannes Schindelin @ 2008-07-23 16:07 UTC (permalink / raw) To: Avery Pennarun; +Cc: Alexander Litvinov, git Hi, On Wed, 23 Jul 2008, Avery Pennarun wrote: > On 7/23/08, Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote: > > On Wed, 23 Jul 2008, Alexander Litvinov wrote: > > > > On Wed, 23 Jul 2008, Alexander Litvinov wrote: > > > > > In short: I can't clone svn repo into git when crlf convertion > > > > > is activated. > > > > > > > > This is a known issue, but since nobody with that itch seems to > > > > care enough to fix it, I doubt it will ever be fixed. > > > > > > That is a bad news for me. Anyway I will spend some time at > > > holidays during digging this bug. > > > > Note that you will have to do your digging using msysGit (i.e. the > > developer's pack, not the installer for plain Git), since git-svn will > > be removed from the next official "Windows Git" release, due to lack > > of fixers. > > Presumably cygwin git will work too, right? Yes. > Does this known issue apply only to msysGit, or both msys and Cygwin, or > all versions? ie. could it be debugged on Linux? You mean the crlf vs git-svn issue? No, yes, yes, yes, and yes. Ciao, Dscho ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: git-svn does not seems to work with crlf convertion enabled. 2008-07-23 12:57 ` Johannes Schindelin 2008-07-23 15:49 ` Avery Pennarun @ 2008-07-24 3:13 ` Alexander Litvinov 2008-08-06 11:15 ` Petr Baudis 2 siblings, 0 replies; 45+ messages in thread From: Alexander Litvinov @ 2008-07-24 3:13 UTC (permalink / raw) To: Johannes Schindelin; +Cc: git > Note that you will have to do your digging using msysGit (i.e. the > developer's pack, not the installer for plain Git), since git-svn will be > removed from the next official "Windows Git" release, due to lack of > fixers. You will not believe me. I use git under Linux, develop under windows on network drive :-) ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: git-svn does not seems to work with crlf convertion enabled. 2008-07-23 12:57 ` Johannes Schindelin 2008-07-23 15:49 ` Avery Pennarun 2008-07-24 3:13 ` Alexander Litvinov @ 2008-08-06 11:15 ` Petr Baudis 2008-08-06 12:35 ` Peter Harris ` (2 more replies) 2 siblings, 3 replies; 45+ messages in thread From: Petr Baudis @ 2008-08-06 11:15 UTC (permalink / raw) To: Johannes Schindelin; +Cc: Alexander Litvinov, git Hi, On Wed, Jul 23, 2008 at 01:57:54PM +0100, Johannes Schindelin wrote: > Note that you will have to do your digging using msysGit (i.e. the > developer's pack, not the installer for plain Git), since git-svn will be > removed from the next official "Windows Git" release, due to lack of > fixers. is there any other problem with git-svn on Windows than the CRLF issue? I couldn't find anything significant in the issue tracker. If not, why do you want to drop git-svn from Windows Git? It seems that the CRLF issue has trivial workaround to set autocrlf=false; this will make git-svn-tracked repositories useful only on Windows, but I'd bet this is fine for large majority of Windows git-svn users? -- Petr "Pasky" Baudis The next generation of interesting software will be done on the Macintosh, not the IBM PC. -- Bill Gates ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: git-svn does not seems to work with crlf convertion enabled. 2008-08-06 11:15 ` Petr Baudis @ 2008-08-06 12:35 ` Peter Harris 2008-08-06 12:43 ` Johannes Schindelin 2008-08-06 16:11 ` git-svn does not seems to work with crlf convertion enabled Dmitry Potapov 2 siblings, 0 replies; 45+ messages in thread From: Peter Harris @ 2008-08-06 12:35 UTC (permalink / raw) To: Petr Baudis; +Cc: Johannes Schindelin, Alexander Litvinov, git On Wed, Aug 6, 2008 at 7:15 AM, Petr Baudis wrote: > On Wed, Jul 23, 2008 at 01:57:54PM +0100, Johannes Schindelin wrote: >> Note that you will have to do your digging using msysGit (i.e. the >> developer's pack, not the installer for plain Git), since git-svn will be >> removed from the next official "Windows Git" release, due to lack of >> fixers. > > is there any other problem with git-svn on Windows than the CRLF > issue? I couldn't find anything significant in the issue tracker. The main problem currently is that git is Win32, and perl is Msys. When perl asks git to read files from /tmp (a path that doesn't exist outside Msys), it grinds to a screeching halt. The quick and dirty fix is to convince git-svn to write temporary files somewhere else (maybe by passing DIR => $ENV{GIT_DIR} to File::Temp::tempname, but I've been too embarrassed to suggest that publicly). The correct fix is to switch the msysGit perl from Msys to Vanilla, but I've been too lazy to finish that up (as the SVN modules quickly descend into dependancy hell). Peter Harris ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: git-svn does not seems to work with crlf convertion enabled. 2008-08-06 11:15 ` Petr Baudis 2008-08-06 12:35 ` Peter Harris @ 2008-08-06 12:43 ` Johannes Schindelin 2008-08-06 13:51 ` git-svn on MSysGit and why is it (going to be?) unsupported Petr Baudis 2008-08-06 16:11 ` git-svn does not seems to work with crlf convertion enabled Dmitry Potapov 2 siblings, 1 reply; 45+ messages in thread From: Johannes Schindelin @ 2008-08-06 12:43 UTC (permalink / raw) To: Petr Baudis; +Cc: Alexander Litvinov, git Hi, On Wed, 6 Aug 2008, Petr Baudis wrote: > On Wed, Jul 23, 2008 at 01:57:54PM +0100, Johannes Schindelin wrote: > > Note that you will have to do your digging using msysGit (i.e. the > > developer's pack, not the installer for plain Git), since git-svn will > > be removed from the next official "Windows Git" release, due to lack > > of fixers. > > is there any other problem with git-svn on Windows than the CRLF > issue? I couldn't find anything significant in the issue tracker. http://code.google.com/p/msysgit/issues/detail?id=120&colspec=ID%20Type%20Status%20Priority%20Component%20Owner%20Summary It is also frustrating that http://code.google.com/p/msysgit/issues/detail?id=83&colspec=ID%20Type%20Status%20Priority%20Component%20Owner%20Summary http://code.google.com/p/msysgit/issues/detail?id=103&colspec=ID%20Type%20Status%20Priority%20Component%20Owner%20Summary http://code.google.com/p/msysgit/issues/detail?id=129&colspec=ID%20Type%20Status%20Priority%20Component%20Owner%20Summary are probably the same issue. I cannot only blame the users for not really looking if their issue has been reported yet; there are 32 open issues in msysGit right now, number increasing, so it gets quite confusing. I once switched off the issue tracker, because I was the only one who took at least a little bit of care of it. Due to list consensus, it was turned back on -- against my will. Guess who takes care of it right now? Exactly. So I will soon be switching it off again, I think, because there are few more useless things than an unmonitored issue tracker. > If not, why do you want to drop git-svn from Windows Git? It seems > that the CRLF issue has trivial workaround to set autocrlf=false; this > will make git-svn-tracked repositories useful only on Windows, but I'd > bet this is fine for large majority of Windows git-svn users? If it was so trivial, why does nobody use it? Oh, and git-svn is slow, too. And _noone_ of those competent Windows git-svn users seemed fit or willing to do anything about git-svn, not even the simplest of issues. If you want to do something about it, go ahead. But I have no inclination of hearing from any Windows user about git-svn again, ever. Ciao, Dscho ^ permalink raw reply [flat|nested] 45+ messages in thread
* git-svn on MSysGit and why is it (going to be?) unsupported 2008-08-06 12:43 ` Johannes Schindelin @ 2008-08-06 13:51 ` Petr Baudis 2008-08-06 15:23 ` Avery Pennarun 0 siblings, 1 reply; 45+ messages in thread From: Petr Baudis @ 2008-08-06 13:51 UTC (permalink / raw) To: Johannes Schindelin; +Cc: Alexander Litvinov, git Hi! On Wed, Aug 06, 2008 at 02:43:51PM +0200, Johannes Schindelin wrote: > On Wed, 6 Aug 2008, Petr Baudis wrote: > > > On Wed, Jul 23, 2008 at 01:57:54PM +0100, Johannes Schindelin wrote: > > > Note that you will have to do your digging using msysGit (i.e. the > > > developer's pack, not the installer for plain Git), since git-svn will > > > be removed from the next official "Windows Git" release, due to lack > > > of fixers. > > > > is there any other problem with git-svn on Windows than the CRLF > > issue? I couldn't find anything significant in the issue tracker. > > http://code.google.com/p/msysgit/issues/detail?id=120&colspec=ID%20Type%20Status%20Priority%20Component%20Owner%20Summary Yes, that's why added the word "significant". ;-) This seems to be simple module-out-of-sync issue. > It is also frustrating that > > http://code.google.com/p/msysgit/issues/detail?id=83&colspec=ID%20Type%20Status%20Priority%20Component%20Owner%20Summary > http://code.google.com/p/msysgit/issues/detail?id=103&colspec=ID%20Type%20Status%20Priority%20Component%20Owner%20Summary > http://code.google.com/p/msysgit/issues/detail?id=129&colspec=ID%20Type%20Status%20Priority%20Component%20Owner%20Summary > > are probably the same issue. I cannot only blame the users for not really > looking if their issue has been reported yet; there are 32 open issues in > msysGit right now, number increasing, so it gets quite confusing. > > I once switched off the issue tracker, because I was the only one who took > at least a little bit of care of it. Due to list consensus, it was turned > back on -- against my will. > > Guess who takes care of it right now? > > Exactly. So I will soon be switching it off again, I think, because there > are few more useless things than an unmonitored issue tracker. Well, when looking through the tracker earlier today, I actually wanted to mark few dupes, but I did not find out how on the earth I'm supposed to do that. Either the operation is well-hidden in the web interface or I have to have some special rights to do that - in which case, it's no wonder the tracker is deteriorating. > > If not, why do you want to drop git-svn from Windows Git? It seems > > that the CRLF issue has trivial workaround to set autocrlf=false; this > > will make git-svn-tracked repositories useful only on Windows, but I'd > > bet this is fine for large majority of Windows git-svn users? > > If it was so trivial, why does nobody use it? Because it is not documented? Or is it? *Searches crlf in git-svn.html bundled with his msysgit* *Looks at Git FAQ* *Looks for release notes in the start menu ... unsuccessfully* *Tries to Google out MSysGit release notes ... unsuccessfully* *Founds MSysGit release notes sitting in Program Files* "git svn is slow or seems to be broken (see discussions on the mailing list)" What is "the" mailing list in MSysGit context? *Googles out MSysGit Google Group* *Searches git-svn and pages... and pages.* http://groups.google.com/group/msysgit/browse_thread/thread/8240da55a76f8c92/30656b448e9f5e74?lnk=gst&q=git-svn#30656b448e9f5e74 Okay. That was really easy to find, wasn't it... Somewhere deep inside, even few mentions of autocrlf can be found. > Oh, and git-svn is slow, too. > > And _noone_ of those competent Windows git-svn users seemed fit or willing > to do anything about git-svn, not even the simplest of issues. I can of course understand that argument, even though it's a bit sad to see when the issues are apparently either trivial or there is simple workaround available. My trouble was that the _concrete_ reasons for this are buried deep inside long mail threads (or threads on other mailing lists). > If you want to do something about it, go ahead. But I have no inclination > of hearing from any Windows user about git-svn again, ever. Not currently, I'm just afraid I *might* have to sometime in the future. ;-) -- Petr "Pasky" Baudis The next generation of interesting software will be done on the Macintosh, not the IBM PC. -- Bill Gates ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: git-svn on MSysGit and why is it (going to be?) unsupported 2008-08-06 13:51 ` git-svn on MSysGit and why is it (going to be?) unsupported Petr Baudis @ 2008-08-06 15:23 ` Avery Pennarun 0 siblings, 0 replies; 45+ messages in thread From: Avery Pennarun @ 2008-08-06 15:23 UTC (permalink / raw) To: Petr Baudis; +Cc: Johannes Schindelin, Alexander Litvinov, git On 8/6/08, Petr Baudis <pasky@suse.cz> wrote: > On Wed, Aug 06, 2008 at 02:43:51PM +0200, Johannes Schindelin wrote: > > And _noone_ of those competent Windows git-svn users seemed fit or willing > > to do anything about git-svn, not even the simplest of issues. > > I can of course understand that argument, even though it's a bit sad to > see when the issues are apparently either trivial or there is simple > workaround available. My trouble was that the _concrete_ reasons for > this are buried deep inside long mail threads (or threads on other > mailing lists). > > > If you want to do something about it, go ahead. But I have no inclination > > of hearing from any Windows user about git-svn again, ever. > > Not currently, I'm just afraid I *might* have to sometime in the future. > ;-) FWIW (and related to the subject line in this thread), I think there are a lot of git users on Windows who just use the cygwin one. That's what I do, and git-svn works fine (I don't use autocrlf though, which is probably why it worked). git's support for both platforms, and the fact that cygwin was first and works already, probably greatly reduces the number of developers who want to fix msysgit. Have fun, Avery ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: git-svn does not seems to work with crlf convertion enabled. 2008-08-06 11:15 ` Petr Baudis 2008-08-06 12:35 ` Peter Harris 2008-08-06 12:43 ` Johannes Schindelin @ 2008-08-06 16:11 ` Dmitry Potapov 2 siblings, 0 replies; 45+ messages in thread From: Dmitry Potapov @ 2008-08-06 16:11 UTC (permalink / raw) To: Petr Baudis; +Cc: Johannes Schindelin, Alexander Litvinov, git On Wed, Aug 6, 2008 at 3:15 PM, Petr Baudis <pasky@suse.cz> wrote: > > If not, why do you want to drop git-svn from Windows Git? It seems > that the CRLF issue has trivial workaround to set autocrlf=false; > this will make git-svn-tracked repositories useful only on Windows, > but I'd bet this is fine for large majority of Windows git-svn users? Actually, it is not so simple. If you have svn properties setup correctly for your text files (i.e. svn:eol-style=native) than autocrlf=false is not what you want, because then SVN uses LF as EOL when stores this files. In many case, just setting svn:eol-style correctly in SVN may solve the problem. However, to make git-svn work reliable in present files with different ending, it should import files from SVN without applying any filter. Therefore, the --no-filters option was recently added to git-hash-object. Adding its use to git-svn should be easy (I have not had time to test it): === diff --git a/perl/Git.pm b/perl/Git.pm index 087d3d0..438b7fd 100644 --- a/perl/Git.pm +++ b/perl/Git.pm @@ -829,7 +829,7 @@ sub _open_hash_and_insert_object_if_needed { ($self->{hash_object_pid}, $self->{hash_object_in}, $self->{hash_object_out}, $self->{hash_object_ctx}) = - command_bidi_pipe(qw(hash-object -w --stdin-paths)); + command_bidi_pipe(qw(hash-object -w --stdin-paths --no-filters)); } sub _close_hash_and_insert_object { === This should solve all problem with git-svn fetch. However, if you want to respect svn:eol-style and when you commit your changes, that will require synchronization svn:eol-style with values for crlf in your .gitattributes, which is a much more ambitious task. Dmitry ^ permalink raw reply related [flat|nested] 45+ messages in thread
* Re: git-svn does not seems to work with crlf convertion enabled. 2008-07-23 11:52 ` Alexander Litvinov 2008-07-23 12:57 ` Johannes Schindelin @ 2008-07-24 14:24 ` Dmitry Potapov 2008-07-24 14:40 ` Johannes Schindelin 1 sibling, 1 reply; 45+ messages in thread From: Dmitry Potapov @ 2008-07-24 14:24 UTC (permalink / raw) To: Alexander Litvinov; +Cc: Johannes Schindelin, git On Wed, Jul 23, 2008 at 06:52:09PM +0700, Alexander Litvinov wrote: > > On Wed, 23 Jul 2008, Alexander Litvinov wrote: > > > In short: I can't clone svn repo into git when crlf convertion is > > > activated. > > > > This is a known issue, but since nobody with that itch seems to care > > enough to fix it, I doubt it will ever be fixed. > > That is a bad news for me. Anyway I will spend some time at holidays during > digging this bug. I don't want to discourage from digging into this problem, but there are two reasons why no one has fixed this issue yet. First, configuration CRLF conversion in Git and SVN is quite different, so it may not be easy to have the solution that will work in all cases. Second, in many cases, you can workaround this issue. If I understood your situation correctly, you use SVN repo where text files are marked with svn:eol-style=native. In this case, SVN stores these files with LF endings internally, and git-svn receives files in that format (at least, it is so on Debian). Practically all Windows editors do not have problems to open and edit files with LF endings, but some of them will write back using CRLF. You do not want CRLF to get in your Git repository, so you can do that by setting core.autocrlf=input. This might work for you... Dmitry ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: git-svn does not seems to work with crlf convertion enabled. 2008-07-24 14:24 ` Dmitry Potapov @ 2008-07-24 14:40 ` Johannes Schindelin 2008-07-24 16:28 ` Avery Pennarun 0 siblings, 1 reply; 45+ messages in thread From: Johannes Schindelin @ 2008-07-24 14:40 UTC (permalink / raw) To: Dmitry Potapov; +Cc: Alexander Litvinov, git Hi, On Thu, 24 Jul 2008, Dmitry Potapov wrote: > On Wed, Jul 23, 2008 at 06:52:09PM +0700, Alexander Litvinov wrote: > > > On Wed, 23 Jul 2008, Alexander Litvinov wrote: > > > > In short: I can't clone svn repo into git when crlf convertion is > > > > activated. > > > > > > This is a known issue, but since nobody with that itch seems to care > > > enough to fix it, I doubt it will ever be fixed. > > > > That is a bad news for me. Anyway I will spend some time at holidays > > during digging this bug. > > I don't want to discourage from digging into this problem Great. Thanks. There is someone who is actually willing to work on the problem. > Practically all Windows editors do not have problems to open and edit > files with LF endings, but some of them will write back using CRLF. 95.23% of all statistics are made up on the spot. I would be surprised if that was not the case here. Ciao, Dscho ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: git-svn does not seems to work with crlf convertion enabled. 2008-07-24 14:40 ` Johannes Schindelin @ 2008-07-24 16:28 ` Avery Pennarun 0 siblings, 0 replies; 45+ messages in thread From: Avery Pennarun @ 2008-07-24 16:28 UTC (permalink / raw) To: Johannes Schindelin; +Cc: Dmitry Potapov, Alexander Litvinov, git On 7/24/08, Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote: > On Thu, 24 Jul 2008, Dmitry Potapov wrote: > > Practically all Windows editors do not have problems to open and edit > > files with LF endings, but some of them will write back using CRLF. > > 95.23% of all statistics are made up on the spot. I would be surprised if > that was not the case here. Without assigning a specific number, Dmitry's experience matches mine. I haven't seen an editor that can't *read* LF since notepad. But many of them happily mangle the files. Of course, notepad is probably at least 50% of the editors most Windows users actually use, on a per-transaction basis. Have fun, Avery ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: git-svn does not seems to work with crlf convertion enabled. 2008-07-23 9:18 ` Johannes Schindelin 2008-07-23 11:52 ` Alexander Litvinov @ 2008-07-30 4:37 ` Alexander Litvinov 2008-07-31 5:43 ` [PATCH] git-svn now " Alexander Litvinov 2 siblings, 0 replies; 45+ messages in thread From: Alexander Litvinov @ 2008-07-30 4:37 UTC (permalink / raw) To: Johannes Schindelin; +Cc: git > This is a known issue, but since nobody with that itch seems to care > enough to fix it, I doubt it will ever be fixed. Hello again. I have investigated this problem. Short result: git-svn and ANY file convertion will not work now. In my case I have found the problem is the SVN::Git::Fetcher::apply_textdelta() function. To be more precicly call to SVN::TxDelta::apply(). We fetch previous version of file from git and then apply to it svn's delta. As far as we modify src file SVN fails to apply its delta. If I modify last commit and put original version of file everything works. So it seems to me there are two solutions: 1. Store original file somehow and use it to construct new file version; 2. In case of this error we could fetch full blob with new (or old) version of the file. I did not find the way to gather full file conntent nor feel myself ready to rewrite git-svn to store original file somewhere. Does anybody can help or comment on this ? ^ permalink raw reply [flat|nested] 45+ messages in thread
* [PATCH] git-svn now work with crlf convertion enabled. 2008-07-23 9:18 ` Johannes Schindelin 2008-07-23 11:52 ` Alexander Litvinov 2008-07-30 4:37 ` Alexander Litvinov @ 2008-07-31 5:43 ` Alexander Litvinov 2008-07-31 5:57 ` Alexander Litvinov 2008-08-04 0:48 ` Eric Wong 2 siblings, 2 replies; 45+ messages in thread From: Alexander Litvinov @ 2008-07-31 5:43 UTC (permalink / raw) To: git; +Cc: Eric Wong Make git-svn works with crlf (or any other) file content convertion enabled. When we modify file content SVN cant apply its delta to it. To fix this situation I take full file content from SVN as next revision. This is dump and slow but it works. --- git-svn.perl | 34 +++++++++++++++++++--------------- 1 files changed, 19 insertions(+), 15 deletions(-) diff --git a/git-svn.perl b/git-svn.perl index cf6dbbc..606a177 100755 --- a/git-svn.perl +++ b/git-svn.perl @@ -28,6 +28,7 @@ sub fatal (@) { print STDERR "@_\n"; exit 1 } require SVN::Core; # use()-ing this causes segfaults for me... *shrug* require SVN::Ra; require SVN::Delta; +require SVN::Client; if ($SVN::Core::VERSION lt '1.1.0') { fatal "Need SVN::Core 1.1.0 or better (got $SVN::Core::VERSION)"; } @@ -3075,6 +3076,7 @@ sub new { my $self = SVN::Delta::Editor->new; bless $self, $class; $self->{c} = $git_svn->{last_commit} if exists $git_svn->{last_commit}; + $self->{url} = $git_svn->{url}; $self->{empty} = {}; $self->{dir_prop} = {}; $self->{file_prop} = {}; @@ -3214,30 +3216,32 @@ sub change_file_prop { sub apply_textdelta { my ($self, $fb, $exp) = @_; - my $fh = IO::File->new_tmpfile; - $fh->autoflush(1); - # $fh gets auto-closed() by SVN::TxDelta::apply(), - # (but $base does not,) so dup() it for reading in close_file - open my $dup, '<&', $fh or croak $!; + my $base = IO::File->new_tmpfile; $base->autoflush(1); if ($fb->{blob}) { print $base 'link ' if ($fb->{mode_a} == 120000); my $size = $::_repository->cat_blob($fb->{blob}, $base); die "Failed to read object $fb->{blob}" if ($size < 0); - - if (defined $exp) { - seek $base, 0, 0 or croak $!; - my $got = ::md5sum($base); - die "Checksum mismatch: $fb->{path} $fb->{blob}\n", - "expected: $exp\n", - " got: $got\n" if ($got ne $exp); - } } seek $base, 0, 0 or croak $!; - $fb->{fh} = $dup; + + my $fh = IO::File->new_tmpfile; + $fh->autoflush(1); + + $fb->{fh} = $fh; $fb->{base} = $base; - [ SVN::TxDelta::apply($base, $fh, undef, $fb->{path}, $fb->{pool}) ]; + + my $url = $self->{url}; + $url =~ s/\/$//; + $url .= '/'; + $url .= $fb->{path}; + + my $rev = $self->{file_prop}->{$fb->{path}}->{'svn:entry:committed-rev'}; + die ("Can't find $fb->{path} revision") unless defined $rev; + + my $ctx = SVN::Client->new(); + $ctx->cat($fh, $url, $rev); } sub close_file { -- 1.5.6.2 ^ permalink raw reply related [flat|nested] 45+ messages in thread
* Re: [PATCH] git-svn now work with crlf convertion enabled. 2008-07-31 5:43 ` [PATCH] git-svn now " Alexander Litvinov @ 2008-07-31 5:57 ` Alexander Litvinov 2008-07-31 10:45 ` Dmitry Potapov 2008-08-04 0:48 ` Eric Wong 1 sibling, 1 reply; 45+ messages in thread From: Alexander Litvinov @ 2008-07-31 5:57 UTC (permalink / raw) To: git; +Cc: Eric Wong > Make git-svn works with crlf (or any other) file content convertion > enabled. > > When we modify file content SVN cant apply its delta to it. To fix this > situation I take full file content from SVN as next revision. This is > dump and slow but it works. Sorry for the noise. git-svn fetch files with this patch but I have found that git-svn use git-hash-object and provide file name to store into stdin. As far as file is a temp file git-hash-object can't correctly apply crlf convertion for the file. As a conclusion: git-svn does not apply crlf convertion on files being stored into git repo. This make my patch useless. ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: [PATCH] git-svn now work with crlf convertion enabled. 2008-07-31 5:57 ` Alexander Litvinov @ 2008-07-31 10:45 ` Dmitry Potapov 2008-07-31 19:09 ` [RFC] hash-object --no-filters Dmitry Potapov 2008-08-01 3:23 ` [PATCH] git-svn now work with crlf convertion enabled Alexander Litvinov 0 siblings, 2 replies; 45+ messages in thread From: Dmitry Potapov @ 2008-07-31 10:45 UTC (permalink / raw) To: Alexander Litvinov; +Cc: git, Eric Wong On Thu, Jul 31, 2008 at 12:57:48PM +0700, Alexander Litvinov wrote: > > git-svn fetch files with this patch but I have found that git-svn use > git-hash-object and provide file name to store into stdin. As far as file is > a temp file git-hash-object can't correctly apply crlf convertion for the > file. It does not look to be true. I did the following test: mkdir hash_test cd hash_test git init cat <<\=== > hash_test.pl #!/usr/bin/env perl use File::Temp qw/tempfile/; my ($tmp_fh, $tmp_filename) = File::Temp::tempfile(UNLINK => 1); print $tmp_fh "Hi\r\n"; $tmp_fh->flush; system ("echo $tmp_filename | git hash-object --stdin-paths"); === git config core.autocrlf true perl hash_test.pl git config core.autocrlf false perl hash_test.pl and the output was b14df6442ea5a1b382985a6549b85d435376c351 ea6b6afbc2cbed0eb8c0f7561286ab72f349416c which means that the autocrlf conversion is done for temporary files created by perl. (I tested it on Linux and Windows/Cygwin). In any case, I believe the right solution should be adding a new option to git-hash-object to disable any conversion. Dmitry ^ permalink raw reply [flat|nested] 45+ messages in thread
* [RFC] hash-object --no-filters 2008-07-31 10:45 ` Dmitry Potapov @ 2008-07-31 19:09 ` Dmitry Potapov 2008-08-01 3:23 ` [PATCH] git-svn now work with crlf convertion enabled Alexander Litvinov 1 sibling, 0 replies; 45+ messages in thread From: Dmitry Potapov @ 2008-07-31 19:09 UTC (permalink / raw) To: git; +Cc: Alexander Litvinov Hi All, I am tryint to add the --no-filters option. It is useful for git-svn and other importers that want to add file as-is without being affected by any filter (in particular, autocrlf). Though, the patch below works, I am not happy with the hackish way of passing no-filter requirement to the index_fd() function. So, I wonder what would be preferable: - to change 'write_object' to be flags (bit 0: write_object, bit 1: no-filters ) - to add some global the no_filters flag to environment.c, which can be checked inside of convert_to_git(), so it may be used in the future in some other cases (though I don't see where else it can be useful). Another question: currently git hash-object --input imply no filters. I don't know if it was done intentionally (it can be argued in both ways). I don't think it is reasonable now to change this behavior, so I want to add just one line to documentation, so there will be no surprise among users. Dmitry -- 8< -- From: Dmitry Potapov <dpotapov@gmail.com> Date: Thu, 31 Jul 2008 21:10:26 +0400 Subject: [PATCH] hash-object --no-filters The --no-filters option makes git hash-object to work as there were no input filters. This option is useful for importers such as git-svn to put new version of files as is even if autocrlf is set. --- Documentation/git-hash-object.txt | 6 ++++++ hash-object.c | 7 ++++++- 2 files changed, 12 insertions(+), 1 deletions(-) diff --git a/Documentation/git-hash-object.txt b/Documentation/git-hash-object.txt index ac928e1..69a17c7 100644 --- a/Documentation/git-hash-object.txt +++ b/Documentation/git-hash-object.txt @@ -35,6 +35,12 @@ OPTIONS --stdin-paths:: Read file names from stdin instead of from the command-line. +--no-filters:: + If this option is given then the file is hashed as is ignoring + all filters specified in the configuration, including crlf + conversion. If the file is read from standard input then no + filters is always implied. + Author ------ Written by Junio C Hamano <gitster@pobox.com> diff --git a/hash-object.c b/hash-object.c index 46c06a9..1e7fe8a 100644 --- a/hash-object.c +++ b/hash-object.c @@ -8,6 +8,8 @@ #include "blob.h" #include "quote.h" +static unsigned no_filters; + static void hash_object(const char *path, enum object_type type, int write_object) { int fd; @@ -16,7 +18,8 @@ static void hash_object(const char *path, enum object_type type, int write_objec fd = open(path, O_RDONLY); if (fd < 0 || fstat(fd, &st) < 0 || - index_fd(sha1, fd, &st, write_object, type, path)) + ((no_filters ? st.st_mode &= ~S_IFREG : 0), + index_fd(sha1, fd, &st, write_object, type, path))) die(write_object ? "Unable to add %s to database" : "Unable to hash %s", path); @@ -104,6 +107,8 @@ int main(int argc, char **argv) die("Multiple --stdin arguments are not supported"); hashstdin = 1; } + else if (!strcmp(argv[i], "--no-filters")) + no_filters = 1; else usage(hash_object_usage); } -- 1.6.0.rc1.32.gc84cb ^ permalink raw reply related [flat|nested] 45+ messages in thread
* Re: [PATCH] git-svn now work with crlf convertion enabled. 2008-07-31 10:45 ` Dmitry Potapov 2008-07-31 19:09 ` [RFC] hash-object --no-filters Dmitry Potapov @ 2008-08-01 3:23 ` Alexander Litvinov 2008-08-01 5:09 ` Junio C Hamano 2008-08-01 7:47 ` Dmitry Potapov 1 sibling, 2 replies; 45+ messages in thread From: Alexander Litvinov @ 2008-08-01 3:23 UTC (permalink / raw) To: Dmitry Potapov; +Cc: git, Eric Wong > It does not look to be true. I did the following test: ... > which means that the autocrlf conversion is done for temporary > files created by perl. (I tested it on Linux and Windows/Cygwin). > > In any case, I believe the right solution should be adding a > new option to git-hash-object to disable any conversion. My bad, I did not append full thoughts. git-hash-object DOES autocrlf convertion but it cant do it correctly. All it can do - is to autodetect text files. My setup has .git/info/attributes file where all files but .cpp and .h are binary. While .cpp and .h are text files. In this case git-hash-object do not know the real file name as far as git-svn use temporary files. I dont think that disabling convertion is a good way. I really want to convert my files. Possible solution is to pass two file names to git-hash-object: the real file with content and the proposed file name in the working directory. In this case git-hash-object will be able to make correct convertion. ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: [PATCH] git-svn now work with crlf convertion enabled. 2008-08-01 3:23 ` [PATCH] git-svn now work with crlf convertion enabled Alexander Litvinov @ 2008-08-01 5:09 ` Junio C Hamano 2008-08-01 7:44 ` Dmitry Potapov 2008-08-01 7:47 ` Dmitry Potapov 1 sibling, 1 reply; 45+ messages in thread From: Junio C Hamano @ 2008-08-01 5:09 UTC (permalink / raw) To: Alexander Litvinov; +Cc: Dmitry Potapov, git, Eric Wong Alexander Litvinov <litvinov2004@gmail.com> writes: > I dont think that disabling convertion is a good way. I really want to convert > my files. Possible solution is to pass two file names to git-hash-object: the > real file with content and the proposed file name in the working directory. > In this case git-hash-object will be able to make correct convertion. I think the optional parameter to say "pretend the content is from this path" makes sense even for (and especially for) hashing --stdin. ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: [PATCH] git-svn now work with crlf convertion enabled. 2008-08-01 5:09 ` Junio C Hamano @ 2008-08-01 7:44 ` Dmitry Potapov 2008-08-01 11:27 ` Alexander Litvinov 0 siblings, 1 reply; 45+ messages in thread From: Dmitry Potapov @ 2008-08-01 7:44 UTC (permalink / raw) To: Junio C Hamano; +Cc: Alexander Litvinov, git, Eric Wong On Fri, Aug 1, 2008 at 9:09 AM, Junio C Hamano <gitster@pobox.com> wrote: > Alexander Litvinov <litvinov2004@gmail.com> writes: > >> I dont think that disabling convertion is a good way. I really want to convert >> my files. Possible solution is to pass two file names to git-hash-object: the >> real file with content and the proposed file name in the working directory. >> In this case git-hash-object will be able to make correct convertion. > > I think the optional parameter to say "pretend the content is from this > path" makes sense even for (and especially for) hashing --stdin. git-svn uses git hash-object --stdin-paths, which means that it reads filenames from the standard input, so one optional parameter cannot help here. Also, I am not sure how it can be useful for --stdin, which does not convert anything (it uses index_pipe, which does not call convert_to_git). Dmitry ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: [PATCH] git-svn now work with crlf convertion enabled. 2008-08-01 7:44 ` Dmitry Potapov @ 2008-08-01 11:27 ` Alexander Litvinov 0 siblings, 0 replies; 45+ messages in thread From: Alexander Litvinov @ 2008-08-01 11:27 UTC (permalink / raw) To: Dmitry Potapov; +Cc: Junio C Hamano, git, Eric Wong > git-svn uses git hash-object --stdin-paths, which means that it reads > filenames from the standard input, so one optional parameter cannot > help here. We could add some parameter ti git-hash-object to tell that we will pass two lines per each file: real file name and proposed file name in workdir. In this case git-hash-object will be able to do proper convertion. The main proble is the tracking original file from svn. Propably we could use some special dir in worktree to store original file. Or we could make special branch to track that files and second one to store converted files. ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: [PATCH] git-svn now work with crlf convertion enabled. 2008-08-01 3:23 ` [PATCH] git-svn now work with crlf convertion enabled Alexander Litvinov 2008-08-01 5:09 ` Junio C Hamano @ 2008-08-01 7:47 ` Dmitry Potapov 2008-08-01 8:08 ` Junio C Hamano 1 sibling, 1 reply; 45+ messages in thread From: Dmitry Potapov @ 2008-08-01 7:47 UTC (permalink / raw) To: Alexander Litvinov; +Cc: git, Eric Wong On Fri, Aug 1, 2008 at 7:23 AM, Alexander Litvinov <litvinov2004@gmail.com> wrote: > > I dont think that disabling convertion is a good way. I really want to convert > my files. To being able to synchronize efficiently in both ways, you need to store files exactly as they were received from SVN then there will be no problem with applying binary delta patch. All CRLF conversion should be done on checkout and checkin from/to Git repository. Dmitry ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: [PATCH] git-svn now work with crlf convertion enabled. 2008-08-01 7:47 ` Dmitry Potapov @ 2008-08-01 8:08 ` Junio C Hamano 2008-08-01 9:24 ` Dmitry Potapov 2008-08-01 11:11 ` [PATCH] git-svn now work with crlf convertion enabled Alexander Litvinov 0 siblings, 2 replies; 45+ messages in thread From: Junio C Hamano @ 2008-08-01 8:08 UTC (permalink / raw) To: Dmitry Potapov; +Cc: Alexander Litvinov, git, Eric Wong "Dmitry Potapov" <dpotapov@gmail.com> writes: > On Fri, Aug 1, 2008 at 7:23 AM, Alexander Litvinov > <litvinov2004@gmail.com> wrote: >> >> I dont think that disabling convertion is a good way. I really want to convert >> my files. > > To being able to synchronize efficiently in both ways, you need to store > files exactly as they were received from SVN then there will be no > problem with applying binary delta patch. All CRLF conversion should be > done on checkout and checkin from/to Git repository. Ahh,... if that is the philosophy, perhaps we can teach --stdin-paths to optionally open the file itself and use index_pipe() like --stdin codepath does? ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: [PATCH] git-svn now work with crlf convertion enabled. 2008-08-01 8:08 ` Junio C Hamano @ 2008-08-01 9:24 ` Dmitry Potapov 2008-08-01 19:42 ` Junio C Hamano 2008-08-01 11:11 ` [PATCH] git-svn now work with crlf convertion enabled Alexander Litvinov 1 sibling, 1 reply; 45+ messages in thread From: Dmitry Potapov @ 2008-08-01 9:24 UTC (permalink / raw) To: Junio C Hamano; +Cc: Alexander Litvinov, git, Eric Wong On Fri, Aug 1, 2008 at 12:08 PM, Junio C Hamano <gitster@pobox.com> wrote: > "Dmitry Potapov" <dpotapov@gmail.com> writes: >> >> To being able to synchronize efficiently in both ways, you need to store >> files exactly as they were received from SVN then there will be no >> problem with applying binary delta patch. All CRLF conversion should be >> done on checkout and checkin from/to Git repository. > > Ahh,... if that is the philosophy, perhaps we can teach --stdin-paths to > optionally open the file itself and use index_pipe() like --stdin codepath > does? It is possible to do in this way, but it less efficient, because it uses index_pipe, which does not know the actual size, so it reallocates the buffer as it reads data from the descriptor, while index_fd uses xmap() instead. So I sent another solution yesterday: http://article.gmane.org/gmane.comp.version-control.git/90968 It is a bit hackish because I unset S_IFREG bit in st_mode to disable conversion. In fact, my question what would be a better way to tell index_fd to not do any conversion. If you think that it is better to use index_pipe, which does not any conversion than I will redo my patch to use it instead. Dmitry ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: [PATCH] git-svn now work with crlf convertion enabled. 2008-08-01 9:24 ` Dmitry Potapov @ 2008-08-01 19:42 ` Junio C Hamano 2008-08-01 22:09 ` Dmitry Potapov 0 siblings, 1 reply; 45+ messages in thread From: Junio C Hamano @ 2008-08-01 19:42 UTC (permalink / raw) To: Dmitry Potapov; +Cc: Alexander Litvinov, git, Eric Wong "Dmitry Potapov" <dpotapov@gmail.com> writes: > On Fri, Aug 1, 2008 at 12:08 PM, Junio C Hamano <gitster@pobox.com> wrote: >> "Dmitry Potapov" <dpotapov@gmail.com> writes: >>> >>> To being able to synchronize efficiently in both ways, you need to store >>> files exactly as they were received from SVN then there will be no >>> problem with applying binary delta patch. All CRLF conversion should be >>> done on checkout and checkin from/to Git repository. >> >> Ahh,... if that is the philosophy, perhaps we can teach --stdin-paths to >> optionally open the file itself and use index_pipe() like --stdin codepath >> does? > > It is possible to do in this way, but it less efficient, because it uses > index_pipe, which does not know the actual size, so it reallocates the buffer > as it reads data from the descriptor, while index_fd uses xmap() instead. > So I sent another solution yesterday: > http://article.gmane.org/gmane.comp.version-control.git/90968 > > It is a bit hackish because... Ok, earlier I was confused who was proposing what for what purpose, but that one was not just "a bit hackish" but an unacceptable hack ;-) Perhaps you would want to do the s/write_object/flags/ conversion, like this? -- cache.h | 9 ++++++--- sha1_file.c | 15 +++++++++------ 2 files changed, 15 insertions(+), 9 deletions(-) diff --git a/cache.h b/cache.h index 2475de9..39975fb 100644 --- a/cache.h +++ b/cache.h @@ -390,9 +390,12 @@ extern int ie_match_stat(const struct index_state *, struct cache_entry *, struc extern int ie_modified(const struct index_state *, struct cache_entry *, struct stat *, unsigned int); extern int ce_path_match(const struct cache_entry *ce, const char **pathspec); -extern int index_fd(unsigned char *sha1, int fd, struct stat *st, int write_object, enum object_type type, const char *path); -extern int index_pipe(unsigned char *sha1, int fd, const char *type, int write_object); -extern int index_path(unsigned char *sha1, const char *path, struct stat *st, int write_object); + +#define HASH_OBJECT_DO_CREATE 01 +#define HASH_OBJECT_LITERALLY 02 +extern int index_fd(unsigned char *sha1, int fd, struct stat *st, int flags, enum object_type type, const char *path); +extern int index_pipe(unsigned char *sha1, int fd, const char *type, int flags); +extern int index_path(unsigned char *sha1, const char *path, struct stat *st, int flags); extern void fill_stat_cache_info(struct cache_entry *ce, struct stat *st); #define REFRESH_REALLY 0x0001 /* ignore_valid */ diff --git a/sha1_file.c b/sha1_file.c index e281c14..5def648 100644 --- a/sha1_file.c +++ b/sha1_file.c @@ -2353,10 +2353,11 @@ int has_sha1_file(const unsigned char *sha1) return has_loose_object(sha1); } -int index_pipe(unsigned char *sha1, int fd, const char *type, int write_object) +int index_pipe(unsigned char *sha1, int fd, const char *type, int flags) { struct strbuf buf; int ret; + int write_object = flags & HASH_OBJECT_DO_CREATE; strbuf_init(&buf, 0); if (strbuf_read(&buf, fd, 4096) < 0) { @@ -2375,9 +2376,11 @@ int index_pipe(unsigned char *sha1, int fd, const char *type, int write_object) return ret; } -int index_fd(unsigned char *sha1, int fd, struct stat *st, int write_object, +int index_fd(unsigned char *sha1, int fd, struct stat *st, int flags, enum object_type type, const char *path) { + int write_object = flags & HASH_OBJECT_DO_CREATE; + int hash_literally = flags & HASH_OBJECT_LITERALLY; size_t size = xsize_t(st->st_size); void *buf = NULL; int ret, re_allocated = 0; @@ -2392,7 +2395,7 @@ int index_fd(unsigned char *sha1, int fd, struct stat *st, int write_object, /* * Convert blobs to git internal format */ - if ((type == OBJ_BLOB) && S_ISREG(st->st_mode)) { + if (!hash_literally && (type == OBJ_BLOB) && S_ISREG(st->st_mode)) { struct strbuf nbuf; strbuf_init(&nbuf, 0); if (convert_to_git(path, buf, size, &nbuf, @@ -2416,7 +2419,7 @@ int index_fd(unsigned char *sha1, int fd, struct stat *st, int write_object, return ret; } -int index_path(unsigned char *sha1, const char *path, struct stat *st, int write_object) +int index_path(unsigned char *sha1, const char *path, struct stat *st, int flags) { int fd; char *target; @@ -2428,7 +2431,7 @@ int index_path(unsigned char *sha1, const char *path, struct stat *st, int write if (fd < 0) return error("open(\"%s\"): %s", path, strerror(errno)); - if (index_fd(sha1, fd, st, write_object, OBJ_BLOB, path) < 0) + if (index_fd(sha1, fd, st, flags, OBJ_BLOB, path) < 0) return error("%s: failed to insert into database", path); break; @@ -2441,7 +2444,7 @@ int index_path(unsigned char *sha1, const char *path, struct stat *st, int write return error("readlink(\"%s\"): %s", path, errstr); } - if (!write_object) + if (!(flags & HASH_OBJECT_DO_CREATE)) hash_sha1_file(target, len, blob_type, sha1); else if (write_sha1_file(target, len, blob_type, sha1)) return error("%s: failed to insert into database", ^ permalink raw reply related [flat|nested] 45+ messages in thread
* Re: [PATCH] git-svn now work with crlf convertion enabled. 2008-08-01 19:42 ` Junio C Hamano @ 2008-08-01 22:09 ` Dmitry Potapov 2008-08-01 22:14 ` Junio C Hamano 2008-08-02 17:28 ` [PATCH] hash-object --no-filters Junio C Hamano 0 siblings, 2 replies; 45+ messages in thread From: Dmitry Potapov @ 2008-08-01 22:09 UTC (permalink / raw) To: Junio C Hamano; +Cc: Alexander Litvinov, git, Eric Wong On Fri, Aug 01, 2008 at 12:42:44PM -0700, Junio C Hamano wrote: > > Ok, earlier I was confused who was proposing what for what purpose, but > that one was not just "a bit hackish" but an unacceptable hack ;-) Thanks for correct my wording ;-) > > Perhaps you would want to do the s/write_object/flags/ conversion, like > this? Yes, it was my prefered choice to change these index_xx functions. I have applied your patch and then corrected mine to use flags. See below. I wonder if something should be done about other places where index_xx functions are called. I have looked at them and all they use either 0 or 1 (boolean expression which will be evaluated to 0 or 1), so they should work as is, but I can correct them to use HASH_OBJECT_DO_CREATE instead of 1 if it helps with readability. -- 8< -- From: Dmitry Potapov <dpotapov@gmail.com> Date: Thu, 31 Jul 2008 21:10:26 +0400 Subject: [PATCH] hash-object --no-filters The --no-filters option makes git hash-object to work as there were no input filters. This option is useful for importers such as git-svn to put new version of files as is even if autocrlf is set. Signed-off-by: Dmitry Potapov <dpotapov@gmail.com> --- Documentation/git-hash-object.txt | 6 ++++++ hash-object.c | 28 +++++++++++++++------------- 2 files changed, 21 insertions(+), 13 deletions(-) diff --git a/Documentation/git-hash-object.txt b/Documentation/git-hash-object.txt index ac928e1..69a17c7 100644 --- a/Documentation/git-hash-object.txt +++ b/Documentation/git-hash-object.txt @@ -35,6 +35,12 @@ OPTIONS --stdin-paths:: Read file names from stdin instead of from the command-line. +--no-filters:: + If this option is given then the file is hashed as is ignoring + all filters specified in the configuration, including crlf + conversion. If the file is read from standard input then no + filters is always implied. + Author ------ Written by Junio C Hamano <gitster@pobox.com> diff --git a/hash-object.c b/hash-object.c index 46c06a9..2dd7283 100644 --- a/hash-object.c +++ b/hash-object.c @@ -8,7 +8,7 @@ #include "blob.h" #include "quote.h" -static void hash_object(const char *path, enum object_type type, int write_object) +static void hash_object(const char *path, enum object_type type, int flags) { int fd; struct stat st; @@ -16,23 +16,23 @@ static void hash_object(const char *path, enum object_type type, int write_objec fd = open(path, O_RDONLY); if (fd < 0 || fstat(fd, &st) < 0 || - index_fd(sha1, fd, &st, write_object, type, path)) - die(write_object + index_fd(sha1, fd, &st, flags, type, path)) + die((flags & HASH_OBJECT_DO_CREATE) ? "Unable to add %s to database" : "Unable to hash %s", path); printf("%s\n", sha1_to_hex(sha1)); maybe_flush_or_die(stdout, "hash to stdout"); } -static void hash_stdin(const char *type, int write_object) +static void hash_stdin(const char *type, int flags) { unsigned char sha1[20]; - if (index_pipe(sha1, 0, type, write_object)) + if (index_pipe(sha1, 0, type, flags)) die("Unable to add stdin to database"); printf("%s\n", sha1_to_hex(sha1)); } -static void hash_stdin_paths(const char *type, int write_objects) +static void hash_stdin_paths(const char *type, int flags) { struct strbuf buf, nbuf; @@ -45,7 +45,7 @@ static void hash_stdin_paths(const char *type, int write_objects) die("line is badly quoted"); strbuf_swap(&buf, &nbuf); } - hash_object(buf.buf, type_from_string(type), write_objects); + hash_object(buf.buf, type_from_string(type), flags); } strbuf_release(&buf); strbuf_release(&nbuf); @@ -58,7 +58,7 @@ int main(int argc, char **argv) { int i; const char *type = blob_type; - int write_object = 0; + int flags = 0; const char *prefix = NULL; int prefix_length = -1; int no_more_flags = 0; @@ -80,7 +80,7 @@ int main(int argc, char **argv) prefix_length = prefix ? strlen(prefix) : 0; } - write_object = 1; + flags |= HASH_OBJECT_DO_CREATE; } else if (!strcmp(argv[i], "--")) { no_more_flags = 1; @@ -104,6 +104,8 @@ int main(int argc, char **argv) die("Multiple --stdin arguments are not supported"); hashstdin = 1; } + else if (!strcmp(argv[i], "--no-filters")) + flags |= HASH_OBJECT_LITERALLY; else usage(hash_object_usage); } @@ -116,21 +118,21 @@ int main(int argc, char **argv) } if (hashstdin) { - hash_stdin(type, write_object); + hash_stdin(type, flags); hashstdin = 0; } if (0 <= prefix_length) arg = prefix_filename(prefix, prefix_length, arg); - hash_object(arg, type_from_string(type), write_object); + hash_object(arg, type_from_string(type), flags); no_more_flags = 1; } } if (stdin_paths) - hash_stdin_paths(type, write_object); + hash_stdin_paths(type, flags); if (hashstdin) - hash_stdin(type, write_object); + hash_stdin(type, flags); return 0; } -- 1.6.0.rc1.33.gb756f ^ permalink raw reply related [flat|nested] 45+ messages in thread
* Re: [PATCH] git-svn now work with crlf convertion enabled. 2008-08-01 22:09 ` Dmitry Potapov @ 2008-08-01 22:14 ` Junio C Hamano 2008-08-01 23:10 ` Dmitry Potapov 2008-08-02 17:28 ` [PATCH] hash-object --no-filters Junio C Hamano 1 sibling, 1 reply; 45+ messages in thread From: Junio C Hamano @ 2008-08-01 22:14 UTC (permalink / raw) To: Dmitry Potapov; +Cc: Alexander Litvinov, git, Eric Wong Dmitry Potapov <dpotapov@gmail.com> writes: > I have applied your patch and then corrected mine to use flags. > See below. > > I wonder if something should be done about other places where index_xx > functions are called. I have looked at them and all they use either 0 or > 1 (boolean expression which will be evaluated to 0 or 1), so they should > work as is, but I can correct them to use HASH_OBJECT_DO_CREATE instead > of 1 if it helps with readability. Even though the patch was not compile tested, I did check the existing call sites are giving only 0 or 1, but I think converting these "please write -- I give you 1" callers to pass the bitmask would be a sane thing to do. ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: [PATCH] git-svn now work with crlf convertion enabled. 2008-08-01 22:14 ` Junio C Hamano @ 2008-08-01 23:10 ` Dmitry Potapov 0 siblings, 0 replies; 45+ messages in thread From: Dmitry Potapov @ 2008-08-01 23:10 UTC (permalink / raw) To: Junio C Hamano; +Cc: Alexander Litvinov, git, Eric Wong On Fri, Aug 01, 2008 at 03:14:15PM -0700, Junio C Hamano wrote: > > Even though the patch was not compile tested, I did check the existing > call sites are giving only 0 or 1, but I think converting these "please > write -- I give you 1" callers to pass the bitmask would be a sane thing > to do. Here it goes. It turned out that there are only two places that actually needs correction, while two others use '0'. I have run 'make test' and it's passed the tests. -- 8< -- From: Dmitry Potapov <dpotapov@gmail.com> Date: Sat, 2 Aug 2008 02:56:45 +0400 Subject: [PATCH] convert index_path callers to use bitmask instead of 1 Signed-off-by: Dmitry Potapov <dpotapov@gmail.com> --- builtin-update-index.c | 5 +++-- read-cache.c | 2 +- 2 files changed, 4 insertions(+), 3 deletions(-) diff --git a/builtin-update-index.c b/builtin-update-index.c index 38eb53c..d3e212c 100644 --- a/builtin-update-index.c +++ b/builtin-update-index.c @@ -85,7 +85,7 @@ static int process_lstat_error(const char *path, int err) static int add_one_path(struct cache_entry *old, const char *path, int len, struct stat *st) { - int option, size; + int option, flags, size; struct cache_entry *ce; /* Was the old index entry already up-to-date? */ @@ -99,7 +99,8 @@ static int add_one_path(struct cache_entry *old, const char *path, int len, stru fill_stat_cache_info(ce, st); ce->ce_mode = ce_mode_from_stat(old, st->st_mode); - if (index_path(ce->sha1, path, st, !info_only)) + flags = info_only ? 0 : HASH_OBJECT_DO_CREATE; + if (index_path(ce->sha1, path, st, flags)) return -1; option = allow_add ? ADD_CACHE_OK_TO_ADD : 0; option |= allow_replace ? ADD_CACHE_OK_TO_REPLACE : 0; diff --git a/read-cache.c b/read-cache.c index 2c03ec3..afd6005 100644 --- a/read-cache.c +++ b/read-cache.c @@ -550,7 +550,7 @@ int add_to_index(struct index_state *istate, const char *path, struct stat *st, alias->ce_flags |= CE_ADDED; return 0; } - if (index_path(ce->sha1, path, st, 1)) + if (index_path(ce->sha1, path, st, HASH_OBJECT_DO_CREATE)) return error("unable to index file %s", path); if (ignore_case && alias && different_name(ce, alias)) ce = create_alias_ce(ce, alias); -- 1.6.0.rc1.34.gad373 ^ permalink raw reply related [flat|nested] 45+ messages in thread
* Re: [PATCH] hash-object --no-filters 2008-08-01 22:09 ` Dmitry Potapov 2008-08-01 22:14 ` Junio C Hamano @ 2008-08-02 17:28 ` Junio C Hamano 2008-08-03 5:42 ` Dmitry Potapov 1 sibling, 1 reply; 45+ messages in thread From: Junio C Hamano @ 2008-08-02 17:28 UTC (permalink / raw) To: Dmitry Potapov; +Cc: Alexander Litvinov, git, Eric Wong Dmitry Potapov <dpotapov@gmail.com> writes: > The --no-filters option makes git hash-object to work as there were no > input filters. This option is useful for importers such as git-svn to > put new version of files as is even if autocrlf is set. I think this is going in the right direction, but I have to wonder a few things. First, on hash-object. (1) "hash-object --stdin" always hashes literally. We may want to be able to say "The contents is this but pretend it came from this path and apply the usual input rules", perhaps with "--path=" option; (2) "hash-object temporaryfile" may want to honor the same "--path" option; (3) "hash-object --stdin-paths" may want to get pair of paths (i.e. two lines per entry) to do the same. If we want to do the above, the existing low-level interface needs to be adjusted. index_pipe() and index_fd() can learn to take an additional string parameter for attribute lookup to implement (1) and (2) above. Perhaps the string can be NULL to signal --no-filter behaviour, in which case the HASH_OBJECT_LITERALLY change may not be necessary for this codepath. index_path() is a healper for add_to_index() which is used for normal addition of working tree entities, and I do not see an immediate need to teach it about this "use this different path for attribute lookup" at least for now. By the way, why do we have index_pipe() and index_fd() to begin with? Is it because users of index_pipe() do not know what the path it is hashing and also the fd being a pipe we cannot mmap it? If these two are the only reasons, then I wonder if we can: - accept NULL as path and stat parameters for callers without a filename (which automatically implies we are doing a regular blob and we hash literally); and - first try to mmap(), and if it fails fall back to the "read once into strbuf" codepath to solve mmap-vs-pipe issue. I am not sure if such a unification of these two functions is useful, though. ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: [PATCH] hash-object --no-filters 2008-08-02 17:28 ` [PATCH] hash-object --no-filters Junio C Hamano @ 2008-08-03 5:42 ` Dmitry Potapov 2008-08-03 5:56 ` Dmitry Potapov 0 siblings, 1 reply; 45+ messages in thread From: Dmitry Potapov @ 2008-08-03 5:42 UTC (permalink / raw) To: Junio C Hamano; +Cc: Alexander Litvinov, git, Eric Wong On Sat, Aug 02, 2008 at 10:28:13AM -0700, Junio C Hamano wrote: > Dmitry Potapov <dpotapov@gmail.com> writes: > > > The --no-filters option makes git hash-object to work as there were no > > input filters. This option is useful for importers such as git-svn to > > put new version of files as is even if autocrlf is set. > > I think this is going in the right direction, but I have to wonder a few > things. > > First, on hash-object. > > (1) "hash-object --stdin" always hashes literally. We may want to be > able to say "The contents is this but pretend it came from this path > and apply the usual input rules", perhaps with "--path=" option; It makes sense. > > (2) "hash-object temporaryfile" may want to honor the same "--path" > option; Agreed. > > (3) "hash-object --stdin-paths" may want to get pair of paths (i.e. two > lines per entry) to do the same. I cannot come up with a good name for this option. > > If we want to do the above, the existing low-level interface needs to be > adjusted. > > index_pipe() and index_fd() can learn to take an additional string > parameter for attribute lookup to implement (1) and (2) above. index_fd already has the 'path' parameter, which is used as hint for for blob conversion. > Perhaps > the string can be NULL to signal --no-filter behaviour, in which case the > HASH_OBJECT_LITERALLY change may not be necessary for this codepath. Sounds like a good idea :) > > By the way, why do we have index_pipe() and index_fd() to begin with? Is > it because users of index_pipe() do not know what the path it is hashing > and also the fd being a pipe we cannot mmap it? index_fd() does not need the path for anything but to choose filters. So, if index_pipe supported filters, it would have the same parameter. There is one more parameter that index_fd() has and index_pipe() does not. It is 'struct stat'. So I decided to look what this parameter is used for in index_fd(), and it turned out for two things: - to determine the size that needs to mmap - to check whether the file is regular and if it is not then skip convert_to_git(). That made me wonder whether index_fd() can be ever called for a non- regular file? I studied the source code and with the exception to git hash-object, which can pass anything what it can bed opened, in all other cases, we always call it for what is know as a regular file. In fact, it could be otherwise. It won't work for non-regular files. It is quite obvious that git hash-object for a directory will fail, but I wondered what would happen if I'd give it something different. For instance, a named pipe (FIFO) $mkfifo fifofile $git hash-object <wait for the other process to start write to it> e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 i.e. the same SHA-1 as for an empty file, and here is why: index_fd() tries to mmap the file descriptor and that obviously fails, but xmmap() has this particular code: if (ret == MAP_FAILED) { if (!length) return NULL; apparently, it was workaround for empty files, but because st_size is 0 for pipes, index_fd treats any pipe as empty file! > > If these two are the only reasons, then I wonder if we can: > > - accept NULL as path and stat parameters for callers without a filename > (which automatically implies we are doing a regular blob and we hash > literally); and I like this idea. > > - first try to mmap(), and if it fails fall back to the "read once into > strbuf" codepath to solve mmap-vs-pipe issue. I have an alternative proposal: Because we have stat structure given as a parameter, we can always check whether the file is regular or not. If it is regular, we can use mmap() and if it is not then use "read once into strbuf" approach. > I am not sure if such a unification of these two functions is useful, > though. I have implemented this unification, and it reduces the code size, makes git-hash-object to work with named pipes, and makes easier to add the --path and --no-filters options, because there is no need to modify the index_fd interface anymore, and there is a single place where convert_to_git is invoked. So it looks like a good idea. Here is the patch: -- >8 -- From: Dmitry Potapov <dpotapov@gmail.com> Date: Sun, 3 Aug 2008 08:39:16 +0400 Subject: [PATCH] teach index_fd to work with pipes index_fd can now work with file descriptors that are not normal files but any readable file. If the given file descriptor is a regular file then mmap() is used; for other files, strbuf_read is used. The path parameter, which has been used as hint for filters, can be NULL now to indicate that the file should be hashed literally without any filter. The index_pipe function is removed as redundant. Signed-off-by: Dmitry Potapov <dpotapov@gmail.com> --- cache.h | 1 - hash-object.c | 29 +++++++++++-------------- sha1_file.c | 64 +++++++++++++++++++++++++++----------------------------- 3 files changed, 44 insertions(+), 50 deletions(-) git-hash-object before text data bss dec hex filename 148751 1332 93164 243247 3b62f git-hash-object and after patch text data bss dec hex filename 148687 1332 93164 243183 3b5ef git-hash-object diff --git a/cache.h b/cache.h index 2475de9..68ce6e6 100644 --- a/cache.h +++ b/cache.h @@ -391,7 +391,6 @@ extern int ie_modified(const struct index_state *, struct cache_entry *, struct extern int ce_path_match(const struct cache_entry *ce, const char **pathspec); extern int index_fd(unsigned char *sha1, int fd, struct stat *st, int write_object, enum object_type type, const char *path); -extern int index_pipe(unsigned char *sha1, int fd, const char *type, int write_object); extern int index_path(unsigned char *sha1, const char *path, struct stat *st, int write_object); extern void fill_stat_cache_info(struct cache_entry *ce, struct stat *st); diff --git a/hash-object.c b/hash-object.c index 46c06a9..ce027b9 100644 --- a/hash-object.c +++ b/hash-object.c @@ -8,28 +8,25 @@ #include "blob.h" #include "quote.h" -static void hash_object(const char *path, enum object_type type, int write_object) +static void hash_fd(int fd, const char *type, int write_object, const char *path) { - int fd; struct stat st; unsigned char sha1[20]; - fd = open(path, O_RDONLY); - if (fd < 0 || - fstat(fd, &st) < 0 || - index_fd(sha1, fd, &st, write_object, type, path)) + if (fstat(fd, &st) < 0 || + index_fd(sha1, fd, &st, write_object, type_from_string(type), path)) die(write_object ? "Unable to add %s to database" : "Unable to hash %s", path); printf("%s\n", sha1_to_hex(sha1)); maybe_flush_or_die(stdout, "hash to stdout"); } - -static void hash_stdin(const char *type, int write_object) +static void hash_object(const char *path, const char *type, int write_object) { - unsigned char sha1[20]; - if (index_pipe(sha1, 0, type, write_object)) - die("Unable to add stdin to database"); - printf("%s\n", sha1_to_hex(sha1)); + int fd; + fd = open(path, O_RDONLY); + if (fd < 0) + die("Cannot open %s", path); + hash_fd(fd, type, write_object, path); } static void hash_stdin_paths(const char *type, int write_objects) @@ -45,7 +42,7 @@ static void hash_stdin_paths(const char *type, int write_objects) die("line is badly quoted"); strbuf_swap(&buf, &nbuf); } - hash_object(buf.buf, type_from_string(type), write_objects); + hash_object(buf.buf, type, write_objects); } strbuf_release(&buf); strbuf_release(&nbuf); @@ -116,13 +113,13 @@ int main(int argc, char **argv) } if (hashstdin) { - hash_stdin(type, write_object); + hash_fd(0, type, write_object, NULL); hashstdin = 0; } if (0 <= prefix_length) arg = prefix_filename(prefix, prefix_length, arg); - hash_object(arg, type_from_string(type), write_object); + hash_object(arg, type, write_object); no_more_flags = 1; } } @@ -131,6 +128,6 @@ int main(int argc, char **argv) hash_stdin_paths(type, write_object); if (hashstdin) - hash_stdin(type, write_object); + hash_fd(0, type, write_object, NULL); return 0; } diff --git a/sha1_file.c b/sha1_file.c index e281c14..765a7e7 100644 --- a/sha1_file.c +++ b/sha1_file.c @@ -2353,51 +2353,22 @@ int has_sha1_file(const unsigned char *sha1) return has_loose_object(sha1); } -int index_pipe(unsigned char *sha1, int fd, const char *type, int write_object) +static int index_mem(unsigned char *sha1, void *buf, size_t size, + int write_object, enum object_type type, const char *path) { - struct strbuf buf; - int ret; - - strbuf_init(&buf, 0); - if (strbuf_read(&buf, fd, 4096) < 0) { - strbuf_release(&buf); - return -1; - } - - if (!type) - type = blob_type; - if (write_object) - ret = write_sha1_file(buf.buf, buf.len, type, sha1); - else - ret = hash_sha1_file(buf.buf, buf.len, type, sha1); - strbuf_release(&buf); - - return ret; -} - -int index_fd(unsigned char *sha1, int fd, struct stat *st, int write_object, - enum object_type type, const char *path) -{ - size_t size = xsize_t(st->st_size); - void *buf = NULL; int ret, re_allocated = 0; - if (size) - buf = xmmap(NULL, size, PROT_READ, MAP_PRIVATE, fd, 0); - close(fd); - if (!type) type = OBJ_BLOB; /* * Convert blobs to git internal format */ - if ((type == OBJ_BLOB) && S_ISREG(st->st_mode)) { + if ((type == OBJ_BLOB) && path) { struct strbuf nbuf; strbuf_init(&nbuf, 0); if (convert_to_git(path, buf, size, &nbuf, write_object ? safe_crlf : 0)) { - munmap(buf, size); buf = strbuf_detach(&nbuf, &size); re_allocated = 1; } @@ -2411,8 +2382,35 @@ int index_fd(unsigned char *sha1, int fd, struct stat *st, int write_object, free(buf); return ret; } - if (size) + return ret; +} + +int index_fd(unsigned char *sha1, int fd, struct stat *st, int write_object, + enum object_type type, const char *path) +{ + size_t size = xsize_t(st->st_size); + int ret; + + if (!S_ISREG(st->st_mode)) + { + struct strbuf sbuf; + strbuf_init(&sbuf, 0); + if (strbuf_read(&sbuf, fd, 4096) >= 0) + ret = index_mem(sha1, sbuf.buf, sbuf.len, write_object, + type, path); + else + ret = -1; + strbuf_release(&sbuf); + } + else if (size) + { + void *buf = xmmap(NULL, size, PROT_READ, MAP_PRIVATE, fd, 0); + ret = index_mem(sha1, buf, size, write_object, type, path); munmap(buf, size); + } + else + ret = index_mem(sha1, NULL, size, write_object, type, path); + close(fd); return ret; } -- 1.6.0.rc1.53.gaeaa.dirty ^ permalink raw reply related [flat|nested] 45+ messages in thread
* Re: [PATCH] hash-object --no-filters 2008-08-03 5:42 ` Dmitry Potapov @ 2008-08-03 5:56 ` Dmitry Potapov 2008-08-03 14:36 ` [PATCH 1/5] correct argument checking test for git hash-object Dmitry Potapov 2008-08-03 20:44 ` [PATCH] hash-object --no-filters Junio C Hamano 0 siblings, 2 replies; 45+ messages in thread From: Dmitry Potapov @ 2008-08-03 5:56 UTC (permalink / raw) To: Junio C Hamano; +Cc: Alexander Litvinov, git, Eric Wong On Sun, Aug 03, 2008 at 09:42:18AM +0400, Dmitry Potapov wrote: > > Here is the patch: I am sorry, I forgot to commit a micro cleanup to my patch: @@ -2378,10 +2378,8 @@ static int index_mem(unsigned char *sha1, void *buf, size_t size, ret = write_sha1_file(buf, size, typename(type), sha1); else ret = hash_sha1_file(buf, size, typename(type), sha1); - if (re_allocated) { + if (re_allocated) free(buf); - return ret; - } return ret; } So, here is the corrected version of my patch: -- >8 -- From: Dmitry Potapov <dpotapov@gmail.com> Date: Sun, 3 Aug 2008 08:39:16 +0400 Subject: [PATCH] teach index_fd to work with pipes index_fd can now work with file descriptors that are not normal files but any readable file. If the given file descriptor is a regular file then mmap() is used; for other files, strbuf_read is used. The path parameter, which has been used as hint for filters, can be NULL now to indicate that the file should be hashed literally without any filter. The index_pipe function is removed as redundant. Signed-off-by: Dmitry Potapov <dpotapov@gmail.com> --- cache.h | 1 - hash-object.c | 29 +++++++++++------------- sha1_file.c | 66 ++++++++++++++++++++++++++------------------------------ 3 files changed, 44 insertions(+), 52 deletions(-) diff --git a/cache.h b/cache.h index 2475de9..68ce6e6 100644 --- a/cache.h +++ b/cache.h @@ -391,7 +391,6 @@ extern int ie_modified(const struct index_state *, struct cache_entry *, struct extern int ce_path_match(const struct cache_entry *ce, const char **pathspec); extern int index_fd(unsigned char *sha1, int fd, struct stat *st, int write_object, enum object_type type, const char *path); -extern int index_pipe(unsigned char *sha1, int fd, const char *type, int write_object); extern int index_path(unsigned char *sha1, const char *path, struct stat *st, int write_object); extern void fill_stat_cache_info(struct cache_entry *ce, struct stat *st); diff --git a/hash-object.c b/hash-object.c index 46c06a9..ce027b9 100644 --- a/hash-object.c +++ b/hash-object.c @@ -8,28 +8,25 @@ #include "blob.h" #include "quote.h" -static void hash_object(const char *path, enum object_type type, int write_object) +static void hash_fd(int fd, const char *type, int write_object, const char *path) { - int fd; struct stat st; unsigned char sha1[20]; - fd = open(path, O_RDONLY); - if (fd < 0 || - fstat(fd, &st) < 0 || - index_fd(sha1, fd, &st, write_object, type, path)) + if (fstat(fd, &st) < 0 || + index_fd(sha1, fd, &st, write_object, type_from_string(type), path)) die(write_object ? "Unable to add %s to database" : "Unable to hash %s", path); printf("%s\n", sha1_to_hex(sha1)); maybe_flush_or_die(stdout, "hash to stdout"); } - -static void hash_stdin(const char *type, int write_object) +static void hash_object(const char *path, const char *type, int write_object) { - unsigned char sha1[20]; - if (index_pipe(sha1, 0, type, write_object)) - die("Unable to add stdin to database"); - printf("%s\n", sha1_to_hex(sha1)); + int fd; + fd = open(path, O_RDONLY); + if (fd < 0) + die("Cannot open %s", path); + hash_fd(fd, type, write_object, path); } static void hash_stdin_paths(const char *type, int write_objects) @@ -45,7 +42,7 @@ static void hash_stdin_paths(const char *type, int write_objects) die("line is badly quoted"); strbuf_swap(&buf, &nbuf); } - hash_object(buf.buf, type_from_string(type), write_objects); + hash_object(buf.buf, type, write_objects); } strbuf_release(&buf); strbuf_release(&nbuf); @@ -116,13 +113,13 @@ int main(int argc, char **argv) } if (hashstdin) { - hash_stdin(type, write_object); + hash_fd(0, type, write_object, NULL); hashstdin = 0; } if (0 <= prefix_length) arg = prefix_filename(prefix, prefix_length, arg); - hash_object(arg, type_from_string(type), write_object); + hash_object(arg, type, write_object); no_more_flags = 1; } } @@ -131,6 +128,6 @@ int main(int argc, char **argv) hash_stdin_paths(type, write_object); if (hashstdin) - hash_stdin(type, write_object); + hash_fd(0, type, write_object, NULL); return 0; } diff --git a/sha1_file.c b/sha1_file.c index e281c14..fe863f5 100644 --- a/sha1_file.c +++ b/sha1_file.c @@ -2353,51 +2353,22 @@ int has_sha1_file(const unsigned char *sha1) return has_loose_object(sha1); } -int index_pipe(unsigned char *sha1, int fd, const char *type, int write_object) +static int index_mem(unsigned char *sha1, void *buf, size_t size, + int write_object, enum object_type type, const char *path) { - struct strbuf buf; - int ret; - - strbuf_init(&buf, 0); - if (strbuf_read(&buf, fd, 4096) < 0) { - strbuf_release(&buf); - return -1; - } - - if (!type) - type = blob_type; - if (write_object) - ret = write_sha1_file(buf.buf, buf.len, type, sha1); - else - ret = hash_sha1_file(buf.buf, buf.len, type, sha1); - strbuf_release(&buf); - - return ret; -} - -int index_fd(unsigned char *sha1, int fd, struct stat *st, int write_object, - enum object_type type, const char *path) -{ - size_t size = xsize_t(st->st_size); - void *buf = NULL; int ret, re_allocated = 0; - if (size) - buf = xmmap(NULL, size, PROT_READ, MAP_PRIVATE, fd, 0); - close(fd); - if (!type) type = OBJ_BLOB; /* * Convert blobs to git internal format */ - if ((type == OBJ_BLOB) && S_ISREG(st->st_mode)) { + if ((type == OBJ_BLOB) && path) { struct strbuf nbuf; strbuf_init(&nbuf, 0); if (convert_to_git(path, buf, size, &nbuf, write_object ? safe_crlf : 0)) { - munmap(buf, size); buf = strbuf_detach(&nbuf, &size); re_allocated = 1; } @@ -2407,12 +2378,37 @@ int index_fd(unsigned char *sha1, int fd, struct stat *st, int write_object, ret = write_sha1_file(buf, size, typename(type), sha1); else ret = hash_sha1_file(buf, size, typename(type), sha1); - if (re_allocated) { + if (re_allocated) free(buf); - return ret; + return ret; +} + +int index_fd(unsigned char *sha1, int fd, struct stat *st, int write_object, + enum object_type type, const char *path) +{ + size_t size = xsize_t(st->st_size); + int ret; + + if (!S_ISREG(st->st_mode)) + { + struct strbuf sbuf; + strbuf_init(&sbuf, 0); + if (strbuf_read(&sbuf, fd, 4096) >= 0) + ret = index_mem(sha1, sbuf.buf, sbuf.len, write_object, + type, path); + else + ret = -1; + strbuf_release(&sbuf); } - if (size) + else if (size) + { + void *buf = xmmap(NULL, size, PROT_READ, MAP_PRIVATE, fd, 0); + ret = index_mem(sha1, buf, size, write_object, type, path); munmap(buf, size); + } + else + ret = index_mem(sha1, NULL, size, write_object, type, path); + close(fd); return ret; } -- 1.6.0.rc1.53.gf8e95 ^ permalink raw reply related [flat|nested] 45+ messages in thread
* [PATCH 1/5] correct argument checking test for git hash-object 2008-08-03 5:56 ` Dmitry Potapov @ 2008-08-03 14:36 ` Dmitry Potapov 2008-08-03 14:36 ` [PATCH 2/5] correct usage help string for git-hash-object Dmitry Potapov 2008-08-03 20:44 ` [PATCH] hash-object --no-filters Junio C Hamano 1 sibling, 1 reply; 45+ messages in thread From: Dmitry Potapov @ 2008-08-03 14:36 UTC (permalink / raw) To: Junio C Hamano; +Cc: Alexander Litvinov, git, Eric Wong, Dmitry Potapov Because the file name given to stdin did not exist, git hash-object will fail to open it and exit with non-zero error code even if there is no check of arguments. Thus the test may pass despite the obvious error in argument checking. Signed-off-by: Dmitry Potapov <dpotapov@gmail.com> --- t/t1007-hash-object.sh | 8 ++++---- 1 files changed, 4 insertions(+), 4 deletions(-) diff --git a/t/t1007-hash-object.sh b/t/t1007-hash-object.sh index 1ec0535..6d505fa 100755 --- a/t/t1007-hash-object.sh +++ b/t/t1007-hash-object.sh @@ -49,16 +49,16 @@ setup_repo # Argument checking test_expect_success "multiple '--stdin's are rejected" ' - test_must_fail git hash-object --stdin --stdin < example + echo example | test_must_fail git hash-object --stdin --stdin ' test_expect_success "Can't use --stdin and --stdin-paths together" ' - test_must_fail git hash-object --stdin --stdin-paths && - test_must_fail git hash-object --stdin-paths --stdin + echo example | test_must_fail git hash-object --stdin --stdin-paths && + echo example | test_must_fail git hash-object --stdin-paths --stdin ' test_expect_success "Can't pass filenames as arguments with --stdin-paths" ' - test_must_fail git hash-object --stdin-paths hello < example + echo example | test_must_fail git hash-object --stdin-paths hello ' # Behavior -- 1.6.0.rc1.58.gacdf ^ permalink raw reply related [flat|nested] 45+ messages in thread
* [PATCH 2/5] correct usage help string for git-hash-object 2008-08-03 14:36 ` [PATCH 1/5] correct argument checking test for git hash-object Dmitry Potapov @ 2008-08-03 14:36 ` Dmitry Potapov 2008-08-03 14:36 ` [PATCH 3/5] use parse_options() in git hash-object Dmitry Potapov 0 siblings, 1 reply; 45+ messages in thread From: Dmitry Potapov @ 2008-08-03 14:36 UTC (permalink / raw) To: Junio C Hamano; +Cc: Alexander Litvinov, git, Eric Wong, Dmitry Potapov The usage string is corrected to make it fit in 80 columns and to make it unequivocal about what options can be used with --stdin-paths. Signed-off-by: Dmitry Potapov <dpotapov@gmail.com> --- Documentation/git-hash-object.txt | 4 +++- hash-object.c | 3 ++- 2 files changed, 5 insertions(+), 2 deletions(-) diff --git a/Documentation/git-hash-object.txt b/Documentation/git-hash-object.txt index ac928e1..a4703ec 100644 --- a/Documentation/git-hash-object.txt +++ b/Documentation/git-hash-object.txt @@ -8,7 +8,9 @@ git-hash-object - Compute object ID and optionally creates a blob from a file SYNOPSIS -------- -'git hash-object' [-t <type>] [-w] [--stdin | --stdin-paths] [--] <file>... +[verse] +'git hash-object' [-t <type>] [-w] [--stdin] [--] <file>... +'git hash-object' [-t <type>] [-w] --stdin-paths < <list-of-paths> DESCRIPTION ----------- diff --git a/hash-object.c b/hash-object.c index ce027b9..ac44b4e 100644 --- a/hash-object.c +++ b/hash-object.c @@ -49,7 +49,8 @@ static void hash_stdin_paths(const char *type, int write_objects) } static const char hash_object_usage[] = -"git hash-object [ [-t <type>] [-w] [--stdin] <file>... | --stdin-paths < <list-of-paths> ]"; +"git hash-object [-t <type>] [-w] [--stdin] [--] <file>...\n" +" or: git hash-object --stdin-paths < <list-of-paths>"; int main(int argc, char **argv) { -- 1.6.0.rc1.58.gacdf ^ permalink raw reply related [flat|nested] 45+ messages in thread
* [PATCH 3/5] use parse_options() in git hash-object 2008-08-03 14:36 ` [PATCH 2/5] correct usage help string for git-hash-object Dmitry Potapov @ 2008-08-03 14:36 ` Dmitry Potapov 2008-08-03 14:36 ` [PATCH 4/5] add --path option to " Dmitry Potapov 0 siblings, 1 reply; 45+ messages in thread From: Dmitry Potapov @ 2008-08-03 14:36 UTC (permalink / raw) To: Junio C Hamano; +Cc: Alexander Litvinov, git, Eric Wong, Dmitry Potapov Signed-off-by: Dmitry Potapov <dpotapov@gmail.com> --- hash-object.c | 122 +++++++++++++++++++++++++-------------------------------- 1 files changed, 53 insertions(+), 69 deletions(-) diff --git a/hash-object.c b/hash-object.c index ac44b4e..b658fae 100644 --- a/hash-object.c +++ b/hash-object.c @@ -7,6 +7,7 @@ #include "cache.h" #include "blob.h" #include "quote.h" +#include "parse-options.h" static void hash_fd(int fd, const char *type, int write_object, const char *path) { @@ -48,87 +49,70 @@ static void hash_stdin_paths(const char *type, int write_objects) strbuf_release(&nbuf); } -static const char hash_object_usage[] = -"git hash-object [-t <type>] [-w] [--stdin] [--] <file>...\n" -" or: git hash-object --stdin-paths < <list-of-paths>"; +static const char * const hash_object_usage[] = { + "git hash-object [-t <type>] [-w] [--stdin] [--] <file>...", + "git hash-object --stdin-paths < <list-of-paths>", + NULL +}; -int main(int argc, char **argv) +static const char *type; +static int write_object; +static int hashstdin; +static int stdin_paths; + +static const struct option hash_object_options[] = { + OPT_STRING('t', NULL, &type, "type", "object type"), + OPT_BOOLEAN('w', NULL, &write_object, "write the object into the object database"), + OPT_BOOLEAN( 0 , "stdin", &hashstdin, "read the object from stdin"), + OPT_BOOLEAN( 0 , "stdin-paths", &stdin_paths, "read file names from stdin"), + OPT_END() +}; + +int main(int argc, const char **argv) { int i; - const char *type = blob_type; - int write_object = 0; const char *prefix = NULL; int prefix_length = -1; - int no_more_flags = 0; - int hashstdin = 0; - int stdin_paths = 0; + const char *errstr = NULL; + + type = blob_type; git_config(git_default_config, NULL); - for (i = 1 ; i < argc; i++) { - if (!no_more_flags && argv[i][0] == '-') { - if (!strcmp(argv[i], "-t")) { - if (argc <= ++i) - usage(hash_object_usage); - type = argv[i]; - } - else if (!strcmp(argv[i], "-w")) { - if (prefix_length < 0) { - prefix = setup_git_directory(); - prefix_length = - prefix ? strlen(prefix) : 0; - } - write_object = 1; - } - else if (!strcmp(argv[i], "--")) { - no_more_flags = 1; - } - else if (!strcmp(argv[i], "--help")) - usage(hash_object_usage); - else if (!strcmp(argv[i], "--stdin-paths")) { - if (hashstdin) { - error("Can't use --stdin-paths with --stdin"); - usage(hash_object_usage); - } - stdin_paths = 1; - - } - else if (!strcmp(argv[i], "--stdin")) { - if (stdin_paths) { - error("Can't use %s with --stdin-paths", argv[i]); - usage(hash_object_usage); - } - if (hashstdin) - die("Multiple --stdin arguments are not supported"); - hashstdin = 1; - } - else - usage(hash_object_usage); - } - else { - const char *arg = argv[i]; - - if (stdin_paths) { - error("Can't specify files (such as \"%s\") with --stdin-paths", arg); - usage(hash_object_usage); - } - - if (hashstdin) { - hash_fd(0, type, write_object, NULL); - hashstdin = 0; - } - if (0 <= prefix_length) - arg = prefix_filename(prefix, prefix_length, - arg); - hash_object(arg, type, write_object); - no_more_flags = 1; - } + argc = parse_options(argc, argv, hash_object_options, hash_object_usage, 0); + + if (write_object) { + prefix = setup_git_directory(); + prefix_length = prefix ? strlen(prefix) : 0; } - if (stdin_paths) - hash_stdin_paths(type, write_object); + if (stdin_paths) { + if (hashstdin) + errstr = "Can't use --stdin-paths with --stdin"; + else if (argc) + errstr = "Can't specify files with --stdin-paths"; + } + else if (hashstdin > 1) + errstr = "Multiple --stdin arguments are not supported"; + + if (errstr) { + error (errstr); + usage_with_options(hash_object_usage, hash_object_options); + } if (hashstdin) hash_fd(0, type, write_object, NULL); + + for (i = 0 ; i < argc; i++) { + const char *arg = argv[i]; + + if (0 <= prefix_length) + arg = prefix_filename(prefix, prefix_length, arg); + hash_object(arg, type, write_object); + } + + if (stdin_paths) + hash_stdin_paths(type, write_object); + return 0; } -- 1.6.0.rc1.58.gacdf ^ permalink raw reply related [flat|nested] 45+ messages in thread
* [PATCH 4/5] add --path option to git hash-object 2008-08-03 14:36 ` [PATCH 3/5] use parse_options() in git hash-object Dmitry Potapov @ 2008-08-03 14:36 ` Dmitry Potapov 2008-08-03 14:36 ` [PATCH 5/5] add --no-filters " Dmitry Potapov 0 siblings, 1 reply; 45+ messages in thread From: Dmitry Potapov @ 2008-08-03 14:36 UTC (permalink / raw) To: Junio C Hamano; +Cc: Alexander Litvinov, git, Eric Wong, Dmitry Potapov The --path option allows to make filters work as if the file specified while the actual its location may be different. It is mostly useful for hashing temporary files outside of the working directory. Signed-off-by: Dmitry Potapov <dpotapov@gmail.com> --- Documentation/git-hash-object.txt | 12 +++++++++++- hash-object.c | 19 +++++++++++++------ t/t1007-hash-object.sh | 24 ++++++++++++++++++++++++ 3 files changed, 48 insertions(+), 7 deletions(-) diff --git a/Documentation/git-hash-object.txt b/Documentation/git-hash-object.txt index a4703ec..fececbf 100644 --- a/Documentation/git-hash-object.txt +++ b/Documentation/git-hash-object.txt @@ -9,7 +9,7 @@ git-hash-object - Compute object ID and optionally creates a blob from a file SYNOPSIS -------- [verse] -'git hash-object' [-t <type>] [-w] [--stdin] [--] <file>... +'git hash-object' [-t <type>] [-w] [--path=<file>] [--stdin] [--] <file>... 'git hash-object' [-t <type>] [-w] --stdin-paths < <list-of-paths> DESCRIPTION @@ -37,6 +37,16 @@ OPTIONS --stdin-paths:: Read file names from stdin instead of from the command-line. +--path:: + Hash object as it were located at the given path. The location of + file does not directly influence on the hash value, but path is + used to determine what git filters should be applied to the object + before it can be placed to the object database, and, as result of + applying filters, the actual blob put into the object database may + differ from the given file. This option is mainly useful for hashing + temporary files located outside of the working directory or files + read from stdin. + Author ------ Written by Junio C Hamano <gitster@pobox.com> diff --git a/hash-object.c b/hash-object.c index b658fae..b11f459 100644 --- a/hash-object.c +++ b/hash-object.c @@ -21,13 +21,14 @@ static void hash_fd(int fd, const char *type, int write_object, const char *path printf("%s\n", sha1_to_hex(sha1)); maybe_flush_or_die(stdout, "hash to stdout"); } -static void hash_object(const char *path, const char *type, int write_object) +static void hash_object(const char *path, const char *type, int write_object, + const char *vpath) { int fd; fd = open(path, O_RDONLY); if (fd < 0) die("Cannot open %s", path); - hash_fd(fd, type, write_object, path); + hash_fd(fd, type, write_object, vpath); } static void hash_stdin_paths(const char *type, int write_objects) @@ -43,14 +44,14 @@ static void hash_stdin_paths(const char *type, int write_objects) die("line is badly quoted"); strbuf_swap(&buf, &nbuf); } - hash_object(buf.buf, type, write_objects); + hash_object(buf.buf, type, write_objects, buf.buf); } strbuf_release(&buf); strbuf_release(&nbuf); } static const char * const hash_object_usage[] = { - "git hash-object [-t <type>] [-w] [--stdin] [--] <file>...", + "git hash-object [-t <type>] [-w] [--path=<file>] [--stdin] [--] <file>...", "git hash-object --stdin-paths < <list-of-paths>", NULL }; @@ -59,12 +60,14 @@ static const char *type; static int write_object; static int hashstdin; static int stdin_paths; +static const char *vpath; static const struct option hash_object_options[] = { OPT_STRING('t', NULL, &type, "type", "object type"), OPT_BOOLEAN('w', NULL, &write_object, "write the object into the object database"), OPT_BOOLEAN( 0 , "stdin", &hashstdin, "read the object from stdin"), OPT_BOOLEAN( 0 , "stdin-paths", &stdin_paths, "read file names from stdin"), + OPT_STRING( 0 , "path", &vpath, "file", "process file as it were from this path"), OPT_END() }; @@ -84,6 +87,8 @@ int main(int argc, const char **argv) if (write_object) { prefix = setup_git_directory(); prefix_length = prefix ? strlen(prefix) : 0; + if (vpath && prefix) + vpath = prefix_filename(prefix, prefix_length, vpath); } if (stdin_paths) { @@ -91,6 +96,8 @@ int main(int argc, const char **argv) errstr = "Can't use --stdin-paths with --stdin"; else if (argc) errstr = "Can't specify files with --stdin-paths"; + else if (vpath) + errstr = "Can't use --stdin-paths with --path"; } else if (hashstdin > 1) errstr = "Multiple --stdin arguments are not supported"; @@ -101,14 +108,14 @@ int main(int argc, const char **argv) } if (hashstdin) - hash_fd(0, type, write_object, NULL); + hash_fd(0, type, write_object, vpath); for (i = 0 ; i < argc; i++) { const char *arg = argv[i]; if (0 <= prefix_length) arg = prefix_filename(prefix, prefix_length, arg); - hash_object(arg, type, write_object); + hash_object(arg, type, write_object, vpath ? vpath : arg); } if (stdin_paths) diff --git a/t/t1007-hash-object.sh b/t/t1007-hash-object.sh index 6d505fa..dbe1f04 100755 --- a/t/t1007-hash-object.sh +++ b/t/t1007-hash-object.sh @@ -61,6 +61,10 @@ test_expect_success "Can't pass filenames as arguments with --stdin-paths" ' echo example | test_must_fail git hash-object --stdin-paths hello ' +test_expect_success "Can't use --path with --stdin-paths" ' + echo example | test_must_fail git hash-object --stdin-paths --path=foo +' + # Behavior push_repo @@ -93,6 +97,26 @@ test_expect_success 'git hash-object --stdin file1 <file0 first operates on file test "$obname1" = "$obname1new" ' +test_expect_success 'check that approperiate filter is invoke when --path is used' ' + echo fooQ | tr Q "\\015" > file0 && + cp file0 file1 && + echo "file0 -crlf" > .gitattributes && + echo "file1 crlf" >> .gitattributes && + git config core.autocrlf true && + file0_sha=$(git hash-object file0) && + file1_sha=$(git hash-object file1) && + test "$file0_sha" != "$file1_sha" && + path1_sha=$(git hash-object --path=file1 file0) && + path0_sha=$(git hash-object --path=file0 file1) && + test "$file0_sha" = "$path0_sha" && + test "$file1_sha" = "$path1_sha" && + path1_sha=$(cat file0 | git hash-object --path=file1 --stdin) && + path0_sha=$(cat file1 | git hash-object --path=file0 --stdin) && + test "$file0_sha" = "$path0_sha" && + test "$file1_sha" = "$path1_sha" && + git config --unset core.autocrlf +' + pop_repo for args in "-w --stdin" "--stdin -w"; do -- 1.6.0.rc1.58.gacdf ^ permalink raw reply related [flat|nested] 45+ messages in thread
* [PATCH 5/5] add --no-filters option to git hash-object 2008-08-03 14:36 ` [PATCH 4/5] add --path option to " Dmitry Potapov @ 2008-08-03 14:36 ` Dmitry Potapov 0 siblings, 0 replies; 45+ messages in thread From: Dmitry Potapov @ 2008-08-03 14:36 UTC (permalink / raw) To: Junio C Hamano; +Cc: Alexander Litvinov, git, Eric Wong, Dmitry Potapov If this option is given then the file is hashed as is ignoring all filters specified in the configuration. This option is incompatible with --path and --stdin-paths options. Signed-off-by: Dmitry Potapov <dpotapov@gmail.com> --- Documentation/git-hash-object.txt | 8 +++++++- hash-object.c | 17 +++++++++++++---- t/t1007-hash-object.sh | 24 ++++++++++++++++++++++++ 3 files changed, 44 insertions(+), 5 deletions(-) diff --git a/Documentation/git-hash-object.txt b/Documentation/git-hash-object.txt index fececbf..340e49c 100644 --- a/Documentation/git-hash-object.txt +++ b/Documentation/git-hash-object.txt @@ -9,7 +9,7 @@ git-hash-object - Compute object ID and optionally creates a blob from a file SYNOPSIS -------- [verse] -'git hash-object' [-t <type>] [-w] [--path=<file>] [--stdin] [--] <file>... +'git hash-object' [-t <type>] [-w] [--path=<file>|--no-filters] [--stdin] [--] <file>... 'git hash-object' [-t <type>] [-w] --stdin-paths < <list-of-paths> DESCRIPTION @@ -47,6 +47,12 @@ OPTIONS temporary files located outside of the working directory or files read from stdin. +--no-filters:: + If this option is given then the file is hashed as is ignoring + all filters specified in the configuration, including crlf + conversion. If the file is read from standard input then no + filters is always implied unless the --path option is given. + Author ------ Written by Junio C Hamano <gitster@pobox.com> diff --git a/hash-object.c b/hash-object.c index b11f459..3070a3e 100644 --- a/hash-object.c +++ b/hash-object.c @@ -51,7 +51,7 @@ static void hash_stdin_paths(const char *type, int write_objects) } static const char * const hash_object_usage[] = { - "git hash-object [-t <type>] [-w] [--path=<file>] [--stdin] [--] <file>...", + "git hash-object [-t <type>] [-w] [--path=<file>|--no-filters] [--stdin] [--] <file>...", "git hash-object --stdin-paths < <list-of-paths>", NULL }; @@ -60,6 +60,7 @@ static const char *type; static int write_object; static int hashstdin; static int stdin_paths; +static int no_filters; static const char *vpath; static const struct option hash_object_options[] = { @@ -67,6 +68,7 @@ static const struct option hash_object_options[] = { OPT_BOOLEAN('w', NULL, &write_object, "write the object into the object database"), OPT_BOOLEAN( 0 , "stdin", &hashstdin, "read the object from stdin"), OPT_BOOLEAN( 0 , "stdin-paths", &stdin_paths, "read file names from stdin"), + OPT_BOOLEAN( 0 , "no-filters", &no_filters, "store file as is without filters"), OPT_STRING( 0 , "path", &vpath, "file", "process file as it were from this path"), OPT_END() }; @@ -98,9 +100,15 @@ int main(int argc, const char **argv) errstr = "Can't specify files with --stdin-paths"; else if (vpath) errstr = "Can't use --stdin-paths with --path"; + else if (no_filters) + errstr = "Can't use --stdin-paths with --no-filters"; + } + else { + if (hashstdin > 1) + errstr = "Multiple --stdin arguments are not supported"; + if (vpath && no_filters) + errstr = "Can't use --path with --no-filters"; } - else if (hashstdin > 1) - errstr = "Multiple --stdin arguments are not supported"; if (errstr) { error (errstr); @@ -115,7 +123,8 @@ int main(int argc, const char **argv) if (0 <= prefix_length) arg = prefix_filename(prefix, prefix_length, arg); - hash_object(arg, type, write_object, vpath ? vpath : arg); + hash_object(arg, type, write_object, + no_filters ? NULL : vpath ? vpath : arg); } if (stdin_paths) diff --git a/t/t1007-hash-object.sh b/t/t1007-hash-object.sh index dbe1f04..12195a5 100755 --- a/t/t1007-hash-object.sh +++ b/t/t1007-hash-object.sh @@ -65,6 +65,14 @@ test_expect_success "Can't use --path with --stdin-paths" ' echo example | test_must_fail git hash-object --stdin-paths --path=foo ' +test_expect_success "Can't use --stdin-paths with --no-filters" ' + echo example | test_must_fail git hash-object --stdin-paths --no-filters +' + +test_expect_success "Can't use --path with --no-filters" ' + test_must_fail git hash-object --no-filters --path=foo +' + # Behavior push_repo @@ -117,6 +125,22 @@ test_expect_success 'check that approperiate filter is invoke when --path is use git config --unset core.autocrlf ' +test_expect_success 'check that --no-filters option works' ' + echo fooQ | tr Q "\\015" > file0 && + cp file0 file1 && + echo "file0 -crlf" > .gitattributes && + echo "file1 crlf" >> .gitattributes && + git config core.autocrlf true && + file0_sha=$(git hash-object file0) && + file1_sha=$(git hash-object file1) && + test "$file0_sha" != "$file1_sha" && + nofilters_file1=$(git hash-object --no-filters file1) && + test "$file0_sha" = "$nofilters_file1" && + nofilters_file1=$(cat file1 | git hash-object --stdin) && + test "$file0_sha" = "$nofilters_file1" && + git config --unset core.autocrlf +' + pop_repo for args in "-w --stdin" "--stdin -w"; do -- 1.6.0.rc1.58.gacdf ^ permalink raw reply related [flat|nested] 45+ messages in thread
* Re: [PATCH] hash-object --no-filters 2008-08-03 5:56 ` Dmitry Potapov 2008-08-03 14:36 ` [PATCH 1/5] correct argument checking test for git hash-object Dmitry Potapov @ 2008-08-03 20:44 ` Junio C Hamano 1 sibling, 0 replies; 45+ messages in thread From: Junio C Hamano @ 2008-08-03 20:44 UTC (permalink / raw) To: Dmitry Potapov; +Cc: Alexander Litvinov, git, Eric Wong Very nicely done; will queue along with the 5 patch series. Thanks. ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: [PATCH] git-svn now work with crlf convertion enabled. 2008-08-01 8:08 ` Junio C Hamano 2008-08-01 9:24 ` Dmitry Potapov @ 2008-08-01 11:11 ` Alexander Litvinov 2008-08-01 12:36 ` Dmitry Potapov 1 sibling, 1 reply; 45+ messages in thread From: Alexander Litvinov @ 2008-08-01 11:11 UTC (permalink / raw) To: Junio C Hamano; +Cc: Dmitry Potapov, git, Eric Wong > To being able to synchronize efficiently in both ways, you need to store > files exactly as they were received from SVN then there will be no > problem with applying binary delta patch. All CRLF conversion should be > done on checkout and checkin from/to Git repository. Sorry I have lost the mind flow here. 1. We 'fetch' files from svn as is. Yes, we know that svn use delta to rebuild original file. 2. We commit file to git. Right here we use git-hash-object. As I understand we _have_ to do convertion CRLF->LF here. 3. In some days we will checkout file from git and wil do LF->CRLF convertion. I thought this is a right workflow. - We could store original file too at step 2 somwhow to be able to use delta at step 1. - We can't skip convertion at step 2. Overwise git will store files with CRLF. Am I wrong ? ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: [PATCH] git-svn now work with crlf convertion enabled. 2008-08-01 11:11 ` [PATCH] git-svn now work with crlf convertion enabled Alexander Litvinov @ 2008-08-01 12:36 ` Dmitry Potapov 2008-08-04 3:10 ` Alexander Litvinov 0 siblings, 1 reply; 45+ messages in thread From: Dmitry Potapov @ 2008-08-01 12:36 UTC (permalink / raw) To: Alexander Litvinov; +Cc: Junio C Hamano, git, Eric Wong On Fri, Aug 1, 2008 at 3:11 PM, Alexander Litvinov <litvinov2004@gmail.com> wrote: >> To being able to synchronize efficiently in both ways, you need to store >> files exactly as they were received from SVN then there will be no >> problem with applying binary delta patch. All CRLF conversion should be >> done on checkout and checkin from/to Git repository. > > Sorry I have lost the mind flow here. > > 1. We 'fetch' files from svn as is. Yes, we know that svn use delta to rebuild > original file. > 2. We commit file to git. Right here we use git-hash-object. As I understand > we _have_ to do convertion CRLF->LF here. No, you should do any conversion here. There are two reasons for that: 1. If you do then you will not be able to apply binary patches later. 2. You do not really need it if the SVN repository has correct eol settings, because all files that have svn:eol-style set to either 'native' or 'LF' will have LF. Those that do not have svn:eol-style or have it to another value should not be subject to CRLF conversion at all. So, I believe all files received from SVN should be stored as is. Import is not about creating new commits, it is about getting history from another repository as it is. > 3. In some days we will checkout file from git and wil do LF->CRLF convertion. It is done only for files that do not have CRLF already. > > I thought this is a right workflow. > - We could store original file too at step 2 somwhow to be able to use delta > at step 1. > - We can't skip convertion at step 2. Overwise git will store files with CRLF. It is okay for Git to store CRLF, because you want to treat them as binary files. If you want them being treated as text, you should change svn:eol-style to 'native' for those files in SVN and then new versions of these files will have the right ending. It is how SVN client works. The only problem is how to synchronize the SVN view which files are binary and which are text and what Git thinks about them. Dmitry ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: [PATCH] git-svn now work with crlf convertion enabled. 2008-08-01 12:36 ` Dmitry Potapov @ 2008-08-04 3:10 ` Alexander Litvinov 0 siblings, 0 replies; 45+ messages in thread From: Alexander Litvinov @ 2008-08-04 3:10 UTC (permalink / raw) To: Dmitry Potapov; +Cc: git > 2. You do not really need it if the SVN repository has correct eol > settings, because all files that have svn:eol-style set to either 'native' > or 'LF' will have LF. Those that do not have svn:eol-style or have it to > another value should not be subject to CRLF conversion at all. > > So, I believe all files received from SVN should be stored as is. Import is > not about creating new commits, it is about getting history from another > repository as it is. I understand the idea now. Some of my files in svn repo are missing eol style property at all. Will fix this :-) Thanks for help ! ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: [PATCH] git-svn now work with crlf convertion enabled. 2008-07-31 5:43 ` [PATCH] git-svn now " Alexander Litvinov 2008-07-31 5:57 ` Alexander Litvinov @ 2008-08-04 0:48 ` Eric Wong 1 sibling, 0 replies; 45+ messages in thread From: Eric Wong @ 2008-08-04 0:48 UTC (permalink / raw) To: Alexander Litvinov; +Cc: Junio C Hamano, git Alexander Litvinov <litvinov2004@gmail.com> wrote: > Make git-svn works with crlf (or any other) file content convertion enabled. > > When we modify file content SVN cant apply its delta to it. To fix this > situation I take full file content from SVN as next revision. This is > dump and slow but it works. > + my $ctx = SVN::Client->new(); > + $ctx->cat($fh, $url, $rev); > } I know you've already (at least for now) pulled this patch but I won't accept anything that opens a second connection to the server. I've seen this in some svn:// servers intermittently, but I've seen git-svn get its connection terminated whenever it opens a second connection (it happens with parent-following). git-svn used to do this more frequently, but most of those cases got fixed (but one remains with parent-following). Additionally, git-svnimport and older versions of git-svn used the equivalent of $ctx->cat without deltas from the SVN::Ra object, so you should be able todo something functionally equivalent w/o opening a new socket. As far as crlf issues with git-svn go, I'm blissfully ignorant of the complexities behind what git (or svn for that matter) does with crlf conversions[1]. I'll be alright with any changes to git-svn that don't modify existing behavior for crlf-ignorant users such as myself. I'll trust Junio and other folks on the list to know and do what makes the most sense here. [1] I would have much rather preferred git didn't implement or care about crlf filters at all, but maybe I'm just in a small minority. -- Eric Wong ^ permalink raw reply [flat|nested] 45+ messages in thread
end of thread, other threads:[~2008-08-06 16:12 UTC | newest] Thread overview: 45+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2008-07-23 8:44 git-svn does not seems to work with crlf convertion enabled Alexander Litvinov 2008-07-23 9:18 ` Johannes Schindelin 2008-07-23 11:52 ` Alexander Litvinov 2008-07-23 12:57 ` Johannes Schindelin 2008-07-23 15:49 ` Avery Pennarun 2008-07-23 16:07 ` Johannes Schindelin 2008-07-24 3:13 ` Alexander Litvinov 2008-08-06 11:15 ` Petr Baudis 2008-08-06 12:35 ` Peter Harris 2008-08-06 12:43 ` Johannes Schindelin 2008-08-06 13:51 ` git-svn on MSysGit and why is it (going to be?) unsupported Petr Baudis 2008-08-06 15:23 ` Avery Pennarun 2008-08-06 16:11 ` git-svn does not seems to work with crlf convertion enabled Dmitry Potapov 2008-07-24 14:24 ` Dmitry Potapov 2008-07-24 14:40 ` Johannes Schindelin 2008-07-24 16:28 ` Avery Pennarun 2008-07-30 4:37 ` Alexander Litvinov 2008-07-31 5:43 ` [PATCH] git-svn now " Alexander Litvinov 2008-07-31 5:57 ` Alexander Litvinov 2008-07-31 10:45 ` Dmitry Potapov 2008-07-31 19:09 ` [RFC] hash-object --no-filters Dmitry Potapov 2008-08-01 3:23 ` [PATCH] git-svn now work with crlf convertion enabled Alexander Litvinov 2008-08-01 5:09 ` Junio C Hamano 2008-08-01 7:44 ` Dmitry Potapov 2008-08-01 11:27 ` Alexander Litvinov 2008-08-01 7:47 ` Dmitry Potapov 2008-08-01 8:08 ` Junio C Hamano 2008-08-01 9:24 ` Dmitry Potapov 2008-08-01 19:42 ` Junio C Hamano 2008-08-01 22:09 ` Dmitry Potapov 2008-08-01 22:14 ` Junio C Hamano 2008-08-01 23:10 ` Dmitry Potapov 2008-08-02 17:28 ` [PATCH] hash-object --no-filters Junio C Hamano 2008-08-03 5:42 ` Dmitry Potapov 2008-08-03 5:56 ` Dmitry Potapov 2008-08-03 14:36 ` [PATCH 1/5] correct argument checking test for git hash-object Dmitry Potapov 2008-08-03 14:36 ` [PATCH 2/5] correct usage help string for git-hash-object Dmitry Potapov 2008-08-03 14:36 ` [PATCH 3/5] use parse_options() in git hash-object Dmitry Potapov 2008-08-03 14:36 ` [PATCH 4/5] add --path option to " Dmitry Potapov 2008-08-03 14:36 ` [PATCH 5/5] add --no-filters " Dmitry Potapov 2008-08-03 20:44 ` [PATCH] hash-object --no-filters Junio C Hamano 2008-08-01 11:11 ` [PATCH] git-svn now work with crlf convertion enabled Alexander Litvinov 2008-08-01 12:36 ` Dmitry Potapov 2008-08-04 3:10 ` Alexander Litvinov 2008-08-04 0:48 ` Eric Wong
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).