git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Michael Haggerty <mhagger@alum.mit.edu>
To: Johannes Sixt <j.sixt@viscovery.net>
Cc: Junio C Hamano <gitster@pobox.com>,
	git@vger.kernel.org, Orgad and Raizel Shaneh <orgads@gmail.com>,
	Nguyen Thai Ngoc Duy <pclouds@gmail.com>
Subject: Re: [PATCH 7/7] t0000: verify that real_path() removes extra slashes
Date: Fri, 07 Sep 2012 00:30:53 +0200	[thread overview]
Message-ID: <5049241D.10604@alum.mit.edu> (raw)
In-Reply-To: <50471003.9010207@viscovery.net>

On 09/05/2012 10:40 AM, Johannes Sixt wrote:
> Am 9/4/2012 10:14, schrieb mhagger@alum.mit.edu:
>> From: Michael Haggerty <mhagger@alum.mit.edu>
>>
>> These tests already pass, but make sure they don't break in the
>> future.
>>
>> Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
>> ---
>>
>> It would be great if somebody would check whether these tests pass on
>> Windows, and if not, give me a tip about how to fix them.
>>
>>  t/t0000-basic.sh | 11 +++++++++++
>>  1 file changed, 11 insertions(+)
>>
>> diff --git a/t/t0000-basic.sh b/t/t0000-basic.sh
>> index d929578..3c75e97 100755
>> --- a/t/t0000-basic.sh
>> +++ b/t/t0000-basic.sh
>> @@ -468,6 +468,17 @@ test_expect_success 'real path works on absolute paths' '
>>  	test "$d/$nopath" = "$(test-path-utils real_path "$d/$nopath")"
>>  '
>>  
>> +test_expect_success 'real path removes extra slashes' '
>> +	nopath="hopefully-absent-path" &&
>> +	test "/" = "$(test-path-utils real_path "///")" &&
>> +	test "/$nopath" = "$(test-path-utils real_path "///$nopath")" &&
> 
> Same here: test-path-utils operates on a DOS-style absolute path.

ACK.

> Furthermore, as Junio pointed out elsewhere, it is desirable that slashes
> (dir-separators) at the beginning are not collapsed if there are exactly
> two of them. Three or more can be collapsed to a single slash.

The empirical behavior of real_path() on Linux is (assuming "/tmp"
exists and "/foo" does not)

                         Before my changes       After my changes
real_path("/tmp")        /tmp                    /tmp
real_path("//tmp")       /tmp                    /tmp
real_path("/foo")        $(pwd)/foo              /foo
real_path("//foo")       /foo                    /foo

real_path("/tmp/bar")    /tmp/bar                /tmp/bar
real_path("//tmp/bar")   /tmp/bar                /tmp/bar
real_path("/foo/bar")    --dies--                --dies--
real_path("//foo/bar")   --dies--                --dies--

So I think that my changes makes things strictly better (albeit probably
not perfect).

real_path() relies on chdir() / getcwd() to canonicalize the path,
except for one case:

If the specified path is not itself a directory, then it strips off the
last component of the name, tries chdir() / getcwd() on the shortened
path, then pastes the last component on again.  The stripping off is
done by find_last_dir_sep(), which literally just looks for the last '/'
(or for Windows also '\') in the string.  Please note that the pasting
back is done using '/' as a separator.

So I think that the only problematic case is a path that only includes a
single group of slashes, like "//foo" or maybe "c:\\foo", and also is
not is_directory() [1].  What should be the algorithm for such cases,
and how should it depend on the platform?  (And does it really matter?
Top-level Linux paths are usually directories.  Are Windows-style
network shares also considered directories in the sense of is_directory()?)

I will make an easy improvement: not to remove *any* slashes when
stripping the last path component from the end of the path (letting
chdir() deal with them).  This results in the same results for Linux and
perhaps more hope under Windows:

On Linux: "//foo" -> (chdir("//"), "foo") -> ("/", "foo") -> "/foo"

On Windows: "//foo" -> (chdir("//"), "foo") -> (?, "foo") -> ?
(what is the result of chdir("//") then getcwd()?)

On Windows: "c:\foo" -> (chdir("c:\"), "foo") -> (?, "foo") -> ?
(what is the result of chdir("c:\") then getcwd()?)

>> +	# We need an existing top-level directory to use in the
>> +	# remaining tests.  Use the top-level ancestor of $(pwd):
>> +	d=$(pwd -P | sed -e "s|^\(/[^/]*\)/.*|\1|") &&
> 
> Same here: Account for the drive letter-colon.

ACK

>> +	test "$d" = "$(test-path-utils real_path "//$d///")" &&
>> +	test "$d/$nopath" = "$(test-path-utils real_path "//$d///$nopath")"
> 
> Since $d is a DOS-style path on Windows, //$d means something entirely
> different than $d. You should split the test in two: One test checks that
> slashes at the end or before $nopath are collapsed (this must work on
> Windows as well), and another test protected by POSIX prerequisite to
> check that slashes at the beginning are collapsed.

Done.

Thanks again for your help.

Michael

[1] If there is more than one group of slashes in the name, like
"//foo//bar" or "c:\\foo\\bar" then:

* All but the last groups of slashes are handled by
  chdir("//foo/")/getcwd() and presumably de-duplicated or not as
  appropriate

* The extras from the last group of slashes are trailing slashes in
  the string passed to chdir() and are therefore removed.

so everything should be OK.

-- 
Michael Haggerty
mhagger@alum.mit.edu
http://softwareswirl.blogspot.com/

  reply	other threads:[~2012-09-06 22:31 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-09-04  8:14 [PATCH 0/7] Fix some bugs in abspath.c mhagger
2012-09-04  8:14 ` [PATCH 1/7] t0000: verify that absolute_path() fails if passed the empty string mhagger
2012-09-04  8:14 ` [PATCH 2/7] absolute_path(): reject " mhagger
2012-09-04  8:14 ` [PATCH 3/7] t0000: verify that real_path() fails if passed " mhagger
2012-09-04  8:14 ` [PATCH 4/7] real_path(): reject " mhagger
2012-09-04  8:14 ` [PATCH 5/7] t0000: verify that real_path() works correctly with absolute paths mhagger
2012-09-05  8:40   ` Johannes Sixt
2012-09-06 21:11     ` Michael Haggerty
2012-09-06 23:08   ` Junio C Hamano
2012-09-09  4:31     ` Michael Haggerty
2012-09-09  4:50       ` Junio C Hamano
2012-09-09  5:27         ` Junio C Hamano
2012-09-04  8:14 ` [PATCH 6/7] real_path(): properly handle nonexistent top-level paths mhagger
2012-09-04  8:14 ` [PATCH 7/7] t0000: verify that real_path() removes extra slashes mhagger
2012-09-04 18:19   ` Junio C Hamano
2012-09-05 10:52     ` Nguyen Thai Ngoc Duy
     [not found]       ` <CAGHpTB+LipLt3CgVt5O0s9xj5HHf9Y5z9QWEy+FKYH8s4O7Q2w@mail.gmail.com>
2012-09-05 11:13         ` Nguyen Thai Ngoc Duy
2012-09-06  3:23       ` Junio C Hamano
2012-09-06  5:44         ` Nguyen Thai Ngoc Duy
2012-09-06 17:34           ` Junio C Hamano
2012-09-07  1:18             ` Nguyen Thai Ngoc Duy
2012-09-05  8:40   ` Johannes Sixt
2012-09-06 22:30     ` Michael Haggerty [this message]
2012-09-04 18:08 ` [PATCH 0/7] Fix some bugs in abspath.c Junio C Hamano
2012-09-04 19:03   ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5049241D.10604@alum.mit.edu \
    --to=mhagger@alum.mit.edu \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=j.sixt@viscovery.net \
    --cc=orgads@gmail.com \
    --cc=pclouds@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).