From: Junio C Hamano <gitster@pobox.com>
To: Jeff King <peff@peff.net>
Cc: Daniel Lyubomirov <daniel@digitalus.bg>, git@vger.kernel.org
Subject: Re: Bug: problem with file named with dash character
Date: Wed, 27 Jun 2012 15:17:54 -0700 [thread overview]
Message-ID: <7v7gus7625.fsf@alter.siamese.dyndns.org> (raw)
In-Reply-To: <20120627210039.GA2292@sigill.intra.peff.net> (Jeff King's message of "Wed, 27 Jun 2012 17:00:39 -0400")
Jeff King <peff@peff.net> writes:
> From a cursory look, these definitely go in the right direction. I like
> how the third one was able to rip populate_from_stdin entirely out of
> diff.c.
>
> I suspect we could get by without even the nongit_stdin flag you added;
> the only place which uses it is diff_fill_sha1_info, but theoretically
> we don't even need it there; we could just index_mem the file contents
> we get via diff_populate_filespec, and the stdin contents will
> already be there.
Probably that is a sane thing to do.
And as you hinted, we definitely need to give the "the copy in the
filespec->data is the only one; do not discard until we are really
done with it" semantics to the bit, or introduce a separate bit that
is not necessarily related to the "this came from the standard input"
bit. I offhand do not know what data source we would later add
that needs such a treatment. "git diff http://example.com/{foo,bar}",
perhaps? The "do not discard" bit at that point may need to learn
to swap it out to a temporary file or something when it happens.
> Right now we call index_path for worktree files, but I don't really see
> much point; we have to read the whole data either way, and
> populate_filespec should be mmap-ing them for us.
>
>> - We say on the "diff --git" header uglyness like "a/-", "b/-";
>> likewise in the metainfo;
>
> I'd consider changing the path to "/dev/stdin" for this case. It doesn't
> exist on some platforms, of course, but neither does /dev/null, which we
> use similarly.
Sensible.
>> - We show on the "index" header "0*" value for these entries, even
>> though we should be able to compute it (after all we do so for
>> files on disk in a non-git directory);
>
> The index_mem I mentioned above would fix that.
Yes.
>> - We still apply attributes and textconv as if we are dealing with
>> a regular file "-" at the root level.
>
> I think that's bad. I wonder if it should have "*" attributes applied to
> it or not. While I can see it being convenient in some cases, I think it
> makes the rules confusingly complex.
It is likely that the crlf conversion on Dos systems wants to be
applied regardless.
This is unrelated to the "standard input confusion" issue, but there
are two more things coming either from the way the no-index code is
done or from the way it is bolted onto git.
With the current code, this:
mkdir foo bar
echo hello >foo/1
echo bye >bar/2
git diff --no-index foo bar
shows differences between a/foo/1 and b/bar/1, as the no-index code
records foo/1 and bar/1 as the paths in the filespec for them.
But if you are comparing two directory hierarchies, it is a lot more
likely that you would want to see corresponding files in these two
directories. In other words, the patch is better shown as
differences between a/1 and b/1 (the code makes the core compare
foo/1 and bar/2 after all). This of course requires no-index to
differentiate the logical pathname (i.e. "this is the path inside
collection a/ (or b/)") and the physical location from which the
contents are read. Such a differentiation would allow us to also do
renames and rename classifications much more sanely. We had to add
DIFF_PAIR_RENAME() and filepair->renamed_pair only to support this
codepath because of this misdesign. We could have just run strcmp()
between the logical pathname of one/two members of the filepair to
see if the pair was renamed if it was done that way.
And the way diff-no-index.c::queue_diff() walks two directories to
grab paths from them in parallel and incrementally means that the
filesystem walking code cannot be reused for something like this:
git diff master:Documentation /var/tmp/docs
to compare a hierarchy represented with a tree object with another
hierarchy stored in the filesystem outside git's control. A natural
way to do so would be to grab a set of paths from /var/tmp/docs and
have that set compared against the other set that comes from the tree,
and the "grab a set of paths from /var/tmp/docs" machinery can be
used twice to implement the current
git diff --no-index /var/tmp/foo /var/tmp/bar
which would have been a lot cleaner implementation.
Oh well.
next prev parent reply other threads:[~2012-06-27 22:18 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <52ae7682-3e9a-4b52-bec1-08ba3aadffc0@office.digitalus.nl>
2012-06-27 7:32 ` Bug: problem with file named with dash character Daniel Lyubomirov -|- Digitalus Bulgaria
2012-06-27 9:57 ` faux
2012-06-27 18:28 ` Junio C Hamano
2012-06-27 19:52 ` Jeff King
2012-06-27 20:25 ` Jeff King
2012-06-27 20:27 ` Junio C Hamano
2012-06-27 20:33 ` Junio C Hamano
2012-06-27 20:35 ` Junio C Hamano
2012-06-27 20:39 ` Junio C Hamano
2012-06-27 20:48 ` Junio C Hamano
2012-06-27 21:00 ` Jeff King
2012-06-27 22:17 ` Junio C Hamano [this message]
2012-06-27 22:41 ` Jeff King
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=7v7gus7625.fsf@alter.siamese.dyndns.org \
--to=gitster@pobox.com \
--cc=daniel@digitalus.bg \
--cc=git@vger.kernel.org \
--cc=peff@peff.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).