From: Junio C Hamano <gitster@pobox.com>
To: Mark Junker <mjscod@web.de>
Cc: git@vger.kernel.org
Subject: Re: [PATCH] [RFC] Design for pathname encoding gitattribute [RESEND]
Date: Tue, 22 Jan 2008 01:16:15 -0800 [thread overview]
Message-ID: <7vbq7erzj4.fsf@gitster.siamese.dyndns.org> (raw)
In-Reply-To: <fn48bp$ff8$1@ger.gmane.org> (Mark Junker's message of "Tue, 22 Jan 2008 09:09:33 +0100")
Mark Junker <mjscod@web.de> writes:
> Just to sum up what you wrote and to be sure that I understand you
> correctly:
>
> Lets have two encodings:
> - Encoding for path names stored in the repository
> - Encoding for path names from/to file systems
>
> Do conversion only if they are different. Both encodings are configurable.
Not really.
1. Encoding for the project does not have to be specified at
all. The project participants are expected to know about it
out of band.
2. Conversion for path names between filesystems and the
project (i.e. "paths in tree objects") can be specified per
repository (i.e. "a particular clone of the project"). We
could even allow the conversion function to be different
per-path-component but I suspect that would be a much
future addition that nobody would use in practice.
3. Suggest use of UTF-8-NFC as the project encoding as a BCP,
but never enforce it. It is a responsibility of the owner
of the particular repository to make sure that the
conversions used in a particular repository (again, "a
particular clone of the project") produces the desired
encoding in the tree objects.
But please take these with a moderately large grain of salt, as
I was more or less handwaving and pretending to know what I was
talking about ;-). I think this should work in theory, but I at
the same time suspect that there are many more places than just
readdir(3) that need to be wrapped if we take this approach, and
the intrusiveness factor might make this infeasible in practice.
The difference between your version and my 1. and 2. is very
subtle, but comes primarily from my desire not to have to use
the word "canonical". Yours define "this canonical encoding is
used in the repository, and we convert back and forth to that
local encoding", as opposed to my saying "here are to and from
conversion functions". The latter is more in line with how we
define smudge/clean filters for blob contents conversion, in
that the "encoding" used in in-repository blob does not have to
even have a name.
next prev parent reply other threads:[~2008-01-22 9:17 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-01-22 4:41 [PATCH] [RFC] Design for pathname encoding gitattribute [RESEND] Sam Vilain
2008-01-22 5:35 ` Johannes Schindelin
2008-01-22 6:37 ` Junio C Hamano
2008-01-22 6:26 ` Junio C Hamano
2008-01-22 7:43 ` Junio C Hamano
2008-01-22 8:09 ` Mark Junker
2008-01-22 9:16 ` Junio C Hamano [this message]
2008-01-22 9:13 ` Rafael Garcia-Suarez
2008-01-22 9:57 ` Sam Vilain
2008-01-22 10:36 ` Junio C Hamano
2008-01-22 10:44 ` Sam Vilain
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=7vbq7erzj4.fsf@gitster.siamese.dyndns.org \
--to=gitster@pobox.com \
--cc=git@vger.kernel.org \
--cc=mjscod@web.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).