git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: "Torsten Bögershausen" <tboegi@web.de>
Cc: git@vger.kernel.org
Subject: Re: [PATCH v7] git on Mac OS and precomposed unicode
Date: Mon, 02 Jul 2012 15:21:39 -0700	[thread overview]
Message-ID: <7vr4styfbg.fsf@alter.siamese.dyndns.org> (raw)
In-Reply-To: <201207022232.51737.tboegi@web.de> ("Torsten Bögershausen"'s message of "Mon, 2 Jul 2012 22:32:50 +0200")

Torsten Bögershausen <tboegi@web.de> writes:

> +core.precomposedunicode::
> +	...
> +	When false, file names are handled fully transparent by git, which means
> +	that file names are stored as decomposed unicode in the repository.

I do not think it means any such thing.

We just take whatever the platform throws at us and shove that in
the repository.  On MacOS X with HFS+, it may be decomposed UTF-8,
but we do not even try to ensure everything (like the path added by
somebody else on a BSD system in a commit that you fetched) is in a
particular encoding.

> diff --git a/Makefile b/Makefile
> index f62ca2a..55ceb10 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -607,6 +607,7 @@ LIB_H += compat/bswap.h
>  LIB_H += compat/cygwin.h
>  LIB_H += compat/mingw.h
>  LIB_H += compat/obstack.h
> +LIB_H += compat/precomposed_utf8.h

Micronit.  Shouldn't these all be called "precompose_utf8"
throughout the patch?

We are asking Git "please normalize by precompose any UTF-8
pathnames" when we give the -DPRECOMPOSE_UNICODE C-preprocessor
macro, and compat/precompose_utf8.[ch] are to implement the
machinery to do so.

> diff --git a/compat/precomposed_utf8.c b/compat/precomposed_utf8.c
> new file mode 100644
> index 0000000..14bb0ce
> --- /dev/null
> +++ b/compat/precomposed_utf8.c
> @@ -0,0 +1,189 @@
> +/* Converts filenames from decomposed unicode into precomposed unicode.
> +   Used on MacOS X.
> +*/

Micronit.

	/*
         * Multi-line comments begin by slash asterisk newline.
         * and ends with a run of SP to align asterisk, asterisk
         * and then newline, like this.
         */
> +#define __PRECOMPOSED_UNICODE_C__
> +
> +#include "cache.h"
> +#include "utf8.h"
> +#include "precomposed_utf8.h"


> +#include "stdio.h"

You shouldn't need "stdio.h" as you are including "git-compat-util.h"
via "cache.h".

> diff --git a/compat/precomposed_utf8.h b/compat/precomposed_utf8.h
> new file mode 100644
> index 0000000..708a1c6
> --- /dev/null
> +++ b/compat/precomposed_utf8.h
> ...
> +#ifndef __PRECOMPOSED_UNICODE_C__
> +#define dirent dirent_prec_psx
> +#define opendir(n) precomposed_utf8_opendir(n)
> +#define readdir(d) precomposed_utf8_readdir(d)
> +#define closedir(d) precomposed_utf8_closedir(d)
> +#define DIR PREC_DIR
> +#endif /* __PRECOMPOSED_UNICODE_C__ */

Hrm, this is not wrong per-se, but looks somewhat unwieldy.

> +#define  __PRECOMPOSED_UNICODE_H__
> +#endif /* __PRECOMPOSED_UNICODE_H__ */

> diff --git a/utf8.c b/utf8.c
> index 8acbc66..a544f15 100644
> --- a/utf8.c
> +++ b/utf8.c
> @@ -433,19 +433,12 @@ int is_encoding_utf8(const char *name)
> ...
> @@ -478,6 +470,20 @@ char *reencode_string(const char *in, const char *out_encoding, const char *in_e
>  			break;
>  		}
>  	}
> +	return out;
> +}
> +
> +char *reencode_string(const char *in, const char *out_encoding, const char *in_encoding)
> +{
> +	iconv_t conv;
> +	char *out;
> +
> +	if (!in_encoding)
> +		return NULL;
> +	conv = iconv_open(out_encoding, in_encoding);
> +	if (conv == (iconv_t) -1)
> +		return NULL;
> +	out = reencode_string_iconv(in, strlen(in), conv);
>  	iconv_close(conv);
>  	return out;
>  }

Much nicer ;-).

      reply	other threads:[~2012-07-02 22:22 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-07-02 20:32 [PATCH v7] git on Mac OS and precomposed unicode Torsten Bögershausen
2012-07-02 22:21 ` Junio C Hamano [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7vr4styfbg.fsf@alter.siamese.dyndns.org \
    --to=gitster@pobox.com \
    --cc=git@vger.kernel.org \
    --cc=tboegi@web.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).