All of lore.kernel.org
 help / color / mirror / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: Yi EungJun <semtlenori@gmail.com>
Cc: Git List <git@vger.kernel.org>,
	Yi EungJun <eungjun.yi@navercorp.com>,
	Eric Sunshine <sunshine@sunshineco.com>,
	Jeff King <peff@peff.net>,
	Peter Krefting <peter@softwolves.pp.se>
Subject: Re: [PATCH v4 1/1] http: Add Accept-Language header if possible
Date: Mon, 21 Jul 2014 12:01:39 -0700	[thread overview]
Message-ID: <xmqqwqb6ilik.fsf@gitster.dls.corp.google.com> (raw)
In-Reply-To: <1405792730-13539-2-git-send-email-eungjun.yi@navercorp.com> (Yi EungJun's message of "Sun, 20 Jul 2014 02:58:50 +0900")

Yi EungJun <semtlenori@gmail.com> writes:

> From: Yi EungJun <eungjun.yi@navercorp.com>
>
> Add an Accept-Language header which indicates the user's preferred
> languages defined by $LANGUAGE, $LC_ALL, $LC_MESSAGES and $LANG.
>
> Examples:
>   LANGUAGE= -> ""
>   LANGUAGE=ko:en -> "Accept-Language: ko, en; q=0.9, *; q=0.1"
>   LANGUAGE=ko LANG=en_US.UTF-8 -> "Accept-Language: ko, *; q=0.1"
>   LANGUAGE= LANG=en_US.UTF-8 -> "Accept-Language: en-US, *; q=0.1"
>
> This gives git servers a chance to display remote error messages in
> the user's preferred language.
>
> Signed-off-by: Yi EungJun <eungjun.yi@navercorp.com>
> ---
>  http.c                     | 134 +++++++++++++++++++++++++++++++++++++++++++++
>  remote-curl.c              |   2 +
>  t/t5550-http-fetch-dumb.sh |  31 +++++++++++
>  3 files changed, 167 insertions(+)
>
> diff --git a/http.c b/http.c
> index 3a28b21..ed4e8e1 100644
> --- a/http.c
> +++ b/http.c
> @@ -67,6 +67,8 @@ static struct curl_slist *no_pragma_header;
>  
>  static struct active_request_slot *active_queue_head;
>  
> +static struct strbuf *cached_accept_language = NULL;

Please drop " = NULL" that is unnecessary for BSS.

> @@ -512,6 +514,9 @@ void http_cleanup(void)
>  		cert_auth.password = NULL;
>  	}
>  	ssl_cert_password_required = 0;
> +
> +	if (cached_accept_language)
> +		strbuf_release(cached_accept_language);
>  }



> @@ -983,6 +988,129 @@ static void extract_content_type(struct strbuf *raw, struct strbuf *type,
>  		strbuf_addstr(charset, "ISO-8859-1");
>  }
>  
> +/*
> + * Guess the user's preferred languages from the value in LANGUAGE environment
> + * variable and LC_MESSAGES locale category.
> + *
> + * The result can be a colon-separated list like "ko:ja:en".
> + */
> +static const char *get_preferred_languages(void)
> +{
> +	const char *retval;
> +
> +	retval = getenv("LANGUAGE");
> +	if (retval && *retval)
> +		return retval;
> +
> +	retval = setlocale(LC_MESSAGES, NULL);
> +	if (retval && *retval &&
> +		strcmp(retval, "C") &&
> +		strcmp(retval, "POSIX"))
> +		return retval;
> +
> +	return NULL;
> +}
> +
> +/*
> + * Get an Accept-Language header which indicates user's preferred languages.
> + *
> + * Examples:
> + *   LANGUAGE= -> ""
> + *   LANGUAGE=ko:en -> "Accept-Language: ko, en; q=0.9, *; q=0.1"
> + *   LANGUAGE=ko_KR.UTF-8:sr@latin -> "Accept-Language: ko-KR, sr; q=0.9, *; q=0.1"
> + *   LANGUAGE=ko LANG=en_US.UTF-8 -> "Accept-Language: ko, *; q=0.1"
> + *   LANGUAGE= LANG=en_US.UTF-8 -> "Accept-Language: en-US, *; q=0.1"
> + *   LANGUAGE= LANG=C -> ""
> + */
> +static struct strbuf *get_accept_language(void)
> +{
> +	const char *lang_begin, *pos;
> +	int q, max_q;
> +	int num_langs;
> +	int decimal_places;
> +	int is_codeset_or_modifier = 0;
> +	static struct strbuf buf = STRBUF_INIT;
> +	struct strbuf q_format_buf = STRBUF_INIT;
> +	char *q_format;
> +
> +	if (cached_accept_language)
> +		return cached_accept_language;
> +
> +	lang_begin = get_preferred_languages();
> +
> +	/* Don't add Accept-Language header if no language is preferred. */
> +	if (!(lang_begin && *lang_begin)) {

It is not wrong per-se, but given how hard get_preferred_languages()
tries not to return a pointer to an empty string, this seems a bit
overly defensive to me.

> +		cached_accept_language = &buf;
> +		return cached_accept_language;

It is somewhat unconventional to have a static pointer outside to
point at a singleton and then have a singleton actually as a static
structure.  I would have done without "buf" in this function and
instead started this function like so:

	if (cached_accept_language)
        	return cached_accept_language;

	cached_accept_language = xmalloc(sizeof(struct strbuf));
        strbuf_init(cached_accept_language, 0);
        lang_begin =  get_preferred_languages();
	if (!lang_begin)
		return cached_accept_language;

> +	}
> +
> +	/* Count number of preferred lang_begin to decide precision of q-factor */
> +	for (num_langs = 1, pos = lang_begin; *pos; pos++)
> +		if (*pos == ':')
> +			num_langs++;
> +
> +	/* Decide the precision for q-factor on number of preferred lang_begin. */
> +	num_langs += 1; /* for '*' */


> +	decimal_places = 1 + (num_langs > 10) + (num_langs > 100);

What if you got 60000 languages ;-)?  I do not think we want to bend
backwards and make the code list all 60000 of them, assigning a
unique and decreasing q value to each of them, forming an overlong
Accept-Language header, but at the same time, I do not think we want
to show nonsense output because we compute the precision incorrectly
here.

> +	strbuf_addf(&q_format_buf, "; q=0.%%0%dd", decimal_places);
> +	q_format = strbuf_detach(&q_format_buf, NULL);

q_format_buf is an overkill use of strbuf, isn't it?  Just

	char q_format_buf[32];
	sprintf(q_format_buf, ";q=0.%%0%d", decimal_places);

or something should be more than sufficient, no?

> +	for (max_q = 1; decimal_places-- > 0;) max_q *= 10;

As you have to do one loop like this that amounts to computing log10
of num_langs, why not compute decimal_places the same way while at
it?  It may also make sense to cap the number of languages to avoid
spitting out overly long Accept-Language header with practicaly
useless list of many languages.  That is, something along the lines
of ... (note that I may very well have off-by-one or off-by-ten
errors here you may need to tweak to get right):

        if (MAX_LANGS < num_langs)
        	num_langs = MAX_LANGS;
        for (max_q = 1, decimal_places = 1;
             max_q < num_langs;
             decimal_places++, max_q *= 10)
             ;

If you are to use the MAX_LANGS cap, the main loop would also need
to pay attention to it by breaking out of the loop early before you
reach the end of the string, of course.

> +	q = max_q;
> +
> +	strbuf_addstr(&buf, "Accept-Language: ");
> +
> +	/*
> +	 * Convert a list of colon-separated locale values [1][2] to a list of
> +	 * comma-separated language tags [3] which can be used as a value of
> +	 * Accept-Language header.
> +	 *
> +	 * [1]: http://pubs.opengroup.org/onlinepubs/007908799/xbd/envvar.html
> +	 * [2]: http://www.gnu.org/software/libc/manual/html_node/Using-gettextized-software.html
> +	 * [3]: http://tools.ietf.org/html/rfc7231#section-5.3.5
> +	 */
> +	for (pos = lang_begin; ; pos++) {
> +		if (*pos == ':' || !*pos) {
> +			/* Ignore if this character is the first one. */
> +			if (pos == lang_begin)
> +				continue;

By doing this "ignore empty" here, but not doing the same when you
count num_langs, are you potentially miscounting num_langs?

> +			is_codeset_or_modifier = 0;
> +
> +			/* Put a q-factor only if it is less than 1.0. */
> +			if (q < max_q)

... is it the same thing as "do not do this for the first round, but
do so for all the other round"?

> +				strbuf_addf(&buf, q_format, q);
> +
> +			if (q > 1)

Hmm, I am puzzled.  C this ever be an issue (unless you have
off-by-one error or you add "cap num_langs to MAX_LANGS", that is)?

> +				q--;

> +			/* NULL pos means this is the last language. */
> +			if (*pos)
> +				strbuf_addstr(&buf, ", ");
> +			else
> +				break;
> +
> +		} else if (is_codeset_or_modifier)
> +			continue;
> +		else if (*pos == '.' || *pos == '@') /* Remove .codeset and @modifier. */
> +			is_codeset_or_modifier = 1;
> +		else
> +			strbuf_addch(&buf, *pos == '_' ? '-' : *pos);
> +	}
> +
> +	/* Don't add Accept-Language header if no language is preferred. */
> +	if (q >= max_q) {

Can q go over max_q, or is it "q may be max_q"?  In other words, is
this essentially saying "if we did not find any language in the
preferred languages list"?

> +		cached_accept_language = &buf;
> +		return cached_accept_language;
> +	}
> +
> +	/* Add '*' with minimum q-factor greater than 0.0. */
> +	strbuf_addstr(&buf, ", *");
> +	strbuf_addf(&buf, q_format, 1);
> +
> +	cached_accept_language = &buf;
> +	return cached_accept_language;
> +}
> +
>  /* http_request() targets */
>  #define HTTP_REQUEST_STRBUF	0
>  #define HTTP_REQUEST_FILE	1
> @@ -995,6 +1123,7 @@ static int http_request(const char *url,
>  	struct slot_results results;
>  	struct curl_slist *headers = NULL;
>  	struct strbuf buf = STRBUF_INIT;
> +	struct strbuf* accept_language;

As we write in C, not C++, our asterisks stick to the variable, not
the type.

  reply	other threads:[~2014-07-21 19:02 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-07-19 17:58 [PATCH v4 0/1] http: Add Accept-Language header if possible Yi EungJun
2014-07-19 17:58 ` [PATCH v4 1/1] " Yi EungJun
2014-07-21 19:01   ` Junio C Hamano [this message]
2014-08-03  7:35     ` Yi, EungJun
2014-12-02 12:12 ` [PATCH v5 0/1] " Yi EungJun
2014-12-02 12:12   ` [PATCH v5 1/1] " Yi EungJun
2014-12-03 18:22     ` Junio C Hamano
2014-12-03 19:31     ` Eric Sunshine
2014-12-03 21:37       ` Junio C Hamano
2014-12-03 22:00         ` Michael Blume
2014-12-03 22:06           ` Michael Blume
2014-12-22 16:44             ` [PATCH v6 0/1] " Yi EungJun
2014-12-22 16:44               ` [PATCH v6 1/1] " Yi EungJun
2014-12-22 19:34                 ` Junio C Hamano
2014-12-24 20:35                 ` Eric Sunshine
2014-12-29 16:18                   ` Junio C Hamano
2015-01-18 12:23                     ` [PATCH v7 0/1] " Yi EungJun
2015-01-18 12:26                       ` [PATCH v7 1/1] " Yi EungJun
2015-01-18 15:14                         ` Torsten Bögershausen
2015-01-19 20:21                         ` [PATCH v6 0/1] " Eric Sunshine
2015-01-22  7:54                         ` [PATCH v7 1/1] " Junio C Hamano
2015-01-27 15:51                           ` [PATCH v8 0/1] " Yi EungJun
2015-01-27 15:51                             ` [PATCH] " Yi EungJun
2015-01-27 23:34                               ` Junio C Hamano
2015-01-28  6:15                                 ` Junio C Hamano
2015-01-28 11:59                                   ` Yi, EungJun
2015-01-28 12:04                                   ` [PATCH v9 0/1] " Yi EungJun
2015-01-28 12:04                                     ` [PATCH v9 1/1] " Yi EungJun
2015-02-25 22:52                                       ` Junio C Hamano
2015-02-26  3:04                                         ` Jeff King
2015-02-26  3:10                                           ` Jeff King
2015-02-26 20:59                                           ` Junio C Hamano
2015-02-26 21:33                                             ` Jeff King
2015-02-26 21:42                                               ` Junio C Hamano
2015-02-26 21:47                                                 ` Stefan Beller
2015-02-26 22:06                                                   ` Jeff King
2015-02-26 22:07                                                     ` Jeff King
2015-02-26 22:26                                                       ` Stefan Beller
2015-02-26 22:36                                                         ` Jeff King
2015-02-26 22:45                                                           ` Jeff King
2015-02-26 23:29                                                             ` Junio C Hamano
2015-02-26 22:13                                                     ` Junio C Hamano
2015-01-29  6:19                                     ` [PATCH v9 0/1] " Junio C Hamano
2015-01-30 17:23                                       ` Yi, EungJun
2015-03-06 16:13 ` [PATCH] http: Include locale.h when using setlocale() Ævar Arnfjörð Bjarmason
2015-03-06 19:01   ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqqwqb6ilik.fsf@gitster.dls.corp.google.com \
    --to=gitster@pobox.com \
    --cc=eungjun.yi@navercorp.com \
    --cc=git@vger.kernel.org \
    --cc=peff@peff.net \
    --cc=peter@softwolves.pp.se \
    --cc=semtlenori@gmail.com \
    --cc=sunshine@sunshineco.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.