git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Carlos Martín Nieto" <cmn@elego.de>
To: git@vger.kernel.org
Subject: Re: [PATCH] dir: allow a BOM at the beginning of exclude files
Date: Thu, 16 Apr 2015 17:10:02 +0200	[thread overview]
Message-ID: <1429197002.3097.16.camel@elego.de> (raw)
In-Reply-To: <1429193112-41184-1-git-send-email-cmn@elego.de>

On Thu, 2015-04-16 at 16:05 +0200, Carlos Martín Nieto wrote:
> Some text editors like Notepad or LibreOffice write an UTF-8 BOM in
> order to indicate that the file is Unicode text rather than whatever the
> current locale would indicate.
> 
> If someone uses such an editor to edit a gitignore file, we are left
> with those three bytes at the beginning of the file. If we do not skip
> them, we will attempt to match a filename with the BOM as prefix, which
> won't match the files the user is expecting.

Signed-off-by: Carlos Martín Nieto <cmn@elego.de>

which I keep forgetting.

> 
> ---
> 
> If you're wondering how I came up with LibreOffice, I was doing a
> workshop recently and one of the participants was not content with the
> choice of vim or nano, so he opened LibreOffice to edit the gitignore
> file with confusing consequences.
> 
> This codepath doesn't go as far as the config code in validating that
> we do not have a partial BOM which would mean there's some invalid
> content, but we don't really have invalid content any other way, as
> we're just dealing with a list of paths in the file.
> 
>  dir.c                      | 8 +++++++-
>  t/t7061-wtstatus-ignore.sh | 2 ++
>  2 files changed, 9 insertions(+), 1 deletion(-)
> 
> diff --git a/dir.c b/dir.c
> index 0943a81..6368247 100644
> --- a/dir.c
> +++ b/dir.c
> @@ -581,6 +581,7 @@ int add_excludes_from_file_to_list(const char *fname,
>  	struct stat st;
>  	int fd, i, lineno = 1;
>  	size_t size = 0;
> +	static const unsigned char *utf8_bom = (unsigned char *) "\xef\xbb\xbf";
>  	char *buf, *entry;
>  
>  	fd = open(fname, O_RDONLY);
> @@ -617,7 +618,12 @@ int add_excludes_from_file_to_list(const char *fname,
>  	}
>  
>  	el->filebuf = buf;
> -	entry = buf;
> +
> +	if (size >= 3 && !memcmp(buf, utf8_bom, 3))
> +		entry = buf + 3;
> +	else
> +		entry = buf;
> +
>  	for (i = 0; i < size; i++) {
>  		if (buf[i] == '\n') {
>  			if (entry != buf + i && entry[0] != '#') {
> diff --git a/t/t7061-wtstatus-ignore.sh b/t/t7061-wtstatus-ignore.sh
> index 460789b..0a06fbf 100755
> --- a/t/t7061-wtstatus-ignore.sh
> +++ b/t/t7061-wtstatus-ignore.sh
> @@ -13,6 +13,8 @@ EOF
>  
>  test_expect_success 'status untracked directory with --ignored' '
>  	echo "ignored" >.gitignore &&
> +	sed -e "s/^/\xef\xbb\xbf/" .gitignore >.gitignore.new &&
> +	mv .gitignore.new .gitignore &&
>  	mkdir untracked &&
>  	: >untracked/ignored &&
>  	: >untracked/uncommitted &&

  parent reply	other threads:[~2015-04-16 15:17 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-04-16 14:05 [PATCH] dir: allow a BOM at the beginning of exclude files Carlos Martín Nieto
2015-04-16 15:03 ` Johannes Schindelin
2015-04-16 15:09   ` Carlos Martín Nieto
2015-04-16 15:10 ` Carlos Martín Nieto [this message]
2015-04-16 15:39 ` Junio C Hamano
2015-04-16 15:55   ` Jeff King
2015-04-16 17:16     ` Junio C Hamano
2015-04-16 17:52       ` [PATCH 0/3] UTF8 BOM follow-up Junio C Hamano
2015-04-16 17:52         ` [PATCH 1/3] utf8-bom: introduce skip_utf8_bom() helper Junio C Hamano
2015-04-16 18:14           ` Jeff King
2015-04-16 18:23             ` Junio C Hamano
2015-04-16 17:52         ` [PATCH 2/3] config: use utf8_bom[] from utf.[ch] in git_parse_source() Junio C Hamano
2015-04-16 17:52         ` [PATCH 3/3] attr: skip UTF8 BOM at the beginning of the input file Junio C Hamano
2015-04-16 18:27       ` [PATCH] dir: allow a BOM at the beginning of exclude files Carlos Martín Nieto
2015-04-16 18:39       ` [PATCH v2 0/4] UTF8 BOM follow-up Junio C Hamano
2015-04-16 18:39         ` [PATCH v2 1/4] add_excludes_from_file: clarify the bom skipping logic Junio C Hamano
2015-04-16 18:39         ` [PATCH v2 2/4] utf8-bom: introduce skip_utf8_bom() helper Junio C Hamano
2015-04-16 18:39         ` [PATCH v2 3/4] config: use utf8_bom[] from utf.[ch] in git_parse_source() Junio C Hamano
2015-04-16 18:39         ` [PATCH v2 4/4] attr: skip UTF8 BOM at the beginning of the input file Junio C Hamano
2015-04-16 19:26         ` [PATCH v2 0/4] UTF8 BOM follow-up Jeff King
2015-04-17 22:44         ` Karsten Blees
2015-04-20 21:50           ` Junio C Hamano
2015-04-16 16:08   ` [PATCH] dir: allow a BOM at the beginning of exclude files Johannes Schindelin
2015-04-16 16:10 ` Torsten Bögershausen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1429197002.3097.16.camel@elego.de \
    --to=cmn@elego.de \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).