From: Jakub Narebski <jnareb@gmail.com>
To: Dmitry Potapov <dpotapov@gmail.com>
Cc: Stephen Bash <bash@genarts.com>, git@vger.kernel.org
Subject: Re: Git EOL Normalization
Date: Wed, 25 May 2011 23:02:35 -0700 (PDT) [thread overview]
Message-ID: <m3y61uxan2.fsf@localhost.localdomain> (raw)
In-Reply-To: <BANLkTik3iRKx4P_3nbzygadmLPEOr2vGhA@mail.gmail.com>
Dmitry Potapov <dpotapov@gmail.com> writes:
> On Wed, May 25, 2011 at 7:20 PM, Stephen Bash <bash@genarts.com> wrote:
> >
> > The open questions for me are:
> > 1) what is the actual text file detection algorithm?
> > 2) what is the autocrlf LF/CRLF detection algorithm?
> > 3) how does autocrlf handle mixed line endings? (either in the working copy or repo)
>
> Git looks at the text attribute of a file. If it is set or unset then it
> treats the file as text or binary accordingly. If the text attribute is
> 'auto', or it is unspecified but core.autocrlf is true, then git uses
> heuristics to detect text files.
>
> Currently, the following heuristics are used:
>
> A file is considered as text if it does not have '\0' or a bare CR, and
> the number of non-printable characters is less than 1 in 128.
>
> Non-printable characters are DEL (127) and anything less than 32 except
> CR, LF, BS, HT, ESC and FF.
I think git examines only first block of a file or so. The heuristic
to detect binary-ness of a file is, as I have heard, the same or
similar to the one that GNU diff uses.
See also `perldoc -f -X`, description of "-T" and "-B" switches,
though this might differ somewhat in detection and thresholds.
> Also, to avoid problems with autocrlf=true when someone has already put
> a text file with CRLF, CRLF->LF conversion happens only if the tracked
> file in the index does not have any CR.
See also documentation of `core.safecrlf` config variable (defaults to
true IIRC).
--
Jakub Narebski
Poland
ShadeHawk on #git
next prev parent reply other threads:[~2011-05-26 6:02 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20833035.39857.1306334468204.JavaMail.root@mail.hq.genarts.com>
2011-05-25 15:20 ` Git EOL Normalization Stephen Bash
2011-05-25 17:58 ` Dmitry Potapov
2011-05-25 18:06 ` Stephen Bash
2011-05-26 6:02 ` Jakub Narebski [this message]
2011-05-26 7:20 ` Dmitry Potapov
2011-05-26 16:07 ` Junio C Hamano
2011-05-26 16:28 ` Stephen Bash
2011-05-31 15:01 ` Drew Northup
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=m3y61uxan2.fsf@localhost.localdomain \
--to=jnareb@gmail.com \
--cc=bash@genarts.com \
--cc=dpotapov@gmail.com \
--cc=git@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.