From: heapcrash heapcrash <heapcrash@gmail.com>
To: Junio C Hamano <gitster@pobox.com>
Cc: "Jeff King" <peff@peff.net>,
	"Ævar Arnfjörð Bjarmason" <avarab@gmail.com>,
	git@vger.kernel.org
Subject: Re: Enhancing --show-function and --function-context in default configurations
Date: Mon, 2 Aug 2021 21:34:32 -0500	[thread overview]
Message-ID: <CAM1Tk16Duh9ySS1xwW3yK4ogz=HbMp9k5B7bjCGSgqOEx=pdtA@mail.gmail.com> (raw)
In-Reply-To: <xmqqmtpzwzbm.fsf@gitster.g>
Thanks all for the feedback!  I'll try to address it below:
> Two other places may be diff hunk headers and --diff-words output, I think.
I didn't think of those.  Does this mean that diffs generated with a
given e.g. diff=cpp configuration might not apply cleanly if run on
some other user's system without that setting?
> That depends.  If you are going to introduce a mechanism to
> introduce hardcoded configurations, depending on how it is done, it
> will become a huge security headache.
I don't intend to add hard-coded configurations, just hard-coded
defaults.  User configurations (~/.gitattributes and gitconfig
diff.funcname/diff.xcfuncname) will still take precedence.
> It makes emitting the diffs take more CPU, but the same is true of other
> options like colorMoved etc, so that in itself is not a dealbreaker.
I didn't think of this scenario, that it would add CPU time even
without -W/--function-context/--show-function.  I'd definitely be fine
with preserving the current behavior in these cases (more below).
> As long as the default built-in ones are
>
> (1) at least 90% of the time improvement over, or at least is not
>     broken compared to, the unconfigured case, and
>
> (2) at the lowest priority that users can easily countermand for
>     the rest 10% cases
>
> I do not think it is too bad to resurrect the old patches from these
The currently-hard-coded regexps for e.g. cpp, python, objc, etc. (in
userdiff.c) are all VAST improvements over the default, which is
effectively '^[A-Za-z0-9_:]+' (i.e., any word which starts on the
zeroth column, IIRC).  This has lots of issues for e.g. Python where
the function name is often indented (e.g. inside a class definition)
and cpp is compatible with, and an improvement over, the default "C"
setting (which is particularly broken as it marks any "goto" label as
a function start / end).
All of those points covered, I think we can make this work in a way
that preserves backward compatibility (no defaults at all, user must
manually configure ~/.gitattributes) by only setting "smart" defaults
(e.g. "*.cpp diff=cpp", "*.cc diff=cpp" etc) when the command is
EXPLICITLY invoked with -W/--function-context/--show-function.
Another case I didn't think of in the original post is with the "git
log -L:funcname:path/to/file.cpp" option, which tracks changes within
a function over time.  Having "better" default function boundary
detection here would also be useful.
Ultimately, I agree we should not have any extra overhead for any
commands (log, diff, grep, etc) unless the user EXPLICITLY uses flags
that indicate function boundary detection are key to functionality of
those flags.
On Mon, Aug 2, 2021 at 12:06 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> Jeff King <peff@peff.net> writes:
>
> > On Mon, Aug 02, 2021 at 10:45:25AM +0200, Ævar Arnfjörð Bjarmason wrote:
> >
> >> I would like to see us have a setting to turn these on by default, but
> >> think it would be better to make that a diff.* config setting to put
> >> into ~/.gitconfig, i.e. we'd extend git itself to know about a list of
> >> extensions for the given userdiff drivers, and use them when rendering
> >> diffs.
> >
> > A long time ago we discussed doing this. The relevant thread is:
> >
> >   https://lore.kernel.org/git/20111216110000.GA15676@sigill.intra.peff.net/
> >
> > which references a few others:
> >
> >   https://lore.kernel.org/git/4E569F10.8060808@panasas.com/
> >
> >   https://lore.kernel.org/git/4E6E928A.6080003@sunshineco.com/
> > ...
>
> Thanks for pointers.
>
> One good suggestion given there was to use diff=c and diff=perl in
> our own .gitattributes to use the patterns ourselves, which we seem
> to have been doing just fine ;-)
>
> As long as the default built-in ones are
>
>  (1) at least 90% of the time improvement over, or at least is not
>      broken compared to, the unconfigured case, and
>
>  (2) at the lowest priority that users can easily countermand for
>      the rest 10% cases
>
> I do not think it is too bad to resurrect the old patches from these
> threads.
>
> Thanks.
>
>
next prev parent reply	other threads:[~2021-08-03  2:34 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-01 21:40 Enhancing --show-function and --function-context in default configurations heapcrash heapcrash
2021-08-02  2:21 ` Junio C Hamano
2021-08-02  8:45 ` Ævar Arnfjörð Bjarmason
2021-08-02 16:17   ` Jeff King
2021-08-02 17:06     ` Junio C Hamano
2021-08-03  2:34       ` heapcrash heapcrash [this message]
2021-08-03 14:55         ` Jeff King
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox
  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):
  git send-email \
    --in-reply-to='CAM1Tk16Duh9ySS1xwW3yK4ogz=HbMp9k5B7bjCGSgqOEx=pdtA@mail.gmail.com' \
    --to=heapcrash@gmail.com \
    --cc=avarab@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY
  https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
  Be sure your reply has a Subject: header at the top and a blank line
  before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).