From: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
To: Jeff King <peff@peff.net>
Cc: git@vger.kernel.org
Subject: Re: [PATCH 0/6] getenv() timing fixes
Date: Sat, 12 Jan 2019 12:31:21 +0100 [thread overview]
Message-ID: <87va2u3yeu.fsf@evledraar.gmail.com> (raw)
In-Reply-To: <20190111221414.GA31335@sigill.intra.peff.net>
On Fri, Jan 11 2019, Jeff King wrote:
> Similar to the recent:
>
> https://public-inbox.org/git/20190109221007.21624-1-kgybels@infogroep.be/
>
> there are some other places where we do not follow the POSIX rule that
> getenv()'s return value may be invalidated by other calls to getenv() or
> setenv().
>
> For the most part we haven't noticed because:
>
> - on many platforms, you can call getenv() as many times as you want.
> This changed recently in our mingw_getenv() helper, which is why
> people are noticing now.
>
> - calling setenv() in between _often_ works, but it depends on whether
> libc feels like it needs to reallocate memory. Which is itself
> platform specific, and even on a single platform may depend on
> things like how many environment variables you have set.
>
> The first patch here is a problem somebody actually found in the wild.
> That led me to start looking through the results of:
>
> git grep '= getenv('
>
> There are a ton of hits. I poked at the first 20 or so. A lot of them
> are fine, as they do something like this:
>
> rla = getenv("GIT_REFLOG_ACTION");
> strbuf_addstr("blah blah %s", rla);
>
> That's not _strictly_ correct, because strbuf_addstr() may actually look
> at the environment. But it works for our mingw_getenv() case, because
> there we use a rotating series of buffers. So as long as it doesn't look at
> 30 environment variables, we're fine. And many calls fall into that
> bucket (a more complicated one is get_ssh_command(), which runs a fair
> bit of code while holding the pointer, but ultimately probably has a
> small fixed number of opportunities to call getenv(). What is more
> worrisome is code that holds a pointer across an arbitrary number of
> calls (like once per diff'd file, or once per submodule, etc).
>
> Of course it's possible for some platform libc to use a single buffer.
> But in that case, I'd argue that the path of least resistance is
> wrapping getenv, like I described in:
>
> https://public-inbox.org/git/20181025062037.GC11460@sigill.intra.peff.net/
>
> So anyway. Here are a handful of what seem like pretty low-hanging
> fruit. Beyond the first one, I'm not sure if they're triggerable, but
> they're easy to fix. There are 100+ grep matches that I _didn't_ audit,
> so this is by no means a complete fix. I was mostly trying to get a
> sense of how painful these fixes would be.
I wonder, and not as "you should do this" feedback on this series, just
on future development, whether we shouldn't just make our own getenv()
wrapper for the majority of the GIT_* variables. The semantics would be
fetch value X, and if it's ever requested again return the value we
found the first time.
For some things we rely on getenv(X) -> setenv(X) -> getenv(X) returning
different values of X, e.g. in passing along config, but for
e.g. GIT_TEST_* variables we just want to check them once, and have our
own ad-hoc caches (via static variables) in a couple of places.
Maybe such an API would just loop over "environ" on startup, looking for
any GIT_* variables, i.e. called from the setup.c code.
next prev parent reply other threads:[~2019-01-12 11:39 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-01-11 22:14 [PATCH 0/6] getenv() timing fixes Jeff King
2019-01-11 22:15 ` [PATCH 1/6] get_super_prefix(): copy getenv() result Jeff King
2019-01-12 3:02 ` Junio C Hamano
2019-01-11 22:15 ` [PATCH 2/6] commit: copy saved " Jeff King
2019-01-12 3:07 ` Junio C Hamano
2019-01-12 10:26 ` Jeff King
2019-01-15 14:05 ` Johannes Schindelin
2019-01-15 19:17 ` Jeff King
2019-01-15 19:25 ` Stefan Beller
2019-01-15 19:32 ` Jeff King
2019-01-16 14:06 ` Johannes Schindelin
2019-01-11 22:15 ` [PATCH 3/6] config: make a copy of $GIT_CONFIG string Jeff King
2019-01-11 22:16 ` [PATCH 4/6] init: make a copy of $GIT_DIR string Jeff King
2019-01-12 3:08 ` Junio C Hamano
2019-01-11 22:16 ` [PATCH 5/6] merge-recursive: copy $GITHEAD strings Jeff King
2019-01-12 3:10 ` Junio C Hamano
2019-01-11 22:17 ` [PATCH 6/6] builtin_diff(): read $GIT_DIFF_OPTS closer to use Jeff King
2019-01-12 11:31 ` Ævar Arnfjörð Bjarmason [this message]
2019-01-12 18:51 ` [PATCH 0/6] getenv() timing fixes Stefan Beller
2019-01-15 19:13 ` Jeff King
2019-01-15 19:32 ` Junio C Hamano
2019-01-15 19:38 ` Stefan Beller
2019-01-15 19:41 ` Jeff King
2019-01-15 19:47 ` Jeff King
2019-01-15 20:49 ` Junio C Hamano
2019-01-15 19:12 ` Jeff King
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87va2u3yeu.fsf@evledraar.gmail.com \
--to=avarab@gmail.com \
--cc=git@vger.kernel.org \
--cc=peff@peff.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.