From: Stefan Beller <sbeller@google.com>
To: "Martin Ågren" <martin.agren@gmail.com>
Cc: Eric Sunshine <sunshine@sunshineco.com>,
git <git@vger.kernel.org>,
Johannes Schindelin <johannes.schindelin@gmx.de>
Subject: Re: [PATCH] regex: do not call `regfree()` if compilation fails
Date: Mon, 21 May 2018 11:43:03 -0700 [thread overview]
Message-ID: <CAGZ79kZotwAFauTkCJ6YZ_C-MuaQpNaaS8LCniL_Or=_ccfC4w@mail.gmail.com> (raw)
In-Reply-To: <20180520105032.9464-1-martin.agren@gmail.com>
On Sun, May 20, 2018 at 3:50 AM, Martin Ågren <martin.agren@gmail.com> wrote:
> It is apparently undefined behavior to call `regfree()` on a regex where
> `regcomp()` failed. The language in [1] is a bit muddy, at least to me,
> but the clearest hint is this (`preg` is the `regex_t *`):
>
> Upon successful completion, the regcomp() function shall return 0.
> Otherwise, it shall return an integer value indicating an error as
> described in <regex.h>, and the content of preg is undefined.
>
> Funnily enough, there is also the `regerror()` function which should be
> given a pointer to such a "failed" `regex_t` -- the content of which
> would supposedly be undefined -- and which may investigate it to come up
> with a detailed error message.
>
> In any case, the example in that document shows how `regfree()` is not
> called after `regcomp()` fails.
>
> We have quite a few users of this API and most get this right. These
> three users do not.
>
> Several implementations can handle this just fine [2] and these code paths
> supposedly have not wreaked havoc or we'd have heard about it. (These
> are all in code paths where git got bad input and is just about to die
> anyway.) But let's just avoid the issue altogether.
>
> [1] http://pubs.opengroup.org/onlinepubs/9699919799/functions/regcomp.html
>
> [2] https://www.redhat.com/archives/libvir-list/2013-September/msg00262.html
>
> Researched-by: Eric Sunshine <sunshine@sunshineco.com>
> Signed-off-byi Martin Ågren <martin.agren@gmail.com>
> ---
>
> On 14 May 2018 at 05:05, Eric Sunshine <sunshine@sunshineco.com> wrote:
>> My research (for instance [1,2]) seems to indicate that it would be
>> better to avoid regfree() upon failed regcomp(), even though such a
>> situation is handled sanely in some implementations.
>>
>> [1]: https://www.redhat.com/archives/libvir-list/2013-September/msg00276.html
>> [2]: https://www.redhat.com/archives/libvir-list/2013-September/msg00273.html
>
> Thank you for researching this. I think it would make sense to get rid
> of the few places we have where we get this wrong (if our understanding
> of this being undefined is right).
>
> diffcore-pickaxe.c | 1 -
> grep.c | 2 --
> 2 files changed, 3 deletions(-)
>
> diff --git a/diffcore-pickaxe.c b/diffcore-pickaxe.c
> index 239ce5122b..800a899c86 100644
> --- a/diffcore-pickaxe.c
> +++ b/diffcore-pickaxe.c
> @@ -215,7 +215,6 @@ static void regcomp_or_die(regex_t *regex, const char *needle, int cflags)
> /* The POSIX.2 people are surely sick */
> char errbuf[1024];
> regerror(err, regex, errbuf, 1024);
> - regfree(regex);
While the commit message is very clear why we supposedly introduce a leak here,
it is hard to be found from the source code (as we only delete code
there, so digging
for history is not obvious), so maybe
/* regfree(regex) is invalid here */
instead?
> die("invalid regex: %s", errbuf);
> }
> }
> diff --git a/grep.c b/grep.c
> index 65b90c10a3..5e4f3f9a9d 100644
> --- a/grep.c
> +++ b/grep.c
> @@ -636,7 +636,6 @@ static void compile_fixed_regexp(struct grep_pat *p, struct grep_opt *opt)
> if (err) {
> char errbuf[1024];
> regerror(err, &p->regexp, errbuf, sizeof(errbuf));
> - regfree(&p->regexp);
> compile_regexp_failed(p, errbuf);
> }
> }
> @@ -701,7 +700,6 @@ static void compile_regexp(struct grep_pat *p, struct grep_opt *opt)
> if (err) {
> char errbuf[1024];
> regerror(err, &p->regexp, errbuf, 1024);
> - regfree(&p->regexp);
> compile_regexp_failed(p, errbuf);
> }
> }
> --
> 2.17.0.840.g5d83f92caf
>
next prev parent reply other threads:[~2018-05-21 18:43 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-05-13 8:23 [PATCH] config: free resources of `struct config_store_data` Martin Ågren
2018-05-13 8:59 ` Eric Sunshine
2018-05-13 9:58 ` Martin Ågren
2018-05-13 9:58 ` [PATCH 2/1] config: let `config_store_data_clear()` handle `value_regex` Martin Ågren
2018-05-14 3:05 ` Eric Sunshine
2018-05-20 10:50 ` [PATCH] regex: do not call `regfree()` if compilation fails Martin Ågren
2018-05-21 18:43 ` Stefan Beller [this message]
2018-05-22 2:20 ` Eric Sunshine
2018-05-22 11:00 ` Martin Ågren
2018-05-13 9:58 ` [PATCH 3/1] config: let `config_store_data_clear()` handle `key` Martin Ågren
2018-05-14 3:03 ` [PATCH] config: free resources of `struct config_store_data` Eric Sunshine
2018-05-20 10:42 ` [PATCH v2 0/3] " Martin Ågren
2018-05-20 10:42 ` [PATCH v2 1/3] " Martin Ågren
2018-05-20 10:42 ` [PATCH v2 2/3] config: let `config_store_data_clear()` handle `value_regex` Martin Ågren
2018-05-20 10:42 ` [PATCH v2 3/3] config: let `config_store_data_clear()` handle `key` Martin Ågren
2018-05-23 7:03 ` Eric Sunshine
2018-05-23 7:01 ` [PATCH v2 0/3] config: free resources of `struct config_store_data` Eric Sunshine
2018-05-13 18:40 ` [PATCH] " Martin Ågren
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAGZ79kZotwAFauTkCJ6YZ_C-MuaQpNaaS8LCniL_Or=_ccfC4w@mail.gmail.com' \
--to=sbeller@google.com \
--cc=git@vger.kernel.org \
--cc=johannes.schindelin@gmx.de \
--cc=martin.agren@gmail.com \
--cc=sunshine@sunshineco.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).