git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jeff King <peff@peff.net>
To: Erik Faye-Lund <kusmabite@gmail.com>
Cc: git@vger.kernel.org, jwa@urbancode.com, drew.northup@maine.edu
Subject: Re: [PATCH] config: support values longer than 1024 bytes
Date: Wed, 6 Apr 2011 11:35:09 -0400	[thread overview]
Message-ID: <20110406153509.GA1864@sigill.intra.peff.net> (raw)
In-Reply-To: <BANLkTim0N0kM+OX5Tztz-Kh+eRRsNixX0A@mail.gmail.com>

On Wed, Apr 06, 2011 at 11:10:42AM +0200, Erik Faye-Lund wrote:

> > But for the other, one of the invariants of strbuf is that the string is
> > always NUL-terminated. So I would expect strbuf_init to properly
> > NUL-terminate after growing based on the hint.
> 
> I agree. An unterminated yet non-NULL return from strbuf_detach is
> just dangerous behavior. Something like this should probably be
> applied:
> 
> ---8<---
> diff --git a/strbuf.c b/strbuf.c
> index 77444a9..538035a 100644
> --- a/strbuf.c
> +++ b/strbuf.c
> @@ -24,14 +24,16 @@ int suffixcmp(const char *str, const char *suffix)
>   * buf is non NULL and ->buf is NUL terminated even for a freshly
>   * initialized strbuf.
>   */
> -char strbuf_slopbuf[1];
> +char strbuf_slopbuf[1] = { '\0' };

This hunk is redundant. slopbuf will already be initialized to 0.

>  void strbuf_init(struct strbuf *sb, size_t hint)
>  {
>  	sb->alloc = sb->len = 0;
>  	sb->buf = strbuf_slopbuf;
> -	if (hint)
> +	if (hint) {
>  		strbuf_grow(sb, hint);
> +		sb->buf[0] = '\0';
> +	}
>  }

But this one is the right fix.

> that. But this brings a new issue: leaving potentially huge blocks of
> memory (especially since this patch is about long lines) allocated
> inside a function can be a bit nasty. But it's probably not a big deal

Yeah. It's just one block, though, and in the normal case it is probably
only about 80 characters. So it is more efficient than what's there now. :)

Somebody could have some gigantic value, though, and yes, we'll grow to
the biggest one and never free that memory. You could also have
parse_value take a strbuf parameter to output into, and then free it
after config reading is done.

> In other words: I think you're right, it's a much better approach.
> Less allocations, less penalty on the start-up time for every little
> git-command.

I doubt the efficiency increase is measurable. We end up xstrdup'ing
quite a few of the values in the config callbacks anyway. I would do
whatever seems most natural for reading/writing the code.

> > I do wonder, though, if we could be reusing the unquote_c_style()
> > function in quote.c. They are obviously similar, but I haven't checked
> > if there is more going on in the config code.
> 
> Hmm, this is an interesting suggestion. It would be a part of a bigger
> change though: unquote_c_style requires it's input to be in memory,
> while parse_value uses a function called get_next_char to feed the
> parser. So we'd either have to read the entire file into memory, or
> find some way to read the file line-by-line while handling \n-escaping
> correctly.
> 
> It also seems like there's differences in what kind of escaping and
> normalization the two functions handle; unquote_c_style handles more
> escaped character sequences, while parse_value normalize all
> non-escaped space-characters ('\t' et. al) into SP. This might not be
> such a big problem in reality.

This was just a random thought that I had, and I didn't investigate it
how hard it would be.  If it turns out to be too much trouble, just
forget it.

-Peff

  reply	other threads:[~2011-04-06 15:35 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-04-05 23:30 [PATCH] config: support values longer than 1024 bytes Erik Faye-Lund
2011-04-05 23:38 ` Erik Faye-Lund
2011-04-06  0:52 ` Jeff King
2011-04-06  9:10   ` Erik Faye-Lund
2011-04-06 15:35     ` Jeff King [this message]
2011-04-06 16:16       ` Erik Faye-Lund

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110406153509.GA1864@sigill.intra.peff.net \
    --to=peff@peff.net \
    --cc=drew.northup@maine.edu \
    --cc=git@vger.kernel.org \
    --cc=jwa@urbancode.com \
    --cc=kusmabite@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).