From: Jeff King <peff@peff.net>
To: Erik Faye-Lund <kusmabite@gmail.com>
Cc: git@vger.kernel.org, jwa@urbancode.com, drew.northup@maine.edu
Subject: Re: [PATCH] config: support values longer than 1024 bytes
Date: Wed, 6 Apr 2011 11:35:09 -0400 [thread overview]
Message-ID: <20110406153509.GA1864@sigill.intra.peff.net> (raw)
In-Reply-To: <BANLkTim0N0kM+OX5Tztz-Kh+eRRsNixX0A@mail.gmail.com>
On Wed, Apr 06, 2011 at 11:10:42AM +0200, Erik Faye-Lund wrote:
> > But for the other, one of the invariants of strbuf is that the string is
> > always NUL-terminated. So I would expect strbuf_init to properly
> > NUL-terminate after growing based on the hint.
>
> I agree. An unterminated yet non-NULL return from strbuf_detach is
> just dangerous behavior. Something like this should probably be
> applied:
>
> ---8<---
> diff --git a/strbuf.c b/strbuf.c
> index 77444a9..538035a 100644
> --- a/strbuf.c
> +++ b/strbuf.c
> @@ -24,14 +24,16 @@ int suffixcmp(const char *str, const char *suffix)
> * buf is non NULL and ->buf is NUL terminated even for a freshly
> * initialized strbuf.
> */
> -char strbuf_slopbuf[1];
> +char strbuf_slopbuf[1] = { '\0' };
This hunk is redundant. slopbuf will already be initialized to 0.
> void strbuf_init(struct strbuf *sb, size_t hint)
> {
> sb->alloc = sb->len = 0;
> sb->buf = strbuf_slopbuf;
> - if (hint)
> + if (hint) {
> strbuf_grow(sb, hint);
> + sb->buf[0] = '\0';
> + }
> }
But this one is the right fix.
> that. But this brings a new issue: leaving potentially huge blocks of
> memory (especially since this patch is about long lines) allocated
> inside a function can be a bit nasty. But it's probably not a big deal
Yeah. It's just one block, though, and in the normal case it is probably
only about 80 characters. So it is more efficient than what's there now. :)
Somebody could have some gigantic value, though, and yes, we'll grow to
the biggest one and never free that memory. You could also have
parse_value take a strbuf parameter to output into, and then free it
after config reading is done.
> In other words: I think you're right, it's a much better approach.
> Less allocations, less penalty on the start-up time for every little
> git-command.
I doubt the efficiency increase is measurable. We end up xstrdup'ing
quite a few of the values in the config callbacks anyway. I would do
whatever seems most natural for reading/writing the code.
> > I do wonder, though, if we could be reusing the unquote_c_style()
> > function in quote.c. They are obviously similar, but I haven't checked
> > if there is more going on in the config code.
>
> Hmm, this is an interesting suggestion. It would be a part of a bigger
> change though: unquote_c_style requires it's input to be in memory,
> while parse_value uses a function called get_next_char to feed the
> parser. So we'd either have to read the entire file into memory, or
> find some way to read the file line-by-line while handling \n-escaping
> correctly.
>
> It also seems like there's differences in what kind of escaping and
> normalization the two functions handle; unquote_c_style handles more
> escaped character sequences, while parse_value normalize all
> non-escaped space-characters ('\t' et. al) into SP. This might not be
> such a big problem in reality.
This was just a random thought that I had, and I didn't investigate it
how hard it would be. If it turns out to be too much trouble, just
forget it.
-Peff
next prev parent reply other threads:[~2011-04-06 15:35 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-04-05 23:30 [PATCH] config: support values longer than 1024 bytes Erik Faye-Lund
2011-04-05 23:38 ` Erik Faye-Lund
2011-04-06 0:52 ` Jeff King
2011-04-06 9:10 ` Erik Faye-Lund
2011-04-06 15:35 ` Jeff King [this message]
2011-04-06 16:16 ` Erik Faye-Lund
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20110406153509.GA1864@sigill.intra.peff.net \
--to=peff@peff.net \
--cc=drew.northup@maine.edu \
--cc=git@vger.kernel.org \
--cc=jwa@urbancode.com \
--cc=kusmabite@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).