git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Use of strbuf.buf when strbuf.len == 0
@ 2007-09-27  6:21 Junio C Hamano
  2007-09-27 10:13 ` Pierre Habouzit
  2007-09-27 11:22 ` Pierre Habouzit
  0 siblings, 2 replies; 10+ messages in thread
From: Junio C Hamano @ 2007-09-27  6:21 UTC (permalink / raw)
  To: git

I am starting to hate some parts of the strbuf API.

Here is an example.  Can you spot what goes wrong?

static int handle_file(const char *path,
	 unsigned char *sha1, const char *output)
{
	SHA_CTX ctx;
	char buf[1024];
	int hunk = 0, hunk_no = 0;
	struct strbuf one, two;
	...
	if (sha1)
		SHA1_Init(&ctx);

	strbuf_init(&one, 0);
	strbuf_init(&two,  0);
	while (fgets(buf, sizeof(buf), f)) {
		if (!prefixcmp(buf, "<<<<<<< "))
			hunk = 1;
		else if (!prefixcmp(buf, "======="))
			hunk = 2;
		else if (!prefixcmp(buf, ">>>>>>> ")) {
			int cmp = strbuf_cmp(&one, &two);

			hunk_no++;
			hunk = 0;
			if (cmp > 0) {
				strbuf_swap(&one, &two);
			}
			if (out) {
				fputs("<<<<<<<\n", out);
				fwrite(one.buf, one.len, 1, out);
				fputs("=======\n", out);
				fwrite(two.buf, two.len, 1, out);
				fputs(">>>>>>>\n", out);
			}
			if (sha1) {
				SHA1_Update(&ctx, one.buf, one.len + 1);
				SHA1_Update(&ctx, two.buf, two.len + 1);
			}
			strbuf_reset(&one);
			strbuf_reset(&two);
		} else if (hunk == 1)
			strbuf_addstr(&one, buf);
		else if (hunk == 2)
			strbuf_addstr(&two, buf);
		else if (out)
			fputs(buf, out);
	}
	strbuf_release(&one);
	strbuf_release(&two);
	...
}

When one or two are empty and the caller asked for checksumming,
the code still relies on one.buf being an allocated memory with
an extra NUL termination and tries to feed the NULL pointer to
SHA1_Update() with length of 1!

An obvious workaround is to say "strbuf_init(&one, !!sha1)" to
force the allocation.  However, because the second parameter to
strbuf_init() is defined to be merely a hint, we should not rely
on strbuf_init() with non-zero hint to have allocation from the
beginning.  It is assuming too much.

A more defensive way would be for the user to do something like:

	SHA1_Update(&ctx, one.buf ? one.buf : "", one.len + 1);
	SHA1_Update(&ctx, two.buf ? two.buf : "", two.len + 1);

which I am leaning towards, but this looks ugly.

I was already bitten by another bug in strbuf_setlen() that
shared the same roots; an empty strbuf can have two internal
representations ("alloc == 0 && buf == NULL" vs "alloc != 0 &&
buf != NULL") and they behave differently.

For example, this happens to be Ok but the validity of it is
subtle.  If a or b is empty, memcmp may get a NULL pointer but
we ask only 0 byte comparison.

        int strbuf_cmp(struct strbuf *a, struct strbuf *b)
        {
                int cmp;
                if (a->len < b->len) {
                        cmp = memcmp(a->buf, b->buf, a->len);
                        return cmp ? cmp : -1;
                } else {
                        cmp = memcmp(a->buf, b->buf, b->len);
                        return cmp ? cmp : a->len != b->len;
                }
        }

It might be an easier and safer fix to define that strbuf_init()
to always have allocation.  Use of a strbuf in the code _and_
not adding any contents to the buffer should be an exception and
avoiding malloc()/free() for that special case feels optimizing
for the wrong case.

However, there are strbuf instances that are not initialized
(i.e. in BSS or initialized by declaring with STRBUF_INIT), so
we still need to handle (.len == 0 && .alloc == 0) case
specially anyway.

It would be appreciated if somebody with a fresh pair of eyes
can go over the strbuf series one more time to make sure that we
do not try to blindly use buf.buf, assuming buf.buf[0] is NUL if
(buf.len == 0).

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2007-09-29  7:48 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-09-27  6:21 Use of strbuf.buf when strbuf.len == 0 Junio C Hamano
2007-09-27 10:13 ` Pierre Habouzit
2007-09-27 10:51   ` [PATCH 1/2] double free in builtin-update-index.c Pierre Habouzit
2007-09-27 10:58   ` [PATCH 2/2] strbuf change: be sure ->buf is never ever NULL Pierre Habouzit
2007-09-29  0:51   ` Use of strbuf.buf when strbuf.len == 0 Linus Torvalds
2007-09-29  7:48     ` Pierre Habouzit
2007-09-27 11:22 ` Pierre Habouzit
2007-09-27 11:33   ` [PATCH 1/1] Make read_patch_file work on a strbuf Pierre Habouzit
2007-09-27 11:33   ` [PROPER PATCH " Pierre Habouzit
2007-09-27 11:37   ` Use of strbuf.buf when strbuf.len == 0 Pierre Habouzit

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).