git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] grep: don't segfault
@ 2010-01-18 21:55 Jim Meyering
  2010-01-18 22:40 ` Linus Torvalds
  0 siblings, 1 reply; 2+ messages in thread
From: Jim Meyering @ 2010-01-18 21:55 UTC (permalink / raw)
  To: git list


Today, git grep -l '^	' failed with a segfault.
I tracked it down to a26345b6085340ebd61e156aa8154a80196bee0f.
Debugging it, I found that look_ahead calls regexec with a "bol"
argument pointing to a buffer that is not NUL-terminated.
Yet regexec requires the NUL-termination.  Without that, its
initial strlen will read beyond the end of the buffer.

    $ valgrind ./git grep -l '^  '
    Memcheck, a memory error detector
    Copyright (C) 2002-2009, and GNU GPL'd, by Julian Seward et al.
    Using Valgrind-3.5.0 and LibVEX; rerun with -h for copyright info
    Command: ./git grep -l ^

    Conditional jump or move depends on uninitialised value(s)
       at 0x4A06318: __GI_strlen (mc_replace_strmem.c:284)
       by 0x3AF00C04C4: regexec@@GLIBC_2.3.4 (regexec.c:243)
       by 0x496A0A: look_ahead (grep.c:644)
       by 0x496D24: grep_buffer_1 (grep.c:726)
       by 0x497161: grep_buffer (grep.c:852)
       by 0x432720: grep_file (builtin-grep.c:201)
       by 0x432854: grep_cache (builtin-grep.c:230)
       by 0x433F2F: cmd_grep (builtin-grep.c:621)
       by 0x40488D: run_builtin (git.c:257)
       by 0x404A28: handle_internal_command (git.c:401)
       by 0x404B1D: run_argv (git.c:443)
       by 0x404C9A: main (git.c:514)
    ...

That buffer is read by builtin-grep.c(grep_file) and here's the code.
You can see from the allocation of "sz + 1" and read of one less
that there is already room for the required trailing NUL byte:

	sz = xsize_t(st.st_size);
	i = open(filename, O_RDONLY);
	if (i < 0)
		goto err_ret;
	data = xmalloc(sz + 1);
	if (st.st_size != read_in_full(i, data, sz)) {
		error("'%s': short read %s", filename, strerror(errno));
		close(i);
		free(data);
		return 0;
	}
	close(i);
>>>     data[sz] = 0;    <======  added line
	if (opt->relative && opt->prefix_length)
		filename = quote_path_relative(filename, -1, &buf, opt->prefix);
	i = grep_buffer(opt, filename, data, sz);

Signed-off-by: Jim Meyering <meyering@redhat.com>
---
This is relative to the head of git's next branch,
d7346b15ca5bda881f5abc37b0844e9b35c8cffc
I noticed the segfault by running the same command in
libvirt's git repository, but it's not consistently reproducible.

 builtin-grep.c |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/builtin-grep.c b/builtin-grep.c
index 4dd5aaf..da854fa 100644
--- a/builtin-grep.c
+++ b/builtin-grep.c
@@ -196,6 +196,7 @@ static int grep_file(struct grep_opt *opt, const char *filename)
 		return 0;
 	}
 	close(i);
+	data[sz] = 0;
 	if (opt->relative && opt->prefix_length)
 		filename = quote_path_relative(filename, -1, &buf, opt->prefix);
 	i = grep_buffer(opt, filename, data, sz);
--
1.6.4.4.1.ga2634

^ permalink raw reply related	[flat|nested] 2+ messages in thread

* Re: [PATCH] grep: don't segfault
  2010-01-18 21:55 [PATCH] grep: don't segfault Jim Meyering
@ 2010-01-18 22:40 ` Linus Torvalds
  0 siblings, 0 replies; 2+ messages in thread
From: Linus Torvalds @ 2010-01-18 22:40 UTC (permalink / raw)
  To: Jim Meyering; +Cc: git list



On Mon, 18 Jan 2010, Jim Meyering wrote:
> 
> 	sz = xsize_t(st.st_size);
> 	i = open(filename, O_RDONLY);
> 	if (i < 0)
> 		goto err_ret;
> 	data = xmalloc(sz + 1);
> 	if (st.st_size != read_in_full(i, data, sz)) {
> 		error("'%s': short read %s", filename, strerror(errno));
> 		close(i);
> 		free(data);
> 		return 0;
> 	}
> 	close(i);
> >>>     data[sz] = 0;    <======  added line

Heh. Fredrik's threaded grep patch also fixed this, although he did the 
"data[*sz] = 0;" up front.

So an obvious ack on the whole thing (the xmalloc() itself makes it 
obvious that that thing should be NULL-terminated).

That said, I also suspect that we have a _lot_ of these patterns, and I 
wonder if we should just add a 

	void *read_file(int fd, struct stat *st);

helper function, which reads a file and NULL-terminates it. Doing a

	git grep -5 read_in_full

on the git sources, and it looks like there are several cases that 
basically do that xmalloc+read_in_full+error handling pattern.

		Linus

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2010-01-18 22:40 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-01-18 21:55 [PATCH] grep: don't segfault Jim Meyering
2010-01-18 22:40 ` Linus Torvalds

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).