git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Michael Haggerty <mhagger@alum.mit.edu>
To: git@vger.kernel.org
Cc: Junio C Hamano <gitster@pobox.com>, Petr Baudis <pasky@suse.cz>,
	Michael Haggerty <mhagger@alum.mit.edu>
Subject: [RFC 2/2] Make misuse of get_pathname() buffers detectable by valgrind
Date: Tue, 27 Sep 2011 06:28:07 +0200	[thread overview]
Message-ID: <1317097687-11098-3-git-send-email-mhagger@alum.mit.edu> (raw)
In-Reply-To: <1317097687-11098-1-git-send-email-mhagger@alum.mit.edu>

A temporary buffer produced by get_pathname() is recycled after a few
subsequent calls of get_pathname().  The use of such a buffer after it
has been recycled can result in the wrong file being accessed with
very strange effects.  Moreover, such a bug can lie dormant until code
elsewhere is changed to use a temporary buffer, causing very
mysterious, nonlocal failures that are hard to analyze.

Add a second implementation of get_pathname() (activated if the
VALGRIND preprocessor macro is defined) that allocates and frees
buffers instead of recycling statically-allocated buffers.  This does
not make the problem less serious, but it turns the errors into
access-after-free errors, making it possible to locate the guilty code
using valgrind.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
---

I believe that it is frowned upon to use #ifdefs in git code, but no
good alternative is obvious to me for this type of use.  Suggestions
are welcome.

I would also welcome suggestions for a better name than "VALGRIND" for
the preprocessor macro.  Are there standard names used elsewhere in
git for such purposes?

 path.c |   40 ++++++++++++++++++++++++++++++++++++++--
 1 files changed, 38 insertions(+), 2 deletions(-)

diff --git path.c path.c
index 6c4714d..3021207 100644
--- path.c
+++ path.c
@@ -9,6 +9,20 @@
  *   f = open(mkpath("%s/%s.git", base, name), O_RDONLY);
  *
  * which is what it's designed for.
+ *
+ * The temporary buffers returned by these functions will be clobbered
+ * by later calls to these functions.  Therefore it is important not
+ * to expect such buffers to keep their values across calls to other
+ * git functions.  Violations of this rule can cause the original
+ * buffer to be overwritten and lead to very confusing, nonlocal bugs,
+ * including data loss (you think you are writing to your file but are
+ * actually writing to a filename created by some other caller).
+ *
+ * If the VALGRIND preprocessor macro is defined, then buffers are
+ * created via xmalloc and old temporary buffers are recycled using
+ * free().  This changes the symptom of abuse of the buffers from
+ * mysterious, random errors into access-after-free errors that are
+ * detectable by valgrind.
  */
 #include "cache.h"
 #include "strbuf.h"
@@ -17,12 +31,34 @@
 #define PATHNAME_BUFFER_COUNT (1 << 2)
 
 static char bad_path[] = "/bad-path/";
+#ifdef VALGRIND
+static char buggy_path[] = "/git-internal-error/";
+#endif
 
 static char *get_pathname(void)
 {
-	static char pathname_array[PATHNAME_BUFFER_COUNT][PATH_MAX];
 	static int index;
-	return pathname_array[(PATHNAME_BUFFER_COUNT - 1) & ++index];
+#ifdef VALGRIND
+	static char *pathname_array[PATHNAME_BUFFER_COUNT];
+	index = (index + 1) & (PATHNAME_BUFFER_COUNT - 1);
+	if (pathname_array[index]) {
+		/*
+		 * In a correct program, this will have no effect, but
+		 * *if* somebody erroneously uses this buffer after it
+		 * has been freed, it gives more of a chance that the
+		 * error will be detected even if valgrind is not
+		 * running:
+		 */
+		strcpy(pathname_array[index], buggy_path);
+
+		free(pathname_array[index]);
+	}
+	pathname_array[index] = xmalloc(PATH_MAX);
+	return pathname_array[index];
+#else
+	static char pathname_array[PATHNAME_BUFFER_COUNT][PATH_MAX];
+ 	return pathname_array[(PATHNAME_BUFFER_COUNT - 1) & ++index];
+#endif
 }
 
 static char *cleanup_path(char *path)
-- 
1.7.7.rc2

  parent reply	other threads:[~2011-09-27  4:29 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-09-27  4:28 [RFC 0/2] Debugging tools for get_pathname() Michael Haggerty
2011-09-27  4:28 ` [RFC 1/2] Make the number of pathname buffers a compile-time constant Michael Haggerty
2011-09-27  4:28 ` Michael Haggerty [this message]
2011-11-16 13:57   ` [RFC 2/2] Make misuse of get_pathname() buffers detectable by valgrind Nguyen Thai Ngoc Duy
2011-11-16 14:18   ` Nguyen Thai Ngoc Duy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1317097687-11098-3-git-send-email-mhagger@alum.mit.edu \
    --to=mhagger@alum.mit.edu \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=pasky@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).