* [RFC 0/2] Debugging tools for get_pathname()
@ 2011-09-27 4:28 Michael Haggerty
2011-09-27 4:28 ` [RFC 1/2] Make the number of pathname buffers a compile-time constant Michael Haggerty
2011-09-27 4:28 ` [RFC 2/2] Make misuse of get_pathname() buffers detectable by valgrind Michael Haggerty
0 siblings, 2 replies; 5+ messages in thread
From: Michael Haggerty @ 2011-09-27 4:28 UTC (permalink / raw)
To: git; +Cc: Junio C Hamano, Petr Baudis, Michael Haggerty
I just wasted a bunch of time diagnosing a problem that was caused by
other code's misuse of a temporary buffer created by get_pathname()
(patch forthcoming). The bug was triggered by totally unrelated (and
innocent) code changes that I had made. I found it quite difficult to
figure out the source of the problem.
It is probably obvious to everybody else, but here are the two tricks
that finally enabled me to diagnose the problem:
1. I increased the number of temporary buffers available to
get_pathname(). This made the test suite failure go away, which
was a good indication that the problem was related to the buffers.
2. I changed get_pathname() to allocate and free the buffers instead
of using statically-allocated buffers. This made it trivial to
find the offender using valgrind.
This patch series make it easier to apply these tricks. I hope it
helps somebody else avoid the pain that I just experienced.
Michael Haggerty (2):
Make the number of pathname buffers a compile-time constant
Make misuse of get_pathname() buffers detectable by valgrind
path.c | 43 +++++++++++++++++++++++++++++++++++++++++--
1 files changed, 41 insertions(+), 2 deletions(-)
--
1.7.7.rc2
^ permalink raw reply [flat|nested] 5+ messages in thread
* [RFC 1/2] Make the number of pathname buffers a compile-time constant
2011-09-27 4:28 [RFC 0/2] Debugging tools for get_pathname() Michael Haggerty
@ 2011-09-27 4:28 ` Michael Haggerty
2011-09-27 4:28 ` [RFC 2/2] Make misuse of get_pathname() buffers detectable by valgrind Michael Haggerty
1 sibling, 0 replies; 5+ messages in thread
From: Michael Haggerty @ 2011-09-27 4:28 UTC (permalink / raw)
To: git; +Cc: Junio C Hamano, Petr Baudis, Michael Haggerty
Changing the number of available temporary buffers for get_pathname()
can help diagnose abuse of such buffers. If a bug goes away when
PATHNAME_BUFFER_COUNT is increased, then it might be due to a buffer
being used after it has been recycled. Similarly, if a bug appears
when PATHNAME_BUFFER_COUNT is decreased, then there might be a latent
problem that could emerge after unrelated code changes.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
---
path.c | 7 +++++--
1 files changed, 5 insertions(+), 2 deletions(-)
diff --git path.c path.c
index 6f3f5d5..6c4714d 100644
--- path.c
+++ path.c
@@ -13,13 +13,16 @@
#include "cache.h"
#include "strbuf.h"
+/* This must be a power of 2: */
+#define PATHNAME_BUFFER_COUNT (1 << 2)
+
static char bad_path[] = "/bad-path/";
static char *get_pathname(void)
{
- static char pathname_array[4][PATH_MAX];
+ static char pathname_array[PATHNAME_BUFFER_COUNT][PATH_MAX];
static int index;
- return pathname_array[3 & ++index];
+ return pathname_array[(PATHNAME_BUFFER_COUNT - 1) & ++index];
}
static char *cleanup_path(char *path)
--
1.7.7.rc2
^ permalink raw reply [flat|nested] 5+ messages in thread
* [RFC 2/2] Make misuse of get_pathname() buffers detectable by valgrind
2011-09-27 4:28 [RFC 0/2] Debugging tools for get_pathname() Michael Haggerty
2011-09-27 4:28 ` [RFC 1/2] Make the number of pathname buffers a compile-time constant Michael Haggerty
@ 2011-09-27 4:28 ` Michael Haggerty
2011-11-16 13:57 ` Nguyen Thai Ngoc Duy
2011-11-16 14:18 ` Nguyen Thai Ngoc Duy
1 sibling, 2 replies; 5+ messages in thread
From: Michael Haggerty @ 2011-09-27 4:28 UTC (permalink / raw)
To: git; +Cc: Junio C Hamano, Petr Baudis, Michael Haggerty
A temporary buffer produced by get_pathname() is recycled after a few
subsequent calls of get_pathname(). The use of such a buffer after it
has been recycled can result in the wrong file being accessed with
very strange effects. Moreover, such a bug can lie dormant until code
elsewhere is changed to use a temporary buffer, causing very
mysterious, nonlocal failures that are hard to analyze.
Add a second implementation of get_pathname() (activated if the
VALGRIND preprocessor macro is defined) that allocates and frees
buffers instead of recycling statically-allocated buffers. This does
not make the problem less serious, but it turns the errors into
access-after-free errors, making it possible to locate the guilty code
using valgrind.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
---
I believe that it is frowned upon to use #ifdefs in git code, but no
good alternative is obvious to me for this type of use. Suggestions
are welcome.
I would also welcome suggestions for a better name than "VALGRIND" for
the preprocessor macro. Are there standard names used elsewhere in
git for such purposes?
path.c | 40 ++++++++++++++++++++++++++++++++++++++--
1 files changed, 38 insertions(+), 2 deletions(-)
diff --git path.c path.c
index 6c4714d..3021207 100644
--- path.c
+++ path.c
@@ -9,6 +9,20 @@
* f = open(mkpath("%s/%s.git", base, name), O_RDONLY);
*
* which is what it's designed for.
+ *
+ * The temporary buffers returned by these functions will be clobbered
+ * by later calls to these functions. Therefore it is important not
+ * to expect such buffers to keep their values across calls to other
+ * git functions. Violations of this rule can cause the original
+ * buffer to be overwritten and lead to very confusing, nonlocal bugs,
+ * including data loss (you think you are writing to your file but are
+ * actually writing to a filename created by some other caller).
+ *
+ * If the VALGRIND preprocessor macro is defined, then buffers are
+ * created via xmalloc and old temporary buffers are recycled using
+ * free(). This changes the symptom of abuse of the buffers from
+ * mysterious, random errors into access-after-free errors that are
+ * detectable by valgrind.
*/
#include "cache.h"
#include "strbuf.h"
@@ -17,12 +31,34 @@
#define PATHNAME_BUFFER_COUNT (1 << 2)
static char bad_path[] = "/bad-path/";
+#ifdef VALGRIND
+static char buggy_path[] = "/git-internal-error/";
+#endif
static char *get_pathname(void)
{
- static char pathname_array[PATHNAME_BUFFER_COUNT][PATH_MAX];
static int index;
- return pathname_array[(PATHNAME_BUFFER_COUNT - 1) & ++index];
+#ifdef VALGRIND
+ static char *pathname_array[PATHNAME_BUFFER_COUNT];
+ index = (index + 1) & (PATHNAME_BUFFER_COUNT - 1);
+ if (pathname_array[index]) {
+ /*
+ * In a correct program, this will have no effect, but
+ * *if* somebody erroneously uses this buffer after it
+ * has been freed, it gives more of a chance that the
+ * error will be detected even if valgrind is not
+ * running:
+ */
+ strcpy(pathname_array[index], buggy_path);
+
+ free(pathname_array[index]);
+ }
+ pathname_array[index] = xmalloc(PATH_MAX);
+ return pathname_array[index];
+#else
+ static char pathname_array[PATHNAME_BUFFER_COUNT][PATH_MAX];
+ return pathname_array[(PATHNAME_BUFFER_COUNT - 1) & ++index];
+#endif
}
static char *cleanup_path(char *path)
--
1.7.7.rc2
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [RFC 2/2] Make misuse of get_pathname() buffers detectable by valgrind
2011-09-27 4:28 ` [RFC 2/2] Make misuse of get_pathname() buffers detectable by valgrind Michael Haggerty
@ 2011-11-16 13:57 ` Nguyen Thai Ngoc Duy
2011-11-16 14:18 ` Nguyen Thai Ngoc Duy
1 sibling, 0 replies; 5+ messages in thread
From: Nguyen Thai Ngoc Duy @ 2011-11-16 13:57 UTC (permalink / raw)
To: Michael Haggerty; +Cc: git, Junio C Hamano, Petr Baudis
On Tue, Sep 27, 2011 at 11:28 AM, Michael Haggerty <mhagger@alum.mit.edu> wrote:
> A temporary buffer produced by get_pathname() is recycled after a few
> subsequent calls of get_pathname(). The use of such a buffer after it
> has been recycled can result in the wrong file being accessed with
> very strange effects. Moreover, such a bug can lie dormant until code
> elsewhere is changed to use a temporary buffer, causing very
> mysterious, nonlocal failures that are hard to analyze.
>
> Add a second implementation of get_pathname() (activated if the
> VALGRIND preprocessor macro is defined) that allocates and frees
> buffers instead of recycling statically-allocated buffers. This does
> not make the problem less serious, but it turns the errors into
> access-after-free errors, making it possible to locate the guilty code
> using valgrind.
>
> Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
> ---
>
> I believe that it is frowned upon to use #ifdefs in git code, but no
> good alternative is obvious to me for this type of use. Suggestions
> are welcome.
Enable the code based on an environment variable, e.g.
GIT_DEBUG_FENCE, then enable it by default in test-lib.sh :-)
--
Duy
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [RFC 2/2] Make misuse of get_pathname() buffers detectable by valgrind
2011-09-27 4:28 ` [RFC 2/2] Make misuse of get_pathname() buffers detectable by valgrind Michael Haggerty
2011-11-16 13:57 ` Nguyen Thai Ngoc Duy
@ 2011-11-16 14:18 ` Nguyen Thai Ngoc Duy
1 sibling, 0 replies; 5+ messages in thread
From: Nguyen Thai Ngoc Duy @ 2011-11-16 14:18 UTC (permalink / raw)
To: Michael Haggerty; +Cc: git, Junio C Hamano
On Tue, Sep 27, 2011 at 11:28 AM, Michael Haggerty <mhagger@alum.mit.edu> wrote:
> +#ifdef VALGRIND
> + static char *pathname_array[PATHNAME_BUFFER_COUNT];
> + index = (index + 1) & (PATHNAME_BUFFER_COUNT - 1);
> + if (pathname_array[index]) {
> + /*
> + * In a correct program, this will have no effect, but
> + * *if* somebody erroneously uses this buffer after it
> + * has been freed, it gives more of a chance that the
> + * error will be detected even if valgrind is not
> + * running:
> + */
> + strcpy(pathname_array[index], buggy_path);
> +
> + free(pathname_array[index]);
> + }
> + pathname_array[index] = xmalloc(PATH_MAX);
> + return pathname_array[index];
> +#else
Not sure if it works (just read man pages, haven't tried anything) I'm
thinking to use mmap() with MAP_ANONYMOUS instead of xmalloc(), then
mprotect() instead of free() to remove read access from that area. Any
access after that should be caught. Leaking may not be severe for
git_path(), hopefully.
--
Duy
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2011-11-16 14:19 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-09-27 4:28 [RFC 0/2] Debugging tools for get_pathname() Michael Haggerty
2011-09-27 4:28 ` [RFC 1/2] Make the number of pathname buffers a compile-time constant Michael Haggerty
2011-09-27 4:28 ` [RFC 2/2] Make misuse of get_pathname() buffers detectable by valgrind Michael Haggerty
2011-11-16 13:57 ` Nguyen Thai Ngoc Duy
2011-11-16 14:18 ` Nguyen Thai Ngoc Duy
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).