* [PATCH] urlmatch: append_normalized_escapes can reallocate norm.buf @ 2013-09-12 9:57 Thomas Rast 2013-09-12 14:15 ` [PATCH v2] urlmatch.c: recompute ptr after append_normalized_escapes Kyle J. McKay 0 siblings, 1 reply; 8+ messages in thread From: Thomas Rast @ 2013-09-12 9:57 UTC (permalink / raw) To: git; +Cc: Kyle J. McKay, Jeff King The calls to strbuf_add* within append_normalized_escapes() can reallocate the buffer passed to it. Therefore, the seg_start pointer into the string cannot be kept across such calls. The actual bug is from 3402a8d (config: add helper to normalize and match URLs, 2013-07-31). It can first be detected by valgrind after 6a56993 (config: parse http.<url>.<variable> using urlmatch, 2013-08-05) introduced tests covering url_normalize(). Signed-off-by: Thomas Rast <trast@inf.ethz.ch> --- My apologies if this is redundant; I didn't have time to watch the list over the last two weeks. However it seems today's pu is still broken. The valgrind error looks like this: ==4607== Invalid read of size 1 ==4607== at 0x4C2D3A1: __GI_strcmp (mc_replace_strmem.c:731) ==4607== by 0x404C68: url_normalize (urlmatch.c:300) ==4607== by 0x403F33: main (test-urlmatch-normalization.c:34) ==4607== Address 0x5be9046 is 6 bytes inside a block of size 24 free'd ==4607== at 0x4C2BFC6: realloc (vg_replace_malloc.c:687) ==4607== by 0x405F6B: xrealloc (wrapper.c:100) ==4607== by 0x40794E: strbuf_grow (strbuf.c:74) ==4607== by 0x40854D: strbuf_vaddf (strbuf.c:268) ==4607== by 0x40817E: strbuf_addf (strbuf.c:203) ==4607== by 0x404300: append_normalized_escapes (urlmatch.c:58) ==4607== by 0x404C0A: url_normalize (urlmatch.c:291) ==4607== by 0x403F33: main (test-urlmatch-normalization.c:34) It went undetected for a while because it does not fail the test: the calls to test-urlmatch-normalization happen inside a $() substitution. I checked the other call sites to append_normalized_escapes() for the same type of problem, and they seem to be okay. urlmatch.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/urlmatch.c b/urlmatch.c index 1db76c8..59abc80 100644 --- a/urlmatch.c +++ b/urlmatch.c @@ -281,7 +281,8 @@ char *url_normalize(const char *url, struct url_info *out_info) url_len--; } for (;;) { - const char *seg_start = norm.buf + norm.len; + const char *seg_start; + size_t prev_len = norm.len; const char *next_slash = url + strcspn(url, "/?#"); int skip_add_slash = 0; /* @@ -297,6 +298,7 @@ char *url_normalize(const char *url, struct url_info *out_info) strbuf_release(&norm); return NULL; } + seg_start = norm.buf + prev_len; if (!strcmp(seg_start, ".")) { /* ignore a . segment; be careful not to remove initial '/' */ if (seg_start == path_start + 1) { -- 1.8.4.609.g4395a4f ^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH v2] urlmatch.c: recompute ptr after append_normalized_escapes 2013-09-12 9:57 [PATCH] urlmatch: append_normalized_escapes can reallocate norm.buf Thomas Rast @ 2013-09-12 14:15 ` Kyle J. McKay 2013-09-12 15:25 ` Thomas Rast 2013-09-12 18:30 ` Junio C Hamano 0 siblings, 2 replies; 8+ messages in thread From: Kyle J. McKay @ 2013-09-12 14:15 UTC (permalink / raw) To: Thomas Rast, Junio C Hamano; +Cc: Jeff King, git On Sep 12, 2013, at 02:57, Thomas Rast wrote: > The calls to strbuf_add* within append_normalized_escapes() can > reallocate the buffer passed to it. Therefore, the seg_start pointer > into the string cannot be kept across such calls. Thanks for finding this. > It went undetected for a while because it does not fail the test: the > calls to test-urlmatch-normalization happen inside a $() substitution. > > I checked the other call sites to append_normalized_escapes() for the > same type of problem, and they seem to be okay. > diff --git a/urlmatch.c b/urlmatch.c > index 1db76c8..59abc80 100644 > --- a/urlmatch.c > +++ b/urlmatch.c > @@ -281,7 +281,8 @@ char *url_normalize(const char *url, struct url_info *out_info) > url_len--; > } > for (;;) { > - const char *seg_start = norm.buf + norm.len; > + const char *seg_start; > + size_t prev_len = norm.len; How about a more descriptive name for what prev_len is? It's actually the segment start offset. > const char *next_slash = url + strcspn(url, "/?#"); > int skip_add_slash = 0; > /* > @@ -297,6 +298,7 @@ char *url_normalize(const char *url, struct url_info *out_info) > strbuf_release(&norm); > return NULL; > } > + seg_start = norm.buf + prev_len; A comment would be nice here to remind folks who might be tempted to revert this to the previous version why it's being done this way. I'm sure at some point someone will propose a "simplification" patch otherwise. Also some nits. The patch description should be imperative mood (cf. Documentation/SubmittingPatches). And instead of mentioning the seg_start pointer in the description (which will be meaningless to just about everyone and it's clear from the diff), mention the bad thing the code was doing in more general terms that will be clear to anyone familiar with a strbuf. So how about this patch instead... -- 8< -- From: Thomas Rast <trast@inf.ethz.ch> Subject: urlmatch.c: recompute pointer after append_normalized_escapes When append_normalized_escapes is called, its internal strbuf_add* calls can cause the strbuf's buf to be reallocated changing the value of the buf pointer. Do not use the strbuf buf pointer from before any append_normalized_escapes calls afterwards. Instead recompute the needed pointer. Signed-off-by: Thomas Rast <trast@inf.ethz.ch> Signed-off-by: Kyle J. McKay <mackyle@gmail.com> --- urlmatch.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/urlmatch.c b/urlmatch.c index 1db76c89..01c67467 100644 --- a/urlmatch.c +++ b/urlmatch.c @@ -281,8 +281,9 @@ char *url_normalize(const char *url, struct url_info *out_info) url_len--; } for (;;) { - const char *seg_start = norm.buf + norm.len; + const char *seg_start; + size_t seg_start_off = norm.len; const char *next_slash = url + strcspn(url, "/?#"); int skip_add_slash = 0; /* * RFC 3689 indicates that any . or .. segments should be @@ -297,6 +298,8 @@ char *url_normalize(const char *url, struct url_info *out_info) strbuf_release(&norm); return NULL; } + /* append_normalized_escapes can cause norm.buf to change */ + seg_start = norm.buf + seg_start_off; if (!strcmp(seg_start, ".")) { /* ignore a . segment; be careful not to remove initial '/' */ if (seg_start == path_start + 1) { -- 1.8.3 ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH v2] urlmatch.c: recompute ptr after append_normalized_escapes 2013-09-12 14:15 ` [PATCH v2] urlmatch.c: recompute ptr after append_normalized_escapes Kyle J. McKay @ 2013-09-12 15:25 ` Thomas Rast 2013-09-12 18:23 ` Junio C Hamano 2013-09-12 18:30 ` Junio C Hamano 1 sibling, 1 reply; 8+ messages in thread From: Thomas Rast @ 2013-09-12 15:25 UTC (permalink / raw) To: Kyle J. McKay; +Cc: Junio C Hamano, Jeff King, git "Kyle J. McKay" <mackyle@gmail.com> writes: > Also some nits. The patch description should be imperative mood > (cf. Documentation/SubmittingPatches). Heh. Serves me right to go away for a while and get SubmittingPatches cited at me on return ;-) Thanks for the updated patch. I agree with the changes. I particularly like the better variable name. -- Thomas Rast trast@{inf,student}.ethz.ch ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2] urlmatch.c: recompute ptr after append_normalized_escapes 2013-09-12 15:25 ` Thomas Rast @ 2013-09-12 18:23 ` Junio C Hamano 0 siblings, 0 replies; 8+ messages in thread From: Junio C Hamano @ 2013-09-12 18:23 UTC (permalink / raw) To: Thomas Rast; +Cc: Kyle J. McKay, Jeff King, git Thanks, both. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2] urlmatch.c: recompute ptr after append_normalized_escapes 2013-09-12 14:15 ` [PATCH v2] urlmatch.c: recompute ptr after append_normalized_escapes Kyle J. McKay 2013-09-12 15:25 ` Thomas Rast @ 2013-09-12 18:30 ` Junio C Hamano 2013-09-12 20:38 ` Kyle J. McKay 1 sibling, 1 reply; 8+ messages in thread From: Junio C Hamano @ 2013-09-12 18:30 UTC (permalink / raw) To: Kyle J. McKay; +Cc: Thomas Rast, Jeff King, git "Kyle J. McKay" <mackyle@gmail.com> writes: > So how about this patch instead... > > -- 8< -- > From: Thomas Rast <trast@inf.ethz.ch> > Subject: urlmatch.c: recompute pointer after append_normalized_escapes > > When append_normalized_escapes is called, its internal strbuf_add* calls can > cause the strbuf's buf to be reallocated changing the value of the buf pointer. > > Do not use the strbuf buf pointer from before any append_normalized_escapes > calls afterwards. Instead recompute the needed pointer. > > Signed-off-by: Thomas Rast <trast@inf.ethz.ch> > Signed-off-by: Kyle J. McKay <mackyle@gmail.com> > --- > urlmatch.c | 5 ++++- > 1 file changed, 4 insertions(+), 1 deletion(-) > > diff --git a/urlmatch.c b/urlmatch.c > index 1db76c89..01c67467 100644 > --- a/urlmatch.c > +++ b/urlmatch.c > @@ -281,8 +281,9 @@ char *url_normalize(const char *url, struct url_info *out_info) > url_len--; > } > for (;;) { > - const char *seg_start = norm.buf + norm.len; > + const char *seg_start; > + size_t seg_start_off = norm.len; > const char *next_slash = url + strcspn(url, "/?#"); > int skip_add_slash = 0; > /* > * RFC 3689 indicates that any . or .. segments should be > @@ -297,6 +298,8 @@ char *url_normalize(const char *url, struct url_info *out_info) > strbuf_release(&norm); > return NULL; > } > + /* append_normalized_escapes can cause norm.buf to change */ > + seg_start = norm.buf + seg_start_off; The change looks good, but I find that this comment is not placed in the right place. It is good if the reader knows about an old bug to put it here, but if the first thing a reader reads is this updated version, the comment is better placed close to the place where the start_ofs variable captures the original value (i.e. "because the next call may relocate the buffer, we cannot grab seg_start upfront; instead we need to record the start_ofs here, and that is what this variable is about"). It is too minor a point for a reroll, so I'll try to tweak it locally. Something like this (but now I think about it, the comment may not even be necessary). urlmatch.c | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-) diff --git a/urlmatch.c b/urlmatch.c index 01c6746..d1600e2 100644 --- a/urlmatch.c +++ b/urlmatch.c @@ -282,9 +282,17 @@ char *url_normalize(const char *url, struct url_info *out_info) } for (;;) { const char *seg_start; - size_t seg_start_off = norm.len; + size_t seg_start_off; const char *next_slash = url + strcspn(url, "/?#"); int skip_add_slash = 0; + + /* + * record the starting offset; appending escapes may + * relocate the buffer, so we cannot capture seg_start + * upfront and use it later. + */ + seg_start_off = norm.len; + /* * RFC 3689 indicates that any . or .. segments should be * unescaped before being checked for. @@ -298,7 +306,7 @@ char *url_normalize(const char *url, struct url_info *out_info) strbuf_release(&norm); return NULL; } - /* append_normalized_escapes can cause norm.buf to change */ + seg_start = norm.buf + seg_start_off; if (!strcmp(seg_start, ".")) { /* ignore a . segment; be careful not to remove initial '/' */ ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH v2] urlmatch.c: recompute ptr after append_normalized_escapes 2013-09-12 18:30 ` Junio C Hamano @ 2013-09-12 20:38 ` Kyle J. McKay 2013-09-12 22:05 ` Jonathan Nieder 0 siblings, 1 reply; 8+ messages in thread From: Kyle J. McKay @ 2013-09-12 20:38 UTC (permalink / raw) To: Junio C Hamano; +Cc: Thomas Rast, Jeff King, git On Sep 12, 2013, at 11:30, Junio C Hamano wrote: >> + /* append_normalized_escapes can cause norm.buf to change */ >> + seg_start = norm.buf + seg_start_off; > > The change looks good, but I find that this comment is not placed in > the right place. It is good if the reader knows about an old bug to > put it here, but if the first thing a reader reads is this updated > version, the comment is better placed close to the place where the > start_ofs variable captures the original value (i.e. "because the > next call may relocate the buffer, we cannot grab seg_start upfront; > instead we need to record the start_ofs here, and that is what this > variable is about"). > > It is too minor a point for a reroll, so I'll try to tweak it > locally. Something like this (but now I think about it, the comment > may not even be necessary). The longer comment looks good to me. If you think the code will be safe from simplification patches without a comment, that works for me too. I've just seen so many "simplification" patches go by on the list I'm concerned it will be a target otherwise leading to re-introduction of the problem. > diff --git a/urlmatch.c b/urlmatch.c > index 01c6746..d1600e2 100644 > --- a/urlmatch.c > +++ b/urlmatch.c > @@ -282,9 +282,17 @@ char *url_normalize(const char *url, struct > url_info *out_info) > } > for (;;) { > const char *seg_start; > - size_t seg_start_off = norm.len; > + size_t seg_start_off; > const char *next_slash = url + strcspn(url, "/?#"); > int skip_add_slash = 0; > + > + /* > + * record the starting offset; appending escapes may > + * relocate the buffer, so we cannot capture seg_start > + * upfront and use it later. > + */ > + seg_start_off = norm.len; > + > /* > * RFC 3689 indicates that any . or .. segments should be > * unescaped before being checked for. > @@ -298,7 +306,7 @@ char *url_normalize(const char *url, struct > url_info *out_info) > strbuf_release(&norm); > return NULL; > } > - /* append_normalized_escapes can cause norm.buf to change */ > + > seg_start = norm.buf + seg_start_off; > if (!strcmp(seg_start, ".")) { > /* ignore a . segment; be careful not to remove initial '/' */ ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2] urlmatch.c: recompute ptr after append_normalized_escapes 2013-09-12 20:38 ` Kyle J. McKay @ 2013-09-12 22:05 ` Jonathan Nieder 2013-09-12 22:26 ` Junio C Hamano 0 siblings, 1 reply; 8+ messages in thread From: Jonathan Nieder @ 2013-09-12 22:05 UTC (permalink / raw) To: Kyle J. McKay; +Cc: Junio C Hamano, Thomas Rast, Jeff King, git Kyle J. McKay wrote: > The longer comment looks good to me. If you think the code will be safe from > simplification patches without a comment, that works for me too. I think if we can't trust reviewers to catch this kind of thing, we're in trouble (i.e., moving too fast). :) So FWIW my instinct is to leave the comment out, since I actually find it more readable that way (otherwise I would wonder, "Why am I being told that a strbuf's buffer has a nonconstant address? Do some other strbufs have a constant address or something?") Thanks, Jonathan ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2] urlmatch.c: recompute ptr after append_normalized_escapes 2013-09-12 22:05 ` Jonathan Nieder @ 2013-09-12 22:26 ` Junio C Hamano 0 siblings, 0 replies; 8+ messages in thread From: Junio C Hamano @ 2013-09-12 22:26 UTC (permalink / raw) To: Jonathan Nieder; +Cc: Kyle J. McKay, Thomas Rast, Jeff King, git Jonathan Nieder <jrnieder@gmail.com> writes: > Kyle J. McKay wrote: > >> The longer comment looks good to me. If you think the code will be safe from >> simplification patches without a comment, that works for me too. > > I think if we can't trust reviewers to catch this kind of thing, we're > in trouble (i.e., moving too fast). :) > > So FWIW my instinct is to leave the comment out, since I actually find > it more readable that way (otherwise I would wonder, "Why am I being > told that a strbuf's buffer has a nonconstant address? Do some other > strbufs have a constant address or something?") Yeah, I was staring at that message and coming to the same conclusion. ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2013-09-12 22:26 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2013-09-12 9:57 [PATCH] urlmatch: append_normalized_escapes can reallocate norm.buf Thomas Rast 2013-09-12 14:15 ` [PATCH v2] urlmatch.c: recompute ptr after append_normalized_escapes Kyle J. McKay 2013-09-12 15:25 ` Thomas Rast 2013-09-12 18:23 ` Junio C Hamano 2013-09-12 18:30 ` Junio C Hamano 2013-09-12 20:38 ` Kyle J. McKay 2013-09-12 22:05 ` Jonathan Nieder 2013-09-12 22:26 ` Junio C Hamano
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).