* [PATCH 0/3] Fix some errors reported by valgrind @ 2011-03-14 19:18 Carlos Martín Nieto 2011-03-14 19:18 ` [PATCH 1/3] make_absolute_path: Don't try to copy a string to itself Carlos Martín Nieto ` (2 more replies) 0 siblings, 3 replies; 48+ messages in thread From: Carlos Martín Nieto @ 2011-03-14 19:18 UTC (permalink / raw) To: git; +Cc: Junio C Hamano This patch series fixes some errors I found when running the test suite with --valgrind. Carlos Martín Nieto (3): make_absolute_path: Don't try to copy a string to itself setup_path(): Free temporary buffer clone: Free a few paths abspath.c | 2 +- builtin/clone.c | 9 ++++++--- exec_cmd.c | 9 ++++++--- 3 files changed, 13 insertions(+), 7 deletions(-) -- 1.7.4.1 ^ permalink raw reply [flat|nested] 48+ messages in thread
* [PATCH 1/3] make_absolute_path: Don't try to copy a string to itself 2011-03-14 19:18 [PATCH 0/3] Fix some errors reported by valgrind Carlos Martín Nieto @ 2011-03-14 19:18 ` Carlos Martín Nieto 2011-03-14 20:02 ` Jeff King 2011-03-14 20:25 ` Junio C Hamano 2011-03-14 19:18 ` [PATCH 2/3] setup_path(): Free temporary buffer Carlos Martín Nieto 2011-03-14 19:18 ` [PATCH 3/3] clone: Free a few paths Carlos Martín Nieto 2 siblings, 2 replies; 48+ messages in thread From: Carlos Martín Nieto @ 2011-03-14 19:18 UTC (permalink / raw) To: git; +Cc: Junio C Hamano Sometimes (at least in t-0001-init.sh test 12), the return value of make_absolute_path() is passed to it as an argument, making the first and second arguments to strlcpy() the same, making the test fail when run under valgrind. Signed-off-by: Carlos Martín Nieto <cmn@elego.de> --- This patch assumes the path returned by make_absolute_path() is never longer than PATH_MAX, which I think is a safe assumption. abspath.c | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/abspath.c b/abspath.c index 91ca00f..9149a98 100644 --- a/abspath.c +++ b/abspath.c @@ -24,7 +24,7 @@ const char *make_absolute_path(const char *path) char *last_elem = NULL; struct stat st; - if (strlcpy(buf, path, PATH_MAX) >= PATH_MAX) + if (buf != path && strlcpy(buf, path, PATH_MAX) >= PATH_MAX) die ("Too long path: %.*s", 60, path); while (depth--) { -- 1.7.4.1 ^ permalink raw reply related [flat|nested] 48+ messages in thread
* Re: [PATCH 1/3] make_absolute_path: Don't try to copy a string to itself 2011-03-14 19:18 ` [PATCH 1/3] make_absolute_path: Don't try to copy a string to itself Carlos Martín Nieto @ 2011-03-14 20:02 ` Jeff King 2011-03-14 20:25 ` Junio C Hamano 1 sibling, 0 replies; 48+ messages in thread From: Jeff King @ 2011-03-14 20:02 UTC (permalink / raw) To: Carlos Martín Nieto; +Cc: git, Junio C Hamano On Mon, Mar 14, 2011 at 08:18:36PM +0100, Carlos Martín Nieto wrote: > Sometimes (at least in t-0001-init.sh test 12), the return value of > make_absolute_path() is passed to it as an argument, making the first > and second arguments to strlcpy() the same, making the test fail when > run under valgrind. Thanks, I can reproduce here and your fix looks good. I used to periodically run valgrind on the whole suite, but I haven't for quite a while. I should probably start doing it again. -Peff ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [PATCH 1/3] make_absolute_path: Don't try to copy a string to itself 2011-03-14 19:18 ` [PATCH 1/3] make_absolute_path: Don't try to copy a string to itself Carlos Martín Nieto 2011-03-14 20:02 ` Jeff King @ 2011-03-14 20:25 ` Junio C Hamano 2011-03-14 22:02 ` Carlos Martín Nieto 1 sibling, 1 reply; 48+ messages in thread From: Junio C Hamano @ 2011-03-14 20:25 UTC (permalink / raw) To: Carlos Martín Nieto; +Cc: git Carlos Martín Nieto <cmn@elego.de> writes: > Sometimes (at least in t-0001-init.sh test 12), the return value of > make_absolute_path() is passed to it as an argument, making the first I don't think it is a bad idea per-se to avoid a copy from the same memory location into the same memory location, but independent of the necessity of fixes at the low-level, shouldn't we fix the callers that do not check if what they have is already absolute? > and second arguments to strlcpy() the same, making the test fail when > run under valgrind. > > Signed-off-by: Carlos Martín Nieto <cmn@elego.de> > --- > > This patch assumes the path returned by make_absolute_path() is never > longer than PATH_MAX, which I think is a safe assumption. > > abspath.c | 2 +- > 1 files changed, 1 insertions(+), 1 deletions(-) > > diff --git a/abspath.c b/abspath.c > index 91ca00f..9149a98 100644 > --- a/abspath.c > +++ b/abspath.c > @@ -24,7 +24,7 @@ const char *make_absolute_path(const char *path) > char *last_elem = NULL; > struct stat st; > > - if (strlcpy(buf, path, PATH_MAX) >= PATH_MAX) > + if (buf != path && strlcpy(buf, path, PATH_MAX) >= PATH_MAX) > die ("Too long path: %.*s", 60, path); > > while (depth--) { ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [PATCH 1/3] make_absolute_path: Don't try to copy a string to itself 2011-03-14 20:25 ` Junio C Hamano @ 2011-03-14 22:02 ` Carlos Martín Nieto 2011-03-14 22:58 ` Junio C Hamano 0 siblings, 1 reply; 48+ messages in thread From: Carlos Martín Nieto @ 2011-03-14 22:02 UTC (permalink / raw) To: Junio C Hamano; +Cc: git On lun, 2011-03-14 at 13:25 -0700, Junio C Hamano wrote: > Carlos Martín Nieto <cmn@elego.de> writes: > > > Sometimes (at least in t-0001-init.sh test 12), the return value of > > make_absolute_path() is passed to it as an argument, making the first > > I don't think it is a bad idea per-se to avoid a copy from the same memory > location into the same memory location, but independent of the necessity > of fixes at the low-level, shouldn't we fix the callers that do not check > if what they have is already absolute? If we'd like the semantics to be "whatever I had, I now know what the absolute path is" then we could make the check in the beginning of the function, to centralise the check. If the semantics should be "I don't have an absolute path, so I need to figure out what it is", then there should be a check before calling make_absolute_path() (the name suggests the second). As there are only ~15-20 uses of make_absolute_path(), we could just leave this as it is (as it works under normal conditions and causes valgrind to warn us), and I'll examine the callers tomorrow. There is however the extra functionality the function offers, namely resolving links. It might be good to split it into two functions so each caller can specify what it wants. There is also this in setup.c:prefix_path() const char *orig = path; char *sanitized; if (is_absolute_path(orig)) { const char *temp = make_absolute_path(path); sanitized = xmalloc(len + strlen(temp) + 1); strcpy(sanitized, temp); } else { sanitized = xmalloc(len + strlen(path) + 1); if (len) memcpy(sanitized, prefix, len); strcpy(sanitized + len, path); } which doesn't seem to make sense. Should the call to make_absolute_path() actually be a call to a function resolve_links() that just incorporates that functionality from make_absolute_path()? > > > and second arguments to strlcpy() the same, making the test fail when > > run under valgrind. > > > > Signed-off-by: Carlos Martín Nieto <cmn@elego.de> > > --- > > > > This patch assumes the path returned by make_absolute_path() is never > > longer than PATH_MAX, which I think is a safe assumption. > > > > > abspath.c | 2 +- > > 1 files changed, 1 insertions(+), 1 deletions(-) > > > > diff --git a/abspath.c b/abspath.c > > index 91ca00f..9149a98 100644 > > --- a/abspath.c > > +++ b/abspath.c > > @@ -24,7 +24,7 @@ const char *make_absolute_path(const char *path) > > char *last_elem = NULL; > > struct stat st; > > > > - if (strlcpy(buf, path, PATH_MAX) >= PATH_MAX) > > + if (buf != path && strlcpy(buf, path, PATH_MAX) >= PATH_MAX) > > die ("Too long path: %.*s", 60, path); > > > > while (depth--) { ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [PATCH 1/3] make_absolute_path: Don't try to copy a string to itself 2011-03-14 22:02 ` Carlos Martín Nieto @ 2011-03-14 22:58 ` Junio C Hamano 2011-03-15 11:59 ` Carlos Martín Nieto 0 siblings, 1 reply; 48+ messages in thread From: Junio C Hamano @ 2011-03-14 22:58 UTC (permalink / raw) To: Carlos Martín Nieto; +Cc: git Carlos Martín Nieto <cmn@elego.de> writes: >> I don't think it is a bad idea per-se to avoid a copy from the same memory >> location into the same memory location, but independent of the necessity >> of fixes at the low-level, shouldn't we fix the callers that do not check >> if what they have is already absolute? > > If we'd like the semantics to be "whatever I had, I now know what the > absolute path is" then we could make the check in the beginning of the > function, to centralise the check. If the semantics should be "I don't > have an absolute path, so I need to figure out what it is", then there > should be a check before calling make_absolute_path() (the name suggests > the second). Good thinking, and I think the former semantics would be easier to use. > There is however the extra functionality the function offers, namely > resolving links. It might be good to split it into two functions so each > caller can specify what it wants. Probably. ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [PATCH 1/3] make_absolute_path: Don't try to copy a string to itself 2011-03-14 22:58 ` Junio C Hamano @ 2011-03-15 11:59 ` Carlos Martín Nieto 2011-03-15 12:40 ` Carlos Martín Nieto 0 siblings, 1 reply; 48+ messages in thread From: Carlos Martín Nieto @ 2011-03-15 11:59 UTC (permalink / raw) To: Junio C Hamano; +Cc: git On lun, 2011-03-14 at 15:58 -0700, Junio C Hamano wrote: > Carlos Martín Nieto <cmn@elego.de> writes: > > >> I don't think it is a bad idea per-se to avoid a copy from the same memory > >> location into the same memory location, but independent of the necessity > >> of fixes at the low-level, shouldn't we fix the callers that do not check > >> if what they have is already absolute? > > > > If we'd like the semantics to be "whatever I had, I now know what the > > absolute path is" then we could make the check in the beginning of the > > function, to centralise the check. If the semantics should be "I don't > > have an absolute path, so I need to figure out what it is", then there > > should be a check before calling make_absolute_path() (the name suggests > > the second). > > Good thinking, and I think the former semantics would be easier to use. I was asking about why we didn't just use realpath, but it seems there is no portable way of dealing with it (even without counting NFS over different systems). I'm thinking of renaming make_absolute_path to real_path as it's in fact our implementation of realpath (and actually describes what it does), without a is_absolute_path check, as a path could be absolute but point to a link, adding a comment to say that if you don't mind links, you should use make_nonrelative_path (I'd rename it to absolute_path, but that may be too close to the old make_absolute_path) > > > There is however the extra functionality the function offers, namely > > resolving links. It might be good to split it into two functions so each > > caller can specify what it wants. > > Probably. With the changes mentioned earlier, if you want an absolute pathname, you'd call absolute_path/make_nonrelative_path and if you want to make sure you have the real path of the target file, you'd use real_path just as you'd use realpath on a sane system, with cmn ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [PATCH 1/3] make_absolute_path: Don't try to copy a string to itself 2011-03-15 11:59 ` Carlos Martín Nieto @ 2011-03-15 12:40 ` Carlos Martín Nieto 2011-03-15 17:02 ` Junio C Hamano 0 siblings, 1 reply; 48+ messages in thread From: Carlos Martín Nieto @ 2011-03-15 12:40 UTC (permalink / raw) To: Junio C Hamano; +Cc: git On mar, 2011-03-15 at 12:59 +0100, Carlos Martín Nieto wrote: > On lun, 2011-03-14 at 15:58 -0700, Junio C Hamano wrote: > > Carlos Martín Nieto <cmn@elego.de> writes: [...] > > > > > There is however the extra functionality the function offers, namely > > > resolving links. It might be good to split it into two functions so each > > > caller can specify what it wants. > > > > Probably. > > With the changes mentioned earlier, if you want an absolute pathname, > you'd call absolute_path/make_nonrelative_path and if you want to make > sure you have the real path of the target file, you'd use real_path just > as you'd use realpath on a sane system, with ... a comment on the functions and maybe some documentation in Documentation/techncal, as it doesn't seem to exist yet. cmn ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [PATCH 1/3] make_absolute_path: Don't try to copy a string to itself 2011-03-15 12:40 ` Carlos Martín Nieto @ 2011-03-15 17:02 ` Junio C Hamano 2011-03-15 17:27 ` Carlos Martín Nieto 2011-03-16 14:04 ` Nguyen Thai Ngoc Duy 0 siblings, 2 replies; 48+ messages in thread From: Junio C Hamano @ 2011-03-15 17:02 UTC (permalink / raw) To: Nguyễn Thái Ngọc Duy; +Cc: Carlos Martín Nieto, git Carlos Martín Nieto <cmn@elego.de> writes: > On mar, 2011-03-15 at 12:59 +0100, Carlos Martín Nieto wrote: >> On lun, 2011-03-14 at 15:58 -0700, Junio C Hamano wrote: >> > Carlos Martín Nieto <cmn@elego.de> writes: > [...] >> > >> > > There is however the extra functionality the function offers, namely >> > > resolving links. It might be good to split it into two functions so each >> > > caller can specify what it wants. >> > >> > Probably. >> >> With the changes mentioned earlier, if you want an absolute pathname, >> you'd call absolute_path/make_nonrelative_path and if you want to make >> sure you have the real path of the target file, you'd use real_path just >> as you'd use realpath on a sane system, with > > ... a comment on the functions and maybe some documentation in > Documentation/techncal, as it doesn't seem to exist yet. We probably should involve Nguyễn in this thread as his fingers are everywhere on the codepaths related to setup. ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [PATCH 1/3] make_absolute_path: Don't try to copy a string to itself 2011-03-15 17:02 ` Junio C Hamano @ 2011-03-15 17:27 ` Carlos Martín Nieto 2011-03-16 14:16 ` Nguyen Thai Ngoc Duy 2011-03-16 14:04 ` Nguyen Thai Ngoc Duy 1 sibling, 1 reply; 48+ messages in thread From: Carlos Martín Nieto @ 2011-03-15 17:27 UTC (permalink / raw) To: Junio C Hamano; +Cc: Nguyễn Thái Ngọc Duy, git On mar, 2011-03-15 at 10:02 -0700, Junio C Hamano wrote: > Carlos Martín Nieto <cmn@elego.de> writes: > > > On mar, 2011-03-15 at 12:59 +0100, Carlos Martín Nieto wrote: > >> On lun, 2011-03-14 at 15:58 -0700, Junio C Hamano wrote: > >> > Carlos Martín Nieto <cmn@elego.de> writes: > > [...] > >> > > >> > > There is however the extra functionality the function offers, namely > >> > > resolving links. It might be good to split it into two functions so each > >> > > caller can specify what it wants. > >> > > >> > Probably. > >> > >> With the changes mentioned earlier, if you want an absolute pathname, > >> you'd call absolute_path/make_nonrelative_path and if you want to make > >> sure you have the real path of the target file, you'd use real_path just > >> as you'd use realpath on a sane system, with > > > > ... a comment on the functions and maybe some documentation in > > Documentation/techncal, as it doesn't seem to exist yet. > > We probably should involve Nguyễn in this thread as his fingers are > everywhere on the codepaths related to setup. > I've been changing this a bit, trying to make all the paths normalized, but it'll take a bit longer. I'll send a partial patch when I've finished something worth seeing (for the moment, the test fail if there is a symlink somewhere in the tree, as I've mixed real_path/make_absolute_path and absolute_path/make_nonrelative_path a bit). Is it a good idea to normalize the paths? Otherwise, everything could be replaced by real_path/make_absolute_path (as most calls already are). As it's transitive and these paths aren't stored permanently (other than with clone), as long as we agree on one representation, it should be fine. Is there a performance hit if we resolve links all the time? If we run everything through normalize_path(_copy), is it slower than resolving links? cmn ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [PATCH 1/3] make_absolute_path: Don't try to copy a string to itself 2011-03-15 17:27 ` Carlos Martín Nieto @ 2011-03-16 14:16 ` Nguyen Thai Ngoc Duy 2011-03-16 14:49 ` Carlos Martín Nieto 0 siblings, 1 reply; 48+ messages in thread From: Nguyen Thai Ngoc Duy @ 2011-03-16 14:16 UTC (permalink / raw) To: Carlos Martín Nieto; +Cc: Junio C Hamano, git 2011/3/16 Carlos Martín Nieto <cmn@elego.de>: > I've been changing this a bit, trying to make all the paths normalized, > but it'll take a bit longer. I'll send a partial patch when I've > finished something worth seeing (for the moment, the test fail if there > is a symlink somewhere in the tree, as I've mixed > real_path/make_absolute_path and absolute_path/make_nonrelative_path a > bit). > > Is it a good idea to normalize the paths? Otherwise, everything could > be replaced by real_path/make_absolute_path (as most calls already are). > As it's transitive and these paths aren't stored permanently (other than > with clone), as long as we agree on one representation, it should be > fine. I think the question is whether it's _necessary_ to do that. Any gain? make_absolute_path() calls are not in critical path, I don't think we should bother much, unless there are bugs like one you fixed in your patch. > Is there a performance hit if we resolve links all the time? If we run > everything through normalize_path(_copy), is it slower than resolving > links? What paths are you talking about? If they are inside $GIT_DIR, we touch them quite often. But there are not many of them (unless you spread loose objects all over the place), resolving links should not be an issue. If they are inside worktree, maybe. Though I'm not sure if we want to normalize all of them. -- Duy ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [PATCH 1/3] make_absolute_path: Don't try to copy a string to itself 2011-03-16 14:16 ` Nguyen Thai Ngoc Duy @ 2011-03-16 14:49 ` Carlos Martín Nieto 2011-03-16 14:58 ` Nguyen Thai Ngoc Duy 0 siblings, 1 reply; 48+ messages in thread From: Carlos Martín Nieto @ 2011-03-16 14:49 UTC (permalink / raw) To: Nguyen Thai Ngoc Duy; +Cc: Junio C Hamano, git On mié, 2011-03-16 at 21:16 +0700, Nguyen Thai Ngoc Duy wrote: > 2011/3/16 Carlos Martín Nieto <cmn@elego.de>: > > I've been changing this a bit, trying to make all the paths normalized, > > but it'll take a bit longer. I'll send a partial patch when I've > > finished something worth seeing (for the moment, the test fail if there > > is a symlink somewhere in the tree, as I've mixed > > real_path/make_absolute_path and absolute_path/make_nonrelative_path a > > bit). > > > > Is it a good idea to normalize the paths? Otherwise, everything could > > be replaced by real_path/make_absolute_path (as most calls already are). > > As it's transitive and these paths aren't stored permanently (other than > > with clone), as long as we agree on one representation, it should be > > fine. > > I think the question is whether it's _necessary_ to do that. Any gain? > make_absolute_path() calls are not in critical path, I don't think we > should bother much, unless there are bugs like one you fixed in your > patch. I was under the wrong impression that non-normalized paths were what was causing is_inside_dir not to recognize the paths (this with a patch using non-resolved absolute paths as given by make_nonrelative_path rather than make_absolute_path). As it turns out, getcwd resolves the links for us, so is_inside_dir would say e.g. that /home/cmn/two/git/t wasn't under /home/cmn/two/git, because getcwd said the cwd was /home/cmn/one/git (two is a symlink to one). At any rate, I think make_absolute_path is mis-named and it should be called real_path (or make_real_path). The difference between make_nonrelative_path and make_absolute_path is certainly not clear without looking at the implementation. > > > Is there a performance hit if we resolve links all the time? If we run > > everything through normalize_path(_copy), is it slower than resolving > > links? > > What paths are you talking about? If they are inside $GIT_DIR, we > touch them quite often. But there are not many of them (unless you > spread loose objects all over the place), resolving links should not > be an issue. There aren't in fact that many calls to these functions, so resolving should be fine. More on this as an answer to your other mail. ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [PATCH 1/3] make_absolute_path: Don't try to copy a string to itself 2011-03-16 14:49 ` Carlos Martín Nieto @ 2011-03-16 14:58 ` Nguyen Thai Ngoc Duy 0 siblings, 0 replies; 48+ messages in thread From: Nguyen Thai Ngoc Duy @ 2011-03-16 14:58 UTC (permalink / raw) To: Carlos Martín Nieto; +Cc: Junio C Hamano, git On Wed, Mar 16, 2011 at 9:49 PM, Carlos Martín Nieto <cmn@elego.de> wrote: > At any rate, I think make_absolute_path is mis-named and it should be > called real_path (or make_real_path). The difference between > make_nonrelative_path and make_absolute_path is certainly not clear > without looking at the implementation. More precise, descriptive names are always appreciated :) -- Duy ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [PATCH 1/3] make_absolute_path: Don't try to copy a string to itself 2011-03-15 17:02 ` Junio C Hamano 2011-03-15 17:27 ` Carlos Martín Nieto @ 2011-03-16 14:04 ` Nguyen Thai Ngoc Duy 2011-03-16 15:08 ` Carlos Martín Nieto 1 sibling, 1 reply; 48+ messages in thread From: Nguyen Thai Ngoc Duy @ 2011-03-16 14:04 UTC (permalink / raw) To: Junio C Hamano; +Cc: Carlos Martín Nieto, git On Wed, Mar 16, 2011 at 12:02 AM, Junio C Hamano <gitster@pobox.com> wrote: > Carlos Martín Nieto <cmn@elego.de> writes: > >> On mar, 2011-03-15 at 12:59 +0100, Carlos Martín Nieto wrote: >>> On lun, 2011-03-14 at 15:58 -0700, Junio C Hamano wrote: >>> > Carlos Martín Nieto <cmn@elego.de> writes: >> [...] >>> > >>> > > There is however the extra functionality the function offers, namely >>> > > resolving links. It might be good to split it into two functions so each >>> > > caller can specify what it wants. >>> > >>> > Probably. >>> >>> With the changes mentioned earlier, if you want an absolute pathname, >>> you'd call absolute_path/make_nonrelative_path and if you want to make >>> sure you have the real path of the target file, you'd use real_path just >>> as you'd use realpath on a sane system, with >> >> ... a comment on the functions and maybe some documentation in >> Documentation/techncal, as it doesn't seem to exist yet. > > We probably should involve Nguyễn in this thread as his fingers are > everywhere on the codepaths related to setup. Thanks, my attempt to fix up setup code leaves more traces that I expect. Splitting functions is fine, but is there any use of absolute_path/make_nonrelative_path alone? Most setup code uses make_absolute_path() for $GIT_{DIR,WORK_TREE}. For GIT_DIR, a resolved/normalized path is preferred. For GIT_WORK_TREE, I'm not sure. I tend to treat it the same way as GIT_DIR, but you guys may have a special case. -- Duy ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [PATCH 1/3] make_absolute_path: Don't try to copy a string to itself 2011-03-16 14:04 ` Nguyen Thai Ngoc Duy @ 2011-03-16 15:08 ` Carlos Martín Nieto 0 siblings, 0 replies; 48+ messages in thread From: Carlos Martín Nieto @ 2011-03-16 15:08 UTC (permalink / raw) To: Nguyen Thai Ngoc Duy; +Cc: Junio C Hamano, git On mié, 2011-03-16 at 21:04 +0700, Nguyen Thai Ngoc Duy wrote: > On Wed, Mar 16, 2011 at 12:02 AM, Junio C Hamano <gitster@pobox.com> wrote: > > Carlos Martín Nieto <cmn@elego.de> writes: > > > >> On mar, 2011-03-15 at 12:59 +0100, Carlos Martín Nieto wrote: > >>> On lun, 2011-03-14 at 15:58 -0700, Junio C Hamano wrote: > >>> > Carlos Martín Nieto <cmn@elego.de> writes: > >> [...] > >>> > > >>> > > There is however the extra functionality the function offers, namely > >>> > > resolving links. It might be good to split it into two functions so each > >>> > > caller can specify what it wants. > >>> > > >>> > Probably. > >>> > >>> With the changes mentioned earlier, if you want an absolute pathname, > >>> you'd call absolute_path/make_nonrelative_path and if you want to make > >>> sure you have the real path of the target file, you'd use real_path just > >>> as you'd use realpath on a sane system, with > >> > >> ... a comment on the functions and maybe some documentation in > >> Documentation/techncal, as it doesn't seem to exist yet. > > > > We probably should involve Nguyễn in this thread as his fingers are > > everywhere on the codepaths related to setup. > > Thanks, my attempt to fix up setup code leaves more traces that I expect. > > Splitting functions is fine, but is there any use of > absolute_path/make_nonrelative_path alone? Most setup code uses > make_absolute_path() for $GIT_{DIR,WORK_TREE}. For GIT_DIR, a > resolved/normalized path is preferred. For GIT_WORK_TREE, I'm not > sure. I tend to treat it the same way as GIT_DIR, but you guys may > have a special case. There only real use for absolute_path/make_nonrelative_path would be for paths that are written to disk (as we should respect the user's path preferences). For paths we never write out, real_path/make_absolute_path makes sense. The way I want to split it up is to have real_path use a simplistic approach, and use a resolve_link function if it needs to. From logging what paths make_absolute_path is called with, all I could find were directories ("." is used very often as well). This should make real_path very easy to understand. I'll start a new thread with the renaming patches and then split up the function. ^ permalink raw reply [flat|nested] 48+ messages in thread
* [PATCH 2/3] setup_path(): Free temporary buffer 2011-03-14 19:18 [PATCH 0/3] Fix some errors reported by valgrind Carlos Martín Nieto 2011-03-14 19:18 ` [PATCH 1/3] make_absolute_path: Don't try to copy a string to itself Carlos Martín Nieto @ 2011-03-14 19:18 ` Carlos Martín Nieto 2011-03-14 20:09 ` Jeff King 2011-03-14 20:14 ` [PATCH 2/3] setup_path(): Free temporary buffer Junio C Hamano 2011-03-14 19:18 ` [PATCH 3/3] clone: Free a few paths Carlos Martín Nieto 2 siblings, 2 replies; 48+ messages in thread From: Carlos Martín Nieto @ 2011-03-14 19:18 UTC (permalink / raw) To: git; +Cc: Junio C Hamano Make sure the pointer git_exec_path() returns is in the heap and free it after it's no longer needed. Signed-off-by: Carlos Martín Nieto <cmn@elego.de> --- exec_cmd.c | 9 ++++++--- 1 files changed, 6 insertions(+), 3 deletions(-) diff --git a/exec_cmd.c b/exec_cmd.c index 38545e8..c16c3d4 100644 --- a/exec_cmd.c +++ b/exec_cmd.c @@ -73,11 +73,11 @@ const char *git_exec_path(void) const char *env; if (argv_exec_path) - return argv_exec_path; + return xstrdup(argv_exec_path); env = getenv(EXEC_PATH_ENVIRONMENT); if (env && *env) { - return env; + return xstrdup(env); } return system_path(GIT_EXEC_PATH); @@ -99,10 +99,13 @@ void setup_path(void) { const char *old_path = getenv("PATH"); struct strbuf new_path = STRBUF_INIT; + char *exec_path = (char *) git_exec_path(); - add_path(&new_path, git_exec_path()); + add_path(&new_path, exec_path); add_path(&new_path, argv0_path); + free(exec_path); + if (old_path) strbuf_addstr(&new_path, old_path); else -- 1.7.4.1 ^ permalink raw reply related [flat|nested] 48+ messages in thread
* Re: [PATCH 2/3] setup_path(): Free temporary buffer 2011-03-14 19:18 ` [PATCH 2/3] setup_path(): Free temporary buffer Carlos Martín Nieto @ 2011-03-14 20:09 ` Jeff King 2011-03-14 22:18 ` Carlos Martín Nieto 2011-03-16 11:26 ` [PATCH] system_path: use a static buffer Carlos Martín Nieto 2011-03-14 20:14 ` [PATCH 2/3] setup_path(): Free temporary buffer Junio C Hamano 1 sibling, 2 replies; 48+ messages in thread From: Jeff King @ 2011-03-14 20:09 UTC (permalink / raw) To: Carlos Martín Nieto; +Cc: git, Junio C Hamano On Mon, Mar 14, 2011 at 08:18:37PM +0100, Carlos Martín Nieto wrote: > Make sure the pointer git_exec_path() returns is in the heap and free > it after it's no longer needed. Could you explain the motivation a bit more? From looking at the code, it looks like the situation is that git_exec_path sometimes returns a newly allocated string and sometimes not, so the caller may leak in the former case. That seems to be the fault of system_path, which sometimes allocates and sometimes does not. Should we also be fixing system_path, which is a source of leaks? Or instead converting system_path to use a static buffer? > @@ -99,10 +99,13 @@ void setup_path(void) > { > const char *old_path = getenv("PATH"); > struct strbuf new_path = STRBUF_INIT; > + char *exec_path = (char *) git_exec_path(); Ick. If it is now returning an allocated buffer that the caller is responsible for free-ing, then its declaration should drop the const in the return value. What about other callers of git_exec_path? Aren't load_command_list and list_commands leaking, too (and possibly worse, since we now always allocated memory)? -Peff ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [PATCH 2/3] setup_path(): Free temporary buffer 2011-03-14 20:09 ` Jeff King @ 2011-03-14 22:18 ` Carlos Martín Nieto 2011-03-16 11:26 ` [PATCH] system_path: use a static buffer Carlos Martín Nieto 1 sibling, 0 replies; 48+ messages in thread From: Carlos Martín Nieto @ 2011-03-14 22:18 UTC (permalink / raw) To: Jeff King; +Cc: git, Junio C Hamano On lun, 2011-03-14 at 16:09 -0400, Jeff King wrote: > On Mon, Mar 14, 2011 at 08:18:37PM +0100, Carlos Martín Nieto wrote: > > > Make sure the pointer git_exec_path() returns is in the heap and free > > it after it's no longer needed. > > Could you explain the motivation a bit more? From looking at the code, > it looks like the situation is that git_exec_path sometimes returns a > newly allocated string and sometimes not, so the caller may leak in the > former case. I mostly wanted valgrind to shut up, but it seems there are a lot of small leaks we don't really care about. > > That seems to be the fault of system_path, which sometimes allocates and > sometimes does not. Should we also be fixing system_path, which is a > source of leaks? Or instead converting system_path to use a static > buffer? Converting system_path() to use a static buffer would also work, and would probably be an easier solution. I'll look at that tomorrow and resend. > > > @@ -99,10 +99,13 @@ void setup_path(void) > > { > > const char *old_path = getenv("PATH"); > > struct strbuf new_path = STRBUF_INIT; > > + char *exec_path = (char *) git_exec_path(); > > Ick. If it is now returning an allocated buffer that the caller is > responsible for free-ing, then its declaration should drop the const in > the return value. Yeah... I meant to delete that. It seems I've completely botched this patch. > > What about other callers of git_exec_path? Aren't load_command_list and > list_commands leaking, too (and possibly worse, since we now always > allocated memory)? Test-suite-induced tunnel vision. I only tested what reported a failure (under valgrind) without the patch. cmn ^ permalink raw reply [flat|nested] 48+ messages in thread
* [PATCH] system_path: use a static buffer 2011-03-14 20:09 ` Jeff King 2011-03-14 22:18 ` Carlos Martín Nieto @ 2011-03-16 11:26 ` Carlos Martín Nieto 2011-03-16 15:58 ` Erik Faye-Lund 1 sibling, 1 reply; 48+ messages in thread From: Carlos Martín Nieto @ 2011-03-16 11:26 UTC (permalink / raw) To: git; +Cc: Junio C Hamano, Jeff King Make system_path behave like the other path functions by using a static buffer, fixing a memory leak. Also make sure the prefix pointer is always initialized to either PREFIX or NULL. Signed-off-by: Carlos Martín Nieto <cmn@elego.de> --- This fixes the leak I was trying to fix with my original patch, but this seems much cleaner exec_cmd.c | 9 ++++----- 1 files changed, 4 insertions(+), 5 deletions(-) diff --git a/exec_cmd.c b/exec_cmd.c index 38545e8..12ce017 100644 --- a/exec_cmd.c +++ b/exec_cmd.c @@ -9,11 +9,11 @@ static const char *argv0_path; const char *system_path(const char *path) { #ifdef RUNTIME_PREFIX - static const char *prefix; + static const char *prefix = NULL; #else static const char *prefix = PREFIX; #endif - struct strbuf d = STRBUF_INIT; + static char buf[PATH_MAX+1]; if (is_absolute_path(path)) return path; @@ -33,9 +33,8 @@ const char *system_path(const char *path) } #endif - strbuf_addf(&d, "%s/%s", prefix, path); - path = strbuf_detach(&d, NULL); - return path; + snprintf(buf, PATH_MAX, "%s/%s", prefix, path); + return buf; } const char *git_extract_argv0_path(const char *argv0) -- 1.7.4.1 ^ permalink raw reply related [flat|nested] 48+ messages in thread
* Re: [PATCH] system_path: use a static buffer 2011-03-16 11:26 ` [PATCH] system_path: use a static buffer Carlos Martín Nieto @ 2011-03-16 15:58 ` Erik Faye-Lund 2011-03-16 16:24 ` Carlos Martín Nieto 2011-03-16 16:33 ` Carlos Martín Nieto 0 siblings, 2 replies; 48+ messages in thread From: Erik Faye-Lund @ 2011-03-16 15:58 UTC (permalink / raw) To: Carlos Martín Nieto; +Cc: git, Junio C Hamano, Jeff King On Wed, Mar 16, 2011 at 12:26 PM, Carlos Martín Nieto <cmn@elego.de> wrote: > Make system_path behave like the other path functions by using a > static buffer, fixing a memory leak. > > Also make sure the prefix pointer is always initialized to either > PREFIX or NULL. > > Signed-off-by: Carlos Martín Nieto <cmn@elego.de> > --- > > This fixes the leak I was trying to fix with my original patch, but > this seems much cleaner > > exec_cmd.c | 9 ++++----- > 1 files changed, 4 insertions(+), 5 deletions(-) > > diff --git a/exec_cmd.c b/exec_cmd.c > index 38545e8..12ce017 100644 > --- a/exec_cmd.c > +++ b/exec_cmd.c > @@ -9,11 +9,11 @@ static const char *argv0_path; > const char *system_path(const char *path) > { > #ifdef RUNTIME_PREFIX > - static const char *prefix; > + static const char *prefix = NULL; > #else > static const char *prefix = PREFIX; > #endif > - struct strbuf d = STRBUF_INIT; > + static char buf[PATH_MAX+1]; Why PATH_MAX + 1? POSIX says that PATH_MAX includes the null-termination... > @@ -33,9 +33,8 @@ const char *system_path(const char *path) > } > #endif > > - strbuf_addf(&d, "%s/%s", prefix, path); > - path = strbuf_detach(&d, NULL); > - return path; > + snprintf(buf, PATH_MAX, "%s/%s", prefix, path); Perhaps "snprintf(buf, sizeof(buf) - 1, "%s/%s", prefix, path);" instead? ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [PATCH] system_path: use a static buffer 2011-03-16 15:58 ` Erik Faye-Lund @ 2011-03-16 16:24 ` Carlos Martín Nieto 2011-03-16 16:33 ` Carlos Martín Nieto 1 sibling, 0 replies; 48+ messages in thread From: Carlos Martín Nieto @ 2011-03-16 16:24 UTC (permalink / raw) To: kusmabite; +Cc: git, Junio C Hamano, Jeff King On mié, 2011-03-16 at 16:58 +0100, Erik Faye-Lund wrote: > On Wed, Mar 16, 2011 at 12:26 PM, Carlos Martín Nieto <cmn@elego.de> wrote: > > Make system_path behave like the other path functions by using a > > static buffer, fixing a memory leak. > > > > Also make sure the prefix pointer is always initialized to either > > PREFIX or NULL. > > > > Signed-off-by: Carlos Martín Nieto <cmn@elego.de> > > --- > > > > This fixes the leak I was trying to fix with my original patch, but > > this seems much cleaner > > > > exec_cmd.c | 9 ++++----- > > 1 files changed, 4 insertions(+), 5 deletions(-) > > > > diff --git a/exec_cmd.c b/exec_cmd.c > > index 38545e8..12ce017 100644 > > --- a/exec_cmd.c > > +++ b/exec_cmd.c > > @@ -9,11 +9,11 @@ static const char *argv0_path; > > const char *system_path(const char *path) > > { > > #ifdef RUNTIME_PREFIX > > - static const char *prefix; > > + static const char *prefix = NULL; > > #else > > static const char *prefix = PREFIX; > > #endif > > - struct strbuf d = STRBUF_INIT; > > + static char buf[PATH_MAX+1]; > > Why PATH_MAX + 1? POSIX says that PATH_MAX includes the null-termination... A lot of other code I've been looking at uses it. I was not aware it included the terminator. > > > @@ -33,9 +33,8 @@ const char *system_path(const char *path) > > } > > #endif > > > > - strbuf_addf(&d, "%s/%s", prefix, path); > > - path = strbuf_detach(&d, NULL); > > - return path; > > + snprintf(buf, PATH_MAX, "%s/%s", prefix, path); > > Perhaps "snprintf(buf, sizeof(buf) - 1, "%s/%s", prefix, path);" instead? The manpage states that the size parameter includes the null-terminator, so sizeof(buf) would be better, I think. I'll send out a new mail with the updated patch. Shouldn't we check to see if there was truncation (i.e. return value >= sizeof(buf)) and die in that case? ^ permalink raw reply [flat|nested] 48+ messages in thread
* [PATCH] system_path: use a static buffer 2011-03-16 15:58 ` Erik Faye-Lund 2011-03-16 16:24 ` Carlos Martín Nieto @ 2011-03-16 16:33 ` Carlos Martín Nieto 2011-03-16 20:43 ` Junio C Hamano 1 sibling, 1 reply; 48+ messages in thread From: Carlos Martín Nieto @ 2011-03-16 16:33 UTC (permalink / raw) To: git; +Cc: Erik Faye-Lund, Junio C Hamano Make system_path behave like the other path functions by using a static buffer, fixing a memory leak. Also make sure the prefix pointer is always initialized to either PREFIX or NULL. Signed-off-by: Carlos Martín Nieto <cmn@elego.de> --- exec_cmd.c | 11 ++++++----- 1 files changed, 6 insertions(+), 5 deletions(-) diff --git a/exec_cmd.c b/exec_cmd.c index 38545e8..5686952 100644 --- a/exec_cmd.c +++ b/exec_cmd.c @@ -9,11 +9,11 @@ static const char *argv0_path; const char *system_path(const char *path) { #ifdef RUNTIME_PREFIX - static const char *prefix; + static const char *prefix = NULL; #else static const char *prefix = PREFIX; #endif - struct strbuf d = STRBUF_INIT; + static char buf[PATH_MAX]; if (is_absolute_path(path)) return path; @@ -33,9 +33,10 @@ const char *system_path(const char *path) } #endif - strbuf_addf(&d, "%s/%s", prefix, path); - path = strbuf_detach(&d, NULL); - return path; + if (snprintf(buf, sizeof(buf), "%s/%s", prefix, path) >= sizeof(buf)) + die("system path too long for %s", path); + + return buf; } const char *git_extract_argv0_path(const char *argv0) -- 1.7.4.1 ^ permalink raw reply related [flat|nested] 48+ messages in thread
* Re: [PATCH] system_path: use a static buffer 2011-03-16 16:33 ` Carlos Martín Nieto @ 2011-03-16 20:43 ` Junio C Hamano 2011-03-17 11:01 ` Carlos Martín Nieto 0 siblings, 1 reply; 48+ messages in thread From: Junio C Hamano @ 2011-03-16 20:43 UTC (permalink / raw) To: Carlos Martín Nieto; +Cc: git, Erik Faye-Lund Carlos Martín Nieto <cmn@elego.de> writes: > Make system_path behave like the other path functions by using a > static buffer, fixing a memory leak. > > Also make sure the prefix pointer is always initialized to either > PREFIX or NULL. > > Signed-off-by: Carlos Martín Nieto <cmn@elego.de> > --- Have you made sure all the callers are Ok with this change? If somebody called system_path(GIT_EXEC_PATH), saved the result in a variable without copying, and then called system_path(ETC_GITATTRIBUTES), his variable may now have a value unrelated to GIT_EXEC_PATH, and you would fix all such callers to save the value away with strdup(). ^ permalink raw reply [flat|nested] 48+ messages in thread
* [PATCH] system_path: use a static buffer 2011-03-16 20:43 ` Junio C Hamano @ 2011-03-17 11:01 ` Carlos Martín Nieto 2011-03-17 14:24 ` Carlos Martín Nieto 0 siblings, 1 reply; 48+ messages in thread From: Carlos Martín Nieto @ 2011-03-17 11:01 UTC (permalink / raw) To: git; +Cc: Junio C Hamano, Erik Faye-Lund Make system_path behave like the other path functions by using a static buffer, fixing a memory leak. Also make sure the prefix pointer is always initialized to either PREFIX or NULL. git_etc_gitattributes and git_etc_gitconfig are the only users who are affected by this change. Make them use a static buffer, which fits their use better as well. Signed-off-by: Carlos Martín Nieto <cmn@elego.de> --- On mié, 2011-03-16 at 13:43 -0700, Junio C Hamano wrote: Carlos Martín Nieto <cmn@elego.de> writes: > > > Make system_path behave like the other path functions by using a > > static buffer, fixing a memory leak. > > > > Also make sure the prefix pointer is always initialized to either > > PREFIX or NULL. > > > > Signed-off-by: Carlos Martín Nieto <cmn@elego.de> > > --- > > Have you made sure all the callers are Ok with this change? > > If somebody called system_path(GIT_EXEC_PATH), saved the result in a > variable without copying, and then called system_path(ETC_GITATTRIBUTES), > his variable may now have a value unrelated to GIT_EXEC_PATH, and you > would fix all such callers to save the value away with strdup(). I checked again, and except for the ones changed in this patch, the rest copy it to their own buffer or pass it to puts, setenv or strbuf_addstr. The way these functions are used suggest the caller expects them to deal with their own memory, so that's what I've done. TBH, valgrind only reports a win of ~6-7kB when doing git log on git.git, but it's a step in the right direction (and adds consistency to system_path, which is the main win). attr.c | 6 +++--- config.c | 6 +++--- exec_cmd.c | 11 ++++++----- 3 files changed, 12 insertions(+), 11 deletions(-) diff --git a/attr.c b/attr.c index 6aff695..64d803f 100644 --- a/attr.c +++ b/attr.c @@ -467,9 +467,9 @@ static void drop_attr_stack(void) const char *git_etc_gitattributes(void) { - static const char *system_wide; - if (!system_wide) - system_wide = system_path(ETC_GITATTRIBUTES); + static char system_wide[PATH_MAX]; + if (!system_wide[0]) + strlcpy(system_wide, system_path(ETC_GITATTRIBUTES), PATH_MAX); return system_wide; } diff --git a/config.c b/config.c index 822ef83..cd1c295 100644 --- a/config.c +++ b/config.c @@ -808,9 +808,9 @@ int git_config_from_file(config_fn_t fn, const char *filename, void *data) const char *git_etc_gitconfig(void) { - static const char *system_wide; - if (!system_wide) - system_wide = system_path(ETC_GITCONFIG); + static char system_wide[PATH_MAX]; + if (!system_wide[0]) + strlcpy(system_wide, system_path(ETC_GITCONFIG), PATH_MAX); return system_wide; } diff --git a/exec_cmd.c b/exec_cmd.c index 38545e8..5686952 100644 --- a/exec_cmd.c +++ b/exec_cmd.c @@ -9,11 +9,11 @@ static const char *argv0_path; const char *system_path(const char *path) { #ifdef RUNTIME_PREFIX - static const char *prefix; + static const char *prefix = NULL; #else static const char *prefix = PREFIX; #endif - struct strbuf d = STRBUF_INIT; + static char buf[PATH_MAX]; if (is_absolute_path(path)) return path; @@ -33,9 +33,10 @@ const char *system_path(const char *path) } #endif - strbuf_addf(&d, "%s/%s", prefix, path); - path = strbuf_detach(&d, NULL); - return path; + if (snprintf(buf, sizeof(buf), "%s/%s", prefix, path) >= sizeof(buf)) + die("system path too long for %s", path); + + return buf; } const char *git_extract_argv0_path(const char *argv0) -- 1.7.4.1 ^ permalink raw reply related [flat|nested] 48+ messages in thread
* [PATCH] system_path: use a static buffer 2011-03-17 11:01 ` Carlos Martín Nieto @ 2011-03-17 14:24 ` Carlos Martín Nieto 2011-03-18 7:25 ` Junio C Hamano 2011-03-18 10:34 ` Nguyen Thai Ngoc Duy 0 siblings, 2 replies; 48+ messages in thread From: Carlos Martín Nieto @ 2011-03-17 14:24 UTC (permalink / raw) To: git; +Cc: Junio C Hamano, Erik Faye-Lund, Carlos Martín Nieto Make system_path behave like the other path functions by using a static buffer, fixing a memory leak. Also make sure the prefix pointer is always initialized to either PREFIX or NULL. git_etc_gitattributes and git_etc_gitconfig are the only users who are affected by this change. Make them use a static buffer, which fits their use better as well. Signed-off-by: Carlos Martín Nieto <cmn@elego.de> --- It was pointed out that one should always check for an encoding error (-1) with the printf family. I'm not sure how likely this is to happen, but this should make the code extra portable :) attr.c | 6 +++--- config.c | 6 +++--- exec_cmd.c | 15 ++++++++++----- 3 files changed, 16 insertions(+), 11 deletions(-) diff --git a/attr.c b/attr.c index 6aff695..64d803f 100644 --- a/attr.c +++ b/attr.c @@ -467,9 +467,9 @@ static void drop_attr_stack(void) const char *git_etc_gitattributes(void) { - static const char *system_wide; - if (!system_wide) - system_wide = system_path(ETC_GITATTRIBUTES); + static char system_wide[PATH_MAX]; + if (!system_wide[0]) + strlcpy(system_wide, system_path(ETC_GITATTRIBUTES), PATH_MAX); return system_wide; } diff --git a/config.c b/config.c index 822ef83..cd1c295 100644 --- a/config.c +++ b/config.c @@ -808,9 +808,9 @@ int git_config_from_file(config_fn_t fn, const char *filename, void *data) const char *git_etc_gitconfig(void) { - static const char *system_wide; - if (!system_wide) - system_wide = system_path(ETC_GITCONFIG); + static char system_wide[PATH_MAX]; + if (!system_wide[0]) + strlcpy(system_wide, system_path(ETC_GITCONFIG), PATH_MAX); return system_wide; } diff --git a/exec_cmd.c b/exec_cmd.c index 38545e8..35d5cd8 100644 --- a/exec_cmd.c +++ b/exec_cmd.c @@ -9,11 +9,12 @@ static const char *argv0_path; const char *system_path(const char *path) { #ifdef RUNTIME_PREFIX - static const char *prefix; + static const char *prefix = NULL; #else static const char *prefix = PREFIX; #endif - struct strbuf d = STRBUF_INIT; + static char buf[PATH_MAX]; + int ret; if (is_absolute_path(path)) return path; @@ -33,9 +34,13 @@ const char *system_path(const char *path) } #endif - strbuf_addf(&d, "%s/%s", prefix, path); - path = strbuf_detach(&d, NULL); - return path; + ret = snprintf(buf, sizeof(buf), "%s/%s", prefix, path); + if (ret >= sizeof(buf)) + die("system path too long for %s", path); + else if (ret < 0) + die_errno("encoding error"); + + return buf; } const char *git_extract_argv0_path(const char *argv0) -- 1.7.4.1 ^ permalink raw reply related [flat|nested] 48+ messages in thread
* Re: [PATCH] system_path: use a static buffer 2011-03-17 14:24 ` Carlos Martín Nieto @ 2011-03-18 7:25 ` Junio C Hamano 2011-03-21 9:56 ` Carlos Martín Nieto 2011-03-18 10:34 ` Nguyen Thai Ngoc Duy 1 sibling, 1 reply; 48+ messages in thread From: Junio C Hamano @ 2011-03-18 7:25 UTC (permalink / raw) To: Carlos Martín Nieto; +Cc: git, Erik Faye-Lund Carlos Martín Nieto <cmn@elego.de> writes: > + ret = snprintf(buf, sizeof(buf), "%s/%s", prefix, path); > + if (ret >= sizeof(buf)) > + die("system path too long for %s", path); > + else if (ret < 0) > + die_errno("encoding error"); POSIX says snprintf() should set errno in this case, and your use of die_errno() would show that information, but what is "encoding error"? Just being curious, as I suspect that "snprintf() returned an error" may be more appropriate, if the answer is "I don't know what kind of error it is, but snprintf() found something faulty while encoding so I chose to call it encoding error". By the way, thanks for checking all the callers. ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [PATCH] system_path: use a static buffer 2011-03-18 7:25 ` Junio C Hamano @ 2011-03-21 9:56 ` Carlos Martín Nieto 2011-03-21 11:14 ` Jeff King 0 siblings, 1 reply; 48+ messages in thread From: Carlos Martín Nieto @ 2011-03-21 9:56 UTC (permalink / raw) To: Junio C Hamano; +Cc: git, Erik Faye-Lund On vie, 2011-03-18 at 00:25 -0700, Junio C Hamano wrote: > Carlos Martín Nieto <cmn@elego.de> writes: > > > + ret = snprintf(buf, sizeof(buf), "%s/%s", prefix, path); > > + if (ret >= sizeof(buf)) > > + die("system path too long for %s", path); > > + else if (ret < 0) > > + die_errno("encoding error"); > > POSIX says snprintf() should set errno in this case, and your use of > die_errno() would show that information, but what is "encoding error"? > > Just being curious, as I suspect that "snprintf() returned an error" may > be more appropriate, if the answer is "I don't know what kind of error it > is, but snprintf() found something faulty while encoding so I chose to > call it encoding error". My manpage says snprintf returns -1 if there was an output or encoding error. As there couldn't be an output error because it's writing to memory and we can't output what snprintf chocked on because whatever die_errno uses will also choke on it, I just put "encoding error". I'd put "error assembling system path" as the actual error message, I guess. cmn ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [PATCH] system_path: use a static buffer 2011-03-21 9:56 ` Carlos Martín Nieto @ 2011-03-21 11:14 ` Jeff King 2011-03-21 15:26 ` Carlos Martín Nieto 0 siblings, 1 reply; 48+ messages in thread From: Jeff King @ 2011-03-21 11:14 UTC (permalink / raw) To: Carlos Martín Nieto; +Cc: Junio C Hamano, git, Erik Faye-Lund On Mon, Mar 21, 2011 at 10:56:19AM +0100, Carlos Martín Nieto wrote: > On vie, 2011-03-18 at 00:25 -0700, Junio C Hamano wrote: > > Carlos Martín Nieto <cmn@elego.de> writes: > > > > > + ret = snprintf(buf, sizeof(buf), "%s/%s", prefix, path); > > > + if (ret >= sizeof(buf)) > > > + die("system path too long for %s", path); > > > + else if (ret < 0) > > > + die_errno("encoding error"); > > > > POSIX says snprintf() should set errno in this case, and your use of > > die_errno() would show that information, but what is "encoding error"? > > > > Just being curious, as I suspect that "snprintf() returned an error" may > > be more appropriate, if the answer is "I don't know what kind of error it > > is, but snprintf() found something faulty while encoding so I chose to > > call it encoding error". > > My manpage says snprintf returns -1 if there was an output or encoding > error. As there couldn't be an output error because it's writing to > memory and we can't output what snprintf chocked on because whatever > die_errno uses will also choke on it, I just put "encoding error". I'd > put "error assembling system path" as the actual error message, I guess. FWIW, we don't catch snprintf failures in 99% of the calls in git. Most calls just ignore the return value, and some even directly use the return value to add to a length. The one place that actually does check for the error is strbuf_vaddf, which just says "your vsnprintf is broken" and dies. So I'm not sure how much we really care about this error code path. If anything, we should be replacing all of the calls with something like: static const char buggy_sprintf_msg[] = "BUG: vsnprintf returned %d; either we fed it a bogus format string\n" "(our bug) or your libc is buggy and returns an error when it should\n" "tell us how much space is needed. The format string was:\n" "%s\n"; int xsnprintf(char *out, size_t size, const char *fmt, ...) { va_list ap; int r; va_start(ap, fmt); r = vsnprintf(out, size, fmt, ap); va_end(ap); if (r < 0) die(buggy_sprintf_msg, r, fmt); return r; } -Peff ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [PATCH] system_path: use a static buffer 2011-03-21 11:14 ` Jeff King @ 2011-03-21 15:26 ` Carlos Martín Nieto 2011-03-21 15:51 ` Jeff King 0 siblings, 1 reply; 48+ messages in thread From: Carlos Martín Nieto @ 2011-03-21 15:26 UTC (permalink / raw) To: Jeff King; +Cc: Junio C Hamano, git, Erik Faye-Lund On lun, 2011-03-21 at 07:14 -0400, Jeff King wrote: > On Mon, Mar 21, 2011 at 10:56:19AM +0100, Carlos Martín Nieto wrote: > > > On vie, 2011-03-18 at 00:25 -0700, Junio C Hamano wrote: > > > Carlos Martín Nieto <cmn@elego.de> writes: > > > > > > > + ret = snprintf(buf, sizeof(buf), "%s/%s", prefix, path); > > > > + if (ret >= sizeof(buf)) > > > > + die("system path too long for %s", path); > > > > + else if (ret < 0) > > > > + die_errno("encoding error"); > > > > > > POSIX says snprintf() should set errno in this case, and your use of > > > die_errno() would show that information, but what is "encoding error"? > > > > > > Just being curious, as I suspect that "snprintf() returned an error" may > > > be more appropriate, if the answer is "I don't know what kind of error it > > > is, but snprintf() found something faulty while encoding so I chose to > > > call it encoding error". > > > > My manpage says snprintf returns -1 if there was an output or encoding > > error. As there couldn't be an output error because it's writing to > > memory and we can't output what snprintf chocked on because whatever > > die_errno uses will also choke on it, I just put "encoding error". I'd > > put "error assembling system path" as the actual error message, I guess. > > FWIW, we don't catch snprintf failures in 99% of the calls in git. Most > calls just ignore the return value, and some even directly use the > return value to add to a length. The one place that actually does check > for the error is strbuf_vaddf, which just says "your vsnprintf is > broken" and dies. It's not actually likely we'll ever meet this error if the only one allowed to set the format string is the programmer (and to do otherwise is a security risk). > > So I'm not sure how much we really care about this error code path. If > anything, we should be replacing all of the calls with something like: > > static const char buggy_sprintf_msg[] = > "BUG: vsnprintf returned %d; either we fed it a bogus format string\n" > "(our bug) or your libc is buggy and returns an error when it should\n" > "tell us how much space is needed. The format string was:\n" > "%s\n"; > int xsnprintf(char *out, size_t size, const char *fmt, ...) > { > va_list ap; > int r; > > va_start(ap, fmt); > r = vsnprintf(out, size, fmt, ap); > va_end(ap); > > if (r < 0) > die(buggy_sprintf_msg, r, fmt); > return r; > } Or we could overload (#define) snprintf and replace it with the paranoid. It'd go nicely with the vsnprintf that tries to work around the Windows implementation. I don't feel that strongly we should have the extra check there, seeing how it's rare and not checked anywhere else. cmn ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [PATCH] system_path: use a static buffer 2011-03-21 15:26 ` Carlos Martín Nieto @ 2011-03-21 15:51 ` Jeff King 2011-03-21 15:57 ` Carlos Martín Nieto 0 siblings, 1 reply; 48+ messages in thread From: Jeff King @ 2011-03-21 15:51 UTC (permalink / raw) To: Carlos Martín Nieto; +Cc: Junio C Hamano, git, Erik Faye-Lund On Mon, Mar 21, 2011 at 04:26:29PM +0100, Carlos Martín Nieto wrote: > > FWIW, we don't catch snprintf failures in 99% of the calls in git. Most > > calls just ignore the return value, and some even directly use the > > return value to add to a length. The one place that actually does check > > for the error is strbuf_vaddf, which just says "your vsnprintf is > > broken" and dies. > > It's not actually likely we'll ever meet this error if the only one > allowed to set the format string is the programmer (and to do otherwise > is a security risk). Agreed. > Or we could overload (#define) snprintf and replace it with the > paranoid. It'd go nicely with the vsnprintf that tries to work around > the Windows implementation. > > I don't feel that strongly we should have the extra check there, seeing > how it's rare and not checked anywhere else. Yeah, I am happy to just drop it. AFAICT, an error return from snprintf is a bug in your program[1] or a bug in libc. In either case, it should fail or produce bogus output on the first run, and we don't need to worry about checking errors all the time. Noting the error helps a little with detection, but since we already install our own vsnprintf on known-buggy platforms, I haven't seen a single complaint. -Peff [1] The only error I managed to get out of eglibc was trying to format "%" (i.e., percent followed by NUL). Skimming the eglibc code (gaah, my eyes!) it looks like under some conditions it can allocate small work buffers, and return an error if malloc fails. So yeah, I guess it can fail randomly, but it seems pretty unlikely. ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [PATCH] system_path: use a static buffer 2011-03-21 15:51 ` Jeff King @ 2011-03-21 15:57 ` Carlos Martín Nieto 0 siblings, 0 replies; 48+ messages in thread From: Carlos Martín Nieto @ 2011-03-21 15:57 UTC (permalink / raw) To: Jeff King; +Cc: Junio C Hamano, git, Erik Faye-Lund On lun, 2011-03-21 at 11:51 -0400, Jeff King wrote: > Yeah, I am happy to just drop it. AFAICT, an error return from snprintf > is a bug in your program[1] or a bug in libc. In either case, it should > fail or produce bogus output on the first run, and we don't need to > worry about checking errors all the time. Noting the error helps a > little with detection, but since we already install our own vsnprintf on > known-buggy platforms, I haven't seen a single complaint. Then the patch in <1300359664-6230-1-git-send-email-cmn@elego.de> should do the trick. cmn ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [PATCH] system_path: use a static buffer 2011-03-17 14:24 ` Carlos Martín Nieto 2011-03-18 7:25 ` Junio C Hamano @ 2011-03-18 10:34 ` Nguyen Thai Ngoc Duy 2011-03-18 11:38 ` PATH_MAX (Re: [PATCH] system_path: use a static buffer) Jonathan Nieder 2011-03-18 11:39 ` [PATCH 1/2] wrapper.c: add xgetcwd() Nguyễn Thái Ngọc Duy 1 sibling, 2 replies; 48+ messages in thread From: Nguyen Thai Ngoc Duy @ 2011-03-18 10:34 UTC (permalink / raw) To: Carlos Martín Nieto; +Cc: git, Junio C Hamano, Erik Faye-Lund On Thu, Mar 17, 2011 at 9:24 PM, Carlos Martín Nieto <cmn@elego.de> wrote: > - static const char *system_wide; > - if (!system_wide) > - system_wide = system_path(ETC_GITATTRIBUTES); > + static char system_wide[PATH_MAX]; ... > - static const char *system_wide; > - if (!system_wide) > - system_wide = system_path(ETC_GITCONFIG); > + static char system_wide[PATH_MAX]; ... > #ifdef RUNTIME_PREFIX > - static const char *prefix; > + static const char *prefix = NULL; > #else > static const char *prefix = PREFIX; > #endif > - struct strbuf d = STRBUF_INIT; > + static char buf[PATH_MAX]; > + int ret; It was pointed out elsewhere [1] that PATH_MAX only specifies max length of a path element, not full path. I think we'd need to stay away from preallocated PATH_MAX-sized arrays. [1] http://mid.gmane.org/AANLkTikXvx7-Q8B_dqG5mMHGK_Rw-dFaeQdXi0zW98SD@mail.gmail.com -- Duy ^ permalink raw reply [flat|nested] 48+ messages in thread
* PATH_MAX (Re: [PATCH] system_path: use a static buffer) 2011-03-18 10:34 ` Nguyen Thai Ngoc Duy @ 2011-03-18 11:38 ` Jonathan Nieder 2011-03-18 11:54 ` Nguyen Thai Ngoc Duy ` (2 more replies) 2011-03-18 11:39 ` [PATCH 1/2] wrapper.c: add xgetcwd() Nguyễn Thái Ngọc Duy 1 sibling, 3 replies; 48+ messages in thread From: Jonathan Nieder @ 2011-03-18 11:38 UTC (permalink / raw) To: Nguyen Thai Ngoc Duy Cc: Carlos Martín Nieto, git, Junio C Hamano, Erik Faye-Lund Hi, Nguyen Thai Ngoc Duy wrote: > It was pointed out elsewhere [1] that PATH_MAX only specifies max > length of a path element, not full path. I think we'd need to stay > away from preallocated PATH_MAX-sized arrays. No, PATH_MAX is actually the maximum length of a path, and when you use, say, open(2), it will fail if your path is longer than that. The maximum length of a path component on most filesytems is 255 or 256; PATH_MAX on Linux is 4096. It is indeed possible to have paths with length longer than that. The way to support that is to use relative paths wherever possible, which does sound to me like an interesting long-term goal (mostly because I suspect the result would be easier to read and, especially, to reason about with respect to race conditions). Hope that helps, Jonathan ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: PATH_MAX (Re: [PATCH] system_path: use a static buffer) 2011-03-18 11:38 ` PATH_MAX (Re: [PATCH] system_path: use a static buffer) Jonathan Nieder @ 2011-03-18 11:54 ` Nguyen Thai Ngoc Duy 2011-03-21 9:47 ` Carlos Martín Nieto 2011-03-21 11:19 ` Nguyen Thai Ngoc Duy 2 siblings, 0 replies; 48+ messages in thread From: Nguyen Thai Ngoc Duy @ 2011-03-18 11:54 UTC (permalink / raw) To: Jonathan Nieder Cc: Carlos Martín Nieto, git, Junio C Hamano, Erik Faye-Lund On Fri, Mar 18, 2011 at 6:38 PM, Jonathan Nieder <jrnieder@gmail.com> wrote: > Hi, > > Nguyen Thai Ngoc Duy wrote: > >> It was pointed out elsewhere [1] that PATH_MAX only specifies max >> length of a path element, not full path. I think we'd need to stay >> away from preallocated PATH_MAX-sized arrays. > > No, PATH_MAX is actually the maximum length of a path, and when you > use, say, open(2), it will fail if your path is longer than that. The > maximum length of a path component on most filesytems is 255 or 256; > PATH_MAX on Linux is 4096. Hmm.. too late I sent two patches before reading this :) Cloning linux-2.6.git for verifying. But getcwd() can return a path longer than PATH_MAX, right? -- Duy ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: PATH_MAX (Re: [PATCH] system_path: use a static buffer) 2011-03-18 11:38 ` PATH_MAX (Re: [PATCH] system_path: use a static buffer) Jonathan Nieder 2011-03-18 11:54 ` Nguyen Thai Ngoc Duy @ 2011-03-21 9:47 ` Carlos Martín Nieto 2011-03-21 12:37 ` Lasse Makholm 2011-03-21 11:19 ` Nguyen Thai Ngoc Duy 2 siblings, 1 reply; 48+ messages in thread From: Carlos Martín Nieto @ 2011-03-21 9:47 UTC (permalink / raw) To: Jonathan Nieder; +Cc: Nguyen Thai Ngoc Duy, git, Junio C Hamano, Erik Faye-Lund On vie, 2011-03-18 at 06:38 -0500, Jonathan Nieder wrote: > Hi, > > Nguyen Thai Ngoc Duy wrote: > > > It was pointed out elsewhere [1] that PATH_MAX only specifies max > > length of a path element, not full path. I think we'd need to stay > > away from preallocated PATH_MAX-sized arrays. > > No, PATH_MAX is actually the maximum length of a path, and when you > use, say, open(2), it will fail if your path is longer than that. The > maximum length of a path component on most filesytems is 255 or 256; > PATH_MAX on Linux is 4096. > > It is indeed possible to have paths with length longer than that. The > way to support that is to use relative paths wherever possible, which So what PATH_MAX describes is the maximum length of a string representing a path, but not necessarily the length of the path itself. > does sound to me like an interesting long-term goal (mostly because I > suspect the result would be easier to read and, especially, to reason > about with respect to race conditions). There is also the following effect with git carlos@bee:~/apps$ mkdir one carlos@bee:~/apps$ ln -s one two carlos@bee:~/apps$ ln -s two three carlos@bee:~/apps$ cd three carlos@bee:~/apps/three$ git init Initialized empty Git repository in /home/carlos/apps/one/.git/ which is at best a bit annoying. Many instances of real_path (formerly make_absolute_path) could be skipped and is_inside_dir could do the transformation to physical path if it needs to. There may be a few edge cases, but most of the transformation should be fairly straightforward (he says as he steps off the cliff...) cmn ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: PATH_MAX (Re: [PATCH] system_path: use a static buffer) 2011-03-21 9:47 ` Carlos Martín Nieto @ 2011-03-21 12:37 ` Lasse Makholm 0 siblings, 0 replies; 48+ messages in thread From: Lasse Makholm @ 2011-03-21 12:37 UTC (permalink / raw) To: git On 21 March 2011 10:47, Carlos Martín Nieto <cmn@elego.de> wrote: > On vie, 2011-03-18 at 06:38 -0500, Jonathan Nieder wrote: >> Hi, >> >> Nguyen Thai Ngoc Duy wrote: >> >> > It was pointed out elsewhere [1] that PATH_MAX only specifies max >> > length of a path element, not full path. I think we'd need to stay >> > away from preallocated PATH_MAX-sized arrays. >> >> No, PATH_MAX is actually the maximum length of a path, and when you >> use, say, open(2), it will fail if your path is longer than that. The >> maximum length of a path component on most filesytems is 255 or 256; >> PATH_MAX on Linux is 4096. >> >> It is indeed possible to have paths with length longer than that. The >> way to support that is to use relative paths wherever possible, which > > So what PATH_MAX describes is the maximum length of a string > representing a path, but not necessarily the length of the path itself. According to this at least, PATH_MAX is bogus: http://insanecoding.blogspot.com/2007/11/pathmax-simply-isnt.html I think the sane thing would be to never rely on a fixed max path length. -- /Lasse ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: PATH_MAX (Re: [PATCH] system_path: use a static buffer) 2011-03-18 11:38 ` PATH_MAX (Re: [PATCH] system_path: use a static buffer) Jonathan Nieder 2011-03-18 11:54 ` Nguyen Thai Ngoc Duy 2011-03-21 9:47 ` Carlos Martín Nieto @ 2011-03-21 11:19 ` Nguyen Thai Ngoc Duy 2 siblings, 0 replies; 48+ messages in thread From: Nguyen Thai Ngoc Duy @ 2011-03-21 11:19 UTC (permalink / raw) To: Jonathan Nieder Cc: Carlos Martín Nieto, git, Junio C Hamano, Erik Faye-Lund On Fri, Mar 18, 2011 at 6:38 PM, Jonathan Nieder <jrnieder@gmail.com> wrote: > It is indeed possible to have paths with length longer than that. The > way to support that is to use relative paths wherever possible, which > does sound to me like an interesting long-term goal (mostly because I > suspect the result would be easier to read and, especially, to reason > about with respect to race conditions). A shorter-term goal is to set a path limit config on trees inside the repo, so people on Linux won't accidentally make paths longer than the limit their Windows fellows have. I think it would be easy to do the check around index (old trees in repo are left alone, inspecting history does not have such a limit). -- Duy ^ permalink raw reply [flat|nested] 48+ messages in thread
* [PATCH 1/2] wrapper.c: add xgetcwd() 2011-03-18 10:34 ` Nguyen Thai Ngoc Duy 2011-03-18 11:38 ` PATH_MAX (Re: [PATCH] system_path: use a static buffer) Jonathan Nieder @ 2011-03-18 11:39 ` Nguyễn Thái Ngọc Duy 2011-03-18 11:39 ` [PATCH 2/2] setup_gently: use xgetcwd() Nguyễn Thái Ngọc Duy 1 sibling, 1 reply; 48+ messages in thread From: Nguyễn Thái Ngọc Duy @ 2011-03-18 11:39 UTC (permalink / raw) To: git, Junio C Hamano, kusmabite, cmn; +Cc: Nguyễn Thái Ngọc Duy getcwd() requires the input buffer must be large enough to contain current working directory, or it will return ERANGE. We currently treat ERANGE critical and die out. xgetcwd() handles ERANGE and grows the buffer if necessary. Like other functions in x* family, it will die() if fails. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> --- git-compat-util.h | 2 ++ wrapper.c | 12 ++++++++++++ 2 files changed, 14 insertions(+), 0 deletions(-) diff --git a/git-compat-util.h b/git-compat-util.h index 49b50ee..e8b1398 100644 --- a/git-compat-util.h +++ b/git-compat-util.h @@ -433,6 +433,8 @@ extern int xmkstemp(char *template); extern int xmkstemp_mode(char *template, int mode); extern int odb_mkstemp(char *template, size_t limit, const char *pattern); extern int odb_pack_keep(char *name, size_t namesz, unsigned char *sha1); +struct strbuf; +extern void xgetcwd(struct strbuf *); static inline size_t xsize_t(off_t len) { diff --git a/wrapper.c b/wrapper.c index 2829000..5c8cbdf 100644 --- a/wrapper.c +++ b/wrapper.c @@ -217,6 +217,20 @@ int xmkstemp(char *template) return fd; } +void xgetcwd(struct strbuf *sb) +{ + const char *ret; + if (sb->alloc - sb->len < PATH_MAX) + strbuf_grow(sb, PATH_MAX); + while ((ret = getcwd(sb->buf + sb->len, + sb->alloc - sb->len - 1)) == NULL && + errno == ERANGE) + strbuf_grow(sb, PATH_MAX); + if (!ret) + die_errno("Unable to read current working directory"); + sb->len += strlen(sb->buf + sb->len); +} + /* git_mkstemp() - create tmp file honoring TMPDIR variable */ int git_mkstemp(char *path, size_t len, const char *template) { -- 1.7.4.74.g639db ^ permalink raw reply related [flat|nested] 48+ messages in thread
* [PATCH 2/2] setup_gently: use xgetcwd() 2011-03-18 11:39 ` [PATCH 1/2] wrapper.c: add xgetcwd() Nguyễn Thái Ngọc Duy @ 2011-03-18 11:39 ` Nguyễn Thái Ngọc Duy 0 siblings, 0 replies; 48+ messages in thread From: Nguyễn Thái Ngọc Duy @ 2011-03-18 11:39 UTC (permalink / raw) To: git, Junio C Hamano, kusmabite, cmn; +Cc: Nguyễn Thái Ngọc Duy Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> --- On Fri, Mar 18, 2011 at 5:34 PM, Nguyen Thai Ngoc Duy <pclouds@gmail.com> wrote: > It was pointed out elsewhere [1] that PATH_MAX only specifies max > length of a path element, not full path. I think we'd need to stay > away from preallocated PATH_MAX-sized arrays. > > [1] http://mid.gmane.org/AANLkTikXvx7-Q8B_dqG5mMHGK_Rw-dFaeQdXi0zW98SD@mail.gmail.com Perhaps this as a start? I don't intend to kill all PATH_MAX by myself though until I finish all other topics I'm working on. setup.c | 9 +++++---- 1 files changed, 5 insertions(+), 4 deletions(-) diff --git a/setup.c b/setup.c index 03cd84f..f7b3556 100644 --- a/setup.c +++ b/setup.c @@ -506,7 +506,8 @@ static dev_t get_device_or_die(const char *path, const char *prefix) static const char *setup_git_directory_gently_1(int *nongit_ok) { const char *env_ceiling_dirs = getenv(CEILING_DIRECTORIES_ENVIRONMENT); - static char cwd[PATH_MAX+1]; + static struct strbuf sb_cwd = STRBUF_INIT; + char *cwd; const char *gitdirenv, *ret; char *gitfile; int len, offset, ceil_offset; @@ -521,9 +522,9 @@ static const char *setup_git_directory_gently_1(int *nongit_ok) if (nongit_ok) *nongit_ok = 0; - if (!getcwd(cwd, sizeof(cwd)-1)) - die_errno("Unable to read current working directory"); - offset = len = strlen(cwd); + xgetcwd(&sb_cwd); + offset = len = sb_cwd.len; + cwd = sb_cwd.buf; /* * If GIT_DIR is set explicitly, we're not going -- 1.7.4.74.g639db ^ permalink raw reply related [flat|nested] 48+ messages in thread
* Re: [PATCH 2/3] setup_path(): Free temporary buffer 2011-03-14 19:18 ` [PATCH 2/3] setup_path(): Free temporary buffer Carlos Martín Nieto 2011-03-14 20:09 ` Jeff King @ 2011-03-14 20:14 ` Junio C Hamano 2011-03-14 22:01 ` Carlos Martín Nieto 1 sibling, 1 reply; 48+ messages in thread From: Junio C Hamano @ 2011-03-14 20:14 UTC (permalink / raw) To: Carlos Martín Nieto; +Cc: git Carlos Martín Nieto <cmn@elego.de> writes: > Make sure the pointer git_exec_path() returns is in the heap and free > it after it's no longer needed. > > Signed-off-by: Carlos Martín Nieto <cmn@elego.de> > --- > exec_cmd.c | 9 ++++++--- > 1 files changed, 6 insertions(+), 3 deletions(-) The proposed log message does not explain half of what this patch does, if I am reading the patch correctly. Here is what I would consider as a fair description: Subject: git_exec_path: always return a free-able string git_exec_path() returns a string that the callers do not have to free in most cases, but when it calls into system_path() and ends up going into runtime-prefix codepath, it allocates an extra string on the heap, which ends up leaking because the callers are not allowed to free it. In order to allow the callers to clean up in all cases, change the API of git_exec_path() to always return a string on heap. This requires all the callers to free it. Update only one caller setup_path() to follow the new API, but leave other callers to leak even in normal cases. The caller in git.c exits immediately after using it, so we don't care about the leak there anyway. Also help.c has a few calls to it but the number of calls to the function is small and bounded, so the leak is small and we don't care. And then reviewers can agree or disagree if the small leaks in git.c (just one string allocation that immediately is followed by exit after its use) and help.c (in list_commands() and load_commands_list(), neither of which is called millions of times anyway) are OK to accept. I tend to think they are Ok, but then I also tend to think one leak of exec-path return value in setup_path() is perfectly fine for the same reason, so in that sense I don't see a point in this patch... > diff --git a/exec_cmd.c b/exec_cmd.c > index 38545e8..c16c3d4 100644 > --- a/exec_cmd.c > +++ b/exec_cmd.c > @@ -73,11 +73,11 @@ const char *git_exec_path(void) > const char *env; > > if (argv_exec_path) > - return argv_exec_path; > + return xstrdup(argv_exec_path); > > env = getenv(EXEC_PATH_ENVIRONMENT); > if (env && *env) { > - return env; > + return xstrdup(env); > } > > return system_path(GIT_EXEC_PATH); > @@ -99,10 +99,13 @@ void setup_path(void) > { > const char *old_path = getenv("PATH"); > struct strbuf new_path = STRBUF_INIT; > + char *exec_path = (char *) git_exec_path(); > > - add_path(&new_path, git_exec_path()); > + add_path(&new_path, exec_path); > add_path(&new_path, argv0_path); > > + free(exec_path); > + > if (old_path) > strbuf_addstr(&new_path, old_path); > else ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [PATCH 2/3] setup_path(): Free temporary buffer 2011-03-14 20:14 ` [PATCH 2/3] setup_path(): Free temporary buffer Junio C Hamano @ 2011-03-14 22:01 ` Carlos Martín Nieto 2011-03-15 1:12 ` Jeff King 0 siblings, 1 reply; 48+ messages in thread From: Carlos Martín Nieto @ 2011-03-14 22:01 UTC (permalink / raw) To: Junio C Hamano; +Cc: git On lun, 2011-03-14 at 13:14 -0700, Junio C Hamano wrote: > Carlos Martín Nieto <cmn@elego.de> writes: > [...] > > Update only one caller setup_path() to follow the new API, but leave > other callers to leak even in normal cases. The caller in git.c exits > immediately after using it, so we don't care about the leak there > anyway. Also help.c has a few calls to it but the number of calls to > the function is small and bounded, so the leak is small and we don't > care. Oops. I blindly tested the path only on the test case where it was triggering an error without it and completely missed the other uses, which is embarrassing. > > > And then reviewers can agree or disagree if the small leaks in git.c (just > one string allocation that immediately is followed by exit after its use) > and help.c (in list_commands() and load_commands_list(), neither of which > is called millions of times anyway) are OK to accept. > > I tend to think they are Ok, but then I also tend to think one leak of > exec-path return value in setup_path() is perfectly fine for the same > reason, so in that sense I don't see a point in this patch... Which brings us to the matter of whether we actually care about memory leaks, as the processes are short-lived and the system is going to clean up after us. Do we, unless the leaks are huge? As there is built-in valgrind support in the test suite, I went in with the assumption that we did. It seems however that hardly any code paths free their memory, other than when using strbuf. In case we don't, valgrind should be told not to bother reporting leaks (and maybe mention in some document that small leaks are not an issue). > > > diff --git a/exec_cmd.c b/exec_cmd.c > > index 38545e8..c16c3d4 100644 > > --- a/exec_cmd.c > > +++ b/exec_cmd.c > > @@ -73,11 +73,11 @@ const char *git_exec_path(void) > > const char *env; > > > > if (argv_exec_path) > > - return argv_exec_path; > > + return xstrdup(argv_exec_path); > > > > env = getenv(EXEC_PATH_ENVIRONMENT); > > if (env && *env) { > > - return env; > > + return xstrdup(env); > > } > > > > return system_path(GIT_EXEC_PATH); > > @@ -99,10 +99,13 @@ void setup_path(void) > > { > > const char *old_path = getenv("PATH"); > > struct strbuf new_path = STRBUF_INIT; > > + char *exec_path = (char *) git_exec_path(); > > > > - add_path(&new_path, git_exec_path()); > > + add_path(&new_path, exec_path); > > add_path(&new_path, argv0_path); > > > > + free(exec_path); > > + > > if (old_path) > > strbuf_addstr(&new_path, old_path); > > else ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [PATCH 2/3] setup_path(): Free temporary buffer 2011-03-14 22:01 ` Carlos Martín Nieto @ 2011-03-15 1:12 ` Jeff King 2011-03-15 9:32 ` [PATCH] t/README: Add a note about running commands under valgrind Carlos Martín Nieto 0 siblings, 1 reply; 48+ messages in thread From: Jeff King @ 2011-03-15 1:12 UTC (permalink / raw) To: Carlos Martín Nieto; +Cc: Junio C Hamano, git On Mon, Mar 14, 2011 at 11:01:53PM +0100, Carlos Martín Nieto wrote: > > I tend to think they are Ok, but then I also tend to think one leak of > > exec-path return value in setup_path() is perfectly fine for the same > > reason, so in that sense I don't see a point in this patch... > > Which brings us to the matter of whether we actually care about memory > leaks, as the processes are short-lived and the system is going to clean > up after us. Do we, unless the leaks are huge? As there is built-in > valgrind support in the test suite, I went in with the assumption that > we did. It seems however that hardly any code paths free their memory, > other than when using strbuf. > > In case we don't, valgrind should be told not to bother reporting leaks > (and maybe mention in some document that small leaks are not an issue). Don't we already use --leak-check=no in our valgrind support for the test suite? The valgrind support is there primarily to find memory access errors, not leaks. It would be nice if it could find leaks, too, but there is currently a lot of noise due to uninteresting leaks like this. I haven't looked at valgrind's support for suppressing leak reporting, but presumably we could build a suppression file that would drop the uninteresting ones. We could also fix them, of course, but if they are one-time-per-program allocations, it might not be worth cluttering the code. -Peff ^ permalink raw reply [flat|nested] 48+ messages in thread
* [PATCH] t/README: Add a note about running commands under valgrind 2011-03-15 1:12 ` Jeff King @ 2011-03-15 9:32 ` Carlos Martín Nieto 2011-03-15 17:06 ` Junio C Hamano 0 siblings, 1 reply; 48+ messages in thread From: Carlos Martín Nieto @ 2011-03-15 9:32 UTC (permalink / raw) To: Jeff King; +Cc: Junio C Hamano, git The test suite runs valgrind with certain options activated. Add a note saying how to run commands under the same conditions as the test suite does. Signed-off-by: Carlos Martín Nieto <cmn@elego.de> --- As Jeff pointed out, the test suite does use --leak-check=no. I was using valgrind manually as I was chasing a different error that does show up. How about adding this to the README? t/README | 7 +++++++ 1 files changed, 7 insertions(+), 0 deletions(-) diff --git a/t/README b/t/README index 78c9e65..2a73fc3 100644 --- a/t/README +++ b/t/README @@ -98,6 +98,13 @@ appropriately before running "make". not see any output, this option implies --verbose. For convenience, it also implies --tee. + *NOTE*: As the git process is short-lived and some errors are + not interesting, valgrind is run with (among others) the + option --leak-check=no. In order to run a single command under + the same conditions manually, you should set GIT_VALGRIND to + point to the 't/valgrind/' directory and use the commands + under 't/valgrind/bin/'. + --tee:: In addition to printing the test output to the terminal, write it to files named 't/test-results/$TEST_NAME.out'. -- 1.7.4.1 ^ permalink raw reply related [flat|nested] 48+ messages in thread
* Re: [PATCH] t/README: Add a note about running commands under valgrind 2011-03-15 9:32 ` [PATCH] t/README: Add a note about running commands under valgrind Carlos Martín Nieto @ 2011-03-15 17:06 ` Junio C Hamano 2011-03-15 17:08 ` Carlos Martín Nieto 0 siblings, 1 reply; 48+ messages in thread From: Junio C Hamano @ 2011-03-15 17:06 UTC (permalink / raw) To: Carlos Martín Nieto; +Cc: Jeff King, Junio C Hamano, git Carlos Martín Nieto <cmn@elego.de> writes: > As Jeff pointed out, the test suite does use --leak-check=no. I was > using valgrind manually as I was chasing a different error that does > show up. How about adding this to the README? > diff --git a/t/README b/t/README > index 78c9e65..2a73fc3 100644 > --- a/t/README > +++ b/t/README > @@ -98,6 +98,13 @@ appropriately before running "make". > not see any output, this option implies --verbose. For > convenience, it also implies --tee. > > + *NOTE*: As the git process is short-lived and some errors are > + not interesting, valgrind is run with (among others) the > + option --leak-check=no. In order to run a single command under > + the same conditions manually, you should set GIT_VALGRIND to > + point to the 't/valgrind/' directory and use the commands > + under 't/valgrind/bin/'. I think what the text says is a good addition. If I were writing this myself, I would rephrase "*NOTE*:" to say "Note that ...", though, as that seems to be more in line with the other parts of the document. ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [PATCH] t/README: Add a note about running commands under valgrind 2011-03-15 17:06 ` Junio C Hamano @ 2011-03-15 17:08 ` Carlos Martín Nieto 0 siblings, 0 replies; 48+ messages in thread From: Carlos Martín Nieto @ 2011-03-15 17:08 UTC (permalink / raw) To: Junio C Hamano; +Cc: Jeff King, git On mar, 2011-03-15 at 10:06 -0700, Junio C Hamano wrote: > Carlos Martín Nieto <cmn@elego.de> writes: > > > As Jeff pointed out, the test suite does use --leak-check=no. I was > > using valgrind manually as I was chasing a different error that does > > show up. How about adding this to the README? > > > diff --git a/t/README b/t/README > > index 78c9e65..2a73fc3 100644 > > --- a/t/README > > +++ b/t/README > > @@ -98,6 +98,13 @@ appropriately before running "make". > > not see any output, this option implies --verbose. For > > convenience, it also implies --tee. > > > > + *NOTE*: As the git process is short-lived and some errors are > > + not interesting, valgrind is run with (among others) the > > + option --leak-check=no. In order to run a single command under > > + the same conditions manually, you should set GIT_VALGRIND to > > + point to the 't/valgrind/' directory and use the commands > > + under 't/valgrind/bin/'. > > I think what the text says is a good addition. If I were writing this > myself, I would rephrase "*NOTE*:" to say "Note that ...", though, as that > seems to be more in line with the other parts of the document. Good point. Should I resend? cmn ^ permalink raw reply [flat|nested] 48+ messages in thread
* [PATCH 3/3] clone: Free a few paths 2011-03-14 19:18 [PATCH 0/3] Fix some errors reported by valgrind Carlos Martín Nieto 2011-03-14 19:18 ` [PATCH 1/3] make_absolute_path: Don't try to copy a string to itself Carlos Martín Nieto 2011-03-14 19:18 ` [PATCH 2/3] setup_path(): Free temporary buffer Carlos Martín Nieto @ 2011-03-14 19:18 ` Carlos Martín Nieto 2011-03-14 19:45 ` Jonathan Nieder 2 siblings, 1 reply; 48+ messages in thread From: Carlos Martín Nieto @ 2011-03-14 19:18 UTC (permalink / raw) To: git; +Cc: Junio C Hamano Free the path, repo, dir buffers Signed-off-by: Carlos Martín Nieto <cmn@elego.de> --- builtin/clone.c | 9 ++++++--- 1 files changed, 6 insertions(+), 3 deletions(-) diff --git a/builtin/clone.c b/builtin/clone.c index 60d9a64..d90770a 100644 --- a/builtin/clone.c +++ b/builtin/clone.c @@ -365,8 +365,8 @@ int cmd_clone(int argc, const char **argv, const char *prefix) { int is_bundle = 0, is_local; struct stat buf; - const char *repo_name, *repo, *work_tree, *git_dir; - char *path, *dir; + const char *repo_name, *work_tree, *git_dir; + char *path, *dir, *repo; int dest_exists; const struct ref *refs, *remote_head; const struct ref *remote_head_points_at; @@ -415,7 +415,7 @@ int cmd_clone(int argc, const char **argv, const char *prefix) else if (!strchr(repo_name, ':')) repo = xstrdup(make_absolute_path(repo_name)); else - repo = repo_name; + repo = xstrdup(repo_name); is_local = path && !is_bundle; if (is_local && option_depth) warning("--depth is ignored in local clones; use file:// instead."); @@ -665,6 +665,9 @@ int cmd_clone(int argc, const char **argv, const char *prefix) err = run_command_v_opt(argv_submodule, RUN_GIT_CMD); } + free(dir); + free(repo); + free(path); strbuf_release(&reflog_msg); strbuf_release(&branch_top); strbuf_release(&key); -- 1.7.4.1 ^ permalink raw reply related [flat|nested] 48+ messages in thread
* Re: [PATCH 3/3] clone: Free a few paths 2011-03-14 19:18 ` [PATCH 3/3] clone: Free a few paths Carlos Martín Nieto @ 2011-03-14 19:45 ` Jonathan Nieder 2011-03-18 7:25 ` Junio C Hamano 0 siblings, 1 reply; 48+ messages in thread From: Jonathan Nieder @ 2011-03-14 19:45 UTC (permalink / raw) To: Carlos Martín Nieto; +Cc: git, Junio C Hamano Hi, Carlos Martín Nieto wrote: > Free the path, repo, dir buffers > > Signed-off-by: Carlos Martín Nieto <cmn@elego.de> [...] > + free(dir); > + free(repo); > + free(path); > strbuf_release(&reflog_msg); > strbuf_release(&branch_top); > strbuf_release(&key); Thanks. The commit message should probably mention that this is for the sake of valgrind rather a true memory leak, since the memory is freed by _exit at the appropriate time already. The patch itself seems sane, since the performance effect should be negligible. But it reminds me: does "valgrind --tool=memcheck" provide a way to annotate allocations like these? In other words, is it be possible to have functions xmalloc_permanent and xstrdup_permanent that * allocate a buffer that is never meant to be freed; * do not cause valgrind to complain; * could be reimplemented some day by taking allocations from a large contiguous pool, to avoid malloc overhead and to take advantage of the knowledge that these allocations never need to be freed ? Curious, Jonathan ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [PATCH 3/3] clone: Free a few paths 2011-03-14 19:45 ` Jonathan Nieder @ 2011-03-18 7:25 ` Junio C Hamano 0 siblings, 0 replies; 48+ messages in thread From: Junio C Hamano @ 2011-03-18 7:25 UTC (permalink / raw) To: Jonathan Nieder; +Cc: Carlos Martín Nieto, git, Junio C Hamano Jonathan Nieder <jrnieder@gmail.com> writes: > Thanks. The commit message should probably mention that this is for > the sake of valgrind rather a true memory leak, since the memory is > freed by _exit at the appropriate time already. Or perhaps you are planning to split larger later half of cmd_clone() into a helper function, so that other people can call it after doing their own command line processing. > The patch itself seems sane, since the performance effect should be > negligible. Perhaps and yes. I would have preferred a patch to free path and dir without doing anything else, as path is coming from get_repo_path() that always return a string on heap, and dir comes either directly from xstrdup() or guess_dir_name() that gives strbuf_detach() or xstrndup() result to us, and they are clearly leaking and are safely freed, though. ^ permalink raw reply [flat|nested] 48+ messages in thread
end of thread, other threads:[~2011-03-21 15:57 UTC | newest] Thread overview: 48+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2011-03-14 19:18 [PATCH 0/3] Fix some errors reported by valgrind Carlos Martín Nieto 2011-03-14 19:18 ` [PATCH 1/3] make_absolute_path: Don't try to copy a string to itself Carlos Martín Nieto 2011-03-14 20:02 ` Jeff King 2011-03-14 20:25 ` Junio C Hamano 2011-03-14 22:02 ` Carlos Martín Nieto 2011-03-14 22:58 ` Junio C Hamano 2011-03-15 11:59 ` Carlos Martín Nieto 2011-03-15 12:40 ` Carlos Martín Nieto 2011-03-15 17:02 ` Junio C Hamano 2011-03-15 17:27 ` Carlos Martín Nieto 2011-03-16 14:16 ` Nguyen Thai Ngoc Duy 2011-03-16 14:49 ` Carlos Martín Nieto 2011-03-16 14:58 ` Nguyen Thai Ngoc Duy 2011-03-16 14:04 ` Nguyen Thai Ngoc Duy 2011-03-16 15:08 ` Carlos Martín Nieto 2011-03-14 19:18 ` [PATCH 2/3] setup_path(): Free temporary buffer Carlos Martín Nieto 2011-03-14 20:09 ` Jeff King 2011-03-14 22:18 ` Carlos Martín Nieto 2011-03-16 11:26 ` [PATCH] system_path: use a static buffer Carlos Martín Nieto 2011-03-16 15:58 ` Erik Faye-Lund 2011-03-16 16:24 ` Carlos Martín Nieto 2011-03-16 16:33 ` Carlos Martín Nieto 2011-03-16 20:43 ` Junio C Hamano 2011-03-17 11:01 ` Carlos Martín Nieto 2011-03-17 14:24 ` Carlos Martín Nieto 2011-03-18 7:25 ` Junio C Hamano 2011-03-21 9:56 ` Carlos Martín Nieto 2011-03-21 11:14 ` Jeff King 2011-03-21 15:26 ` Carlos Martín Nieto 2011-03-21 15:51 ` Jeff King 2011-03-21 15:57 ` Carlos Martín Nieto 2011-03-18 10:34 ` Nguyen Thai Ngoc Duy 2011-03-18 11:38 ` PATH_MAX (Re: [PATCH] system_path: use a static buffer) Jonathan Nieder 2011-03-18 11:54 ` Nguyen Thai Ngoc Duy 2011-03-21 9:47 ` Carlos Martín Nieto 2011-03-21 12:37 ` Lasse Makholm 2011-03-21 11:19 ` Nguyen Thai Ngoc Duy 2011-03-18 11:39 ` [PATCH 1/2] wrapper.c: add xgetcwd() Nguyễn Thái Ngọc Duy 2011-03-18 11:39 ` [PATCH 2/2] setup_gently: use xgetcwd() Nguyễn Thái Ngọc Duy 2011-03-14 20:14 ` [PATCH 2/3] setup_path(): Free temporary buffer Junio C Hamano 2011-03-14 22:01 ` Carlos Martín Nieto 2011-03-15 1:12 ` Jeff King 2011-03-15 9:32 ` [PATCH] t/README: Add a note about running commands under valgrind Carlos Martín Nieto 2011-03-15 17:06 ` Junio C Hamano 2011-03-15 17:08 ` Carlos Martín Nieto 2011-03-14 19:18 ` [PATCH 3/3] clone: Free a few paths Carlos Martín Nieto 2011-03-14 19:45 ` Jonathan Nieder 2011-03-18 7:25 ` Junio C Hamano
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).