* [PATCH] Introduce a filter-path argument to git-daemon, for doing custom path transformations @ 2009-03-11 15:17 Johan Sørensen 2009-03-11 15:58 ` Johannes Sixt 0 siblings, 1 reply; 14+ messages in thread From: Johan Sørensen @ 2009-03-11 15:17 UTC (permalink / raw) To: git; +Cc: Johan Sørensen The argument is an executable script that will receive the path to the repos the client wishes to clone as an argument. It is then the responsibility of the script to return a zero-terminated string on its stdout with the real path of the target repository. This buys us a lot of flexibility when it comes to managing different repositories, possibly located in many different dirs, but with a uniform url-structure to the outside world. --- Documentation/git-daemon.txt | 7 ++++ daemon.c | 75 +++++++++++++++++++++++++++++++++++++++++- 2 files changed, 81 insertions(+), 1 deletions(-) diff --git a/Documentation/git-daemon.txt b/Documentation/git-daemon.txt index 36f00ae..1eca344 100644 --- a/Documentation/git-daemon.txt +++ b/Documentation/git-daemon.txt @@ -13,6 +13,7 @@ SYNOPSIS [--strict-paths] [--base-path=path] [--base-path-relaxed] [--user-path | --user-path=path] [--interpolated-path=pathtemplate] + [--path-filter=executable] [--reuseaddr] [--detach] [--pid-file=file] [--enable=service] [--disable=service] [--allow-override=service] [--forbid-override=service] @@ -71,6 +72,12 @@ OPTIONS After interpolation, the path is validated against the directory whitelist. +--path-filter=executable:: + To support a more flexible directory layout a path filter script + can be used. The executable will receive the requested path from + the client as arg0. The executable must return a zero-terminated + string on stdout which is the real path 'git-daemon' should serve. + --export-all:: Allow pulling from all directories that look like GIT repositories (have the 'objects' and 'refs' subdirectories), even if they diff --git a/daemon.c b/daemon.c index d93cf96..b2571df 100644 --- a/daemon.c +++ b/daemon.c @@ -22,6 +22,7 @@ static const char daemon_usage[] = " [--strict-paths] [--base-path=path] [--base-path-relaxed]\n" " [--user-path | --user-path=path]\n" " [--interpolated-path=path]\n" +" [--path-filter=path]\n" " [--reuseaddr] [--detach] [--pid-file=file]\n" " [--[enable|disable|allow-override|forbid-override]=service]\n" " [--inetd | [--listen=host_or_ipaddr] [--port=n]\n" @@ -58,6 +59,10 @@ static char *canon_hostname; static char *ip_address; static char *tcp_port; +/* if defined, the script will be executed with the requested path on stdin + * and _must_ return with a successful exitcode and the new path on stdout */ +static char *path_filter_script; + static void logreport(int priority, const char *err, va_list params) { if (log_syslog) { @@ -287,9 +292,62 @@ static int git_daemon_config(const char *var, const char *value, void *cb) return 0; } +static char *run_path_filter_script(char *requested_dir) { + pid_t pid; + char result[256]; /* arbitary */ + char *real_path; + int pipe_out[2]; + int exit_code = 1; + + pipe(pipe_out); + + loginfo("Executing path filter script: '%s %s'", path_filter_script, requested_dir); + + switch ((pid = fork())) { + case -1: + logerror("path filter script fork() failed: %s", strerror(errno)); + return NULL; + case 0: + close(pipe_out[0]); + dup2(pipe_out[1], 1); + close(pipe_out[1]); + + execl(path_filter_script, path_filter_script, requested_dir, NULL); + + /* execl failed if we got here */ + logerror("path filter script execl() failed: %s", strerror(errno)); + return NULL; + default: + close(pipe_out[1]); + + while(waitpid(pid, &exit_code, 0) < 0) { + switch(errno) { + case EINTR: + continue; + default: + logerror("path filter script waitpid() fail: %s", strerror(errno)); + goto waitpid_error; + } + } + if (WIFEXITED(exit_code) && WEXITSTATUS(exit_code) == 0) { + read(pipe_out[0], result, sizeof(result) - 1); + loginfo("path filter script result: %s", result); + } + waitpid_error: + close(pipe_out[0]); + } + + if (result) { + real_path = result; + return real_path; + } + return NULL; +} + static int run_service(char *dir, struct daemon_service *service) { const char *path; + char *real_dir; int enabled = service->enabled; loginfo("Request %s for '%s'", service->name, dir); @@ -299,8 +357,19 @@ static int run_service(char *dir, struct daemon_service *service) errno = EACCES; return -1; } + loginfo("path_filter_script %s", path_filter_script); + if (!path_filter_script) { + real_dir = dir; + } else { + char *tdir; + if ((tdir = run_path_filter_script(dir))) { + real_dir = tdir; + } else { + real_dir = dir; + } + } - if (!(path = path_ok(dir))) + if (!(path = path_ok(real_dir))) return -1; /* @@ -1018,6 +1087,10 @@ int main(int argc, char **argv) pid_file = arg + 11; continue; } + if (!prefixcmp(arg, "--path-filter=")) { + path_filter_script = arg + 14; + continue; + } if (!strcmp(arg, "--detach")) { detach = 1; log_syslog = 1; -- 1.6.1 ^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH] Introduce a filter-path argument to git-daemon, for doing custom path transformations 2009-03-11 15:17 [PATCH] Introduce a filter-path argument to git-daemon, for doing custom path transformations Johan Sørensen @ 2009-03-11 15:58 ` Johannes Sixt 2009-03-12 10:13 ` Johan Sørensen 2009-03-12 10:26 ` Johan Sørensen 0 siblings, 2 replies; 14+ messages in thread From: Johannes Sixt @ 2009-03-11 15:58 UTC (permalink / raw) To: Johan Sørensen; +Cc: git Johan Sørensen schrieb: > The argument is an executable script that will receive the path to the repos > the client wishes to clone as an argument. It is then the responsibility of the > script to return a zero-terminated string on its stdout with the real path of > the target repository. > > This buys us a lot of flexibility when it comes to managing different > repositories, possibly located in many different dirs, but with a uniform > url-structure to the outside world. It's the first time that I see a deamon with this feature - except perhaps Apache's ModRewrite. Are you sure you are not working around your problem at the wrong place? Doesn't --interpolated-path already solve your problem? If not, then you at least you must describe in the documentation the use-cases when --path-filter should be preferred. Your implementation does not pass the target hostname to the script, but it should; otherwise you lose flexibility (for virtual hosting). > +static char *run_path_filter_script(char *requested_dir) { > + pid_t pid; > + char result[256]; /* arbitary */ > + char *real_path; > + int pipe_out[2]; > + int exit_code = 1; > + > + pipe(pipe_out); > + > + loginfo("Executing path filter script: '%s %s'", path_filter_script, requested_dir); > + > + switch ((pid = fork())) { > + case -1: > + logerror("path filter script fork() failed: %s", strerror(errno)); > + return NULL; > + case 0: > + close(pipe_out[0]); > + dup2(pipe_out[1], 1); > + close(pipe_out[1]); > + > + execl(path_filter_script, path_filter_script, requested_dir, NULL); Use start_command()/finish_command() instead of rolling your own fork/exec combo. -- Hannes ^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH] Introduce a filter-path argument to git-daemon, for doing custom path transformations 2009-03-11 15:58 ` Johannes Sixt @ 2009-03-12 10:13 ` Johan Sørensen 2009-03-12 11:29 ` Johannes Schindelin 2009-03-12 10:26 ` Johan Sørensen 1 sibling, 1 reply; 14+ messages in thread From: Johan Sørensen @ 2009-03-12 10:13 UTC (permalink / raw) To: git; +Cc: Johan Sørensen The parameter for filter-path is an executable that will receive the service name, the client hostname and path to the repos the client requests as as arguments. It is then the responsibility of the script to return a zero terminated string on its stdout with the real path of the target repository. --- Documentation/git-daemon.txt | 13 ++++++++++ daemon.c | 53 +++++++++++++++++++++++++++++++++++++++++- 2 files changed, 65 insertions(+), 1 deletions(-) diff --git a/Documentation/git-daemon.txt b/Documentation/git-daemon.txt index 36f00ae..efd1687 100644 --- a/Documentation/git-daemon.txt +++ b/Documentation/git-daemon.txt @@ -13,6 +13,7 @@ SYNOPSIS [--strict-paths] [--base-path=path] [--base-path-relaxed] [--user-path | --user-path=path] [--interpolated-path=pathtemplate] + [--path-filter=executable] [--reuseaddr] [--detach] [--pid-file=file] [--enable=service] [--disable=service] [--allow-override=service] [--forbid-override=service] @@ -71,6 +72,18 @@ OPTIONS After interpolation, the path is validated against the directory whitelist. +--path-filter=executable:: + To support a more flexible directory layout a path filter script + can be used. The executable will receive the service name (upload-pack, + upload-archive or receive-pack), the client hostname and the request git + directory as arguments. The executable must return a zero-terminated string + on stdout which is the real path 'git-daemon' should serve. This is useful + when --interpolated-path doesn't buy you enough flexibility. You could for + instance keep support for old clone urls if you rename your repository, or + fetch a custom url-mapping from a third-party repo manager application, or + map deeply nested repository directories to a more sensible layout for the + outside world. + --export-all:: Allow pulling from all directories that look like GIT repositories (have the 'objects' and 'refs' subdirectories), even if they diff --git a/daemon.c b/daemon.c index d93cf96..e6777c6 100644 --- a/daemon.c +++ b/daemon.c @@ -1,6 +1,7 @@ #include "cache.h" #include "pkt-line.h" #include "exec_cmd.h" +#include "run-command.h" #include <syslog.h> @@ -22,6 +23,7 @@ static const char daemon_usage[] = " [--strict-paths] [--base-path=path] [--base-path-relaxed]\n" " [--user-path | --user-path=path]\n" " [--interpolated-path=path]\n" +" [--path-filter=path]\n" " [--reuseaddr] [--detach] [--pid-file=file]\n" " [--[enable|disable|allow-override|forbid-override]=service]\n" " [--inetd | [--listen=host_or_ipaddr] [--port=n]\n" @@ -58,6 +60,10 @@ static char *canon_hostname; static char *ip_address; static char *tcp_port; +/* if defined, the script will be executed with the requested path on stdin + * and _must_ return with a successful exitcode and the new path on stdout */ +static char *path_filter_script; + static void logreport(int priority, const char *err, va_list params) { if (log_syslog) { @@ -287,6 +293,37 @@ static int git_daemon_config(const char *var, const char *value, void *cb) return 0; } +static char *run_path_filter_script(struct daemon_service *s, char *host, char *dir) { + char result[256]; /* arbitary */ + char *real_path; + struct child_process filter_cmd; + const char *args[] = { path_filter_script, s->name, host, dir, NULL }; + + loginfo("Executing path filter script: '%s %s'", path_filter_script, dir); + memset(&filter_cmd, 0, sizeof(filter_cmd)); + filter_cmd.argv = args; + filter_cmd.out = -1; + + if (start_command(&filter_cmd)) { + logerror("path filter: unable to fork path_filter_script"); + return NULL; + } + + read(filter_cmd.out, result, sizeof(result) - 1); + + close(filter_cmd.out); + if (finish_command(&filter_cmd)) { + logerror("path filter died with strange error"); + return NULL; + } + + if (result) { + real_path = result; + return real_path; + } + return NULL; +} + static int run_service(char *dir, struct daemon_service *service) { const char *path; @@ -495,6 +532,7 @@ static void parse_extra_args(char *extra_args, int buflen) static int execute(struct sockaddr *addr) { static char line[1000]; + char *path; int pktlen, len, i; if (addr) { @@ -553,11 +591,20 @@ static int execute(struct sockaddr *addr) if (!prefixcmp(line, "git-") && !strncmp(s->name, line + 4, namelen) && line[namelen + 4] == ' ') { + path = line + namelen + 5; + if (path_filter_script) { + loginfo("path_filter_script %s for path %s", path_filter_script, path); + char *tdir; + if ((tdir = run_path_filter_script(s, hostname, path))) { + path = tdir; + } + } + /* * Note: The directory here is probably context sensitive, * and might depend on the actual service being performed. */ - return run_service(line + namelen + 5, s); + return run_service(path, s); } } @@ -1018,6 +1065,10 @@ int main(int argc, char **argv) pid_file = arg + 11; continue; } + if (!prefixcmp(arg, "--path-filter=")) { + path_filter_script = arg + 14; + continue; + } if (!strcmp(arg, "--detach")) { detach = 1; log_syslog = 1; -- 1.6.1 ^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH] Introduce a filter-path argument to git-daemon, for doing custom path transformations 2009-03-12 10:13 ` Johan Sørensen @ 2009-03-12 11:29 ` Johannes Schindelin 2009-03-12 15:48 ` Johan Sørensen 2009-03-12 19:06 ` Johan Sørensen 0 siblings, 2 replies; 14+ messages in thread From: Johannes Schindelin @ 2009-03-12 11:29 UTC (permalink / raw) To: Johan Sørensen; +Cc: git [-- Attachment #1: Type: TEXT/PLAIN, Size: 4612 bytes --] Hi, Disclaimer: if you are offended by constructive criticism, or likely to answer with insults to the comments I offer, please stop reading this mail now (and please do not answer my mail, either). :-) Still with me? Good. Nice to meet you. Just for the record: responding to a patch is my strongest way of saying that I appreciate your work. On Thu, 12 Mar 2009, Johan Sørensen wrote: > The parameter for filter-path is an executable that will receive the > service name, the client hostname and path to the repos the client > requests as as arguments. It is then the responsibility of the script to > return a zero terminated string on its stdout with the real path of the > target repository. > --- A sign-off is missing... More importantly, you might want to point out the security concerns of running a script with the full permissions of git-daemon. (AFAICT from your patch you are not dropping any privileges at any point.) Which brings me to another idea: we have quite a few places in Git where we use regular expressions. Would they not be enough for your use case? > diff --git a/Documentation/git-daemon.txt b/Documentation/git-daemon.txt > index 36f00ae..efd1687 100644 > --- a/Documentation/git-daemon.txt > +++ b/Documentation/git-daemon.txt > @@ -71,6 +72,18 @@ OPTIONS > After interpolation, the path is validated against the directory > whitelist. > > +--path-filter=executable:: > + To support a more flexible directory layout a path filter script > + can be used. The executable will receive the service name (upload-pack, > + upload-archive or receive-pack), the client hostname and the request git > + directory as arguments. The executable must return a zero-terminated string > + on stdout which is the real path 'git-daemon' should serve. This is useful > + when --interpolated-path doesn't buy you enough flexibility. You could for > + instance keep support for old clone urls if you rename your repository, or > + fetch a custom url-mapping from a third-party repo manager application, or > + map deeply nested repository directories to a more sensible layout for the > + outside world. Please keep the lines shorter than 81 characters. > diff --git a/daemon.c b/daemon.c > index d93cf96..e6777c6 100644 > --- a/daemon.c > +++ b/daemon.c > @@ -287,6 +293,37 @@ static int git_daemon_config(const char *var, const char *value, void *cb) > return 0; > } > > +static char *run_path_filter_script(struct daemon_service *s, char *host, char *dir) { Again, pretty long line. (I will refrain from saying that for every long line, but please cooperate by pretending I did ;-) But there is more: what about concurrent accesses? > + char result[256]; /* arbitary */ Why not PATH_MAX? > + char *real_path; > + struct child_process filter_cmd; > + const char *args[] = { path_filter_script, s->name, host, dir, NULL }; > + > + loginfo("Executing path filter script: '%s %s'", path_filter_script, dir); > + memset(&filter_cmd, 0, sizeof(filter_cmd)); > + filter_cmd.argv = args; > + filter_cmd.out = -1; > + > + if (start_command(&filter_cmd)) { > + logerror("path filter: unable to fork path_filter_script"); > + return NULL; > + } > + > + read(filter_cmd.out, result, sizeof(result) - 1); No error checking? BTW we do have strbuf_read(), which would solve your "static char *" problem nicely. > + close(filter_cmd.out); > + if (finish_command(&filter_cmd)) { > + logerror("path filter died with strange error"); > + return NULL; > + } > + > + if (result) { > + real_path = result; > + return real_path; > + } > + return NULL; What would be the difference if you wrote return result; instead? > @@ -495,6 +532,7 @@ static void parse_extra_args(char *extra_args, int buflen) > static int execute(struct sockaddr *addr) > { > static char line[1000]; > + char *path; Is it not rather "const char *"? But that point would be moot should you decide to use strbufs. > @@ -553,11 +591,20 @@ static int execute(struct sockaddr *addr) > if (!prefixcmp(line, "git-") && > !strncmp(s->name, line + 4, namelen) && > line[namelen + 4] == ' ') { > + path = line + namelen + 5; > + if (path_filter_script) { > + loginfo("path_filter_script %s for path %s", path_filter_script, path); > + char *tdir; Declaration after a call to a function. > + if ((tdir = run_path_filter_script(s, hostname, path))) { > + path = tdir; > + } Unnecessary curly brackets. And your code would be even easier to read if your run_path_filter_script() would never return NULL, but the unchanged path instead. Ciao, Dscho ^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH] Introduce a filter-path argument to git-daemon, for doing custom path transformations 2009-03-12 11:29 ` Johannes Schindelin @ 2009-03-12 15:48 ` Johan Sørensen 2009-03-12 16:50 ` Johannes Schindelin 2009-03-12 19:06 ` Johan Sørensen 1 sibling, 1 reply; 14+ messages in thread From: Johan Sørensen @ 2009-03-12 15:48 UTC (permalink / raw) To: git; +Cc: Johan Sørensen The parameter for filter-path is an executable that will receive the service name, the client hostname and path to the repos the client requests as as arguments. It is then the responsibility of the script to return a zero terminated string on its stdout with the real path of the target repository. Signed-off-by: Johan Sørensen <johan@johansorensen.com> --- Documentation/git-daemon.txt | 15 +++++++++++ daemon.c | 54 +++++++++++++++++++++++++++++++++++++++++- 2 files changed, 68 insertions(+), 1 deletions(-) diff --git a/Documentation/git-daemon.txt b/Documentation/git-daemon.txt index 36f00ae..bf8d31f 100644 --- a/Documentation/git-daemon.txt +++ b/Documentation/git-daemon.txt @@ -13,6 +13,7 @@ SYNOPSIS [--strict-paths] [--base-path=path] [--base-path-relaxed] [--user-path | --user-path=path] [--interpolated-path=pathtemplate] + [--path-filter=executable] [--reuseaddr] [--detach] [--pid-file=file] [--enable=service] [--disable=service] [--allow-override=service] [--forbid-override=service] @@ -71,6 +72,20 @@ OPTIONS After interpolation, the path is validated against the directory whitelist. +--path-filter=executable:: + To support a more flexible directory layout a path filter script + can be used. The executable will receive the service name (upload-pack, + upload-archive or receive-pack), the client hostname and the request git + directory as arguments. The executable must return a zero-terminated + string on stdout which is the real path 'git-daemon' should serve. This + is useful when --interpolated-path doesn't buy you enough flexibility. + You could for instance keep support for old clone urls if you rename your + repository, or fetch a custom url-mapping from a third-party repo manager + application, or map deeply nested repository directories to a more + sensible layout for the outside world. + Please be aware that the executable spawned will have the same privileges + as the user you are running the git-daemon under. + --export-all:: Allow pulling from all directories that look like GIT repositories (have the 'objects' and 'refs' subdirectories), even if they diff --git a/daemon.c b/daemon.c index d93cf96..e865e78 100644 --- a/daemon.c +++ b/daemon.c @@ -1,6 +1,7 @@ #include "cache.h" #include "pkt-line.h" #include "exec_cmd.h" +#include "run-command.h" #include <syslog.h> @@ -22,6 +23,7 @@ static const char daemon_usage[] = " [--strict-paths] [--base-path=path] [--base-path-relaxed]\n" " [--user-path | --user-path=path]\n" " [--interpolated-path=path]\n" +" [--path-filter=path]\n" " [--reuseaddr] [--detach] [--pid-file=file]\n" " [--[enable|disable|allow-override|forbid-override]=service]\n" " [--inetd | [--listen=host_or_ipaddr] [--port=n]\n" @@ -58,6 +60,11 @@ static char *canon_hostname; static char *ip_address; static char *tcp_port; +/* if defined, the script will be executed with the service name, hostname, + * and requested path on stdin and _must_ return with a successful exitcode + * and the new path on stdout */ +static char *path_filter_script; + static void logreport(int priority, const char *err, va_list params) { if (log_syslog) { @@ -287,6 +294,42 @@ static int git_daemon_config(const char *var, const char *value, void *cb) return 0; } +static char *run_path_filter_script(struct daemon_service *s, char *host, + char *dir) { + struct strbuf result_buf = STRBUF_INIT; + struct child_process filter_cmd; + const char *args[] = { path_filter_script, s->name, host, dir, NULL }; + + loginfo("Executing path filter script: '%s %s %s %s'", + path_filter_script, s->name, host, dir); + memset(&filter_cmd, 0, sizeof(filter_cmd)); + filter_cmd.argv = args; + filter_cmd.out = -1; + + if (start_command(&filter_cmd)) { + logerror("path filter: unable to fork path_filter_script"); + return dir; + } + + if (strbuf_read(&result_buf, filter_cmd.out, PATH_MAX) < 0) { + strbuf_release(&result_buf); + close(filter_cmd.out); + logerror("path filter: script read returned %s", strerror(errno)); + return dir; + } + + close(filter_cmd.out); + if (finish_command(&filter_cmd)) { + logerror("path filter script died with strange error"); + return dir; + } + + if (result_buf.len > 0) + dir = strbuf_detach(&result_buf, NULL); + + return dir; +} + static int run_service(char *dir, struct daemon_service *service) { const char *path; @@ -557,7 +600,12 @@ static int execute(struct sockaddr *addr) * Note: The directory here is probably context sensitive, * and might depend on the actual service being performed. */ - return run_service(line + namelen + 5, s); + if (path_filter_script) { + return run_service(run_path_filter_script(s, hostname, + line + namelen + 5), s); + } else { + return run_service(line + namelen + 5, s); + } } } @@ -1018,6 +1066,10 @@ int main(int argc, char **argv) pid_file = arg + 11; continue; } + if (!prefixcmp(arg, "--path-filter=")) { + path_filter_script = arg + 14; + continue; + } if (!strcmp(arg, "--detach")) { detach = 1; log_syslog = 1; -- 1.6.1 ^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH] Introduce a filter-path argument to git-daemon, for doing custom path transformations 2009-03-12 15:48 ` Johan Sørensen @ 2009-03-12 16:50 ` Johannes Schindelin 0 siblings, 0 replies; 14+ messages in thread From: Johannes Schindelin @ 2009-03-12 16:50 UTC (permalink / raw) To: Johan Sørensen; +Cc: git [-- Attachment #1: Type: TEXT/PLAIN, Size: 575 bytes --] Hi, On Thu, 12 Mar 2009, Johan Sørensen wrote: > The parameter for filter-path is an executable that will receive the service > name, the client hostname and path to the repos the client requests as as > arguments. It is then the responsibility of the script to return a zero > terminated string on its stdout with the real path of the target repository. > > Signed-off-by: Johan Sørensen <johan@johansorensen.com> I see that you addressed some, but not all of my concerns. Do you think that I am wrong? Then please, by all means, increase my knowledge. Ciao, Dscho ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] Introduce a filter-path argument to git-daemon, for doing custom path transformations 2009-03-12 11:29 ` Johannes Schindelin 2009-03-12 15:48 ` Johan Sørensen @ 2009-03-12 19:06 ` Johan Sørensen 2009-03-14 6:58 ` Junio C Hamano 1 sibling, 1 reply; 14+ messages in thread From: Johan Sørensen @ 2009-03-12 19:06 UTC (permalink / raw) To: Johannes Schindelin; +Cc: git On Thu, Mar 12, 2009 at 12:29 PM, Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote: (all my comments below refer to my latest patch) > More importantly, you might want to point out the security concerns of > running a script with the full permissions of git-daemon. (AFAICT from > your patch you are not dropping any privileges at any point.) Do you really think this is needed? It doesn't seem like running the hook scripts does anything more than trusting the script author and permissions of the hook scripts (?). I see the path-filter script exactly the same way, with the exception of having to double-check the user supplied path the script receives. > Which brings me to another idea: we have quite a few places in Git where > we use regular expressions. Would they not be enough for your use case? Hmm, please do elaborate on your idea. If you mean being able to supply a bunch of regexp mappings when starting the daemon then it wouldn't cut it for me; I'm in need of something more dynamic. >> diff --git a/Documentation/git-daemon.txt b/Documentation/git-daemon.txt >> index 36f00ae..efd1687 100644 >> --- a/Documentation/git-daemon.txt >> +++ b/Documentation/git-daemon.txt >> @@ -71,6 +72,18 @@ OPTIONS [snip] > Please keep the lines shorter than 81 characters. I believe the longest line I've added in the docs is 77. > But there is more: what about concurrent accesses? The external path-filter script is run from the execute(), which is forked+exec'ed for each incoming connection to the daemon, so that would mean a concurrency of one in that child-process, unless I've missed something in the code path? >> + read(filter_cmd.out, result, sizeof(result) - 1); > > No error checking? > > BTW we do have strbuf_read(), which would solve your "static char *" > problem nicely. I'm using strbuf_read() now, but this being my very first git patch, I may have misunderstood the api slightly? > And your code would be even easier to read if your > run_path_filter_script() would never return NULL, but the unchanged path > instead. Done. Thanks for giving my patch the run-through. I'm still curious to hear what people think about the idea in general though! > > Ciao, > Dscho > Cheers, JS ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] Introduce a filter-path argument to git-daemon, for doing custom path transformations 2009-03-12 19:06 ` Johan Sørensen @ 2009-03-14 6:58 ` Junio C Hamano 2009-03-14 14:39 ` Johan Sørensen 0 siblings, 1 reply; 14+ messages in thread From: Junio C Hamano @ 2009-03-14 6:58 UTC (permalink / raw) To: Johan Sørensen; +Cc: Johannes Schindelin, git Johan Sørensen <johan@johansorensen.com> writes: >> More importantly, you might want to point out the security concerns of >> running a script with the full permissions of git-daemon. (AFAICT from >> your patch you are not dropping any privileges at any point.) > > Do you really think this is needed? It doesn't seem like running the > hook scripts does anything more than trusting the script author and > permissions of the hook scripts (?). I see the path-filter script > exactly the same way, with the exception of having to double-check the > user supplied path the script receives. If I am not misreading the patch (I only skimmed it), the script is what is given to the git-daemon process from its command line, so it is under total control of the site owner. It is much much much less problematic than the security worry of allowing random hook scripts to be installed in the repositories hosted at a hosting site. I think Dscho is being a bit too paranoid in this particular case. However, being paranoid is a good thing when we talk about instructions we give to the end users. The site owner who uses this facility needs to be aware that the script is run as the same user that runs git-daemon, and that more than one instances of the script can be run at the same time. The script writer needs to be careful about using the same scratchpad location for the temporary files the script uses and not letting multiple instances of scripts stomping on each other's toes. These things need to be documented. Do you run git-daemon from inetd, or standalone, by the way? I am wondering how well it would scale if you spawn an external "filter path" script (by the way, "filter path" sounds as if it checks and conditionally denies access to, or something like that, which is not what you are using it for. It is more about rewriting paths, a la mod_rewrite, and I think the option is misnamed) every time you get a request. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] Introduce a filter-path argument to git-daemon, for doing custom path transformations 2009-03-14 6:58 ` Junio C Hamano @ 2009-03-14 14:39 ` Johan Sørensen 2009-03-14 18:23 ` Junio C Hamano 0 siblings, 1 reply; 14+ messages in thread From: Johan Sørensen @ 2009-03-14 14:39 UTC (permalink / raw) To: Junio C Hamano; +Cc: Johannes Schindelin, git On Sat, Mar 14, 2009 at 7:58 AM, Junio C Hamano <gitster@pobox.com> wrote: > However, being paranoid is a good thing when we talk about instructions we > give to the end users. The site owner who uses this facility needs to be > aware that the script is run as the same user that runs git-daemon, and > that more than one instances of the script can be run at the same time. > The script writer needs to be careful about using the same scratchpad > location for the temporary files the script uses and not letting multiple > instances of scripts stomping on each other's toes. These things need to > be documented. Will expand the docs further. > Do you run git-daemon from inetd, or standalone, by the way? Standalone. > I am wondering how well it would scale if you spawn an external "filter path" > script every time you get a request. A quick test of 250 consecutive requests with ls-remote to localhost (all without the --verbose flag), slowest run: - Baseline (no --filter-path agument): 3.39s $ cat filter.c #import "stdio.h" int main (int argc, char const *argv[]) { printf("%s", "/existing.git\0"); return 0; } - 3.84s $ cat filter.rb #!/usr/bin/ruby print "/existing.git\0" - 4.76s So, obviously highly dependent on how long it takes the script to launch and how much work it does. And yes, neither of the above really does anything :) nor takes any increased cpu load into account Another approach is to keep the external script running and feed it on stdin, but that would involve a bit more micro-management of the external process. I will revisit that idea if I find out that's needed. > (by the way, "filter path" sounds as if it checks and conditionally > denies access to, or something like that, which is not what you are using > it for. It is more about rewriting paths, a la mod_rewrite, and I think > the option is misnamed) Maybe --rewrite-script or --rewrite-command instead? Cheers, JS ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] Introduce a filter-path argument to git-daemon, for doing custom path transformations 2009-03-14 14:39 ` Johan Sørensen @ 2009-03-14 18:23 ` Junio C Hamano 2009-03-19 0:15 ` Johannes Schindelin 0 siblings, 1 reply; 14+ messages in thread From: Junio C Hamano @ 2009-03-14 18:23 UTC (permalink / raw) To: Johan Sørensen; +Cc: Johannes Schindelin, git Johan Sørensen <johan@johansorensen.com> writes: >> Do you run git-daemon from inetd, or standalone, by the way? > > Standalone. > >> I am wondering how well it would scale if you spawn an external "filter path" >> script every time you get a request. > > A quick test of 250 consecutive requests with ls-remote to localhost > (all without the --verbose flag), slowest run: > - Baseline (no --filter-path agument): 3.39s > > $ cat filter.c > #import "stdio.h" > int main (int argc, char const *argv[]) { > printf("%s", "/existing.git\0"); > return 0; > } > - 3.84s > > $ cat filter.rb > #!/usr/bin/ruby > print "/existing.git\0" > - 4.76s > > So, obviously highly dependent on how long it takes the script to > launch and how much work it does. And yes, neither of the above really > does anything :) nor takes any increased cpu load into account > > Another approach is to keep the external script running and feed it on > stdin, but that would involve a bit more micro-management of the > external process. I will revisit that idea if I find out that's > needed. I actually was hoping (especially we have Dscho on Cc: list) that somebody like you would start suggesting a "plug in" approach to load .so files, which would lead to a easy-to-port dso support with the help from msysgit folks we can use later in other parts of the system (e.g. customizable filters used for diff textconv, clean/smudge, etc.) >> (by the way, "filter path" sounds as if it checks and conditionally >> denies access to, or something like that, which is not what you are using >> it for. It is more about rewriting paths, a la mod_rewrite, and I think >> the option is misnamed) > > Maybe --rewrite-script or --rewrite-command instead? Perhaps. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] Introduce a filter-path argument to git-daemon, for doing custom path transformations 2009-03-14 18:23 ` Junio C Hamano @ 2009-03-19 0:15 ` Johannes Schindelin 2009-03-19 13:02 ` Johan Sørensen 0 siblings, 1 reply; 14+ messages in thread From: Johannes Schindelin @ 2009-03-19 0:15 UTC (permalink / raw) To: Junio C Hamano; +Cc: Johan Sørensen, git [-- Attachment #1: Type: TEXT/PLAIN, Size: 2072 bytes --] Hi, On Sat, 14 Mar 2009, Junio C Hamano wrote: > Johan Sørensen <johan@johansorensen.com> writes: > > >> Do you run git-daemon from inetd, or standalone, by the way? > > > > Standalone. > > > >> I am wondering how well it would scale if you spawn an external > >>"filter path" script every time you get a request. > > > > A quick test of 250 consecutive requests with ls-remote to localhost > > (all without the --verbose flag), slowest run: > > - Baseline (no --filter-path agument): 3.39s > > > > $ cat filter.c > > #import "stdio.h" > > int main (int argc, char const *argv[]) { > > printf("%s", "/existing.git\0"); > > return 0; > > } > > - 3.84s > > > > $ cat filter.rb > > #!/usr/bin/ruby > > print "/existing.git\0" > > - 4.76s > > > > So, obviously highly dependent on how long it takes the script to > > launch and how much work it does. And yes, neither of the above really > > does anything :) nor takes any increased cpu load into account > > > > Another approach is to keep the external script running and feed it on > > stdin, but that would involve a bit more micro-management of the > > external process. I will revisit that idea if I find out that's > > needed. > > I actually was hoping (especially we have Dscho on Cc: list) that somebody > like you would start suggesting a "plug in" approach to load .so files, > which would lead to a easy-to-port dso support with the help from msysgit > folks we can use later in other parts of the system (e.g. customizable > filters used for diff textconv, clean/smudge, etc.) I do not like that at all. Dynamic libraries -- especially on Windows -- are a major hassle. However, I cannot think of anything Johan might want to do that would not be possible using a bunch of regular expressions together with substitions. FWIW I have experimental code in my personal tree that sports strbuf_regsub(), a function to replace matches of a regular expression (possibly with groups) by a given string (which may contain \0 .. \9, being replaced with the respective group's contents). Ciao, Dscho ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] Introduce a filter-path argument to git-daemon, for doing custom path transformations 2009-03-19 0:15 ` Johannes Schindelin @ 2009-03-19 13:02 ` Johan Sørensen 2009-03-20 22:27 ` Johannes Schindelin 0 siblings, 1 reply; 14+ messages in thread From: Johan Sørensen @ 2009-03-19 13:02 UTC (permalink / raw) To: Johannes Schindelin; +Cc: Junio C Hamano, git 2009/3/19 Johannes Schindelin <Johannes.Schindelin@gmx.de>: >> I actually was hoping (especially we have Dscho on Cc: list) that somebody >> like you would start suggesting a "plug in" approach to load .so files, >> which would lead to a easy-to-port dso support with the help from msysgit >> folks we can use later in other parts of the system (e.g. customizable >> filters used for diff textconv, clean/smudge, etc.) > > I do not like that at all. Dynamic libraries -- especially on Windows -- > are a major hassle. > > However, I cannot think of anything Johan might want to do that would not > be possible using a bunch of regular expressions together with > substitions. Let me reiterate my use-case then: I need to dynamically substitute one path with another. Perhaps "map" paints a better picture than "substitute" here. Please refer to my second mail in this thread for more details. The only way I can see regexps work, is that if they're read, on a per-request basis (reloading git-daemon every time they change is just silly), from somewhere outside the git-daemon. Then, you might as well take the full-on approach this patch provides. Cheers, JS > > FWIW I have experimental code in my personal tree that sports > strbuf_regsub(), a function to replace matches of a regular expression > (possibly with groups) by a given string (which may contain \0 .. \9, > being replaced with the respective group's contents). > > Ciao, > Dscho > ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] Introduce a filter-path argument to git-daemon, for doing custom path transformations 2009-03-19 13:02 ` Johan Sørensen @ 2009-03-20 22:27 ` Johannes Schindelin 0 siblings, 0 replies; 14+ messages in thread From: Johannes Schindelin @ 2009-03-20 22:27 UTC (permalink / raw) To: Johan Sørensen; +Cc: Junio C Hamano, git [-- Attachment #1: Type: TEXT/PLAIN, Size: 2242 bytes --] Hi, On Thu, 19 Mar 2009, Johan Sørensen wrote: > 2009/3/19 Johannes Schindelin <Johannes.Schindelin@gmx.de>: > >> I actually was hoping (especially we have Dscho on Cc: list) that > >> somebody like you would start suggesting a "plug in" approach to load > >> .so files, which would lead to a easy-to-port dso support with the > >> help from msysgit folks we can use later in other parts of the system > >> (e.g. customizable filters used for diff textconv, clean/smudge, > >> etc.) > > > > I do not like that at all. Dynamic libraries -- especially on Windows > > -- are a major hassle. > > > > However, I cannot think of anything Johan might want to do that would > > not be possible using a bunch of regular expressions together with > > substitions. > > Let me reiterate my use-case then: I need to dynamically substitute one > path with another. Perhaps "map" paints a better picture than > "substitute" here. Please refer to my second mail in this thread for > more details. > > The only way I can see regexps work, is that if they're read, on a > per-request basis (reloading git-daemon every time they change is just > silly), from somewhere outside the git-daemon. Then, you might as well > take the full-on approach this patch provides. > > > FWIW I have experimental code in my personal tree that sports > > strbuf_regsub(), a function to replace matches of a regular expression > > (possibly with groups) by a given string (which may contain \0 .. \9, > > being replaced with the respective group's contents). Do not get me wrong, I can see your use case. But I have been cautioning against other possibly regrettably things, and it gave me _no_ pleasure at all to be proven correct in hindsight. I'd rather be called grumpy old Git, be ridiculed and insulted, but at the same time have precautions in git.git that prevent having to admit mournfully that some change was not so brilliant after all. So if some rules consisting of regular expressions with appropriate substitutions, even if they will have to be updated from time to time, solve your case, I'd rather have that than allow a server to run external programs that are not exactly well audited against all kinds of attacks. Ciao, Dscho ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] Introduce a filter-path argument to git-daemon, for doing custom path transformations 2009-03-11 15:58 ` Johannes Sixt 2009-03-12 10:13 ` Johan Sørensen @ 2009-03-12 10:26 ` Johan Sørensen 1 sibling, 0 replies; 14+ messages in thread From: Johan Sørensen @ 2009-03-12 10:26 UTC (permalink / raw) To: Johannes Sixt; +Cc: git On Wed, Mar 11, 2009 at 4:58 PM, Johannes Sixt <j.sixt@viscovery.net> wrote: > Johan Sørensen schrieb: >> This buys us a lot of flexibility when it comes to managing different >> repositories, possibly located in many different dirs, but with a uniform >> url-structure to the outside world. > > It's the first time that I see a deamon with this feature - except perhaps > Apache's ModRewrite. Are you sure you are not working around your problem > at the wrong place? > > Doesn't --interpolated-path already solve your problem? If not, then you > at least you must describe in the documentation the use-cases when > --path-filter should be preferred. Maybe I am barking up the wrong tree, but here's my real-world use case: I'm currently working on some bigger changes for gitorious.org, where the repository url-structure could potentially change over time, as a consequence of various features. Using the path-filter script I can keep the old urls around and still working, and I can map any url to a on-disk uniquely hashed path, so I don't have to move the files around, maintain symlinks and so forth for information the gitorious application already has nicely structured and easy to lookup. I know these may be highly specialized needs, but so is interpolated-path for the common user. I think this patch could be useful for anyone else wanting to set up a flexible repo hosting system. I think the url-structure is a major part of the UI for any app exposing them, even for a git-daemon, so the mod_rewrite comparison isn't too far fetched in my opinion... > Your implementation does not pass the target hostname to the script, but > it should; otherwise you lose flexibility (for virtual hosting). Good point. I've added the hostname as well as the service name as arguments for the script. >> + switch ((pid = fork())) { [snip] > > Use start_command()/finish_command() instead of rolling your own fork/exec > combo. Ah nice! I'm sending an updated patch. > > -- Hannes > Cheers, JS ^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2009-03-20 22:29 UTC | newest] Thread overview: 14+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2009-03-11 15:17 [PATCH] Introduce a filter-path argument to git-daemon, for doing custom path transformations Johan Sørensen 2009-03-11 15:58 ` Johannes Sixt 2009-03-12 10:13 ` Johan Sørensen 2009-03-12 11:29 ` Johannes Schindelin 2009-03-12 15:48 ` Johan Sørensen 2009-03-12 16:50 ` Johannes Schindelin 2009-03-12 19:06 ` Johan Sørensen 2009-03-14 6:58 ` Junio C Hamano 2009-03-14 14:39 ` Johan Sørensen 2009-03-14 18:23 ` Junio C Hamano 2009-03-19 0:15 ` Johannes Schindelin 2009-03-19 13:02 ` Johan Sørensen 2009-03-20 22:27 ` Johannes Schindelin 2009-03-12 10:26 ` Johan Sørensen
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.