* [PATCH] Linux 2.6: shebang handling in fs/binfmt_script.c
@ 2004-02-16 13:34 Hansjoerg Lipp
2004-02-22 10:09 ` Paul Jackson
2004-02-23 5:42 ` Paul Jackson
0 siblings, 2 replies; 31+ messages in thread
From: Hansjoerg Lipp @ 2004-02-16 13:34 UTC (permalink / raw)
To: linux-kernel; +Cc: hjlipp
Hi!
In a newsgroup about unix shells we had a discussion, why it is not
possible to pass more than one argument to an interpreter using the
shebang line of a script. We found, that this behaviour is rather
OS dependent. See Sven Mascheck's page for details:
http://www.in-ulm.de/~mascheck/various/shebang/
As I'm really missing this feature in Linux and changing this would not
break anything (unless someone uses rather unportable "#!cmd x y" to
pass _one_ argument "x y" containing spaces), I'd like to know if it's
possible to apply the patch below to the kernel.
It also allows to pass whitespace by using '\' as escape character:
"\t" => TAB
"\n" => LF
"\ " => SPC
"\\" => backslash
All other backslashes are discarded.
This allows something like
#!/usr/bin/awk -F \t -f
This part could break old scripts if the interpreter's path/filename or
the arguments contain backslashes. Although I don't consider this a real
problem, this feature can be deactivated by removing the
if (c=='\\') { ... }
part from the patch.
Another change: -ENOEXEC is returned, if the shebang line is too long.
So, excessive characters are not dropped silently any more.
The patch is tested for 2.6.1, but also applies cleanly to 2.6.2. I can
also send a tested patch for 2.4.24.
[ CC me on replies, please, as I'm not subscribed. ]
Kind regards
Hansjoerg Lipp
--- linux-2.6.1/fs/binfmt_script.c.orig 2004-02-06 22:21:30.000000000 +0100
+++ linux-2.6.1/fs/binfmt_script.c 2004-02-06 22:21:30.000000000 +0100
@@ -18,10 +18,16 @@
static int load_script(struct linux_binprm *bprm,struct pt_regs *regs)
{
- char *cp, *i_name, *i_arg;
+ char *cp;
struct file *file;
char interp[BINPRM_BUF_SIZE];
int retval;
+ char *argv[(BINPRM_BUF_SIZE-1)/2];
+ char **cur_arg;
+ unsigned argc;
+ int in_arg;
+ char *end, *dest;
+ char c;
if ((bprm->buf[0] != '#') || (bprm->buf[1] != '!') || (bprm->sh_bang))
return -ENOEXEC;
@@ -35,51 +41,47 @@
fput(bprm->file);
bprm->file = NULL;
- bprm->buf[BINPRM_BUF_SIZE - 1] = '\0';
- if ((cp = strchr(bprm->buf, '\n')) == NULL)
- cp = bprm->buf+BINPRM_BUF_SIZE-1;
- *cp = '\0';
- while (cp > bprm->buf) {
- cp--;
- if ((*cp == ' ') || (*cp == '\t'))
- *cp = '\0';
- else
- break;
+ in_arg=0;
+ cur_arg=argv;
+ argc=0;
+ dest=bprm->buf+2;
+ end=bprm->buf+BINPRM_BUF_SIZE;
+ for (cp=bprm->buf+2;cp<end;++cp) {
+ c=*cp;
+ if (c==' '|| c=='\t' || c=='\n' || !c) {
+ if (in_arg) {
+ in_arg=0;
+ *dest++=0;
+ }
+ if (c=='\n' || !c) break;
+ } else {
+ if (c=='\\') {
+ if (++cp>=end) return -ENOEXEC;
+ c=*cp;
+ if (c=='\n' || !c) return -ENOEXEC;
+ if (c=='t')
+ c='\t';
+ else if (c=='n')
+ c='\n';
+ }
+ if (!in_arg) {
+ in_arg=1;
+ argc++;
+ *cur_arg++=dest;
+ }
+ *dest++=c;
+ }
}
- for (cp = bprm->buf+2; (*cp == ' ') || (*cp == '\t'); cp++);
- if (*cp == '\0')
- return -ENOEXEC; /* No interpreter name found */
- i_name = cp;
- i_arg = 0;
- for ( ; *cp && (*cp != ' ') && (*cp != '\t'); cp++)
- /* nothing */ ;
- while ((*cp == ' ') || (*cp == '\t'))
- *cp++ = '\0';
- if (*cp)
- i_arg = cp;
- strcpy (interp, i_name);
- /*
- * OK, we've parsed out the interpreter name and
- * (optional) argument.
- * Splice in (1) the interpreter's name for argv[0]
- * (2) (optional) argument to interpreter
- * (3) filename of shell script (replace argv[0])
- *
- * This is done in reverse order, because of how the
- * user environment and arguments are stored.
- */
+ if (cp>=end||!argc) return -ENOEXEC;
+
+ strcpy (interp, argv[0]);
remove_arg_zero(bprm);
retval = copy_strings_kernel(1, &bprm->interp, bprm);
- if (retval < 0) return retval;
- bprm->argc++;
- if (i_arg) {
- retval = copy_strings_kernel(1, &i_arg, bprm);
- if (retval < 0) return retval;
- bprm->argc++;
- }
- retval = copy_strings_kernel(1, &i_name, bprm);
- if (retval) return retval;
+ if (retval < 0) return retval;
bprm->argc++;
+ retval = copy_strings_kernel(argc, argv, bprm);
+ if (retval < 0) return retval;
+ bprm->argc += argc;
bprm->interp = interp;
/*
^ permalink raw reply [flat|nested] 31+ messages in thread* Re: [PATCH] Linux 2.6: shebang handling in fs/binfmt_script.c 2004-02-16 13:34 [PATCH] Linux 2.6: shebang handling in fs/binfmt_script.c Hansjoerg Lipp @ 2004-02-22 10:09 ` Paul Jackson 2004-02-22 15:54 ` Hansjoerg Lipp 2004-02-23 5:42 ` Paul Jackson 1 sibling, 1 reply; 31+ messages in thread From: Paul Jackson @ 2004-02-22 10:09 UTC (permalink / raw) To: Hansjoerg Lipp; +Cc: linux-kernel, hjlipp In addition to the incompatible changes you note: 1) "#! cmd x y" to pass single arg "x y" with embedded space broken 2) Use of '\' char changed 3) Handling of long line changed doesn't this also 4) risk breaking shells that look to argv[2] for the name of the shell script file for error messages? This argument has moved out to argv[argc-1], for some value of argc. I'll wager you have to make a better case for this than simply: As I'm really missing this feature in Linux and changing this would not break any (unless ... before the above incompatibilities in a critical piece of code are overcome with the compelling need to change these details. Perhaps you can handle any such special argument specification by wrapping the user level command, as in: Instead of: #!/usr/bin/awk -F \t -f ... my awk code ... rather do: #!myawk ... my awk code ... where myawk is a compiled program that essentially does /usr/bin/awk -F '\t' -f argv[2] -- I won't rest till it's the best ... Programmer, Linux Scalability Paul Jackson <pj@sgi.com> 1.650.933.1373 ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH] Linux 2.6: shebang handling in fs/binfmt_script.c 2004-02-22 10:09 ` Paul Jackson @ 2004-02-22 15:54 ` Hansjoerg Lipp 2004-02-22 20:53 ` Paul Jackson ` (2 more replies) 0 siblings, 3 replies; 31+ messages in thread From: Hansjoerg Lipp @ 2004-02-22 15:54 UTC (permalink / raw) To: Paul Jackson; +Cc: linux-kernel On Sun, Feb 22, 2004 at 02:09:11AM -0800, Paul Jackson wrote: > In addition to the incompatible changes you note: > 1) "#! cmd x y" to pass single arg "x y" with embedded space broken > 2) Use of '\' char changed Well, as noted, this part can be removed easily. As I consider this part least important, I maybe should have deleted it before sending the patch (some "#ifdef CONFIG_xxxx" could be used instead). But as I sent the patch also because I wanted to know what other people think about the issue, I did not change it (passing \t was the topic of the newsgroup discussion I mentioned). > 3) Handling of long line changed > doesn't this also > 4) risk breaking shells that look to argv[2] for the name of the > shell script file for error messages? This argument > has moved out to argv[argc-1], for some value of argc. Well, if the shell can't handle some parameters, you shouldn't add them to the shebang line. If you have some example.script #!cmd -x executed as "example.script -a -b", exec will still pass {"cmd", "-x", "example.script", "-a", "-b"} as argv to cmd. The patch just allows #!cmd -x -y to become {"cmd", "-x", "-y", "example.script", "-a", "-b"}. If I understand you right, your argument could be used to say: passing arguments is not good at all, because some interpreter expects the name of the script in argv[1] (as it's usual with "normal" "#!/bin/sh" scripts). In my opinion, you just can't use a shebang line "#!interpreter argument" in this case. And it's the same with my proposal: you don't have to pass two arguments -- and you shouldn't if the interpreter can't handle it. BTW, which shell expects the name of the script in argv[2]? > I'll wager you have to make a better case for this than simply: > > As I'm really missing this feature in Linux and changing this > would not break any (unless ... > > before the above incompatibilities in a critical piece of code are > overcome with the compelling need to change these details. Yes, you may be right. But please note, that the "incompatibilities" are rather theoretical, in my opinion (please correct me if I'm wrong): - I don't think there are many scripts with "#!cmd -a -b" that must be parsed like {"cmd", "-a -b"}. And scripts like this would not be portable among the Unices, anyway. - I think it's much better to get an error on a too long shebang line. It's rather dangerous to drop excessive characters silently as this can change the meaning of the command totally. It's just a pain to have to use wrappers; they make a system unnecessarily complex and error-prone and the arguments needed by the interpreter cannot be found, where it's most logical to search. I think, handling the shebang line "my" way (as it's already done by FreeBSD) makes writing complex scripts easier and cleaner and has no real disadvantages. Thanks for the response, Hansjoerg Lipp ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH] Linux 2.6: shebang handling in fs/binfmt_script.c 2004-02-22 15:54 ` Hansjoerg Lipp @ 2004-02-22 20:53 ` Paul Jackson 2004-02-22 22:57 ` Jamie Lokier 2004-02-23 20:13 ` Hansjoerg Lipp 2004-02-23 5:49 ` Paul Jackson 2004-02-23 5:50 ` Paul Jackson 2 siblings, 2 replies; 31+ messages in thread From: Paul Jackson @ 2004-02-22 20:53 UTC (permalink / raw) To: Hansjoerg Lipp; +Cc: linux-kernel > BTW, which shell expects the name of the script in argv[2]? Which ones don't? The burden is on you, not me. The Bourne like shells that I happen to try just now _do_ display syntax error messages in shell scripts with the name of the shell script file in the error message. Look and see how they are getting that script file name. What's theoretical on one persons machine is very real and painful on a million persons machines. Incompatible changes in documented interfaces have a high threshold to overcome. -- I won't rest till it's the best ... Programmer, Linux Scalability Paul Jackson <pj@sgi.com> 1.650.933.1373 ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH] Linux 2.6: shebang handling in fs/binfmt_script.c 2004-02-22 20:53 ` Paul Jackson @ 2004-02-22 22:57 ` Jamie Lokier 2004-02-23 5:44 ` Paul Jackson 2004-02-23 20:13 ` Hansjoerg Lipp 1 sibling, 1 reply; 31+ messages in thread From: Jamie Lokier @ 2004-02-22 22:57 UTC (permalink / raw) To: Paul Jackson; +Cc: Hansjoerg Lipp, linux-kernel Paul Jackson wrote: > > BTW, which shell expects the name of the script in argv[2]? > > Which ones don't? I believe the question was "which shell expects the name in argv[2] regardless of an options given before the name". That rules out all the ordinary shell programs. > The burden is on you, not me. The Bourne like shells > that I happen to try just now _do_ display syntax error messages in > shell scripts with the name of the shell script file in the error > message. Look and see how they are getting that script file name. The standard shell programs all get the name from the first non-option argument. > What's theoretical on one persons machine is very real and painful > on a million persons machines. Incompatible changes in documented > interfaces have a high threshold to overcome. I'll be astonished if the change to split the arguments breaks any script which actually exists, except for the rare and convoluted possibility: where the interpreter is a C program specially written to workaround the fact that Linux doesn't split the arguments. The backslash functionality (\t) may be more of a problem. -- Jamie ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH] Linux 2.6: shebang handling in fs/binfmt_script.c 2004-02-22 22:57 ` Jamie Lokier @ 2004-02-23 5:44 ` Paul Jackson 2004-02-23 14:22 ` Jamie Lokier 2004-02-23 20:25 ` Hansjoerg Lipp 0 siblings, 2 replies; 31+ messages in thread From: Paul Jackson @ 2004-02-23 5:44 UTC (permalink / raw) To: Jamie Lokier; +Cc: hjlipp, linux-kernel > I believe the question was "which shell expects the name in argv[2] The question is more like: examine each shell's argument parsing code to determine which ones will or will not be affected by this. For a change like this, someone needs to actually look at the code for each major shell, and verify their reading of the code with a little experimentation. -- I won't rest till it's the best ... Programmer, Linux Scalability Paul Jackson <pj@sgi.com> 1.650.933.1373 ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH] Linux 2.6: shebang handling in fs/binfmt_script.c 2004-02-23 5:44 ` Paul Jackson @ 2004-02-23 14:22 ` Jamie Lokier 2004-02-23 17:34 ` Andries Brouwer 2004-02-23 20:12 ` Paul Jackson 2004-02-23 20:25 ` Hansjoerg Lipp 1 sibling, 2 replies; 31+ messages in thread From: Jamie Lokier @ 2004-02-23 14:22 UTC (permalink / raw) To: Paul Jackson; +Cc: hjlipp, linux-kernel Paul Jackson wrote: > > I believe the question was "which shell expects the name in argv[2] > > The question is more like: examine each shell's argument parsing code to > determine which ones will or will not be affected by this. For a change > like this, someone needs to actually look at the code for each major > shell, and verify their reading of the code with a little experimentation. Eh? We do know what the major shells do: They either look at the first non-option argument for the script name, or they do not accept options at all. Anyway that's irrelevant: the splitting change only affects shell _scripts_ which already have multiple options on the #! line, and which depend on a space not splitting the argument. If a script doesn't have that, the shell's behaviour isn't affected by this change. Such scripts are non-portable because that behaviour isn't universal (although I have a feeling the current Linux behaviour was done to mimick some existing system - as it was never hard to implement argument splitting of the original author had wanted to.) In other words, what's relevant is which shell _scripts_ would be affected, not which shells. To find those scripts, do: find /bin /sbin /usr/bin /usr/sbin /usr/X11R6/bin /usr/local/bin \ /etc /usr/lib -type f \ | xargs perl -ne 'print "$ARGV\n" if /^#! ?.+ .+ /; close ARGV' (Or choose your own directories). I didn't find any on my system. -- Jamie ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH] Linux 2.6: shebang handling in fs/binfmt_script.c 2004-02-23 14:22 ` Jamie Lokier @ 2004-02-23 17:34 ` Andries Brouwer 2004-02-23 20:13 ` Paul Jackson 2004-02-23 21:46 ` Paul Jackson 2004-02-23 20:12 ` Paul Jackson 1 sibling, 2 replies; 31+ messages in thread From: Andries Brouwer @ 2004-02-23 17:34 UTC (permalink / raw) To: Jamie Lokier; +Cc: Paul Jackson, hjlipp, linux-kernel On Mon, Feb 23, 2004 at 02:22:15PM +0000, Jamie Lokier wrote: > Paul Jackson wrote: ... > Such scripts are non-portable because that behaviour isn't universal There are several websites with information. I once collected #! info. See http://homepages.cwi.nl/~aeb/std/hashexclam-1.html .. argi, consists of the 0 or 1 or perhaps more arguments to the interpreter found in the #! line. Thus, this group is empty if there is no nonblank text following the interpreter name in the #! line. If there is such nonblank text then for SysVR4, SunOS, Solaris, IRIX, HPUX, AIX, Unixware, Linux, OpenBSD, Tru64 this group consists of precisely one argument. FreeBSD, BSD/OS, BSDI split the text following the interpreter name into zero or more arguments. Andries ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH] Linux 2.6: shebang handling in fs/binfmt_script.c 2004-02-23 17:34 ` Andries Brouwer @ 2004-02-23 20:13 ` Paul Jackson 2004-02-23 21:46 ` Paul Jackson 1 sibling, 0 replies; 31+ messages in thread From: Paul Jackson @ 2004-02-23 20:13 UTC (permalink / raw) To: Andries Brouwer; +Cc: jamie, hjlipp, linux-kernel If there is such nonblank text then for SysVR4, SunOS, Solaris, IRIX, HPUX, AIX, Unixware, Linux, OpenBSD, Tru64 this group consists of precisely one argument. FreeBSD, BSD/OS, BSDI split the text -- I won't rest till it's the best ... Programmer, Linux Scalability Paul Jackson <pj@sgi.com> 1.650.933.1373 ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH] Linux 2.6: shebang handling in fs/binfmt_script.c 2004-02-23 17:34 ` Andries Brouwer 2004-02-23 20:13 ` Paul Jackson @ 2004-02-23 21:46 ` Paul Jackson 2004-02-24 1:13 ` Hansjoerg Lipp 1 sibling, 1 reply; 31+ messages in thread From: Paul Jackson @ 2004-02-23 21:46 UTC (permalink / raw) To: Andries Brouwer; +Cc: jamie, hjlipp, linux-kernel Andries Brouwer wrote: > If there is such nonblank text then for SysVR4, > SunOS, Solaris, IRIX, HPUX, AIX, Unixware, Linux, OpenBSD, Tru64 > this group consists of precisely one argument. > FreeBSD, BSD/OS, BSDI split the text Interesting - I notice that 9 Operating Systems, in addition to Linux, don't split the optional shebang argument, and 3 do. All else equal, I am not enthusiastic about a somewhat arbitrary change that could be done either way, that is actually done more often in other operating systems the current way, and that potentially affects both script files and their interpreters (shells, awk, perl, python, guile, tcl, bc, ...). I will acknowledge however that if there was a shell or interpreter that allowed at most one '-' prefixed option before the path to the script file to be interpreted, that that shell or interpreter would be poorly coded. And, to be truthful, the usual way that I code awk scripts is not as a shbang script with an interpreter of awk, #!/bin/awk BEGIN ... but rather as a quoted awk script within a shell script: #!/bin/sh awk ' BEGIN ... ' It is then trivial to supply one or several options to 'awk', and (as the tclsh man page notes) to cope with possible diverse locations along $PATH of the interpreter. This is especially useful in the case of awk, since it is not a convenient language for many things that are easily done in a shell. That is, I don't write awk scripts, so much as I write shell scripts that might make use of awk. This is a portable habit, that avoids all the afore mentioned limitations and inconsistencies in shbang handling. -- I won't rest till it's the best ... Programmer, Linux Scalability Paul Jackson <pj@sgi.com> 1.650.933.1373 ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH] Linux 2.6: shebang handling in fs/binfmt_script.c 2004-02-23 21:46 ` Paul Jackson @ 2004-02-24 1:13 ` Hansjoerg Lipp 2004-02-24 1:29 ` Paul Jackson 0 siblings, 1 reply; 31+ messages in thread From: Hansjoerg Lipp @ 2004-02-24 1:13 UTC (permalink / raw) To: Paul Jackson; +Cc: Andries Brouwer, jamie, linux-kernel On Mon, Feb 23, 2004 at 01:46:10PM -0800, Paul Jackson wrote: > Andries Brouwer wrote: > > If there is such nonblank text then for SysVR4, > > SunOS, Solaris, IRIX, HPUX, AIX, Unixware, Linux, OpenBSD, Tru64 > > this group consists of precisely one argument. > > FreeBSD, BSD/OS, BSDI split the text > > Interesting - I notice that 9 Operating Systems, in addition to Linux, > don't split the optional shebang argument, and 3 do. Yes. And this shows, that Linux would not be the first OS which splits arguments. One more reason, why I'm sure this change won't cause lots of problems. > All else equal, I am not enthusiastic about a somewhat arbitrary change > that could be done either way, that is actually done more often in other > operating systems the current way, and that potentially affects both > script files and their interpreters (shells, awk, perl, python, guile, > tcl, bc, ...). [...] As written in my previous mail, it only affects scripts, that already have multiple arguments in the shebang line. So, I don't see a lot of problems here. > And, to be truthful, the usual way that I code awk scripts is not as > a shbang script with an interpreter of awk, > > #!/bin/awk > BEGIN ... > > but rather as a quoted awk script within a shell script: > > #!/bin/sh > awk ' > BEGIN ... > ' > [...] This may be right for awk, although I still consider wrapper scripts to be somewhat awkward. But your argument is not true for shells, perl, python, ... And I still think, it's somewhat strange, that perl has to parse the shebang line of the scripts, because the OS can't do it. And as other interpreters don't act this way, there are totally unnecessary restrictions writing certain scripts... > This is a portable habit, that avoids all the afore mentioned > limitations and inconsistencies in shbang handling. If you write scripts for several OSes you are right. On the other hand, I don't see any reason, why one should stick to the limits of some other operating systems, when it's not necessary. Acting this way will never change these limitations. If the three OSes mentioned above _and_ Linux handle the shebang line in a more sensible way, this could be one step to get rid of these inconsistencies. Regards, Hansjoerg Lipp ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH] Linux 2.6: shebang handling in fs/binfmt_script.c 2004-02-24 1:13 ` Hansjoerg Lipp @ 2004-02-24 1:29 ` Paul Jackson 2004-02-25 23:13 ` Hansjoerg Lipp 0 siblings, 1 reply; 31+ messages in thread From: Paul Jackson @ 2004-02-24 1:29 UTC (permalink / raw) To: Hansjoerg Lipp; +Cc: aebr, jamie, linux-kernel > I don't see any reason, why one should stick to the limits of some other > operating systems, when it's not necessary. If I make it a habit to write portable code, then over the years, I cause fewer problems for myself and others. More things "just work". I've got scripts that I use that are 10 or 20 years old, and have been used on all manner of evironments that could not have been anticipated when the script was first written. Also my habits need not change - the more things I can do without thinking, the more thinking I have left over to do something useful. Somedays, boring beats fine tuning. -- I won't rest till it's the best ... Programmer, Linux Scalability Paul Jackson <pj@sgi.com> 1.650.933.1373 ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH] Linux 2.6: shebang handling in fs/binfmt_script.c 2004-02-24 1:29 ` Paul Jackson @ 2004-02-25 23:13 ` Hansjoerg Lipp 0 siblings, 0 replies; 31+ messages in thread From: Hansjoerg Lipp @ 2004-02-25 23:13 UTC (permalink / raw) To: Paul Jackson; +Cc: aebr, jamie, linux-kernel On Mon, Feb 23, 2004 at 05:29:42PM -0800, Paul Jackson wrote: > > I don't see any reason, why one should stick to the limits of some other > > operating systems, when it's not necessary. > > If I make it a habit to write portable code, then over the years, I > cause fewer problems for myself and others. More things "just work". > I've got scripts that I use that are 10 or 20 years old, and have been > used on all manner of evironments that could not have been anticipated > when the script was first written. Yes, it's true, that this is often sensible. But I also think, that sometimes we must get rid of old restrictions, that don't make much sense. The patch does not prevent you from writing portable scripts, but it allows us to write scripts, that can't be written without this change (or you need some work around like wrappers or an interpreter parsing the shebang line on its own). And because you could see this patch as a step towards other operating systems to reduce the chaos the web pages mentioned in this thread show[1], this patch might even make scripts written for other operating systems work under Linux. So, there are not only disadvantages with regard to portability. Regards, Hansjoerg Lipp [1] http://www.in-ulm.de/~mascheck/various/shebang/ http://homepages.cwi.nl/~aeb/std/hashexclam-1.html ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH] Linux 2.6: shebang handling in fs/binfmt_script.c 2004-02-23 14:22 ` Jamie Lokier 2004-02-23 17:34 ` Andries Brouwer @ 2004-02-23 20:12 ` Paul Jackson 2004-02-23 20:16 ` Jamie Lokier 1 sibling, 1 reply; 31+ messages in thread From: Paul Jackson @ 2004-02-23 20:12 UTC (permalink / raw) To: Jamie Lokier; +Cc: hjlipp, linux-kernel > Anyway that's irrelevant: the splitting change only affects shell _scripts_ Well, I wouldn't say 'irrelevant'. Some might claim that this question (what the major shell's do) is already known, but surely it does matter. The shells _do_ need to find the path to the script file in the argv[] passed to them, and the proposed change does alter the parsing of that argv[]. The splitting does not affect only the scripts. It also affects the argv[] array presented to the shells, which may or may not deal with such as we would like. -- I won't rest till it's the best ... Programmer, Linux Scalability Paul Jackson <pj@sgi.com> 1.650.933.1373 ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH] Linux 2.6: shebang handling in fs/binfmt_script.c 2004-02-23 20:12 ` Paul Jackson @ 2004-02-23 20:16 ` Jamie Lokier 2004-02-23 22:08 ` Paul Jackson 0 siblings, 1 reply; 31+ messages in thread From: Jamie Lokier @ 2004-02-23 20:16 UTC (permalink / raw) To: Paul Jackson; +Cc: hjlipp, linux-kernel Paul Jackson wrote: > > Anyway that's irrelevant: the splitting change only affects shell _scripts_ > > The splitting does not affect only the scripts. It also affects the > argv[] array presented to the shells, which may or may not deal with > such as we would like. You misread what I wrote. This is a rephrasing of what I wrote: The splitting does not affect any shells when called by scripts with <= 1 argument - because the splitting change doesn't affect anything in those cases. Therefore the shell behaviour is not relevant, except for such scripts. On my system there are no such scripts. -- Jamie ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH] Linux 2.6: shebang handling in fs/binfmt_script.c 2004-02-23 20:16 ` Jamie Lokier @ 2004-02-23 22:08 ` Paul Jackson 0 siblings, 0 replies; 31+ messages in thread From: Paul Jackson @ 2004-02-23 22:08 UTC (permalink / raw) To: Jamie Lokier; +Cc: hjlipp, linux-kernel > Therefore the shell behaviour is not relevant, except for such scripts. So we agree that the shell behaviour is relevant for such scripts. I don't think I missed a thing, and I think we are in agreement, except on the relative value of this change, versus the risk of breaking a shell. If a shell is coded to allow for at most one option before the script file path, and if a script is presented to it with a shebang option having an embedded space, then ... oops. You're just discounting the risk of either such scripts or of such stupidly coded shells more than I am discounting such, and you are valuing the usefulness of the proposed change more than I value it. I accept that there is no shell, nor script, on your system that would break, and to be honest, I can't find any such shell, or script, on my system either. -- I won't rest till it's the best ... Programmer, Linux Scalability Paul Jackson <pj@sgi.com> 1.650.933.1373 ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH] Linux 2.6: shebang handling in fs/binfmt_script.c 2004-02-23 5:44 ` Paul Jackson 2004-02-23 14:22 ` Jamie Lokier @ 2004-02-23 20:25 ` Hansjoerg Lipp 2004-02-23 22:00 ` Paul Jackson 1 sibling, 1 reply; 31+ messages in thread From: Hansjoerg Lipp @ 2004-02-23 20:25 UTC (permalink / raw) To: Paul Jackson; +Cc: Jamie Lokier, linux-kernel On Sun, Feb 22, 2004 at 09:44:57PM -0800, Paul Jackson wrote: > > I believe the question was "which shell expects the name in argv[2] > > The question is more like: examine each shell's argument parsing code to > determine which ones will or will not be affected by this. For a change > like this, someone needs to actually look at the code for each major > shell, and verify their reading of the code with a little experimentation. I still don't understand your argument... If there is a shell having those problems, nobody would use something like #!/shell -foo -bar And the "old" #!/shell -foo or #!/shell still work as usual (if there are no whitespace characters in the parameter). Regards, Hansjoerg Lipp ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH] Linux 2.6: shebang handling in fs/binfmt_script.c 2004-02-23 20:25 ` Hansjoerg Lipp @ 2004-02-23 22:00 ` Paul Jackson 2004-02-23 23:59 ` Jamie Lokier 2004-02-24 0:13 ` Hansjoerg Lipp 0 siblings, 2 replies; 31+ messages in thread From: Paul Jackson @ 2004-02-23 22:00 UTC (permalink / raw) To: Hansjoerg Lipp; +Cc: jamie, linux-kernel Hansjoerg wrote: > I still don't understand your argument... If there is a shell having > those problems, nobody would use something like I will acknowledge that while one _could_ code a shell so that your proposed change would break it, it would be a stupid, silly and ugly way to code a shell. That is, one _could_ code a shell to say: 1) If argv[1] starts with a '-', consume and handle as an option (or possibly as a space separated list of options). 2) Presume the next argument, if any, is a shell script file. I would be surprised if any of the major shells are coded this way. -- I won't rest till it's the best ... Programmer, Linux Scalability Paul Jackson <pj@sgi.com> 1.650.933.1373 ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH] Linux 2.6: shebang handling in fs/binfmt_script.c 2004-02-23 22:00 ` Paul Jackson @ 2004-02-23 23:59 ` Jamie Lokier 2004-02-24 0:13 ` Hansjoerg Lipp 1 sibling, 0 replies; 31+ messages in thread From: Jamie Lokier @ 2004-02-23 23:59 UTC (permalink / raw) To: Paul Jackson; +Cc: Hansjoerg Lipp, linux-kernel Paul Jackson wrote: > 1) If argv[1] starts with a '-', consume and handle as an option > (or possibly as a space separated list of options). > 2) Presume the next argument, if any, is a shell script file. > > I would be surprised if any of the major shells are coded this way. It would have been a "smart" thing for Perl to do, extra friendly for programmers, auto-configuring with a test at installation time of course. I doubt Perl does that but it wouldn't surprise me - it seems like quite a good idea - Perl scripts using the capability would even be portable :) -- Jamie ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH] Linux 2.6: shebang handling in fs/binfmt_script.c 2004-02-23 22:00 ` Paul Jackson 2004-02-23 23:59 ` Jamie Lokier @ 2004-02-24 0:13 ` Hansjoerg Lipp 2004-02-24 1:32 ` Paul Jackson 1 sibling, 1 reply; 31+ messages in thread From: Hansjoerg Lipp @ 2004-02-24 0:13 UTC (permalink / raw) To: Paul Jackson; +Cc: jamie, linux-kernel On Mon, Feb 23, 2004 at 02:00:27PM -0800, Paul Jackson wrote: > Hansjoerg wrote: > > I still don't understand your argument... If there is a shell having > > those problems, nobody would use something like > > I will acknowledge that while one _could_ code a shell so that your > proposed change would break it, it would be a stupid, silly and ugly > way to code a shell. > > That is, one _could_ code a shell to say: > > 1) If argv[1] starts with a '-', consume and handle as an option > (or possibly as a space separated list of options). > 2) Presume the next argument, if any, is a shell script file. There is no problem with such a shell if you use scripts beginning with #!/some/shell or #!/some/shell -some_arg if some_arg does not contain whitespace characters. In both cases, argv will be the same as it is with the current code. /some/script param1 param2 will become /some/shell /some/script param1 param2 or /some/shell -some_arg /some/script param1 param2 as it has been before. There is a problem with a shebang line like #!/some/shell -x -y _but_ this was most probably an error, before. (Unless this shell accepts _one_ parameter "-x -y" containing a space.) So, I really can't see any problem with such a shell... Regards, Hansjoerg Lipp ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH] Linux 2.6: shebang handling in fs/binfmt_script.c 2004-02-24 0:13 ` Hansjoerg Lipp @ 2004-02-24 1:32 ` Paul Jackson 2004-02-25 23:14 ` Hansjoerg Lipp 0 siblings, 1 reply; 31+ messages in thread From: Paul Jackson @ 2004-02-24 1:32 UTC (permalink / raw) To: Hansjoerg Lipp; +Cc: jamie, linux-kernel > So, I really can't see any problem with such a shell... I think we are agreeing on the technical details. But not on the relative weight of the potential problems versus the value of the change you propose. -- I won't rest till it's the best ... Programmer, Linux Scalability Paul Jackson <pj@sgi.com> 1.650.933.1373 ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH] Linux 2.6: shebang handling in fs/binfmt_script.c 2004-02-24 1:32 ` Paul Jackson @ 2004-02-25 23:14 ` Hansjoerg Lipp 2004-02-25 23:24 ` Paul Jackson 0 siblings, 1 reply; 31+ messages in thread From: Hansjoerg Lipp @ 2004-02-25 23:14 UTC (permalink / raw) To: Paul Jackson; +Cc: aebr, jamie, linux-kernel On Mon, Feb 23, 2004 at 05:32:46PM -0800, Paul Jackson wrote: > > So, I really can't see any problem with such a shell... > > I think we are agreeing on the technical details. > > But not on the relative weight of the potential problems > versus the value of the change you propose. Okay. So the "result" of this discussion seems to be: We agree, that it is not that likely that there will be a lot of problems caused by existing scripts with a shebang line with one argument containing spaces. But you still consider this too risky, whereas Jamie Lokier (if I understood him right) and I think, the risk is low enough. The '\'-part seems to be more problematic and not that useful. So, this part could be removed from the patch. Andries Brouwer's web page shows me, that there are operating systems that already split arguments, which seems to work without a lot of problems, while you emphasize the fact, that there are more operating systems parsing the shebang line the "old" way. So, talking about the same facts, we still disagree, and I don't think further discussion will change this. How should we proceed? Should we still wait for other comments (I'd really like to know what people think about it)? Should I ask Andrew Morton what he thinks about it, also with regard to our discussion? Or do you still see facts, we should talk about? Regards, Hansjoerg Lipp ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH] Linux 2.6: shebang handling in fs/binfmt_script.c 2004-02-25 23:14 ` Hansjoerg Lipp @ 2004-02-25 23:24 ` Paul Jackson 0 siblings, 0 replies; 31+ messages in thread From: Paul Jackson @ 2004-02-25 23:24 UTC (permalink / raw) To: Hansjoerg Lipp; +Cc: aebr, jamie, linux-kernel > How should we proceed? Good summary - thanks. Clearly others will have to chime in. In particular, Andrew Morton holds the final vote. -- I won't rest till it's the best ... Programmer, Linux Scalability Paul Jackson <pj@sgi.com> 1.650.933.1373 ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH] Linux 2.6: shebang handling in fs/binfmt_script.c 2004-02-22 20:53 ` Paul Jackson 2004-02-22 22:57 ` Jamie Lokier @ 2004-02-23 20:13 ` Hansjoerg Lipp 2004-02-23 22:24 ` Paul Jackson 1 sibling, 1 reply; 31+ messages in thread From: Hansjoerg Lipp @ 2004-02-23 20:13 UTC (permalink / raw) To: Paul Jackson; +Cc: linux-kernel On Sun, Feb 22, 2004 at 12:53:12PM -0800, Paul Jackson wrote: > > BTW, which shell expects the name of the script in argv[2]? > > Which ones don't? The burden is on you, not me. The Bourne like shells > that I happen to try just now _do_ display syntax error messages in > shell scripts with the name of the shell script file in the error > message. Look and see how they are getting that script file name. Although I still don't think, this is relevant (because scripts for interpreters having these problems don't have to pass multiple arguments on the shebang line), I just tested some example scripts like this: ---- #!/bin/zsh -v -x echo "argv0: $0" /foo/bar ---- (the last line to get an error message). Everything works as expected using those shells: ksh: PD KSH v5.2.14 GNU bash: 2.05b ash: 0.2 zsh: 4.1.1 tcsh: 6.12.00 I could have a look at the sources, but as this is the behaviour the man pages and susv3 describe, this should be "evidence" enough(?). Regards, Hansjoerg Lipp ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH] Linux 2.6: shebang handling in fs/binfmt_script.c 2004-02-23 20:13 ` Hansjoerg Lipp @ 2004-02-23 22:24 ` Paul Jackson 2004-02-24 0:21 ` Hansjoerg Lipp 0 siblings, 1 reply; 31+ messages in thread From: Paul Jackson @ 2004-02-23 22:24 UTC (permalink / raw) To: Hansjoerg Lipp; +Cc: linux-kernel Hansjoerg wrote: > #!/bin/zsh -v -x > ... > this should be "evidence" enough(?) This testing was done on a system with your patch applied, right? Because on a stock kernel, the various shells are of course confused by the "-v -x" argv[1]. I will grant that ksh, bash, ash, tcsh and zsh are likely ok (willing to see > 1 option before the script file name.) An alternative way to test the same thing, that works even on a stock kernel: $ echo 'echo "$*"' > ./d $ ash -e -e ./d 1 2 3 $ tcsh -v -v ./d 1 2 3 $ zsh -e -e ./d 1 2 3 $ ksh -e -e ./d 1 2 3 $ bash -e -e ./d 1 2 3 The thing being tested: will a shell handle > 1 option before a script file name. Each shell invocation of the "./d" script should echo the script file arguments "1 2 3". -- I won't rest till it's the best ... Programmer, Linux Scalability Paul Jackson <pj@sgi.com> 1.650.933.1373 ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH] Linux 2.6: shebang handling in fs/binfmt_script.c 2004-02-23 22:24 ` Paul Jackson @ 2004-02-24 0:21 ` Hansjoerg Lipp 0 siblings, 0 replies; 31+ messages in thread From: Hansjoerg Lipp @ 2004-02-24 0:21 UTC (permalink / raw) To: Paul Jackson; +Cc: linux-kernel On Mon, Feb 23, 2004 at 02:24:51PM -0800, Paul Jackson wrote: > Hansjoerg wrote: > > #!/bin/zsh -v -x > > ... > > this should be "evidence" enough(?) > > This testing was done on a system with your patch applied, right? Yes. The results are the same if you use a stock kernel and call the script manually. > Because on a stock kernel, the various shells are of course > confused by the "-v -x" argv[1]. Yes, of course. > I will grant that ksh, bash, ash, tcsh and zsh are likely ok > (willing to see > 1 option before the script file name.) > > An alternative way to test the same thing, that works even on > a stock kernel: > > $ echo 'echo "$*"' > ./d > $ ash -e -e ./d 1 2 3 > $ tcsh -v -v ./d 1 2 3 > $ zsh -e -e ./d 1 2 3 > $ ksh -e -e ./d 1 2 3 > $ bash -e -e ./d 1 2 3 > > The thing being tested: will a shell handle > 1 option before a script > file name. Each shell invocation of the "./d" script should echo the > script file arguments "1 2 3". This works as expected. Regards, Hansjoerg Lipp ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH] Linux 2.6: shebang handling in fs/binfmt_script.c 2004-02-22 15:54 ` Hansjoerg Lipp 2004-02-22 20:53 ` Paul Jackson @ 2004-02-23 5:49 ` Paul Jackson 2004-02-23 5:50 ` Paul Jackson 2 siblings, 0 replies; 31+ messages in thread From: Paul Jackson @ 2004-02-23 5:49 UTC (permalink / raw) To: Hansjoerg Lipp; +Cc: linux-kernel > BTW, which shell expects the name of the script in argv[2]? I don't know. Someone needs to actually examine the shell code, and see what it does, for various shells. My jaw boning neither proves nor disproves anything. -- I won't rest till it's the best ... Programmer, Linux Scalability Paul Jackson <pj@sgi.com> 1.650.933.1373 ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH] Linux 2.6: shebang handling in fs/binfmt_script.c 2004-02-22 15:54 ` Hansjoerg Lipp 2004-02-22 20:53 ` Paul Jackson 2004-02-23 5:49 ` Paul Jackson @ 2004-02-23 5:50 ` Paul Jackson 2 siblings, 0 replies; 31+ messages in thread From: Paul Jackson @ 2004-02-23 5:50 UTC (permalink / raw) To: Hansjoerg Lipp; +Cc: linux-kernel > It's just a pain to have to use wrappers true > "my" way (as it's already done by > FreeBSD) makes writing complex scripts easier likely true as well -- I won't rest till it's the best ... Programmer, Linux Scalability Paul Jackson <pj@sgi.com> 1.650.933.1373 ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH] Linux 2.6: shebang handling in fs/binfmt_script.c 2004-02-16 13:34 [PATCH] Linux 2.6: shebang handling in fs/binfmt_script.c Hansjoerg Lipp 2004-02-22 10:09 ` Paul Jackson @ 2004-02-23 5:42 ` Paul Jackson 2004-02-23 20:24 ` Hansjoerg Lipp 1 sibling, 1 reply; 31+ messages in thread From: Paul Jackson @ 2004-02-23 5:42 UTC (permalink / raw) To: Hansjoerg Lipp; +Cc: linux-kernel, hjlipp > #!/usr/bin/awk -F \t -f If your primary need is to set the awk field separator, how about setting FS (or IFS, depending on which awk) in a BEGIN section in the script? -- I won't rest till it's the best ... Programmer, Linux Scalability Paul Jackson <pj@sgi.com> 1.650.933.1373 ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH] Linux 2.6: shebang handling in fs/binfmt_script.c 2004-02-23 5:42 ` Paul Jackson @ 2004-02-23 20:24 ` Hansjoerg Lipp 2004-02-23 21:55 ` Paul Jackson 0 siblings, 1 reply; 31+ messages in thread From: Hansjoerg Lipp @ 2004-02-23 20:24 UTC (permalink / raw) To: Paul Jackson; +Cc: linux-kernel On Sun, Feb 22, 2004 at 09:42:55PM -0800, Paul Jackson wrote: > > #!/usr/bin/awk -F \t -f > > If your primary need is to set the awk field separator, how about > setting FS (or IFS, depending on which awk) in a BEGIN section > in the script? Well, this was just the example we used in the discussion I mentioned. In this case you are right. But what about #!/usr/bin/awk --posix -f to enable expressions like [0-9]{1,2}. There are really usefull parameters for awk, shells, ... you can't use easily in scripts (IIRC, perl has to parse the shebang line on its own because of this - although this is really not the job of an interpreter.) The "\" part: Yes, there are not many examples, where you really need this, because it's not that likely to have filenames or parameters containing spaces. That's why I said, this part could get some "#ifdef CONFIG_SHEBANG_ESCAPE" or could even be deleted from the patch. Here, I'd like to know what people consider more important: compatibility for old scripts with shebang lines containing backslashes or the possibility to have file names or paramaters containing white space characters. Regards, Hansjoerg Lipp ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH] Linux 2.6: shebang handling in fs/binfmt_script.c 2004-02-23 20:24 ` Hansjoerg Lipp @ 2004-02-23 21:55 ` Paul Jackson 0 siblings, 0 replies; 31+ messages in thread From: Paul Jackson @ 2004-02-23 21:55 UTC (permalink / raw) To: Hansjoerg Lipp; +Cc: linux-kernel Hansjoerg wrote: > But what about > #!/usr/bin/awk --posix -f What I would actually code, in this case, and as I just noted a minute ago in a parallel rely, would be: #!/bin/sh awk --posix -f ' ... ' I basically never put awk in the shebang line. Rather I invoke it on quoted scripts inside of a shell script. This habit has served me well for some 25 years now, on a variety of systems. -- I won't rest till it's the best ... Programmer, Linux Scalability Paul Jackson <pj@sgi.com> 1.650.933.1373 ^ permalink raw reply [flat|nested] 31+ messages in thread
end of thread, other threads:[~2004-02-25 23:28 UTC | newest] Thread overview: 31+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2004-02-16 13:34 [PATCH] Linux 2.6: shebang handling in fs/binfmt_script.c Hansjoerg Lipp 2004-02-22 10:09 ` Paul Jackson 2004-02-22 15:54 ` Hansjoerg Lipp 2004-02-22 20:53 ` Paul Jackson 2004-02-22 22:57 ` Jamie Lokier 2004-02-23 5:44 ` Paul Jackson 2004-02-23 14:22 ` Jamie Lokier 2004-02-23 17:34 ` Andries Brouwer 2004-02-23 20:13 ` Paul Jackson 2004-02-23 21:46 ` Paul Jackson 2004-02-24 1:13 ` Hansjoerg Lipp 2004-02-24 1:29 ` Paul Jackson 2004-02-25 23:13 ` Hansjoerg Lipp 2004-02-23 20:12 ` Paul Jackson 2004-02-23 20:16 ` Jamie Lokier 2004-02-23 22:08 ` Paul Jackson 2004-02-23 20:25 ` Hansjoerg Lipp 2004-02-23 22:00 ` Paul Jackson 2004-02-23 23:59 ` Jamie Lokier 2004-02-24 0:13 ` Hansjoerg Lipp 2004-02-24 1:32 ` Paul Jackson 2004-02-25 23:14 ` Hansjoerg Lipp 2004-02-25 23:24 ` Paul Jackson 2004-02-23 20:13 ` Hansjoerg Lipp 2004-02-23 22:24 ` Paul Jackson 2004-02-24 0:21 ` Hansjoerg Lipp 2004-02-23 5:49 ` Paul Jackson 2004-02-23 5:50 ` Paul Jackson 2004-02-23 5:42 ` Paul Jackson 2004-02-23 20:24 ` Hansjoerg Lipp 2004-02-23 21:55 ` Paul Jackson
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox