public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
  • [parent not found: <4Nu4p-5Js-3@gated-at.bofh.it>]
  • * Re: [Patch] Support UTF-8 scripts
           [not found]           ` <4NsOZ-3YF-9@gated-at.bofh.it>
           [not found]             ` <4NsYH-4bv-27@gated-at.bofh.it>
    @ 2005-09-17  6:45             ` "Martin v. Löwis"
      1 sibling, 0 replies; 80+ messages in thread
    From: "Martin v. Löwis" @ 2005-09-17  6:45 UTC (permalink / raw)
      To: 7eggert; +Cc: linux-kernel
    
    Bodo Eggert wrote:
    > BTW2: However, I don't like the patch.
    > 
    > I'd first check for a utf-8 signature, and if it's found, adjust the
    > buffer offset by 3. Then I'd run the old code checking for the sh_bang.
    > OTOH, I just read the patch and not the .c file, maybe (unlikely) my idea
    > wouldn't work correctly.
    
    I believe this wouldn't work. binfmt_script currently has the code
    
            for (cp = bprm->buf+2; (*cp == ' ') || (*cp == '\t'); cp++);
    
    to get out the (start of the) interpreter file name. This knows
    implicitly that you need to skip two bytes #!; for UTF-8 signatures,
    it would be 5 bytes.
    
    Now, if you meant to suggest that bprm->buf should be adjusted (e.g.
    through 'brpm->buf += 3'): this cannot work, either. It would break
    subsequent binfmt modules which assume that bprm->buf is the first
    1KiB (or so) of the file to be executed.
    
    If you suggest that the patch should merely check for the signature,
    and then skip it: this is what the patch does.
    
    Regards,
    Martin
    
    P.S. I just noticed there is a
    
    bprm->buf[BINPRM_BUF_SIZE - 1] = '\0';
    
    which seems incorrect: it puts a null-byte into the buffer data,
    thus (slightly) corrupting the data for subsequent binfmt modules
    (although it already knows the file starts with #!, so the
     subsequent modules will fail, anyway)
    
    Also, I think the above loop should also terminate for '
    
     *cp == '\0'
    
    if there is neither a space nor a tab in the file.
    
    ^ permalink raw reply	[flat|nested] 80+ messages in thread
  • [parent not found: <4NXfZ-5P0-1@gated-at.bofh.it>]
  • [parent not found: <4NVHm-3yE-13@gated-at.bofh.it>]
    [parent not found: <4Nvab-7o5-11@gated-at.bofh.it>]
    [parent not found: <4N6EL-4Hq-3@gated-at.bofh.it>]
    [parent not found: <4B2ZV-2dl-7@gated-at.bofh.it>]
    * [Patch] Support UTF-8 scripts
    @ 2005-08-13 12:07 "Martin v. Löwis"
      2005-08-13 16:35 ` Stephen Pollei
      2005-08-31 23:27 ` H. Peter Anvin
      0 siblings, 2 replies; 80+ messages in thread
    From: "Martin v. Löwis" @ 2005-08-13 12:07 UTC (permalink / raw)
      To: linux-kernel
    
    This patch adds support for UTF-8 signatures (aka BOM, byte order
    mark) to binfmt_script. Files that start with EF BF FF # ! are now
    recognized as scripts (in addition to files starting with # !).
    
    With such support, creating scripts that reliably carry non-ASCII
    characters is simplified. Editors and the script interpreter can
    easily agree on what the encoding of the script is, and the
    interpreter can then render strings appropriately. Currently,
    Python supports source files that start with the UTF-8 signature;
    the approach would naturally extend to Perl to enhance/replace
    the "use utf8" pragma. Likewise, Tcl could use the UTF-8 signature
    to reliably identify UTF-8 source code (instead of assuming
    [encoding system] for source code).
    
    Please find the patch attached below.
    
    Regards,
    Martin
    
    Signed-off-by: Martin v. Löwis <martin@v.loewis.de>
    
    diff --git a/fs/binfmt_script.c b/fs/binfmt_script.c
    --- a/fs/binfmt_script.c
    +++ b/fs/binfmt_script.c
    @@ -1,7 +1,7 @@
     /*
      *  linux/fs/binfmt_script.c
      *
    - *  Copyright (C) 1996  Martin von Löwis
    + *  Copyright (C) 1996, 2005  Martin von Löwis
      *  original #!-checking implemented by tytso.
      */
    
    @@ -23,7 +23,16 @@ static int load_script(struct linux_binp
            char interp[BINPRM_BUF_SIZE];
            int retval;
    
    -       if ((bprm->buf[0] != '#') || (bprm->buf[1] != '!') ||
    (bprm->sh_bang))
    +       /* It is a recursive invocation. */
    +       if (bprm->sh_bang)
    +               return -ENOEXEC;
    +
    +       /* It starts neither with #!, nor with #! preceded by
    +          the UTF-8 signature. */
    +       if (!(((bprm->buf[0] == '#') && (bprm->buf[1] == '!'))
    +             || ((bprm->buf[0] == '\xef') && (bprm->buf[1] == '\xbb')
    +                 && (bprm->buf[2] == '\xbf') && (bprm->buf[3] == '#')
    +                 && (bprm->buf[4] == '!'))))
                    return -ENOEXEC;
            /*
             * This section does the #! interpretation.
    @@ -46,7 +55,8 @@ static int load_script(struct linux_binp
                    else
                            break;
            }
    -       for (cp = bprm->buf+2; (*cp == ' ') || (*cp == '\t'); cp++);
    +       cp = (bprm->buf[0]=='\xef') ? bprm->buf+5 : bprm->buf+2;
    +       while ((*cp == ' ') || (*cp == '\t')) cp++;
            if (*cp == '\0')
                    return -ENOEXEC; /* No interpreter name found */
            i_name = cp;
    
    ^ permalink raw reply	[flat|nested] 80+ messages in thread

    end of thread, other threads:[~2005-09-20  3:28 UTC | newest]
    
    Thread overview: 80+ messages (download: mbox.gz follow: Atom feed
    -- links below jump to the message on this page --
         [not found] <4NsP0-3YF-11@gated-at.bofh.it>
         [not found] ` <4NsP0-3YF-13@gated-at.bofh.it>
         [not found]   ` <4NsP0-3YF-15@gated-at.bofh.it>
         [not found]     ` <4NsP0-3YF-17@gated-at.bofh.it>
         [not found]       ` <4NsP1-3YF-19@gated-at.bofh.it>
         [not found]         ` <4NsP1-3YF-21@gated-at.bofh.it>
         [not found]           ` <4NsOZ-3YF-9@gated-at.bofh.it>
         [not found]             ` <4NsYH-4bv-27@gated-at.bofh.it>
         [not found]               ` <4NtBr-4WU-3@gated-at.bofh.it>
         [not found]                 ` <4NtL0-5lQ-13@gated-at.bofh.it>
    2005-09-16 20:34                   ` [Patch] Support UTF-8 scripts "Martin v. Löwis"
    2005-09-17 12:01                     ` Martin Mares
    2005-09-17 12:25                       ` "Martin v. Löwis"
    2005-09-17 12:28                         ` Martin Mares
    2005-09-17 12:53                           ` "Martin v. Löwis"
    2005-09-17 13:05                             ` Martin Mares
    2005-09-17 13:33                               ` "Martin v. Löwis"
    2005-09-19  7:08                         ` Pavel Machek
    2005-09-19  7:18                           ` "Martin v. Löwis"
    2005-09-19  7:24                             ` Pavel Machek
    2005-09-19  7:46                               ` "Martin v. Löwis"
    2005-09-19  7:50                                 ` Pavel Machek
    2005-09-19 10:48                               ` Alan Cox
    2005-09-19 23:49                             ` Horst von Brand
         [not found]                 ` <4Nu4p-5Js-3@gated-at.bofh.it>
    2005-09-16 20:41                   ` "Martin v. Löwis"
    2005-09-16 22:08                     ` H. Peter Anvin
    2005-09-17  6:05                       ` "Martin v. Löwis"
    2005-09-16 22:45                     ` Bernd Petrovitsch
    2005-09-17  6:20                       ` "Martin v. Löwis"
    2005-09-17 22:28                         ` Bernd Petrovitsch
    2005-09-18  7:23                           ` "Martin v. Löwis"
    2005-09-18 14:50                             ` Bernd Petrovitsch
    2005-09-17  6:45             ` "Martin v. Löwis"
         [not found] ` <4NXfZ-5P0-1@gated-at.bofh.it>
         [not found]   ` <4NYlM-7i0-5@gated-at.bofh.it>
         [not found]     ` <4Olip-6HH-13@gated-at.bofh.it>
    2005-09-19  4:41       ` "Martin v. Löwis"
         [not found] <4NVHm-3yE-13@gated-at.bofh.it>
         [not found] ` <4NVHm-3yE-15@gated-at.bofh.it>
         [not found]   ` <4NVHm-3yE-17@gated-at.bofh.it>
         [not found]     ` <4NVHm-3yE-19@gated-at.bofh.it>
         [not found]       ` <4NVHm-3yE-21@gated-at.bofh.it>
         [not found]         ` <4NVHm-3yE-23@gated-at.bofh.it>
         [not found]           ` <4NVHm-3yE-25@gated-at.bofh.it>
         [not found]             ` <4NVHm-3yE-27@gated-at.bofh.it>
         [not found]               ` <4NVHm-3yE-29@gated-at.bofh.it>
         [not found]                 ` <4NVHm-3yE-31@gated-at.bofh.it>
         [not found]                   ` <4NVHn-3yE-33@gated-at.bofh.it>
         [not found]                     ` <4NVHn-3yE-35@gated-at.bofh.it>
         [not found]                       ` <4NVHn-3yE-37@gated-at.bofh.it>
         [not found]                         ` <4NVHn-3yE-39@gated-at.bofh.it>
         [not found]                           ` <4Od1x-3e3-5@gated-at.bofh.it>
         [not found]                             ` <4Od1x-3e3-7@gated-at.bofh.it>
         [not found]                               ` <4Od1w-3e3-3@gated-at.bofh.it>
         [not found]                                 ` <4OfZo-7AG-21@gated-at.bofh.it>
    2005-09-19  5:11                                   ` "Martin v. Löwis"
         [not found] <4Nvab-7o5-11@gated-at.bofh.it>
         [not found] ` <4Nvab-7o5-13@gated-at.bofh.it>
         [not found]   ` <4Nvab-7o5-15@gated-at.bofh.it>
         [not found]     ` <4Nvab-7o5-17@gated-at.bofh.it>
         [not found]       ` <4Nvab-7o5-19@gated-at.bofh.it>
         [not found]         ` <4Nvab-7o5-21@gated-at.bofh.it>
         [not found]           ` <4Nvab-7o5-23@gated-at.bofh.it>
         [not found]             ` <4Nvab-7o5-25@gated-at.bofh.it>
         [not found]               ` <4Nvab-7o5-27@gated-at.bofh.it>
         [not found]                 ` <4NvjM-7CU-7@gated-at.bofh.it>
         [not found]                   ` <4NvjM-7CU-5@gated-at.bofh.it>
         [not found]                     ` <4NxbR-20S-1@gated-at.bofh.it>
         [not found]                       ` <4NEn7-3M5-7@gated-at.bofh.it>
         [not found]                         ` <4NTvO-yJ-13@gated-at.bofh.it>
    2005-09-18  0:53                           ` Bodo Eggert
    2005-09-18 16:53                             ` Bernd Petrovitsch
         [not found]                           ` <4O1MJ-3Hf-5@gated-at.bofh.it>
         [not found]                             ` <4O8Oh-5jp-7@gated-at.bofh.it>
    2005-09-18 19:23                               ` Bodo Eggert
    2005-09-18 21:03                                 ` Bernd Petrovitsch
    2005-09-19 19:37                                   ` Bodo Eggert
    2005-09-18 22:29                                 ` Valdis.Kletnieks
    2005-09-19  6:03                                 ` H. Peter Anvin
    2005-09-19  4:54                               ` "Martin v. Löwis"
    2005-09-19  8:26                                 ` Bernd Petrovitsch
    2005-09-19  9:00                                   ` Valdis.Kletnieks
    2005-09-19  9:41                                     ` Bernd Petrovitsch
    2005-09-19 21:40                                   ` "Martin v. Löwis"
         [not found] <4N6EL-4Hq-3@gated-at.bofh.it>
         [not found] ` <4N6EL-4Hq-5@gated-at.bofh.it>
         [not found]   ` <4N6EK-4Hq-1@gated-at.bofh.it>
         [not found]     ` <4N6EX-4Hq-27@gated-at.bofh.it>
         [not found]       ` <4N6Ox-4Ts-33@gated-at.bofh.it>
         [not found]         ` <4N7AS-67L-3@gated-at.bofh.it>
    2005-09-16 18:02           ` Bodo Eggert
    2005-09-16 18:09             ` H. Peter Anvin
    2005-09-16 18:57               ` Bodo Eggert
    2005-09-16 19:08                 ` Martin Mares
    2005-09-16 19:25                 ` H. Peter Anvin
    2005-09-16 19:57                 ` Horst von Brand
         [not found]             ` <200509170028.59973.dhazelton@enter.net>
    2005-09-17  6:28               ` "Martin v. Löwis"
    2005-09-17 22:31                 ` D. Hazelton
    2005-09-18  3:45                   ` Kyle Moffett
    2005-09-19  0:14                     ` D. Hazelton
    2005-09-18  6:58                   ` "Martin v. Löwis"
    2005-09-19  0:31                     ` D. Hazelton
    2005-09-17 17:16               ` Bodo Eggert
         [not found] <4B2ZV-2dl-7@gated-at.bofh.it>
         [not found] ` <4HKbZ-Cx-37@gated-at.bofh.it>
    2005-09-15 18:24   ` "Martin v. Löwis"
    2005-09-15 18:25     ` H. Peter Anvin
    2005-09-15 18:39       ` "Martin v. Löwis"
    2005-09-15 19:20         ` H. Peter Anvin
    2005-09-16  8:13         ` Bernd Petrovitsch
    2005-08-13 12:07 "Martin v. Löwis"
    2005-08-13 16:35 ` Stephen Pollei
    2005-08-13 18:42   ` Lee Revell
    2005-08-13 18:49     ` Hugo Mills
    2005-08-13 18:53       ` Lee Revell
    2005-08-14  0:57         ` Alan Cox
    2005-08-14  1:19           ` Kyle Moffett
    2005-08-14  1:40             ` Lee Revell
    2005-08-14 10:40               ` Wichert Akkerman
    2005-08-13 19:20       ` Lee Revell
    2005-08-16  9:46       ` Jan Engelhardt
    2005-08-14  0:53     ` Alan Cox
    2005-08-14  4:10       ` James Cloos
    2005-08-14  6:18     ` Jason L Tibbitts III
         [not found]       ` <feed8cdd050814125845fe4e2e@mail.gmail.com>
    2005-08-14 19:59         ` Lee Revell
    2005-08-14 20:13           ` Stephen Pollei
    2005-08-14 20:22             ` Lee Revell
    2005-08-14 22:10               ` "Martin v. Löwis"
    2005-08-14 23:55           ` Alan Cox
    2005-08-16 13:56           ` David Madore
         [not found]           ` <mailman.1124063520.13257.linux-kernel2news@redhat.com>
    2005-08-16 20:17             ` Pete Zaitcev
    2005-08-14 21:52       ` Kyle Moffett
    2005-08-14 22:12         ` Valdis.Kletnieks
    2005-08-15  8:01     ` Helge Hafting
    2005-08-31 23:27 ` H. Peter Anvin
    

    This is a public inbox, see mirroring instructions
    for how to clone and mirror all data and code used for this inbox