* suggestion to avoid erroneous lines in findmnt/lslocks/... @ 2012-08-04 15:42 Pádraig Brady 2012-08-04 15:57 ` Dave Reisner 2012-08-06 8:15 ` Karel Zak 0 siblings, 2 replies; 11+ messages in thread From: Pádraig Brady @ 2012-08-04 15:42 UTC (permalink / raw) To: util-linux There was a recent change in df in coreutils to sanitize output of paths: http://git.sv.gnu.org/gitweb/?p=coreutils.git;a=commitdiff;h=3ed70fd The essential issue fixed there is that control chars in a path will be converted to '?' (this works in all locales), and doing so will mean '\n' for example is not output. You could even consider this a potential security improvement so that arbitrary users couldn't influence the output of these commands for all users. I suggest using the simple inplace replacement function from above. cheers, Pádraig. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: suggestion to avoid erroneous lines in findmnt/lslocks/... 2012-08-04 15:42 suggestion to avoid erroneous lines in findmnt/lslocks/ Pádraig Brady @ 2012-08-04 15:57 ` Dave Reisner 2012-08-05 2:02 ` Pádraig Brady 2012-08-06 8:15 ` Karel Zak 1 sibling, 1 reply; 11+ messages in thread From: Dave Reisner @ 2012-08-04 15:57 UTC (permalink / raw) To: Pádraig Brady; +Cc: util-linux On Sat, Aug 04, 2012 at 04:42:10PM +0100, Pádraig Brady wrote: > There was a recent change in df in coreutils to sanitize output of paths: > > http://git.sv.gnu.org/gitweb/?p=coreutils.git;a=commitdiff;h=3ed70fd > > The essential issue fixed there is that control chars in a path will be > converted to '?' (this works in all locales), and doing so will mean > '\n' for example is not output. You could even consider this a potential > security improvement so that arbitrary users couldn't influence the > output of these commands for all users. > > I suggest using the simple inplace replacement function from above. Why replace with a bogus character when you could instead use an octal or hex escape? Wouldn't this still address the underlying problem? Munging the content of a string could break a script consuming the output with no way for the script to recover. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: suggestion to avoid erroneous lines in findmnt/lslocks/... 2012-08-04 15:57 ` Dave Reisner @ 2012-08-05 2:02 ` Pádraig Brady 0 siblings, 0 replies; 11+ messages in thread From: Pádraig Brady @ 2012-08-05 2:02 UTC (permalink / raw) To: util-linux On 08/04/2012 04:57 PM, Dave Reisner wrote: > On Sat, Aug 04, 2012 at 04:42:10PM +0100, Pádraig Brady wrote: >> There was a recent change in df in coreutils to sanitize output of paths: >> >> http://git.sv.gnu.org/gitweb/?p=coreutils.git;a=commitdiff;h=3ed70fd >> >> The essential issue fixed there is that control chars in a path will be >> converted to '?' (this works in all locales), and doing so will mean >> '\n' for example is not output. You could even consider this a potential >> security improvement so that arbitrary users couldn't influence the >> output of these commands for all users. >> >> I suggest using the simple inplace replacement function from above. > > Why replace with a bogus character when you could instead use an octal > or hex escape? Wouldn't this still address the underlying problem? > Munging the content of a string could break a script consuming the > output with no way for the script to recover. Yes true. I suppose you could use octal escapes, so 0x00 -> 0x1F are mapped to \000 -> \037 and '\' is mapped to \134 That's more invasive though. For df for now at least it was thought that requiring unambiguous output for these names was overkill. The names were adjusted just so as to avoid processing issues for other sanely named items. Also related is the issue of non printable characters. For example if you view the unicode line separator char (\u2028) in certain places (like a pango editor like gedit for example) it will appear as a normal new line. It might be appropriate to replace or escape all non printable chars. That's complicated though (mbsalign in util-linux already does this to some extent). So you could have complex mappings, but I was thinking at least for these utils a simple method is appropriate to avoid the immediate issue. cheers, Pádraig. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: suggestion to avoid erroneous lines in findmnt/lslocks/... 2012-08-04 15:42 suggestion to avoid erroneous lines in findmnt/lslocks/ Pádraig Brady 2012-08-04 15:57 ` Dave Reisner @ 2012-08-06 8:15 ` Karel Zak 2012-08-06 11:10 ` Karel Zak 1 sibling, 1 reply; 11+ messages in thread From: Karel Zak @ 2012-08-06 8:15 UTC (permalink / raw) To: Pádraig Brady; +Cc: util-linux On Sat, Aug 04, 2012 at 04:42:10PM +0100, Pádraig Brady wrote: > There was a recent change in df in coreutils to sanitize output of paths: > > http://git.sv.gnu.org/gitweb/?p=coreutils.git;a=commitdiff;h=3ed70fd Thanks! > The essential issue fixed there is that control chars in a path will be > converted to '?' (this works in all locales), and doing so will mean > '\n' for example is not output. You could even consider this a potential > security improvement so that arbitrary users couldn't influence the > output of these commands for all users. > > I suggest using the simple inplace replacement function from above. All our new utils (based on lib/tt.c) already uses hex encoding for ascii non-printable when export mode (e.g. findmnt -P) or blank chars when raw mode (e.g. findmnt -r) is specified. The default output does not escape problematic chars :-( I'll fix it to use iscntrl() and \x?? hex (to be consistent our another outputs). Karel -- Karel Zak <kzak@redhat.com> http://karelzak.blogspot.com ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: suggestion to avoid erroneous lines in findmnt/lslocks/... 2012-08-06 8:15 ` Karel Zak @ 2012-08-06 11:10 ` Karel Zak 2012-08-06 14:44 ` Pádraig Brady 0 siblings, 1 reply; 11+ messages in thread From: Karel Zak @ 2012-08-06 11:10 UTC (permalink / raw) To: Pádraig Brady; +Cc: util-linux On Mon, Aug 06, 2012 at 10:15:33AM +0200, Karel Zak wrote: > I'll fix it to use iscntrl() and \x?? hex (to be consistent our > another outputs). Fixed: - mount(8) uses '?' like coreutils for control chars (note that listing mode in mount(8) is in maintenance mode, use findmnt(8) if you want something better) - \x<code> is used in findmnt, lsblk, partx, ... for control and non-printable chars - in the raw and export (NAME=data) output are also replaced already existing \x<code> sequences (aaa\x20bbb --> aaa\x5cx20bbb). This is not used in the default output to keep it human readable (\x?? is pretty common in /dev/disk/by-*). I have also fixed the way how lib/tt.c counts cells, it's possible that old findmnt, lsblk, ... versions have a problem with some languages (e.g JP) where more than one cell is necessary to print one multibyte. Karel ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: suggestion to avoid erroneous lines in findmnt/lslocks/... 2012-08-06 11:10 ` Karel Zak @ 2012-08-06 14:44 ` Pádraig Brady 2012-08-07 8:09 ` Karel Zak 0 siblings, 1 reply; 11+ messages in thread From: Pádraig Brady @ 2012-08-06 14:44 UTC (permalink / raw) To: Karel Zak; +Cc: util-linux On 08/06/2012 12:10 PM, Karel Zak wrote: > On Mon, Aug 06, 2012 at 10:15:33AM +0200, Karel Zak wrote: >> I'll fix it to use iscntrl() and \x?? hex (to be consistent our >> another outputs). > > Fixed: > > - mount(8) uses '?' like coreutils for control chars (note that > listing mode in mount(8) is in maintenance mode, use findmnt(8) if > you want something better) > > - \x<code> is used in findmnt, lsblk, partx, ... for control and non-printable > chars > > - in the raw and export (NAME=data) output are also replaced already existing > \x<code> sequences (aaa\x20bbb --> aaa\x5cx20bbb). > > This is not used in the default output to keep it human readable (\x?? is > pretty common in /dev/disk/by-*). > > > I have also fixed the way how lib/tt.c counts cells, it's possible > that old findmnt, lsblk, ... versions have a problem with some languages > (e.g JP) where more than one cell is necessary to print one multibyte. Cool. I did a quick test... $ echo $LANG en_US.utf8 $ mkdir tst && cd tst $ truncate -s10M img $ mkfs.ext2 -F img $ mkdir ascii "$(printf 'co\ntrol')" 'back\slash' 'es\x63aped' "$(printf 'nonútf8' | iconv -t iso-8859-15)" '日一二三四五六' $ for mnt in ascii "$(printf 'co\ntrol')" 'back\slash' 'es\x63aped' "$(printf 'nonútf8' | iconv -t iso-8859-15)" '日一二三四五六'; do > sudo mount img "$mnt" > ~/git/util-linux/findmnt -l /dev/loop1 > ~/git/util-linux/findmnt -rn /dev/loop1 | cut -d' ' -f1 > sleep 1 > sudo umount /dev/loop1 > done TARGET SOURCE FSTYPE OPTIONS /home/padraig/tst/ascii /dev/loop1 ext2 rw,relatime,seclabel,errors=continue /home/padraig/tst/ascii TARGET SOURCE FSTYPE OPTIONS /home/padraig/tst/co\x0atrol /dev/loop1 ext2 rw,relatime,seclabel,errors=continue /home/padraig/tst/co\x0atrol TARGET SOURCE FSTYPE OPTIONS /home/padraig/tst/back\slash /dev/loop1 ext2 rw,relatime,seclabel,errors=continue /home/padraig/tst/back\slash TARGET SOURCE FSTYPE OPTIONS /home/padraig/tst/es\x63aped /dev/loop1 ext2 rw,relatime,seclabel,errors=continue /home/padraig/tst/es\x5cx63aped TARGET SOURCE FSTYPE OPTIONS /dev/loop1 ext2 rw,relatime,seclabel,errors=continue /home/padraig/tst/non\xfffffffatf8 TARGET SOURCE FSTYPE OPTIONS /home/padraig/tst/日一二三四五六 /dev/loop1 ext2 rw,relatime,seclabel,errors=continue /home/padraig/tst/\xffffffe6\xffffff97... So two questions. 1. Should the back\slash case be back\x5cslash in both cases? 2. The nonútf8 one produces an errant new line. Also in this case could you fall back to using \x escapes for the whole string? cheers, Pádraig. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: suggestion to avoid erroneous lines in findmnt/lslocks/... 2012-08-06 14:44 ` Pádraig Brady @ 2012-08-07 8:09 ` Karel Zak 2012-08-07 11:35 ` Dave Reisner 2012-08-07 23:36 ` Pádraig Brady 0 siblings, 2 replies; 11+ messages in thread From: Karel Zak @ 2012-08-07 8:09 UTC (permalink / raw) To: Pádraig Brady; +Cc: util-linux On Mon, Aug 06, 2012 at 03:44:21PM +0100, Pádraig Brady wrote: > I did a quick test... Thanks, I'll use it in regression tests ;-) (I was busy yesterday to write any reg.tests.) > TARGET SOURCE FSTYPE OPTIONS > /home/padraig/tst/ascii /dev/loop1 ext2 rw,relatime,seclabel,errors=continue > /home/padraig/tst/ascii > TARGET SOURCE FSTYPE OPTIONS > /home/padraig/tst/co\x0atrol /dev/loop1 ext2 rw,relatime,seclabel,errors=continue > /home/padraig/tst/co\x0atrol > TARGET SOURCE FSTYPE OPTIONS > /home/padraig/tst/back\slash /dev/loop1 ext2 rw,relatime,seclabel,errors=continue > /home/padraig/tst/back\slash > TARGET SOURCE FSTYPE OPTIONS > /home/padraig/tst/es\x63aped /dev/loop1 ext2 rw,relatime,seclabel,errors=continue > /home/padraig/tst/es\x5cx63aped > TARGET SOURCE FSTYPE OPTIONS > > /dev/loop1 ext2 rw,relatime,seclabel,errors=continue > /home/padraig/tst/non\xfffffffatf8 > TARGET SOURCE FSTYPE OPTIONS > /home/padraig/tst/日一二三四五六 /dev/loop1 ext2 rw,relatime,seclabel,errors=continue > /home/padraig/tst/\xffffffe6\xffffff97... > > So two questions. > > 1. Should the back\slash case be back\x5cslash in both cases? back\slash is not \x<xdigit> sequence, so escape is unnecessary Note that \\server\path is pretty common for cifs and use \x5c for all '\' will make the findmnt output unreadable in many cases. IMHO is better to be "smart" and use escape sequences only when it's really necessary. > 2. The nonútf8 one produces an errant new line. > Also in this case could you fall back to using \x escapes for the whole string? Yeah, nonútf8 output seems strange, I'll fix it. I'll also update findmnt (and others) man pages to explain when and how we use \x escapes. Is there any elegant way how to convert \x sequences back to the native strings in shell? Maybe we can add some hint to the man pages too. Karel -- Karel Zak <kzak@redhat.com> http://karelzak.blogspot.com ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: suggestion to avoid erroneous lines in findmnt/lslocks/... 2012-08-07 8:09 ` Karel Zak @ 2012-08-07 11:35 ` Dave Reisner 2012-08-07 23:36 ` Pádraig Brady 1 sibling, 0 replies; 11+ messages in thread From: Dave Reisner @ 2012-08-07 11:35 UTC (permalink / raw) To: Karel Zak; +Cc: Pádraig Brady, util-linux On Tue, Aug 07, 2012 at 10:09:39AM +0200, Karel Zak wrote: > On Mon, Aug 06, 2012 at 03:44:21PM +0100, Pádraig Brady wrote: > > I did a quick test... > > Thanks, I'll use it in regression tests ;-) (I was busy yesterday to > write any reg.tests.) > > > TARGET SOURCE FSTYPE OPTIONS > > /home/padraig/tst/ascii /dev/loop1 ext2 rw,relatime,seclabel,errors=continue > > /home/padraig/tst/ascii > > TARGET SOURCE FSTYPE OPTIONS > > /home/padraig/tst/co\x0atrol /dev/loop1 ext2 rw,relatime,seclabel,errors=continue > > /home/padraig/tst/co\x0atrol > > TARGET SOURCE FSTYPE OPTIONS > > /home/padraig/tst/back\slash /dev/loop1 ext2 rw,relatime,seclabel,errors=continue > > /home/padraig/tst/back\slash > > TARGET SOURCE FSTYPE OPTIONS > > /home/padraig/tst/es\x63aped /dev/loop1 ext2 rw,relatime,seclabel,errors=continue > > /home/padraig/tst/es\x5cx63aped > > TARGET SOURCE FSTYPE OPTIONS > > > > /dev/loop1 ext2 rw,relatime,seclabel,errors=continue > > /home/padraig/tst/non\xfffffffatf8 > > TARGET SOURCE FSTYPE OPTIONS > > /home/padraig/tst/日一二三四五六 /dev/loop1 ext2 rw,relatime,seclabel,errors=continue > > /home/padraig/tst/\xffffffe6\xffffff97... > > > > So two questions. > > > > 1. Should the back\slash case be back\x5cslash in both cases? > > back\slash is not \x<xdigit> sequence, so escape is unnecessary > > Note that \\server\path is pretty common for cifs and use \x5c > for all '\' will make the findmnt output unreadable in many cases. > IMHO is better to be "smart" and use escape sequences only when it's > really necessary. > > > 2. The nonútf8 one produces an errant new line. > > Also in this case could you fall back to using \x escapes for the whole string? > > Yeah, nonútf8 output seems strange, I'll fix it. > > > I'll also update findmnt (and others) man pages to explain when and > how we use \x escapes. Is there any elegant way how to convert \x > sequences back to the native strings in shell? Maybe we can add some > hint to the man pages too. > You can do something as simple as: printf '%b' "$mangled_content" And it works even in POSIX shell. However, the %b formatter unescapes simple control character sequences as well, so "foo\bar" will end up as "foar" because the '\b' is consumed as a backspace. I ended up writing my own unmangle routine to properly, and only, handle octal and hex escapes: https://github.com/falconindy/arch-install-scripts/blob/master/common#L70 Dave > Karel > > -- > Karel Zak <kzak@redhat.com> > http://karelzak.blogspot.com > -- > To unsubscribe from this list: send the line "unsubscribe util-linux" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: suggestion to avoid erroneous lines in findmnt/lslocks/... 2012-08-07 8:09 ` Karel Zak 2012-08-07 11:35 ` Dave Reisner @ 2012-08-07 23:36 ` Pádraig Brady 2012-08-13 12:39 ` Karel Zak 1 sibling, 1 reply; 11+ messages in thread From: Pádraig Brady @ 2012-08-07 23:36 UTC (permalink / raw) To: Karel Zak; +Cc: util-linux On 08/07/2012 09:09 AM, Karel Zak wrote: > On Mon, Aug 06, 2012 at 03:44:21PM +0100, Pádraig Brady wrote: >> I did a quick test... > > Thanks, I'll use it in regression tests ;-) (I was busy yesterday to > write any reg.tests.) > >> TARGET SOURCE FSTYPE OPTIONS >> /home/padraig/tst/ascii /dev/loop1 ext2 rw,relatime,seclabel,errors=continue >> /home/padraig/tst/ascii >> TARGET SOURCE FSTYPE OPTIONS >> /home/padraig/tst/co\x0atrol /dev/loop1 ext2 rw,relatime,seclabel,errors=continue >> /home/padraig/tst/co\x0atrol >> TARGET SOURCE FSTYPE OPTIONS >> /home/padraig/tst/back\slash /dev/loop1 ext2 rw,relatime,seclabel,errors=continue >> /home/padraig/tst/back\slash >> TARGET SOURCE FSTYPE OPTIONS >> /home/padraig/tst/es\x63aped /dev/loop1 ext2 rw,relatime,seclabel,errors=continue >> /home/padraig/tst/es\x5cx63aped >> TARGET SOURCE FSTYPE OPTIONS >> >> /dev/loop1 ext2 rw,relatime,seclabel,errors=continue >> /home/padraig/tst/non\xfffffffatf8 >> TARGET SOURCE FSTYPE OPTIONS >> /home/padraig/tst/日一二三四五六 /dev/loop1 ext2 rw,relatime,seclabel,errors=continue >> /home/padraig/tst/\xffffffe6\xffffff97... >> >> So two questions. >> >> 1. Should the back\slash case be back\x5cslash in both cases? > > back\slash is not \x<xdigit> sequence, so escape is unnecessary > > Note that \\server\path is pretty common for cifs and use \x5c > for all '\' will make the findmnt output unreadable in many cases. > IMHO is better to be "smart" and use escape sequences only when it's > really necessary. Better for humans, but awkward for scripts to parse. What I was thinking was perhaps --raw or -P would do unconditional escaping of '\' so unescaping can be done with just `printf %b`? With the conditional escaping you'd have to do something like: unmangle() { printf '%b' $( sed ' s/\\\([^x]\)/\\x5c\1/g; s/\\\(x[^0-9a-f]\)/\\x5c\1/g; s/\\\(x[0-9a-f][^0-9a-f]\)/\\x5c\1/g; ' ) } > >> 2. The nonútf8 one produces an errant new line. >> Also in this case could you fall back to using \x escapes for the whole string? > > Yeah, nonútf8 output seems strange, I'll fix it. > > > I'll also update findmnt (and others) man pages to explain when and > how we use \x escapes. Is there any elegant way how to convert \x > sequences back to the native strings in shell? Maybe we can add some > hint to the man pages too. cheers, Pádraig. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: suggestion to avoid erroneous lines in findmnt/lslocks/... 2012-08-07 23:36 ` Pádraig Brady @ 2012-08-13 12:39 ` Karel Zak 2012-08-20 0:31 ` Pádraig Brady 0 siblings, 1 reply; 11+ messages in thread From: Karel Zak @ 2012-08-13 12:39 UTC (permalink / raw) To: Pádraig Brady; +Cc: util-linux On Wed, Aug 08, 2012 at 12:36:05AM +0100, Pádraig Brady wrote: > ck\slash is not \x<xdigit> sequence, so escape is unnecessary > > > > Note that \\server\path is pretty common for cifs and use \x5c > > for all '\' will make the findmnt output unreadable in many cases. > > IMHO is better to be "smart" and use escape sequences only when it's > > really necessary. > > Better for humans, but awkward for scripts to parse. > What I was thinking was perhaps --raw or -P would > do unconditional escaping of '\' so unescaping can be > done with just `printf %b`? Good point. Fixed, all '\' will be replaced with \x5c. > With the conditional escaping you'd have to do something like: > > unmangle() { > printf '%b' $( > sed ' > s/\\\([^x]\)/\\x5c\1/g; > s/\\\(x[^0-9a-f]\)/\\x5c\1/g; > s/\\\(x[0-9a-f][^0-9a-f]\)/\\x5c\1/g; > ' > ) > } ugly, I see, let's use printf '%b' without the sed stuff. Thanks. Karel -- Karel Zak <kzak@redhat.com> http://karelzak.blogspot.com ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: suggestion to avoid erroneous lines in findmnt/lslocks/... 2012-08-13 12:39 ` Karel Zak @ 2012-08-20 0:31 ` Pádraig Brady 0 siblings, 0 replies; 11+ messages in thread From: Pádraig Brady @ 2012-08-20 0:31 UTC (permalink / raw) To: Karel Zak; +Cc: util-linux On 08/13/2012 01:39 PM, Karel Zak wrote: > On Wed, Aug 08, 2012 at 12:36:05AM +0100, Pádraig Brady wrote: >> ck\slash is not \x<xdigit> sequence, so escape is unnecessary >>> >>> Note that \\server\path is pretty common for cifs and use \x5c >>> for all '\' will make the findmnt output unreadable in many cases. >>> IMHO is better to be "smart" and use escape sequences only when it's >>> really necessary. >> >> Better for humans, but awkward for scripts to parse. >> What I was thinking was perhaps --raw or -P would >> do unconditional escaping of '\' so unescaping can be >> done with just `printf %b`? > > Good point. Fixed, all '\' will be replaced with \x5c. Excellent. I've tested using the previous test case, and output is as expected (including the invalid UTF8 case). cheers, Pádraig. ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2012-08-20 0:31 UTC | newest] Thread overview: 11+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2012-08-04 15:42 suggestion to avoid erroneous lines in findmnt/lslocks/ Pádraig Brady 2012-08-04 15:57 ` Dave Reisner 2012-08-05 2:02 ` Pádraig Brady 2012-08-06 8:15 ` Karel Zak 2012-08-06 11:10 ` Karel Zak 2012-08-06 14:44 ` Pádraig Brady 2012-08-07 8:09 ` Karel Zak 2012-08-07 11:35 ` Dave Reisner 2012-08-07 23:36 ` Pádraig Brady 2012-08-13 12:39 ` Karel Zak 2012-08-20 0:31 ` Pádraig Brady
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).