* suggestion to avoid erroneous lines in findmnt/lslocks/...
@ 2012-08-04 15:42 Pádraig Brady
2012-08-04 15:57 ` Dave Reisner
2012-08-06 8:15 ` Karel Zak
0 siblings, 2 replies; 11+ messages in thread
From: Pádraig Brady @ 2012-08-04 15:42 UTC (permalink / raw)
To: util-linux
There was a recent change in df in coreutils to sanitize output of paths:
http://git.sv.gnu.org/gitweb/?p=coreutils.git;a=commitdiff;h=3ed70fd
The essential issue fixed there is that control chars in a path will be
converted to '?' (this works in all locales), and doing so will mean
'\n' for example is not output. You could even consider this a potential
security improvement so that arbitrary users couldn't influence the
output of these commands for all users.
I suggest using the simple inplace replacement function from above.
cheers,
Pádraig.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: suggestion to avoid erroneous lines in findmnt/lslocks/...
2012-08-04 15:42 suggestion to avoid erroneous lines in findmnt/lslocks/ Pádraig Brady
@ 2012-08-04 15:57 ` Dave Reisner
2012-08-05 2:02 ` Pádraig Brady
2012-08-06 8:15 ` Karel Zak
1 sibling, 1 reply; 11+ messages in thread
From: Dave Reisner @ 2012-08-04 15:57 UTC (permalink / raw)
To: Pádraig Brady; +Cc: util-linux
On Sat, Aug 04, 2012 at 04:42:10PM +0100, Pádraig Brady wrote:
> There was a recent change in df in coreutils to sanitize output of paths:
>
> http://git.sv.gnu.org/gitweb/?p=coreutils.git;a=commitdiff;h=3ed70fd
>
> The essential issue fixed there is that control chars in a path will be
> converted to '?' (this works in all locales), and doing so will mean
> '\n' for example is not output. You could even consider this a potential
> security improvement so that arbitrary users couldn't influence the
> output of these commands for all users.
>
> I suggest using the simple inplace replacement function from above.
Why replace with a bogus character when you could instead use an octal
or hex escape? Wouldn't this still address the underlying problem?
Munging the content of a string could break a script consuming the
output with no way for the script to recover.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: suggestion to avoid erroneous lines in findmnt/lslocks/...
2012-08-04 15:57 ` Dave Reisner
@ 2012-08-05 2:02 ` Pádraig Brady
0 siblings, 0 replies; 11+ messages in thread
From: Pádraig Brady @ 2012-08-05 2:02 UTC (permalink / raw)
To: util-linux
On 08/04/2012 04:57 PM, Dave Reisner wrote:
> On Sat, Aug 04, 2012 at 04:42:10PM +0100, Pádraig Brady wrote:
>> There was a recent change in df in coreutils to sanitize output of paths:
>>
>> http://git.sv.gnu.org/gitweb/?p=coreutils.git;a=commitdiff;h=3ed70fd
>>
>> The essential issue fixed there is that control chars in a path will be
>> converted to '?' (this works in all locales), and doing so will mean
>> '\n' for example is not output. You could even consider this a potential
>> security improvement so that arbitrary users couldn't influence the
>> output of these commands for all users.
>>
>> I suggest using the simple inplace replacement function from above.
>
> Why replace with a bogus character when you could instead use an octal
> or hex escape? Wouldn't this still address the underlying problem?
> Munging the content of a string could break a script consuming the
> output with no way for the script to recover.
Yes true.
I suppose you could use octal escapes, so 0x00 -> 0x1F are
mapped to \000 -> \037 and '\' is mapped to \134
That's more invasive though.
For df for now at least it was thought that
requiring unambiguous output for these names was overkill.
The names were adjusted just so as to avoid processing issues
for other sanely named items.
Also related is the issue of non printable characters.
For example if you view the unicode line separator char
(\u2028) in certain places (like a pango editor like gedit for example)
it will appear as a normal new line.
It might be appropriate to replace or escape all non printable chars.
That's complicated though (mbsalign in util-linux already does
this to some extent).
So you could have complex mappings, but I was
thinking at least for these utils a simple method
is appropriate to avoid the immediate issue.
cheers,
Pádraig.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: suggestion to avoid erroneous lines in findmnt/lslocks/...
2012-08-04 15:42 suggestion to avoid erroneous lines in findmnt/lslocks/ Pádraig Brady
2012-08-04 15:57 ` Dave Reisner
@ 2012-08-06 8:15 ` Karel Zak
2012-08-06 11:10 ` Karel Zak
1 sibling, 1 reply; 11+ messages in thread
From: Karel Zak @ 2012-08-06 8:15 UTC (permalink / raw)
To: Pádraig Brady; +Cc: util-linux
On Sat, Aug 04, 2012 at 04:42:10PM +0100, Pádraig Brady wrote:
> There was a recent change in df in coreutils to sanitize output of paths:
>
> http://git.sv.gnu.org/gitweb/?p=coreutils.git;a=commitdiff;h=3ed70fd
Thanks!
> The essential issue fixed there is that control chars in a path will be
> converted to '?' (this works in all locales), and doing so will mean
> '\n' for example is not output. You could even consider this a potential
> security improvement so that arbitrary users couldn't influence the
> output of these commands for all users.
>
> I suggest using the simple inplace replacement function from above.
All our new utils (based on lib/tt.c) already uses hex encoding for
ascii non-printable when export mode (e.g. findmnt -P)
or blank chars when raw mode (e.g. findmnt -r) is specified.
The default output does not escape problematic chars :-(
I'll fix it to use iscntrl() and \x?? hex (to be consistent our
another outputs).
Karel
--
Karel Zak <kzak@redhat.com>
http://karelzak.blogspot.com
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: suggestion to avoid erroneous lines in findmnt/lslocks/...
2012-08-06 8:15 ` Karel Zak
@ 2012-08-06 11:10 ` Karel Zak
2012-08-06 14:44 ` Pádraig Brady
0 siblings, 1 reply; 11+ messages in thread
From: Karel Zak @ 2012-08-06 11:10 UTC (permalink / raw)
To: Pádraig Brady; +Cc: util-linux
On Mon, Aug 06, 2012 at 10:15:33AM +0200, Karel Zak wrote:
> I'll fix it to use iscntrl() and \x?? hex (to be consistent our
> another outputs).
Fixed:
- mount(8) uses '?' like coreutils for control chars (note that
listing mode in mount(8) is in maintenance mode, use findmnt(8) if
you want something better)
- \x<code> is used in findmnt, lsblk, partx, ... for control and non-printable
chars
- in the raw and export (NAME=data) output are also replaced already existing
\x<code> sequences (aaa\x20bbb --> aaa\x5cx20bbb).
This is not used in the default output to keep it human readable (\x?? is
pretty common in /dev/disk/by-*).
I have also fixed the way how lib/tt.c counts cells, it's possible
that old findmnt, lsblk, ... versions have a problem with some languages
(e.g JP) where more than one cell is necessary to print one multibyte.
Karel
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: suggestion to avoid erroneous lines in findmnt/lslocks/...
2012-08-06 11:10 ` Karel Zak
@ 2012-08-06 14:44 ` Pádraig Brady
2012-08-07 8:09 ` Karel Zak
0 siblings, 1 reply; 11+ messages in thread
From: Pádraig Brady @ 2012-08-06 14:44 UTC (permalink / raw)
To: Karel Zak; +Cc: util-linux
On 08/06/2012 12:10 PM, Karel Zak wrote:
> On Mon, Aug 06, 2012 at 10:15:33AM +0200, Karel Zak wrote:
>> I'll fix it to use iscntrl() and \x?? hex (to be consistent our
>> another outputs).
>
> Fixed:
>
> - mount(8) uses '?' like coreutils for control chars (note that
> listing mode in mount(8) is in maintenance mode, use findmnt(8) if
> you want something better)
>
> - \x<code> is used in findmnt, lsblk, partx, ... for control and non-printable
> chars
>
> - in the raw and export (NAME=data) output are also replaced already existing
> \x<code> sequences (aaa\x20bbb --> aaa\x5cx20bbb).
>
> This is not used in the default output to keep it human readable (\x?? is
> pretty common in /dev/disk/by-*).
>
>
> I have also fixed the way how lib/tt.c counts cells, it's possible
> that old findmnt, lsblk, ... versions have a problem with some languages
> (e.g JP) where more than one cell is necessary to print one multibyte.
Cool.
I did a quick test...
$ echo $LANG
en_US.utf8
$ mkdir tst && cd tst
$ truncate -s10M img
$ mkfs.ext2 -F img
$ mkdir ascii "$(printf 'co\ntrol')" 'back\slash' 'es\x63aped' "$(printf 'nonútf8' | iconv -t iso-8859-15)" '日一二三四五六'
$ for mnt in ascii "$(printf 'co\ntrol')" 'back\slash' 'es\x63aped' "$(printf 'nonútf8' | iconv -t iso-8859-15)" '日一二三四五六'; do
> sudo mount img "$mnt"
> ~/git/util-linux/findmnt -l /dev/loop1
> ~/git/util-linux/findmnt -rn /dev/loop1 | cut -d' ' -f1
> sleep 1
> sudo umount /dev/loop1
> done
TARGET SOURCE FSTYPE OPTIONS
/home/padraig/tst/ascii /dev/loop1 ext2 rw,relatime,seclabel,errors=continue
/home/padraig/tst/ascii
TARGET SOURCE FSTYPE OPTIONS
/home/padraig/tst/co\x0atrol /dev/loop1 ext2 rw,relatime,seclabel,errors=continue
/home/padraig/tst/co\x0atrol
TARGET SOURCE FSTYPE OPTIONS
/home/padraig/tst/back\slash /dev/loop1 ext2 rw,relatime,seclabel,errors=continue
/home/padraig/tst/back\slash
TARGET SOURCE FSTYPE OPTIONS
/home/padraig/tst/es\x63aped /dev/loop1 ext2 rw,relatime,seclabel,errors=continue
/home/padraig/tst/es\x5cx63aped
TARGET SOURCE FSTYPE OPTIONS
/dev/loop1 ext2 rw,relatime,seclabel,errors=continue
/home/padraig/tst/non\xfffffffatf8
TARGET SOURCE FSTYPE OPTIONS
/home/padraig/tst/日一二三四五六 /dev/loop1 ext2 rw,relatime,seclabel,errors=continue
/home/padraig/tst/\xffffffe6\xffffff97...
So two questions.
1. Should the back\slash case be back\x5cslash in both cases?
2. The nonútf8 one produces an errant new line.
Also in this case could you fall back to using \x escapes for the whole string?
cheers,
Pádraig.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: suggestion to avoid erroneous lines in findmnt/lslocks/...
2012-08-06 14:44 ` Pádraig Brady
@ 2012-08-07 8:09 ` Karel Zak
2012-08-07 11:35 ` Dave Reisner
2012-08-07 23:36 ` Pádraig Brady
0 siblings, 2 replies; 11+ messages in thread
From: Karel Zak @ 2012-08-07 8:09 UTC (permalink / raw)
To: Pádraig Brady; +Cc: util-linux
On Mon, Aug 06, 2012 at 03:44:21PM +0100, Pádraig Brady wrote:
> I did a quick test...
Thanks, I'll use it in regression tests ;-) (I was busy yesterday to
write any reg.tests.)
> TARGET SOURCE FSTYPE OPTIONS
> /home/padraig/tst/ascii /dev/loop1 ext2 rw,relatime,seclabel,errors=continue
> /home/padraig/tst/ascii
> TARGET SOURCE FSTYPE OPTIONS
> /home/padraig/tst/co\x0atrol /dev/loop1 ext2 rw,relatime,seclabel,errors=continue
> /home/padraig/tst/co\x0atrol
> TARGET SOURCE FSTYPE OPTIONS
> /home/padraig/tst/back\slash /dev/loop1 ext2 rw,relatime,seclabel,errors=continue
> /home/padraig/tst/back\slash
> TARGET SOURCE FSTYPE OPTIONS
> /home/padraig/tst/es\x63aped /dev/loop1 ext2 rw,relatime,seclabel,errors=continue
> /home/padraig/tst/es\x5cx63aped
> TARGET SOURCE FSTYPE OPTIONS
>
> /dev/loop1 ext2 rw,relatime,seclabel,errors=continue
> /home/padraig/tst/non\xfffffffatf8
> TARGET SOURCE FSTYPE OPTIONS
> /home/padraig/tst/日一二三四五六 /dev/loop1 ext2 rw,relatime,seclabel,errors=continue
> /home/padraig/tst/\xffffffe6\xffffff97...
>
> So two questions.
>
> 1. Should the back\slash case be back\x5cslash in both cases?
back\slash is not \x<xdigit> sequence, so escape is unnecessary
Note that \\server\path is pretty common for cifs and use \x5c
for all '\' will make the findmnt output unreadable in many cases.
IMHO is better to be "smart" and use escape sequences only when it's
really necessary.
> 2. The nonútf8 one produces an errant new line.
> Also in this case could you fall back to using \x escapes for the whole string?
Yeah, nonútf8 output seems strange, I'll fix it.
I'll also update findmnt (and others) man pages to explain when and
how we use \x escapes. Is there any elegant way how to convert \x
sequences back to the native strings in shell? Maybe we can add some
hint to the man pages too.
Karel
--
Karel Zak <kzak@redhat.com>
http://karelzak.blogspot.com
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: suggestion to avoid erroneous lines in findmnt/lslocks/...
2012-08-07 8:09 ` Karel Zak
@ 2012-08-07 11:35 ` Dave Reisner
2012-08-07 23:36 ` Pádraig Brady
1 sibling, 0 replies; 11+ messages in thread
From: Dave Reisner @ 2012-08-07 11:35 UTC (permalink / raw)
To: Karel Zak; +Cc: Pádraig Brady, util-linux
On Tue, Aug 07, 2012 at 10:09:39AM +0200, Karel Zak wrote:
> On Mon, Aug 06, 2012 at 03:44:21PM +0100, Pádraig Brady wrote:
> > I did a quick test...
>
> Thanks, I'll use it in regression tests ;-) (I was busy yesterday to
> write any reg.tests.)
>
> > TARGET SOURCE FSTYPE OPTIONS
> > /home/padraig/tst/ascii /dev/loop1 ext2 rw,relatime,seclabel,errors=continue
> > /home/padraig/tst/ascii
> > TARGET SOURCE FSTYPE OPTIONS
> > /home/padraig/tst/co\x0atrol /dev/loop1 ext2 rw,relatime,seclabel,errors=continue
> > /home/padraig/tst/co\x0atrol
> > TARGET SOURCE FSTYPE OPTIONS
> > /home/padraig/tst/back\slash /dev/loop1 ext2 rw,relatime,seclabel,errors=continue
> > /home/padraig/tst/back\slash
> > TARGET SOURCE FSTYPE OPTIONS
> > /home/padraig/tst/es\x63aped /dev/loop1 ext2 rw,relatime,seclabel,errors=continue
> > /home/padraig/tst/es\x5cx63aped
> > TARGET SOURCE FSTYPE OPTIONS
> >
> > /dev/loop1 ext2 rw,relatime,seclabel,errors=continue
> > /home/padraig/tst/non\xfffffffatf8
> > TARGET SOURCE FSTYPE OPTIONS
> > /home/padraig/tst/日一二三四五六 /dev/loop1 ext2 rw,relatime,seclabel,errors=continue
> > /home/padraig/tst/\xffffffe6\xffffff97...
> >
> > So two questions.
> >
> > 1. Should the back\slash case be back\x5cslash in both cases?
>
> back\slash is not \x<xdigit> sequence, so escape is unnecessary
>
> Note that \\server\path is pretty common for cifs and use \x5c
> for all '\' will make the findmnt output unreadable in many cases.
> IMHO is better to be "smart" and use escape sequences only when it's
> really necessary.
>
> > 2. The nonútf8 one produces an errant new line.
> > Also in this case could you fall back to using \x escapes for the whole string?
>
> Yeah, nonútf8 output seems strange, I'll fix it.
>
>
> I'll also update findmnt (and others) man pages to explain when and
> how we use \x escapes. Is there any elegant way how to convert \x
> sequences back to the native strings in shell? Maybe we can add some
> hint to the man pages too.
>
You can do something as simple as:
printf '%b' "$mangled_content"
And it works even in POSIX shell. However, the %b formatter unescapes
simple control character sequences as well, so "foo\bar" will end up as
"foar" because the '\b' is consumed as a backspace. I ended up writing
my own unmangle routine to properly, and only, handle octal and hex
escapes:
https://github.com/falconindy/arch-install-scripts/blob/master/common#L70
Dave
> Karel
>
> --
> Karel Zak <kzak@redhat.com>
> http://karelzak.blogspot.com
> --
> To unsubscribe from this list: send the line "unsubscribe util-linux" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: suggestion to avoid erroneous lines in findmnt/lslocks/...
2012-08-07 8:09 ` Karel Zak
2012-08-07 11:35 ` Dave Reisner
@ 2012-08-07 23:36 ` Pádraig Brady
2012-08-13 12:39 ` Karel Zak
1 sibling, 1 reply; 11+ messages in thread
From: Pádraig Brady @ 2012-08-07 23:36 UTC (permalink / raw)
To: Karel Zak; +Cc: util-linux
On 08/07/2012 09:09 AM, Karel Zak wrote:
> On Mon, Aug 06, 2012 at 03:44:21PM +0100, Pádraig Brady wrote:
>> I did a quick test...
>
> Thanks, I'll use it in regression tests ;-) (I was busy yesterday to
> write any reg.tests.)
>
>> TARGET SOURCE FSTYPE OPTIONS
>> /home/padraig/tst/ascii /dev/loop1 ext2 rw,relatime,seclabel,errors=continue
>> /home/padraig/tst/ascii
>> TARGET SOURCE FSTYPE OPTIONS
>> /home/padraig/tst/co\x0atrol /dev/loop1 ext2 rw,relatime,seclabel,errors=continue
>> /home/padraig/tst/co\x0atrol
>> TARGET SOURCE FSTYPE OPTIONS
>> /home/padraig/tst/back\slash /dev/loop1 ext2 rw,relatime,seclabel,errors=continue
>> /home/padraig/tst/back\slash
>> TARGET SOURCE FSTYPE OPTIONS
>> /home/padraig/tst/es\x63aped /dev/loop1 ext2 rw,relatime,seclabel,errors=continue
>> /home/padraig/tst/es\x5cx63aped
>> TARGET SOURCE FSTYPE OPTIONS
>>
>> /dev/loop1 ext2 rw,relatime,seclabel,errors=continue
>> /home/padraig/tst/non\xfffffffatf8
>> TARGET SOURCE FSTYPE OPTIONS
>> /home/padraig/tst/日一二三四五六 /dev/loop1 ext2 rw,relatime,seclabel,errors=continue
>> /home/padraig/tst/\xffffffe6\xffffff97...
>>
>> So two questions.
>>
>> 1. Should the back\slash case be back\x5cslash in both cases?
>
> back\slash is not \x<xdigit> sequence, so escape is unnecessary
>
> Note that \\server\path is pretty common for cifs and use \x5c
> for all '\' will make the findmnt output unreadable in many cases.
> IMHO is better to be "smart" and use escape sequences only when it's
> really necessary.
Better for humans, but awkward for scripts to parse.
What I was thinking was perhaps --raw or -P would
do unconditional escaping of '\' so unescaping can be
done with just `printf %b`?
With the conditional escaping you'd have to do something like:
unmangle() {
printf '%b' $(
sed '
s/\\\([^x]\)/\\x5c\1/g;
s/\\\(x[^0-9a-f]\)/\\x5c\1/g;
s/\\\(x[0-9a-f][^0-9a-f]\)/\\x5c\1/g;
'
)
}
>
>> 2. The nonútf8 one produces an errant new line.
>> Also in this case could you fall back to using \x escapes for the whole string?
>
> Yeah, nonútf8 output seems strange, I'll fix it.
>
>
> I'll also update findmnt (and others) man pages to explain when and
> how we use \x escapes. Is there any elegant way how to convert \x
> sequences back to the native strings in shell? Maybe we can add some
> hint to the man pages too.
cheers,
Pádraig.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: suggestion to avoid erroneous lines in findmnt/lslocks/...
2012-08-07 23:36 ` Pádraig Brady
@ 2012-08-13 12:39 ` Karel Zak
2012-08-20 0:31 ` Pádraig Brady
0 siblings, 1 reply; 11+ messages in thread
From: Karel Zak @ 2012-08-13 12:39 UTC (permalink / raw)
To: Pádraig Brady; +Cc: util-linux
On Wed, Aug 08, 2012 at 12:36:05AM +0100, Pádraig Brady wrote:
> ck\slash is not \x<xdigit> sequence, so escape is unnecessary
> >
> > Note that \\server\path is pretty common for cifs and use \x5c
> > for all '\' will make the findmnt output unreadable in many cases.
> > IMHO is better to be "smart" and use escape sequences only when it's
> > really necessary.
>
> Better for humans, but awkward for scripts to parse.
> What I was thinking was perhaps --raw or -P would
> do unconditional escaping of '\' so unescaping can be
> done with just `printf %b`?
Good point. Fixed, all '\' will be replaced with \x5c.
> With the conditional escaping you'd have to do something like:
>
> unmangle() {
> printf '%b' $(
> sed '
> s/\\\([^x]\)/\\x5c\1/g;
> s/\\\(x[^0-9a-f]\)/\\x5c\1/g;
> s/\\\(x[0-9a-f][^0-9a-f]\)/\\x5c\1/g;
> '
> )
> }
ugly, I see, let's use printf '%b' without the sed stuff. Thanks.
Karel
--
Karel Zak <kzak@redhat.com>
http://karelzak.blogspot.com
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: suggestion to avoid erroneous lines in findmnt/lslocks/...
2012-08-13 12:39 ` Karel Zak
@ 2012-08-20 0:31 ` Pádraig Brady
0 siblings, 0 replies; 11+ messages in thread
From: Pádraig Brady @ 2012-08-20 0:31 UTC (permalink / raw)
To: Karel Zak; +Cc: util-linux
On 08/13/2012 01:39 PM, Karel Zak wrote:
> On Wed, Aug 08, 2012 at 12:36:05AM +0100, Pádraig Brady wrote:
>> ck\slash is not \x<xdigit> sequence, so escape is unnecessary
>>>
>>> Note that \\server\path is pretty common for cifs and use \x5c
>>> for all '\' will make the findmnt output unreadable in many cases.
>>> IMHO is better to be "smart" and use escape sequences only when it's
>>> really necessary.
>>
>> Better for humans, but awkward for scripts to parse.
>> What I was thinking was perhaps --raw or -P would
>> do unconditional escaping of '\' so unescaping can be
>> done with just `printf %b`?
>
> Good point. Fixed, all '\' will be replaced with \x5c.
Excellent.
I've tested using the previous test case,
and output is as expected (including the
invalid UTF8 case).
cheers,
Pádraig.
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2012-08-20 0:31 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-08-04 15:42 suggestion to avoid erroneous lines in findmnt/lslocks/ Pádraig Brady
2012-08-04 15:57 ` Dave Reisner
2012-08-05 2:02 ` Pádraig Brady
2012-08-06 8:15 ` Karel Zak
2012-08-06 11:10 ` Karel Zak
2012-08-06 14:44 ` Pádraig Brady
2012-08-07 8:09 ` Karel Zak
2012-08-07 11:35 ` Dave Reisner
2012-08-07 23:36 ` Pádraig Brady
2012-08-13 12:39 ` Karel Zak
2012-08-20 0:31 ` Pádraig Brady
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).