* msdos filesystem ignores codepage argument?
@ 2014-12-26 22:50 Phillip Susi
2014-12-28 17:00 ` OGAWA Hirofumi
0 siblings, 1 reply; 6+ messages in thread
From: Phillip Susi @ 2014-12-26 22:50 UTC (permalink / raw)
To: hirofumi; +Cc: linux-fsdevel
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512
I'm investigating a bug report about the msdos filesystem and the fact
that it appears to ignore the codepage argument. The user wants to
use msdos instead of vfat to avoid adding long filenames to the fs and
store the short filenames using code page 850. I verified that when
the on disk directory entry contains 0x8E, which is an umlouted A in
cp850, but when mounting with mount -t msdos no matter what codepage I
specify in -o, ls outputs the raw 0x8E rather than using the specified
codepage to translate the on disk string to utf8. The msdos
filesystem refuses to accept the utf8 or iocharset options, so how do
you get it to do correct translation using the specified codepage?
Also in the process I noticed some odd behavior of ls. If I set my
terminal to use cp850 and ls | cat, I see the umlouted A, but without
piping the output through cat, it comes out as a question mark. Why
is that?
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAEBCgAGBQJUneYfAAoJENRVrw2cjl5RpOYH/3MsR7aeuJAKcAyIzo57k397
Ef9wAD37FxJ/CzQ8o+ZtlN4wMY2sYx6gbZBK5lfi2T6NfzJ/7k6zhG9Kxr+4zZyS
m/Zuq2NHN5dSANhaUqJNtRDHA1/kTHZHvC/oZ+Kpoogr1tOAslayuYz1L24Kz/FB
m3p6eXbRJBW1wJ2tqW3F45Wbw0oB9K7mBRGzTRedwVTSzNdGJdl2Dy+ru74r8yu2
5i4m7wmuwNZ7WtCd1Rs3B81wDwK0Uz8TPZL2W7lp15MHl1Bi8l9NsClYVcrXALLs
RacNE57laNu16ijTtJxgz7DBttZTQZRplCCB2pRLxr01Iv09TtJr/x+k8z3hYvA=
=O/ao
-----END PGP SIGNATURE-----
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: msdos filesystem ignores codepage argument?
2014-12-26 22:50 msdos filesystem ignores codepage argument? Phillip Susi
@ 2014-12-28 17:00 ` OGAWA Hirofumi
2014-12-28 22:13 ` Phillip Susi
0 siblings, 1 reply; 6+ messages in thread
From: OGAWA Hirofumi @ 2014-12-28 17:00 UTC (permalink / raw)
To: Phillip Susi; +Cc: linux-fsdevel
Phillip Susi <psusi@ubuntu.com> writes:
> I'm investigating a bug report about the msdos filesystem and the fact
> that it appears to ignore the codepage argument. The user wants to
> use msdos instead of vfat to avoid adding long filenames to the fs and
> store the short filenames using code page 850. I verified that when
> the on disk directory entry contains 0x8E, which is an umlouted A in
> cp850, but when mounting with mount -t msdos no matter what codepage I
> specify in -o, ls outputs the raw 0x8E rather than using the specified
> codepage to translate the on disk string to utf8. The msdos
> filesystem refuses to accept the utf8 or iocharset options, so how do
> you get it to do correct translation using the specified codepage?
codepage option is to specify what codepage is used as on-disk encode in
FAT, not how convert to encoding to show.
And msdos driver doesn't have the feature to encoding conversion between
on-disk and user (codepage is used only to upper/lower case conversion
basically). IOW, msdos assumes the user and on-disk encodings are same.
> Also in the process I noticed some odd behavior of ls. If I set my
> terminal to use cp850 and ls | cat, I see the umlouted A, but without
> piping the output through cat, it comes out as a question mark. Why
> is that?
It is what "ls" does. Probably, following option will show raw string
$ ls -N --show-control-chars
Thanks.
--
OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: msdos filesystem ignores codepage argument?
2014-12-28 17:00 ` OGAWA Hirofumi
@ 2014-12-28 22:13 ` Phillip Susi
2014-12-28 23:06 ` OGAWA Hirofumi
0 siblings, 1 reply; 6+ messages in thread
From: Phillip Susi @ 2014-12-28 22:13 UTC (permalink / raw)
To: OGAWA Hirofumi; +Cc: linux-fsdevel
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512
On 12/28/2014 12:00 PM, OGAWA Hirofumi wrote:
> codepage option is to specify what codepage is used as on-disk
> encode in FAT, not how convert to encoding to show.
>
> And msdos driver doesn't have the feature to encoding conversion
> between on-disk and user (codepage is used only to upper/lower case
> conversion basically). IOW, msdos assumes the user and on-disk
> encodings are same.
Umm... so you are saying the argument does nothing on purpose? What
is the use of specifying the on disk code page if not so that it can
be translated to utf8?
>> Also in the process I noticed some odd behavior of ls. If I set
>> my terminal to use cp850 and ls | cat, I see the umlouted A, but
>> without piping the output through cat, it comes out as a question
>> mark. Why is that?
>
> It is what "ls" does. Probably, following option will show raw
> string
>
> $ ls -N --show-control-chars
What exactly is it doing that causes its output to differ when sent to
a tty vs a pipe?
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAEBCgAGBQJUoIB6AAoJENRVrw2cjl5RkOYIAJ0P7nkRGvvNgONFZFElIoFA
ZV1Ki63L/gC1RpZLcGEQp1Oc2nGPUhQYI00brbBDRJU0u8wbv2zMcMVldxP0xOp6
zSLKe631egyK8Tlq85t7DtC8XElYyEtTVgfTvlcLYpH4avOj5jURN99dFUhtnHYy
sKW+z+aRNzhG5qvopagWwWs8bg0ew0PcjY61GoJcfzgud2TZkLd0JE8OOxQJsbLh
ewxXIjJrL/DnRDwu1wy8mEZ7VhA2f1ADRdq6fYGh4K0q5C9jP1ZXDl2k/CPJaRF2
nSg/6XE9SKPVeHwuTHtfxUguNnte9AIhFGps7aXBYw9tiPOLnsg6kv17sxTYtRo=
=8IP6
-----END PGP SIGNATURE-----
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: msdos filesystem ignores codepage argument?
2014-12-28 22:13 ` Phillip Susi
@ 2014-12-28 23:06 ` OGAWA Hirofumi
2014-12-29 0:51 ` Phillip Susi
0 siblings, 1 reply; 6+ messages in thread
From: OGAWA Hirofumi @ 2014-12-28 23:06 UTC (permalink / raw)
To: Phillip Susi; +Cc: linux-fsdevel
Phillip Susi <psusi@ubuntu.com> writes:
> On 12/28/2014 12:00 PM, OGAWA Hirofumi wrote:
>> codepage option is to specify what codepage is used as on-disk
>> encode in FAT, not how convert to encoding to show.
>>
>> And msdos driver doesn't have the feature to encoding conversion
>> between on-disk and user (codepage is used only to upper/lower case
>> conversion basically). IOW, msdos assumes the user and on-disk
>> encodings are same.
>
> Umm... so you are saying the argument does nothing on purpose? What
> is the use of specifying the on disk code page if not so that it can
> be translated to utf8?
As I said, the codepage option is used for upper/lower conversion.
>>> Also in the process I noticed some odd behavior of ls. If I set
>>> my terminal to use cp850 and ls | cat, I see the umlouted A, but
>>> without piping the output through cat, it comes out as a question
>>> mark. Why is that?
>>
>> It is what "ls" does. Probably, following option will show raw
>> string
>>
>> $ ls -N --show-control-chars
>
> What exactly is it doing that causes its output to differ when sent to
> a tty vs a pipe?
See a man page of "ls".
"ls" changes that depending on the output target (tty or not). This is
completely about "ls", not fat driver.
--
OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: msdos filesystem ignores codepage argument?
2014-12-28 23:06 ` OGAWA Hirofumi
@ 2014-12-29 0:51 ` Phillip Susi
2014-12-29 1:00 ` OGAWA Hirofumi
0 siblings, 1 reply; 6+ messages in thread
From: Phillip Susi @ 2014-12-29 0:51 UTC (permalink / raw)
To: OGAWA Hirofumi; +Cc: linux-fsdevel
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512
On 12/28/2014 06:06 PM, OGAWA Hirofumi wrote:
> Phillip Susi <psusi@ubuntu.com> writes:
>
>> On 12/28/2014 12:00 PM, OGAWA Hirofumi wrote:
>>> codepage option is to specify what codepage is used as on-disk
>>> encode in FAT, not how convert to encoding to show.
>>>
>>> And msdos driver doesn't have the feature to encoding
>>> conversion between on-disk and user (codepage is used only to
>>> upper/lower case conversion basically). IOW, msdos assumes the
>>> user and on-disk encodings are same.
>>
>> Umm... so you are saying the argument does nothing on purpose?
>> What is the use of specifying the on disk code page if not so
>> that it can be translated to utf8?
>
> As I said, the codepage option is used for upper/lower conversion.
But userspace tools like ls and the terminal expect all filenames to
be in utf8, not in some random codepage that varies from fs to fs, so
how is it not a bug that the names aren't translated?
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAEBCgAGBQJUoKV3AAoJENRVrw2cjl5RRD8H/0dS+pLqrk1gHDCKs/jYaOgu
kyqmFQBlwHPNEGI3ekzpcQwkpH1EphbnZ4LtqSDSta4ZQ9NAs2iC31Z8/1mEo/Sv
h9UC5sxAF+1lfTRuz6vaZRfv5DUeo5406YvhtmoKDNL75Ji5YJl+1ZTP5jSmM2HC
ax8o3P5wmknfGSVGUgSvQOcKE+ZkdhZ8w6NgselLNOjea7trhEGWI0Up6CVHaJ+S
jGpGdLqozm67rY03bErFIgYRMVzcteA5EZvk/swfmAswHELI767Fyn18sH7Q0ZsQ
A6ew11ujCDcTntEV2+y3v/575cfRT6IQUXkkq5+iosmuT+BfwZi94ihOQHkpRAc=
=0nIy
-----END PGP SIGNATURE-----
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: msdos filesystem ignores codepage argument?
2014-12-29 0:51 ` Phillip Susi
@ 2014-12-29 1:00 ` OGAWA Hirofumi
0 siblings, 0 replies; 6+ messages in thread
From: OGAWA Hirofumi @ 2014-12-29 1:00 UTC (permalink / raw)
To: Phillip Susi; +Cc: linux-fsdevel
Phillip Susi <psusi@ubuntu.com> writes:
>>> Umm... so you are saying the argument does nothing on purpose?
>>> What is the use of specifying the on disk code page if not so
>>> that it can be translated to utf8?
>>
>> As I said, the codepage option is used for upper/lower conversion.
>
> But userspace tools like ls and the terminal expect all filenames to
> be in utf8, not in some random codepage that varies from fs to fs, so
> how is it not a bug that the names aren't translated?
Kernel doesn't know about $LANG at all. utf8 is just in a userland
thing, and normal UNIX FSes doesn't care about encoding *at all*, just
save as-is. It is why you feel FSes working by utf8.
Well, anyway, if you want to add encoding conversion to msdos, we need
the patch.
--
OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2014-12-29 1:00 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-12-26 22:50 msdos filesystem ignores codepage argument? Phillip Susi
2014-12-28 17:00 ` OGAWA Hirofumi
2014-12-28 22:13 ` Phillip Susi
2014-12-28 23:06 ` OGAWA Hirofumi
2014-12-29 0:51 ` Phillip Susi
2014-12-29 1:00 ` OGAWA Hirofumi
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox