From: Arnout Vandecappelle <arnout@mind.be>
To: buildroot@busybox.net
Subject: [Buildroot] User question UTF-8
Date: Tue, 15 Sep 2015 23:21:48 +0200 [thread overview]
Message-ID: <55F88BEC.1060306@mind.be> (raw)
In-Reply-To: <CAJJ6jxvdrYbALJfdd=oKMO4xwsg9o6MfwR_YGLyvOsCgfV77Hw@mail.gmail.com>
On 15-09-15 19:11, Steve Calfee wrote:
> Hi,
>
> I am trying to port a python application to buildroot/busybox. It
> needs to read disk files from removable drives. The filenames may
> contain utf-8 chars.
>
> Currently ls from busybox prints ? for the utf-8 non-ascii chars. Both
> from console on minicom and from ssh (which should handle utf-8).
Busybox ls will print all non-ASCII characters as ? unless UNICODE_SUPPORT is
enabled. Our default busybox config doesn't have UNICODE_SUPPORT enabled. So do
'make busybox-menuconfig' and enable UNICODE_SUPPORT. You'll also need to enable
WCHAR in the toolchain - but since you use glibc, it always has WCHAR enabled.
>
> There seems to be lots of config knobs.
>
> I assume utf-8 chars are somehow related to locales? I enabled locales
> in the internal glib toolchain.
>
> BR2_arm=y
> BR2_TOOLCHAIN_BUILDROOT_GLIBC=y
> BR2_TOOLCHAIN_BUILDROOT_CXX=y
> BR2_ENABLE_LOCALE_PURGE=y
> BR2_GENERATE_LOCALE="en_US.UTF-8"
> BR2_TARGET_OPTIMIZATION="-Os -pipe"
> # BR2_TARGET_GENERIC_GETTY is not set
> # BR2_TARGET_GENERIC_REMOUNT_ROOTFS_RW is not set
> BR2_PACKAGE_LIBPTHREAD_STUBS=y
> # BR2_TARGET_ROOTFS_TAR is not set
> BR2_TARGET_SHEEVAPLUG=y
>
>
> Busybox also has locale settings:
> grep LOCAL output/build/busybox-1.23.2/.config
> CONFIG_LOCALE_SUPPORT=y
> # CONFIG_UNICODE_USING_LOCALE is not set
> # CONFIG_FEATURE_UNIX_LOCAL is not set
> # CONFIG_HUSH_LOCAL is not set
>
>>From googling, Linux always supports anything for filenames, since it
> just uses bytes not unicode for filenames.
>
> But I seem to be missing something. My generated system does not seem
> to properly handle utf-8. I am guessing until that works the python os
> module is also not going to handle utf-8. And indeed it does not work
> now.
Busybox and python are completely unrelated. In python 2, you'll have to
explicitly encode/decode the filenames with the appropriate character set. The
default character set is ascii, not utf-8. In python 3, there is an environment
variable that you can set to default to utf-8, though.
Regards,
Arnout
>
> Regards, Steve
> _______________________________________________
> buildroot mailing list
> buildroot at busybox.net
> http://lists.busybox.net/mailman/listinfo/buildroot
>
--
Arnout Vandecappelle arnout at mind be
Senior Embedded Software Architect +32-16-286500
Essensium/Mind http://www.mind.be
G.Geenslaan 9, 3001 Leuven, Belgium BE 872 984 063 RPR Leuven
LinkedIn profile: http://www.linkedin.com/in/arnoutvandecappelle
GPG fingerprint: 7493 020B C7E3 8618 8DEC 222C 82EB F404 F9AC 0DDF
next prev parent reply other threads:[~2015-09-15 21:21 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-09-15 17:11 [Buildroot] User question UTF-8 Steve Calfee
2015-09-15 21:21 ` Thomas Petazzoni
2015-09-15 21:39 ` Steve Calfee
2015-09-15 21:21 ` Arnout Vandecappelle [this message]
2015-09-15 21:49 ` Steve Calfee
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=55F88BEC.1060306@mind.be \
--to=arnout@mind.be \
--cc=buildroot@busybox.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox