From: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
To: buildroot@busybox.net
Subject: [Buildroot] [PATCH/next v2 1/1] package/icu: Add support to generate a subset of ICU data
Date: Sun, 18 Jul 2021 23:11:20 +0200 [thread overview]
Message-ID: <20210718231120.76adc4e3@windsurf> (raw)
In-Reply-To: <20210601060608.5531-1-bernd.kuhls@t-online.de>
Hello Bernd,
On Tue, 1 Jun 2021 08:06:08 +0200
Bernd Kuhls <bernd.kuhls@t-online.de> wrote:
> This would reduce the size of libicudata.so to 12M.
>
> [1] https://github.com/unicode-org/icu/blob/master/docs/userguide/icu_data/buildtool.md
This is really some good and useful contribution, but (as usual), I
have a couple of questions/requests.
> +config BR2_PACKAGE_ICU_DATA_FILTER_FILE
> + string "Path to custom data configuration file"
> + help
> + The ICU Data Build Tool enables you to write a configuration
> + file that specifies what features and locales to include in a
> + custom data bundle:
> + https://github.com/unicode-org/icu/blob/main/docs/userguide/icu_data/buildtool.md
> + Leave empty to not use this functionality.
We already have BR2_PACKAGE_ICU_CUSTOM_DATA_PATH which was also meant
to allow building ICU with a smaller dataset, based on pre-compiled
.dat file. It points to using http://apps.icu-project.org/datacustom/,
which in fact seems to no longer exists, as it redirects to
https://unicode-org.atlassian.net/browse/ICU-12977/, which says that
the feature is no longer available since ICU 58.x.
So I guess we should first drop BR2_PACKAGE_ICU_CUSTOM_DATA_PATH which
is already unusable today.
Then the second question is whether we can provide/generate some kind
of default file in Buildroot that would in a default Buildroot
configuration generate a more minimal ICU dataset. For example by
building only the support for English ? Or based on the value of
BR2_GENERATE_LOCALE ?
I know this would break backward compatibility, but it would really
make ICU a lot more reasonable in size for most of our users. Indeed,
they are unlikely to know that they can reduce the size of ICU by using
this new feature.
> +ICU_DATA_FILTER_FILE = $(call qstrip,$(BR2_PACKAGE_ICU_DATA_FILTER_FILE))
> +
> +ifneq ($(ICU_DATA_FILTER_FILE),)
> +HOST_ICU_DATA_SOURCE = $(subst src.tgz,data.zip,$(ICU_SOURCE))
> +HOST_ICU_EXTRA_DOWNLOADS += $(HOST_ICU_SITE)/$(HOST_ICU_DATA_SOURCE)
> +
> +define HOST_ICU_EXTRACT_DATA
> + rm -rf $(@D)/$(HOST_ICU_SUBDIR)/data
> + $(UNZIP) $(ICU_DL_DIR)/$(HOST_ICU_DATA_SOURCE) -d $(@D)/$(HOST_ICU_SUBDIR)
> +endef
> +HOST_ICU_POST_EXTRACT_HOOKS += HOST_ICU_EXTRACT_DATA
> +
> +HOST_ICU_CONF_ENV = ICU_DATA_FILTER_FILE=$(ICU_DATA_FILTER_FILE)
> +HOST_ICU_CONF_OPTS += --with-data-packaging=archive
> +
> +define ICU_COPY_CUSTOM_DATA
> + $(INSTALL) -D -m 644 $(HOST_ICU_DIR)/$(HOST_ICU_SUBDIR)/data/out/icudt$(ICU_VERSION_MAJOR)l.dat $(@D)/$(ICU_SUBDIR)/data/in/
> +endef
> +ICU_PRE_CONFIGURE_HOOKS += ICU_COPY_CUSTOM_DATA
> +endif
It took me quite some time to understand what was going on here. My
understanding is as follows:
* In a normal build, the pre-compiled source/data/in/icudt69l.dat
provided in the ICU tarball is used.
* When a custom dataset needs to be generated thanks to your new
option, we need to download the source of this dataset as an extra
download for the host-icu package, replace the source/data/ subdir
with this source data set, and ask the host-icu package to generate
icudt69l.dat. Then the target icu package grabs this pre-compiled
icudt69l.dat.
Is that correct ? If so, then I'd say it would be useful to add a
comment above this code, as I find the logic to not be that trivial.
Thanks!
Thomas
--
Thomas Petazzoni, co-owner and CEO, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com
next prev parent reply other threads:[~2021-07-18 21:11 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-06-01 6:06 [Buildroot] [PATCH/next v2 1/1] package/icu: Add support to generate a subset of ICU data Bernd Kuhls
2021-07-18 21:11 ` Thomas Petazzoni [this message]
[not found] ` <20210718231120.76adc4e3__4803.22810342244$1626642706$gmane$org@windsurf>
2021-07-19 5:45 ` Bernd Kuhls
2021-07-19 7:46 ` Thomas Petazzoni
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20210718231120.76adc4e3@windsurf \
--to=thomas.petazzoni@bootlin.com \
--cc=buildroot@busybox.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.