From: Jiri Slaby <jirislaby@kernel.org>
To: Nicolas Pitre <nico@fluxnic.net>,
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Nicolas Pitre <npitre@baylibre.com>, Dave Mielke <Dave@mielke.cc>,
linux-serial@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 06/11] vt: introduce gen_ucs_recompose.py to create ucs_recompose.c
Date: Mon, 14 Apr 2025 09:08:04 +0200 [thread overview]
Message-ID: <b6fbf76f-bebb-4754-9415-d7e4dc4066cc@kernel.org> (raw)
In-Reply-To: <20250410011839.64418-7-nico@fluxnic.net>
On 10. 04. 25, 3:13, Nicolas Pitre wrote:
> From: Nicolas Pitre <npitre@baylibre.com>
>
> The generated code includes a table that maps base character + combining
> mark pairs to their precomposed equivalents using Python's unicodedata
> module. It also provides the ucs_recompose() function to query that
> table.
>
> The default script behavior is to create a table with most commonly used
> Latin, Greek, and Cyrillic recomposition pairs only. It is much smaller
> than the table with all possible recomposition pairs (71 entries vs 1000
> entries). But if one needs/wants the full table then simply running the
> script with the --full argument will generate it.
>
> Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
> ---
> drivers/tty/vt/gen_ucs_recompose.py | 321 ++++++++++++++++++++++++++++
> 1 file changed, 321 insertions(+)
> create mode 100755 drivers/tty/vt/gen_ucs_recompose.py
>
> diff --git a/drivers/tty/vt/gen_ucs_recompose.py b/drivers/tty/vt/gen_ucs_recompose.py
> new file mode 100755
> index 0000000000..64418803e4
> --- /dev/null
> +++ b/drivers/tty/vt/gen_ucs_recompose.py
...
> +struct compare_key {{
> + uint16_t base;
> + uint16_t combining;
> +}};
> +
> +static int recomposition_compare(const void *key, const void *element)
> +{{
> + const struct compare_key *search_key = key;
> + const struct recomposition *table_entry = element;
> +
> + /* Compare base character first */
> + if (search_key->base < table_entry->base)
> + return -1;
> + if (search_key->base > table_entry->base)
> + return 1;
> +
> + /* Base characters match, now compare combining character */
> + if (search_key->combining < table_entry->combining)
> + return -1;
> + if (search_key->combining > table_entry->combining)
> + return 1;
> +
> + /* Both match */
> + return 0;
> +}}
> +
> +/**
> + * Attempt to recompose two Unicode characters into a single character.
> + *
> + * @param previous: Previous Unicode code point (UCS-4)
> + * @param current: Current Unicode code point (UCS-4)
> + * Return: Recomposed Unicode code point, or 0 if no recomposition is possible
> + */
> +uint32_t ucs_recompose(uint32_t base, uint32_t combining)
> +{{
> + /* Check if characters are within the range of our table */
> + if (base < MIN_BASE_CHAR || base > MAX_BASE_CHAR ||
> + combining < MIN_COMBINING_CHAR || combining > MAX_COMBINING_CHAR)
> + return 0;
> +
> + struct compare_key key = {{ base, combining }};
> +
> + struct recomposition *result =
> + __inline_bsearch(&key, recomposition_table,
> + ARRAY_SIZE(recomposition_table),
> + sizeof(*recomposition_table),
> + recomposition_compare);
> +
> + return result ? result->recomposed : 0;
> +}}
Again, I think no reason to maintain C functions in py.
thanks,
--
js
suse labs
next prev parent reply other threads:[~2025-04-14 7:08 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-04-10 1:13 [PATCH 00/11] vt: implement proper Unicode handling Nicolas Pitre
2025-04-10 1:13 ` [PATCH 01/11] vt: minor cleanup to vc_translate_unicode() Nicolas Pitre
2025-04-10 1:13 ` [PATCH 02/11] vt: move unicode processing to a separate file Nicolas Pitre
2025-04-14 6:47 ` Jiri Slaby
2025-04-15 19:03 ` Nicolas Pitre
2025-04-10 1:13 ` [PATCH 03/11] vt: properly support zero-width Unicode code points Nicolas Pitre
2025-04-14 6:51 ` Jiri Slaby
2025-04-15 19:06 ` Nicolas Pitre
2025-04-10 1:13 ` [PATCH 04/11] vt: introduce gen_ucs_width.py to create ucs_width.c Nicolas Pitre
2025-04-14 7:04 ` Jiri Slaby
2025-04-15 19:13 ` Nicolas Pitre
2025-04-10 1:13 ` [PATCH 05/11] vt: update ucs_width.c using gen_ucs_width.py Nicolas Pitre
2025-04-11 3:47 ` kernel test robot
2025-04-10 1:13 ` [PATCH 06/11] vt: introduce gen_ucs_recompose.py to create ucs_recompose.c Nicolas Pitre
2025-04-14 7:08 ` Jiri Slaby [this message]
2025-04-10 1:13 ` [PATCH 07/11] vt: create ucs_recompose.c using gen_ucs_recompose.py Nicolas Pitre
2025-04-11 6:00 ` kernel test robot
2025-04-10 1:14 ` [PATCH 08/11] vt: support Unicode recomposition Nicolas Pitre
2025-04-10 1:14 ` [PATCH 09/11] vt: update gen_ucs_width.py to produce more space efficient tables Nicolas Pitre
2025-04-14 7:14 ` Jiri Slaby
2025-04-15 19:16 ` Nicolas Pitre
2025-04-10 1:14 ` [PATCH 10/11] vt: update ucs_width.c following latest gen_ucs_width.py Nicolas Pitre
2025-04-14 7:17 ` Jiri Slaby
2025-04-10 1:14 ` [PATCH 11/11] vt: pad double-width code points with a zero-white-space Nicolas Pitre
2025-04-14 7:18 ` Jiri Slaby
2025-04-10 19:38 ` [PATCH 12/11] vt: remove zero-white-space handling from conv_uni_to_pc() Nicolas Pitre
2025-04-11 14:49 ` [PATCH 00/11] vt: implement proper Unicode handling Greg Kroah-Hartman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=b6fbf76f-bebb-4754-9415-d7e4dc4066cc@kernel.org \
--to=jirislaby@kernel.org \
--cc=Dave@mielke.cc \
--cc=gregkh@linuxfoundation.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-serial@vger.kernel.org \
--cc=nico@fluxnic.net \
--cc=npitre@baylibre.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox