public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Jiri Slaby <jirislaby@kernel.org>
To: Nicolas Pitre <nico@fluxnic.net>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Nicolas Pitre <npitre@baylibre.com>, Dave Mielke <Dave@mielke.cc>,
	linux-serial@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 06/11] vt: introduce gen_ucs_recompose.py to create ucs_recompose.c
Date: Mon, 14 Apr 2025 09:08:04 +0200	[thread overview]
Message-ID: <b6fbf76f-bebb-4754-9415-d7e4dc4066cc@kernel.org> (raw)
In-Reply-To: <20250410011839.64418-7-nico@fluxnic.net>

On 10. 04. 25, 3:13, Nicolas Pitre wrote:
> From: Nicolas Pitre <npitre@baylibre.com>
> 
> The generated code includes a table that maps base character + combining
> mark pairs to their precomposed equivalents using Python's unicodedata
> module. It also provides the ucs_recompose() function to query that
> table.
> 
> The default script behavior is to create a table with most commonly used
> Latin, Greek, and Cyrillic recomposition pairs only. It is much smaller
> than the table with all possible recomposition pairs (71 entries vs 1000
> entries). But if one needs/wants the full table then simply running the
> script with the --full argument will generate it.
> 
> Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
> ---
>   drivers/tty/vt/gen_ucs_recompose.py | 321 ++++++++++++++++++++++++++++
>   1 file changed, 321 insertions(+)
>   create mode 100755 drivers/tty/vt/gen_ucs_recompose.py
> 
> diff --git a/drivers/tty/vt/gen_ucs_recompose.py b/drivers/tty/vt/gen_ucs_recompose.py
> new file mode 100755
> index 0000000000..64418803e4
> --- /dev/null
> +++ b/drivers/tty/vt/gen_ucs_recompose.py
...
> +struct compare_key {{
> +	uint16_t base;
> +	uint16_t combining;
> +}};
> +
> +static int recomposition_compare(const void *key, const void *element)
> +{{
> +	const struct compare_key *search_key = key;
> +	const struct recomposition *table_entry = element;
> +
> +	/* Compare base character first */
> +	if (search_key->base < table_entry->base)
> +		return -1;
> +	if (search_key->base > table_entry->base)
> +		return 1;
> +
> +	/* Base characters match, now compare combining character */
> +	if (search_key->combining < table_entry->combining)
> +		return -1;
> +	if (search_key->combining > table_entry->combining)
> +		return 1;
> +
> +	/* Both match */
> +	return 0;
> +}}
> +
> +/**
> + * Attempt to recompose two Unicode characters into a single character.
> + *
> + * @param previous: Previous Unicode code point (UCS-4)
> + * @param current: Current Unicode code point (UCS-4)
> + * Return: Recomposed Unicode code point, or 0 if no recomposition is possible
> + */
> +uint32_t ucs_recompose(uint32_t base, uint32_t combining)
> +{{
> +	/* Check if characters are within the range of our table */
> +	if (base < MIN_BASE_CHAR || base > MAX_BASE_CHAR ||
> +	    combining < MIN_COMBINING_CHAR || combining > MAX_COMBINING_CHAR)
> +		return 0;
> +
> +	struct compare_key key = {{ base, combining }};
> +
> +	struct recomposition *result =
> +		__inline_bsearch(&key, recomposition_table,
> +				 ARRAY_SIZE(recomposition_table),
> +				 sizeof(*recomposition_table),
> +				 recomposition_compare);
> +
> +	return result ? result->recomposed : 0;
> +}}

Again, I think no reason to maintain C functions in py.

thanks,
-- 
js
suse labs

  reply	other threads:[~2025-04-14  7:08 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-04-10  1:13 [PATCH 00/11] vt: implement proper Unicode handling Nicolas Pitre
2025-04-10  1:13 ` [PATCH 01/11] vt: minor cleanup to vc_translate_unicode() Nicolas Pitre
2025-04-10  1:13 ` [PATCH 02/11] vt: move unicode processing to a separate file Nicolas Pitre
2025-04-14  6:47   ` Jiri Slaby
2025-04-15 19:03     ` Nicolas Pitre
2025-04-10  1:13 ` [PATCH 03/11] vt: properly support zero-width Unicode code points Nicolas Pitre
2025-04-14  6:51   ` Jiri Slaby
2025-04-15 19:06     ` Nicolas Pitre
2025-04-10  1:13 ` [PATCH 04/11] vt: introduce gen_ucs_width.py to create ucs_width.c Nicolas Pitre
2025-04-14  7:04   ` Jiri Slaby
2025-04-15 19:13     ` Nicolas Pitre
2025-04-10  1:13 ` [PATCH 05/11] vt: update ucs_width.c using gen_ucs_width.py Nicolas Pitre
2025-04-11  3:47   ` kernel test robot
2025-04-10  1:13 ` [PATCH 06/11] vt: introduce gen_ucs_recompose.py to create ucs_recompose.c Nicolas Pitre
2025-04-14  7:08   ` Jiri Slaby [this message]
2025-04-10  1:13 ` [PATCH 07/11] vt: create ucs_recompose.c using gen_ucs_recompose.py Nicolas Pitre
2025-04-11  6:00   ` kernel test robot
2025-04-10  1:14 ` [PATCH 08/11] vt: support Unicode recomposition Nicolas Pitre
2025-04-10  1:14 ` [PATCH 09/11] vt: update gen_ucs_width.py to produce more space efficient tables Nicolas Pitre
2025-04-14  7:14   ` Jiri Slaby
2025-04-15 19:16     ` Nicolas Pitre
2025-04-10  1:14 ` [PATCH 10/11] vt: update ucs_width.c following latest gen_ucs_width.py Nicolas Pitre
2025-04-14  7:17   ` Jiri Slaby
2025-04-10  1:14 ` [PATCH 11/11] vt: pad double-width code points with a zero-white-space Nicolas Pitre
2025-04-14  7:18   ` Jiri Slaby
2025-04-10 19:38 ` [PATCH 12/11] vt: remove zero-white-space handling from conv_uni_to_pc() Nicolas Pitre
2025-04-11 14:49 ` [PATCH 00/11] vt: implement proper Unicode handling Greg Kroah-Hartman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b6fbf76f-bebb-4754-9415-d7e4dc4066cc@kernel.org \
    --to=jirislaby@kernel.org \
    --cc=Dave@mielke.cc \
    --cc=gregkh@linuxfoundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-serial@vger.kernel.org \
    --cc=nico@fluxnic.net \
    --cc=npitre@baylibre.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox