All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jiri Slaby <jirislaby@kernel.org>
To: Nicolas Pitre <nico@fluxnic.net>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Nicolas Pitre <npitre@baylibre.com>, Dave Mielke <Dave@mielke.cc>,
	linux-serial@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 06/11] vt: introduce gen_ucs_recompose.py to create ucs_recompose.c
Date: Mon, 14 Apr 2025 09:08:04 +0200	[thread overview]
Message-ID: <b6fbf76f-bebb-4754-9415-d7e4dc4066cc@kernel.org> (raw)
In-Reply-To: <20250410011839.64418-7-nico@fluxnic.net>

On 10. 04. 25, 3:13, Nicolas Pitre wrote:
> From: Nicolas Pitre <npitre@baylibre.com>
> 
> The generated code includes a table that maps base character + combining
> mark pairs to their precomposed equivalents using Python's unicodedata
> module. It also provides the ucs_recompose() function to query that
> table.
> 
> The default script behavior is to create a table with most commonly used
> Latin, Greek, and Cyrillic recomposition pairs only. It is much smaller
> than the table with all possible recomposition pairs (71 entries vs 1000
> entries). But if one needs/wants the full table then simply running the
> script with the --full argument will generate it.
> 
> Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
> ---
>   drivers/tty/vt/gen_ucs_recompose.py | 321 ++++++++++++++++++++++++++++
>   1 file changed, 321 insertions(+)
>   create mode 100755 drivers/tty/vt/gen_ucs_recompose.py
> 
> diff --git a/drivers/tty/vt/gen_ucs_recompose.py b/drivers/tty/vt/gen_ucs_recompose.py
> new file mode 100755
> index 0000000000..64418803e4
> --- /dev/null
> +++ b/drivers/tty/vt/gen_ucs_recompose.py
...
> +struct compare_key {{
> +	uint16_t base;
> +	uint16_t combining;
> +}};
> +
> +static int recomposition_compare(const void *key, const void *element)
> +{{
> +	const struct compare_key *search_key = key;
> +	const struct recomposition *table_entry = element;
> +
> +	/* Compare base character first */
> +	if (search_key->base < table_entry->base)
> +		return -1;
> +	if (search_key->base > table_entry->base)
> +		return 1;
> +
> +	/* Base characters match, now compare combining character */
> +	if (search_key->combining < table_entry->combining)
> +		return -1;
> +	if (search_key->combining > table_entry->combining)
> +		return 1;
> +
> +	/* Both match */
> +	return 0;
> +}}
> +
> +/**
> + * Attempt to recompose two Unicode characters into a single character.
> + *
> + * @param previous: Previous Unicode code point (UCS-4)
> + * @param current: Current Unicode code point (UCS-4)
> + * Return: Recomposed Unicode code point, or 0 if no recomposition is possible
> + */
> +uint32_t ucs_recompose(uint32_t base, uint32_t combining)
> +{{
> +	/* Check if characters are within the range of our table */
> +	if (base < MIN_BASE_CHAR || base > MAX_BASE_CHAR ||
> +	    combining < MIN_COMBINING_CHAR || combining > MAX_COMBINING_CHAR)
> +		return 0;
> +
> +	struct compare_key key = {{ base, combining }};
> +
> +	struct recomposition *result =
> +		__inline_bsearch(&key, recomposition_table,
> +				 ARRAY_SIZE(recomposition_table),
> +				 sizeof(*recomposition_table),
> +				 recomposition_compare);
> +
> +	return result ? result->recomposed : 0;
> +}}

Again, I think no reason to maintain C functions in py.

thanks,
-- 
js
suse labs

  reply	other threads:[~2025-04-14  7:08 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-04-10  1:13 [PATCH 00/11] vt: implement proper Unicode handling Nicolas Pitre
2025-04-10  1:13 ` [PATCH 01/11] vt: minor cleanup to vc_translate_unicode() Nicolas Pitre
2025-04-10  1:13 ` [PATCH 02/11] vt: move unicode processing to a separate file Nicolas Pitre
2025-04-14  6:47   ` Jiri Slaby
2025-04-15 19:03     ` Nicolas Pitre
2025-04-10  1:13 ` [PATCH 03/11] vt: properly support zero-width Unicode code points Nicolas Pitre
2025-04-14  6:51   ` Jiri Slaby
2025-04-15 19:06     ` Nicolas Pitre
2025-04-10  1:13 ` [PATCH 04/11] vt: introduce gen_ucs_width.py to create ucs_width.c Nicolas Pitre
2025-04-14  7:04   ` Jiri Slaby
2025-04-15 19:13     ` Nicolas Pitre
2025-04-10  1:13 ` [PATCH 05/11] vt: update ucs_width.c using gen_ucs_width.py Nicolas Pitre
2025-04-11  3:47   ` kernel test robot
2025-04-10  1:13 ` [PATCH 06/11] vt: introduce gen_ucs_recompose.py to create ucs_recompose.c Nicolas Pitre
2025-04-14  7:08   ` Jiri Slaby [this message]
2025-04-10  1:13 ` [PATCH 07/11] vt: create ucs_recompose.c using gen_ucs_recompose.py Nicolas Pitre
2025-04-11  6:00   ` kernel test robot
2025-04-10  1:14 ` [PATCH 08/11] vt: support Unicode recomposition Nicolas Pitre
2025-04-10  1:14 ` [PATCH 09/11] vt: update gen_ucs_width.py to produce more space efficient tables Nicolas Pitre
2025-04-14  7:14   ` Jiri Slaby
2025-04-15 19:16     ` Nicolas Pitre
2025-04-10  1:14 ` [PATCH 10/11] vt: update ucs_width.c following latest gen_ucs_width.py Nicolas Pitre
2025-04-14  7:17   ` Jiri Slaby
2025-04-10  1:14 ` [PATCH 11/11] vt: pad double-width code points with a zero-white-space Nicolas Pitre
2025-04-14  7:18   ` Jiri Slaby
2025-04-10 19:38 ` [PATCH 12/11] vt: remove zero-white-space handling from conv_uni_to_pc() Nicolas Pitre
2025-04-11 14:49 ` [PATCH 00/11] vt: implement proper Unicode handling Greg Kroah-Hartman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b6fbf76f-bebb-4754-9415-d7e4dc4066cc@kernel.org \
    --to=jirislaby@kernel.org \
    --cc=Dave@mielke.cc \
    --cc=gregkh@linuxfoundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-serial@vger.kernel.org \
    --cc=nico@fluxnic.net \
    --cc=npitre@baylibre.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.