Linux Sound subsystem development
 help / color / mirror / Atom feed
From: Takashi Iwai <tiwai@suse.de>
To: Jaroslav Kysela <perex@perex.cz>
Cc: "Barnabás Pőcze" <pobrn@protonmail.com>,
	"linux-sound@vger.kernel.org" <linux-sound@vger.kernel.org>,
	"Takashi Iwai" <tiwai@suse.de>
Subject: Re: Incorrect automatic ALSA card ID when unicode is at play
Date: Wed, 02 Oct 2024 21:36:49 +0200	[thread overview]
Message-ID: <874j5ul2jy.wl-tiwai@suse.de> (raw)
In-Reply-To: <506d56d3-540e-4675-a126-347e176b8a0d@perex.cz>

On Wed, 02 Oct 2024 20:00:15 +0200,
Jaroslav Kysela wrote:
> 
> On 02. 10. 24 18:54, Barnabás Pőcze wrote:
> > Hi
> > 
> > 
> > When `snd_card::id` is not specified explicitly it is determined automatically
> > in `sound/core/init.c:snd_card_register()` as follows:
> > 
> >    if (*card->id) {
> >      // ...
> >    } else {
> >      /* create an id from either shortname or longname */
> >      const char *src;
> > 
> >      src = *card->shortname ? card->shortname : card->longname;
> >      snd_card_set_id_no_lock(card, src,
> >                                    retrieve_id_from_card_name(src));
> >    }
> > 
> > However, `snd_card_set_id_no_lock()`, or more specifically the `copy_valid_id_string()`
> > function that it calls does not seem to copy very well with utf-8.
> > 
> > For example, based on the report at https://gitlab.freedesktop.org/pipewire/pipewire/-/issues/4135
> > where `card->shortname="Redmi 电脑音箱"`, `card->id` will be set to \xE7\xE8\xE9\xE7,
> > which are the first bytes of the symbols in the suffix "电脑音箱" because only those
> > bytes of the string satisfy the `isalnum()` check in `copy_valid_id_string()`.
> > 
> > I am not sure what kind of 8-bit character set `isalnum()` is basing these results on, but
> > it certainly does not mix well with utf-8 in this scenario at least.
> 
> It seems that isalnum() also returns true for characters above 0x80
> (see lib/ctype.c):
> 
> 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,                        /* 128-143 */
> 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,                        /* 144-159 */
> _S|_SP,_P,_P,_P,_P,_P,_P,_P,_P,_P,_P,_P,_P,_P,_P,_P,    /* 160-175 */
> _P,_P,_P,_P,_P,_P,_P,_P,_P,_P,_P,_P,_P,_P,_P,_P,        /* 176-191 */
> _U,_U,_U,_U,_U,_U,_U,_U,_U,_U,_U,_U,_U,_U,_U,_U,        /* 192-207 */
> _U,_U,_U,_U,_U,_U,_U,_P,_U,_U,_U,_U,_U,_U,_U,_L,        /* 208-223 */
> _L,_L,_L,_L,_L,_L,_L,_L,_L,_L,_L,_L,_L,_L,_L,_L,        /* 224-239 */
> _L,_L,_L,_L,_L,_L,_L,_P,_L,_L,_L,_L,_L,_L,_L,_L};       /* 240-255 */
> 
> We should probably add isacii() check like:
> 
> diff --git a/sound/core/init.c b/sound/core/init.c
> index b92aa7103589..114fb87de990 100644
> --- a/sound/core/init.c
> +++ b/sound/core/init.c
> @@ -654,13 +654,19 @@ void snd_card_free(struct snd_card *card)
>  }
>  EXPORT_SYMBOL(snd_card_free);
> 
> +/* check, if the character is in the valid ASCII range */
> +static inline bool safe_ascii_char(char c)
> +{
> +	return isascii(c) && isalnum(c);
> +}
> +
>  /* retrieve the last word of shortname or longname */
>  static const char *retrieve_id_from_card_name(const char *name)
>  {
>  	const char *spos = name;
> 
>  	while (*name) {
> -		if (isspace(*name) && isalnum(name[1]))
> +		if (isspace(*name) && safe_ascii_char(name[1]))
>  			spos = name + 1;
>  		name++;
>  	}
> @@ -687,12 +693,12 @@ static void copy_valid_id_string(struct snd_card
> *card, const char *src,
>  {
>  	char *id = card->id;
> 
> -	while (*nid && !isalnum(*nid))
> +	while (*nid && !safe_ascii_char(*nid))
>  		nid++;
>  	if (isdigit(*nid))
>  		*id++ = isalpha(*src) ? *src : 'D';
>  	while (*nid && (size_t)(id - card->id) < sizeof(card->id) - 1) {
> -		if (isalnum(*nid))
> +		if (safe_ascii_char(*nid))
>  			*id++ = *nid;
>  		nid++;
>  	}
> @@ -787,7 +793,7 @@ static ssize_t id_store(struct device *dev, struct
> device_attribute *attr,
> 
>  	for (idx = 0; idx < copy; idx++) {
>  		c = buf[idx];
> -		if (!isalnum(c) && c != '_' && c != '-')
> +		if (!safe_ascii_char(c) && c != '_' && c != '-')
>  			return -EINVAL;
>  	}
>  	memcpy(buf1, buf, copy);
> 
> 
> Takashi, do you have a better idea?

I believe your change is fine.  Care to submit a proper fix patch?


thanks,

Takashi

      reply	other threads:[~2024-10-02 19:35 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-10-02 16:54 Incorrect automatic ALSA card ID when unicode is at play Barnabás Pőcze
2024-10-02 18:00 ` Jaroslav Kysela
2024-10-02 19:36   ` Takashi Iwai [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=874j5ul2jy.wl-tiwai@suse.de \
    --to=tiwai@suse.de \
    --cc=linux-sound@vger.kernel.org \
    --cc=perex@perex.cz \
    --cc=pobrn@protonmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox