All of lore.kernel.org
 help / color / mirror / Atom feed
From: Tobias Waldekranz <tobias@waldekranz.com>
To: Ahmad Fatoum <a.fatoum@pengutronix.de>, barebox@lists.infradead.org
Subject: Re: [PATCH 1/5] string: add strtok/strtokv
Date: Thu, 04 Sep 2025 15:35:30 +0200	[thread overview]
Message-ID: <87wm6e2vdp.fsf@waldekranz.com> (raw)
In-Reply-To: <96be67c8-e10e-40f7-9945-76ccfa8d0aac@pengutronix.de>

On tor, sep 04, 2025 at 13:00, Ahmad Fatoum <a.fatoum@pengutronix.de> wrote:
> Hello Tobias,
>
> On 8/28/25 5:05 PM, Tobias Waldekranz wrote:
>> Add an implementation of libc's standard strtok(3), which is useful
>> for tokenizing strings.
>
> strtok was previously removed in favor of strsep as it doesn't suffer
> from re-entrancy issues (poller and bthreads can run during delays). If
> you want to allow escapes, there's also strsep_unescaped.

Aha, my bad. I did not realize that there was more than one thread of
execution.

strsep() is not quite the same thing though, I am really after the
strtok()'s behavior of skipping empty tokens. How would you feel about
adding strtok_r() instead?

>> Also, add a version that will collect all tokens from a string into an
>> array, which is useful in situations where you need to know how many
>> tokens there are, and when a token's relative position in the order is
>> significant.
>
> We have the inverse as strjoin, but not this. Maybe call it strsplit
> instead?

If you accept my strtok_r() suggestion, do you still think strsplit() is
a better name, or is there value in signaling the underlying strtok()
behavior?

> Cheers,
> Ahmad
>
>> 
>> Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com>
>> ---
>>  include/string.h |  2 ++
>>  lib/string.c     | 66 ++++++++++++++++++++++++++++++++++++++++++++++++
>>  2 files changed, 68 insertions(+)
>> 
>> diff --git a/include/string.h b/include/string.h
>> index 71affe48b6..c8df8540d8 100644
>> --- a/include/string.h
>> +++ b/include/string.h
>> @@ -8,6 +8,8 @@
>>  void *mempcpy(void *dest, const void *src, size_t count);
>>  int strtobool(const char *str, int *val);
>>  char *strsep_unescaped(char **, const char *, char *);
>> +char *strtok(char *str, const char *delim);
>> +int strtokv(char *str, const char *delim, char ***vecp);
>>  char *stpcpy(char *dest, const char *src);
>>  bool strends(const char *str, const char *postfix);
>>  
>> diff --git a/lib/string.c b/lib/string.c
>> index 73637cd971..be7e65eb45 100644
>> --- a/lib/string.c
>> +++ b/lib/string.c
>> @@ -593,6 +593,72 @@ char *strsep_unescaped(char **s, const char *ct, char *delim)
>>          return sbegin;
>>  }
>>  
>> +/**
>> + * strtok - extract tokens from string
>> + * @str:	string to split
>> + * @delim:	set of delimiter characters
>> + *
>> + * The strtok() function breaks up a string into zero or more nonempty
>> + * tokens.  On the first call, the string to be parsed should be
>> + * specified in @str.  In each subsequent call that should parse the
>> + * same string, @str must be NULL.
>> + *
>> + * @delim specifies a set of bytes that delimit the tokens in the
>> + * string.
>> + *
>> + * Each call to strtok() returns a pointer to a string containing the
>> + * next token.  This is done by replacing the first delimiter with a
>> + * NUL character, the operation is thus destructive to the string. If
>> + * no more tokens are found, strtok() returns NULL.
>> + */
>> +char *strtok(char *str, const char *delim)
>> +{
>> +	static char *cursor;
>> +
>> +	if (str)
>> +		cursor = str;
>> +
>> +	if (!cursor)
>> +		return NULL;
>> +
>> +	cursor += strspn(cursor, delim);
>> +	if (*cursor == '\0') {
>> +		cursor = NULL;
>> +		return NULL;
>> +	}
>> +
>> +	return strsep(&cursor, delim);
>> +}
>> +EXPORT_SYMBOL(strtok);
>> +
>> +/**
>> + * strtokv - split string into array of tokens based on a delimiter set
>> + * @str:	string to split
>> + * @delim:	set of delimiter characters
>> + * @vecp:	array of tokens
>> + *
>> + * Split @str into tokens delimited by @delim, using strtok(), and
>> + * store the allocated token array in @vecp, which the caller is
>> + * responsible for freeing.
>> + *
>> + * Return: The number of tokens in the array.
>> + */
>> +int strtokv(char *str, const char *delim, char ***vecp)
>> +{
>> +	char *tok, **vec = NULL;
>> +	int cnt = 0;
>> +
>> +
>> +	for (tok = strtok(str, delim); tok; tok = strtok(NULL, delim)) {
>> +		vec = xrealloc(vec, (cnt + 1) * sizeof(*vec));
>> +		vec[cnt++] = tok;
>> +	}
>> +
>> +	*vecp = vec;
>> +	return cnt;
>> +}
>> +EXPORT_SYMBOL(strtokv);
>> +
>>  #ifndef __HAVE_ARCH_STRSWAB
>>  /**
>>   * strswab - swap adjacent even and odd bytes in %NUL-terminated string
>
> -- 
> Pengutronix e.K.                  |                             |
> Steuerwalder Str. 21              | http://www.pengutronix.de/  |
> 31137 Hildesheim, Germany         | Phone: +49-5121-206917-0    |
> Amtsgericht Hildesheim, HRA 2686  | Fax:   +49-5121-206917-5555 |



  reply	other threads:[~2025-09-04 16:29 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-08-28 15:05 [PATCH 0/5] dm: Initial work on a device mapper Tobias Waldekranz
2025-08-28 15:05 ` [PATCH 1/5] string: add strtok/strtokv Tobias Waldekranz
2025-09-04 11:00   ` Ahmad Fatoum
2025-09-04 13:35     ` Tobias Waldekranz [this message]
2025-09-05 16:28       ` Ahmad Fatoum
2025-09-08  9:26         ` Tobias Waldekranz
2025-08-28 15:05 ` [PATCH 2/5] dm: Add initial device mapper infrastructure Tobias Waldekranz
2025-09-05 16:14   ` Ahmad Fatoum
2025-09-08  9:27     ` Tobias Waldekranz
2025-09-08 18:40       ` Ahmad Fatoum
2025-09-05 17:26   ` Ahmad Fatoum
2025-08-28 15:05 ` [PATCH 3/5] dm: linear: Add linear target Tobias Waldekranz
2025-08-29  5:56   ` Ahmad Fatoum
2025-09-05 16:37   ` Ahmad Fatoum
2025-08-28 15:05 ` [PATCH 4/5] test: self: dm: Add test of " Tobias Waldekranz
2025-09-05 16:50   ` Ahmad Fatoum
2025-09-08  9:27     ` Tobias Waldekranz
2025-09-08 18:53       ` Ahmad Fatoum
2025-08-28 15:05 ` [PATCH 5/5] commands: dmsetup: Basic command set for dm device management Tobias Waldekranz
2025-09-05 16:54   ` Ahmad Fatoum
2025-09-08  9:27     ` Tobias Waldekranz
2025-09-08 18:59       ` Ahmad Fatoum
2025-08-29  8:29 ` [PATCH 0/5] dm: Initial work on a device mapper Sascha Hauer
2025-08-31  7:48   ` Tobias Waldekranz
2025-09-02  8:40     ` Ahmad Fatoum
2025-09-02  9:44       ` Tobias Waldekranz
2025-08-29 11:24 ` Ahmad Fatoum
2025-08-31  7:48   ` Tobias Waldekranz
2025-09-02  9:03     ` Ahmad Fatoum
2025-09-02 13:01       ` Tobias Waldekranz
2025-09-03  7:05         ` Jan Lübbe
2025-09-02 14:46       ` Jan Lübbe
2025-09-02 21:34         ` Tobias Waldekranz
2025-09-03  6:50           ` Jan Lübbe
2025-09-03 20:19             ` Tobias Waldekranz
2025-09-05 14:44               ` Jan Lübbe
2025-09-02 14:34   ` Jan Lübbe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87wm6e2vdp.fsf@waldekranz.com \
    --to=tobias@waldekranz.com \
    --cc=a.fatoum@pengutronix.de \
    --cc=barebox@lists.infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.