From: Benjamin Marzinski <bmarzins@redhat.com>
To: Abhinav Jain <jain.abhinav177@gmail.com>
Cc: agk@redhat.com, snitzer@kernel.org, mpatocka@redhat.com,
dm-devel@lists.linux.dev, linux-kernel@vger.kernel.org,
skhan@linuxfoundation.org, javier.carrasco.cruz@gmail.com
Subject: Re: [PATCH v2] dm: Add support for escaped characters in str_field_delimit()
Date: Fri, 14 Jun 2024 16:12:25 -0400 [thread overview]
Message-ID: <ZmykKYjVP1xU4J3d@redhat.com> (raw)
In-Reply-To: <20240613162632.38065-1-jain.abhinav177@gmail.com>
On Thu, Jun 13, 2024 at 04:26:32PM +0000, Abhinav Jain wrote:
> Remove all the escape characters that come before separator.
> Tested this code by writing a dummy program containing the two
> functions and testing it on below input, sharing results:
>
> Original string: "field1\,with\,commas,field2\,with\,more\,commas"
> Field: "field1"
> Field: "with"
> Field: "commas"
> Field: "field2"
> Field: "with"
> Field: "more"
> Field: "commas"
But that's not the output that you want here. The purpose of escaping
the separator is so that the seraptor character remains in the field
without the escape character and without acting as a seperator.
The output you would want is:
Field: "field1,with,commas"
Field: "field2,with,more,commas"
>
> Signed-off-by: Abhinav Jain <jain.abhinav177@gmail.com>
> ---
> PATCH v1:
> https://lore.kernel.org/all/20240609141721.52344-1-jain.abhinav177@gmail.com/
>
> Changes since v1:
> - Modified the str_field_delimit function as per shared feedback
> - Added remove_escaped_characters function
> ---
> ---
> drivers/md/dm-init.c | 53 +++++++++++++++++++++++++++++++++++++++-----
> 1 file changed, 47 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/md/dm-init.c b/drivers/md/dm-init.c
> index 2a71bcdba92d..0e31ecf1b48e 100644
> --- a/drivers/md/dm-init.c
> +++ b/drivers/md/dm-init.c
> @@ -76,6 +76,24 @@ static void __init dm_setup_cleanup(struct list_head *devices)
> }
> }
>
> +/* Remove escape characters from a given field string. */
> +static void __init remove_escape_characters(char *field)
> +{
This means that there is no way to have the escape character in a field,
which is a valid character in device-mapper names and UUIDS. "bad\name"
is a valid device-mapper name. So is "badname\".
This brings up a different point, both the separator characters and the
escape character aren't valid for udev names. If you want this to work
correctly with udev and let users enter these unfortunate names, there
is more mangling that will need to get done later. I'm not sure all this
is super useful just to let people use poorly chosen device names/uuids.
Is there some other purpose for this work that I'm missing?
Assuming there are reasons to do this work, the only strings that need
to be changed by this function are.
<start_of_string>\<seperator><rest_of_string>
Which needs to be changed to
<start_of_string><seperator><rest_of_string>
and
<start_of_string>\\
which needs to be changed to
<start_of_string>\
This is assuming that "\\<seperator>" is what you would use to end your
field in a \, escaping the escape, so that it didn't interere with the
seperator.
> + char *src = field;
> + char *dest = field;
> +
> + while (*src) {
> + if (*src == '\\') {
> + src++;
> + if (*src)
> + *dest++ = *src++;
> + } else {
> + *dest++ = *src++;
> + }
> + }
> + *dest = '\0';
> +}
> +
> /**
> * str_field_delimit - delimit a string based on a separator char.
> * @str: the pointer to the string to delimit.
> @@ -87,16 +105,39 @@ static void __init dm_setup_cleanup(struct list_head *devices)
> */
> static char __init *str_field_delimit(char **str, char separator)
> {
> - char *s;
> + char *s, *escaped, *field;
>
> - /* TODO: add support for escaped characters */
> *str = skip_spaces(*str);
> s = strchr(*str, separator);
> - /* Delimit the field and remove trailing spaces */
> - if (s)
> +
> + /* Check for escaped character */
> + escaped = strchr(*str, '\\');
> + while (escaped && (s == NULL || escaped < s)) {
> + /*
> + * Move the separator search ahead if escaped
> + * character comes before.
> + */
> + s = strchr(escaped + 1, separator);
> + escaped = strchr(escaped + 1, '\\');
> + }
> +
This code still splits the string at every seperator. It should probably
just scan for separators, and split the string when it finds the first
one that does not have exactly one escape character before it.
> + /* If we found a separator, we need to handle escape characters */
> + if (s) {
> + *s = '\0';
> +
> + remove_escape_characters(*str);
> + field = *str;
> + *str = s + 1;
> + } else {
> + /* Handle the last field when no separator is present */
If no separator is present, there's nothing to do. strlen() only works
on strings that are already null-terminated.
> + s = *str + strlen(*str);
> *s = '\0';
> - *str = strim(*str);
Why skip trimming the string?
> - return s ? ++s : NULL;
> +
> + remove_escape_characters(*str);
> + field = *str;
> + *str = s;
> + }
This function is supposed to return the rest of the string after the
separator. and *str is supposed to point to the start of the field
after skipping the initial spaces.
-Ben
> + return field;
> }
>
> /**
> --
> 2.34.1
next prev parent reply other threads:[~2024-06-14 20:12 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-06-13 16:26 [PATCH v2] dm: Add support for escaped characters in str_field_delimit() Abhinav Jain
2024-06-14 20:12 ` Benjamin Marzinski [this message]
2024-07-02 15:18 ` Mikulas Patocka
2024-07-02 16:23 ` Benjamin Marzinski
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZmykKYjVP1xU4J3d@redhat.com \
--to=bmarzins@redhat.com \
--cc=agk@redhat.com \
--cc=dm-devel@lists.linux.dev \
--cc=jain.abhinav177@gmail.com \
--cc=javier.carrasco.cruz@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mpatocka@redhat.com \
--cc=skhan@linuxfoundation.org \
--cc=snitzer@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox