linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jessica Yu <jeyu@redhat.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>,
	Andy Shevchenko <andriy.shevchenko@linux.intel.com>,
	Kees Cook <keescook@chromium.org>,
	linux-kernel@vger.kernel.org, Jessica Yu <jeyu@redhat.com>
Subject: [PATCH v4] sscanf: implement basic character sets
Date: Fri, 26 Feb 2016 15:20:59 -0500	[thread overview]
Message-ID: <1456518059-7472-1-git-send-email-jeyu@redhat.com> (raw)

Implement basic character sets for the '%[' conversion specifier.

The '%[' conversion specifier matches a nonempty sequence of characters
from the specified set of accepted (or with '^', rejected) characters
between the brackets. The substring matched is to be made up of characters
in (or not in) the set. This is useful for matching substrings that are
delimited by something other than spaces.

This implementation differs from its glibc counterpart in the following ways:
(1) No support for character ranges (e.g., 'a-z' or '0-9')
(2) The hyphen '-' is not a special character
(3) The closing bracket ']' cannot be matched
(4) No support (yet) for discarding matching input ('%*[')

Signed-off-by: Jessica Yu <jeyu@redhat.com>
---
Patch based on linux-next-20160226.

v4:
 - To avoid allocations (i.e. kstrndup), use a bitmap to represent
   the character set (suggested by Rasmus Villemoes)
 - Add a comment to document non-glibc behavior and provide example usage
 - Check for '[' in the '*' (discard) case. Since it is not supported
   yet, it is considered malformed input

v3:
 - Fix memory leak in error path (kfree() before returning)
 - Remove redundant condition in while loop
 - Style fix (*op)() -> op()

v2:
 - Use kstrndup() to copy the character set from fmt instead of using a
   statically allocated array

 lib/vsprintf.c | 57 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 56 insertions(+), 1 deletion(-)

diff --git a/lib/vsprintf.c b/lib/vsprintf.c
index 525c8e1..9a3b860 100644
--- a/lib/vsprintf.c
+++ b/lib/vsprintf.c
@@ -2640,8 +2640,12 @@ int vsscanf(const char *buf, const char *fmt, va_list args)
 		if (*fmt == '*') {
 			if (!*str)
 				break;
-			while (!isspace(*fmt) && *fmt != '%' && *fmt)
+			while (!isspace(*fmt) && *fmt != '%' && *fmt) {
+				/* '%*[' not yet supported, invalid format */
+				if (*fmt == '[')
+					return num;
 				fmt++;
+			}
 			while (!isspace(*str) && *str)
 				str++;
 			continue;
@@ -2714,6 +2718,57 @@ int vsscanf(const char *buf, const char *fmt, va_list args)
 			num++;
 		}
 		continue;
+		/*
+		 * Warning: This implementation of the '[' conversion specifier
+		 * deviates from its glibc counterpart in the following ways:
+		 * (1) It does NOT support ranges i.e. '-' is NOT a special character
+		 * (2) It cannot match the closing bracket ']' itself
+		 * (3) A field width is required
+		 * (4) '%*[' (discard matching input) is currently not supported
+		 *
+		 * Example usage:
+		 * ret = sscanf("00:0a:95","%2[^:]:%2[^:]:%2[^:]", buf1, buf2, buf3);
+		 * if (ret < 3)
+		 *    // etc..
+		 */
+		case '[':
+		{
+			char *s = (char *)va_arg(args, char *);
+			DECLARE_BITMAP(set, 256) = {0};
+			unsigned int len = 0;
+			bool negate = (*fmt == '^');
+
+			/* field width is required */
+			if (field_width == -1)
+				return num;
+
+			if (negate)
+				++fmt;
+
+			for ( ; *fmt && *fmt != ']'; ++fmt, ++len)
+				set_bit((u8)*fmt, set);
+
+			/* no ']' or no character set found */
+			if (!*fmt || !len)
+				return num;
+			++fmt;
+
+			if (negate) {
+				bitmap_complement(set, set, 256);
+				/* exclude null '\0' byte */
+				clear_bit(0, set);
+			}
+
+			/* match must be non-empty */
+			if (!test_bit((u8)*str, set))
+				return num;
+
+			while (test_bit((u8)*str, set) && field_width--)
+				*s++ = *str++;
+			*s = '\0';
+			++num;
+		}
+		continue;
 		case 'o':
 			base = 8;
 			break;
-- 
2.4.3

             reply	other threads:[~2016-02-26 20:21 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-02-26 20:20 Jessica Yu [this message]
2016-02-26 20:28 ` sscanf: implement basic character sets Jessica Yu
2016-03-07 23:12   ` Jessica Yu
2016-03-07 23:24     ` Andrew Morton
2016-03-07 23:32       ` Rasmus Villemoes
2016-03-08  1:07       ` Jessica Yu
2016-03-02 23:49 ` [PATCH v4] " Rasmus Villemoes
2016-03-07 23:09   ` Jessica Yu
  -- strict thread matches above, loose matches on Subject: below --
2016-04-05 14:32 [PATCH v4] " Shahbaz Youssefi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1456518059-7472-1-git-send-email-jeyu@redhat.com \
    --to=jeyu@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=andriy.shevchenko@linux.intel.com \
    --cc=keescook@chromium.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux@rasmusvillemoes.dk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).