From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755264AbcBZUVQ (ORCPT ); Fri, 26 Feb 2016 15:21:16 -0500 Received: from mx1.redhat.com ([209.132.183.28]:58383 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755238AbcBZUVM (ORCPT ); Fri, 26 Feb 2016 15:21:12 -0500 From: Jessica Yu To: Andrew Morton Cc: Rasmus Villemoes , Andy Shevchenko , Kees Cook , linux-kernel@vger.kernel.org, Jessica Yu Subject: [PATCH v4] sscanf: implement basic character sets Date: Fri, 26 Feb 2016 15:20:59 -0500 Message-Id: <1456518059-7472-1-git-send-email-jeyu@redhat.com> Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Implement basic character sets for the '%[' conversion specifier. The '%[' conversion specifier matches a nonempty sequence of characters from the specified set of accepted (or with '^', rejected) characters between the brackets. The substring matched is to be made up of characters in (or not in) the set. This is useful for matching substrings that are delimited by something other than spaces. This implementation differs from its glibc counterpart in the following ways: (1) No support for character ranges (e.g., 'a-z' or '0-9') (2) The hyphen '-' is not a special character (3) The closing bracket ']' cannot be matched (4) No support (yet) for discarding matching input ('%*[') Signed-off-by: Jessica Yu --- Patch based on linux-next-20160226. v4: - To avoid allocations (i.e. kstrndup), use a bitmap to represent the character set (suggested by Rasmus Villemoes) - Add a comment to document non-glibc behavior and provide example usage - Check for '[' in the '*' (discard) case. Since it is not supported yet, it is considered malformed input v3: - Fix memory leak in error path (kfree() before returning) - Remove redundant condition in while loop - Style fix (*op)() -> op() v2: - Use kstrndup() to copy the character set from fmt instead of using a statically allocated array lib/vsprintf.c | 57 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 56 insertions(+), 1 deletion(-) diff --git a/lib/vsprintf.c b/lib/vsprintf.c index 525c8e1..9a3b860 100644 --- a/lib/vsprintf.c +++ b/lib/vsprintf.c @@ -2640,8 +2640,12 @@ int vsscanf(const char *buf, const char *fmt, va_list args) if (*fmt == '*') { if (!*str) break; - while (!isspace(*fmt) && *fmt != '%' && *fmt) + while (!isspace(*fmt) && *fmt != '%' && *fmt) { + /* '%*[' not yet supported, invalid format */ + if (*fmt == '[') + return num; fmt++; + } while (!isspace(*str) && *str) str++; continue; @@ -2714,6 +2718,57 @@ int vsscanf(const char *buf, const char *fmt, va_list args) num++; } continue; + /* + * Warning: This implementation of the '[' conversion specifier + * deviates from its glibc counterpart in the following ways: + * (1) It does NOT support ranges i.e. '-' is NOT a special character + * (2) It cannot match the closing bracket ']' itself + * (3) A field width is required + * (4) '%*[' (discard matching input) is currently not supported + * + * Example usage: + * ret = sscanf("00:0a:95","%2[^:]:%2[^:]:%2[^:]", buf1, buf2, buf3); + * if (ret < 3) + * // etc.. + */ + case '[': + { + char *s = (char *)va_arg(args, char *); + DECLARE_BITMAP(set, 256) = {0}; + unsigned int len = 0; + bool negate = (*fmt == '^'); + + /* field width is required */ + if (field_width == -1) + return num; + + if (negate) + ++fmt; + + for ( ; *fmt && *fmt != ']'; ++fmt, ++len) + set_bit((u8)*fmt, set); + + /* no ']' or no character set found */ + if (!*fmt || !len) + return num; + ++fmt; + + if (negate) { + bitmap_complement(set, set, 256); + /* exclude null '\0' byte */ + clear_bit(0, set); + } + + /* match must be non-empty */ + if (!test_bit((u8)*str, set)) + return num; + + while (test_bit((u8)*str, set) && field_width--) + *s++ = *str++; + *s = '\0'; + ++num; + } + continue; case 'o': base = 8; break; -- 2.4.3