public inbox for ltp@lists.linux.it
 help / color / mirror / Atom feed
From: Petr Vorel <pvorel@suse.cz>
To: ltp@lists.linux.it
Subject: [LTP] [PATCH v4 1/1] docparse: Handle special characters in JSON
Date: Thu, 6 May 2021 21:35:01 +0200	[thread overview]
Message-ID: <YJRE5TQQfezcPeKr@pevik> (raw)
In-Reply-To: <YJQAukLCEqSX1X/9@yuki>

Hi Cyril,

Looking at your code, I'm not sure if it's needed.

> > +static inline void data_fprintf_esc(FILE *f, unsigned int padd, const char *str)
> > +{
> > +	while (padd-- > 0)
> > +		fputc(' ', f);
> > +
> > +	fputc('"', f);

> 	int was_backslash = 0;

> > +	while (*str) {
> > +		switch (*str) {
> > +		case '\\':
> > +		break;
> > +		case '"':
> > +			fputs("\\\"", f);
> 			was_backslash = 0;
> > +			break;
> > +		case '\t':
> > +			fputs("        ", f);
> > +			break;
> > +		default:
> > +			/* RFC 8259 specify  chars before 0x20 as invalid */
> > +			if (*str >= 0x20)
> > +				putc(*str, f);
> > +			else
> > +				fprintf(stderr, "%s:%d %s(): invalid character for JSON: %x\n",
> > +						__FILE__, __LINE__, __func__, *str);
> > +			break;
> > +		}

> 		if (was_backslash)
> 			fputs("\\\\", f);

> 		was_backslash = (*str == '\\');
> > +		str++;
> > +	}
> > +
> > +	fputc('"', f);
> > +}

> This should avoid "unescaping" an escaped double quote. We deffer
> printing the backslash until we know the character after it and we make
> sure that we do not excape backslash before ".

> Consider what would happen if someone did put a "\"text\"" into options
> strings, the original code would escape the backslashes and we would end
> up with "\\"text"\\" which would break parser again.

> This way we can at least avoid parsing errors until we fix the problem
> one level down in the parser where we have the context required for a
> proper fix.

It looks to me it it works exactly the same with and w/a was_backslash.

Trying to escape \" will results in first escape \ (=> \\), then " (=> \")

Example C code:

/*\
 * [Description]
 * "expected" \\ behaviour "\"text\""
 */

static struct tst_test test = {
	.options = (struct tst_option[]) {
		{"a:", &can_dev_name, "\"text \\ \""},
		{}
	},
};

results from both original code and your with was_backslash are valid JSON,
but was_backslash add extra backslashes.

result from original code:

  "testfile": {
   "options": [
     [
      "a:",
      "can_dev_name",
      "\\\"text \\\\ \\\""
     ]
    ],
   "doc": [
     "[Description]",
     "\"expected\" \\\\ behaviour \"\\\"text\\\"\""
    ],
   "fname": "testfile.c"
  }

result from was_backslash:
  "testfile": {
   "options": [
     [
      "a:",
      "can_dev_name",
      "\\\"text \\\\\\ \\\\\""
     ]
    ],
   "doc": [
     "[Description]",
     "\"expected\" \\\\\\ \\behaviour \"\\\"text\\\"\""
    ],
   "fname": "testfile.c"
  }

What am I missing?

Kind regards,
Petr

  parent reply	other threads:[~2021-05-06 19:35 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-06 13:27 [LTP] [PATCH v4 1/1] docparse: Handle special characters in JSON Petr Vorel
2021-05-06 14:44 ` Cyril Hrubis
2021-05-06 18:21   ` Petr Vorel
2021-05-06 19:35   ` Petr Vorel [this message]
2021-05-07 10:10     ` Cyril Hrubis
2021-05-07 10:52       ` Petr Vorel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YJRE5TQQfezcPeKr@pevik \
    --to=pvorel@suse.cz \
    --cc=ltp@lists.linux.it \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox