public inbox for ltp@lists.linux.it
 help / color / mirror / Atom feed
From: Petr Vorel <pvorel@suse.cz>
To: ltp@lists.linux.it
Subject: [LTP] [PATCH v4 1/1] docparse: Handle special characters in JSON
Date: Thu, 6 May 2021 20:21:37 +0200	[thread overview]
Message-ID: <YJQzsUBhaD4wQyrF@pevik> (raw)
In-Reply-To: <YJQAukLCEqSX1X/9@yuki>

Hi Cyril,
> Hi!
> > * escape backslash (/) and double quote (")
>                       ^
> 		      \
+1

> >   escaping backslash effectively escapes other C escaped strings (\t,
> >   \n, ...), which we sometimes want (in the comment) but sometimes not
> >   (in .option we want to have them interpreted)
> > * replace tab with 8x space
> > * skip and TWARN invalid chars (< 0x20, i.e. anything before space)
>              ^
> 	     warn on? We are not actually using TWARN o here right?
Yep, I didn't update commit message (first I included tst_test.h with
TST_NO_DEFAULT_MAIN but there was missing include path => stderr is enough).

> >   defined by RFC 8259 (https://tools.ietf.org/html/rfc8259#page-9)

> > NOTE: atm fix is required only for ", but tab was problematic in the past.

> > TODO: This is just a "hot fix" solution before release. Proper solution
> > would be to check if chars needed to be escaped (", \, /) aren't already
> > escaped.

> > Also for correct decision whether \n, \t should be escaped or interpreted
> > we should decide in the parser which has the context. C string should be
> > probably interpreted (thus nothing needed to be done as it escapes in
> > a compatible way with JSON), but comments probably should display \n, \t
> > thus add extra \.

> > Fixes: c39b29f0a ("bpf: Check truncation on 32bit div/mod by zero")

> > Suggested-by: Cyril Hrubis <chrubis@suse.cz>
> > Co-developed-by: Cyril Hrubis <chrubis@suse.cz>
> > Signed-off-by: Petr Vorel <pvorel@suse.cz>
> > ---
> >  docparse/data_storage.h | 36 +++++++++++++++++++++++++++++++++++-
> >  1 file changed, 35 insertions(+), 1 deletion(-)

> > diff --git a/docparse/data_storage.h b/docparse/data_storage.h
> > index ef420c08f..9f36dd6f0 100644
> > --- a/docparse/data_storage.h
> > +++ b/docparse/data_storage.h
> > @@ -256,6 +256,40 @@ static inline void data_fprintf(FILE *f, unsigned int padd, const char *fmt, ...
> >  	va_end(va);
> >  }

> > +
> > +static inline void data_fprintf_esc(FILE *f, unsigned int padd, const char *str)
> > +{
> > +	while (padd-- > 0)
> > +		fputc(' ', f);
> > +
> > +	fputc('"', f);

> 	int was_backslash = 0;

> > +	while (*str) {
> > +		switch (*str) {
> > +		case '\\':
> > +		break;
> > +		case '"':
> > +			fputs("\\\"", f);
> 			was_backslash = 0;
> > +			break;
> > +		case '\t':
> > +			fputs("        ", f);
> > +			break;
> > +		default:
> > +			/* RFC 8259 specify  chars before 0x20 as invalid */
> > +			if (*str >= 0x20)
> > +				putc(*str, f);
> > +			else
> > +				fprintf(stderr, "%s:%d %s(): invalid character for JSON: %x\n",
> > +						__FILE__, __LINE__, __func__, *str);
> > +			break;
> > +		}

> 		if (was_backslash)
> 			fputs("\\\\", f);

> 		was_backslash = (*str == '\\');
> > +		str++;
> > +	}
> > +
> > +	fputc('"', f);
> > +}

> This should avoid "unescaping" an escaped double quote. We deffer
> printing the backslash until we know the character after it and we make
> sure that we do not excape backslash before ".

> Consider what would happen if someone did put a "\"text\"" into options
> strings, the original code would escape the backslashes and we would end
> up with "\\"text"\\" which would break parser again.

> This way we can at least avoid parsing errors until we fix the problem
> one level down in the parser where we have the context required for a
> proper fix.

+1.

I'll test it and merge under your as it's basically your work :).
Thanks!

Kind regards,
Petr

  reply	other threads:[~2021-05-06 18:21 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-06 13:27 [LTP] [PATCH v4 1/1] docparse: Handle special characters in JSON Petr Vorel
2021-05-06 14:44 ` Cyril Hrubis
2021-05-06 18:21   ` Petr Vorel [this message]
2021-05-06 19:35   ` Petr Vorel
2021-05-07 10:10     ` Cyril Hrubis
2021-05-07 10:52       ` Petr Vorel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YJQzsUBhaD4wQyrF@pevik \
    --to=pvorel@suse.cz \
    --cc=ltp@lists.linux.it \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox