From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([208.118.235.92]:42277) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SuQf9-00032x-4Q for qemu-devel@nongnu.org; Thu, 26 Jul 2012 12:11:47 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1SuQf3-0001gs-38 for qemu-devel@nongnu.org; Thu, 26 Jul 2012 12:11:43 -0400 Received: from mx1.redhat.com ([209.132.183.28]:25378) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SuQf2-0001go-R8 for qemu-devel@nongnu.org; Thu, 26 Jul 2012 12:11:37 -0400 From: Markus Armbruster References: <1343235256-26310-1-git-send-email-lcapitulino@redhat.com> <1343235256-26310-8-git-send-email-lcapitulino@redhat.com> <20120725161813.538012f0@doriath.home> <878ve6epif.fsf@blackfin.pond.sub.org> <20120726104749.5be67b6a@doriath.home> Date: Thu, 26 Jul 2012 18:11:30 +0200 In-Reply-To: <20120726104749.5be67b6a@doriath.home> (Luiz Capitulino's message of "Thu, 26 Jul 2012 10:47:49 -0300") Message-ID: <87394ev6x9.fsf@blackfin.pond.sub.org> MIME-Version: 1.0 Content-Type: text/plain Subject: Re: [Qemu-devel] [PATCH 07/11] qapi: qapi.py: allow the "'" character be escaped List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Luiz Capitulino Cc: Peter Maydell , aliguori@us.ibm.com, pbonzini@redhat.com, qemu-devel@nongnu.org, mdroth@linux.vnet.ibm.com Luiz Capitulino writes: > On Thu, 26 Jul 2012 13:22:00 +0200 > Markus Armbruster wrote: > >> Peter Maydell writes: >> >> > On 25 July 2012 20:18, Luiz Capitulino wrote: >> >> Peter Maydell wrote: >> >>> On 25 July 2012 17:54, Luiz Capitulino wrote: >> >>> > --- a/scripts/qapi.py >> >>> > +++ b/scripts/qapi.py >> >>> > @@ -21,7 +21,9 @@ def tokenize(data): >> >>> > elif data[0] == "'": >> >>> > data = data[1:] >> >>> > string = '' >> >>> > - while data[0] != "'": >> >>> > + while True: >> >>> > + if data[0] == "'" and string[len(string)-1] != "\\": >> >>> > + break >> >>> > string += data[0] >> >>> > data = data[1:] >> >>> > data = data[1:] >> >>> >> >>> Won't this cause us to look at string[-1] if >> >>> the input data has two ' characters in a row? >> >> >> >> Non escaped? If you meant '' that's a zero length string and >> >> should work, but >> >> if you meant 'foo '' bar' that's illegal, because ' characters >> >> should be escaped. >> > >> > I meant the zero length string case. yes. We come in with data = "''", >> > strip the first ' and set string to empty. Then in the first time >> > in the while loop data[0] is "'" but len(string) is 0 and so we'll >> > do string[-1] which I think will throw an exception. >> > >> > ...and yep, quick test of a nobbbled qapi-schema.json confirms: >> > $ python /home/pm215/src/qemu/qemu/scripts/qapi-types.py -h -o "." < >> > /home/pm215/src/qemu/qemu/qapi-schema.json >> > Traceback (most recent call last): >> > File "/home/pm215/src/qemu/qemu/scripts/qapi-types.py", line 260, in >> > exprs = parse_schema(sys.stdin) >> > File "/home/pm215/src/qemu/qemu/scripts/qapi.py", line 78, in parse_schema >> > expr_eval = evaluate(expr) >> > File "/home/pm215/src/qemu/qemu/scripts/qapi.py", line 64, in evaluate >> > return parse(map(lambda x: x, tokenize(string)))[0] >> > File "/home/pm215/src/qemu/qemu/scripts/qapi.py", line 25, in tokenize >> > if data[0] == "'" and string[len(string)-1] != "\\": >> > IndexError: string index out of range >> > >> > Try this (very lightly tested but seems to work): >> > (feel free to do something nicer than raising an exception on >> > the syntax error, and sorry I'm feeling too lazy to make this >> > an actual patch email) >> > >> > Signed-off-by: Peter Maydell > > Peter, I've replaced my original 07/11 patch with your patch below. > >> > >> > --- a/scripts/qapi.py >> > +++ b/scripts/qapi.py >> > @@ -21,10 +21,16 @@ def tokenize(data): >> > elif data[0] == "'": >> > data = data[1:] >> > string = '' >> > - while data[0] != "'": >> > - string += data[0] >> > - data = data[1:] >> > - data = data[1:] >> > + while True: >> > + pos = data.find("'") >> > + if pos == -1: >> > + raise Exception("Mismatched quotes") >> > + string += data[0:pos] >> > + data = data[pos+1:] >> > + if len(string) == 0 or string[-1] != "\\": >> > + # found a ' and it wasn't escaped >> > + break >> > + string = string[0:-1] + "'" >> > yield string >> > >> > def parse(tokens): >> > >> > (if anybody wants to be able to use '\\' to escape escapes then >> > this approach is a bit stuffed, of course.) An escape mechanism that can't be escaped sucks :) >> For what it's worth, the orthodox way to lexically analyze strings is a >> finite automaton. Utterly untested sketch: > > Feel free to send a patch if you're strong about this. I'll leave that to the poor guy who first needs to escape escapes. [...]