Re: [PATCH 4/4] DTC: Begin the path to sane literals and expressions.

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Jon Loeliger <jdl@jdl.com>
To: David Gibson <david@gibson.dropbear.id.au>
Cc: linuxppc-dev@ozlabs.org
Subject: Re: [PATCH 4/4] DTC: Begin the path to sane literals and expressions.
Date: Fri, 26 Oct 2007 08:07:49 -0500	[thread overview]
Message-ID: <E1IlOuj-0000lw-7E@jdl.com> (raw)
In-Reply-To: Your message of "Fri, 26 Oct 2007 11:28:32 +1000." <20071026012832.GC457@localhost.localdomain>

So, like, the other day David Gibson mumbled:
> 
> Ah... I think I see the source of our misunderstanding.  Sorry if I
> was unclear.  I'm not saying that the version token would be
> invisible to the parser, just that it would be recognized by the lexer
> first.

Ah! Right.  OK, I see what you are saying now.

> The nice thing about having a token, is that if necessary we can
> completely change the grammar for each version, without having to have
> tangled rules that have to generate yyerror()s in some circumstances
> depending on the version variable.  The alternate grammars can be
> encoded directly into the yacc rules:
> 	startsymbol : version0_file
> 		    | V1_TOKEN version1_file
> 		    | V2_TOKEN version2_file
> 		    ;

Hmmm...  Now that I see that your symbol is still in the grammar,
I can see this part as well.  OK.  I'll buy it.

> > > I'm also inclined to leave the syntax for bytestrings as it is, in
> > 
> > Why?  Why not be allowed to form up a series of expressions
> > that make up a byte string? Am I missing something obvious here?
> 
> Because part of the point of bytestrings is to provide representation
> for binary data.  For a MAC address, say
> 	[0x00 0x0a 0xe4 0x2c 0x23 0x1f]
> is way bulkier than
> 	[000ae42c231f]

No, I think you misuderstand what I was after.  I'm not after the
the latter [000ae4...].  In that case, there would be multiple
expressions, each no bigger than 8 bits wide:

    [ expr expr expr    expr  expr      expr ]
    [ 0x00   10 0x4  0x20+12 '0'+3  0x20 - 1 ]

or whatever seemed appropriate.  It would not be one giant value.

> And in bytestring context, I suspect having every expression result be
> truncated to bytesize will be way more of a gotcha than in cell
> context.

Which is why we run a semantic checking as well and warn on
values not fitting in container sizes.

> I suspect we can get the expression flexibility we want here by
> providing the right operators to act *on* bytestrings, rather than
> within bytestrings.

That too.  No problem.  I suspect some may be functional, though.
Haven't thought about that a bunch yet.  I just want to get
basis stuff in first.

> Hrm.  I think just exprval or intval would be better.  Actually
> probably intval, since last we spoke I though we were planning on
> having expressions of string and bytestring types as well.

Except I think we want more generalized than that.

> Incidentally, there's another problem here: we haven't solved the
> problem about having to allow property names with initial digits.

I know.

> That's a particular problem here, because although we can make
> literals scanned in preference to propnames of the same length, in
> this case
> 	0x1234..0xabcd
> Will be scanned as one huge propname.

I know.  White space is mandatory right now.

> This might work for you at the moment, if you've still got all the
> lexer states, but I was really hoping we could ditch most of them with
> the new literals.

Which is really why they are all still there.  Longer term,
I want to _quit_ supporting "version 0" and remove the cruft...

> But you haven't actually addressed my concern about this.  Actually
> it's worse that I said then, because
> 	<0x10000000 -999>
> is ambiguous.  Is it a single subtraction expression, or one literal
> cell followed by an expression cell with a unary '-'?

Gah.

Paren'ed expressions may be the thing to do.
How do you feel about comma separation?

Anyone else care to chime in?

> > > > +unsigned int dts_version = 0;

> Yeah, I figured this out after.  Youch, an even tighter and harder to
> follow coupling between lexer and parser execution order.  I can think
> of at least two better ways to do this.

I'm listening... :-)

> 1) handle d# b# etc. at the lexer lexel, with a regex like
> (d#{WS}*[0-9]+).  Strictly speaking that changes the language, but I
> don't think anyone's been insane enough to do something like "d#
> /*stupid comment*/ 999".  That would remove the whole ugly
> opt_cell_base tangle from the grammar.

That seems like it could work...

> 2) Have the lexer just pass up literals as strings, and let the parser
> do the conversion to integer, based on the grammatical context.  I
> think this is preferable because it has other advantages: we can do
> the distinction between 64-bit values for memreserve and 32-bit values
> for cell at the grammatical level.  It can also be used to handle the
> propname/literal ambiguity without lexer states (I had a patch a while
> back which removed the MEMRESERVE and CELLDATA lex states using this
> technique).

I'm not so keen on that approach, I don't think.

> > The same call to set_dts_version() as any other case.
> 
> Erm... which same call to set_dts_version()?  Surely not the one in
> the parser..

I'm clearly not understanding your point, I'm afraid.  There are
static default values here:

    /*
     * DTS sourcefile version.
     */
    unsigned int dts_version = 0;
    unsigned int expr_default_base = 10;

And there is a call to set_dts_version() made when any DTS file
is parsed, which happens before any -O option is even handled.

What am I missing?

jdl

next prev parent reply	other threads:[~2007-10-26 13:07 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-10-19 17:43 [PATCH 4/4] DTC: Begin the path to sane literals and expressions Jon Loeliger
2007-10-20  8:47 ` David Gibson
2007-10-21  5:32   ` Segher Boessenkool
2007-10-22  0:37     ` David Gibson
2007-10-25 18:24   ` Jon Loeliger
2007-10-26  1:28     ` David Gibson
2007-10-26 13:07       ` Jon Loeliger [this message]
2007-10-26 14:03         ` David Gibson
2007-10-21  5:30 ` Segher Boessenkool
2007-10-22  0:51   ` David Gibson
2007-10-22 12:36   ` Jon Loeliger
2007-10-23  0:33     ` David Gibson
2007-10-23  0:57       ` Segher Boessenkool

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=E1IlOuj-0000lw-7E@jdl.com \
    --to=jdl@jdl.com \
    --cc=david@gibson.dropbear.id.au \
    --cc=linuxppc-dev@ozlabs.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.