Re: [PATCH] c2xml - Josh Triplett

linux-sparse.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Josh Triplett <josht@linux.vnet.ibm.com>
To: Rob Taylor <rob.taylor@codethink.co.uk>
Cc: linux-sparse@vger.kernel.org
Subject: Re: [PATCH] c2xml
Date: Wed, 27 Jun 2007 11:49:47 -0700	[thread overview]
Message-ID: <1182970187.8970.145.camel@josh-work.beaverton.ibm.com> (raw)
In-Reply-To: <46826B68.5040302@codethink.co.uk>

On Wed, 2007-06-27 at 14:51 +0100, Rob Taylor wrote:
> Here's something I've hacked up for my work on gobject-introspection
> [1]. It basically dumps the parse tree for a given file as simplistic
> xml, suitable for further transformation by something else (in my case,
> some python).
> 
> I'd expect this to also be useful for code navigation in editors and c
> refactoring tools, but I've really only focused on my needs for c api
> description.
> 
> There are 3 patches here. The first introduces a field in the symbol
> struct for the end position of the symbol. I've added this in my case
> for documentation generation, but again I think it'd be useful in other
> cases. The next introduces a sparse_keep_tokens, which parses a file,
> but doesn't free the tokens after parsing. The final one adds c2xml and
> the DTD for the xml format. It builds conditionally on whether libxml2
> is available.
> 
> All feedback appreciated!

Wow.  Very nice.  I can already think of several other uses for this.

A few suggestions:

      * Please sign off your patches.  See
        http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=blob;hb=HEAD;f=Documentation/SubmittingPatches , section "Sign your work", for details on the Developer's Certificate of Origin and the Signed-off-by convention.  I really need to include some documentation in the Sparse source tree, though.
      * Rather than specifying start="line:col" end="line:col", how
        about splitting those up into start-line, start-col, end-line,
        and end-col?  That would avoid the need to do string parsing
        after reading the XML.
      * Positions have file information associated with them.  A symbol
        might potentially start in one file and end in another, if
        people play crazy games with #include.  start-file and end-file?
      * Typo in examine_namespace: "Unregonized namespace".
      * get_type_name seems generally useful, and several other parts of
        Sparse (such as in evaluate.c and show-parse.c) could become
        simpler by using it.  How about putting it in symbol.c and
        exposing it via symbol.h?  Can you do that in a separate patch,
        please?
      * Also, should get_type_name perhaps look up the string in an
        array rather than using switch?  (I don't know which makes more
        sense.)
      * I don't know how much work this would require, but it doesn't
        seem like c2xml gets much value out of using libxml, so would it
        make things very painful to just print XML directly?  It would
        certainly make things like BAD_CAST and having to snprintf to
        local buffers go away.  If you count on libxml for some form of
        escaping or similar, please ignore this; however, as far as I
        can tell, all of the strings that c2xml works with (such as
        identifiers) can't have unusual characters in them.
      * Please don't include vim modelines in source files.  (Same goes
        for emacs and similar.)
      * Please explicitly limit the possible values of the type
        attribute to those that Sparse produces, rather than allowing
        any arbitrary CDATA.  The same goes for a few other 
      * Please consider including information from the context and
        address space attributes.
      * In examine_modifiers, please use C99-style designated assignment
        for the modifiers array, for clarity and robustness.
      * I suspect several of the modifiers in examine_modifiers don't
        need to generate output; I think you want to ignore everything
        in MOD_IGNORE.
      * Rather than the current base-type and base-type-builtin
        mechanism, you might consider having designated IDs for the base
        types and using those in base-type.  You could even output the
        builtin types if you want.  I don't know if this makes things
        easier or harder for consumers of the output; what do you think?
      * I don't know if sparse_keep_symbols seems like the right API.
        Sparse's approach to memory management (or lack thereof) bugs me
        a bit.  More importantly, though, it makes the hierarchy of
        functions sparse(), then __sparse(), then sparse_keep_symbols(),
        which seems strange.  I don't know a better solution offhand,
        though; don't worry too much about addressing this.

Note that you don't need to address all of these before resending.  In
particular, I'd love to merge the first patch, and I just need a signoff
for it.

Thanks again for this work; it looks great, and highly useful.

- Josh Triplett

next prev parent reply	other threads:[~2007-06-27 18:49 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-06-27 13:51 [PATCH] c2xml Rob Taylor
2007-06-27 18:49 ` Josh Triplett [this message]
2007-06-28  5:45   ` Josh Triplett
2007-06-28 11:00     ` Rob Taylor
2007-07-02 12:32   ` Rob Taylor
2007-07-13 15:50     ` Rob Taylor
2007-07-13 17:55       ` Josh Triplett
2007-07-14  6:24         ` Josh Triplett
2007-07-14 23:54           ` Rob Taylor

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1182970187.8970.145.camel@josh-work.beaverton.ibm.com \
    --to=josht@linux.vnet.ibm.com \
    --cc=linux-sparse@vger.kernel.org \
    --cc=rob.taylor@codethink.co.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).