From: Josh Triplett <josht@linux.vnet.ibm.com>
To: Rob Taylor <rob.taylor@codethink.co.uk>
Cc: linux-sparse@vger.kernel.org
Subject: Re: [PATCH] c2xml
Date: Wed, 27 Jun 2007 11:49:47 -0700 [thread overview]
Message-ID: <1182970187.8970.145.camel@josh-work.beaverton.ibm.com> (raw)
In-Reply-To: <46826B68.5040302@codethink.co.uk>
On Wed, 2007-06-27 at 14:51 +0100, Rob Taylor wrote:
> Here's something I've hacked up for my work on gobject-introspection
> [1]. It basically dumps the parse tree for a given file as simplistic
> xml, suitable for further transformation by something else (in my case,
> some python).
>
> I'd expect this to also be useful for code navigation in editors and c
> refactoring tools, but I've really only focused on my needs for c api
> description.
>
> There are 3 patches here. The first introduces a field in the symbol
> struct for the end position of the symbol. I've added this in my case
> for documentation generation, but again I think it'd be useful in other
> cases. The next introduces a sparse_keep_tokens, which parses a file,
> but doesn't free the tokens after parsing. The final one adds c2xml and
> the DTD for the xml format. It builds conditionally on whether libxml2
> is available.
>
> All feedback appreciated!
Wow. Very nice. I can already think of several other uses for this.
A few suggestions:
* Please sign off your patches. See
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=blob;hb=HEAD;f=Documentation/SubmittingPatches , section "Sign your work", for details on the Developer's Certificate of Origin and the Signed-off-by convention. I really need to include some documentation in the Sparse source tree, though.
* Rather than specifying start="line:col" end="line:col", how
about splitting those up into start-line, start-col, end-line,
and end-col? That would avoid the need to do string parsing
after reading the XML.
* Positions have file information associated with them. A symbol
might potentially start in one file and end in another, if
people play crazy games with #include. start-file and end-file?
* Typo in examine_namespace: "Unregonized namespace".
* get_type_name seems generally useful, and several other parts of
Sparse (such as in evaluate.c and show-parse.c) could become
simpler by using it. How about putting it in symbol.c and
exposing it via symbol.h? Can you do that in a separate patch,
please?
* Also, should get_type_name perhaps look up the string in an
array rather than using switch? (I don't know which makes more
sense.)
* I don't know how much work this would require, but it doesn't
seem like c2xml gets much value out of using libxml, so would it
make things very painful to just print XML directly? It would
certainly make things like BAD_CAST and having to snprintf to
local buffers go away. If you count on libxml for some form of
escaping or similar, please ignore this; however, as far as I
can tell, all of the strings that c2xml works with (such as
identifiers) can't have unusual characters in them.
* Please don't include vim modelines in source files. (Same goes
for emacs and similar.)
* Please explicitly limit the possible values of the type
attribute to those that Sparse produces, rather than allowing
any arbitrary CDATA. The same goes for a few other
* Please consider including information from the context and
address space attributes.
* In examine_modifiers, please use C99-style designated assignment
for the modifiers array, for clarity and robustness.
* I suspect several of the modifiers in examine_modifiers don't
need to generate output; I think you want to ignore everything
in MOD_IGNORE.
* Rather than the current base-type and base-type-builtin
mechanism, you might consider having designated IDs for the base
types and using those in base-type. You could even output the
builtin types if you want. I don't know if this makes things
easier or harder for consumers of the output; what do you think?
* I don't know if sparse_keep_symbols seems like the right API.
Sparse's approach to memory management (or lack thereof) bugs me
a bit. More importantly, though, it makes the hierarchy of
functions sparse(), then __sparse(), then sparse_keep_symbols(),
which seems strange. I don't know a better solution offhand,
though; don't worry too much about addressing this.
Note that you don't need to address all of these before resending. In
particular, I'd love to merge the first patch, and I just need a signoff
for it.
Thanks again for this work; it looks great, and highly useful.
- Josh Triplett
next prev parent reply other threads:[~2007-06-27 18:49 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-06-27 13:51 [PATCH] c2xml Rob Taylor
2007-06-27 18:49 ` Josh Triplett [this message]
2007-06-28 5:45 ` Josh Triplett
2007-06-28 11:00 ` Rob Taylor
2007-07-02 12:32 ` Rob Taylor
2007-07-13 15:50 ` Rob Taylor
2007-07-13 17:55 ` Josh Triplett
2007-07-14 6:24 ` Josh Triplett
2007-07-14 23:54 ` Rob Taylor
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1182970187.8970.145.camel@josh-work.beaverton.ibm.com \
--to=josht@linux.vnet.ibm.com \
--cc=linux-sparse@vger.kernel.org \
--cc=rob.taylor@codethink.co.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).