On 05/12/2012 01:02 PM, Christopher Li wrote: > On Fri, May 11, 2012 at 2:48 PM, Konrad Eisele wrote: >> >> This seems ok. expanding_macro has to be global not static to be >> used... (?) > > The expand_macro call back use the parent argument which get > from expanding_macro list. The caller should be able to create tree > from the leaf node using the parent pointer. > > Feel free to change to use the expanding_macro instead if that make > building the tree easier. > >> I think the fact that argument expansion is recursive and >> body expansion is non-recursive is one of the things that >> make the preprocessor kindof hard to grasp. > > The body expansion can't be recursive on same macro otherwise > it can result in unlimited expansion. The C stander specify > the macro expand this way. > >> >> I cannot say this before I've tried it. >> >> I'd like to straighten things out a bit: My last emails >> where a bit too harsh and I'd like to apologize. Sorry >> for that. > > No problem at all. I figure you just want to the patch to > get included. > >> The next step then is: I'll write a patch to add a >> test-prog that uses this api to trace the token generation >> and generate a tree for it. >> For a start I'll printout for all tokens of a preprocessor >> run all macros-expansions that generated them. > > That is great. I have a test-macro program in that > branch which is very close to print out all the tokens. Appended is a test-patch that adds test-mdep testcase. The file mdep.c is used to record that macro expansion, each token will have a reference to its source. test-mdep.c does pre-process (as test-macro.c) then prints out the token trace through macros for each token: @{ } is used to mark the active path. An example file is added: a.h $test-mdep a.h ... 0004: 8 body in D1 :4 @{8} 10 9 5 arg0 in D1 :@{8} 10 9 body in D0 :1 @{D1}(8 10 9) 2 D2(11) 3 a.h:6:6 ... Token nr 4 of the preprocess stream is "8". The generation path of "8" is marked @{8}... Not 100%, still, I think already readable. (Actually the printout order should be reversed (starting from file scope and drilling down the macro expansions...) I still dont handle empty expansions. I'll see weather I can come up with something here... > >> Now, I've learned not to run too fast towards the >> goal, (which is still "dependency tee from c parser entities downto >> token"), maybe you can think about how to achieve the next steps >> in an API : >> - An #include #ifdef #else #endif pushdown-stack >> to record the nestings for each token > > Let me think about this. Just thinking out lound, > The #include and #ifdef can consider as a special kind > of predefine macro as well. No, only a linked list that model the nexting levels. Then a preprocessor hook that can register lookup_macro() macro lookups inside # preprocessor lines. An example makes it clear: #if defined(a) && defined(b) #if defined(c) #endif #if defined(e) #endif #endif Result in: [a b]+<-[c] +<-[e] This can be easily done with a push-pop brackets and a callback in lookup_macro(). Also: #if defined(a) #elif defined(c) #endif [a]+<-[c] #if defined(a) #else #endif <-[empty]<-[a] ... Another point I also need is to have an option so that inside do_handle_define() the symbol structures are never reused but alloc_symbol() is always used for undef and define, this is because I need to be able to also track the undef and define history for a macro at a certain position. I think this should be easy to add because you just need to define define-undef on top of each other... > >> - How to connect all this to the AST. > > For symbol, it relative easy because symbol has pos range > and aux pointer. I thought about taking "struct symbol_list *syms = sparse(file)" as the root. Then mark all elements that are used by them as dependent. I dont have enough insight to say how I can determine things like which "static inline" are used or how to traverse the "typedef" dependency. The goal is to have a "shrink" application that can strip away all c-lines (pre-pre-process level) that are not used by a specific command invocation of the compiler. Also a tool that can quickly show for a specific identifier everything that is connected to it, again on pre-preprocessor source level. kind-of something like: ... func1() { struct string_list *filelist = NULL; int i; } .. I point to "string_list" and then all lines that are related to struct string_list, (#ifdef nestings, macros, all member typedefs) etc are shown and all the rest stripped away, again on human readable c source level. > > Do you need to attach the dependency for the statment and > expression as well? > > Chris >