On 05/12/2012 07:46 PM, Konrad Eisele wrote: > On 05/12/2012 01:02 PM, Christopher Li wrote: >> On Fri, May 11, 2012 at 2:48 PM, Konrad Eisele wrote: >>> >>> This seems ok. expanding_macro has to be global not static to be >>> used... (?) >> >> The expand_macro call back use the parent argument which get >> from expanding_macro list. The caller should be able to create tree >> from the leaf node using the parent pointer. >> >> Feel free to change to use the expanding_macro instead if that make >> building the tree easier. >> >>> I think the fact that argument expansion is recursive and >>> body expansion is non-recursive is one of the things that >>> make the preprocessor kindof hard to grasp. >> >> The body expansion can't be recursive on same macro otherwise >> it can result in unlimited expansion. The C stander specify >> the macro expand this way. >> >>> >>> I cannot say this before I've tried it. >>> >>> I'd like to straighten things out a bit: My last emails >>> where a bit too harsh and I'd like to apologize. Sorry >>> for that. >> >> No problem at all. I figure you just want to the patch to >> get included. >> >>> The next step then is: I'll write a patch to add a >>> test-prog that uses this api to trace the token generation >>> and generate a tree for it. >>> For a start I'll printout for all tokens of a preprocessor >>> run all macros-expansions that generated them. >> >> That is great. I have a test-macro program in that >> branch which is very close to print out all the tokens. > > Appended is a test-patch that adds test-mdep testcase. > The file mdep.c is used to record that macro > expansion, each token will have a reference to its > source. > test-mdep.c does pre-process (as test-macro.c) then > prints out the token trace through macros for each > token: @{ } is used to mark the active path. > > An example file is added: a.h > $test-mdep a.h > ... > 0004: 8 > body in D1 :4 @{8} 10 9 5 > arg0 in D1 :@{8} 10 9 > body in D0 :1 @{D1}(8 10 9) 2 D2(11) 3 > a.h:6:6 > ... > Token nr 4 of the preprocess stream is "8". The > generation path of "8" is marked @{8}... > Not 100%, still, I think already readable. (Actually > the printout order should be reversed (starting from file scope > and drilling down the macro expansions...) > > I still dont handle empty expansions. I'll see weather I can come up > with something here... I have thought about how to implement empty expansion tracing without introducing a new token type. I came up with a solution, however I need one callback, I called it substitute_arg(), see patch attached. What do you think, is it apply-able? I think I can use the address of the pointer to token (strict token **, which is normally &tok->next) as a hashing to propagate the empty expansions... Im not 100% shure it works but I need the extra hook to be able to propagate the empty expansion from the arguments into the substitution body... > > >> >>> Now, I've learned not to run too fast towards the >>> goal, (which is still "dependency tee from c parser entities downto >>> token"), maybe you can think about how to achieve the next steps >>> in an API : >>> - An #include #ifdef #else #endif pushdown-stack >>> to record the nestings for each token >> >> Let me think about this. Just thinking out lound, >> The #include and #ifdef can consider as a special kind >> of predefine macro as well. > > No, only a linked list that model the nexting levels. > Then a preprocessor hook that can register lookup_macro() > macro lookups inside # preprocessor lines. An example > makes it clear: > > #if defined(a) && defined(b) > #if defined(c) > #endif > #if defined(e) > #endif > #endif > > Result in: > [a b]+<-[c] > +<-[e] > > This can be easily done with a push-pop brackets > and a callback in lookup_macro(). > > > Also: > #if defined(a) > #elif defined(c) > #endif > > [a]+<-[c] > > #if defined(a) > #else > #endif > > <-[empty]<-[a] > > ... > > > Another point I also need is to have an option so that inside > do_handle_define() the symbol structures are never reused but > alloc_symbol() is always used for undef and define, this is > because I need to be able to also track the undef and define > history for a macro at a certain position. I think this should be > easy to add because you just need to define define-undef on > top of each other... > > >> >>> - How to connect all this to the AST. >> >> For symbol, it relative easy because symbol has pos range >> and aux pointer. > > I thought about taking "struct symbol_list *syms = sparse(file)" > as the root. Then mark all elements that are used by them as dependent. > I dont have enough insight to say how I can determine things like > which "static inline" are used or how to traverse the > "typedef" dependency. > The goal is to have a "shrink" application that can strip away > all c-lines (pre-pre-process level) that are not used by a specific > command invocation of the compiler. Also a tool that can quickly show > for a specific identifier everything that is connected to it, again on > pre-preprocessor source level. kind-of something like: > ... > func1() { > struct string_list *filelist = NULL; int i; > } > .. > I point to "string_list" and then all lines that are related > to struct string_list, (#ifdef nestings, macros, all member typedefs) > etc are shown and all the rest stripped away, again on human > readable c source level. > > >> >> Do you need to attach the dependency for the statment and >> expression as well? >> >> Chris >> >