From mboxrd@z Thu Jan 1 00:00:00 1970 From: Konrad Eisele Subject: Re: Fwd: dependency tee from c parser entities downto token Date: Sat, 12 May 2012 19:57:16 +0200 Message-ID: <4FAEA47C.6080308@gmail.com> References: <4F967865.60809@gaisler.com> <4FA5B9E8.7010208@gmail.com> <4FA767BD.8060703@gaisler.com> <4FA8BF7D.60606@gaisler.com> <4FAA3D50.8080901@gaisler.com> <4FAB5DEA.5060009@gaisler.com> <4FAB6268.7070908@gaisler.com> <4FAB8F9E.8040205@gaisler.com> <4FABB132.1070308@gmail.com> <4FABB467.7030703@gmail.com> <4FAD892B.8070709@gmail.com> <4FAEA208.3090601@gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Return-path: Received: from mail-lpp01m010-f46.google.com ([209.85.215.46]:49421 "EHLO mail-lpp01m010-f46.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751415Ab2ELRxl (ORCPT ); Sat, 12 May 2012 13:53:41 -0400 Received: by lahd3 with SMTP id d3so2554113lah.19 for ; Sat, 12 May 2012 10:53:39 -0700 (PDT) In-Reply-To: <4FAEA208.3090601@gmail.com> Sender: linux-sparse-owner@vger.kernel.org List-Id: linux-sparse@vger.kernel.org To: Christopher Li Cc: Konrad Eisele , Linux-Sparse On 05/12/2012 07:46 PM, Konrad Eisele wrote: > On 05/12/2012 01:02 PM, Christopher Li wrote: >> On Fri, May 11, 2012 at 2:48 PM, Konrad Eisele wrote: >>> >>> This seems ok. expanding_macro has to be global not static to be >>> used... (?) >> >> The expand_macro call back use the parent argument which get >> from expanding_macro list. The caller should be able to create tree >> from the leaf node using the parent pointer. >> >> Feel free to change to use the expanding_macro instead if that make >> building the tree easier. >> >>> I think the fact that argument expansion is recursive and >>> body expansion is non-recursive is one of the things that >>> make the preprocessor kindof hard to grasp. >> >> The body expansion can't be recursive on same macro otherwise >> it can result in unlimited expansion. The C stander specify >> the macro expand this way. >> >>> >>> I cannot say this before I've tried it. >>> >>> I'd like to straighten things out a bit: My last emails >>> where a bit too harsh and I'd like to apologize. Sorry >>> for that. >> >> No problem at all. I figure you just want to the patch to >> get included. >> >>> The next step then is: I'll write a patch to add a >>> test-prog that uses this api to trace the token generation >>> and generate a tree for it. >>> For a start I'll printout for all tokens of a preprocessor >>> run all macros-expansions that generated them. >> >> That is great. I have a test-macro program in that >> branch which is very close to print out all the tokens. > > Appended is a test-patch that adds test-mdep testcase. > The file mdep.c is used to record that macro > expansion, each token will have a reference to its > source. > test-mdep.c does pre-process (as test-macro.c) then > prints out the token trace through macros for each > token: @{ } is used to mark the active path. > To explain mdep.c: There are in fact only 3 lines that are of interest: ... 137: n->from = list->pos; ... ... 143: list->pos.line = id; 144: list->pos.stream = pps; ... Line 137 saves the last token.pos , (143+144) insert a new id into token.pos. This will generate the path for each token through the expansions. mdep_trace() traverses the path... > An example file is added: a.h > $test-mdep a.h > ... > 0004: 8 > body in D1 :4 @{8} 10 9 5 > arg0 in D1 :@{8} 10 9 > body in D0 :1 @{D1}(8 10 9) 2 D2(11) 3 > a.h:6:6 > ... > Token nr 4 of the preprocess stream is "8". The > generation path of "8" is marked @{8}... > Not 100%, still, I think already readable. (Actually > the printout order should be reversed (starting from file scope > and drilling down the macro expansions...) > > I still dont handle empty expansions. I'll see weather I can come up > with something here... > > >> >>> Now, I've learned not to run too fast towards the >>> goal, (which is still "dependency tee from c parser entities downto >>> token"), maybe you can think about how to achieve the next steps >>> in an API : >>> - An #include #ifdef #else #endif pushdown-stack >>> to record the nestings for each token >> >> Let me think about this. Just thinking out lound, >> The #include and #ifdef can consider as a special kind >> of predefine macro as well. > > No, only a linked list that model the nexting levels. > Then a preprocessor hook that can register lookup_macro() > macro lookups inside # preprocessor lines. An example > makes it clear: > > #if defined(a) && defined(b) > #if defined(c) > #endif > #if defined(e) > #endif > #endif > > Result in: > [a b]+<-[c] > +<-[e] > > This can be easily done with a push-pop brackets > and a callback in lookup_macro(). > > > Also: > #if defined(a) > #elif defined(c) > #endif > > [a]+<-[c] > > #if defined(a) > #else > #endif > > <-[empty]<-[a] > > ... > > > Another point I also need is to have an option so that inside > do_handle_define() the symbol structures are never reused but > alloc_symbol() is always used for undef and define, this is > because I need to be able to also track the undef and define > history for a macro at a certain position. I think this should be > easy to add because you just need to define define-undef on > top of each other... > > >> >>> - How to connect all this to the AST. >> >> For symbol, it relative easy because symbol has pos range >> and aux pointer. > > I thought about taking "struct symbol_list *syms = sparse(file)" > as the root. Then mark all elements that are used by them as dependent. > I dont have enough insight to say how I can determine things like > which "static inline" are used or how to traverse the > "typedef" dependency. > The goal is to have a "shrink" application that can strip away > all c-lines (pre-pre-process level) that are not used by a specific > command invocation of the compiler. Also a tool that can quickly show > for a specific identifier everything that is connected to it, again on > pre-preprocessor source level. kind-of something like: > ... > func1() { > struct string_list *filelist = NULL; int i; > } > .. > I point to "string_list" and then all lines that are related > to struct string_list, (#ifdef nestings, macros, all member typedefs) > etc are shown and all the rest stripped away, again on human > readable c source level. > > >> >> Do you need to attach the dependency for the statment and >> expression as well? >> >> Chris >> >