* Interrupt context @ 2008-03-23 21:44 Codrin Alexandru Grajdeanu 2008-03-24 21:00 ` Christopher Li 0 siblings, 1 reply; 6+ messages in thread From: Codrin Alexandru Grajdeanu @ 2008-03-23 21:44 UTC (permalink / raw) To: linux-sparse; +Cc: Octavian Purdila Hi all, I am a student from Politehnica University of Bucharest studying Computer Science. I would like to add some new kernel source checks based on sparse. The first idea would be to test if from interrupt context sleepy functions are called. To test this, sparse would be required to run twice. First to get all interrupt context functions, by verifying what arguments are passed to irq_handler_t() and what values are passed to the function pointers in struct timer_list, softirq_action and tasklet_struct. The second run would generate the call graph for this function and would verify if schedule() is called inside their call graph. What do you think about this? Thank you, Codrin Grajdenau ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Interrupt context 2008-03-23 21:44 Interrupt context Codrin Alexandru Grajdeanu @ 2008-03-24 21:00 ` Christopher Li 2008-03-25 1:34 ` Octavian Purdila 0 siblings, 1 reply; 6+ messages in thread From: Christopher Li @ 2008-03-24 21:00 UTC (permalink / raw) To: Codrin Alexandru Grajdeanu; +Cc: linux-sparse, Octavian Purdila On Sun, Mar 23, 2008 at 2:44 PM, Codrin Alexandru Grajdeanu <grcodal@gmail.com> wrote: > To test this, sparse would be required to run twice. First to get all > interrupt context functions, by verifying what arguments are passed to > irq_handler_t() and what values are passed to the function pointers in You can identify the interrupt handler by return type is "irqreturn_t". > struct timer_list, softirq_action and tasklet_struct. The second run That is harder. > would generate the call graph for this function and would verify if > schedule() is called inside their call graph. I don't think two pass is enough. You need to build the call graph for pretty much every function. Because the irq handler function might call other function which calls other function which calls schedule(). I don't think you can go very far without doing any control flow and data flow analyze. e.g. kmalloc() can go to sleep or not depend on the allocation flag (GFP_ATOMIC). Which points back to the proposal of: a) allow sparse to access function from different files. b) building the call graph for every function in the kernel. Chris ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Interrupt context 2008-03-24 21:00 ` Christopher Li @ 2008-03-25 1:34 ` Octavian Purdila 2008-03-25 2:57 ` Christopher Li 0 siblings, 1 reply; 6+ messages in thread From: Octavian Purdila @ 2008-03-25 1:34 UTC (permalink / raw) To: Christopher Li; +Cc: Codrin Alexandru Grajdeanu, linux-sparse On Monday 24 March 2008, Christopher Li wrote: > > I don't think two pass is enough. You need to build the call graph > for pretty much every function. Because the irq handler function might > call other function which calls other function which calls schedule(). > > I don't think you can go very far without doing any control flow > and data flow analyze. e.g. kmalloc() can go to sleep or not depend > on the allocation flag (GFP_ATOMIC). > > Which points back to the proposal of: > a) allow sparse to access function from different files. > b) building the call graph for every function in the kernel. > Hi Chris, Yes, you are right, we need to have the complete call graph of the whole kernel and kernel modules we want to check. We developed a prototype some time ago, but we never manage to move from the prototype to something that could be used out in the real world. The idea which we explored for the prototype was to serialize the sparse state and save it into the object files in a private section. The linker would than take care of aggregating the sparse state into the kernel image or kernel modules. The second stage loads the saved state, create the call graph, propagate the interrupt/softirq context around and finally check if schedule was called from interrupt context -- the check itself was really broad as we did not do any data flow analysis. The nice thing about this approach is that at least in theory would allow all sorts of global analysis, not only this particular (sleeping in interrupt) check. And we actually started with the idea of using sparse itself to generate the serializer, but we ended up patching the generate code manually - we abandoned the idea of adding sparse annotations to sparse code to get things right as we realized that we are moving away from our goal. But, what we obtain after the first stage was a vmlinux over 2GB in size, which could not be processed by ELF utilities (we assumed that we hit some limitations in the ELF32 format). So in the end its not so practical. For this second try, we were thinking about replacing the serializer with a thiner layer which would just save the call graph information together with the associated interrupt context function / sleeping function attributes in the object files. Any comments / suggestions are greatly appreciated. Thanks, tavi -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Interrupt context 2008-03-25 1:34 ` Octavian Purdila @ 2008-03-25 2:57 ` Christopher Li 2008-03-26 12:43 ` Octavian Purdila 0 siblings, 1 reply; 6+ messages in thread From: Christopher Li @ 2008-03-25 2:57 UTC (permalink / raw) To: Octavian Purdila; +Cc: Codrin Alexandru Grajdeanu, linux-sparse On Mon, Mar 24, 2008 at 6:34 PM, Octavian Purdila <tavi@cs.pub.ro> wrote: > On Monday 24 March 2008, Christopher Li wrote: > Yes, you are right, we need to have the complete call graph of the whole > kernel and kernel modules we want to check. We developed a prototype > some time ago, but we never manage to move from the prototype to something > that could be used out in the real world. Interesting. I did some hack on the serialize of the sparse output as well. > The idea which we explored for the prototype was to serialize the sparse state > and save it into the object files in a private section. The linker would than > take care of aggregating the sparse state into the kernel image or kernel > modules. The second stage loads the saved state, create the call graph, > propagate the interrupt/softirq context around and finally check if schedule > was called from interrupt context -- the check itself was really broad as we > did not do any data flow analysis. > > The nice thing about this approach is that at least in theory would allow all > sorts of global analysis, not only this particular (sleeping in interrupt) > check. And we actually started with the idea of using sparse itself to > generate the serializer, but we ended up patching the generate code > manually - we abandoned the idea of adding sparse annotations to sparse code > to get things right as we realized that we are moving away from our goal. I want to have sparse generate the the serializer code as well. One problem I run into is that, a lot of the sparse C structure member are within the union. The serialize code needs to understand the object type in order to access the member specific to this type inside the union. It need some fairly complicate data flow analyze code to trace the sparse code itself how it access the union member. I end up do it by hand as well :-) > > But, what we obtain after the first stage was a vmlinux over 2GB in size, > which could not be processed by ELF utilities (we assumed that we hit some > limitations in the ELF32 format). So in the end its not so practical. My plan is to write a symbol mapping to perform the symbol look up. For each extern symbol, you can look up to a object file name and an offset within that object file to locate the symbol. Then with the help from the serializer, you can load the that object file into memory. BTW, besides linking what else do the ELF format buys you? Whatever file format, I want it to store the linearized byte code rather than the machine code. > For this second try, we were thinking about replacing the serializer with a > thiner layer which would just save the call graph information together with > the associated interrupt context function / sleeping function attributes in > the object files. I would like to see some thing more general. For each file, it saves the information: 1) What symbol does it provide as extern. 2) What symbol does it accessed. 3) The linearized byte code for each function emits. (serialized of the entrypoint for each function). And then, you would be able to perform a lot of checking on this. Chris ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Interrupt context 2008-03-25 2:57 ` Christopher Li @ 2008-03-26 12:43 ` Octavian Purdila 2008-03-26 21:53 ` Christopher Li 0 siblings, 1 reply; 6+ messages in thread From: Octavian Purdila @ 2008-03-26 12:43 UTC (permalink / raw) To: Christopher Li; +Cc: Codrin Alexandru Grajdeanu, linux-sparse On Tuesday 25 March 2008, Christopher Li wrote: > > BTW, besides linking what else do the ELF format buys you? > You don't need to change the build system, you got all the information you need in the final deliverable of the build (vmlinux or the kernel module). > > > For this second try, we were thinking about replacing the serializer > > with a thiner layer which would just save the call graph information > > together with the associated interrupt context function / sleeping > > function attributes in the object files. > > I would like to see some thing more general. For each file, it saves > the information: > > 1) What symbol does it provide as extern. > 2) What symbol does it accessed. > 3) The linearized byte code for each function emits. (serialized of > the entrypoint > for each function). > > And then, you would be able to perform a lot of checking on this. > OK, so you think that the serializer approach is still the way to go then? Thanks, tavi ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Interrupt context 2008-03-26 12:43 ` Octavian Purdila @ 2008-03-26 21:53 ` Christopher Li 0 siblings, 0 replies; 6+ messages in thread From: Christopher Li @ 2008-03-26 21:53 UTC (permalink / raw) To: Octavian Purdila; +Cc: Codrin Alexandru Grajdeanu, linux-sparse On Wed, Mar 26, 2008 at 5:43 AM, Octavian Purdila <tavi@cs.pub.ro> wrote: > You don't need to change the build system, you got all the information you > need in the final deliverable of the build (vmlinux or the kernel module). How do you write to the ELF object? Can you write to ELF64 then you shouldn't have the size limit. I am not a big fan of loading everything into one big ELF file. I image it would be useful to only load the module as you needed. > > OK, so you think that the serializer approach is still the way to go then? > Depend on what you actually serialized. In the ideal world, you can save and load the compiled linearized byte code with symbol informations. That way you don't lose any information during the serialization. Chris ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2008-03-26 21:53 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2008-03-23 21:44 Interrupt context Codrin Alexandru Grajdeanu 2008-03-24 21:00 ` Christopher Li 2008-03-25 1:34 ` Octavian Purdila 2008-03-25 2:57 ` Christopher Li 2008-03-26 12:43 ` Octavian Purdila 2008-03-26 21:53 ` Christopher Li
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).