* Re: [Qemu-devel] TCG flow vs dyngen @ 2011-01-16 14:46 Raphael Lefevre 2011-01-16 15:21 ` Stefano Bonifazi 0 siblings, 1 reply; 36+ messages in thread From: Raphael Lefevre @ 2011-01-16 14:46 UTC (permalink / raw) To: stefboombastic; +Cc: blauwirbel, qemu-devel [-- Attachment #1: Type: text/plain, Size: 2733 bytes --] On Wed, Dec 15, 2010 at 4:17 AM, Stefano Bonifazi <stefboombastic@gmail.com> wrote: > On 12/11/2010 03:44 PM, Blue Swirl wrote: > > Hi! > Thank you very much! Knowing exactly where I should check, in a so big > project helped me very much!! > Anyway after having spent more than 2 days on that code I still can't > understand how it works the real execution: > > in cpu-exec.c : cpu_exec_nocache i find: > >> /* execute the generated code */ >> next_tb = tcg_qemu_tb_exec(tb->tc_ptr); > > and in cpu-exec.c : cpu_exec > >> /* execute the generated code */ >> >> next_tb = tcg_qemu_tb_exec(tc_ptr); > > so I thought tcg_qemu_tb_exec "function" should do the work of executing the > translated binary in the host. > But then I found out it is just a define in tcg.h: > >> #define tcg_qemu_tb_exec(tb_ptr) ((long REGPARM (*)(void >> *))code_gen_prologue)(tb_ptr) > > and again in exec.c > >> uint8_t code_gen_prologue[1024] code_gen_section; > > Maybe I have some problems with that C syntax, but I really don't understand > what happens there.. how the execution happens! > > Here instead with QEMU/TCG I understood that at runtime the target binary > is translated into host binary (somehow) .. but then.. how can this new host > binary be run? Shall the host code at runtime do some sort of (assembly > speaking) branch jump to an area of memory with new host binary instructions > .. and then jump back to the old process binary code? 1. As I know, the host codes translated from the target instructions exist by the format of object file, that’s why they can be executed directly. 2. I think you catch the right concept in some point of view, one part of the internal of QEMU does such jump & back works certainly. > If so, can you explain me how this happens in those lines of code? I only can give a rough profile, the code you listed do a simple thing: Modify the pointer of the host code execution to point the next address that the host processor should continue to execute. > I am just a student.. unluckily at university they just tell you that a cpu > follows some sort of "fetch ->decode->execute" flow .. but then you open > QEMU.. and wow there is a huge gap for understanding it, and no books where > to study it! ;) The QEMU is not used to simulate the every details of the processor should behave, it just try to approximate the necessary operations what a machine should be! “fetch->decode->execute” flow only need to be concerned when you involve into the hardware design. Raphaël Lefèvre [-- Attachment #2: Type: text/html, Size: 11197 bytes --] ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [Qemu-devel] TCG flow vs dyngen 2011-01-16 14:46 [Qemu-devel] TCG flow vs dyngen Raphael Lefevre @ 2011-01-16 15:21 ` Stefano Bonifazi 2011-01-16 16:01 ` Raphaël Lefèvre 0 siblings, 1 reply; 36+ messages in thread From: Stefano Bonifazi @ 2011-01-16 15:21 UTC (permalink / raw) To: Raphael Lefevre; +Cc: blauwirbel, qemu-devel [-- Attachment #1: Type: text/plain, Size: 3132 bytes --] On 01/16/2011 03:46 PM, Raphael Lefevre wrote: > > On Wed, Dec 15, 2010 at 4:17 AM, Stefano Bonifazi > <stefboombastic@gmail.com> wrote: > > > On 12/11/2010 03:44 PM, Blue Swirl wrote: > > > > > > Hi! > > > Thank you very much! Knowing exactly where I should check, in a so big > > > project helped me very much!! > > > Anyway after having spent more than 2 days on that code I still can't > > > understand how it works the real execution: > > > > > > in cpu-exec.c : cpu_exec_nocache i find: > > > > > >> /* execute the generated code */ > > >> next_tb = tcg_qemu_tb_exec(tb->tc_ptr); > > > > > > and in cpu-exec.c : cpu_exec > > > > > >> /* execute the generated code */ > > >> > > >> next_tb = tcg_qemu_tb_exec(tc_ptr); > > > > > > so I thought tcg_qemu_tb_exec "function" should do the work of > executing the > > > translated binary in the host. > > > But then I found out it is just a define in tcg.h: > > > > > >> #define tcg_qemu_tb_exec(tb_ptr) ((long REGPARM (*)(void > > >> *))code_gen_prologue)(tb_ptr) > > > > > > and again in exec.c > > > > > >> uint8_t code_gen_prologue[1024] code_gen_section; > > > > > > Maybe I have some problems with that C syntax, but I really don't > understand > > > what happens there.. how the execution happens! > > > > > > Here instead with QEMU/TCG I understood that at runtime the target > binary > > > is translated into host binary (somehow) .. but then.. how can this > new host > > > binary be run? Shall the host code at runtime do some sort of (assembly > > > speaking) branch jump to an area of memory with new host binary > instructions > > > .. and then jump back to the old process binary code? > > 1. As I know, the host codes translated from the target instructions > exist by the format of object file, that’s why they can be executed > directly. > > 2. I think you catch the right concept in some point of view, one part > of the internal of QEMU does such jump & back works certainly. > > > If so, can you explain me how this happens in those lines of code? > > I only can give a rough profile, the code you listed do a simple thing: > > Modify the pointer of the host code execution to point the next > address that the host processor should continue to execute. > > > I am just a student.. unluckily at university they just tell you that > a cpu > > > follows some sort of "fetch ->decode->execute" flow .. but then you open > > > QEMU.. and wow there is a huge gap for understanding it, and no books > where > > > to study it! ;) > > The QEMU is not used to simulate the every details of the processor > should behave, it just try to approximate the necessary operations > what a machine should be! > > “fetch->decode->execute” flow only need to be concerned when you > involve into the hardware design. > > Raphaël Lefèvre > Thank you very much! I've already solved this problem.. Right now I am fighting with the possibility of changing qemu-user code for making it run several binaries in succession .. But it seems to remember the first translated code.. Nobody answered to my post about it, do you have any idea? [-- Attachment #2: Type: text/html, Size: 14125 bytes --] ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [Qemu-devel] TCG flow vs dyngen 2011-01-16 15:21 ` Stefano Bonifazi @ 2011-01-16 16:01 ` Raphaël Lefèvre 2011-01-16 16:43 ` Stefano Bonifazi 2011-01-23 21:50 ` Rob Landley 0 siblings, 2 replies; 36+ messages in thread From: Raphaël Lefèvre @ 2011-01-16 16:01 UTC (permalink / raw) To: Stefano Bonifazi; +Cc: qemu-devel On Sun, Jan 16, 2011 at 11:21 PM, Stefano Bonifazi <stefboombastic@gmail.com> wrote: > > Thank you very much! > I've already solved this problem.. Right now I am fighting with the possibility of changing qemu-user code for making it run several binaries in succession .. But it seems to remember the first translated code.. Nobody answered to my post about it, do you have any idea? > Sorry for my belated on this discussion, after I searched for the topics you posted, it seems two main problems are unsolved? (Am I right?? I'm not sure...) 1. "I edited QEMU user, more exactly qemu-ppc launching the main function (inside main.c) from another c function I created, passing it the appropriate parameters. ...balabala" at Jan, 2011 2. "how can I check the number of target cpu cycles or target instructions executed inside qemu-user (i.e. qemu-ppc)? Is there any variable I can inspect for such informations?" at Dec, 2010 If I'm not correct, please let me know where the problem is. Raphaël Lefèvre ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [Qemu-devel] TCG flow vs dyngen 2011-01-16 16:01 ` Raphaël Lefèvre @ 2011-01-16 16:43 ` Stefano Bonifazi 2011-01-16 18:29 ` Peter Maydell 2011-01-16 19:16 ` [Qemu-devel] " Raphaël Lefèvre 2011-01-23 21:50 ` Rob Landley 1 sibling, 2 replies; 36+ messages in thread From: Stefano Bonifazi @ 2011-01-16 16:43 UTC (permalink / raw) To: Raphaël Lefèvre; +Cc: qemu-devel > Sorry for my belated on this discussion, after I searched for the > topics you posted, it seems two main problems are unsolved? (Am I > right?? I'm not sure...) > > 1. "I edited QEMU user, more exactly qemu-ppc launching the main function > (inside main.c) from another c function I created, passing it the > appropriate parameters. ...balabala" at Jan, 2011 > > 2. "how can I check the number of target cpu cycles or target > instructions executed inside qemu-user (i.e. qemu-ppc)? > Is there any variable I can inspect for such informations?" at Dec, 2010 > > If I'm not correct, please let me know where the problem is. > > Raphaël Lefèvre Hi! Thank you very much for Your concern! Honestly I had lost hope in any help, I even contacted directly some developers in this mailing list without luck! I am a student who needs to use qemu for a project where it will be used for its capabilities of running PowerPC code. As you can imagine qemu goes far beyond the knowledge in electronics and computer science of a student. Nevertheless I have to do that! I have been studying all the possible technical documents available in the internet, but it is really not much at all , not sufficient for getting the code and being able of understanding it .. It is in C, even not modular C++ Anyway with some help from this mailing list, and a lot of studying about assembly, loaders, compilers.. I am going on, though there are still big problems due of the nature of the QEMU code.. First of all, I am starting from qemu-user, more specifically, qemu-ppc as I don't need the full system capabilities, and it is easier for me to control the binary target memory with qemu-user. Originally I started with a lot of work on libqemu .. until some developer here told me it was deprecated (though still in the source) and not working fine. I edited the code of qemu-ppc so that another function of mine calls qemu-user main, with the appropriate parameters.. The pursued goal was to launch it several times with different target binaries in succession.. For some reason, I still can't find out, qemu code remembers the old code, running it instead of the new loaded binary.. and if I flush the cache of translated code before loading a new binary it stops and can't go on! My workaround to this problem was compiling qemu-ppc as a dynamic library and load it at runtime.. I also managed to load multiple copies of it (with dlmopen each at a different address space) ..in fact I need to run more than one qemu-ppc at the same time but a new big problem popped up now: the target binary is loaded always at a fixed address.. no matter if another qemu-ppc already loaded code there.. it is like the internal elf loader can't understand those addresses are not available, and then relocate them .. I tried to link (ld) the binary target elf as position independent code, but then qemu-ppc complains it can't find /usr/lib/libc.so.1 and /usr/lib/ld.so.1 To sum up the problems are (in order of importance): - making the elf loader relocate the target code into other addresses when the default ones (I guess those embedded into the target binary when it is not compiled as position independent code) are taken - making qemu-user able of running more than one target binary in succession - counting qemu-user executed instructions My university is a public one, so my project will be open to the community, I will also upload the documentation I am writing about qemu coming from the knowledge I am acquiring working on it, so that, I hope, other people will find less frustrating the first steps into developing qemu! Any help will be more than welcome! Thank you in advance! Stefano B. ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [Qemu-devel] TCG flow vs dyngen 2011-01-16 16:43 ` Stefano Bonifazi @ 2011-01-16 18:29 ` Peter Maydell 2011-01-16 19:02 ` Stefano Bonifazi 2011-01-16 19:16 ` [Qemu-devel] " Raphaël Lefèvre 1 sibling, 1 reply; 36+ messages in thread From: Peter Maydell @ 2011-01-16 18:29 UTC (permalink / raw) To: Stefano Bonifazi; +Cc: Raphaël Lefèvre, qemu-devel 2011/1/16 Stefano Bonifazi <stefboombastic@gmail.com>: > My workaround to this problem was compiling qemu-ppc as a dynamic library > and load it at runtime.. I also managed to load multiple copies of it (with > dlmopen each at a different address space) ..in fact I need to run more than > one qemu-ppc at the same time This approach seems very unlikely to work -- in general qemu in both system and user mode assumes that there is only one instance running in the host process address space, and things are bound to clash. (Linux doesn't seem to have dlmopen but google suggests that it puts the library in its own namespace but not its own address space.) Running each qemu as its own process and using interprocess communication for whatever coordination you need between the various instances seems more likely to be workable to me. This will also fix your "can't run more than one binary in succession" problem, because you can just have the first qemu run and exit as normal and launch a second qemu to run the second binary. -- PMM ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [Qemu-devel] TCG flow vs dyngen 2011-01-16 18:29 ` Peter Maydell @ 2011-01-16 19:02 ` Stefano Bonifazi 2011-01-16 19:24 ` Peter Maydell 2011-01-16 20:50 ` [Qemu-devel] " Stefano Bonifazi 0 siblings, 2 replies; 36+ messages in thread From: Stefano Bonifazi @ 2011-01-16 19:02 UTC (permalink / raw) To: Peter Maydell; +Cc: Raphaël Lefèvre, qemu-devel Thank you very much for Your fast reply! On 01/16/2011 07:29 PM, Peter Maydell wrote: > Linux doesn't seem to have dlmopen http://www.unix.com/man-page/All/3c/dlmopen/ #define __USE_GNU #include <dlfcn.h> lib_handle1 = dlmopen(LM_ID_NEWLM,"./libqemu-ppc.so", RTLD_NOW); I am developing that on a clean ubuntu 10.10 > but google suggests that it puts the library in its own namespace > but not its own address space. I need to make the different instances of qemu-user exchange data .. obviously keeping all of them in the same address space would be the easiest way (unless I have to change all qemu code ;) ) Running each qemu as its own > process and using interprocess communication for whatever > coordination you need between the various instances seems > more likely to be workable to me. This will also fix your "can't run > more than one binary in succession" problem, because you can > just have the first qemu run and exit as normal and launch a > second qemu to run the second binary. > > -- PMM Exactly, it was the easiest way also for me.. and I've already done it, works smoothly .. the only big problem is that it is not good for my teacher.. he says it should work the dynamic library way o.O Working with libraries even solved the problem of consecutive runs, though according to me it is not good a software when you must reboot it for making it run again fine.. sounds more Windows style :D Clearly it makes memory "dirty" and do not clean after the target process completes its execution.. leaving the OS care about it. I tried zeroing all global variables before starting a new execution without results (other than making it stall) .. After very long time spent trying to find a solution I think the problem should be with the mmap' ings stuff in the loader .. the same reason why 2 different libraries with their own namespaces clash according to me.. the elf loaders work globally within the unique address space .. I think for a guru of loaders-linkers should not be so difficult to patch it.. but not for a student who almost heard about them for the first time ;) Any help is very appreciated :) Thank you again! Stefano B. ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [Qemu-devel] TCG flow vs dyngen 2011-01-16 19:02 ` Stefano Bonifazi @ 2011-01-16 19:24 ` Peter Maydell 2011-01-24 13:20 ` [Qemu-devel] " Stefano Bonifazi 2011-01-16 20:50 ` [Qemu-devel] " Stefano Bonifazi 1 sibling, 1 reply; 36+ messages in thread From: Peter Maydell @ 2011-01-16 19:24 UTC (permalink / raw) To: Stefano Bonifazi; +Cc: Raphaël Lefèvre, qemu-devel 2011/1/16 Stefano Bonifazi <stefboombastic@gmail.com>: > I need to make the different instances of qemu-user exchange data .. > obviously keeping all of them in the same address space would be the easiest > way (unless I have to change all qemu code ;) ) The problem is that you're trying to break a fundamental assumption made by a lot of qemu code. That's a large job which involves understanding, checking and possibly changing lots of already written code. In contrast, the code you need to exchange data between the instances is going to be fairly small and self contained and you'll already understand it because you've written it/will write it. I think it's pretty clear which one is going to be easier. >> Running each qemu as its own >> process and using interprocess communication for whatever >> coordination you need between the various instances seems >> more likely to be workable to me. > Exactly, it was the easiest way also for me.. and I've already done it, > works smoothly .. the only big problem is that it is not good for my > teacher.. he says it should work the dynamic library way o.O I think he's wrong. (You might like to think about what happens if the program being emulated in qemu user-mode does a fork()). Basically you're trying to do things the hard way; maybe you can get something that sort of works in the subset of cases you care about, but why on earth put in that much time and effort on something irrelevant to the actual problem you're trying to work on? -- PMM ^ permalink raw reply [flat|nested] 36+ messages in thread
* [Qemu-devel] Re: TCG flow vs dyngen 2011-01-16 19:24 ` Peter Maydell @ 2011-01-24 13:20 ` Stefano Bonifazi 0 siblings, 0 replies; 36+ messages in thread From: Stefano Bonifazi @ 2011-01-24 13:20 UTC (permalink / raw) To: Peter Maydell; +Cc: Raphaël Lefèvre, qemu-devel On 01/16/2011 08:24 PM, Peter Maydell wrote: > 2011/1/16 Stefano Bonifazi<stefboombastic@gmail.com>: >> I need to make the different instances of qemu-user exchange data .. >> obviously keeping all of them in the same address space would be the easiest >> way (unless I have to change all qemu code ;) ) > > The problem is that you're trying to break a fundamental > assumption made by a lot of qemu code. That's a large > job which involves understanding, checking and possibly > changing lots of already written code. In contrast, the > code you need to exchange data between the instances is > going to be fairly small and self contained and you'll already > understand it because you've written it/will write it. I think > it's pretty clear which one is going to be easier. > >>> Running each qemu as its own >>> process and using interprocess communication for whatever >>> coordination you need between the various instances seems >>> more likely to be workable to me. > >> Exactly, it was the easiest way also for me.. and I've already done it, >> works smoothly .. the only big problem is that it is not good for my >> teacher.. he says it should work the dynamic library way o.O > > I think he's wrong. (You might like to think about what happens > if the program being emulated in qemu user-mode does a fork()). > > Basically you're trying to do things the hard way; maybe > you can get something that sort of works in the subset of > cases you care about, but why on earth put in that much > time and effort on something irrelevant to the actual problem > you're trying to work on? > > -- PMM > > Well my teacher's answer was that it is useless doing that, as there are already plenty of solutions based on IPC .. they are interested in this other approach, testing it .. They are not interested on how difficult it can be for a student, how long it can take.. :( Best regards, Stefano B. ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [Qemu-devel] TCG flow vs dyngen 2011-01-16 19:02 ` Stefano Bonifazi 2011-01-16 19:24 ` Peter Maydell @ 2011-01-16 20:50 ` Stefano Bonifazi 2011-01-16 21:08 ` Raphaël Lefèvre 2011-01-17 11:59 ` [Qemu-devel] " Lluís 1 sibling, 2 replies; 36+ messages in thread From: Stefano Bonifazi @ 2011-01-16 20:50 UTC (permalink / raw) To: Peter Maydell; +Cc: Raphaël Lefèvre, qemu-devel Hi! In case you are interested in helping me, I'll give you a big piece of news I've just got (even my teacher is not informed yet! :) ) I've just managed to make more than one instance of qemu-user run at the same time linking the target code with a specified address for the code section (-Ttext address of ld). It works fine and this proves my idea that the problem is within the elf loader.. Making it relocate the target code properly would fix the problem ;) Now let's work on it :) Regards, Stefano B. On 01/16/2011 08:02 PM, Stefano Bonifazi wrote: > Thank you very much for Your fast reply! > > > On 01/16/2011 07:29 PM, Peter Maydell wrote: >> Linux doesn't seem to have dlmopen > http://www.unix.com/man-page/All/3c/dlmopen/ > > #define __USE_GNU > #include <dlfcn.h> > > lib_handle1 = dlmopen(LM_ID_NEWLM,"./libqemu-ppc.so", RTLD_NOW); > > I am developing that on a clean ubuntu 10.10 >> but google suggests that it puts the library in its own namespace >> but not its own address space. > I need to make the different instances of qemu-user exchange data .. > obviously keeping all of them in the same address space would be the > easiest way (unless I have to change all qemu code ;) ) Running each > qemu as its own >> process and using interprocess communication for whatever >> coordination you need between the various instances seems >> more likely to be workable to me. This will also fix your "can't run >> more than one binary in succession" problem, because you can >> just have the first qemu run and exit as normal and launch a >> second qemu to run the second binary. >> >> -- PMM > Exactly, it was the easiest way also for me.. and I've already done > it, works smoothly .. the only big problem is that it is not good for > my teacher.. he says it should work the dynamic library way o.O > Working with libraries even solved the problem of consecutive runs, > though according to me it is not good a software when you must reboot > it for making it run again fine.. sounds more Windows style :D > Clearly it makes memory "dirty" and do not clean after the target > process completes its execution.. leaving the OS care about it. > I tried zeroing all global variables before starting a new execution > without results (other than making it stall) .. After very long time > spent trying to find a solution I think the problem should be with the > mmap' ings stuff in the loader .. the same reason why 2 different > libraries with their own namespaces clash according to me.. the elf > loaders work globally within the unique address space .. I think for a > guru of loaders-linkers should not be so difficult to patch it.. but > not for a student who almost heard about them for the first time ;) > Any help is very appreciated :) > Thank you again! > Stefano B. > > > ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [Qemu-devel] TCG flow vs dyngen 2011-01-16 20:50 ` [Qemu-devel] " Stefano Bonifazi @ 2011-01-16 21:08 ` Raphaël Lefèvre 2011-01-24 12:35 ` [Qemu-devel] " Stefano Bonifazi 2011-01-17 11:59 ` [Qemu-devel] " Lluís 1 sibling, 1 reply; 36+ messages in thread From: Raphaël Lefèvre @ 2011-01-16 21:08 UTC (permalink / raw) To: Stefano Bonifazi; +Cc: Peter Maydell, qemu-devel 2011/1/17 Stefano Bonifazi <stefboombastic@gmail.com>: > Hi! > In case you are interested in helping me, I'll give you a big piece of news > I've just got (even my teacher is not informed yet! :) ) > I've just managed to make more than one instance of qemu-user run at the > same time linking the target code with a specified address for the code > section (-Ttext address of ld). > It works fine and this proves my idea that the problem is within the elf > loader.. > Making it relocate the target code properly would fix the problem ;) > Now let's work on it :) > Regards, > Stefano B. > Congratulation~ just keep going on~! Raphaël Lefèvre ^ permalink raw reply [flat|nested] 36+ messages in thread
* [Qemu-devel] Re: TCG flow vs dyngen 2011-01-16 21:08 ` Raphaël Lefèvre @ 2011-01-24 12:35 ` Stefano Bonifazi 0 siblings, 0 replies; 36+ messages in thread From: Stefano Bonifazi @ 2011-01-24 12:35 UTC (permalink / raw) To: Raphaël Lefèvre; +Cc: Peter Maydell, qemu-devel On 01/16/2011 10:08 PM, Raphaël Lefèvre wrote: > 2011/1/17 Stefano Bonifazi<stefboombastic@gmail.com>: >> Hi! >> In case you are interested in helping me, I'll give you a big piece of news >> I've just got (even my teacher is not informed yet! :) ) >> I've just managed to make more than one instance of qemu-user run at the >> same time linking the target code with a specified address for the code >> section (-Ttext address of ld). >> It works fine and this proves my idea that the problem is within the elf >> loader.. >> Making it relocate the target code properly would fix the problem ;) >> Now let's work on it :) >> Regards, >> Stefano B. >> > > Congratulation~ just keep going on~! > > Raphaël Lefèvre > > Thank you! Working on the elf loader I found out many problems on that code.. If you are interested you can have a look to my last post! Best regards! Stefano B. ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [Qemu-devel] TCG flow vs dyngen 2011-01-16 20:50 ` [Qemu-devel] " Stefano Bonifazi 2011-01-16 21:08 ` Raphaël Lefèvre @ 2011-01-17 11:59 ` Lluís 2011-01-24 12:31 ` [Qemu-devel] " Stefano Bonifazi 1 sibling, 1 reply; 36+ messages in thread From: Lluís @ 2011-01-17 11:59 UTC (permalink / raw) To: qemu-devel Stefano Bonifazi writes: > Hi! > In case you are interested in helping me, I'll give you a big piece of news > I've just got (even my teacher is not informed yet! :) ) I still don't understand what is your high-level objective... Lluis -- "And it's much the same thing with knowledge, for whenever you learn something new, the whole world becomes that much richer." -- The Princess of Pure Reason, as told by Norton Juster in The Phantom Tollbooth ^ permalink raw reply [flat|nested] 36+ messages in thread
* [Qemu-devel] Re: TCG flow vs dyngen 2011-01-17 11:59 ` [Qemu-devel] " Lluís @ 2011-01-24 12:31 ` Stefano Bonifazi 2011-01-24 13:36 ` Lluís 0 siblings, 1 reply; 36+ messages in thread From: Stefano Bonifazi @ 2011-01-24 12:31 UTC (permalink / raw) To: qemu-devel On 01/17/2011 12:59 PM, Lluís wrote: > Stefano Bonifazi writes: > >> Hi! >> In case you are interested in helping me, I'll give you a big piece of news >> I've just got (even my teacher is not informed yet! :) ) > > I still don't understand what is your high-level objective... > > > Lluis > Hi! Sorry I've noticed your reply only know (dunno why I was not notified by email!) Do you mean what is my final goal? ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [Qemu-devel] Re: TCG flow vs dyngen 2011-01-24 12:31 ` [Qemu-devel] " Stefano Bonifazi @ 2011-01-24 13:36 ` Lluís 2011-01-24 14:00 ` Stefano Bonifazi 0 siblings, 1 reply; 36+ messages in thread From: Lluís @ 2011-01-24 13:36 UTC (permalink / raw) To: Stefano Bonifazi; +Cc: qemu-devel Stefano Bonifazi writes: > Do you mean what is my final goal? Exactly. A higher level perspective of what is our ultimate goal might help others figure out better ways to do it. Right now I don't remember what you posted your where technically trying to do, but I do remember it looked convoluted to me. Lluis -- "And it's much the same thing with knowledge, for whenever you learn something new, the whole world becomes that much richer." -- The Princess of Pure Reason, as told by Norton Juster in The Phantom Tollbooth ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [Qemu-devel] Re: TCG flow vs dyngen 2011-01-24 13:36 ` Lluís @ 2011-01-24 14:00 ` Stefano Bonifazi 2011-01-24 15:06 ` Lluís 0 siblings, 1 reply; 36+ messages in thread From: Stefano Bonifazi @ 2011-01-24 14:00 UTC (permalink / raw) To: qemu-devel On 01/24/2011 02:36 PM, Lluís wrote: > Stefano Bonifazi writes: > >> Do you mean what is my final goal? > Exactly. A higher level perspective of what is our ultimate goal might > help others figure out better ways to do it. > > Right now I don't remember what you posted your where technically trying > to do, but I do remember it looked convoluted to me. > > > Lluis > Sorry if I could not explain it better before, but it was not totally clear for me too since the beginning, as I get new specs from my teacher on the way, according what I manage to do, and where I find big obstacles! Now, the final goal is to get multiple instances of qemu-ppc driven by a systemc project executing on a x86 machine, with the different qemu-ppc instances used as emulators for power-pc binaries.. I would get the results of the run of the various ppc binaries back to the systemc project and work with the results then. I've already managed to integrate systemc with qemu-ppc, and I managed to load multiple instances of qemu together, by loading it as a dynamic library. I think much confusion about my goals was originated by the fact that the first attempt (failed) was to use qemu-user for loading many target binaries one after the other.. Then I changed for having many instances of qemu-user at the same time inside the same process.. The actual problem is letting qemu-user able of loading target code at a different address than the one chosen by the link editor when creating the binary.. If you are interested in that I've just created a new post about it: http://lists.nongnu.org/archive/html/qemu-devel/2011-01/msg02361.html Best regards, Stefano B. ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [Qemu-devel] Re: TCG flow vs dyngen 2011-01-24 14:00 ` Stefano Bonifazi @ 2011-01-24 15:06 ` Lluís 2011-01-24 17:23 ` Stefano Bonifazi 0 siblings, 1 reply; 36+ messages in thread From: Lluís @ 2011-01-24 15:06 UTC (permalink / raw) To: Stefano Bonifazi; +Cc: qemu-devel Stefano Bonifazi writes: > Now, the final goal is to get multiple instances of qemu-ppc driven by a systemc > project executing on a x86 machine, with the different qemu-ppc instances used > as emulators for power-pc binaries.. I would get the results of the run of the > various ppc binaries back to the systemc project and work with the results then. If I understand this correctly, the execution of one of your PPC cores is oblivious of the others (they share no guest physical memory). If that's true, it would be extremely more simple to make systemc launch a separate process with a modified version of linux-user. This modified version would just change the main loop in a way that it can communicate with systemc, e.g., using shared memory. Of course, depending on the granularity of synchronization that you want, this might not prove useful: if systemc and the linux-user processes all execute in the finest possible synchronization (e.g., comunicate after each TB or even after each instruction), the communication overhead might prove too high. Still, if you want a somewhat looser synchronization, you could take a look at how COTSon [1] does it. [1] http://sites.google.com/site/hplabscotson/ Lluis -- "And it's much the same thing with knowledge, for whenever you learn something new, the whole world becomes that much richer." -- The Princess of Pure Reason, as told by Norton Juster in The Phantom Tollbooth ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [Qemu-devel] Re: TCG flow vs dyngen 2011-01-24 15:06 ` Lluís @ 2011-01-24 17:23 ` Stefano Bonifazi 2011-01-24 18:12 ` Lluís 0 siblings, 1 reply; 36+ messages in thread From: Stefano Bonifazi @ 2011-01-24 17:23 UTC (permalink / raw) To: qemu-devel; +Cc: xscript Hi! Thank you for answering me! > If I understand this correctly, the execution of one of your PPC cores > is oblivious of the others (they share no guest physical memory). > No! They do share the same address space.. the way I am loading the different qemu-ppc instances divides their namespaces allowing them to coexist, but they share the same address space anyway (that is the same of the caller process too), that is what I want for a communication. The problem is that they are at the same time oblivious of the others and each of them wants to map its target binary at the same unique virtual address (again see my last post about relocating target code).. I tried successfully the way of IPC (interprocess communication) having a different qemu-ppc spawned by systemc as a process, then using shared memory and signals for communicating.. pretty easy and well working, but the specs of my project (university) do not let me using IPC.. > [1] http://sites.google.com/site/hplabscotson/ Thank you, I am a student of digital electronics, with not big knowledge about developing in linux but this project is very interesting for my field.. some sort of alternative to systemc if I understand fine! Thanks surely I'll have a look at that! Best regards! Stefano B. ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [Qemu-devel] Re: TCG flow vs dyngen 2011-01-24 17:23 ` Stefano Bonifazi @ 2011-01-24 18:12 ` Lluís 0 siblings, 0 replies; 36+ messages in thread From: Lluís @ 2011-01-24 18:12 UTC (permalink / raw) To: qemu-devel Stefano Bonifazi writes: > Hi! > Thank you for answering me! >> If I understand this correctly, the execution of one of your PPC cores >> is oblivious of the others (they share no guest physical memory). >> > No! They do share the same address space.. the way I am loading the different > qemu-ppc instances divides their namespaces allowing them to coexist, but they > share the same address space anyway (that is the same of the caller process > too), that is what I want for a communication. Sure, but each core runs a separate (guest) process. It's not that they are (guest) threads of the same (guest) application. > The problem is that they are at the same time oblivious of the others and each > of them wants to map its target binary at the same unique virtual address (again > see my last post about relocating target code).. Right. The important point here is that each qemu core runs a separate (guest) process. > I tried successfully the way of IPC (interprocess communication) having a > different qemu-ppc spawned by systemc as a process, then using shared memory and > signals for communicating.. pretty easy and well working, but the specs of my > project (university) do not let me using IPC.. That sounds silly, but there's no way out if this is in the rules of the project. >> [1] http://sites.google.com/site/hplabscotson/ > Thank you, I am a student of digital electronics, with not big knowledge about > developing in linux but this project is very interesting for my field.. some > sort of alternative to systemc if I understand fine! Thanks surely I'll have a > look at that! Well, if you don't know about avilable simulators, there's a plethora of them. From the top of my head, I can give you these other names: * SESC http://sesc.sourceforge.net/ * M5 http://www.m5sim.org/wiki/index.php/Main_Page * GEMS http://www.cs.wisc.edu/gems/ * SimpleScalar http://www.simplescalar.com/ * Graphite http://groups.csail.mit.edu/carbon/?page_id=111 Lluis -- "And it's much the same thing with knowledge, for whenever you learn something new, the whole world becomes that much richer." -- The Princess of Pure Reason, as told by Norton Juster in The Phantom Tollbooth ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [Qemu-devel] TCG flow vs dyngen 2011-01-16 16:43 ` Stefano Bonifazi 2011-01-16 18:29 ` Peter Maydell @ 2011-01-16 19:16 ` Raphaël Lefèvre 1 sibling, 0 replies; 36+ messages in thread From: Raphaël Lefèvre @ 2011-01-16 19:16 UTC (permalink / raw) To: Stefano Bonifazi; +Cc: qemu-devel 2011/1/17 Stefano Bonifazi <stefboombastic@gmail.com>: > > Hi! > Thank you very much for Your concern! > Honestly I had lost hope in any help, I even contacted directly some > developers in this mailing list without luck! I guess many good developers in mailing list are still try their best to solve your problems, such as Blue Swirl, Paolo Bonzini, Stefan Weil, Peter Maydell, Mulyadi Santosa, Andreas Färber and Alexander Graf (hope I won't lost anyone that had helped you, and the order of name list without any meaning) ...etc., every developer has his expertises, and it is hard to recognize all of the activities of qemu. Please trust one thing: you are not alone:). > I am a student who needs to use qemu for a project where it will be used for > its capabilities of running PowerPC code. > As you can imagine qemu goes far beyond the knowledge in electronics and > computer science of a student. Nevertheless I have to do that! > I have been studying all the possible technical documents available in the > internet, but it is really not much at all , not sufficient for getting the > code and being able of understanding it .. It is in C, even not modular C++ Due to the lack of tehnical document of qemu and you are a student (maybe study for master/phd degree?), some literatures that published on IEEE/ACM may give you some inspiration and help (suppose that your university have bought the authority for download). As I know, though the issue of qemu is relative new for the academia, there still are literatures have been discussed. Maybe you can find which research domain categorized that is most approximative to your works. If any literature has inspired you or related to your research, don't hasitate to discuss. > Anyway with some help from this mailing list, and a lot of studying about > assembly, loaders, compilers.. I am going on, though there are still big > problems due of the nature of the QEMU code.. > First of all, I am starting from qemu-user, more specifically, qemu-ppc as I > don't need the full system capabilities, and it is easier for me to control > the binary target memory with qemu-user. Is there any reason why should you use the user mode of qemu, not the system mode? Sometime, the system mode of qemu will release you from the nightmare for managing the memory hierarchy. Maybe you can start from talking about what is the original goal of the project instead of falling into the hell of code tracing. > Originally I started with a lot of work on libqemu .. until some developer > here told me it was deprecated (though still in the source) and not working > fine. > I edited the code of qemu-ppc so that another function of mine calls > qemu-user main, with the appropriate parameters.. The pursued goal was to > launch it several times with different target binaries in succession.. > For some reason, I still can't find out, qemu code remembers the old code, > running it instead of the new loaded binary.. and if I flush the cache of > translated code before loading a new binary it stops and can't go on! > My workaround to this problem was compiling qemu-ppc as a dynamic library > and load it at runtime.. I also managed to load multiple copies of it (with > dlmopen each at a different address space) ..in fact I need to run more than > one qemu-ppc at the same time but a new big problem popped up now: the I need to thanks the Peter Maydell explained the principle that I'm not familiar with. And from your description, would you want to invoke multi-cores? Because I cannot imagine which application need to run multiple qemu-ppc at the same time. > target binary is loaded always at a fixed address.. no matter if another > qemu-ppc already loaded code there.. it is like the internal elf loader > can't understand those addresses are not available, and then relocate them > .. > I tried to link (ld) the binary target elf as position independent code, but > then qemu-ppc complains it can't find /usr/lib/libc.so.1 and > /usr/lib/ld.so.1 > The above description seems to be out of my scope to answer, because I only studied on system mode of qemu. > To sum up the problems are (in order of importance): > - making the elf loader relocate the target code into other addresses when > the default ones (I guess those embedded into the target binary when it is > not compiled as position independent code) are taken Maybe the problem only can be solved by re-write the loader if you insist to use user mode. (just as your response to Peter) > - making qemu-user able of running more than one target binary in > succession Will m"ore than one target binary in succession (assume A then B then C)" be achieved by "compile ABC into one binary in sequence"? > - counting qemu-user executed instructions I guess all the works before this are for the goal: "counting qemu-user executed instructions", am I right? If so, the paper published in IEEE 2010 maybe give some help (I guess) http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=5475901 (Make sure that your university can access it) Raphaël Lefèvre ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [Qemu-devel] TCG flow vs dyngen 2011-01-16 16:01 ` Raphaël Lefèvre 2011-01-16 16:43 ` Stefano Bonifazi @ 2011-01-23 21:50 ` Rob Landley 2011-01-23 22:25 ` Stefano Bonifazi 2011-01-24 14:32 ` Peter Maydell 1 sibling, 2 replies; 36+ messages in thread From: Rob Landley @ 2011-01-23 21:50 UTC (permalink / raw) To: Raphaël Lefèvre; +Cc: Stefano Bonifazi, qemu-devel On 01/16/2011 10:01 AM, Raphaël Lefèvre wrote: > On Sun, Jan 16, 2011 at 11:21 PM, Stefano Bonifazi > <stefboombastic@gmail.com> wrote: > 2. "how can I check the number of target cpu cycles or target > instructions executed inside qemu-user (i.e. qemu-ppc)? > Is there any variable I can inspect for such informations?" at Dec, 2010 Keep in mind I'm a bit rusty and not an expert, but I'll give a stab at answering: You can't, because QEMU doesn't work that way. QEMU isn't an instruction level emulator, it's closer to a Java JIT. It doesn't translate one instruction at a time but instead translates large blocks of code all at once, and keeps a cache of translated blocks around. Execution jumps into each block and either waits for it to exit again (meaning it jumped out of that page and QEMU's main execution loop has to look up what page to execute next, possibly translating it first if it's not in the cache yet), or else QEMU interrupts it after while to fake an IRQ of some kind (such as a timer interrupt). You may want to read Fabrice Bellard's original paper on the QEMU design: http://www.usenix.org/event/usenix05/tech/freenix/full_papers/bellard/bellard.pdf Since that was written, dyngen was replaced with tcg, but that does the same thing in a slightly different way. Building a QEMU with dyngen support used to use the host compiler to compile chunks of code corresponding to the target operations it would see at runtime, and then strip the machine language out of the resulting .o files and save them in a table. Then at runtime dyngen could generate translated pages by gluing together the resulting saved machine language snippets the host compiler had produced when qemu was built. The problem was, beating the right kind of machine language snippets out of the .o files the compiler produced from the example code turned out to be VERY COMPILER DEPENDENT. This is why you couldn't build qemu with gcc 4.x for the longest time, gcc's code generator and the layout of the .o files changed in a bunch of subtle ways which broke dyngen's ability to extract usable machine code snippets to put 'em into the table so it could translate pages at runtime. TCG stands for "Tiny Code Generator". It just hardwires a code generator into QEMU. They wrote a mini-compiler in C, which knows what instructions to output for each host qemu supports. If QEMU understands target instructions well enough to _read_ them, it's not a big stretch to be able to _write_ them when running on that kind of host. (It's more or less the same operation in reverse.) This means that QEMU can no longer run on a type of host it can't execute target code for, but the solution is to just add support for all the interesting machines out there, on both sides. So, when QEMU executes code, the virtual MMU faults a new page into the virtual TLB, and goes "I can't execute this, fix it up!" And the fixup handler looks for a translation of the page in the cache of translated pages, and if it can't find it it calls the translator to convert the target code into a page of corresponding host code. Which may involve discarding an existing entry out of the cache, but this is how instruction caches work on real hardware anyway so the delays in QEMU are where they'd be on real hardware anyway, and optimizing for one is pretty close to optimizing for the other, so life is good. The chunk you found earlier is a function pointer typecast: #define tcg_qemu_tb_exec(tb_ptr) \ ((long REGPARM (*)(void *))code_gen_prologue)(tb_ptr) Which looks like it's calling code_gen_prologue() with tp_ptr as its argument (typecast to a void *), and it returns a long. That calls a translated page, and when the function returns that means the page of code needs to jump to code somewhere outside of that page, and we go back to the main loop to figure out where to go next. The reason QEMU is as fast as it is is because once it has a page of translated code, actually _running_ it is entirely native. It jumps into the page, and executes natively until it leaves the page. Control only goes back to QEMU to switch pages or to handle I/O and interrupts and such. So when you ask "how many clock cycles did that instruction take", the answer is "it doesn't work that way". QEMU emulates at memory page level (generally 4k of target code), not at individual instruction level. (Oh, and the worst thing you can do to QEMU from a performance perspective is self-modifying code. Because the virtual MMU has to strip the executable bit off the TLB entry and re-translate the entire page next time something tries to execute it. It _works_, it's just slow. But again, real hardware can hiccup a bit on this too.) Does that answer your question? Rob ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [Qemu-devel] TCG flow vs dyngen 2011-01-23 21:50 ` Rob Landley @ 2011-01-23 22:25 ` Stefano Bonifazi 2011-01-23 23:40 ` Rob Landley 2011-01-24 14:32 ` Peter Maydell 1 sibling, 1 reply; 36+ messages in thread From: Stefano Bonifazi @ 2011-01-23 22:25 UTC (permalink / raw) To: Rob Landley; +Cc: Raphaël Lefèvre, qemu-devel On 01/23/2011 10:50 PM, Rob Landley wrote: > On 01/16/2011 10:01 AM, Raphaël Lefèvre wrote: >> On Sun, Jan 16, 2011 at 11:21 PM, Stefano Bonifazi >> <stefboombastic@gmail.com> wrote: >> 2. "how can I check the number of target cpu cycles or target >> instructions executed inside qemu-user (i.e. qemu-ppc)? >> Is there any variable I can inspect for such informations?" at Dec, 2010 > Keep in mind I'm a bit rusty and not an expert, but I'll give a stab at > answering: > > You can't, because QEMU doesn't work that way. QEMU isn't an > instruction level emulator, it's closer to a Java JIT. It doesn't > translate one instruction at a time but instead translates large blocks > of code all at once, and keeps a cache of translated blocks around. > Execution jumps into each block and either waits for it to exit again > (meaning it jumped out of that page and QEMU's main execution loop has > to look up what page to execute next, possibly translating it first if > it's not in the cache yet), or else QEMU interrupts it after while to > fake an IRQ of some kind (such as a timer interrupt). > > You may want to read Fabrice Bellard's original paper on the QEMU design: > > http://www.usenix.org/event/usenix05/tech/freenix/full_papers/bellard/bellard.pdf > > Since that was written, dyngen was replaced with tcg, but that does the > same thing in a slightly different way. > > Building a QEMU with dyngen support used to use the host compiler to > compile chunks of code corresponding to the target operations it would > see at runtime, and then strip the machine language out of the resulting > .o files and save them in a table. Then at runtime dyngen could > generate translated pages by gluing together the resulting saved machine > language snippets the host compiler had produced when qemu was built. > The problem was, beating the right kind of machine language snippets out > of the .o files the compiler produced from the example code turned out > to be VERY COMPILER DEPENDENT. This is why you couldn't build qemu with > gcc 4.x for the longest time, gcc's code generator and the layout of the > .o files changed in a bunch of subtle ways which broke dyngen's ability > to extract usable machine code snippets to put 'em into the table so it > could translate pages at runtime. > > TCG stands for "Tiny Code Generator". It just hardwires a code > generator into QEMU. They wrote a mini-compiler in C, which knows what > instructions to output for each host qemu supports. If QEMU understands > target instructions well enough to _read_ them, it's not a big stretch > to be able to _write_ them when running on that kind of host. (It's > more or less the same operation in reverse.) This means that QEMU can > no longer run on a type of host it can't execute target code for, but > the solution is to just add support for all the interesting machines out > there, on both sides. > > So, when QEMU executes code, the virtual MMU faults a new page into the > virtual TLB, and goes "I can't execute this, fix it up!" And the fixup > handler looks for a translation of the page in the cache of translated > pages, and if it can't find it it calls the translator to convert the > target code into a page of corresponding host code. Which may involve > discarding an existing entry out of the cache, but this is how > instruction caches work on real hardware anyway so the delays in QEMU > are where they'd be on real hardware anyway, and optimizing for one is > pretty close to optimizing for the other, so life is good. > > The chunk you found earlier is a function pointer typecast: > > #define tcg_qemu_tb_exec(tb_ptr) \ > ((long REGPARM (*)(void *))code_gen_prologue)(tb_ptr) > > Which looks like it's calling code_gen_prologue() with tp_ptr as its > argument (typecast to a void *), and it returns a long. That calls a > translated page, and when the function returns that means the page of > code needs to jump to code somewhere outside of that page, and we go > back to the main loop to figure out where to go next. > > The reason QEMU is as fast as it is is because once it has a page of > translated code, actually _running_ it is entirely native. It jumps > into the page, and executes natively until it leaves the page. Control > only goes back to QEMU to switch pages or to handle I/O and interrupts > and such. So when you ask "how many clock cycles did that instruction > take", the answer is "it doesn't work that way". QEMU emulates at > memory page level (generally 4k of target code), not at individual > instruction level. > > (Oh, and the worst thing you can do to QEMU from a performance > perspective is self-modifying code. Because the virtual MMU has to > strip the executable bit off the TLB entry and re-translate the entire > page next time something tries to execute it. It _works_, it's just > slow. But again, real hardware can hiccup a bit on this too.) > > Does that answer your question? > > Rob Wow! Thank you! That's an ANSWER! Gold for who's studying all of that! Though at the stage of my work I had to "understand" almost all of it, your perfect summary make everything much clearer.. About counting instructions I found that counting the instructions of each executed TB was a very good approximation, sure the cache represent a major problem, already translated TB can't be counted that way.. I'd like to disable the cache, but the parameter singlestep doesn't seem to work for qemu-user. Right now I am stuck with another problem .. maybe with your experience you can tell me whether it is possible at all.. I am trying to shift in memory the target executable .. now the code is "supposed" to be loaded by the elfloader at the exact start address set at link time .. Inside elfloader there is even a check for verifying whether that address range is busy.. but no action is taken in that case o.O Maybe I'll post a new thread about this problem (bug?) .. anyway if you think you can help me anyway I'll give you further details.. Thank you really very much again for your great explanation! Best Regards! Stefano B. ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [Qemu-devel] TCG flow vs dyngen 2011-01-23 22:25 ` Stefano Bonifazi @ 2011-01-23 23:40 ` Rob Landley 2011-01-24 10:17 ` Stefano Bonifazi 0 siblings, 1 reply; 36+ messages in thread From: Rob Landley @ 2011-01-23 23:40 UTC (permalink / raw) To: Stefano Bonifazi; +Cc: Raphaël Lefèvre, qemu-devel On 01/23/2011 04:25 PM, Stefano Bonifazi wrote: > I am trying to shift in memory the target executable .. now the code is > "supposed" to be loaded by the elfloader at the exact start address set > at link time .. Ah, elf loading. That's a whole 'nother bag of worms. Oddly enough, I was deling with this last year trying to debug the uClibc dynamic linker. I blogged a bit about it at the time: http://landley.net/notes-2010.html#12-07-2010 (And the next few days. Sigh, I never did go back and fill in the holes, did I?) > Inside elfloader there is even a check for verifying whether that > address range is busy.. but no action is taken in that case o.O > Maybe I'll post a new thread about this problem (bug?) .. anyway if you > think you can help me anyway I'll give you further details.. Tired right now, but if you post a clearer question (what are you trying to _do_) and cc: me on it I'll try to respond. Maybe I can find some decent documentation to point you at, or maybe I'll write some... Rob ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [Qemu-devel] TCG flow vs dyngen 2011-01-23 23:40 ` Rob Landley @ 2011-01-24 10:17 ` Stefano Bonifazi 2011-01-24 18:20 ` Rob Landley 0 siblings, 1 reply; 36+ messages in thread From: Stefano Bonifazi @ 2011-01-24 10:17 UTC (permalink / raw) To: Rob Landley; +Cc: Raphaël Lefèvre, qemu-devel On 01/24/2011 12:40 AM, Rob Landley wrote: > On 01/23/2011 04:25 PM, Stefano Bonifazi wrote: >> I am trying to shift in memory the target executable .. now the code is >> "supposed" to be loaded by the elfloader at the exact start address set >> at link time .. > Ah, elf loading. That's a whole 'nother bag of worms. > > Oddly enough, I was deling with this last year trying to debug the > uClibc dynamic linker. I blogged a bit about it at the time: > > http://landley.net/notes-2010.html#12-07-2010 > > (And the next few days. Sigh, I never did go back and fill in the > holes, did I?) > >> Inside elfloader there is even a check for verifying whether that >> address range is busy.. but no action is taken in that case o.O >> Maybe I'll post a new thread about this problem (bug?) .. anyway if you >> think you can help me anyway I'll give you further details.. > Tired right now, but if you post a clearer question (what are you trying > to _do_) and cc: me on it I'll try to respond. > > Maybe I can find some decent documentation to point you at, or maybe > I'll write some... > > Rob Thank you! I read your post, and yup you also noticed the weird of load_bias.. and wondered how it can work on x86.. But I think your work was on qemu-system.. I am working on qemu-user.. Yup better to post a new thread, I'll cc: you there! Thank you very much! Stefano B ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [Qemu-devel] TCG flow vs dyngen 2011-01-24 10:17 ` Stefano Bonifazi @ 2011-01-24 18:20 ` Rob Landley 2011-01-24 21:16 ` Stefano Bonifazi 0 siblings, 1 reply; 36+ messages in thread From: Rob Landley @ 2011-01-24 18:20 UTC (permalink / raw) To: Stefano Bonifazi; +Cc: Raphaël Lefèvre, qemu-devel On 01/24/2011 04:17 AM, Stefano Bonifazi wrote: > I read your post, and yup you also noticed the weird of load_bias.. and > wondered how it can work on x86.. > But I think your work was on qemu-system.. I am working on qemu-user.. My post wasn't on qemu-anything, it was while I was trying to debug the uClibc dynamic loader on a new platform (the Qualcomm Hexagon) that Linux support still hasn't gone upstream for yet. The thing is, the kernel currently _does_ work, so studying the relevant kernel code (and possibly the dynamic loader code) is one way to learn how it currently works. Rob ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [Qemu-devel] TCG flow vs dyngen 2011-01-24 18:20 ` Rob Landley @ 2011-01-24 21:16 ` Stefano Bonifazi 2011-01-25 1:19 ` Rob Landley 0 siblings, 1 reply; 36+ messages in thread From: Stefano Bonifazi @ 2011-01-24 21:16 UTC (permalink / raw) To: Rob Landley; +Cc: Raphaël Lefèvre, qemu-devel Hi! Thanks for replying me! > The thing is, the kernel currently _does_ work, so studying the relevant > kernel code (and possibly the dynamic loader code) is one way to learn > how it currently works. Sorry what kernel? Qemu's? Linux's? ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [Qemu-devel] TCG flow vs dyngen 2011-01-24 21:16 ` Stefano Bonifazi @ 2011-01-25 1:19 ` Rob Landley 2011-01-25 8:53 ` Stefano Bonifazi 0 siblings, 1 reply; 36+ messages in thread From: Rob Landley @ 2011-01-25 1:19 UTC (permalink / raw) To: Stefano Bonifazi; +Cc: Raphaël Lefèvre, qemu-devel On 01/24/2011 03:16 PM, Stefano Bonifazi wrote: > Hi! Thanks for replying me! >> The thing is, the kernel currently _does_ work, so studying the relevant >> kernel code (and possibly the dynamic loader code) is one way to learn >> how it currently works. > Sorry what kernel? Qemu's? Linux's? QEMU isn't a kernel, it's an emulator. Linux is a kernel. I meant Linux loads and runs Linux ELF executables. That's pretty much the definition of "how to do it". So if there's ever a conflict between "how qemu does it" and "how the Linux kernel does it", the Linux kernel is going to win. (And yes, this has come up before, for me it was http://www.mail-archive.com/qemu-devel@nongnu.org/msg25336.html ) That said, QEMU's currently working fairly well on this front too, so studying either should work pretty well... One advantage of the kernel is "cat /proc/$PID/maps" which lets you know what the mappings are, and then you can look up the appropriate chunks of the executable and read the elf spec: http://refspecs.freestandards.org/elf/elf.pdf And to be honest, the best way to get up to speed on this is to read this: http://www.muppetlabs.com/~breadbox/software/tiny/teensy.html Where some guy asked "ok, what do we actually NEED" and then set out to prove it. This book is pretty good too, although so dry it's almost unreadable. You might have better luck getting a paper copy out of the library: http://www.iecc.com/linker/ Rob ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [Qemu-devel] TCG flow vs dyngen 2011-01-25 1:19 ` Rob Landley @ 2011-01-25 8:53 ` Stefano Bonifazi 0 siblings, 0 replies; 36+ messages in thread From: Stefano Bonifazi @ 2011-01-25 8:53 UTC (permalink / raw) To: Rob Landley; +Cc: Raphaël Lefèvre, qemu-devel > That said, QEMU's currently working fairly well on this front too, so > studying either should work pretty well... > Mr Richard Henderson's patch on elfload.c says I was right.. at least the version I am working on (qemu-0.13.0) had some bugs and weaknesses though it worked smoothly for most cases.. > And to be honest, the best way to get up to speed on this is to read this: > > http://www.muppetlabs.com/~breadbox/software/tiny/teensy.html uhmm seems a good piece.. maybe one of the last I still didn't have :) Thank you!! Stefano B. ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [Qemu-devel] TCG flow vs dyngen 2011-01-23 21:50 ` Rob Landley 2011-01-23 22:25 ` Stefano Bonifazi @ 2011-01-24 14:32 ` Peter Maydell 2011-01-24 14:56 ` Stefano Bonifazi 1 sibling, 1 reply; 36+ messages in thread From: Peter Maydell @ 2011-01-24 14:32 UTC (permalink / raw) To: Rob Landley; +Cc: Raphaël Lefèvre, Stefano Bonifazi, qemu-devel 2011/1/23 Rob Landley <rob@landley.net>: > Keep in mind I'm a bit rusty and not an expert, but I'll give a stab at > answering: ...here's a couple of clarifications: >> 2. "how can I check the number of target cpu cycles or target >> instructions executed inside qemu-user (i.e. qemu-ppc)? > You can't, because QEMU doesn't work that way. QEMU isn't an > instruction level emulator, it's closer to a Java JIT. Being a JIT doesn't prohibit counting target instructions executed. It just means that counting them generally requires generating code to do the counting at runtime, so it's a more complicated change to make than it would be in a non-JIT emulator. The major reason for not counting cycles is that for an emulation of a modern CPU this is pretty nearly impossible: the number of cycles an instruction takes can depend on whether it causes a cache miss, which CPU internal pipeline it uses, whether it needs to stall waiting for a result from an earlier insn, whether the CPU correctly predicted the branch leading up to it or not, and on and on. You would need to precisely model all the internals of each variant of each CPU, which would be a mammoth undertaking requiring probably unpublished internal data, and if you ever managed to finish it then it would run incredibly slowly and would probably contain enough bugs you couldn't trust the data it gave you anyway. > This means that QEMU can > no longer run on a type of host it can't execute target code for This isn't correct; for instance there's hppa support in TCG for hppa hosts but no hppa target support, and there's sh4 target support but no TCG backend for it. The two ends are cleanly separated in qemu and don't generally depend on each other. -- PMM ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [Qemu-devel] TCG flow vs dyngen 2011-01-24 14:32 ` Peter Maydell @ 2011-01-24 14:56 ` Stefano Bonifazi 2011-01-24 15:15 ` Lluís 2011-01-24 18:02 ` Dushyant Bansal 0 siblings, 2 replies; 36+ messages in thread From: Stefano Bonifazi @ 2011-01-24 14:56 UTC (permalink / raw) To: Peter Maydell; +Cc: Raphaël Lefèvre, qemu-devel On 01/24/2011 03:32 PM, Peter Maydell wrote: > > Being a JIT doesn't prohibit counting target instructions executed. > It just means that counting them generally requires generating > code to do the counting at runtime, so it's a more complicated > change to make than it would be in a non-JIT emulator. > What do you mean? Should I change the code of qemu-user for counting the instructions, or should I add code into the target binaries? > The major reason for not counting cycles is that for an emulation > of a modern CPU this is pretty nearly impossible: the number > of cycles an instruction takes can depend on whether it causes > a cache miss, which CPU internal pipeline it uses, whether it > needs to stall waiting for a result from an earlier insn, whether > the CPU correctly predicted the branch leading up to it or not, > and on and on. You would need to precisely model all the > internals of each variant of each CPU, which would be a > mammoth undertaking requiring probably unpublished internal > data, and if you ever managed to finish it then it would run > incredibly slowly and would probably contain enough bugs you > couldn't trust the data it gave you anyway. > Yup, I think it was just a silly mistake of mine when in the first post I wrote cycles.. that was because for me anything that can estimate how long it takes to do the work would be fine.. I can't simply check the time because that is host machine dependent... Number of executed instructions would be fine.. >> This means that QEMU can >> no longer run on a type of host it can't execute target code for > This isn't correct; for instance there's hppa support in TCG for hppa > hosts but no hppa target support, and there's sh4 target support > but no TCG backend for it. The two ends are cleanly separated in > qemu and don't generally depend on each other. > Well I experienced a strange behavior some time ago that initially made me think mr Rob was right on that though I knew host support and target support were separated in qemu: I tried to make directly qemu-ppc on a x86_64 machine from inside ppc-linux-user folder (i can do fine onto x86 machine) and it failed because there was no tgc/x86_64/tcg_target.h, whereas doing the make from within the main folder worked. So I do not understand very well.. is there some required headers fix when using the main make file? Best regards! Stefano B. ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [Qemu-devel] TCG flow vs dyngen 2011-01-24 14:56 ` Stefano Bonifazi @ 2011-01-24 15:15 ` Lluís 2011-01-24 18:02 ` Dushyant Bansal 1 sibling, 0 replies; 36+ messages in thread From: Lluís @ 2011-01-24 15:15 UTC (permalink / raw) To: qemu-devel Stefano Bonifazi writes: > On 01/24/2011 03:32 PM, Peter Maydell wrote: >> >> Being a JIT doesn't prohibit counting target instructions executed. >> It just means that counting them generally requires generating >> code to do the counting at runtime, so it's a more complicated >> change to make than it would be in a non-JIT emulator. >> > What do you mean? Should I change the code of qemu-user for counting the > instructions, or should I add code into the target binaries? If I recall this correctly, target-i386 has a generic function (whose name I don't remember) called whenever the rdtsc instruction is executed. This function rebuilds the counter that contains the number of executed instructions (more or less, this number can be tuned from a variety of sources). Lluis -- "And it's much the same thing with knowledge, for whenever you learn something new, the whole world becomes that much richer." -- The Princess of Pure Reason, as told by Norton Juster in The Phantom Tollbooth ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [Qemu-devel] TCG flow vs dyngen 2011-01-24 14:56 ` Stefano Bonifazi 2011-01-24 15:15 ` Lluís @ 2011-01-24 18:02 ` Dushyant Bansal 2011-01-24 19:38 ` Stefano Bonifazi 1 sibling, 1 reply; 36+ messages in thread From: Dushyant Bansal @ 2011-01-24 18:02 UTC (permalink / raw) To: Stefano Bonifazi; +Cc: qemu-devel [-- Attachment #1: Type: text/plain, Size: 650 bytes --] On Monday 24 January 2011 08:26 PM, Stefano Bonifazi wrote: > On 01/24/2011 03:32 PM, Peter Maydell wrote: >> >> Being a JIT doesn't prohibit counting target instructions executed. >> It just means that counting them generally requires generating >> code to do the counting at runtime, so it's a more complicated >> change to make than it would be in a non-JIT emulator. >> > What do you mean? Should I change the code of qemu-user for counting > the instructions, or should I add code into the target binaries? You should see this pdf (www.ecs.syr.edu/faculty/yin/Teaching/TC2010/Proj4.pdf). It talks about tracing the instructions. -- Dushyant [-- Attachment #2: Type: text/html, Size: 1218 bytes --] ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [Qemu-devel] TCG flow vs dyngen 2011-01-24 18:02 ` Dushyant Bansal @ 2011-01-24 19:38 ` Stefano Bonifazi 2011-01-25 7:56 ` Dushyant Bansal 0 siblings, 1 reply; 36+ messages in thread From: Stefano Bonifazi @ 2011-01-24 19:38 UTC (permalink / raw) To: Dushyant Bansal; +Cc: qemu-devel [-- Attachment #1: Type: text/plain, Size: 1478 bytes --] On 01/24/2011 07:02 PM, Dushyant Bansal wrote: > On Monday 24 January 2011 08:26 PM, Stefano Bonifazi wrote: >> On 01/24/2011 03:32 PM, Peter Maydell wrote: >>> >>> Being a JIT doesn't prohibit counting target instructions executed. >>> It just means that counting them generally requires generating >>> code to do the counting at runtime, so it's a more complicated >>> change to make than it would be in a non-JIT emulator. >>> >> What do you mean? Should I change the code of qemu-user for counting >> the instructions, or should I add code into the target binaries? > You should see this pdf > (www.ecs.syr.edu/faculty/yin/Teaching/TC2010/Proj4.pdf). It talks > about tracing the instructions. > > -- > Dushyant Wow thank you! It sounds incredibly interesting!! > What we really need is to insert a function call into the > translated code, so when each instruction is executed at runtime, our > inserted function will be > executed. Again wow!! Is that really possible? Some sort of callback triggered at every instruction execution? Do you have any another document explaining that? This pdf just gives instructions on how to do it on an old version of qemu (disas_insn doesn't exist at all on my code now), and does not explain what it is, what's behind that suggested code .. Also the code for single step would be of great help to me! I really needed that.. but when I tried it on qemu-user didn't work at all.. Thank you very much! Best regards, Stefano B. [-- Attachment #2: Type: text/html, Size: 2604 bytes --] ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [Qemu-devel] TCG flow vs dyngen 2011-01-24 19:38 ` Stefano Bonifazi @ 2011-01-25 7:56 ` Dushyant Bansal 2011-01-25 9:04 ` Stefano Bonifazi 0 siblings, 1 reply; 36+ messages in thread From: Dushyant Bansal @ 2011-01-25 7:56 UTC (permalink / raw) To: Stefano Bonifazi; +Cc: qemu-devel [-- Attachment #1: Type: text/plain, Size: 1437 bytes --] >> You should see this pdf >> (www.ecs.syr.edu/faculty/yin/Teaching/TC2010/Proj4.pdf). It talks >> about tracing the instructions. >> >> -- >> Dushyant > Wow thank you! It sounds incredibly interesting!! >> What we really need is to insert a function call into the >> translated code, so when each instruction is executed at runtime, our >> inserted function will be >> executed. > Again wow!! Is that really possible? Some sort of callback triggered > at every instruction execution? Yes, this mechanism works. I have written a code to count different kinds of instructions. > Do you have any another document explaining that? No. But maybe you can try to understand this through qemu source code. Here are some resources for that http://stackoverflow.com/questions/4501173/a-call-to-those-who-have-worked-with-qemu > This pdf just gives instructions on how to do it on an old version of > qemu (disas_insn doesn't exist at all on my code now), and does not > explain what it is, what's behind that suggested code .. > Also the code for single step would be of great help to me! I really > needed that.. but when I tried it on qemu-user didn't work at all.. It exists in file qemu/target-i386/translate.c You are also talking about qemu source code privided here http://wiki.qemu.org/Download, right? If you need, I can give the source code of counting implementation with some documentation. Hope this helps. -- Dushyant [-- Attachment #2: Type: text/html, Size: 2448 bytes --] ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [Qemu-devel] TCG flow vs dyngen 2011-01-25 7:56 ` Dushyant Bansal @ 2011-01-25 9:04 ` Stefano Bonifazi 2011-01-25 9:05 ` Edgar E. Iglesias 0 siblings, 1 reply; 36+ messages in thread From: Stefano Bonifazi @ 2011-01-25 9:04 UTC (permalink / raw) To: Dushyant Bansal; +Cc: qemu-devel Again wow!! Is that really possible? Some sort of callback triggered at every instruction execution? > Yes, this mechanism works. I have written a code to count different > kinds of instructions. Great! that opens a lot of possibilities!. > It exists in file qemu/target-i386/translate.c Ops right! I checked target-ppc/translate.c as I need Power-PC as target.. I wonder what function replaces it there.. > You are also talking about qemu source code privided here > http://wiki.qemu.org/Download, right? Yes I am using this http://wiki.qemu.org/download/qemu-0.13.0.tar.gz > If you need, I can give the source code of counting implementation > with some documentation. > Hope this helps. > Wow that would be awesome! I'd really appreciate it very much! Thank you! :) You are free of sending it to my address! :) Best regards!! Stefano B. ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [Qemu-devel] TCG flow vs dyngen 2011-01-25 9:04 ` Stefano Bonifazi @ 2011-01-25 9:05 ` Edgar E. Iglesias 2011-01-25 9:28 ` Stefano Bonifazi 0 siblings, 1 reply; 36+ messages in thread From: Edgar E. Iglesias @ 2011-01-25 9:05 UTC (permalink / raw) To: Stefano Bonifazi; +Cc: Dushyant Bansal, qemu-devel On Tue, Jan 25, 2011 at 10:04:39AM +0100, Stefano Bonifazi wrote: > Again wow!! Is that really possible? Some sort of callback triggered at > every instruction execution? > > Yes, this mechanism works. I have written a code to count different > > kinds of instructions. > Great! that opens a lot of possibilities!. > > It exists in file qemu/target-i386/translate.c > Ops right! I checked target-ppc/translate.c as I need Power-PC as > target.. I wonder what function replaces it there.. > > You are also talking about qemu source code privided here > > http://wiki.qemu.org/Download, right? > Yes I am using this http://wiki.qemu.org/download/qemu-0.13.0.tar.gz > > If you need, I can give the source code of counting implementation > > with some documentation. > > Hope this helps. > > > Wow that would be awesome! I'd really appreciate it very much! Thank you! :) > You are free of sending it to my address! :) Hi, If you are interested in instruction counting maybe you should take a look at the -icount option as well. Cheers ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [Qemu-devel] TCG flow vs dyngen 2011-01-25 9:05 ` Edgar E. Iglesias @ 2011-01-25 9:28 ` Stefano Bonifazi 0 siblings, 0 replies; 36+ messages in thread From: Stefano Bonifazi @ 2011-01-25 9:28 UTC (permalink / raw) To: Edgar E. Iglesias; +Cc: Dushyant Bansal, qemu-devel On 01/25/2011 10:05 AM, Edgar E. Iglesias wrote: > On Tue, Jan 25, 2011 at 10:04:39AM +0100, Stefano Bonifazi wrote: >> Again wow!! Is that really possible? Some sort of callback triggered at >> every instruction execution? >>> Yes, this mechanism works. I have written a code to count different >>> kinds of instructions. >> Great! that opens a lot of possibilities!. >>> It exists in file qemu/target-i386/translate.c >> Ops right! I checked target-ppc/translate.c as I need Power-PC as >> target.. I wonder what function replaces it there.. >>> You are also talking about qemu source code privided here >>> http://wiki.qemu.org/Download, right? >> Yes I am using this http://wiki.qemu.org/download/qemu-0.13.0.tar.gz >>> If you need, I can give the source code of counting implementation >>> with some documentation. >>> Hope this helps. >>> >> Wow that would be awesome! I'd really appreciate it very much! Thank you! :) >> You are free of sending it to my address! :) > Hi, > > If you are interested in instruction counting maybe you should take > a look at the -icount option as well. > > Cheers Thank you! Already tried long ago, it doesn't work with qemu-user..If I remember fine its core was in files not used in qemu-user :( Regards, Stefano B. ^ permalink raw reply [flat|nested] 36+ messages in thread
end of thread, other threads:[~2011-01-25 9:31 UTC | newest] Thread overview: 36+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2011-01-16 14:46 [Qemu-devel] TCG flow vs dyngen Raphael Lefevre 2011-01-16 15:21 ` Stefano Bonifazi 2011-01-16 16:01 ` Raphaël Lefèvre 2011-01-16 16:43 ` Stefano Bonifazi 2011-01-16 18:29 ` Peter Maydell 2011-01-16 19:02 ` Stefano Bonifazi 2011-01-16 19:24 ` Peter Maydell 2011-01-24 13:20 ` [Qemu-devel] " Stefano Bonifazi 2011-01-16 20:50 ` [Qemu-devel] " Stefano Bonifazi 2011-01-16 21:08 ` Raphaël Lefèvre 2011-01-24 12:35 ` [Qemu-devel] " Stefano Bonifazi 2011-01-17 11:59 ` [Qemu-devel] " Lluís 2011-01-24 12:31 ` [Qemu-devel] " Stefano Bonifazi 2011-01-24 13:36 ` Lluís 2011-01-24 14:00 ` Stefano Bonifazi 2011-01-24 15:06 ` Lluís 2011-01-24 17:23 ` Stefano Bonifazi 2011-01-24 18:12 ` Lluís 2011-01-16 19:16 ` [Qemu-devel] " Raphaël Lefèvre 2011-01-23 21:50 ` Rob Landley 2011-01-23 22:25 ` Stefano Bonifazi 2011-01-23 23:40 ` Rob Landley 2011-01-24 10:17 ` Stefano Bonifazi 2011-01-24 18:20 ` Rob Landley 2011-01-24 21:16 ` Stefano Bonifazi 2011-01-25 1:19 ` Rob Landley 2011-01-25 8:53 ` Stefano Bonifazi 2011-01-24 14:32 ` Peter Maydell 2011-01-24 14:56 ` Stefano Bonifazi 2011-01-24 15:15 ` Lluís 2011-01-24 18:02 ` Dushyant Bansal 2011-01-24 19:38 ` Stefano Bonifazi 2011-01-25 7:56 ` Dushyant Bansal 2011-01-25 9:04 ` Stefano Bonifazi 2011-01-25 9:05 ` Edgar E. Iglesias 2011-01-25 9:28 ` Stefano Bonifazi
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).