[Fwd: Re: [Qemu-devel] [PATCH] SVM support]

All of lore.kernel.org
 help / color / mirror / Atom feed

From: "J. Mayer" <l_indien@magic.fr>
To: qemu-devel@nongnu.org
Subject: [Fwd: Re: [Qemu-devel] [PATCH] SVM support]
Date: Tue, 18 Sep 2007 22:54:16 +0200	[thread overview]
Message-ID: <1190148856.14938.247.camel@rapid> (raw)

Forwarded, as requested.

-------- Forwarded Message --------
> From: Alexander Graf <agraf@suse.de>
> To: J.Mayer <l_indien@magic.fr>
> Subject: Re: [Qemu-devel] [PATCH] SVM support
> Date: Tue, 18 Sep 2007 13:51:26 +0200
> 
> On Sep 18, 2007, at 12:09 PM, J. Mayer wrote:
> 
> > On Tue, 2007-09-18 at 03:02 +0200, Alexander Graf wrote:
> >> Jocelyn Mayer wrote:
> >>> On Mon, 2007-09-17 at 12:02 +0200, Alexander Graf wrote:
> >>>
> >>>> J. Mayer wrote:
> >>>>
> >>>>> On Thu, 2007-09-13 at 17:27 +0200, Alexander Graf wrote:
> >>>>>
> >>>>>
> >>>>>> Thiemo Seufer wrote:
> >>>>>>
> >>>>>>
> >>>>>>> Alexander Graf wrote:
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>> Thanks to Michael Peter who tried the patch on an x86 host I  
> >>>>>>>> found some
> >>>>>>>> compilation problems.
> >>>>>>>> This updated patch addresses these issues and thus should  
> >>>>>>>> compile on
> >>>>>>>> every platform for every target available.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>> [...]
> >>>>>
> >>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>> Wow sorry about that, looks like I missed these.
> >>>>>>
> >>>>>>
> >>>>> Index: qemu-0.9.0.cvs/exec-all.h
> >>>>> ================================================================== 
> >>>>> =
> >>>>> --- qemu-0.9.0.cvs.orig/exec-all.h
> >>>>> +++ qemu-0.9.0.cvs/exec-all.h
> >>>>> @@ -166,6 +166,7 @@ static inline int tlb_set_page(CPUState
> >>>>>  typedef struct TranslationBlock {
> >>>>>      target_ulong pc;   /* simulated PC corresponding to this  
> >>>>> block (EIP
> >>>>> + CS base) */
> >>>>>      target_ulong cs_base; /* CS base for this block */
> >>>>> +    uint64_t intercept; /* SVM intercept vector */
> >>>>>      unsigned int flags; /* flags defining in which context the  
> >>>>> code was
> >>>>> generated */
> >>>>>      uint16_t size;      /* size of target code for this block  
> >>>>> (1 <=
> >>>>>                             size <= TARGET_PAGE_SIZE) */
> >>>>> Index: qemu-0.9.0.cvs/cpu-all.h
> >>>>> ================================================================== 
> >>>>> =
> >>>>> --- qemu-0.9.0.cvs.orig/cpu-all.h
> >>>>> +++ qemu-0.9.0.cvs/cpu-all.h
> >>>>> @@ -715,6 +715,7 @@ extern int code_copy_enabled;
> >>>>>  #define CPU_INTERRUPT_HALT   0x20 /* CPU halt wanted */
> >>>>>  #define CPU_INTERRUPT_SMI    0x40 /* (x86 only) SMI interrupt  
> >>>>> pending
> >>>>> */
> >>>>>  #define CPU_INTERRUPT_DEBUG  0x80 /* Debug event occured.  */
> >>>>> +#define CPU_INTERRUPT_VIRQ   0x100 /* virtual interrupt  
> >>>>> pending.  */
> >>>>>
> >>>>>  void cpu_interrupt(CPUState *s, int mask);
> >>>>>  void cpu_reset_interrupt(CPUState *env, int mask);
> >>>>>
> >>>>> Those two patches look ugly to me as target specific features  
> >>>>> should
> >>>>> never go in generic code or structures.
> >>>>> The CPU_INTERRUPT flags should just contain information about the
> >>>>> emulator behavior, thus CPU_INTERRUPT_TIMER, CPU_INTERRUPT_SMI  
> >>>>> are to be
> >>>>> removed. Target specific informations about the exception  
> >>>>> nature should
> >>>>> go in the CPUState structure... Then, adding a  
> >>>>> CPU_INTERRUPT_VIRQ seems
> >>>>> not a good idea at all: it's outside of the generic emulator  
> >>>>> scope and
> >>>>> pointless for most targets.
> >>>>>
> >>>>>
> >>>> I don't see any practical reason not to do it this way. We could  
> >>>> define
> >>>> a CPU_INTERRUPT_TARGET interrupt, that stores the information in  
> >>>> a the
> >>>> target specific CPUState, but this would hit performance  
> >>>> (marginally
> >>>> though) and does not improve the situation. I don't think that  
> >>>> we should
> >>>> should forcefully try to seperate targets where we are not even  
> >>>> close to
> >>>> running out of constants. If everyone on this list sees the  
> >>>> situation as
> >>>> Jocelyn does, I would be fine with writing a patch that puts target
> >>>> specific interrupts to the specific targets.
> >>>>
> >>>
> >>> I was to do the same to add some functionality to the PowerPC  
> >>> target,
> >>> long time ago, and Fabrice Bellard convinced me not to do and  
> >>> agreed it
> >>> has been a bad idea to add the target specific CPU_INTERRUPT_xxx  
> >>> flags.
> >>> Then I did manage the PowerPC target to use only generic
> >>> CPU_INTERRUPT_xxx and put all other target specific informations  
> >>> in the
> >>> CPUState structure. It absolutelly changes nothing in terms of
> >>> functionality nor performance. The only thing is that it makes the
> >>> generic parts simpler and could even allow the cpu_exec function to
> >>> contain almost no target specific code, which would really great  
> >>> imho.
> >>>
> >>
> >> I can give that a try :-). Sounds reasonable for me.
> >
> >> [Next message]
> >> Oh well I just thought about this a bit more and while stumbling  
> >> across
> >> CPU_INTERRUPT_FIQ which does the same thing one major problem came  
> >> to me
> >> on that one: Priority. Real interrupts have priority over virtual
> >> interrupts so the VIRQs have to be processed after HARD IRQs, whereas
> >> SMIs and NMIs have to be taken care of before the HARD IRQs. It  
> >> simply
> >> wouldn't work out to generalize the IRQs, just as it would not  
> >> work with
> >> the VIRQ, as it has to be handled as a real IRQ but without accessing
> >> the APIC which has to be done for HARD IRQs.
> >>
> >> I guess we're stuck with what's there now.
> >
> > Priorities are not an issue, you can take a look to the  
> > ppc_hw_interrupt function
> > to see how the PowerPC emulation code manages priorities and  
> > masking between
> > 10 different hardware interruption sources. The code could be  
> > better (I wanted
> > to preserve existing code) but it's a solution that actually works...
> > But looking better the x86 emulation code, I agree that if you  
> > don't want to change it
> > too much, you have to add a flag. The other solution (and best,  
> > imho) would be
> > to do the same as what has been done for PowerPC: just let the  
> > generic code in
> > the cpu_exec loop and move the x86 specific code somewhere in  
> > target-i386 code.
> > Going this way, I see no reason why the interruption code in the  
> > cpu_exec loop
> > could not finally become:
> >                 interrupt_request = env->interrupt_request;
> >                 if (__builtin_expect(interrupt_request, 0)) {
> >                     if (interrupt_request & CPU_INTERRUPT_DEBUG) {
> >                         env->interrupt_request &=  
> > ~CPU_INTERRUPT_DEBUG;
> >                         env->exception_index = EXCP_DEBUG;
> >                         cpu_loop_exit();
> >                     }
> >                     if (interrupt_request & CPU_INTERRUPT_HALT) {
> >                         env->interrupt_request &= ~CPU_INTERRUPT_HALT;
> >                         env->halted = 1;
> >                         env->exception_index = EXCP_HLT;
> >                         cpu_loop_exit();
> >                     }
> >                     if (interrupt_request & CPU_INTERRUPT_HARD) {
> >                         hw_interrupt(env);
> >                         if (env->pending_interrupts == 0)
> >                             env->interrupt_request &=  
> > ~CPU_INTERRUPT_HARD;
> > #if defined(__sparc__) && !defined(HOST_SOLARIS)
> >                         tmp_T0 = 0;
> > #else
> >                         T0 = 0;
> > #endif
> >                     }
> >                    /* Don't use the cached interupt_request value,
> >                       do_interrupt may have updated the EXITTB  
> > flag. */
> >                     if (interrupt_request & CPU_INTERRUPT_EXITTB) {
> >                         env->interrupt_request &=  
> > ~CPU_INTERRUPT_EXITTB;
> >                         /* ensure that no TB jump will be modified as
> >                            the program flow was changed */
> > #if defined(__sparc__) && !defined(HOST_SOLARIS)
> >                         tmp_T0 = 0;
> > #else
> >                         T0 = 0;
> > #endif
> >                     }
> >                     if (interrupt_request & CPU_INTERRUPT_EXIT) {
> >                         env->interrupt_request &= ~CPU_INTERRUPT_EXIT;
> >                         env->exception_index = EXCP_INTERRUPT;
> >                         cpu_loop_exit();
> >                     }
> >                 }
> > All the targets specific tricks could be done in the hw_interrupt  
> > routine. And the generic code
> > would become much more readable. But this needs some works (not so  
> > much) and intensive tests...
> > And I guess nobody feels like taking this risk right now ;-)
> > But I think this will have to be done someday...
> 
> I definitely agree with you on that one. The current interrupt  
> handlers are filled with #ifdefs and not easily readable. This is a  
> different task though and maybe I will get the time to do this. I  
> believe it should be done "all archs at once" though, so either we  
> use the current method (adding non-gerneric interrupts) for now and  
> tidy up everything later on or tidy up the code now and port the svm  
> patch on that. As I don't see the cleanup coming in the near future  
> I'd vote for including it now and have it reworked at once when  
> someone feels dare enough to clean up the whole generic interrupt  
> handler.
> 
> Nevertheless I really do like the idea of having a cleaned up  
> interrupt handler. This is a different patch though and is rather a  
> TODO than anything that should be mangled with the svm patch.
> 
> >
> >>>
> >>>>> For the same reason, the intercept field in the TB structure  
> >>>>> seems not
> >>>>> acceptable, as TB specific target informations are already to  
> >>>>> be stored
> >>>>> in the flags field. As intercept seems only to be a bit field,  
> >>>>> it should
> >>>>> go, in a way or another, in tb flags. And as it seems that some
> >>>>>
> >>>>>
> >>>> TB flags is a 32-bit bitfield, where several bits are already used.
> >>>> Currently SVM supports 45 intercepts and each combination can  
> >>>> generate a
> >>>> different TB, as we do not want to reduce performance for the
> >>>> non-intercepted case. Think of the intercept field as flags2,  
> >>>> that could
> >>>> be used by other virtualization implementations on other  
> >>>> architectures
> >>>> too (though I do not know of any)
> >>>>
> >>>
> >>> So, why not make it generic and extend the flag field to as many  
> >>> bits as
> >>> you need ? Intercept is x86 specific and has no meaning outside  
> >>> of this
> >>> scope. Having more bits for flags is generic and may be useful  
> >>> for some
> >>> other targets... The easy way is to convert flag to 64 bits, if  
> >>> you have
> >>> enough bits. Another way is to make it a table, but this may need
> >>> changes in all targets code...
> >>>
> >>
> >> I think it might be quite a bad idea to extend flags until everything
> >> fits in there. I'm planning on implementing VMX as well and there are
> >> intercepts too so the meaning of the different fields change and I  
> >> would
> >> not want that happening with the normal flags vector. What would you
> >> think of an #ifdef TARGET_I386? This way intercepts are x86 only and
> >> don't interfere with any other target architecture, but are still
> >> seperated from the flags (which is semantically correct as far as it
> >> goes for me).
> >
> > I don't consider as a problem that one bit in the TB flags field can
> > have different meanings.
> > This field is something opaque from the generic code. I actually have
> > the same thing in
> > the PowerPC emulation (even if it's not fully done for now): TB flags
> > are just an
> > selection of the Machine State Register bits. Some of those bits can
> > have different
> > meanings depending on the PowerPC implementation (ie CPU model).  
> > But the
> > only thing important is to know if the current CPUState flags are  
> > equal
> > to the
> > TB flags, not the actual meaning of the bits. As the MSR is 32 bits on
> > 32 bits
> > PowerPC implementations, I have no hesitation to use those bits  
> > directly
> > even if their actual meaning, from the CPU "point of view" changes.
> > Then, imho, you should not hesitate to put those intercept bits in a
> > generic field...
> 
> 
> Ok, I will try to shift the intercepts in an uint_64 flags variable  
> in the TB.
> 
> >
> >>>
> >>>>> interceptions are related with functions implemented in helpers  
> >>>>> (not
> >>>>> micro-ops), you'd better check the interception in the helper at
> >>>>> runtime, which would add no visible overhead (as calling a  
> >>>>> helper is
> >>>>> slow compared to direct micro-ops execution), then you would  
> >>>>> not need to
> >>>>> store those infos in the TB structure. This may even make the  
> >>>>> emulation
> >>>>>
> >>>>>
> >>>> This was the case in an earlier version of the patch. The current
> >>>> implementation you see is an optimization to make it running  
> >>>> faster, as
> >>>> there is (nearly) no need for runtime detection.
> >>>>
> >>>
> >>> For all interceptions related to operations in helpers, it's not a
> >>> actual optimisation. Things done in helpers are slow and not  
> >>> widely used
> >>> (compared to the translated code, I mean), then adding a single  
> >>> test for
> >>> interception in a helper may not hurt global performances. You  
> >>> may need
> >>> TB flags only for operations you want to intercept that would be  
> >>> inlined
> >>> in the translated code, then maybe reduce the number of bits needed
> >>> without decreasing performanced. But, not knowing a lot about x86 VM
> >>> specification I may be wrong...
> >>>
> >>>
> >>
> >> I feel really uncomfortable with supporting hundreds of different  
> >> ways
> >> of interception. It took me days to take all the intercept checks  
> >> out of
> >> the helpers into the translate.c, as it is way easier to maintain and
> >> read that way. I do not like the idea of converting the intercept  
> >> vector
> >> to something internal. It's quite nice to be able to use the very  
> >> same
> >> constants as kvm does. This makes debugging really easy and you don't
> >> have to define everything twice. Additionally this way you can  
> >> just set
> >> a variable without calculating anything, which does come in handy  
> >> as well.
> >>
> >> Basically the reason I really want to have everyting in  
> >> translate.c is
> >> simplicity though.
> >
> > OK, I understand that using the same code for all interceptions can  
> > make
> > your coding
> > far easier...
> >
> >>>>> run faster has you won't fill the TB cache with multiple  
> >>>>> translation of
> >>>>> the same code each time the env->intercept changes, thus have  
> >>>>> chance to
> >>>>> avoid many TB caches flushes.
> >>>>>
> >>>>>
> >>>> This is actually intended, as I believe the emulated machine  
> >>>> should run
> >>>> as fast as before when not using any interceptions
> >>>>
> >>>
> >>> It depends if those flags are to be changed often or not, which  
> >>> depends
> >>> on the hypervisor software in use I guess. If the flags never  
> >>> change,
> >>> there'll be no TB flush, I agree...
> >>>
> >>
> >> Usually (at least that's the case with kvm) you have two different  
> >> sets
> >> of intercept flags. One for the real machine and one for the  
> >> virtualized
> >> one. There is only one minor situation where that might change but
> >> that's not worth mentioning here. Nevertheless TBs point to physical
> >> memory  (afaik) so since the virtual kernel is on a different page  
> >> than
> >> the real one, we could not reuse TBs anyway so having the intercept
> >> information in the TB only helps in finding out which intercepts to
> >> check for as only a minority of the possible intercepts actually  
> >> get used.
> >
> > So as far as the hypervisor do not provide shared routines to be  
> > used by
> > the OS,
> > you should never have any "collisison" between translated blocks in
> > different environments ?
> > I guess you need a complete TB flush when switching between the
> > virtualized environments.
> > Then, OK, the interception flags should never lead to fill the TB  
> > cache
> > with multiple translations
> > of the same blocks...
> 
> Exactly. The only possible "collision" would be two virtual machines  
> that share code on the same physical page somehow.
> 
> Cheers,
> 
> Alex
-- 
J. Mayer <l_indien@magic.fr>
Never organized

next             reply	other threads:[~2007-09-18 20:54 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-09-18 20:54 J. Mayer [this message]
2007-09-18 23:05 ` [Fwd: Re: [Qemu-devel] [PATCH] SVM support] J. Mayer
2007-09-18 23:28   ` Paul Brook
2007-09-19 10:55     ` Alexander Graf
2007-09-19 10:56     ` Alexander Graf
2007-09-19 14:39       ` Jocelyn Mayer
2007-09-19 14:55         ` Alexander Graf
2007-09-19 15:35         ` Paul Brook
2007-09-19 18:24           ` J. Mayer
2007-09-19 10:59   ` Alexander Graf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1190148856.14938.247.camel@rapid \
    --to=l_indien@magic.fr \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.