Re: [Qemu-devel] RFC: Code fetch optimisation

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: "J. Mayer" <l_indien@magic.fr>
To: Paul Brook <paul@codesourcery.com>
Cc: qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] RFC: Code fetch optimisation
Date: Mon, 15 Oct 2007 23:30:33 +0200	[thread overview]
Message-ID: <1192483833.9976.456.camel@rapid> (raw)
In-Reply-To: <200710151701.17822.paul@codesourcery.com>

On Mon, 2007-10-15 at 17:01 +0100, Paul Brook wrote:
> > > > +    unsigned long phys_pc;
> > > > +    unsigned long phys_pc_start;
> > >
> > > These are ram offsets, not physical addresses. I recommend naming them as
> > > such to avoid confusion.
> >
> > Well, those are host addresses. Fabrice even suggested me to replace
> > them with void * to prevent confusion, but I kept using unsigned long
> > because the _p functions API do not use pointers. As those values are
> > defined as phys_ram_base + offset, those are likely to be host address,
> > not RAM offset, and are used directly to dereference host pointers in
> > the ldxxx_p functions. Did I miss something ?
> 
> You are correct, they are host addresses. I still think calling them phys_pc 
> is confusing. It took me a while to convince myself that "unsigned long" was 
> an appropriate type (ignoring 64-bit windows hosts for now).
> 
> How about host_pc?

It's OK for me.

> > > > +    /* Avoid softmmu access on next load */
> > > > +    /* XXX: dont: phys PC is not correct anymore
> > > > +     *      We could call get_phys_addr_code(env, pc); and remove the
> > > > else +     *      condition, here.
> > > > +     */
> > > > +    //*start_pc = phys_pc;
> > >
> > > The commented out code is completely bogus, please remove it. The comment
> > > is also somewhat misleading/incorrect. The else would still be required
> > > for accesses that span a page boundary.
> >
> > I guess trying to optimize this case retrieving the physical address
> > would not bring any optimization as in fact only the last translated
> > instruction of a TB (then only a few code loads) may hit this case.
> 
> VLE targets (x86, m68k) can translate almost a full page of instructions, and 
> a page boundary can be anywhere within that block. Once we've spanned 
> multiple pages there's not point stopping translation immediately. We may as 
> well translate as many instructions as we can on the second page.
> 
> I'd guess most TB are much smaller than a page, so on average only a few 
> instructions are going to come after the page boundary.

This leads me to another reflexion. For fixed length encoding targets,
we always stop translation when reaching a page boundary. If we keep
using the current model and we optimize the slow case, it would be
possible to stop only if we cross 2 pages boundary during code
translation, and it seems that this case is not likely to happen. If we
keep the current behavior, we could remove the second page_addr element
in the tb structure and maybe optimize parts of the tb management and
invalidation code.

> > I'd like to keep a comment here to show that it may not be a good idea
> > (or may not be as simple as it seems at first sight) to try to do more
> > optimisation here, but you're right this comment is not correct.
> 
> Agreed.
> 
> > > The code itself looks ok, though I'd be surprised if it made a
> > > significant difference. We're always going to hit the fast-path TLB
> > > lookup case anyway.
> >
> > It seems that the generated code for the code fetch is much more
> > efficient than the one generated when we get when using the softmmu
> > routines. But it's true we do not get any significant performance boost.
> > As it was previously mentioned, the idea of the patch is more a 'don't
> > do unneeded things during code translation' than a great performance
> > improvment.
> 
> OTOH it does make the the code more complicated. I'm agnostic about whether 
> this patch should be applied.

I agree that this proposal was an answer to a challenging idea that I
received more than a real need.
The worst thing in this patch, imho, is that you need to increase 2
values each time you want to change the PC. This is likely to bring some
bug when one will forgot to increase one of the two. I was thinking of
hiding the pc, host_pc and host_pc_start (and maybe also pc_start) in a
structure and add inline helpers:
* get_pc would return the current virtual PC, as needed by the jump and
relative memory accesses functions.
* get_tb_len would return the difference between the virtual PC and the
virtual pc_start, as it is done at the end of the gen_intermediate_code
functions
* move_pc would add an offset to the virtual and the physical PC. This
has to be target dependant, due to the special case for Sparc
* update_phys_pc would be void for most targets, except for Sparc where
the phys_pc needs to be adjusted after the translation of each target
instruction.
and maybe more, if needed.
This structure could also contain target specific information. To
address the problem of segment limit check reported by Fabrice Bellard,
we could for example add the address of the next segment limit for x86
target and add a target specific check at the start of the ldx_code_p
function. But I don't know much about segmentation "subtilities" on x86,
then this idea may not be appropriate to solve this problem.

-- 
J. Mayer <l_indien@magic.fr>
Never organized

next prev parent reply	other threads:[~2007-10-15 21:30 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-10-14 11:44 [Qemu-devel] RFC: Code fetch optimisation J. Mayer
2007-10-15  2:30 ` Paul Brook
2007-10-15 12:09   ` J. Mayer
2007-10-15 16:01     ` Paul Brook
2007-10-15 16:19       ` Fabrice Bellard
2007-10-15 21:30       ` J. Mayer [this message]
2007-10-15 22:42         ` Paul Brook
2007-10-16 20:27           ` J. Mayer
2007-10-16 22:00             ` Paul Brook
2007-10-16 23:38               ` J. Mayer
2007-10-17  0:43                 ` Paul Brook
2007-10-16 22:26             ` Paul Brook
  -- strict thread matches above, loose matches on Subject: below --
2007-10-12  8:33 J. Mayer
2007-10-12 15:21 ` Blue Swirl
2007-10-12 18:24   ` Jocelyn Mayer
2007-10-12 18:36   ` Fabrice Bellard
2007-10-12 18:39   ` Fabrice Bellard

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1192483833.9976.456.camel@rapid \
    --to=l_indien@magic.fr \
    --cc=paul@codesourcery.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).