xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: Peter Maloney <peter.maloney@brockmann-consult.de>
To: Andres Lagar-Cavilla <andreslc@gridcentric.ca>
Cc: Tim Deegan <tim@xen.org>,
	Andres Lagar-Cavilla <andres@lagarcavilla.org>,
	xen-devel@lists.xen.org
Subject: Re: xen-unstable, winxp32 very poor performance on AMD FX-8150, I bisected and changeset is 24770:7f79475d3de7
Date: Tue, 13 Nov 2012 14:17:01 +0100	[thread overview]
Message-ID: <50A2484D.309@brockmann-consult.de> (raw)
In-Reply-To: <E9577EEC-5F2E-4DFB-913F-C470D77CCF5C@gridcentric.ca>

On 2012-11-01 18:28, Andres Lagar-Cavilla wrote:
> On Nov 1, 2012, at 1:00 PM, Tim Deegan <tim@xen.org> wrote:
>
>> Hi,
>>
>> At 14:59 +0100 on 22 Oct (1350917960), Tim Deegan wrote:
>>> At 19:21 +0200 on 20 Oct (1350760876), Peter Maloney wrote:
>>>> The change was 8 months ago
>>>>
>>>> changeset:   24770:7f79475d3de7
>>>> user:        Andres Lagar-Cavilla <andres@lagarcavilla.org>
>>>> date:        Fri Feb 10 16:07:07 2012 +0000
>>>> summary:     x86/mm: Make p2m lookups fully synchronized wrt modifications
>>> This change was bad for performnace across the board and most of it has
>>> since been either reverted or amended, but clearly we missed something
>>> here. 
>>>
>>> It's interesting that Win8 isn't slowed down.  I wonder whether that's to
>>> do with the way it drives the VGA card -- IIRC it uses a generic VESA
>>> driver rather than a Cirrus one. 
>> In fact this is to do with the APIC.  On my test system, a busy 2-vcpu
>> VM is making about 300k/s accesses to the APIC TPR.  These accesses are
>> all trapped and emulated by Xen, and that emulation has got more
>> expensive as part of this change.
>>
>> Later Windows OSes have a feature called 'lazy IRQL' which makes those
>> accesses go away, but sadly that's not been done for WinXP.  On modern
>> Intel CPUs, the hardware acceleration for TPR accesses works for XP; on
>> AMD it requires the OS to use 'MOV reg32, CR8' to access the TPR instead
>> of MMIO, which XP is clearly not doing. :(
>>
>> Peter: if you have the option, you might find that installing the PV
>> drivers that ship with Citrix XenServer 6.0 makes things work better.
>>
>> Andres: even though this load of APIC emulations is pretty extreme, it's
>> surprising that the VM runs faster on shadow pagetables!  Any ideas for
>> where this slowdown is coming from?
> Not any immediate ideas without profiling.
>
> However, most callers of hvmemul_do_io pass a stub zero ram_gpa address. We might be madly hitting the p2m locks for no reason there.
>
> How about the following patch, Peter, Tim?

Thanks,
I'll give it a try sometime this week I guess.

> diff -r 5171750d133e xen/arch/x86/hvm/emulate.c
> --- a/xen/arch/x86/hvm/emulate.c
> +++ b/xen/arch/x86/hvm/emulate.c
> @@ -60,24 +60,28 @@ static int hvmemul_do_io(
>      ioreq_t *p = get_ioreq(curr);
>      unsigned long ram_gfn = paddr_to_pfn(ram_gpa);
>      p2m_type_t p2mt;
> -    struct page_info *ram_page;
> +    struct page_info *ram_page = NULL;
>      int rc;
>  
>      /* Check for paged out page */
> -    ram_page = get_page_from_gfn(curr->domain, ram_gfn, &p2mt, P2M_UNSHARE);
> -    if ( p2m_is_paging(p2mt) )
> +    if ( ram_gpa != INVALID_MFN )
>      {
> -        if ( ram_page )
> -            put_page(ram_page);
> -        p2m_mem_paging_populate(curr->domain, ram_gfn);
> -        return X86EMUL_RETRY;
> -    }
> -    if ( p2m_is_shared(p2mt) )
> -    {
> -        if ( ram_page )
> -            put_page(ram_page);
> -        return X86EMUL_RETRY;
> -    }
> +        ram_page = get_page_from_gfn(curr->domain, ram_gfn, &p2mt, P2M_UNSHARE);
> +        if ( p2m_is_paging(p2mt) )
> +        {
> +            if ( ram_page )
> +                put_page(ram_page);
> +            p2m_mem_paging_populate(curr->domain, ram_gfn);
> +            return X86EMUL_RETRY;
> +        }
> +        if ( p2m_is_shared(p2mt) )
> +        {
> +            if ( ram_page )
> +                put_page(ram_page);
> +            return X86EMUL_RETRY;
> +        }
> +    } else
> +        value = 0; /* for pvalue */
>  
>      /*
>       * Weird-sized accesses have undefined behaviour: we discard writes
> @@ -455,7 +459,7 @@ static int __hvmemul_read(
>              return X86EMUL_UNHANDLEABLE;
>          gpa = (((paddr_t)vio->mmio_gpfn << PAGE_SHIFT) | off);
>          if ( (off + bytes) <= PAGE_SIZE )
> -            return hvmemul_do_mmio(gpa, &reps, bytes, 0,
> +            return hvmemul_do_mmio(gpa, &reps, bytes, INVALID_MFN,
>                                     IOREQ_READ, 0, p_data);
>      }
>  
> @@ -480,7 +484,8 @@ static int __hvmemul_read(
>              addr, &gpa, bytes, &reps, pfec, hvmemul_ctxt);
>          if ( rc != X86EMUL_OKAY )
>              return rc;
> -        return hvmemul_do_mmio(gpa, &reps, bytes, 0, IOREQ_READ, 0, p_data);
> +        return hvmemul_do_mmio(gpa, &reps, bytes, INVALID_MFN,
> +                                IOREQ_READ, 0, p_data);
>      case HVMCOPY_gfn_paged_out:
>          return X86EMUL_RETRY;
>      case HVMCOPY_gfn_shared:
> @@ -552,7 +557,7 @@ static int hvmemul_write(
>          unsigned int off = addr & (PAGE_SIZE - 1);
>          gpa = (((paddr_t)vio->mmio_gpfn << PAGE_SHIFT) | off);
>          if ( (off + bytes) <= PAGE_SIZE )
> -            return hvmemul_do_mmio(gpa, &reps, bytes, 0,
> +            return hvmemul_do_mmio(gpa, &reps, bytes, INVALID_MFN,
>                                     IOREQ_WRITE, 0, p_data);
>      }
>  
> @@ -573,7 +578,7 @@ static int hvmemul_write(
>              addr, &gpa, bytes, &reps, pfec, hvmemul_ctxt);
>          if ( rc != X86EMUL_OKAY )
>              return rc;
> -        return hvmemul_do_mmio(gpa, &reps, bytes, 0,
> +        return hvmemul_do_mmio(gpa, &reps, bytes, INVALID_MFN,
>                                 IOREQ_WRITE, 0, p_data);
>      case HVMCOPY_gfn_paged_out:
>          return X86EMUL_RETRY;
> @@ -804,7 +809,7 @@ static int hvmemul_read_io(
>  {
>      unsigned long reps = 1;
>      *val = 0;
> -    return hvmemul_do_pio(port, &reps, bytes, 0, IOREQ_READ, 0, val);
> +    return hvmemul_do_pio(port, &reps, bytes, INVALID_MFN, IOREQ_READ, 0, val);
>  }
>  
>  static int hvmemul_write_io(
> @@ -814,7 +819,7 @@ static int hvmemul_write_io(
>      struct x86_emulate_ctxt *ctxt)
>  {
>      unsigned long reps = 1;
> -    return hvmemul_do_pio(port, &reps, bytes, 0, IOREQ_WRITE, 0, &val);
> +    return hvmemul_do_pio(port, &reps, bytes, INVALID_MFN, IOREQ_WRITE, 0, &val);
>  }
>  
>  static int hvmemul_read_cr(
> diff -r 5171750d133e xen/arch/x86/hvm/io.c
> --- a/xen/arch/x86/hvm/io.c
> +++ b/xen/arch/x86/hvm/io.c
> @@ -231,7 +231,7 @@ int handle_pio(uint16_t port, int size, 
>      if ( dir == IOREQ_WRITE )
>          data = guest_cpu_user_regs()->eax;
>  
> -    rc = hvmemul_do_pio(port, &reps, size, 0, dir, 0, &data);
> +    rc = hvmemul_do_pio(port, &reps, size, INVALID_MFN, dir, 0, &data);
>  
>      switch ( rc )
>      {
>
>

  reply	other threads:[~2012-11-13 13:17 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-10-20 17:21 xen-unstable, winxp32 very poor performance on AMD FX-8150, I bisected and changeset is 24770:7f79475d3de7 Peter Maloney
2012-10-20 18:40 ` Peter Maloney
2012-10-22 13:56 ` Andres Lagar-Cavilla
2012-10-22 13:59 ` Tim Deegan
2012-10-23 22:17   ` Peter Maloney
2012-11-01 17:00   ` Tim Deegan
2012-11-01 17:28     ` Andres Lagar-Cavilla
2012-11-13 13:17       ` Peter Maloney [this message]
2012-11-22 18:54         ` Peter Maloney
2013-01-12 15:25           ` Peter Maloney
2013-01-17 20:57             ` Pasi Kärkkäinen
2013-01-18 14:22               ` George Dunlap
2013-01-18 14:40                 ` Andres Lagar-Cavilla
2013-01-21 12:07                   ` George Dunlap
2013-01-18 14:30             ` George Dunlap
2013-01-26 12:30               ` Peter Maloney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=50A2484D.309@brockmann-consult.de \
    --to=peter.maloney@brockmann-consult.de \
    --cc=andres@lagarcavilla.org \
    --cc=andreslc@gridcentric.ca \
    --cc=tim@xen.org \
    --cc=xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).