[RFC PATCH 0/2] ASID: Flush by ASID

All of lore.kernel.org
 help / color / mirror / Atom feed

* [RFC PATCH 0/2] ASID: Flush by ASID
@ 2011-01-11 17:55 Wei Wang2
  2011-01-12 10:17 ` Tim Deegan
  0 siblings, 1 reply; 7+ messages in thread
From: Wei Wang2 @ 2011-01-11 17:55 UTC (permalink / raw)
  To: xen-devel@lists.xensource.com

Future AMD SVM supports a new feature called flush by ASID. The idea is to 
allow CPU to flush TLBs associated with the ASID assigned to guest VM. So 
hypervisor doesn't have to reassign a new ASID in order to flush guest's 
VCPU.  Please review it.
Thanks,
Wei

Signed-off-by: Wei Huang <wei.huang2@amd.com>
Signed-off-by: Wei Wang <wei.wang2@amd.com>
--
Advanced Micro Devices GmbH
Sitz: Dornach, Gemeinde Aschheim, 
Landkreis München Registergericht München, 
HRB Nr. 43632
WEEE-Reg-Nr: DE 12919551
Geschäftsführer:
Alberto Bozzo, Andrew Bowd

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC PATCH 0/2] ASID: Flush by ASID
  2011-01-11 17:55 [RFC PATCH 0/2] ASID: Flush by ASID Wei Wang2
@ 2011-01-12 10:17 ` Tim Deegan
  2011-01-12 12:41   ` Wei Wang2
  0 siblings, 1 reply; 7+ messages in thread
From: Tim Deegan @ 2011-01-12 10:17 UTC (permalink / raw)
  To: Wei Wang2; +Cc: xen-devel@lists.xensource.com

At 17:55 +0000 on 11 Jan (1294768552), Wei Wang2 wrote:
> Future AMD SVM supports a new feature called flush by ASID. The idea is to 
> allow CPU to flush TLBs associated with the ASID assigned to guest VM. So 
> hypervisor doesn't have to reassign a new ASID in order to flush guest's 
> VCPU.  Please review it.

What advantage does the new system have?  Intuitively it seems like it
might be a tiny bit fairer and a tiny bit faster (by explicitly flushing
instead of relying on LRO) but I'm not convinced that it will be visible
in macro-benchmarks.  Have you measured it?

Cheers,

Tim.

> Thanks,
> Wei
> 
> Signed-off-by: Wei Huang <wei.huang2@amd.com>
> Signed-off-by: Wei Wang <wei.wang2@amd.com>
> --
> Advanced Micro Devices GmbH
> Sitz: Dornach, Gemeinde Aschheim, 
> Landkreis München Registergericht München, 
> HRB Nr. 43632
> WEEE-Reg-Nr: DE 12919551
> Geschäftsführer:
> Alberto Bozzo, Andrew Bowd
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel

-- 
Tim Deegan <Tim.Deegan@citrix.com>
Principal Software Engineer, Xen Platform Team
Citrix Systems UK Ltd.  (Company #02937203, SL9 0BG)

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC PATCH 0/2] ASID: Flush by ASID
  2011-01-12 10:17 ` Tim Deegan
@ 2011-01-12 12:41   ` Wei Wang2
  2011-01-12 12:48     ` Keir Fraser
  0 siblings, 1 reply; 7+ messages in thread
From: Wei Wang2 @ 2011-01-12 12:41 UTC (permalink / raw)
  To: Tim Deegan; +Cc: xen-devel@lists.xensource.com

Hi Tim,
Flush by ASID provides more flexible control of tlb flushing. The most 
advantage is to allow hypervisor to flush tagged tlb selectively. Using this 
feature, HV is able to flush tlb entries associated with a guest VM directly 
instead of allocating a new asid . The whole tlb flush will also be reduced 
by reducing asid allocation.  

So far, we did not measure drastic performance improvement in testing with 
kernbench and X11perf. Actually, we found out that, reducing tlb flushes 
accompanying with vmrun does not improve performance very much. 
we sent out a patch to optimize hvm_flush_guest_tlbs last week, which reduces 
over 90% tlb flushes for vmrun, and we even cannot see signification speedup 
with it. Maybe, the latency of vmrun is too big so that the overhead of tlb 
flush is negligible?

Thanks,
Wei

On Wednesday 12 January 2011 11:17:00 Tim Deegan wrote:
> At 17:55 +0000 on 11 Jan (1294768552), Wei Wang2 wrote:
> > Future AMD SVM supports a new feature called flush by ASID. The idea is
> > to allow CPU to flush TLBs associated with the ASID assigned to guest VM.
> > So hypervisor doesn't have to reassign a new ASID in order to flush
> > guest's VCPU.  Please review it.
>
> What advantage does the new system have?  Intuitively it seems like it
> might be a tiny bit fairer and a tiny bit faster (by explicitly flushing
> instead of relying on LRO) but I'm not convinced that it will be visible
> in macro-benchmarks.  Have you measured it?
>
> Cheers,
>
> Tim.
>
> > Thanks,
> > Wei
> >
> > Signed-off-by: Wei Huang <wei.huang2@amd.com>
> > Signed-off-by: Wei Wang <wei.wang2@amd.com>
> > --
> > Advanced Micro Devices GmbH
> > Sitz: Dornach, Gemeinde Aschheim,
> > Landkreis München Registergericht München,
> > HRB Nr. 43632
> > WEEE-Reg-Nr: DE 12919551
> > Geschäftsführer:
> > Alberto Bozzo, Andrew Bowd
> >
> >
> > _______________________________________________
> > Xen-devel mailing list
> > Xen-devel@lists.xensource.com
> > http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC PATCH 0/2] ASID: Flush by ASID
  2011-01-12 12:41   ` Wei Wang2
@ 2011-01-12 12:48     ` Keir Fraser
  2011-01-12 13:23       ` Wei Wang2
  0 siblings, 1 reply; 7+ messages in thread
From: Keir Fraser @ 2011-01-12 12:48 UTC (permalink / raw)
  To: Wei Wang2, Tim Deegan; +Cc: xen-devel@lists.xensource.com

It begs the question whether it's worth complicating code for an
optimisation with no measurable benefit, doesn't it?

 -- Keir

On 12/01/2011 12:41, "Wei Wang2" <wei.wang2@amd.com> wrote:

> Hi Tim,
> Flush by ASID provides more flexible control of tlb flushing. The most
> advantage is to allow hypervisor to flush tagged tlb selectively. Using this
> feature, HV is able to flush tlb entries associated with a guest VM directly
> instead of allocating a new asid . The whole tlb flush will also be reduced
> by reducing asid allocation.
> 
> So far, we did not measure drastic performance improvement in testing with
> kernbench and X11perf. Actually, we found out that, reducing tlb flushes
> accompanying with vmrun does not improve performance very much.
> we sent out a patch to optimize hvm_flush_guest_tlbs last week, which reduces
> over 90% tlb flushes for vmrun, and we even cannot see signification speedup
> with it. Maybe, the latency of vmrun is too big so that the overhead of tlb
> flush is negligible?
> 
> Thanks,
> Wei
> 
> 
> On Wednesday 12 January 2011 11:17:00 Tim Deegan wrote:
>> At 17:55 +0000 on 11 Jan (1294768552), Wei Wang2 wrote:
>>> Future AMD SVM supports a new feature called flush by ASID. The idea is
>>> to allow CPU to flush TLBs associated with the ASID assigned to guest VM.
>>> So hypervisor doesn't have to reassign a new ASID in order to flush
>>> guest's VCPU.  Please review it.
>> 
>> What advantage does the new system have?  Intuitively it seems like it
>> might be a tiny bit fairer and a tiny bit faster (by explicitly flushing
>> instead of relying on LRO) but I'm not convinced that it will be visible
>> in macro-benchmarks.  Have you measured it?
>> 
>> Cheers,
>> 
>> Tim.
>> 
>>> Thanks,
>>> Wei
>>> 
>>> Signed-off-by: Wei Huang <wei.huang2@amd.com>
>>> Signed-off-by: Wei Wang <wei.wang2@amd.com>
>>> --
>>> Advanced Micro Devices GmbH
>>> Sitz: Dornach, Gemeinde Aschheim,
>>> Landkreis München Registergericht München,
>>> HRB Nr. 43632
>>> WEEE-Reg-Nr: DE 12919551
>>> Geschäftsführer:
>>> Alberto Bozzo, Andrew Bowd
>>> 
>>> 
>>> _______________________________________________
>>> Xen-devel mailing list
>>> Xen-devel@lists.xensource.com
>>> http://lists.xensource.com/xen-devel
> 
> 
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC PATCH 0/2] ASID: Flush by ASID
  2011-01-12 12:48     ` Keir Fraser
@ 2011-01-12 13:23       ` Wei Wang2
  2011-01-12 13:38         ` Keir Fraser
  0 siblings, 1 reply; 7+ messages in thread
From: Wei Wang2 @ 2011-01-12 13:23 UTC (permalink / raw)
  To: xen-devel; +Cc: Keir Fraser, Tim Deegan

Keir,
Sure, that is a good question :) . 
Actually finding a benchmark that scales with asid well is not quite easy. 
Benckmark like Kernbench which has large working set will occupy all tls 
entries by its own asid. In this case, even disabling asid is not harmful.
We only tested single guest with multiple vcpus. Maybe using multiple guests 
or other benchmarks will show a better result?
Thanks,
Wei


On Wednesday 12 January 2011 13:48:49 Keir Fraser wrote:
> It begs the question whether it's worth complicating code for an
> optimisation with no measurable benefit, doesn't it?
>
>  -- Keir
>
> On 12/01/2011 12:41, "Wei Wang2" <wei.wang2@amd.com> wrote:
> > Hi Tim,
> > Flush by ASID provides more flexible control of tlb flushing. The most
> > advantage is to allow hypervisor to flush tagged tlb selectively. Using
> > this feature, HV is able to flush tlb entries associated with a guest VM
> > directly instead of allocating a new asid . The whole tlb flush will also
> > be reduced by reducing asid allocation.
> >
> > So far, we did not measure drastic performance improvement in testing
> > with kernbench and X11perf. Actually, we found out that, reducing tlb
> > flushes accompanying with vmrun does not improve performance very much.
> > we sent out a patch to optimize hvm_flush_guest_tlbs last week, which
> > reduces over 90% tlb flushes for vmrun, and we even cannot see
> > signification speedup with it. Maybe, the latency of vmrun is too big so
> > that the overhead of tlb flush is negligible?
> >
> > Thanks,
> > Wei
> >
> > On Wednesday 12 January 2011 11:17:00 Tim Deegan wrote:
> >> At 17:55 +0000 on 11 Jan (1294768552), Wei Wang2 wrote:
> >>> Future AMD SVM supports a new feature called flush by ASID. The idea is
> >>> to allow CPU to flush TLBs associated with the ASID assigned to guest
> >>> VM. So hypervisor doesn't have to reassign a new ASID in order to flush
> >>> guest's VCPU.  Please review it.
> >>
> >> What advantage does the new system have?  Intuitively it seems like it
> >> might be a tiny bit fairer and a tiny bit faster (by explicitly flushing
> >> instead of relying on LRO) but I'm not convinced that it will be visible
> >> in macro-benchmarks.  Have you measured it?
> >>
> >> Cheers,
> >>
> >> Tim.
> >>
> >>> Thanks,
> >>> Wei
> >>>
> >>> Signed-off-by: Wei Huang <wei.huang2@amd.com>
> >>> Signed-off-by: Wei Wang <wei.wang2@amd.com>
> >>> --
> >>> Advanced Micro Devices GmbH
> >>> Sitz: Dornach, Gemeinde Aschheim,
> >>> Landkreis München Registergericht München,
> >>> HRB Nr. 43632
> >>> WEEE-Reg-Nr: DE 12919551
> >>> Geschäftsführer:
> >>> Alberto Bozzo, Andrew Bowd
> >>>
> >>>
> >>> _______________________________________________
> >>> Xen-devel mailing list
> >>> Xen-devel@lists.xensource.com
> >>> http://lists.xensource.com/xen-devel
> >
> > _______________________________________________
> > Xen-devel mailing list
> > Xen-devel@lists.xensource.com
> > http://lists.xensource.com/xen-devel
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC PATCH 0/2] ASID: Flush by ASID
  2011-01-12 13:23       ` Wei Wang2
@ 2011-01-12 13:38         ` Keir Fraser
  2011-01-12 17:22           ` Wei Huang
  0 siblings, 1 reply; 7+ messages in thread
From: Keir Fraser @ 2011-01-12 13:38 UTC (permalink / raw)
  To: Wei Wang2, xen-devel; +Cc: Tim Deegan

Our gut feeling has always been that the major benefit is having two ASIDS,
allowing one for host and one for current guest and thus avoiding TLB flush
on every VM entry/exit. Unless your TLB is very large, or guest vcpus run
only for very short periods, it's likely that a heavy guest workload
displaces all other ASIDs (guest VCPUs) from the TLB anyway.

We're interested in benchmark numbers that can disprove the gut feeling, of
course!

 -- Keir

On 12/01/2011 13:23, "Wei Wang2" <wei.wang2@amd.com> wrote:

> Keir,
> Sure, that is a good question :) .
> Actually finding a benchmark that scales with asid well is not quite easy.
> Benckmark like Kernbench which has large working set will occupy all tls
> entries by its own asid. In this case, even disabling asid is not harmful.
> We only tested single guest with multiple vcpus. Maybe using multiple guests
> or other benchmarks will show a better result?
> Thanks,
> Wei
> 
> 
> On Wednesday 12 January 2011 13:48:49 Keir Fraser wrote:
>> It begs the question whether it's worth complicating code for an
>> optimisation with no measurable benefit, doesn't it?
>> 
>>  -- Keir
>> 
>> On 12/01/2011 12:41, "Wei Wang2" <wei.wang2@amd.com> wrote:
>>> Hi Tim,
>>> Flush by ASID provides more flexible control of tlb flushing. The most
>>> advantage is to allow hypervisor to flush tagged tlb selectively. Using
>>> this feature, HV is able to flush tlb entries associated with a guest VM
>>> directly instead of allocating a new asid . The whole tlb flush will also
>>> be reduced by reducing asid allocation.
>>> 
>>> So far, we did not measure drastic performance improvement in testing
>>> with kernbench and X11perf. Actually, we found out that, reducing tlb
>>> flushes accompanying with vmrun does not improve performance very much.
>>> we sent out a patch to optimize hvm_flush_guest_tlbs last week, which
>>> reduces over 90% tlb flushes for vmrun, and we even cannot see
>>> signification speedup with it. Maybe, the latency of vmrun is too big so
>>> that the overhead of tlb flush is negligible?
>>> 
>>> Thanks,
>>> Wei
>>> 
>>> On Wednesday 12 January 2011 11:17:00 Tim Deegan wrote:
>>>> At 17:55 +0000 on 11 Jan (1294768552), Wei Wang2 wrote:
>>>>> Future AMD SVM supports a new feature called flush by ASID. The idea is
>>>>> to allow CPU to flush TLBs associated with the ASID assigned to guest
>>>>> VM. So hypervisor doesn't have to reassign a new ASID in order to flush
>>>>> guest's VCPU.  Please review it.
>>>> 
>>>> What advantage does the new system have?  Intuitively it seems like it
>>>> might be a tiny bit fairer and a tiny bit faster (by explicitly flushing
>>>> instead of relying on LRO) but I'm not convinced that it will be visible
>>>> in macro-benchmarks.  Have you measured it?
>>>> 
>>>> Cheers,
>>>> 
>>>> Tim.
>>>> 
>>>>> Thanks,
>>>>> Wei
>>>>> 
>>>>> Signed-off-by: Wei Huang <wei.huang2@amd.com>
>>>>> Signed-off-by: Wei Wang <wei.wang2@amd.com>
>>>>> --
>>>>> Advanced Micro Devices GmbH
>>>>> Sitz: Dornach, Gemeinde Aschheim,
>>>>> Landkreis München Registergericht München,
>>>>> HRB Nr. 43632
>>>>> WEEE-Reg-Nr: DE 12919551
>>>>> Geschäftsführer:
>>>>> Alberto Bozzo, Andrew Bowd
>>>>> 
>>>>> 
>>>>> _______________________________________________
>>>>> Xen-devel mailing list
>>>>> Xen-devel@lists.xensource.com
>>>>> http://lists.xensource.com/xen-devel
>>> 
>>> _______________________________________________
>>> Xen-devel mailing list
>>> Xen-devel@lists.xensource.com
>>> http://lists.xensource.com/xen-devel
>> 
>> _______________________________________________
>> Xen-devel mailing list
>> Xen-devel@lists.xensource.com
>> http://lists.xensource.com/xen-devel
> 
> 
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC PATCH 0/2] ASID: Flush by ASID
  2011-01-12 13:38         ` Keir Fraser
@ 2011-01-12 17:22           ` Wei Huang
  0 siblings, 0 replies; 7+ messages in thread
From: Wei Huang @ 2011-01-12 17:22 UTC (permalink / raw)
  To: Keir Fraser; +Cc: Wang2, Wei, xen-devel@lists.xensource.com, Tim Deegan

This feature isn't something ground-breaking. So we don't expect 
significant performance improvement for many benchmarks. But it ought to 
have a niche market for certain workloads. We will collect more 
performance results for the next submission. The bottom line is not to 
slowdown existing ASID implementation.

Thanks,
-WeiH

On 01/12/2011 07:38 AM, Keir Fraser wrote:
> Our gut feeling has always been that the major benefit is having two ASIDS,
> allowing one for host and one for current guest and thus avoiding TLB flush
> on every VM entry/exit. Unless your TLB is very large, or guest vcpus run
> only for very short periods, it's likely that a heavy guest workload
> displaces all other ASIDs (guest VCPUs) from the TLB anyway.
>
> We're interested in benchmark numbers that can disprove the gut feeling, of
> course!
>
>   -- Keir
>
> On 12/01/2011 13:23, "Wei Wang2"<wei.wang2@amd.com>  wrote:
>
>> Keir,
>> Sure, that is a good question :) .
>> Actually finding a benchmark that scales with asid well is not quite easy.
>> Benckmark like Kernbench which has large working set will occupy all tls
>> entries by its own asid. In this case, even disabling asid is not harmful.
>> We only tested single guest with multiple vcpus. Maybe using multiple guests
>> or other benchmarks will show a better result?
>> Thanks,
>> Wei
>>
>>
>> On Wednesday 12 January 2011 13:48:49 Keir Fraser wrote:
>>> It begs the question whether it's worth complicating code for an
>>> optimisation with no measurable benefit, doesn't it?
>>>
>>>   -- Keir
>>>
>>> On 12/01/2011 12:41, "Wei Wang2"<wei.wang2@amd.com>  wrote:
>>>> Hi Tim,
>>>> Flush by ASID provides more flexible control of tlb flushing. The most
>>>> advantage is to allow hypervisor to flush tagged tlb selectively. Using
>>>> this feature, HV is able to flush tlb entries associated with a guest VM
>>>> directly instead of allocating a new asid . The whole tlb flush will also
>>>> be reduced by reducing asid allocation.
>>>>
>>>> So far, we did not measure drastic performance improvement in testing
>>>> with kernbench and X11perf. Actually, we found out that, reducing tlb
>>>> flushes accompanying with vmrun does not improve performance very much.
>>>> we sent out a patch to optimize hvm_flush_guest_tlbs last week, which
>>>> reduces over 90% tlb flushes for vmrun, and we even cannot see
>>>> signification speedup with it. Maybe, the latency of vmrun is too big so
>>>> that the overhead of tlb flush is negligible?
>>>>
>>>> Thanks,
>>>> Wei
>>>>
>>>> On Wednesday 12 January 2011 11:17:00 Tim Deegan wrote:
>>>>> At 17:55 +0000 on 11 Jan (1294768552), Wei Wang2 wrote:
>>>>>> Future AMD SVM supports a new feature called flush by ASID. The idea is
>>>>>> to allow CPU to flush TLBs associated with the ASID assigned to guest
>>>>>> VM. So hypervisor doesn't have to reassign a new ASID in order to flush
>>>>>> guest's VCPU.  Please review it.
>>>>> What advantage does the new system have?  Intuitively it seems like it
>>>>> might be a tiny bit fairer and a tiny bit faster (by explicitly flushing
>>>>> instead of relying on LRO) but I'm not convinced that it will be visible
>>>>> in macro-benchmarks.  Have you measured it?
>>>>>
>>>>> Cheers,
>>>>>
>>>>> Tim.
>>>>>
>>>>>> Thanks,
>>>>>> Wei
>>>>>>
>>>>>> Signed-off-by: Wei Huang<wei.huang2@amd.com>
>>>>>> Signed-off-by: Wei Wang<wei.wang2@amd.com>
>>>>>> --
>>>>>> Advanced Micro Devices GmbH
>>>>>> Sitz: Dornach, Gemeinde Aschheim,
>>>>>> Landkreis München Registergericht München,
>>>>>> HRB Nr. 43632
>>>>>> WEEE-Reg-Nr: DE 12919551
>>>>>> Geschäftsführer:
>>>>>> Alberto Bozzo, Andrew Bowd
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> Xen-devel mailing list
>>>>>> Xen-devel@lists.xensource.com
>>>>>> http://lists.xensource.com/xen-devel
>>>> _______________________________________________
>>>> Xen-devel mailing list
>>>> Xen-devel@lists.xensource.com
>>>> http://lists.xensource.com/xen-devel
>>> _______________________________________________
>>> Xen-devel mailing list
>>> Xen-devel@lists.xensource.com
>>> http://lists.xensource.com/xen-devel
>>
>>
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2011-01-12 17:22 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-01-11 17:55 [RFC PATCH 0/2] ASID: Flush by ASID Wei Wang2
2011-01-12 10:17 ` Tim Deegan
2011-01-12 12:41   ` Wei Wang2
2011-01-12 12:48     ` Keir Fraser
2011-01-12 13:23       ` Wei Wang2
2011-01-12 13:38         ` Keir Fraser
2011-01-12 17:22           ` Wei Huang

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.