All of lore.kernel.org
 help / color / mirror / Atom feed
From: George Dunlap <george.dunlap@citrix.com>
To: "Wu, Feng" <feng.wu@intel.com>, Jan Beulich <JBeulich@suse.com>,
	George Dunlap <George.Dunlap@eu.citrix.com>
Cc: "Tian, Kevin" <kevin.tian@intel.com>, Keir Fraser <keir@xen.org>,
	Andrew Cooper <andrew.cooper3@citrix.com>,
	Dario Faggioli <dario.faggioli@citrix.com>,
	"xen-devel@lists.xen.org" <xen-devel@lists.xen.org>
Subject: Re: Ideas Re: [PATCH v14 1/2] vmx: VT-d posted-interrupt core logic handling
Date: Wed, 9 Mar 2016 11:25:25 +0000	[thread overview]
Message-ID: <56E00825.4000605@citrix.com> (raw)
In-Reply-To: <E959C4978C3B6342920538CF579893F00C369BFF@SHSMSX104.ccr.corp.intel.com>

On 09/03/16 05:22, Wu, Feng wrote:
> 
> 
>> -----Original Message-----
>> From: George Dunlap [mailto:george.dunlap@citrix.com]
>> Sent: Wednesday, March 9, 2016 1:06 AM
>> To: Jan Beulich <JBeulich@suse.com>; George Dunlap
>> <George.Dunlap@eu.citrix.com>; Wu, Feng <feng.wu@intel.com>
>> Cc: Andrew Cooper <andrew.cooper3@citrix.com>; Dario Faggioli
>> <dario.faggioli@citrix.com>; Tian, Kevin <kevin.tian@intel.com>; xen-
>> devel@lists.xen.org; Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>; Keir
>> Fraser <keir@xen.org>
>> Subject: Re: [Xen-devel] Ideas Re: [PATCH v14 1/2] vmx: VT-d posted-interrupt
>> core logic handling
>>
>> On 08/03/16 15:42, Jan Beulich wrote:
>>>>>> On 08.03.16 at 15:42, <George.Dunlap@eu.citrix.com> wrote:
>>>> On Tue, Mar 8, 2016 at 1:10 PM, Wu, Feng <feng.wu@intel.com> wrote:
>>>>>> -----Original Message-----
>>>>>> From: George Dunlap [mailto:george.dunlap@citrix.com]
>>>> [snip]
>>>>>> It seems like there are a couple of ways we could approach this:
>>>>>>
>>>>>> 1. Try to optimize the reverse look-up code so that it's not a linear
>>>>>> linked list (getting rid of the theoretical fear)
>>>>>
>>>>> Good point.
>>>>>
>>>>>>
>>>>>> 2. Try to test engineered situations where we expect this to be a
>>>>>> problem, to see how big of a problem it is (proving the theory to be
>>>>>> accurate or inaccurate in this case)
>>>>>
>>>>> Maybe we can run a SMP guest with all the vcpus pinned to a dedicated
>>>>> pCPU, we can run some benchmark in the guest with VT-d PI and without
>>>>> VT-d PI, then see the performance difference between these two sceanrios.
>>>>
>>>> This would give us an idea what the worst-case scenario would be.
>>>
>>> How would a single VM ever give us an idea about the worst
>>> case? Something getting close to worst case is a ton of single
>>> vCPU guests all temporarily pinned to one and the same pCPU
>>> (could be multi-vCPU ones, but the more vCPU-s the more
>>> artificial this pinning would become) right before they go into
>>> blocked state (i.e. through one of the two callers of
>>> arch_vcpu_block()), the pinning removed while blocked, and
>>> then all getting woken at once.
>>
>> Why would removing the pinning be important?
>>
>> And I guess it's actually the case that it doesn't need all VMs to
>> actually be *receiving* interrupts; it just requires them to be
>> *capable* of receiving interrupts, for there to be a long chain all
>> blocked on the same physical cpu.
>>
>>>
>>>>  But
>>>> pinning all vcpus to a single pcpu isn't really a sensible use case we
>>>> want to support -- if you have to do something stupid to get a
>>>> performance regression, then I as far as I'm concerned it's not a
>>>> problem.
>>>>
>>>> Or to put it a different way: If we pin 10 vcpus to a single pcpu and
>>>> then pound them all with posted interrupts, and there is *no*
>>>> significant performance regression, then that will conclusively prove
>>>> that the theoretical performance regression is of no concern, and we
>>>> can enable PI by default.
>>>
>>> The point isn't the pinning. The point is what pCPU they're on when
>>> going to sleep. And that could involve quite a few more than just
>>> 10 vCPU-s, provided they all sleep long enough.
>>>
>>> And the "theoretical performance regression is of no concern" is
>>> also not a proper way of looking at it, I would say: Even if such
>>> a situation would happen extremely rarely, if it can happen at all,
>>> it would still be a security issue.
>>
>> What I'm trying to get at is -- exactly what situation?  What actually
>> constitutes a problematic interrupt latency / interrupt processing
>> workload, how many vcpus must be sleeping on the same pcpu to actually
>> risk triggering that latency / workload, and how feasible is it that
>> such a situation would arise in a reasonable scenario?
>>
>> If 200us is too long, and it only takes 3 sleeping vcpus to get there,
>> then yes, there is a genuine problem we need to try to address before we
>> turn it on by default.  If we say that up to 500us is tolerable, and it
>> takes 100 sleeping vcpus to reach that latency, then this is something I
>> don't really think we need to worry about.
>>
>> "I think something bad may happen" is a really difficult to work with.
>> "I want to make sure that even a high number of blocked cpus won't cause
>> the interrupt latency to exceed 500us; and I want it to be basically
>> impossible for the interrupt latency to exceed 5ms under any
>> circumstances" is a concrete target someone can either demonstrate that
>> they meet, or aim for when trying to improve the situation.
>>
>> Feng: It should be pretty easy for you to:
> 
> George, thanks a lot for you to pointing the possible way to move forward.
> 
>> * Implement a modified version of Xen where
>>  - *All* vcpus get put on the waitqueue
> 
> So this means, all the vcpus are blocked, and hence waiting in the
> blocking list, right?

No.

For testing purposes, we need a lot of vcpus on the list, but we only
need one vcpu to actually be woken up to see low long it takes to
traverse the list.

At the moment, a vcpu will only be put on the list if it has the
arch_block callback defined; and it will have the arch_block callback
defined only if the domain it's a part of has a device assigned to it.
But it would be easy enough to make it so that *all* VMs have the
arch_block callback defined; then all vcpus would end up on the
pi_blocked list when they're blocked, even if they don't have a device
assigned.

That way you could have a really long pi_blocked list while only needing
a single device to pass through to the guest.

>>  - Measure how long it took to run the loop in pi_wakeup_interrupt
>> * Have one VM receiving posted interrupts on a regular basis.
>> * Slowly increase the number of vcpus blocked on a single cpu (e.g., by
>> creating more guests), stopping when you either reach 500us or 500
>> vcpus. :-)
> 
> This may depends on the environment, I was using a 10G NIC to do the
> test, if we increase the number of guests, I need more NICs to get assigned
> to the guests, I will see if I can get them.

...which is why I suggested setting the arch_block() callback for all
domains, even those which don't have devices assigned, so that you could
get away with a single passed-through device. :-)

 -George


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

  reply	other threads:[~2016-03-09 11:25 UTC|newest]

Thread overview: 53+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-02-29  3:00 [PATCH v14 0/2] Add VT-d Posted-Interrupts support Feng Wu
2016-02-29  3:00 ` [PATCH v14 1/2] vmx: VT-d posted-interrupt core logic handling Feng Wu
2016-02-29 13:33   ` Jan Beulich
2016-02-29 13:52     ` Dario Faggioli
2016-03-01  5:39       ` Wu, Feng
2016-03-01  9:24         ` Jan Beulich
2016-03-01 10:16     ` George Dunlap
2016-03-01 13:06       ` Wu, Feng
2016-03-01  5:24   ` Tian, Kevin
2016-03-01  5:39     ` Wu, Feng
2016-03-04 22:00   ` Ideas " Konrad Rzeszutek Wilk
2016-03-07 11:21     ` George Dunlap
2016-03-07 15:53       ` Konrad Rzeszutek Wilk
2016-03-07 16:19         ` Dario Faggioli
2016-03-07 20:23           ` Konrad Rzeszutek Wilk
2016-03-08 12:02         ` George Dunlap
2016-03-08 13:10           ` Wu, Feng
2016-03-08 14:42             ` George Dunlap
2016-03-08 15:42               ` Jan Beulich
2016-03-08 17:05                 ` George Dunlap
2016-03-08 17:26                   ` Jan Beulich
2016-03-08 18:38                     ` George Dunlap
2016-03-09  5:06                       ` Wu, Feng
2016-03-09 13:39                       ` Jan Beulich
2016-03-09 16:01                         ` George Dunlap
2016-03-09 16:31                           ` Jan Beulich
2016-03-09 16:23                         ` On setting clear criteria for declaring a feature acceptable (was "vmx: VT-d posted-interrupt core logic handling") George Dunlap
2016-03-09 16:58                           ` On setting clear criteria for declaring a feature acceptable Jan Beulich
2016-03-09 18:02                           ` On setting clear criteria for declaring a feature acceptable (was "vmx: VT-d posted-interrupt core logic handling") David Vrabel
2016-03-10  1:15                             ` Wu, Feng
2016-03-10  9:30                             ` George Dunlap
2016-03-10  5:09                           ` Tian, Kevin
2016-03-10  8:07                             ` vmx: VT-d posted-interrupt core logic handling Jan Beulich
2016-03-10  8:43                               ` Tian, Kevin
2016-03-10  9:05                                 ` Jan Beulich
2016-03-10  9:20                                   ` Tian, Kevin
2016-03-10 10:05                                   ` Tian, Kevin
2016-03-10 10:18                                     ` Jan Beulich
2016-03-10 10:35                                       ` David Vrabel
2016-03-10 10:46                                         ` George Dunlap
2016-03-10 11:16                                           ` David Vrabel
2016-03-10 11:49                                             ` George Dunlap
2016-03-10 13:24                                             ` Jan Beulich
2016-03-10 11:00                                       ` George Dunlap
2016-03-10 11:21                                         ` Dario Faggioli
2016-03-10 13:36                                     ` Wu, Feng
2016-05-17 13:27                                       ` Konrad Rzeszutek Wilk
2016-05-19  7:22                                         ` Wu, Feng
2016-03-10 10:41                               ` George Dunlap
2016-03-09  5:22                   ` Ideas Re: [PATCH v14 1/2] " Wu, Feng
2016-03-09 11:25                     ` George Dunlap [this message]
2016-03-09 12:06                       ` Wu, Feng
2016-02-29  3:00 ` [PATCH v14 2/2] Add a command line parameter for VT-d posted-interrupts Feng Wu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=56E00825.4000605@citrix.com \
    --to=george.dunlap@citrix.com \
    --cc=George.Dunlap@eu.citrix.com \
    --cc=JBeulich@suse.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=dario.faggioli@citrix.com \
    --cc=feng.wu@intel.com \
    --cc=keir@xen.org \
    --cc=kevin.tian@intel.com \
    --cc=xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.