xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
* Re: [PATCH v7 15/17] vmx: VT-d posted-interrupt core logic handling
@ 2015-09-21  5:08 Wu, Feng
  2015-09-21  9:18 ` George Dunlap
  0 siblings, 1 reply; 56+ messages in thread
From: Wu, Feng @ 2015-09-21  5:08 UTC (permalink / raw)
  To: George Dunlap, Dario Faggioli
  Cc: Tian, Kevin, Keir Fraser, George Dunlap, Andrew Cooper,
	xen-devel@lists.xen.org, Jan Beulich, Wu, Feng



> -----Original Message-----
> From: George Dunlap [mailto:george.dunlap@citrix.com]
> Sent: Thursday, September 17, 2015 5:38 PM
> To: Dario Faggioli; Wu, Feng
> Cc: xen-devel@lists.xen.org; Tian, Kevin; Keir Fraser; George Dunlap; Andrew
> Cooper; Jan Beulich
> Subject: Re: [Xen-devel] [PATCH v7 15/17] vmx: VT-d posted-interrupt core logic
> handling
> 
> On 09/17/2015 09:48 AM, Dario Faggioli wrote:
> > On Thu, 2015-09-17 at 08:00 +0000, Wu, Feng wrote:
> >
> >>> -----Original Message-----
> >>> From: Dario Faggioli [mailto:dario.faggioli@citrix.com]
> >
> >>> So, I guess, first of all, can you confirm whether or not it's exploding
> >>> in debug builds?
> >>
> >> Does the following information in Config.mk mean it is a debug build?
> >>
> >> # A debug build of Xen and tools?
> >> debug ?= y
> >> debug_symbols ?= $(debug)
> >>
> > I think so. But as I said in my other email, I was wrong, and this is
> > probably not an issue.
> >
> >>> And in either case (just tossing out ideas) would it be
> >>> possible to deal with the "interrupt already raised when blocking" case:
> >>
> >> Thanks for the suggestions below!
> >>
> > :-)
> >
> >>>  - later in the context switching function ?
> >> In this case, we might need to set a flag in vmx_pre_ctx_switch_pi() instead
> >> of calling vcpu_unblock() directly, then when it returns to context_switch(),
> >> we can check the flag and don't really block the vCPU.
> >>
> > Yeah, and that would still be rather hard to understand and maintain,
> > IMO.
> >
> >> But I don't have a clear
> >> picture about how to archive this, here are some questions from me:
> >> - When we are in context_switch(), we have done the following changes to
> >> vcpu's state:
> >> 	* sd->curr is set to next
> >> 	* vCPU's running state (both prev and next ) is changed by
> >> 	  vcpu_runstate_change()
> >> 	* next->is_running is set to 1
> >> 	* periodic timer for prev is stopped
> >> 	* periodic timer for next is setup
> >> 	......
> >>
> >> So what point should we perform the action to _unblock_ the vCPU? We
> >> Need to roll back the formal changes to the vCPU's state, right?
> >>
> > Mmm... not really. Not blocking prev does not mean that prev would be
> > kept running on the pCPU, and that's true for your current solution as
> > well! As you say yourself, you're already in the process of switching
> > between prev and next, at a point where it's already a thing that next
> > will be the vCPU that will run. Not blocking means that prev is
> > reinserted to the runqueue, and a new invocation to the scheduler is
> > (potentially) queued as well (via raising SCHEDULE_SOFTIRQ, in
> > __runq_tickle()), but it's only when such new scheduling happens that
> > prev will (potentially) be selected to run again.
> >
> > So, no, unless I'm fully missing your point, there wouldn't be no
> > rollback required. However, I still would like the other solution (doing
> > stuff in vcpu_block()) better (see below).
> >
> >>>  - with another hook, perhaps in vcpu_block() ?
> >>
> >> We could check this in vcpu_block(), however, the logic here is that before
> >> vCPU is blocked, we need to change the posted-interrupt descriptor,
> >> and during changing it, if 'ON' bit is set, which means VT-d hardware
> >> issues a notification event because interrupts from the assigned devices
> >> is coming, we don't need to block the vCPU and hence no need to update
> >> the PI descriptor in this case.
> >>
> > Yep, I saw that. But could it be possible to do *everything* related to
> > blocking, including the update of the descriptor, in vcpu_block(), if no
> > interrupt have been raised yet at that time? I mean, would you, if
> > updating the descriptor in there, still get the event that allows you to
> > call vcpu_wake(), and hence vmx_vcpu_wake_prepare(), which would undo
> > the blocking, no matter whether that resulted in an actual context
> > switch already or not?
> >
> > I appreciate that this narrows the window for such an event to happen by
> > quite a bit, making the logic itself a little less useful (it makes
> > things more similar to "regular" blocking vs. event delivery, though,
> > AFAICT), but if it's correct, ad if it allows us to save the ugly
> > invocation of vcpu_unblock from context switch context, I'd give it a
> > try.
> >
> > After all, this PI thing requires actions to be taken when a vCPU is
> > scheduled or descheduled because of blocking, unblocking and
> > preemptions, and it would seem natural to me to:
> >  - deal with blockings in vcpu_block()
> >  - deal with unblockings in vcpu_wake()
> >  - deal with preemptions in context_switch()
> >
> > This does not mean being able to consolidate some of the cases (like
> > blockings and preemptions, in the current version of the code) were not
> > a nice thing... But we don't want it at all costs . :-)
> 
> So just to clarify the situation...
> 
> If a vcpu configured for the "running" state (i.e., NV set to
> "posted_intr_vector", notifications enabled), and an interrupt happens
> in the hypervisor -- what happens?
> 
> Is it the case that the interrupt is not actually delivered to the
> processor, but that the pending bit will be set in the pi field, so that
> the interrupt will be delivered the next time the hypervisor returns
> into the guest?
> 
> (I am assuming that is the case, because if the hypervisor *does* get an
> interrupt, then it can just unblock it there.)
> 
> This sort of race condition -- where we get an interrupt to wake up a
> vcpu as we're blocking -- is already handled for "old-style" interrupts
> in vcpu_block:
> 
> void vcpu_block(void)
> {
>     struct vcpu *v = current;
> 
>     set_bit(_VPF_blocked, &v->pause_flags);
> 
>     /* Check for events /after/ blocking: avoids wakeup waiting race. */
>     if ( local_events_need_delivery() )
>     {
>         clear_bit(_VPF_blocked, &v->pause_flags);
>     }
>     else
>     {
>         TRACE_2D(TRC_SCHED_BLOCK, v->domain->domain_id, v->vcpu_id);
>         raise_softirq(SCHEDULE_SOFTIRQ);
>     }
> }
> 
> That is, we set _VPF_blocked, so that any interrupt which would wake it
> up actually wakes it up, and then we check local_events_need_delivery()
> to see if there were any that came in after we decided to block but
> before we made sure that an interrupt would wake us up.
> 
> I think it would be best if we could keep all the logic that does the
> same thing in the same place.  Which would mean in vcpu_block(), after
> calling set_bit(_VPF_blocked), changing the NV to pi_wakeup_vector, and
> then extending local_events_need_delivery() to also look for pending PI
> events.
> 
> Looking a bit more at your states, I think the actions that need to be
> taken on all the transitions are these (collapsing 'runnable' and
> 'offline' into the same state):
> 
> blocked -> runnable (vcpu_wake)
>  - NV = posted_intr_vector
>  - Take vcpu off blocked list
>  - SN = 1
> runnable -> running (context_switch)
>  - SN = 0

Need set the 'NDST' field to the right dest vCPU as well.

> running -> runnable (context_switch)
>  - SN = 1
> running -> blocked (vcpu_block)
>  - NV = pi_wakeup_vector
>  - Add vcpu to blocked list

Need set the 'NDST' field to the pCPU which owns the blocking list,
So we can wake up the vCPU from the right blocking list in the wakeup
event handler.

> 
> This actually has a few pretty nice properties:
> 1. You have a nice pair of complementary actions -- block / wake, run /
> preempt
> 2. The potentially long actions with lists happen in vcpu_wake and
> vcpu_block, not on the context switch path
> 
> And at this point, you don't have the "lazy context switch" issue
> anymore, do we?  Because we've handled the "blocking" case in
> vcpu_block(), we don't need to do anything in the main context_switch()
> (which may do the lazy context switching into idle).  We only need to do
> something in the actual __context_switch().

I think the handling for lazy context switch is not only for the blocking case,
we still need to do something for lazy context switch even we handled the
blocking case in vcpu_block(), such as,
1. For non-idle -> idle
- set 'SN'

2. For idle -> non-idle
- clear 'SN'
- set 'NDST' filed to the right cpu the vCPU is going to running on. (Maybe
this one doesn't belong to lazy context switch, if the cpu of the non-idle
vCPU was changed, then per_cpu(curr_vcpu, cpu) != next in context_switch(),
hence it will go to __context_switch() directly, right?)

Thanks,
Feng

> 
> And at that point, could we actually get rid of the PI-specific context
> switch hooks altogether, and just put the SN state changes required for
> running->runnable and runnable->running in vmx_ctxt_switch_from() and
> vmx_ctxt_switch_to()?
> 
> If so, then the only hooks we need to add are vcpu_block and vcpu_wake.
>  To keep these consistent with other scheduling-related functions, I
> would put these in arch_vcpu, next to ctxt_switch_from() and
> ctxt_switch_to().
> 
> Thoughts?
> 
>  -George

^ permalink raw reply	[flat|nested] 56+ messages in thread
* Re: [PATCH v7 15/17] vmx: VT-d posted-interrupt core logic handling
@ 2015-09-21  5:09 Wu, Feng
  2015-09-21  9:54 ` George Dunlap
  0 siblings, 1 reply; 56+ messages in thread
From: Wu, Feng @ 2015-09-21  5:09 UTC (permalink / raw)
  To: George Dunlap, Dario Faggioli
  Cc: Tian, Kevin, Keir Fraser, Andrew Cooper, George Dunlap,
	xen-devel@lists.xen.org, Jan Beulich, Wu, Feng



> -----Original Message-----
> From: dunlapg@gmail.com [mailto:dunlapg@gmail.com] On Behalf Of George
> Dunlap
> Sent: Friday, September 18, 2015 10:34 PM
> To: Dario Faggioli
> Cc: Jan Beulich; George Dunlap; Tian, Kevin; Keir Fraser; Andrew Cooper;
> xen-devel@lists.xen.org; Wu, Feng
> Subject: Re: [Xen-devel] [PATCH v7 15/17] vmx: VT-d posted-interrupt core logic
> handling
> 
> On Fri, Sep 18, 2015 at 3:31 PM, George Dunlap
> <George.Dunlap@eu.citrix.com> wrote:
> >> As said, me too. Perhaps we can go for option 1, which is simpler,
> >> cleaner and more consistent, considering the current status of the
> >> code. We can always investigate, in future, whether and how to
> >> implement the optimization for all the blockings, if beneficial and fea
> >> sible, or have them diverge, if deemed worthwhile.
> >
> > Sounds like a plan.
> 
> Er, just in case that idiom wasn't clear: Option 1 sounds like a
> *good* plan, so unless Feng disagrees, let's go with that. :-)

Sorry for the late response, I was on leave last Friday.

Thanks for your discussions and suggestions. I have one question about option 1.
I find that there are two places where '_VPF_blocked' can get set: vcpu_block()
and do_poll(). After putting the logic in vcpu_block(), do we need to care about
do_poll(). I don't know the purpose of do_poll() and the usage case of it.
Dario/George, could you please share some knowledge about it? Thanks a lot!

Thanks,
Feng


> 
>  -George

^ permalink raw reply	[flat|nested] 56+ messages in thread
* [PATCH v7 00/17] Add VT-d Posted-Interrupts support
@ 2015-09-11  8:28 Feng Wu
  2015-09-11  8:29 ` [PATCH v7 15/17] vmx: VT-d posted-interrupt core logic handling Feng Wu
  0 siblings, 1 reply; 56+ messages in thread
From: Feng Wu @ 2015-09-11  8:28 UTC (permalink / raw)
  To: xen-devel; +Cc: Feng Wu

VT-d Posted-Interrupts is an enhancement to CPU side Posted-Interrupt.
With VT-d Posted-Interrupts enabled, external interrupts from
direct-assigned devices can be delivered to guests without VMM
intervention when guest is running in non-root mode.

You can find the VT-d Posted-Interrtups Spec. in the following URL:
http://www.intel.com/content/www/us/en/intelligent-systems/intel-technology/vt-directed-io-spec.html

Feng Wu (17):
  VT-d Posted-intterrupt (PI) design
  Add cmpxchg16b support for x86-64
  iommu: Add iommu_intpost to control VT-d Posted-Interrupts feature
  vt-d: VT-d Posted-Interrupts feature detection
  vmx: Extend struct pi_desc to support VT-d Posted-Interrupts
  vmx: Add some helper functions for Posted-Interrupts
  vmx: Initialize VT-d Posted-Interrupts Descriptor
  vmx: Suppress posting interrupts when 'SN' is set
  VT-d: Remove pointless casts
  vt-d: Extend struct iremap_entry to support VT-d Posted-Interrupts
  vt-d: Add API to update IRTE when VT-d PI is used
  x86: move some APIC related macros to apicdef.h
  Update IRTE according to guest interrupt config changes
  vmx: Properly handle notification event when vCPU is running
  vmx: VT-d posted-interrupt core logic handling
  VT-d: Dump the posted format IRTE
  Add a command line parameter for VT-d posted-interrupts

 docs/misc/vtd-pi.txt                   | 332 +++++++++++++++++++++++++++++++++
 docs/misc/xen-command-line.markdown    |   9 +-
 xen/arch/x86/domain.c                  |  21 +++
 xen/arch/x86/hvm/hvm.c                 |   6 +
 xen/arch/x86/hvm/vlapic.c              |   5 -
 xen/arch/x86/hvm/vmx/vmcs.c            |  24 +++
 xen/arch/x86/hvm/vmx/vmx.c             | 312 ++++++++++++++++++++++++++++++-
 xen/common/schedule.c                  |   2 +
 xen/drivers/passthrough/io.c           | 118 +++++++++++-
 xen/drivers/passthrough/iommu.c        |  16 +-
 xen/drivers/passthrough/vtd/intremap.c | 213 ++++++++++++++++-----
 xen/drivers/passthrough/vtd/iommu.c    |  14 +-
 xen/drivers/passthrough/vtd/iommu.h    |  51 +++--
 xen/drivers/passthrough/vtd/utils.c    |  42 +++--
 xen/include/asm-arm/domain.h           |   2 +
 xen/include/asm-x86/apicdef.h          |   3 +
 xen/include/asm-x86/domain.h           |   3 +
 xen/include/asm-x86/hvm/hvm.h          |   4 +
 xen/include/asm-x86/hvm/vmx/vmcs.h     |  25 ++-
 xen/include/asm-x86/hvm/vmx/vmx.h      |  27 +++
 xen/include/asm-x86/iommu.h            |   2 +
 xen/include/asm-x86/x86_64/system.h    |  31 +++
 xen/include/xen/iommu.h                |   2 +-
 23 files changed, 1176 insertions(+), 88 deletions(-)
 create mode 100644 docs/misc/vtd-pi.txt

-- 
2.1.0

^ permalink raw reply	[flat|nested] 56+ messages in thread

end of thread, other threads:[~2015-09-24  8:03 UTC | newest]

Thread overview: 56+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-09-21  5:08 [PATCH v7 15/17] vmx: VT-d posted-interrupt core logic handling Wu, Feng
2015-09-21  9:18 ` George Dunlap
2015-09-21 11:59   ` Wu, Feng
2015-09-21 13:31     ` Dario Faggioli
2015-09-21 13:50       ` Wu, Feng
2015-09-21 14:11         ` Dario Faggioli
2015-09-22  5:10           ` Wu, Feng
2015-09-22 10:43             ` George Dunlap
2015-09-22 10:46               ` George Dunlap
2015-09-22 13:25                 ` Wu, Feng
2015-09-22 13:40                   ` Dario Faggioli
2015-09-22 13:52                     ` Wu, Feng
2015-09-22 14:15                       ` George Dunlap
2015-09-22 14:38                         ` Dario Faggioli
2015-09-23  5:52                           ` Wu, Feng
2015-09-23  7:59                             ` Dario Faggioli
2015-09-23  8:11                               ` Wu, Feng
2015-09-22 14:28                   ` George Dunlap
2015-09-23  5:37                     ` Wu, Feng
  -- strict thread matches above, loose matches on Subject: below --
2015-09-21  5:09 Wu, Feng
2015-09-21  9:54 ` George Dunlap
2015-09-21 12:22   ` Wu, Feng
2015-09-21 14:24     ` Dario Faggioli
2015-09-22  7:19       ` Wu, Feng
2015-09-22  8:59         ` Jan Beulich
2015-09-22 13:40           ` Wu, Feng
2015-09-22 14:01             ` Jan Beulich
2015-09-23  9:44               ` George Dunlap
2015-09-23 12:35                 ` Wu, Feng
2015-09-23 15:25                   ` George Dunlap
2015-09-23 15:38                     ` Jan Beulich
2015-09-24  1:50                     ` Wu, Feng
2015-09-24  3:35                       ` Dario Faggioli
2015-09-24  7:51                       ` Jan Beulich
2015-09-24  8:03                         ` Wu, Feng
2015-09-22 10:26         ` George Dunlap
2015-09-23  6:35           ` Wu, Feng
2015-09-23  7:11             ` Dario Faggioli
2015-09-23  7:20               ` Wu, Feng
2015-09-11  8:28 [PATCH v7 00/17] Add VT-d Posted-Interrupts support Feng Wu
2015-09-11  8:29 ` [PATCH v7 15/17] vmx: VT-d posted-interrupt core logic handling Feng Wu
2015-09-16 16:00   ` Dario Faggioli
2015-09-16 17:18   ` Dario Faggioli
2015-09-16 18:05     ` Dario Faggioli
2015-09-17  8:00     ` Wu, Feng
2015-09-17  8:48       ` Dario Faggioli
2015-09-17  9:16         ` Wu, Feng
2015-09-17  9:38         ` George Dunlap
2015-09-17  9:39           ` George Dunlap
2015-09-17 11:44           ` George Dunlap
2015-09-17 12:40             ` Dario Faggioli
2015-09-17 14:30               ` George Dunlap
2015-09-17 16:36                 ` Dario Faggioli
2015-09-18  6:27                 ` Jan Beulich
2015-09-18  9:22                   ` Dario Faggioli
2015-09-18 14:31                     ` George Dunlap
2015-09-18 14:34                       ` George Dunlap

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).