* [PATCH v8 5/7] VT-d: No need to set irq affinity for posted format IRTE
@ 2016-11-18 1:58 Feng Wu
2016-11-18 4:44 ` Tian, Kevin
2016-11-22 10:03 ` Jan Beulich
0 siblings, 2 replies; 6+ messages in thread
From: Feng Wu @ 2016-11-18 1:58 UTC (permalink / raw)
To: xen-devel
Cc: kevin.tian, Feng Wu, george.dunlap, andrew.cooper3,
dario.faggioli, jbeulich
We don't set the affinity for posted format IRTE, since the
destination of these interrupts is vCPU and the vCPU affinity
is set during vCPU scheduling.
Signed-off-by: Feng Wu <feng.wu@intel.com>
---
v8:
- Changes based on [6/7]
v7:
- Compare all the field in IRTE to justify whether we can suppress the update
v6:
- Make pi_can_suppress_irte_update() a check-only function
- Introduce another function pi_get_new_irte() to update the 'new_ire' if needed
v5:
- Only suppress affinity related IRTE updates for PI
v4:
- Keep the construction of new_ire and only modify the hardware
IRTE when it is not in posted mode.
xen/drivers/passthrough/vtd/intremap.c | 87 ++++++++++++++++++++--------------
1 file changed, 52 insertions(+), 35 deletions(-)
diff --git a/xen/drivers/passthrough/vtd/intremap.c b/xen/drivers/passthrough/vtd/intremap.c
index fd2a49a..0cb8c37 100644
--- a/xen/drivers/passthrough/vtd/intremap.c
+++ b/xen/drivers/passthrough/vtd/intremap.c
@@ -600,27 +600,41 @@ static int msi_msg_to_remap_entry(
if ( !pi_desc )
{
- /* Set interrupt remapping table entry */
- new_ire.remap.fpd = 0;
- new_ire.remap.dm = (msg->address_lo >> MSI_ADDR_DESTMODE_SHIFT) & 0x1;
- new_ire.remap.tm = (msg->data >> MSI_DATA_TRIGGER_SHIFT) & 0x1;
- new_ire.remap.dlm = (msg->data >> MSI_DATA_DELIVERY_MODE_SHIFT) & 0x1;
- /* Hardware require RH = 1 for LPR delivery mode */
- new_ire.remap.rh = (new_ire.remap.dlm == dest_LowestPrio);
- new_ire.remap.avail = 0;
- new_ire.remap.res_1 = 0;
- new_ire.remap.vector = (msg->data >> MSI_DATA_VECTOR_SHIFT) &
- MSI_DATA_VECTOR_MASK;
- new_ire.remap.res_2 = 0;
- if ( x2apic_enabled )
- new_ire.remap.dst = msg->dest32;
- else
- new_ire.remap.dst = ((msg->address_lo >> MSI_ADDR_DEST_ID_SHIFT)
- & 0xff) << 8;
+ /*
+ * We are here because we are trying to update the IRTE to remapped mode,
+ * we only need to update the remapped specific fields for the following
+ * two cases:
+ * 1. When we create a new IRTE. A new IRTE is created when we create a
+ * new irq, so a new IRTE is always initialized with remapped format.
+ * 2. When the old IRTE is present and in remapped mode. Since if the old
+ * IRTE is in posted mode, we cannot update it to remapped mode and
+ * this is what we need to suppress. So we don't update the remapped
+ * specific fields here, we only update the commom field.
+ */
+ if ( !iremap_entry->remap.p || !iremap_entry->remap.im )
+ {
+ /* Set interrupt remapping table entry */
+ new_ire.remap.fpd = 0;
+ new_ire.remap.dm = (msg->address_lo >> MSI_ADDR_DESTMODE_SHIFT) & 0x1;
+ new_ire.remap.tm = (msg->data >> MSI_DATA_TRIGGER_SHIFT) & 0x1;
+ new_ire.remap.dlm = (msg->data >> MSI_DATA_DELIVERY_MODE_SHIFT) & 0x1;
+ /* Hardware require RH = 1 for LPR delivery mode */
+ new_ire.remap.rh = (new_ire.remap.dlm == dest_LowestPrio);
+ new_ire.remap.avail = 0;
+ new_ire.remap.res_1 = 0;
+ new_ire.remap.vector = (msg->data >> MSI_DATA_VECTOR_SHIFT) &
+ MSI_DATA_VECTOR_MASK;
+ new_ire.remap.res_2 = 0;
+ if ( x2apic_enabled )
+ new_ire.remap.dst = msg->dest32;
+ else
+ new_ire.remap.dst = ((msg->address_lo >> MSI_ADDR_DEST_ID_SHIFT)
+ & 0xff) << 8;
- new_ire.remap.res_3 = 0;
- new_ire.remap.res_4 = 0;
- new_ire.remap.p = 1; /* finally, set present bit */
+ new_ire.remap.res_3 = 0;
+ new_ire.remap.res_4 = 0;
+ new_ire.remap.p = 1; /* finally, set present bit */
+ }
}
else
{
@@ -657,25 +671,28 @@ static int msi_msg_to_remap_entry(
remap_rte->address_hi = 0;
remap_rte->data = index - i;
- if ( !pi_desc )
- memcpy(iremap_entry, &new_ire, sizeof(struct iremap_entry));
- else
+ if ( iremap_entry->val != new_ire.val )
{
- __uint128_t ret;
+ if ( !pi_desc )
+ memcpy(iremap_entry, &new_ire, sizeof(struct iremap_entry));
+ else
+ {
+ __uint128_t ret;
- old_ire = *iremap_entry;
- ret = cmpxchg16b(iremap_entry, &old_ire, &new_ire);
+ old_ire = *iremap_entry;
+ ret = cmpxchg16b(iremap_entry, &old_ire, &new_ire);
- /*
- * In the above, we use cmpxchg16 to atomically update the 128-bit IRTE,
- * and the hardware cannot update the IRTE behind us, so the return value
- * of cmpxchg16 should be the same as old_ire. This ASSERT validate it.
- */
- ASSERT(ret == old_ire.val);
- }
+ /*
+ * In the above, we use cmpxchg16 to atomically update the 128-bit IRTE,
+ * and the hardware cannot update the IRTE behind us, so the return value
+ * of cmpxchg16 should be the same as old_ire. This ASSERT validate it.
+ */
+ ASSERT(ret == old_ire.val);
+ }
- iommu_flush_cache_entry(iremap_entry, sizeof(struct iremap_entry));
- iommu_flush_iec_index(iommu, 0, index);
+ iommu_flush_cache_entry(iremap_entry, sizeof(struct iremap_entry));
+ iommu_flush_iec_index(iommu, 0, index);
+ }
unmap_vtd_domain_page(iremap_entries);
spin_unlock_irqrestore(&ir_ctrl->iremap_lock, flags);
--
2.1.0
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel
^ permalink raw reply related [flat|nested] 6+ messages in thread* Re: [PATCH v8 5/7] VT-d: No need to set irq affinity for posted format IRTE
2016-11-18 1:58 [PATCH v8 5/7] VT-d: No need to set irq affinity for posted format IRTE Feng Wu
@ 2016-11-18 4:44 ` Tian, Kevin
2016-11-22 4:31 ` Wu, Feng
2016-11-22 10:03 ` Jan Beulich
1 sibling, 1 reply; 6+ messages in thread
From: Tian, Kevin @ 2016-11-18 4:44 UTC (permalink / raw)
To: Wu, Feng, xen-devel@lists.xen.org
Cc: george.dunlap@eu.citrix.com, andrew.cooper3@citrix.com,
dario.faggioli@citrix.com, jbeulich@suse.com
> From: Wu, Feng
> Sent: Friday, November 18, 2016 9:59 AM
>
> We don't set the affinity for posted format IRTE, since the
> destination of these interrupts is vCPU and the vCPU affinity
> is set during vCPU scheduling.
>
> Signed-off-by: Feng Wu <feng.wu@intel.com>
> ---
> v8:
> - Changes based on [6/7]
[5/7]?
>
> v7:
> - Compare all the field in IRTE to justify whether we can suppress the update
>
> v6:
> - Make pi_can_suppress_irte_update() a check-only function
> - Introduce another function pi_get_new_irte() to update the 'new_ire' if needed
>
> v5:
> - Only suppress affinity related IRTE updates for PI
>
> v4:
> - Keep the construction of new_ire and only modify the hardware
> IRTE when it is not in posted mode.
>
> xen/drivers/passthrough/vtd/intremap.c | 87
> ++++++++++++++++++++--------------
> 1 file changed, 52 insertions(+), 35 deletions(-)
>
> diff --git a/xen/drivers/passthrough/vtd/intremap.c
> b/xen/drivers/passthrough/vtd/intremap.c
> index fd2a49a..0cb8c37 100644
> --- a/xen/drivers/passthrough/vtd/intremap.c
> +++ b/xen/drivers/passthrough/vtd/intremap.c
> @@ -600,27 +600,41 @@ static int msi_msg_to_remap_entry(
>
> if ( !pi_desc )
> {
> - /* Set interrupt remapping table entry */
> - new_ire.remap.fpd = 0;
> - new_ire.remap.dm = (msg->address_lo >> MSI_ADDR_DESTMODE_SHIFT) &
> 0x1;
> - new_ire.remap.tm = (msg->data >> MSI_DATA_TRIGGER_SHIFT) & 0x1;
> - new_ire.remap.dlm = (msg->data >> MSI_DATA_DELIVERY_MODE_SHIFT) &
> 0x1;
> - /* Hardware require RH = 1 for LPR delivery mode */
> - new_ire.remap.rh = (new_ire.remap.dlm == dest_LowestPrio);
> - new_ire.remap.avail = 0;
> - new_ire.remap.res_1 = 0;
> - new_ire.remap.vector = (msg->data >> MSI_DATA_VECTOR_SHIFT) &
> - MSI_DATA_VECTOR_MASK;
> - new_ire.remap.res_2 = 0;
> - if ( x2apic_enabled )
> - new_ire.remap.dst = msg->dest32;
> - else
> - new_ire.remap.dst = ((msg->address_lo >> MSI_ADDR_DEST_ID_SHIFT)
> - & 0xff) << 8;
> + /*
> + * We are here because we are trying to update the IRTE to remapped mode,
> + * we only need to update the remapped specific fields for the following
> + * two cases:
you said "we only need to update the remapped specific fields", while later...
> + * 1. When we create a new IRTE. A new IRTE is created when we create a
> + * new irq, so a new IRTE is always initialized with remapped format.
> + * 2. When the old IRTE is present and in remapped mode. Since if the old
> + * IRTE is in posted mode, we cannot update it to remapped mode and
> + * this is what we need to suppress. So we don't update the remapped
> + * specific fields here, we only update the commom field.
you said "we don't update remapped specific fields"...
It's also unclear to me why we cannot change irte from posted mode back to
remapped mode. Is it defined as a VT-d arch limitation? What about the other
direction from remapped mode to posted mode?
> + */
> + if ( !iremap_entry->remap.p || !iremap_entry->remap.im )
> + {
> + /* Set interrupt remapping table entry */
> + new_ire.remap.fpd = 0;
> + new_ire.remap.dm = (msg->address_lo >> MSI_ADDR_DESTMODE_SHIFT)
> & 0x1;
> + new_ire.remap.tm = (msg->data >> MSI_DATA_TRIGGER_SHIFT) & 0x1;
> + new_ire.remap.dlm = (msg->data >> MSI_DATA_DELIVERY_MODE_SHIFT)
> & 0x1;
> + /* Hardware require RH = 1 for LPR delivery mode */
> + new_ire.remap.rh = (new_ire.remap.dlm == dest_LowestPrio);
> + new_ire.remap.avail = 0;
> + new_ire.remap.res_1 = 0;
> + new_ire.remap.vector = (msg->data >> MSI_DATA_VECTOR_SHIFT) &
> + MSI_DATA_VECTOR_MASK;
> + new_ire.remap.res_2 = 0;
> + if ( x2apic_enabled )
> + new_ire.remap.dst = msg->dest32;
> + else
> + new_ire.remap.dst = ((msg->address_lo >>
> MSI_ADDR_DEST_ID_SHIFT)
> + & 0xff) << 8;
>
> - new_ire.remap.res_3 = 0;
> - new_ire.remap.res_4 = 0;
> - new_ire.remap.p = 1; /* finally, set present bit */
> + new_ire.remap.res_3 = 0;
> + new_ire.remap.res_4 = 0;
> + new_ire.remap.p = 1; /* finally, set present bit */
> + }
what about old_ire is in posted mode? If it cannot happen from posted
to remap as you explained earlier, then should be an ASSERT here.
Otherwise you just leave a condition unhandled...
> }
> else
> {
> @@ -657,25 +671,28 @@ static int msi_msg_to_remap_entry(
> remap_rte->address_hi = 0;
> remap_rte->data = index - i;
>
> - if ( !pi_desc )
> - memcpy(iremap_entry, &new_ire, sizeof(struct iremap_entry));
> - else
> + if ( iremap_entry->val != new_ire.val )
> {
> - __uint128_t ret;
> + if ( !pi_desc )
> + memcpy(iremap_entry, &new_ire, sizeof(struct iremap_entry));
> + else
> + {
> + __uint128_t ret;
>
> - old_ire = *iremap_entry;
> - ret = cmpxchg16b(iremap_entry, &old_ire, &new_ire);
> + old_ire = *iremap_entry;
> + ret = cmpxchg16b(iremap_entry, &old_ire, &new_ire);
>
> - /*
> - * In the above, we use cmpxchg16 to atomically update the 128-bit IRTE,
> - * and the hardware cannot update the IRTE behind us, so the return value
> - * of cmpxchg16 should be the same as old_ire. This ASSERT validate it.
> - */
> - ASSERT(ret == old_ire.val);
> - }
> + /*
> + * In the above, we use cmpxchg16 to atomically update the 128-bit IRTE,
> + * and the hardware cannot update the IRTE behind us, so the return value
> + * of cmpxchg16 should be the same as old_ire. This ASSERT validate it.
> + */
> + ASSERT(ret == old_ire.val);
> + }
>
> - iommu_flush_cache_entry(iremap_entry, sizeof(struct iremap_entry));
> - iommu_flush_iec_index(iommu, 0, index);
> + iommu_flush_cache_entry(iremap_entry, sizeof(struct iremap_entry));
> + iommu_flush_iec_index(iommu, 0, index);
> + }
>
> unmap_vtd_domain_page(iremap_entries);
> spin_unlock_irqrestore(&ir_ctrl->iremap_lock, flags);
> --
> 2.1.0
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 6+ messages in thread* Re: [PATCH v8 5/7] VT-d: No need to set irq affinity for posted format IRTE
2016-11-18 4:44 ` Tian, Kevin
@ 2016-11-22 4:31 ` Wu, Feng
0 siblings, 0 replies; 6+ messages in thread
From: Wu, Feng @ 2016-11-22 4:31 UTC (permalink / raw)
To: Tian, Kevin, xen-devel@lists.xen.org
Cc: george.dunlap@eu.citrix.com, andrew.cooper3@citrix.com,
dario.faggioli@citrix.com, Wu, Feng, jbeulich@suse.com
> -----Original Message-----
> From: Tian, Kevin
> Sent: Friday, November 18, 2016 12:44 PM
> To: Wu, Feng <feng.wu@intel.com>; xen-devel@lists.xen.org
> Cc: jbeulich@suse.com; andrew.cooper3@citrix.com;
> george.dunlap@eu.citrix.com; dario.faggioli@citrix.com
> Subject: RE: [PATCH v8 5/7] VT-d: No need to set irq affinity for posted format
> IRTE
>
> > From: Wu, Feng
> > Sent: Friday, November 18, 2016 9:59 AM
> >
> > We don't set the affinity for posted format IRTE, since the
> > destination of these interrupts is vCPU and the vCPU affinity
> > is set during vCPU scheduling.
> >
> > Signed-off-by: Feng Wu <feng.wu@intel.com>
> > ---
> > v8:
> > - Changes based on [6/7]
>
> [5/7]?
Sorry, should be [4/7]
>
> >
> > v7:
> > - Compare all the field in IRTE to justify whether we can suppress the update
> >
> > v6:
> > - Make pi_can_suppress_irte_update() a check-only function
> > - Introduce another function pi_get_new_irte() to update the 'new_ire' if
> needed
> >
> > v5:
> > - Only suppress affinity related IRTE updates for PI
> >
> > v4:
> > - Keep the construction of new_ire and only modify the hardware
> > IRTE when it is not in posted mode.
> >
> > xen/drivers/passthrough/vtd/intremap.c | 87
> > ++++++++++++++++++++--------------
> > 1 file changed, 52 insertions(+), 35 deletions(-)
> >
> > diff --git a/xen/drivers/passthrough/vtd/intremap.c
> > b/xen/drivers/passthrough/vtd/intremap.c
> > index fd2a49a..0cb8c37 100644
> > --- a/xen/drivers/passthrough/vtd/intremap.c
> > +++ b/xen/drivers/passthrough/vtd/intremap.c
> > @@ -600,27 +600,41 @@ static int msi_msg_to_remap_entry(
> >
> > if ( !pi_desc )
> > {
> > - /* Set interrupt remapping table entry */
> > - new_ire.remap.fpd = 0;
> > - new_ire.remap.dm = (msg->address_lo >> MSI_ADDR_DESTMODE_SHIFT)
> &
> > 0x1;
> > - new_ire.remap.tm = (msg->data >> MSI_DATA_TRIGGER_SHIFT) & 0x1;
> > - new_ire.remap.dlm = (msg->data >> MSI_DATA_DELIVERY_MODE_SHIFT)
> &
> > 0x1;
> > - /* Hardware require RH = 1 for LPR delivery mode */
> > - new_ire.remap.rh = (new_ire.remap.dlm == dest_LowestPrio);
> > - new_ire.remap.avail = 0;
> > - new_ire.remap.res_1 = 0;
> > - new_ire.remap.vector = (msg->data >> MSI_DATA_VECTOR_SHIFT) &
> > - MSI_DATA_VECTOR_MASK;
> > - new_ire.remap.res_2 = 0;
> > - if ( x2apic_enabled )
> > - new_ire.remap.dst = msg->dest32;
> > - else
> > - new_ire.remap.dst = ((msg->address_lo >> MSI_ADDR_DEST_ID_SHIFT)
> > - & 0xff) << 8;
> > + /*
> > + * We are here because we are trying to update the IRTE to remapped
> mode,
> > + * we only need to update the remapped specific fields for the following
> > + * two cases:
>
> you said "we only need to update the remapped specific fields", while later...
>
> > + * 1. When we create a new IRTE. A new IRTE is created when we create
> a
> > + * new irq, so a new IRTE is always initialized with remapped format.
> > + * 2. When the old IRTE is present and in remapped mode. Since if the old
> > + * IRTE is in posted mode, we cannot update it to remapped mode and
> > + * this is what we need to suppress. So we don't update the remapped
> > + * specific fields here, we only update the commom field.
>
> you said "we don't update remapped specific fields"...
Sorry for the confusion. When I said "So we don't update the remapped specific
fields here, we only update the common field" above, I actually would like to
mean for the case "when we are updating remapped IRTE -> posted IRTE".
I will reword this to make it clear.
>
> It's also unclear to me why we cannot change irte from posted mode back to
> remapped mode. Is it defined as a VT-d arch limitation? What about the other
> direction from remapped mode to posted mode?
It is not a VT-d arch limitation. We don't need that and currently don't have that
path to trigger the transition from posted IRTE to remapped one. And in fact that
is the main purpose of this patch to suppress affinity update to a posted mode
IRTE. And the other direction from remapped mode to posted mode is
straightforward, when IRTE is created with remapped mode, sometime later,
our code will update it to posted mode if the device is assigned to guest and
the guest updates the affinity.
>
> > + */
> > + if ( !iremap_entry->remap.p || !iremap_entry->remap.im )
> > + {
> > + /* Set interrupt remapping table entry */
> > + new_ire.remap.fpd = 0;
> > + new_ire.remap.dm = (msg->address_lo >>
> MSI_ADDR_DESTMODE_SHIFT)
> > & 0x1;
> > + new_ire.remap.tm = (msg->data >> MSI_DATA_TRIGGER_SHIFT) & 0x1;
> > + new_ire.remap.dlm = (msg->data >>
> MSI_DATA_DELIVERY_MODE_SHIFT)
> > & 0x1;
> > + /* Hardware require RH = 1 for LPR delivery mode */
> > + new_ire.remap.rh = (new_ire.remap.dlm == dest_LowestPrio);
> > + new_ire.remap.avail = 0;
> > + new_ire.remap.res_1 = 0;
> > + new_ire.remap.vector = (msg->data >> MSI_DATA_VECTOR_SHIFT) &
> > + MSI_DATA_VECTOR_MASK;
> > + new_ire.remap.res_2 = 0;
> > + if ( x2apic_enabled )
> > + new_ire.remap.dst = msg->dest32;
> > + else
> > + new_ire.remap.dst = ((msg->address_lo >>
> > MSI_ADDR_DEST_ID_SHIFT)
> > + & 0xff) << 8;
> >
> > - new_ire.remap.res_3 = 0;
> > - new_ire.remap.res_4 = 0;
> > - new_ire.remap.p = 1; /* finally, set present bit */
> > + new_ire.remap.res_3 = 0;
> > + new_ire.remap.res_4 = 0;
> > + new_ire.remap.p = 1; /* finally, set present bit */
> > + }
>
> what about old_ire is in posted mode? If it cannot happen from posted
> to remap as you explained earlier, then should be an ASSERT here.
> Otherwise you just leave a condition unhandled...
If the old_ire is in posted mode, it is the case which we need to suppress
the affinity update. So here we don't update the 'new_ire' to remapped
mode and just leave it as the old new, and update the common field
in set_msi_source_id() if needed after the if-else part. And this is the
main purpose of this patch: suppress the affinity related update to
posted mode IRTE, but update the common fields if need. Back to
your comments in [4/7] about whether we need copy to old IRTE
value to 'new_ire', I think it is needed in this case.
Thanks,
Feng
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v8 5/7] VT-d: No need to set irq affinity for posted format IRTE
2016-11-18 1:58 [PATCH v8 5/7] VT-d: No need to set irq affinity for posted format IRTE Feng Wu
2016-11-18 4:44 ` Tian, Kevin
@ 2016-11-22 10:03 ` Jan Beulich
2017-02-22 7:24 ` Chao Gao
1 sibling, 1 reply; 6+ messages in thread
From: Jan Beulich @ 2016-11-22 10:03 UTC (permalink / raw)
To: Feng Wu
Cc: george.dunlap, andrew.cooper3, dario.faggioli, kevin.tian,
xen-devel
>>> On 18.11.16 at 02:58, <feng.wu@intel.com> wrote:
> --- a/xen/drivers/passthrough/vtd/intremap.c
> +++ b/xen/drivers/passthrough/vtd/intremap.c
> @@ -600,27 +600,41 @@ static int msi_msg_to_remap_entry(
>
> if ( !pi_desc )
> {
> - /* Set interrupt remapping table entry */
> - new_ire.remap.fpd = 0;
> - new_ire.remap.dm = (msg->address_lo >> MSI_ADDR_DESTMODE_SHIFT) & 0x1;
> - new_ire.remap.tm = (msg->data >> MSI_DATA_TRIGGER_SHIFT) & 0x1;
> - new_ire.remap.dlm = (msg->data >> MSI_DATA_DELIVERY_MODE_SHIFT) & 0x1;
> - /* Hardware require RH = 1 for LPR delivery mode */
> - new_ire.remap.rh = (new_ire.remap.dlm == dest_LowestPrio);
> - new_ire.remap.avail = 0;
> - new_ire.remap.res_1 = 0;
> - new_ire.remap.vector = (msg->data >> MSI_DATA_VECTOR_SHIFT) &
> - MSI_DATA_VECTOR_MASK;
> - new_ire.remap.res_2 = 0;
> - if ( x2apic_enabled )
> - new_ire.remap.dst = msg->dest32;
> - else
> - new_ire.remap.dst = ((msg->address_lo >> MSI_ADDR_DEST_ID_SHIFT)
> - & 0xff) << 8;
> + /*
> + * We are here because we are trying to update the IRTE to remapped mode,
> + * we only need to update the remapped specific fields for the following
> + * two cases:
> + * 1. When we create a new IRTE. A new IRTE is created when we create a
> + * new irq, so a new IRTE is always initialized with remapped format.
> + * 2. When the old IRTE is present and in remapped mode. Since if the old
> + * IRTE is in posted mode, we cannot update it to remapped mode and
> + * this is what we need to suppress. So we don't update the remapped
> + * specific fields here, we only update the commom field.
> + */
> + if ( !iremap_entry->remap.p || !iremap_entry->remap.im )
> + {
> + /* Set interrupt remapping table entry */
> + new_ire.remap.fpd = 0;
> + new_ire.remap.dm = (msg->address_lo >> MSI_ADDR_DESTMODE_SHIFT) & 0x1;
> + new_ire.remap.tm = (msg->data >> MSI_DATA_TRIGGER_SHIFT) & 0x1;
> + new_ire.remap.dlm = (msg->data >> MSI_DATA_DELIVERY_MODE_SHIFT) & 0x1;
> + /* Hardware require RH = 1 for LPR delivery mode */
> + new_ire.remap.rh = (new_ire.remap.dlm == dest_LowestPrio);
> + new_ire.remap.avail = 0;
> + new_ire.remap.res_1 = 0;
> + new_ire.remap.vector = (msg->data >> MSI_DATA_VECTOR_SHIFT) &
> + MSI_DATA_VECTOR_MASK;
> + new_ire.remap.res_2 = 0;
> + if ( x2apic_enabled )
> + new_ire.remap.dst = msg->dest32;
> + else
> + new_ire.remap.dst = ((msg->address_lo >> MSI_ADDR_DEST_ID_SHIFT)
> + & 0xff) << 8;
>
> - new_ire.remap.res_3 = 0;
> - new_ire.remap.res_4 = 0;
> - new_ire.remap.p = 1; /* finally, set present bit */
> + new_ire.remap.res_3 = 0;
> + new_ire.remap.res_4 = 0;
> + new_ire.remap.p = 1; /* finally, set present bit */
> + }
I disagree with this entire change, including namely point 2 of the
comment. There should be no special casing of either transition
here - the caller should provide you with input correctly identifying
which of the two formats is to be generated. This certainly is
connected to one of the comments I've just made on patch 4.
> @@ -657,25 +671,28 @@ static int msi_msg_to_remap_entry(
> remap_rte->address_hi = 0;
> remap_rte->data = index - i;
>
> - if ( !pi_desc )
> - memcpy(iremap_entry, &new_ire, sizeof(struct iremap_entry));
> - else
> + if ( iremap_entry->val != new_ire.val )
> {
> - __uint128_t ret;
> + if ( !pi_desc )
> + memcpy(iremap_entry, &new_ire, sizeof(struct iremap_entry));
> + else
> + {
> + __uint128_t ret;
>
> - old_ire = *iremap_entry;
> - ret = cmpxchg16b(iremap_entry, &old_ire, &new_ire);
> + old_ire = *iremap_entry;
> + ret = cmpxchg16b(iremap_entry, &old_ire, &new_ire);
>
> - /*
> - * In the above, we use cmpxchg16 to atomically update the 128-bit IRTE,
> - * and the hardware cannot update the IRTE behind us, so the return value
> - * of cmpxchg16 should be the same as old_ire. This ASSERT validate it.
> - */
> - ASSERT(ret == old_ire.val);
> - }
> + /*
> + * In the above, we use cmpxchg16 to atomically update the 128-bit IRTE,
> + * and the hardware cannot update the IRTE behind us, so the return value
> + * of cmpxchg16 should be the same as old_ire. This ASSERT validate it.
> + */
> + ASSERT(ret == old_ire.val);
> + }
>
> - iommu_flush_cache_entry(iremap_entry, sizeof(struct iremap_entry));
> - iommu_flush_iec_index(iommu, 0, index);
> + iommu_flush_cache_entry(iremap_entry, sizeof(struct iremap_entry));
> + iommu_flush_iec_index(iommu, 0, index);
> + }
>
> unmap_vtd_domain_page(iremap_entries);
> spin_unlock_irqrestore(&ir_ctrl->iremap_lock, flags);
This second hunk is imo all this patch should consist of.
Jan
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 6+ messages in thread* Re: [PATCH v8 5/7] VT-d: No need to set irq affinity for posted format IRTE
2016-11-22 10:03 ` Jan Beulich
@ 2017-02-22 7:24 ` Chao Gao
0 siblings, 0 replies; 6+ messages in thread
From: Chao Gao @ 2017-02-22 7:24 UTC (permalink / raw)
To: Jan Beulich
Cc: george.dunlap@eu.citrix.com, andrew.cooper3@citrix.com,
dario.faggioli@citrix.com, Tian, Kevin, xen-devel@lists.xen.org
On Tue, Nov 22, 2016 at 06:03:51PM +0800, Jan Beulich wrote:
>>>> On 18.11.16 at 02:58, <feng.wu@intel.com> wrote:
>> --- a/xen/drivers/passthrough/vtd/intremap.c
>> +++ b/xen/drivers/passthrough/vtd/intremap.c
>> @@ -600,27 +600,41 @@ static int msi_msg_to_remap_entry(
>>
>> if ( !pi_desc )
>> {
>> - /* Set interrupt remapping table entry */
>> - new_ire.remap.fpd = 0;
>> - new_ire.remap.dm = (msg->address_lo >> MSI_ADDR_DESTMODE_SHIFT) & 0x1;
>> - new_ire.remap.tm = (msg->data >> MSI_DATA_TRIGGER_SHIFT) & 0x1;
>> - new_ire.remap.dlm = (msg->data >> MSI_DATA_DELIVERY_MODE_SHIFT) & 0x1;
>> - /* Hardware require RH = 1 for LPR delivery mode */
>> - new_ire.remap.rh = (new_ire.remap.dlm == dest_LowestPrio);
>> - new_ire.remap.avail = 0;
>> - new_ire.remap.res_1 = 0;
>> - new_ire.remap.vector = (msg->data >> MSI_DATA_VECTOR_SHIFT) &
>> - MSI_DATA_VECTOR_MASK;
>> - new_ire.remap.res_2 = 0;
>> - if ( x2apic_enabled )
>> - new_ire.remap.dst = msg->dest32;
>> - else
>> - new_ire.remap.dst = ((msg->address_lo >> MSI_ADDR_DEST_ID_SHIFT)
>> - & 0xff) << 8;
>> + /*
>> + * We are here because we are trying to update the IRTE to remapped mode,
>> + * we only need to update the remapped specific fields for the following
>> + * two cases:
>> + * 1. When we create a new IRTE. A new IRTE is created when we create a
>> + * new irq, so a new IRTE is always initialized with remapped format.
>> + * 2. When the old IRTE is present and in remapped mode. Since if the old
>> + * IRTE is in posted mode, we cannot update it to remapped mode and
>> + * this is what we need to suppress. So we don't update the remapped
>> + * specific fields here, we only update the commom field.
>> + */
>> + if ( !iremap_entry->remap.p || !iremap_entry->remap.im )
>> + {
>> + /* Set interrupt remapping table entry */
>> + new_ire.remap.fpd = 0;
>> + new_ire.remap.dm = (msg->address_lo >> MSI_ADDR_DESTMODE_SHIFT) & 0x1;
>> + new_ire.remap.tm = (msg->data >> MSI_DATA_TRIGGER_SHIFT) & 0x1;
>> + new_ire.remap.dlm = (msg->data >> MSI_DATA_DELIVERY_MODE_SHIFT) & 0x1;
>> + /* Hardware require RH = 1 for LPR delivery mode */
>> + new_ire.remap.rh = (new_ire.remap.dlm == dest_LowestPrio);
>> + new_ire.remap.avail = 0;
>> + new_ire.remap.res_1 = 0;
>> + new_ire.remap.vector = (msg->data >> MSI_DATA_VECTOR_SHIFT) &
>> + MSI_DATA_VECTOR_MASK;
>> + new_ire.remap.res_2 = 0;
>> + if ( x2apic_enabled )
>> + new_ire.remap.dst = msg->dest32;
>> + else
>> + new_ire.remap.dst = ((msg->address_lo >> MSI_ADDR_DEST_ID_SHIFT)
>> + & 0xff) << 8;
>>
>> - new_ire.remap.res_3 = 0;
>> - new_ire.remap.res_4 = 0;
>> - new_ire.remap.p = 1; /* finally, set present bit */
>> + new_ire.remap.res_3 = 0;
>> + new_ire.remap.res_4 = 0;
>> + new_ire.remap.p = 1; /* finally, set present bit */
>> + }
>
>I disagree with this entire change, including namely point 2 of the
>comment. There should be no special casing of either transition
>here - the caller should provide you with input correctly identifying
>which of the two formats is to be generated. This certainly is
>connected to one of the comments I've just made on patch 4.
>
If we reach an agreement on patch 4, the logic will be more general and the problem
will disappear.
>> @@ -657,25 +671,28 @@ static int msi_msg_to_remap_entry(
>> remap_rte->address_hi = 0;
>> remap_rte->data = index - i;
>>
>> - if ( !pi_desc )
>> - memcpy(iremap_entry, &new_ire, sizeof(struct iremap_entry));
>> - else
>> + if ( iremap_entry->val != new_ire.val )
>> {
>> - __uint128_t ret;
>> + if ( !pi_desc )
>> + memcpy(iremap_entry, &new_ire, sizeof(struct iremap_entry));
>> + else
>> + {
>> + __uint128_t ret;
>>
>> - old_ire = *iremap_entry;
>> - ret = cmpxchg16b(iremap_entry, &old_ire, &new_ire);
>> + old_ire = *iremap_entry;
>> + ret = cmpxchg16b(iremap_entry, &old_ire, &new_ire);
>>
>> - /*
>> - * In the above, we use cmpxchg16 to atomically update the 128-bit IRTE,
>> - * and the hardware cannot update the IRTE behind us, so the return value
>> - * of cmpxchg16 should be the same as old_ire. This ASSERT validate it.
>> - */
>> - ASSERT(ret == old_ire.val);
>> - }
>> + /*
>> + * In the above, we use cmpxchg16 to atomically update the 128-bit IRTE,
>> + * and the hardware cannot update the IRTE behind us, so the return value
>> + * of cmpxchg16 should be the same as old_ire. This ASSERT validate it.
>> + */
>> + ASSERT(ret == old_ire.val);
>> + }
>>
>> - iommu_flush_cache_entry(iremap_entry, sizeof(struct iremap_entry));
>> - iommu_flush_iec_index(iommu, 0, index);
>> + iommu_flush_cache_entry(iremap_entry, sizeof(struct iremap_entry));
>> + iommu_flush_iec_index(iommu, 0, index);
>> + }
>>
>> unmap_vtd_domain_page(iremap_entries);
>> spin_unlock_irqrestore(&ir_ctrl->iremap_lock, flags);
>
>This second hunk is imo all this patch should consist of.
>
>Jan
>
>_______________________________________________
>Xen-devel mailing list
>Xen-devel@lists.xen.org
>https://lists.xen.org/xen-devel
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH v8 0/7] VMX: Properly handle pi descriptor and per-cpu blocking list
@ 2016-11-18 1:57 Feng Wu
2016-11-18 1:57 ` [PATCH v8 5/7] VT-d: No need to set irq affinity for posted format IRTE Feng Wu
0 siblings, 1 reply; 6+ messages in thread
From: Feng Wu @ 2016-11-18 1:57 UTC (permalink / raw)
To: xen-devel
Cc: kevin.tian, Feng Wu, george.dunlap, andrew.cooper3,
dario.faggioli, jbeulich
The current VT-d PI related code may operate incorrectly in the
following scenarios:
1. When the last assigned device is dettached from the domain, all
the PI related hooks are removed then, however, the vCPU can be
blocked, switched to another pCPU, etc, all without the aware of
PI. After the next time we attach another device to the domain,
which makes the PI realted hooks avaliable again, the status
of the pi descriptor is not true. Beside that, the blocking vcpu
may still remain in the per-cpu blocking in this case. Patch [1/7]
and [2/7] handle this.
2. [4/7] unify the code path of PI mode update and remapped mode update
2. When IRTE is in posted mode, we don't need to set the irq
affinity for it, since the destination of these interrupts is
vCPU and the vCPU affinity is set during vCPU scheduling. Patch
[5/7] handles this.
4. [6/7] is a cleanup patch
5. When a pCPU is unplugged, and there might be vCPUs on its
list. Since the pCPU is offline, those vCPUs might not be woken
up again. [7/7] addresses it.
Feng Wu (7):
VMX: Permanently assign PI hook vmx_pi_switch_to()
VMX: Properly handle pi when all the assigned devices are removed
VMX: Make sure PI is in proper state before install the hooks
VT-d: Use one function to update both remapped and posted IRTE
VT-d: No need to set irq affinity for posted format IRTE
VT-d: Some cleanups
VMX: Fixup PI descriptor when cpu is offline
xen/arch/x86/hvm/vmx/vmcs.c | 14 +--
xen/arch/x86/hvm/vmx/vmx.c | 133 ++++++++++++++++++++++-
xen/drivers/passthrough/pci.c | 14 +++
xen/drivers/passthrough/vtd/intremap.c | 193 +++++++++++++++------------------
xen/include/asm-x86/hvm/vmx/vmx.h | 3 +
5 files changed, 240 insertions(+), 117 deletions(-)
--
2.1.0
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH v8 5/7] VT-d: No need to set irq affinity for posted format IRTE
2016-11-18 1:57 [PATCH v8 0/7] VMX: Properly handle pi descriptor and per-cpu blocking list Feng Wu
@ 2016-11-18 1:57 ` Feng Wu
0 siblings, 0 replies; 6+ messages in thread
From: Feng Wu @ 2016-11-18 1:57 UTC (permalink / raw)
To: xen-devel
Cc: kevin.tian, Feng Wu, george.dunlap, andrew.cooper3,
dario.faggioli, jbeulich
We don't set the affinity for posted format IRTE, since the
destination of these interrupts is vCPU and the vCPU affinity
is set during vCPU scheduling.
Signed-off-by: Feng Wu <feng.wu@intel.com>
---
v8:
- Changes based on [6/7]
v7:
- Compare all the field in IRTE to justify whether we can suppress the update
v6:
- Make pi_can_suppress_irte_update() a check-only function
- Introduce another function pi_get_new_irte() to update the 'new_ire' if needed
v5:
- Only suppress affinity related IRTE updates for PI
v4:
- Keep the construction of new_ire and only modify the hardware
IRTE when it is not in posted mode.
xen/drivers/passthrough/vtd/intremap.c | 87 ++++++++++++++++++++--------------
1 file changed, 52 insertions(+), 35 deletions(-)
diff --git a/xen/drivers/passthrough/vtd/intremap.c b/xen/drivers/passthrough/vtd/intremap.c
index fd2a49a..0cb8c37 100644
--- a/xen/drivers/passthrough/vtd/intremap.c
+++ b/xen/drivers/passthrough/vtd/intremap.c
@@ -600,27 +600,41 @@ static int msi_msg_to_remap_entry(
if ( !pi_desc )
{
- /* Set interrupt remapping table entry */
- new_ire.remap.fpd = 0;
- new_ire.remap.dm = (msg->address_lo >> MSI_ADDR_DESTMODE_SHIFT) & 0x1;
- new_ire.remap.tm = (msg->data >> MSI_DATA_TRIGGER_SHIFT) & 0x1;
- new_ire.remap.dlm = (msg->data >> MSI_DATA_DELIVERY_MODE_SHIFT) & 0x1;
- /* Hardware require RH = 1 for LPR delivery mode */
- new_ire.remap.rh = (new_ire.remap.dlm == dest_LowestPrio);
- new_ire.remap.avail = 0;
- new_ire.remap.res_1 = 0;
- new_ire.remap.vector = (msg->data >> MSI_DATA_VECTOR_SHIFT) &
- MSI_DATA_VECTOR_MASK;
- new_ire.remap.res_2 = 0;
- if ( x2apic_enabled )
- new_ire.remap.dst = msg->dest32;
- else
- new_ire.remap.dst = ((msg->address_lo >> MSI_ADDR_DEST_ID_SHIFT)
- & 0xff) << 8;
+ /*
+ * We are here because we are trying to update the IRTE to remapped mode,
+ * we only need to update the remapped specific fields for the following
+ * two cases:
+ * 1. When we create a new IRTE. A new IRTE is created when we create a
+ * new irq, so a new IRTE is always initialized with remapped format.
+ * 2. When the old IRTE is present and in remapped mode. Since if the old
+ * IRTE is in posted mode, we cannot update it to remapped mode and
+ * this is what we need to suppress. So we don't update the remapped
+ * specific fields here, we only update the commom field.
+ */
+ if ( !iremap_entry->remap.p || !iremap_entry->remap.im )
+ {
+ /* Set interrupt remapping table entry */
+ new_ire.remap.fpd = 0;
+ new_ire.remap.dm = (msg->address_lo >> MSI_ADDR_DESTMODE_SHIFT) & 0x1;
+ new_ire.remap.tm = (msg->data >> MSI_DATA_TRIGGER_SHIFT) & 0x1;
+ new_ire.remap.dlm = (msg->data >> MSI_DATA_DELIVERY_MODE_SHIFT) & 0x1;
+ /* Hardware require RH = 1 for LPR delivery mode */
+ new_ire.remap.rh = (new_ire.remap.dlm == dest_LowestPrio);
+ new_ire.remap.avail = 0;
+ new_ire.remap.res_1 = 0;
+ new_ire.remap.vector = (msg->data >> MSI_DATA_VECTOR_SHIFT) &
+ MSI_DATA_VECTOR_MASK;
+ new_ire.remap.res_2 = 0;
+ if ( x2apic_enabled )
+ new_ire.remap.dst = msg->dest32;
+ else
+ new_ire.remap.dst = ((msg->address_lo >> MSI_ADDR_DEST_ID_SHIFT)
+ & 0xff) << 8;
- new_ire.remap.res_3 = 0;
- new_ire.remap.res_4 = 0;
- new_ire.remap.p = 1; /* finally, set present bit */
+ new_ire.remap.res_3 = 0;
+ new_ire.remap.res_4 = 0;
+ new_ire.remap.p = 1; /* finally, set present bit */
+ }
}
else
{
@@ -657,25 +671,28 @@ static int msi_msg_to_remap_entry(
remap_rte->address_hi = 0;
remap_rte->data = index - i;
- if ( !pi_desc )
- memcpy(iremap_entry, &new_ire, sizeof(struct iremap_entry));
- else
+ if ( iremap_entry->val != new_ire.val )
{
- __uint128_t ret;
+ if ( !pi_desc )
+ memcpy(iremap_entry, &new_ire, sizeof(struct iremap_entry));
+ else
+ {
+ __uint128_t ret;
- old_ire = *iremap_entry;
- ret = cmpxchg16b(iremap_entry, &old_ire, &new_ire);
+ old_ire = *iremap_entry;
+ ret = cmpxchg16b(iremap_entry, &old_ire, &new_ire);
- /*
- * In the above, we use cmpxchg16 to atomically update the 128-bit IRTE,
- * and the hardware cannot update the IRTE behind us, so the return value
- * of cmpxchg16 should be the same as old_ire. This ASSERT validate it.
- */
- ASSERT(ret == old_ire.val);
- }
+ /*
+ * In the above, we use cmpxchg16 to atomically update the 128-bit IRTE,
+ * and the hardware cannot update the IRTE behind us, so the return value
+ * of cmpxchg16 should be the same as old_ire. This ASSERT validate it.
+ */
+ ASSERT(ret == old_ire.val);
+ }
- iommu_flush_cache_entry(iremap_entry, sizeof(struct iremap_entry));
- iommu_flush_iec_index(iommu, 0, index);
+ iommu_flush_cache_entry(iremap_entry, sizeof(struct iremap_entry));
+ iommu_flush_iec_index(iommu, 0, index);
+ }
unmap_vtd_domain_page(iremap_entries);
spin_unlock_irqrestore(&ir_ctrl->iremap_lock, flags);
--
2.1.0
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel
^ permalink raw reply related [flat|nested] 6+ messages in thread
end of thread, other threads:[~2017-02-22 7:24 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-11-18 1:58 [PATCH v8 5/7] VT-d: No need to set irq affinity for posted format IRTE Feng Wu
2016-11-18 4:44 ` Tian, Kevin
2016-11-22 4:31 ` Wu, Feng
2016-11-22 10:03 ` Jan Beulich
2017-02-22 7:24 ` Chao Gao
-- strict thread matches above, loose matches on Subject: below --
2016-11-18 1:57 [PATCH v8 0/7] VMX: Properly handle pi descriptor and per-cpu blocking list Feng Wu
2016-11-18 1:57 ` [PATCH v8 5/7] VT-d: No need to set irq affinity for posted format IRTE Feng Wu
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).