* [PATCH 0/2] vhost: IOTLB fixes for -rc1 @ 2017-10-12 15:38 Maxime Coquelin 2017-10-12 15:38 ` [PATCH 1/2] vhost: fix deadlock on IOTLB miss Maxime Coquelin ` (2 more replies) 0 siblings, 3 replies; 6+ messages in thread From: Maxime Coquelin @ 2017-10-12 15:38 UTC (permalink / raw) To: yliu, thomas, dev; +Cc: Maxime Coquelin These two patches fixes issues faced when running the VM on a different socket than DPDK. In this case, the numa_realloc() function is called to reallocate the virtqueue and the virtio-net device structs on the VM's socket. The problem is that doing so corrupts the IOTLB cache list, as the list head is being reallocated, but the first entry in the list is not updated to point to the new list head. It results in all new IOTLB entries that need to be inserted before the first entry in the list to be leaked, as the new head is still pointing to the first entry at the time the realloc happened. Patch 2 addresses this issue by re-initializing the IOTLB cache completely. Doing this also create again the IOTLB mempool on the new socket. This first issue helped to highlight a deadlock that patch 1 fixes. As inserting an entry before the first entry in the list resulted in a leak, it ended up flooding Qemu with IOTLB misses for the same address. The deadlock happen because an optimization was done to lock the iotlb cache lock once per packet burst instead of once per translation. It means that when an IOTLB miss is sent, it is done with the lock held. The problem is that sending an IOTLB miss can block if the socket buffer is full, and this buffer is emptied by the same Qemu thread which is waiting for an IOTLB update to be completed. But it never completes because DPDK waits for the iotlb lock to insert the update into the iotlb cache, hence the deadlock. The fix consists in just unlocking the iotlb lock while sending the IOTLB miss, which is safe as it does not access the iotlb list the lock protects. Maxime Coquelin (2): vhost: fix deadlock on IOTLB miss vhost: fix IOTLB on NUMA realloc lib/librte_vhost/iotlb.c | 1 - lib/librte_vhost/vhost.c | 12 ++++++++++++ lib/librte_vhost/vhost_user.c | 3 +++ 3 files changed, 15 insertions(+), 1 deletion(-) -- 2.13.6 ^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH 1/2] vhost: fix deadlock on IOTLB miss 2017-10-12 15:38 [PATCH 0/2] vhost: IOTLB fixes for -rc1 Maxime Coquelin @ 2017-10-12 15:38 ` Maxime Coquelin 2017-10-13 11:32 ` Jens Freimann 2017-10-12 15:38 ` [PATCH 2/2] vhost: fix IOTLB on NUMA realloc Maxime Coquelin 2017-10-13 18:37 ` [PATCH 0/2] vhost: IOTLB fixes for -rc1 Thomas Monjalon 2 siblings, 1 reply; 6+ messages in thread From: Maxime Coquelin @ 2017-10-12 15:38 UTC (permalink / raw) To: yliu, thomas, dev; +Cc: Maxime Coquelin An optimization was done to only take the iotlb cache lock once per packet burst instead of once per IOVA translation. With this, IOTLB miss requests are sent to Qemu with the lock held, which can cause a deadlock if the socket buffer is full, and if Qemu is waiting for an IOTLB update to be done. Holding the lock is not necessary when sending an IOTLB miss request, as it is not manipulating the IOTLB cache list, which the lock protects. Let's just release it while sending the IOTLB miss. Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> --- lib/librte_vhost/vhost.c | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/lib/librte_vhost/vhost.c b/lib/librte_vhost/vhost.c index 54a1864eb..4f8b73a09 100644 --- a/lib/librte_vhost/vhost.c +++ b/lib/librte_vhost/vhost.c @@ -55,6 +55,7 @@ struct virtio_net *vhost_devices[MAX_VHOST_DEVICE]; +/* Called with iotlb_lock read-locked */ uint64_t __vhost_iova_to_vva(struct virtio_net *dev, struct vhost_virtqueue *vq, uint64_t iova, uint64_t size, uint8_t perm) @@ -71,8 +72,19 @@ __vhost_iova_to_vva(struct virtio_net *dev, struct vhost_virtqueue *vq, return vva; if (!vhost_user_iotlb_pending_miss(vq, iova + tmp_size, perm)) { + /* + * iotlb_lock is read-locked for a full burst, + * but it only protects the iotlb cache. + * In case of IOTLB miss, we might block on the socket, + * which could cause a deadlock with QEMU if an IOTLB update + * is being handled. We can safely unlock here to avoid it. + */ + vhost_user_iotlb_rd_unlock(vq); + vhost_user_iotlb_pending_insert(vq, iova + tmp_size, perm); vhost_user_iotlb_miss(dev, iova + tmp_size, perm); + + vhost_user_iotlb_rd_lock(vq); } return 0; -- 2.13.6 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH 1/2] vhost: fix deadlock on IOTLB miss 2017-10-12 15:38 ` [PATCH 1/2] vhost: fix deadlock on IOTLB miss Maxime Coquelin @ 2017-10-13 11:32 ` Jens Freimann 0 siblings, 0 replies; 6+ messages in thread From: Jens Freimann @ 2017-10-13 11:32 UTC (permalink / raw) To: Maxime Coquelin; +Cc: yliu, thomas, dev On Thu, Oct 12, 2017 at 03:38:49PM +0000, Maxime Coquelin wrote: >An optimization was done to only take the iotlb cache lock >once per packet burst instead of once per IOVA translation. > >With this, IOTLB miss requests are sent to Qemu with the lock >held, which can cause a deadlock if the socket buffer is full, >and if Qemu is waiting for an IOTLB update to be done. > >Holding the lock is not necessary when sending an IOTLB miss >request, as it is not manipulating the IOTLB cache list, which >the lock protects. Let's just release it while sending the >IOTLB miss. > >Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> >--- > lib/librte_vhost/vhost.c | 12 ++++++++++++ > 1 file changed, 12 insertions(+) > Seems to be safe, because in case of an IOTLB miss we only take a different lock. Reviewed-by: Jens Freimann <jfreimann@redhat.com> ^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH 2/2] vhost: fix IOTLB on NUMA realloc 2017-10-12 15:38 [PATCH 0/2] vhost: IOTLB fixes for -rc1 Maxime Coquelin 2017-10-12 15:38 ` [PATCH 1/2] vhost: fix deadlock on IOTLB miss Maxime Coquelin @ 2017-10-12 15:38 ` Maxime Coquelin 2017-10-13 11:23 ` Jens Freimann 2017-10-13 18:37 ` [PATCH 0/2] vhost: IOTLB fixes for -rc1 Thomas Monjalon 2 siblings, 1 reply; 6+ messages in thread From: Maxime Coquelin @ 2017-10-12 15:38 UTC (permalink / raw) To: yliu, thomas, dev; +Cc: Maxime Coquelin In case of NUMA reallocation, virtqueue's iotlb list is broken, has its head changes but first iotlb entry in the list still points to the previous head pointer. Also, in case of reallocation, we want the IOTLB cache mempool to be on the new socket. This patch perform a full re-init of the IOTLB cache when mempool already exists, and calls the IOTLB cache init function in case the virtqueue is being reallocated on a new socket. Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> --- lib/librte_vhost/iotlb.c | 1 - lib/librte_vhost/vhost_user.c | 3 +++ 2 files changed, 3 insertions(+), 1 deletion(-) diff --git a/lib/librte_vhost/iotlb.c b/lib/librte_vhost/iotlb.c index 05c278040..b74cc6a78 100644 --- a/lib/librte_vhost/iotlb.c +++ b/lib/librte_vhost/iotlb.c @@ -309,7 +309,6 @@ vhost_user_iotlb_init(struct virtio_net *dev, int vq_index) */ vhost_user_iotlb_cache_remove_all(vq); vhost_user_iotlb_pending_remove_all(vq); - return 0; } #ifdef RTE_LIBRTE_VHOST_NUMA diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.c index 9acac6125..1dfb234ca 100644 --- a/lib/librte_vhost/vhost_user.c +++ b/lib/librte_vhost/vhost_user.c @@ -314,6 +314,9 @@ numa_realloc(struct virtio_net *dev, int index) dev->virtqueue[index] = vq; vhost_devices[dev->vid] = dev; + if (old_vq != vq) + vhost_user_iotlb_init(dev, index); + return dev; } #else -- 2.13.6 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH 2/2] vhost: fix IOTLB on NUMA realloc 2017-10-12 15:38 ` [PATCH 2/2] vhost: fix IOTLB on NUMA realloc Maxime Coquelin @ 2017-10-13 11:23 ` Jens Freimann 0 siblings, 0 replies; 6+ messages in thread From: Jens Freimann @ 2017-10-13 11:23 UTC (permalink / raw) To: Maxime Coquelin; +Cc: yliu, thomas, dev On Thu, Oct 12, 2017 at 03:38:50PM +0000, Maxime Coquelin wrote: >In case of NUMA reallocation, virtqueue's iotlb list is broken, >has its head changes but first iotlb entry in the list still points >to the previous head pointer. > >Also, in case of reallocation, we want the IOTLB cache mempool to be >on the new socket. > >This patch perform a full re-init of the IOTLB cache when mempool >already exists, and calls the IOTLB cache init function in case >the virtqueue is being reallocated on a new socket. > >Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> >--- > lib/librte_vhost/iotlb.c | 1 - > lib/librte_vhost/vhost_user.c | 3 +++ > 2 files changed, 3 insertions(+), 1 deletion(-) > >diff --git a/lib/librte_vhost/iotlb.c b/lib/librte_vhost/iotlb.c >index 05c278040..b74cc6a78 100644 >--- a/lib/librte_vhost/iotlb.c >+++ b/lib/librte_vhost/iotlb.c >@@ -309,7 +309,6 @@ vhost_user_iotlb_init(struct virtio_net *dev, int vq_index) > */ > vhost_user_iotlb_cache_remove_all(vq); > vhost_user_iotlb_pending_remove_all(vq); >- return 0; > } > > #ifdef RTE_LIBRTE_VHOST_NUMA Reviewed-by: Jens Freimann <jfreimann@redhat.com> ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 0/2] vhost: IOTLB fixes for -rc1 2017-10-12 15:38 [PATCH 0/2] vhost: IOTLB fixes for -rc1 Maxime Coquelin 2017-10-12 15:38 ` [PATCH 1/2] vhost: fix deadlock on IOTLB miss Maxime Coquelin 2017-10-12 15:38 ` [PATCH 2/2] vhost: fix IOTLB on NUMA realloc Maxime Coquelin @ 2017-10-13 18:37 ` Thomas Monjalon 2 siblings, 0 replies; 6+ messages in thread From: Thomas Monjalon @ 2017-10-13 18:37 UTC (permalink / raw) To: Maxime Coquelin; +Cc: dev, yliu > Maxime Coquelin (2): > vhost: fix deadlock on IOTLB miss > vhost: fix IOTLB on NUMA realloc Applied, thanks ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2017-10-13 18:37 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2017-10-12 15:38 [PATCH 0/2] vhost: IOTLB fixes for -rc1 Maxime Coquelin 2017-10-12 15:38 ` [PATCH 1/2] vhost: fix deadlock on IOTLB miss Maxime Coquelin 2017-10-13 11:32 ` Jens Freimann 2017-10-12 15:38 ` [PATCH 2/2] vhost: fix IOTLB on NUMA realloc Maxime Coquelin 2017-10-13 11:23 ` Jens Freimann 2017-10-13 18:37 ` [PATCH 0/2] vhost: IOTLB fixes for -rc1 Thomas Monjalon
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).