From mboxrd@z Thu Jan 1 00:00:00 1970 From: Darek Stojaczyk Subject: [PATCH] dev: fix attach rollback of a device that was already attached Date: Fri, 23 Nov 2018 15:45:06 +0100 Message-ID: <20181123144506.95367-1-dariusz.stojaczyk@intel.com> Cc: thomas@monjalon.net, Darek Stojaczyk , qi.z.zhang@intel.com To: dev@dpdk.org Return-path: Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by dpdk.org (Postfix) with ESMTP id B0DC61B586 for ; Fri, 23 Nov 2018 15:50:44 +0100 (CET) List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" When primary process receives an IPC attach request of a device that's already locally-attached, it doesn't setup its variables properly and is prone to segfaulting on a subsequent rollback. `ret = local_dev_probe(req->devargs, &dev)` The above function will set `dev` pointer to the proper device *unless* it returns with error. One of those errors is -EEXIST, which the hotplug function explicitly ignores. For -EEXIST, it proceeds with attaching the device and expects the dev pointer to be valid. Despite this patch being a fix, it also introduces a design decision - when any secondary process fails to attach a device, the primary process that already had the device attached won't attempt to detach that device locally as a part of the rollback routine. Primary process would have already printed a message "Failed to [...] on secondary" and now it will also print a warning "Devices may not be in sync [...]". Fixes: ac9e4a17370f ("eal: support attach/detach shared device from secondary") Cc: qi.z.zhang@intel.com Signed-off-by: Darek Stojaczyk --- lib/librte_eal/common/hotplug_mp.c | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-) diff --git a/lib/librte_eal/common/hotplug_mp.c b/lib/librte_eal/common/hotplug_mp.c index 7c9fcc46c..7ee074a31 100644 --- a/lib/librte_eal/common/hotplug_mp.c +++ b/lib/librte_eal/common/hotplug_mp.c @@ -88,7 +88,7 @@ __handle_secondary_request(void *param) (const struct eal_dev_mp_req *)msg->param; struct eal_dev_mp_req tmp_req; struct rte_devargs *da; - struct rte_device *dev; + struct rte_device *dev = NULL; struct rte_bus *bus; int ret = 0; @@ -168,7 +168,15 @@ __handle_secondary_request(void *param) if (req->t == EAL_DEV_REQ_TYPE_ATTACH) { tmp_req.t = EAL_DEV_REQ_TYPE_ATTACH_ROLLBACK; eal_dev_hotplug_request_to_secondary(&tmp_req); - local_dev_remove(dev); + if (dev == NULL) { + /* device was already attached at the time we got the + * request, don't detach it now. + */ + RTE_LOG(WARNING, EAL, + "Devices in secondary may not sync with primary\n"); + } else { + local_dev_remove(dev); + } } else { tmp_req.t = EAL_DEV_REQ_TYPE_DETACH_ROLLBACK; eal_dev_hotplug_request_to_secondary(&tmp_req); -- 2.17.1