From mboxrd@z Thu Jan 1 00:00:00 1970 From: Stephen Hemminger Subject: Re: [PATCH net-next 1/1] hv_netvsc: fix deadlock on hotplug Date: Wed, 6 Sep 2017 09:36:38 -0700 Message-ID: <20170906093638.2074f455@xeon-e3> References: <20170906151925.15221-1-sthemmin@microsoft.com> <20170906151925.15221-2-sthemmin@microsoft.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Cc: "devel@linuxdriverproject.org" , Stephen Hemminger , "netdev@vger.kernel.org" To: Haiyang Zhang Return-path: In-Reply-To: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: driverdev-devel-bounces@linuxdriverproject.org Sender: "devel" List-Id: netdev.vger.kernel.org On Wed, 6 Sep 2017 16:23:45 +0000 Haiyang Zhang wrote: > > -----Original Message----- > > From: Stephen Hemminger [mailto:stephen@networkplumber.org] > > Sent: Wednesday, September 6, 2017 11:19 AM > > To: KY Srinivasan ; Haiyang Zhang > > ; Stephen Hemminger > > Cc: devel@linuxdriverproject.org; netdev@vger.kernel.org > > Subject: [PATCH net-next 1/1] hv_netvsc: fix deadlock on hotplug > > > > When a virtual device is added dynamically (via host console), then > > the vmbus sends an offer message for the primary channel. The processing > > of this message for networking causes the network device to then > > initialize the sub channels. > > > > The problem is that setting up the sub channels needs to wait until > > the subsequent subchannel offers have been processed. These offers > > come in on the same ring buffer and work queue as where the primary > > offer is being processed; leading to a deadlock. > > > > This did not happen in older kernels, because the sub channel waiting > > logic was broken (it wasn't really waiting). > > > > The solution is to do the sub channel setup in its own work queue > > context that is scheduled by the primary channel setup; and then > > happens later. > > > > Fixes: 732e49850c5e ("netvsc: fix race on sub channel creation") > > Reported-by: Dexuan Cui > > Signed-off-by: Stephen Hemminger > > --- > > Should also go to stable, but this version does not apply cleanly > > to 4.13. Have another patch for that. > > > > drivers/net/hyperv/hyperv_net.h | 1 + > > drivers/net/hyperv/netvsc_drv.c | 8 +-- > > drivers/net/hyperv/rndis_filter.c | 106 ++++++++++++++++++++++++++----- > > ------- > > 3 files changed, 74 insertions(+), 41 deletions(-) > > The patch looks overall. I just have a question: > > With this patch, after module load and probe is done, there may still be > subchannels being processed. If rmmod immediately, the subchannel offers > may hit half-way removed device structures... Do we also need to add > cancel_work_sync(&dev->subchan_work) to the top of netvsc_remove()? > > unregister_netdevice() includes device close, but it's only called later > in the netvsc_remove() when rndis is already removed. > > Thanks, > - Haiyang Good catch. If the driver called unregister_netdevice first before doing rndis_filter_device_remove that would solve the problem. That wouldn't cause additional problems and it makes sense to close the network layer first.